question about bst

Hi everyone,I'm trying to do some kind of things about bst.
actually,program finds occurencens of each word and write alphapectiaclly in to text using binary search tree.I have written code and it works but ı have a problem dealiing with read from file.

here is my code:
bst.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
 
#include <iostream>
#include <string>
#include <fstream>
#include "stack.h"

using namespace std;

struct node
{
int occurence;
string word;
node *left;
node *right;
int difWord;
};

class BST
{

public:
	BST();
	void insert(string word);
	void FindDuplicates();
	void InorderTraversal();
		node *search(string word);

	

private:
	node *createNode(string word);
	void INT(node *node);
	node * curr;
	node *pre;
	node *root;
};

BST::BST()
{
	root = NULL;

}

node *BST::createNode(string word)
{
	node *n = new node;
	
	n->word=word;
	n->left = NULL;
	n->right = NULL;
	n->occurence = 1;

	return n;
}

void BST::insert(string word){

 
	node *pre = NULL;
node *cur = root;

	if(root == NULL)
	{
		root = createNode(word);
		return;
	}
	
	
	while(cur != NULL)
	{	pre = cur;
		
		if(word== cur->word)
        {
                  cur->occurence++;
                  break;
        }
	
	
		 else if(word > cur->word)
			cur = cur->right;
		else
			cur = cur->left;
	}
	
	if(word > pre->word)
		pre->right = createNode(word);
	else
		pre->left = createNode(word);
 }






void BST::InorderTraversal()
{
	std::cout << "...Inorder Traversal..." << endl;
	BST::INT(root);
}

void BST::INT(node *node)
{

	ofstream myfile ("example.txt",ios::app);
	


	if (node == NULL)
	{
		return;
	}

	
	
	BST::INT(node->left);


	if(node->occurence!=1 &&node->occurence!=0)
	myfile<<node->occurence<<"  "<<node->word<<endl;


	
 	BST::INT(node->right);	


		                                                                                                                                                                                                                                                                                                                                                      
	

}








main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include<iostream>
#include "bst.h"
#include <string>
#include <sstream>
#include <vector>
using namespace std;

int main()
{
	BST st;
ifstream ine("input.txt");


string currentWord;

while(!ine.eof())
{
ine >> currentWord;
st.insert(currentWord);

}

		 	st.InorderTraversal();


	system("PAUSE");
	return 0;
}



Here is my problem,output:
4 'and
2 'poison,'
20 Alice
2 Alice,
3 And
24 I
3 I'll
2 I've
2 Latitude
2 ME'
5 Rabbit
2 So
6 There
4 VERY
88 the
5 to
51 was
3 way
9 what
4 when
2 whether
3 which
4 with
2 words
5 would
17 you


like this.It took for example Alice and Alice' and so on,for that reason occurences say wrong number.

How can I deal with this problem?

Regards.


Last edited on
Not really a bst question; you just have to do some more processing on your input to make sure the word you are inserting doesn't include such punctuation.
But also,in printing somethink like that:
for example

8 Alice
10 Alice

so,the correct result is 10 Alice.Every word seems like that step how can I prevent this stuation?
I notice you are opening the output file for append for each word.
You may want to try this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class BST
{

public:
    BST();
    void insert(string word);
    void FindDuplicates();
    void InorderTraversal();
    node *search(string word);

private:
    node *createNode(string word);
    void INT(node *node);
    void removePunct(string &word);
    node *cur;
    node *pre;
    node *root;
    ofstream myfile;
};

BST::BST()
{
    root = NULL;
    myfile.open("example.txt",ios::out);
}


A sample remove punct method

1
2
3
4
5
6
7
8
9
10
void BST::removePunct(string &word)
{
    for ( string::size_type ix = 0; ix < word.size(); ++ix )
    {   
        if ( ispunct(word[ix]) ) { 
            word.erase(ix,1);
            ix--;
        }
    }   
}


The removePunct should be added before any word comparisons are done.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void BST::insert(string word)
{
    removePunct(word);

    if ( root == NULL )
    {
        root = createNode(word);
        return;
    }

    node *pre = NULL;
    node *cur = root;

    while ( cur != NULL )



Your output line can then check if the file is open

1
2
3
4
    if ( node->occurence > 1 && myfile.is_open() )
    {
        myfile << node->occurence << "  " << node->word << endl;
    }



Last edited on
Ow!
Thanks for your help,it works fine..But I could not understand these logic..

Appriciated lot!

I'll try to explain the changes. :)

Moving the file handle (myfile) to the class definition allows me to use one handle, and keep the file open for output during the recursive loop.

Originally, the file was being reopened for each word you were adding to the file. Also, by not overwriting the file, every additional run would add more duplicates.

As for the removePunct, I changed the assignments to only occur if root was not null. The position of the removePunct needs to be so that any compares or inserts we do ensures the word is clean.

For removePunct method, the assigning the variable to string::size_type will allow us to correctly iterate the string in memory. Assuming integer may be incorrect and lead to problems compiling on different systems.

http://cplusplus.com/reference/std/locale/ispunct/

ispunct compares if a letter belongs to punctuation group of characters. Each character is compared and if its a punctuation character, its erased from the word. When we erase a character, we need to adjust the index to reflect the change so the next iteration will point to the next letter.

Another approach would be to create a tmp string, and add non-punctuation characters to the string then return the tmp string.


1
2
3
4
5
6
7
8
9
10
11
void BST::removePunct(string &word)
{
    string str;
    for ( string::size_type ix = 0; ix < word.size(); ++ix )
    {
        if ( ! ispunct(word[ix]) ) {       
            str.push_back(word[ix]); // append chr to temp string
        }
    }
    word.assign(str);  // reassign temp string to word
}


Hope that helps. :)


Last edited on
Thanks for your advice and code,I really appriciated!
Topic archived. No new replies allowed.