Problems with vec.erase

I wrote a program that is reading a text from a text file. The program should erase all double words, all words with less then four letters but it desn't do any of it.
Can someone plese tell me whats wrong with my program?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#include <string>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <vector>
#include <functional>
#include <cctype>
#include <iterator>

using namespace std;

static struct transform_helper :unary_function <string, string> {
	string operator () (string &value) {
		int i=0;
		
		while (value[i]) {
			//change letters to lower case
			value [i] = tolower(value[i]);
			i++;
		}
		return value;
	}
}transform_helper;

int main () {
	vector <string> vec;

	ifstream stream ("words.txt");
	if(!stream){
		cout << "Can't open file" << endl;
		return 1;
	}

	
	//reads the text into the vector
	copy (istream_iterator <string> (stream), istream_iterator <string>(), back_inserter(vec));

	//calles the function to lower all cases
	transform (vec.begin(), vec.end(), vec.begin(), transform_helper);
	
		//sorts the words
	sort (vec.begin (), vec.end());
	

//removes all double words
	int test;
	int testeins;
	string a;
	string b;
	for (test=0; test<vec.size();test++){
	a = vec[test]; b=vec[test+1];
	if (a==b){
		testeins=test+1;
	vec.erase (vec.begin()+testeins);
	}
	
	}


int fourcount;
for (fourcount=0;fourcount<vec.size();fourcount++){
if (vec[fourcount].length() <4){
	int point=0;
		point = fourcount -1;
	vec.erase (vec.begin()+point);
	fourcount=fourcount-1;
}}



int count;
for (count=0;count<vec.size(); count++)
cout << vec[count]<< " " << endl;
return 0;

}



Did you try debugging? How far did you get before you noticed a problem? Please be more clear about that so that we don't spend time troubleshooting the entire program. The reason that it isn't removing words with less than 4 characters is that your for loop is incorrect.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <fstream>
#include <vector>
#include <functional>
#include <cctype>
#include <iterator>

int main()
{
    int fourcount;
    std::vector<std::string> vec;
    vec.push_back("word");
    vec.push_back("the");
    vec.push_back("Greetings");
    for (fourcount=0;fourcount<vec.size();fourcount++)
    {
	if (vec[fourcount].length() <4)
	{
           // This is wrong!  You are erasing the wrong string.  If you debug
           // this you will notice that "word" is being erased. 
	    int point=0;
	    point = fourcount -1;
	    vec.erase (vec.begin()+point);
	    fourcount=fourcount-1;
	}
    }
}


A better solution is to use std::remove_if and define a predicate that returns true if the string size is less than 4.

Take a look at this thread on std::remove and std::remove_if.
http://www.codeguru.com/forum/showthread.php?t=231045
Ok let me show you some of the output without the last vec.erase part:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
to
to
to
to
to
to
to
to
train
transient
truths
unalienable
unanimous
under
united
usurpations,
we
when
whenever
which
which
while
will
with


and with the vec.erase i have this here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
them
them,
themselves
these
these
they
they
thirteen
to
train
transient
truths
unalienable
unanimous
under
united
we
when
whenever
which
which
while
will
with


So it at least deletes all the douple words that have less then 4 letters. But somehow not all of them.
I still have to read about std::remove, and i will show you the code after i changed it.
thanks for the help

Unluckily nobody showed us ever how to use the debugger the right way, so i really don't know how to run programs with it, we were thought to run the programs without debugging them.
Last edited on
Consider the case where the vector contains the words

[0] one
[1] two
[2] three
[3] four

Your for() loop starts at 0 and checks to see if [0] has less than 4 letters. It does.
So you erase element 0. Your vector is now:

[0] two
[1] three
[2] four

You now increment fourcount to 1, and now you check element 1, which is "three".

See how you missed "two"?
For the case of removing double words, it might be helpful to simply create a temporary vector object that doesn't contain duplicates and then swap them when you are finished. By double words I am assuming that you are talking about duplicates right? Even if the word appears 10 times, you want to end up with a vector that contains no duplicate strings? You could simply interate over the original vector and use vector's find in the temp vector to determine if the value was already copied into the new one. If not, copy it. This will greatly simplify your forloop. When you are finished you simply swap the two vectors. It might not be the most efficient way of doing it but it would work and it would keep your code simple and readable.
What IDE do you use? The debugger basics are not difficult to figure out. If you are smart enough to write a functor as you have I am fairly certain that you are smart enough to fiddle with the debugger a bit. The debugger allows you to step through the code line by line and evaluate the contents of each container.
Consider this program. It is your original program without the file i/o. In this case, the first for loop doesn't remove anything. It will only remove duplicates that happen to be adjacent (which none are). Is that what you meant by double words? I'm unclear on your requirements. The second for loop will not remove the correct words because of the incorrect use of erase. When the program finds a word with less than 4 characters it erases the previoud entry.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <string>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <vector>
#include <functional>
#include <cctype>
#include <iterator>

static struct transform_helper : std::unary_function <std::string, std::string> {
    std::string operator () (std::string &value) {
		int i=0;
		
		while (value[i]) {
			//change letters to lower case
			value [i] = tolower(value[i]);
			i++;
		}
		return value;
	}
}transform_helper;

int main()
{
    int fourcount;
    std::vector<std::string> vec;
    vec.push_back("WORD");
    vec.push_back("The");
    vec.push_back("Greetings");
    vec.push_back("WORD");
    vec.push_back("The");
    vec.push_back("Greetings");
    
    //calles the function to lower all cases
    std::transform (vec.begin(), vec.end(), vec.begin(), transform_helper);

    int test;
    int testeins;
    std::string a;
    std::string b;
    for (test=0; test<vec.size();test++)
    {
	a = vec[test];
	b=vec[test+1];
	if (a==b)
	{
	    testeins=test+1;
	    vec.erase (vec.begin()+testeins);
	}
    }


    for (fourcount=0;fourcount<vec.size();fourcount++)
    {
	if (vec[fourcount].length() <4)
	{
	    int point=0;
	    point = fourcount -1;
	    vec.erase (vec.begin()+point);
	    fourcount=fourcount-1;
	}
    }
}


The contents of the vector at the end is the following. "word" is removed because it is before the entry with less than 4 words. That isn't correct is it?

1
2
3
4
vec[0] = "the";
vec[1] = "greetings";
vec[2] = "the";
vec[3] = "greetings";
if by "double words" you mean consecutive repetitions, try this algorithm instead of writing the code yourself.
http://www.cplusplus.com/reference/algorithm/adjacent_find/

You could execute this over and over until you find all of them. In each case, simply erase the element returned by the algorithm if it is not equal to container.end(). When the algorithm returns a value equal to container.end() you are finished.

By the way, nice job trying to use the std template library. At least you were able to write a functor successfully. Your program isn't that far off from what you need it to do.
thanks guys the less then four letters loop now works, it took me a little to understand all your answerer's until it made click.
i am still working on removing all the dublicates and i have to remove all commas and dots that are next to the words.
then there is something else, i have a 2nd file with words, at the end my first vector isn't allowed to have any words in it that are in the 2nd file.
kempofighter you said i should get a 2nd temporary vector and could copy all words in this one, and check if the word i would like to copy is already there.
I think this is the key to the last task with the 2nd word file.
sorry if i sometimes write a little misunderstand able. I am trying my best.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#include <string>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <vector>
#include <functional>
#include <cctype>
#include <iterator>

using namespace std;



static struct transform_helper :unary_function <string, string> {
	string operator () (string &value) {
		int i=0;
		
		while (value[i]) {
			//change letters to lower case
			value [i] = tolower(value[i]);
			i++;
		}
		return value;
	}
}transform_helper;

int main () {
	vector <string> vec;

	ifstream stream ("words.txt");
	if(!stream){
		cout << "Can't open file" << endl;
		return 1;
	}

	
	//reads the text into the vector
	copy (istream_iterator <string> (stream), istream_iterator <string>(), back_inserter(vec));

	//calles the function to lower all cases
	transform (vec.begin(), vec.end(), vec.begin(), transform_helper);
	
		//sorts the words
	sort (vec.begin (), vec.end());
	

//removes all word that have less then 4 letters
int fourcount; 
for (fourcount=0;fourcount<vec.size();fourcount++){
if (vec[fourcount].length() <4){
	int point=0;
		point = fourcount;
	vec.erase (vec.begin()+point);
	fourcount=fourcount-1;

}}

//this should copy all words only ones in a new vector named a, but how can i get it back inbto the original vector?
vector <string>::iterator a;

int num;
for (num=0;num<vec.size();num++){
	
	a = adjacent_find (vec.begin (), vec.end());

	if (a!=vec.end());
}


int count;
for (count=0;count<vec.size(); count++)
cout << vec[count]<< " " << endl;
return 0;

}
Any particular reason why you didn't use iterators for the following part?
1
2
3
4
5
6
7
8
9
10
//removes all word that have less then 4 letters
int fourcount; 
for (fourcount=0;fourcount<vec.size();fourcount++){
if (vec[fourcount].length() <4){
	int point=0;
		point = fourcount;
	vec.erase (vec.begin()+point);
	fourcount=fourcount-1;

}}
because i am not really sure how to use it.
Topic archived. No new replies allowed.