how to delete the duplicate in the text file??

hello everyone, hope all of u are fine..

i have a problem in my work. i have a data in a .txt file format.
the data is looks like below :


4 5 4 4 1
6 4 4 3 3 2 0
2 3 1
6 8 4 4 2 1 1 figure 1
12 8 8 7 7 5 5 3 3 1 1 0 0
6 7 7 6 4 4 0
2 7 5
6 8 6 5 5 4 4
4 7 4 4 3


the leftmost value will be determine how many number in this column.For instance, first row, the leftmost value is 4, then there are 4 numbers after it.
same goes to rest.

u guys can see there are duplicated numbers and want i want to do is, to delete the duplicated value, and takes only one value like below:


5 4 1
4 3 2 0
3 1
8 4 2 1
8 7 5 3 1 0 figure 2
7 6 4 0
7 5
8 6 5 4 4
4 7 4 3

and i don't want to take the leftmost (figure 1)value anymore..
i have try to code this idea, maybe anyone can help me to figure out what why it become like this??


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void readdata2()
{	
	ifstream infile;
	infile.open("neighbor.txt");

	for(int i=0;i<noofvert;i++)
	{
	//	cout << m[i] ;
		for(int j=1;j<m[i];j++)
		{
			cout << num[j];
			for(int k=0;k<m[i]-1;k++)
			{
				if(num[k] == num[k+1])
					num[k+1] = num[k];
				else
					num[k] = num[k+1];
				
			}
			
		}
		//cout << endl;
	
	}



i hope, u can help me..thank you very much...
Last edited on
You can do this easily using stl algorithms:
1. read the numbers into an array or vector
2. http://www.cplusplus.com/reference/algorithm/unique/
3. write everything back to a file
thanks hamsterman...i have tried as what u suggest..but the output become like this..

5 4 1 1
4 3 2 0 2 0
3 1
8 4 2 1 1 1
8 7 5 3 1 0 3 3 1 1 0 0
7 6 4 0 4 0
7 5
8 6 5 4 4 4
7 4 3 3


from this output, i think maybe the program is trying to fulfill the number of each rows..

from figure 1 , the leftmost, is 4, so in this modified, it also make it 4, even tough after delete the repetition, it should become 3.

and this is code that i have modified

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

bool myfunction (int i, int j) {
  return (i==j);
}

void readdata2()
{	
	ifstream infile;
	infile.open("neighbor.txt");
	infile >> noofvert;

	ofstream outfile;
	outfile.open("cplus.txt");

	for(int i=0;i<noofvert;i++)
	{
	   infile >> m[i];
		for(int j=0;j<m[i];j++)
		{
			infile >> num[j];
			//cout << num[j] << " ";
			
		}
	
		unique (num, num+m[i], myfunction);          

		for(int j=0;j<m[i];j++)
		{
			outfile << num[j]<< " ";
		}
		outfile << endl;
		
	
	}

	//cout << *num<< endl;
	

}


thank you again hamsterman....i hope u can help me to figure out this problem..

thank you
note in the example, that it is it = unique (myvector.begin(), myvector.end());

when function is given an array, for example:
0 0 5 5 5 3 2 6 6
it removes duplicates, however the new array is shorter than the original, so there is some rubbish at the end:
0 5 3 2 6 3 2 6 6

The thing that std::unique returns is a pointer (or iterator) pointing to the end of your new array.
To find the length of your new array subtract the beginning pointer from the end pointer:
1
2
int* end = unique(num, num+[i]);//note that you don't really need to write your own comparison function
int len = end - num;
Here you go. Just so as you know, the left-most value is now meaningless.
Unique only removes consecutive repeated values.
ie. After unique, 1 2 2 2 3 4 2 3 , would be , 1 2 3 4 2 3
By the look of the other data set, a greater-than sort first seemed appropriate.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <iterator>
#include <algorithm>
#include <list>
using namespace std;


// Works like greater-than but forces 'figure x' to smallest
bool mySort(const string &s1, const string &s2) {
	if(*s2.begin()=='f')
		return true;
	else if(*s1.begin()=='f')
		return false;
	else
		return (s1>s2);
}


void readdata2()
{
	string parse,	// A line of text from the file
		   temp;	// A potential duplicate value
	ifstream infile("neighbor.txt");
	ofstream outfile("cplus.txt");
	outfile.close();	// Hack: Dunno how to delete the (contents of the) file
	if(infile.good()) {
		while(!infile.eof()) {
			getline(infile, parse);
			if(parse!="") {
				list<string> line;
				istringstream iss(parse);
				while(!iss.eof()) {
					iss >> temp;
					if(temp=="figure") {	// Keep figure # together
						temp+=" ";	// Put the space in
						string fig;
						iss >> fig;		// Get the figure #
						temp+=fig;
					}
					line.push_back(temp);
				}
				line.sort(mySort);	// Custom sort to handle the figure #, descending
				line.unique();		// Strip the duplicates
				outfile.open("cplus.txt", ios::app);
				copy(line.begin(), line.end(), ostream_iterator<string> (outfile, " "));
				outfile << endl;
				outfile.close();
				line.clear();
			}
		}
		infile.close();
	}
}

dear hamsterman ,

i will try to edit my code as what u suggest..thank you very much..i really appreciate it

dear scott vass,

thank you for ur help. i have try your suggestion code. it works..thank you..but i need to understand it. thank you very much..i really appreciate it..

if i have any problem to understand ur suggestion, i'll contact both of u..

thank you very much...
There is no need to read the string into a vector. You can apply the std algorithms to strings as well as vectors:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <sstream>

int main()
{
	std::string line = "875310331100";

	std::ostringstream oss;
	std::ostream_iterator<char> ossi(oss);

	std::unique_copy(line.begin() + 1, line.end(), ossi);
	std::string uline = oss.str();

	std::cout << line << std::endl;
	std::cout << uline << std::endl;

	return 0;
}
875310331100
75310310
Last edited on
On reflection, looking at your data more carefully I realise that you have spaces separating the numbers. So the method I outlined above would not be sufficient.
This version scans the string and outputs to a vector:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <sstream>
#include <fstream>
#include <vector>

int main()
{
	// declare input/output types and iterators
	typedef std::istringstream in_type;
	typedef std::istream_iterator<size_t> in_iter;
	typedef std::vector<size_t> out_type;
	typedef std::vector<size_t>::iterator out_iter;

	std::ifstream ifs("infile.txt");
	std::ofstream ofs("outfile.txt");

	if(ofs)
	{
		std::string line;
		while(std::getline(ifs, line))
		{
			in_type is(line);

			size_t length;
			if(is >> length)
			{
				out_type ov(length);
				out_iter ovi = ov.begin();

				in_iter isi(is);
				in_iter ise; // istream end

				out_iter ove = std::unique_copy(isi, ise, ovi);

				for(ovi = ov.begin(); ovi != ove; ++ovi)
				{
					ofs << *ovi << ' ';
				}
				ofs << std::endl;
			}
		}
	}

	return 0;
}
5 4 1 
4 3 2 0 
3 1 
8 4 2 1 
8 7 5 3 1 0 
7 6 4 0 
7 5 
8 6 5 4 
7 4 3 
hai galik..thanks alot..i have try ur code.it works to my all data...but i need to study ur code first...

a million thanks to everyone.. i really appreciate it..i'm not only can have idea to solve my prob but also can increase my knowledge..thanks guyss..
elo guys,

i have made some adjustment to my previous code...and it works...below is the final code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
bool myfunction (int i, int j) {
  return (i==j);
}

void readdata2()
{	
	ifstream infile;
	infile.open("neighbor.txt");
	infile >> noofvert;

	ofstream outfile;
	outfile.open("cplus.txt");
	for(int i=0;i<noofvert;i++)
	{
	   infile >> mh[i];
		for(int j=0;j<mh[i];j++)
		{
			infile >> num[j];
			//cout << num[j] << " ";
		}
		//cout << endl;
	int *end =	unique (num, num+mh[i], myfunction); 
	int len = end - num;

		for(int j=0;j<len;j++)
		{
			outfile << num[j]<< " ";
		}
		outfile << endl;
	}

	//cout << *num<< endl;
	

}


the result is same as galik and scott vass..thanks a lot guys..

to galik and scott vass -> i used ur code to increase my knowlegde..thanks alot~~ :)
Topic archived. No new replies allowed.