Reading a particular set of lines from a .txt file

Hi,
I have a .txt file which contains the x, y and z co-ordinates of particles which I am trying to cast for a particular compound. The no. of particles present is of the order of 2 billion and hence the size of the text file is of the order of a few Gigabytes. The particles have been casted layer wise - thus, if there are 15000 layers in which I have casted the particles, there are approx. 2 billion/15000 particles in each layer. Thus, every 2 billion/15000 particles have the same Y co-ordinate. Now, I need to read the particles at a given value of Y (say y = 10). I wrote a small program, where I had used fin.seekg( ). However I realized that the seeking of the position from where the file has to be read is not done line-wise, but is done character-wise. Could someone please tell me how I could read the co-ordinates of all the particles at a particular value of Y using a simple C++ program. I know that this can be done on the command line usind 'sed'. Is there a way possible to use sed, but write the output to another file?
What is the exact format of your data? Can you give is one or two lines?
Hi,
The following is the format of the data:

21.2342 (\t) 11.2430 23.5453 0.005 2.25 86 (\n)
49.2429 (\t) 11.2430 45.4353 0.005 2.25 86 (\n)

So basically the first 3 nos. are the x, y and z co-ordinates respectively of the particles. The other 3 nos. are the radius, density and type. Each of the nos. are floating point numbers (except 'type' which is int). And the data in each line is separated by a TAB spacing.
I can't say that this is fully tested but this is the approach I would take.

You would run this by piping in your data file and specifying the required value for y as a parameter. Then redirect the putput to the target file:


cat input_file.txt | ./filter 11.2430 > output_file.txt


The output file should contain all the matches in the same format as the input data.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
// filter.cpp

#include <string>
#include <sstream>
#include <iostream>

struct particle
{
	double x;
	double y;
	double z;

	double radius;
	double dencity;

	int type;
};

int main(int argc, char* argv[])
{
	if(argc < 2)
	{
		std::cerr << "Error, need to supply y as argument." << std::endl;
		return 1;
	}

	// get input parameter y
	double y;
	std::istringstream iss(argv[1]);
	if(iss >> y) // this will fail for bad input
	{
		std::string line;
		while(std::getline(std::cin, line)) // one line at a time
		{
			// Then put the line into a stringstream
			// to read out the individual values
			std::istringstream iss(line);
			particle p;
			iss >> p.x;
			iss >> p.y;
			iss >> p.z;
			iss >> p.radius;
			iss >> p.dencity;
			iss >> p.type;

			if(p.y == y) // do we have a match?
			{
				// output the matching record.
				std::cout << p.x << '\t';
				std::cout << p.y << '\t';
				std::cout << p.z << '\t';
				std::cout << p.radius << '\t';
				std::cout << p.dencity << '\t';
				std::cout << p.type << std::endl;
			}
		}
	}
	else
	{
		std::cerr << "Argument y was not valid." << std::endl;
		return 1;
	}

	return 0;
}
Last edited on
Hey,
That really helped a lot. Thanks a ton! :)
Topic archived. No new replies allowed.