Reading a .txt file into Vector

Hello this is simple and fast question. How do I make the while loop (the best way) to read a file ? I can do this (down below) but it's not good looking (for example what if I had like data1 to data10).

1
2
3
    ifstream in("duomenys.txt");
    int data1, data2;
    while(in >> data1 && in >> data2)
Assuming your data is whitespace-delimited (spaces, tabs, newlines)

(for example what if I had like data1 to data10).
1
2
3
4
5
6
7
8
9
10
11
12
13
ifstream in("duomenys.txt");
int data[10];
for (int i = 0; i < 10; i++)
{
    if (in >> data[i])
    {
        // success path
    }
    else
    {
        // failed to read in number, something went wrong
    }
}


If you don't know how many pieces of data you have, use a vector.
e.g.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Example program
#include <iostream>
#include <fstream>
#include <vector>

int main()
{
    std::ifstream fin("t.txt");
    
    std::vector<int> data;
    
    int element;
    while (fin >> element)
    {
        data.push_back(element);
    }
}
Last edited on
That doesn't work if I have 2 different vectors for example my .txt file is :

10 25
78 3
14 65
78 1
98 2

And then the 1st collumn should be assigned to a vector seq1, second vector to seq2;

So there's the example code that I had with while loop :

1
2
3
4
5
6
7
8
    ifstream in("duomenys.txt");
    int data1, data2;
    while(in >> data1 && in >> data2)
    {
        seq1.push_back(data1);
        seq2.push_back(data2);
    }
    in.close();



vector 1
10 78 14 78 98
vector 2
25 3 65 1 2
So you're saying that
1
2
3
4
5
10 25 42
78 3 1
14 64 82
78 1 3
98 2 4

is also a valid file, each column is a separate sequence, and you don't know in advance how many columns there are?
Yeah it's valid file. What I'm trying to say is if I had like 100000 sequences it would make no sense to do :

while(in >> data1 && in >> data2 && in >> data3 && in >> data4 && in >> data5............)

And then

1
2
3
seq1.push_back(data1);
seq2.push_back(data2);
............................


etc.
Okay, it's a bit more complicated if you don't know in advance how many columns there are.
I would use getline combined with a stringstream to parse how many columns there are from the first line, and then make a vector of sequences (vector of vector) for each column. I'll post an example in a bit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

int main()
{
    using Sequence = std::vector<int>;
    
    std::ifstream fin("d.txt");
    
    // first goal: Figure out how many columns there are
    // by reading in and parsing the first line of the file
    std::vector<Sequence> sequences;
    {
        std::string first_line;
        std::getline(fin, first_line);
        std::istringstream iss(first_line); // used to separate each element in the line
        
        int element;
        while (iss >> element)
        {
            sequences.push_back(Sequence()); // add empty sequence
            sequences.back().push_back(element); // insert first element
        }
    }
    
    // First line and all sequences are now created.
    // Now we just loop for the rest of the way.
    bool end = false;
    while (!end)
    {
        for (size_t i = 0; i < sequences.size(); i++)
        {
            int element;
            if (fin >> element)
            {
                sequences[i].push_back(element);
            }
            else
            {
                // end of data.
                // could do extra error checking after this
                // to make sure the columns are all equal in size
                end = true;
                break;
            } 
        }
    }

    // print results
    for (size_t i = 0; i < sequences.size(); i++)
    {
        std::cout << "seq " << i << ":";
        for (int elem : sequences[i])
        {
            std::cout << ' ' << elem;
        }
        std::cout << '\n';
    }
}
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>
#include <vector>
using namespace std;

using TYPE = string;                   // the type of your data; use string if unknown

int main()
{
   string filename = "duomenys.txt";
   vector< vector<TYPE> > data;
   
   ifstream in( filename );
   for ( string line; getline( in, line ); )
   {
      stringstream ss( line );
      vector<TYPE> row;
      for ( TYPE d; ss >> d; ) row.push_back( d );
      data.push_back( row );
   }

   cout << "Your data:\n";
   for ( auto &row : data )
   {
      for ( auto &item : row ) cout << setw( 10 ) << item << ' ';
      cout << '\n';
   }
}


Your data:
        10         25         42 
        78          3          1 
        14         64         82 
        78          1          3 
        98          2          4 

You guys are really writing in very experienced hard codes ! :) Anyway I'll study both of yours. Thank you !
Yep!

btw I purposefully avoided making a stringstream each time just as a self-challenge, but that is definitely conciser code. Although one difference is that it's on a row-basis instead of column-basis.
Last edited on
@DdavidDLT,
You do know that the python program to do the same is just
1
2
3
import numpy as np
data = np.loadtxt( "duomenys.txt" )
print( data )

C++ has a way to go on the usability front.
Last edited on
loadtxt assumes 'rectangular' data?
SciPy wrote:
Each row in the text file must have the same number of values.
Ah yep

To be fair, NumPy is not part of the Python standard library. You could also find/make a C++ library that could call "loadtxt" to load in data from a file.
Last edited on
Ganado wrote:
loadtxt assumes 'rectangular' data?


Accoding to the reference, yes. (https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html )
Each row in the text file must have the same number of values.


This is what it produced for the same input file (which I borrowed from you!):
[[10. 25. 42.]
 [78.  3.  1.]
 [14. 64. 82.]
 [78.  1.  3.]
 [98.  2.  4.]]



I think there's a numpy.genfromtxt that can handle "missing values".


Ganado wrote:
To be fair, NumPy is not part of the Python standard library.

True, but most scientists and engineers will have NumPy, SciPy and matplotlib as standard!
Last edited on
ganado wrote:
You could also find/make a C++ library that could call "loadtxt" to load in data from a file.


VoilĂ !

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <iostream>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>
#include <vector>
using namespace std;

using TYPE = double;

//------------------------------------------------

template< typename T > vector< vector<T> > loadtxt( const string &filename )
{
   vector< vector<T> > data;
   ifstream in( filename );
   for ( string line; getline( in, line ); )
   {
      stringstream ss( line );
      vector<T> row;
      for ( T d; ss >> d; ) row.push_back( d );
      data.push_back( row );
   }
   return data;
}

//------------------------------------------------

template< typename T > void print( const vector< vector<T> > &data )
{
   for ( auto &row : data )
   {
      for ( auto &item : row ) cout << setw( 10 ) << item << ' ';
      cout << '\n';
   }
}

//======================================================================

int main()
{
   auto data = loadtxt<TYPE>( "duomenys.txt" );
   print( data );
}


        10         25         42 
        78          3          1 
        14         64         82 
        78          1          3 
        98          2          4 

Topic archived. No new replies allowed.