Optimal method to store string with spaces

I'm trying to read from a file that has several records within it. For instance... one line will include: school name, year, term, course name, etc. which is built from a struct.

I am trying to place one entire record in one index of an array. For example, in an theoretical array with 3 elements and "record" is the dataType of the struct and "myRecord" is the identifier:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include<iostream>
#include<string>
 
using namespace std;

struct record {

string schoolName, term, courseName;
int year;

};

int main() {

record myRecord[3];

myRecord[0] = City College, 2014, Fall, Biology;
myRecord[1] = City College, 2014, Fall, Trigonometry;
myRecord[2] = State College, 2015, Spring, Calculus;

}

The problem is the space in between the "City College" and "State College". How can I store the name of the school (which is a string) including the space when extracting from a txt file? I'm using ifstream to read from a local file.

Any help is much appreciated. Thanks
Last edited on
You use std::getline();

1
2
3
4
5
6
7
8
9
std::string name;
std::cout << "Type ya name m9" << std::endl;
std::getline(std::cin, name);

Input: Tarik Neaj

std::cout << "Your Names is: " << name << std::endl;

Output: Your name is: Tarik Neaj.


in your case, you don't want to use your file not std::cin.

std::getline(myFile, name);

http://www.cplusplus.com/doc/tutorial/files/
http://www.cplusplus.com/forum/beginner/11304/
Thanks for the reponse Tarik Neaj.


I understand getline, though if one record consists of multiple items (members of a struct), how does getline know where to stop extracting for the specific struct member? In this case I want to store "City College" in the struct member (schoolName). How do I prevent it from continuing onto the next word after "College" in the row/record?

As of now, I'm extracting the contents of the txt file with a while loop and the termination point is the whitespace. (See below) If I use getline(sourceFile, recordsArra[arraySize].school) how will getline know where to stop extracting from the txt file for that specific struct member?

1
2
3
4
5
6
7
8

while (!sourceFile.eof()) {

sourceFile >>
		recordsArray[arraySize].school >> recordsArray[arraySize].term >>
		recordsArray[arraySize].year >> recordsArray[arraySize].course >>
		recordsArray[arraySize].grade >> recordsArray[arraySize].units;
}



The source text file is formatted as such:


City College     Fall        2014   Biology
City College     Fall        2014   Trigonometry
State College    Spring      2015   PreCalculus

Last edited on
You can use the string find_first_of function passing in the comma as a parameter.
http://www.cplusplus.com/reference/string/string/find_first_of/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
    string str("City College, 2014, Fall, Biology");
    record rec;

    size_t prev = 0, comma = str.find_first_of(',');
    rec.schoolName = str.substr(prev, comma);

    prev = comma + 1;
    comma = str.find_first_of(',', prev);
    rec.year = atoi(str.substr(prev, comma).c_str());

    prev = comma + 1;
    comma = str.find_first_of(',', prev);
    rec.term = str.substr(prev, comma);

    prev = comma + 1;
    comma = str.find_first_of(',', prev);
    rec.courseName = str.substr(prev, comma);

    cout << rec.schoolName + ", " << rec.year << ", " + rec.term + ", " + rec.courseName + "\n";


Though for some reason it prints Biology twice. :/

City College, 2014,  Fall, Biology,  Biology
Last edited on
closed account (E0p9LyTq)
Since you are using a structure, read/write your entire struct to/from the file:
http://codereview.stackexchange.com/questions/26344/writing-reading-data-structure-to-a-file-using-c
http://stackoverflow.com/questions/5506645/how-to-read-write-a-struct-in-binary-files

If you have to use a human readable text file, then use getline() for each struct member.
Thanks for this solution, integralfx.


In the event I wish not to delimit the struct members with a comma? Any other solution?

I'm assuming spaces between each category is a tab. So you can just replace the find_first_of() with the tab character literal : '\t'. I don't know if this works as I haven't tested it out yet. It's worth a try though.
If you know the format of the file won't change.
1
2
3
4
5
6
7
8
std::string part1, part2;
while (!sourceFile.eof()) {

sourceFile >> part1 >> part2 >> recordsArray[arraySize].term >>
		recordsArray[arraySize].year >> recordsArray[arraySize].course >>
		recordsArray[arraySize].grade >> recordsArray[arraySize].units;
                recordArray[arraySize].school = part1 + " " + part2;
}
integralfx - I'll try that out. Thanks!

Semoirethe - Aren't the >> symobls an extractor? What is being extracted from string part1 and part1 in this snippet? I don't understand how two empty concatenated string assign something? Sorry.. I'm a n00b.
Last edited on
It extracts the contents of the file. Wherever there is a space in the file, it will store it in that variable.

For example, part1 stores "City", part2 stores "College", recordsArray[arraySize].term stores "Fall", etc.
Last edited on
integralfx - I tried this method and got an error. When I put a break point to see what's being stored in the strings "part1" or "part2", it says "error reading characters of string". Not sure what's going on. Seems odd.
I have a question.
The source text file is formatted as such:
City College     Fall        2014   Biology
City College     Fall        2014   Trigonometry
State College    Spring      2015   PreCalculus

Am I right in assuming that this file is supplied to you by someone else and you cannot change it at all?

If so, is it properly represented here - all the blanks are ordinary spaces, or is is possible that the original file has tab characters '\t' as separators between each field?
Chevril,


The file is supplied by me so it can change, though I would like this formatting to persist. The separator between each struct member is a tab '\t'.


If the fields are separated with tabs then you can use the getline() variant that specifies the delimiter
1
2
3
4
5
6
7
8
9
istream operator >> (istream &is, record &dst)
{
    getline(is, dst.schoolName, '\t');
    is >> dst.year;
    is.ignore();    // ignore the tab following year
    getline(is, dst.term, '\t');
    getline(is, dst.courseName);  // courseName ends with end-of-line
    return is;
}
@dhayden - Looks good. But did you get the year and term fields swapped out of sequence? Also the return type should be a reference to the stream.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
const char tab = '\t';

istream & operator >> (istream &is, record &dst)
{
    getline(is, dst.schoolName, tab);
    getline(is, dst.term, tab);
    is >> dst.year;
    is.ignore();    // ignore the tab following year    
    getline(is, dst.courseName);  // courseName ends with end-of-line
    return is;
}

ostream & operator << (ostream &os, const record &dst)
{
     return os 
       << dst.schoolName << tab 
       << dst.term       << tab
       << dst.year       << tab
       << dst.courseName;
}
Last edited on
@ryanjamesdeveloper

I'm not sure whether you've followed the last couple of posts, and how that code relates to your original questions. So here's a complete example showing the code in context.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

using namespace std;

struct record 
{
    string schoolName;
    string term;
    string courseName;
    int year;
};

const char tab = '\t';

istream & operator >> (istream &is, record &dst)
{
    getline(is, dst.schoolName, tab);
    getline(is, dst.term, tab);
    is >> dst.year;
    is.ignore();    // ignore the tab following year    
    getline(is, dst.courseName);  // courseName ends with end-of-line
    return is;
}

ostream & operator << (ostream &os, const record &dst)
{
     return os 
       << dst.schoolName << tab 
       << dst.term       << tab
       << dst.year       << tab
       << dst.courseName;
}

int main()
{
    ifstream fin("data.txt");

    vector <record> recs;
    record rec;
    while (fin >> rec)
        recs.push_back(rec);
        
    for (auto & r : recs)
        cout << r << endl;
}


Output:
City College    Fall    2014    Biology
City College    Fall    2014    Trigonometry
State College   Spring  2015    PreCalculus

This may not look very exciting. The output looks a whole lot like the input file, so is the program simply copying the input file to the console?

Well no. With the good start suggested by dhayden above, the input was parsed into individual fields, stored in a record structure, and then stored in a vector (much like a dynamically sized array).
The two operator functions allow the record object to be input or output with the same syntax as for ordinary types such as int or string.
Thanks all for the contributions on this post.

I haven't used vector just yet (looks like the syntax is a bit different) so I'm not familiar, but I will look into that a bit more.

Also @chevril can you explain the block of code between lines 18-26? They look like function definitions but not exactly what I'm familiar with. Much appreciated!

1
2
3
4
5
6
7
8
9
istream & operator >> (istream &is, record &dst)
{
    getline(is, dst.schoolName, tab);
    getline(is, dst.term, tab);
    is >> dst.year;
    is.ignore();    // ignore the tab following year    
    getline(is, dst.courseName);  // courseName ends with end-of-line
    return is;
}
closed account (E0p9LyTq)
I haven't used vector just yet


Learn about all the C++ STL container classes, and the algorithms used to access and manipulate the containers, so you don't have to reinvent a flat wheel writing your own containers and algorithms.

The time spent learning how to use one container class, vector for example, will teach you most of the basics how to use all the other container classes.

http://www.cplusplus.com/reference/stl/
http://www.cplusplus.com/reference/algorithm/
closed account (E0p9LyTq)
explain the block of code between lines 18-26? They look like function definitions but not exactly what I'm familiar with.


They are functions that overload the C++ i/o streams insertion and extraction operators.
http://en.cppreference.com/w/cpp/language/operators

With custom created structs and classes you can't cin and cout a class object, as you can with the built-in data types. The two functions tell the compiler how to deal with your struct as if it were a built-in type, like int.
Topic archived. No new replies allowed.