Reading fixed-format file (two delimiters)

I'm trying to read in a file that is in the format:

name:project:specification.

An example file:
John Doe:Cat:ABCDE.
Carly:Mouse:DEFG.



I also want to check if there are any errors in the formatting.

So far I have opened the file and am reading line by line:

1
2
3
4
5
6
7
8
std::ifstream inFile;
inFile.open("file.txt")

std::string line;
while(std::getline(inFile, line))
{

}


I know how to set a delimiter, however I'm not sure of a clean way to extract each section while also checking for the period at the end.

I was thinking along the lines of using two getlines, the first with '.' as the delimiter, putting the result into a std::stringstream and using getline again with ":" as the delimiter, however I'm not sure how great this would be for error checking.

I'm not after something to copy-paste, just suggestions on something in-line with good practice.

Any suggestions are welcome, thanks!

Last edited on
Hello pkdir,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::ifstream inFile;

inFile.open("file.txt")

// <--- How do you know the file is open?

std::string name, project, specification;

while(std::getline(inFile, name, ':'))
{
    std::getline(inFile, project, ':')
    std::getline(inFile, specification)  //  <--- Notice no delimiter. Reads until it finds a "\n".

}


Andy

Edit:

I think for line 12 you could use std::getline(inFile, specification, '.') if you do not want the ".".
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;

struct Item
{
   string name;
   string project;
   string specification;
};

istream & operator >> ( istream &in, Item &item )
{
   string line;
   getline( in, line );
   stringstream ss( line );
   getline( ss, item.name, ':' );
   getline( ss, item.project, ':' );
   getline( ss, item.specification, '.' );
   return in;
}


int main()
{
// ifstream in( "file.txt" );
   istringstream in( "John Doe:Cat:ABCDE.\n"
                     "Carly:Mouse:DEFG.\n" );
                     
   vector<Item> things;
   for ( Item item; in >> item; ) things.push_back( item );
   
   for ( Item I : things ) cout << I.name << ", " << I.project << ", " << I.specification << '\n';   
}


John Doe, Cat, ABCDE
Carly, Mouse, DEFG
Last edited on
Thank you for the replies.

While these solutions work for valid data, as mentioned in OP my concern is not only extracting the data but also performing error checking, such as confirming the period exists. Unfortunately using getline(stream, string, '.') will still return the rest of the line even without a . present.
Use getline() as you have in your first post, then check that there are enough semicolons and that a period exists (and is in the correct place?) by using some of the different string functions (find, find_if, etc.). Then if all the delimiters are present, parse the line using a stringstream with the delimiters. Hopefully you have a structure/class designed to hold the data items.


Heh, jlb beat me to the punch. Here’s something you can play with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <ciso646>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>

#define FAIL_HARD 1

struct record
{
  std::string name;
  std::string project;
  std::string specification;
};

using table = std::vector <record> ;

std::istream& operator >> ( std::istream& ins, record& r )
{
  std::string s;
  if (!getline( ins, s )) return ins;
  auto n0 = s.find( ':' );
  auto n1 = s.find( ':', n0+1 );
  if ((n0 == s.npos) or (n1 == s.npos) or (s.back() != '.'))
  {
  #if FAIL_HARD
    ins.setstate( std::ios::failbit );
    return ins;
  #else
    return ins >> r;
  #endif
  }
  r.name          = s.substr( 0, n0 );
  r.project       = s.substr( n0+1, n1-n0-1 );
  r.specification = s.substr( n1+1, s.size()-n1-2 );
  return ins;
}

std::istream& operator >> ( std::istream&& ins, table& t )
{
  record r;
  while (ins >> r) t.emplace_back( r );
  return ins;
}

int main()
{
  std::string s = 
    "Vader:Empire:Father."    "\n"
    "Luke:Heroism:Jedi."      "\n"
    "Anna:Banana:Comedienne"  "\n"  // <-- missing record terminator (period)
    "Leah:Republic:Princess." "\n";
    
   table data;
   std::istringstream{ s } >> data;
   
   int n = 0;
   for (auto rec : data)
     std::cout << (++n) 
       << ": "        << rec.name 
       << " is a "    << rec.specification 
       << " for the " << rec.project << "\n";

#if FAIL_HARD       
   std::cout << "Miss anyone? Try _not_ setting failbit for bad lines; ignore them instead.\n"
                "(#define FAIL_HARD 0)\n";
#endif
}

Good luck!
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;

struct Item
{
   string name;
   string project;
   string specification;
};

bool getItem( istream &in, Item &item )
{
   string line;
   if ( !getline( in, line ) ) return false;
   stringstream ss( line );
   if ( getline( ss, item.name, ':' ) && getline( ss, item.project, ':' ) && getline( ss, item.specification ) )
   {
      size_t pos = item.specification.find( '.' );
      if ( pos != string::npos )
      {
         item.specification.resize( pos );
         return true;
      }
   }
   cout << "*** Invalid input line " << line << '\n';
   return false;
}


int main()
{
// ifstream in( "file.txt" );
   istringstream in( "J.S. Bach:Brandenburg Concerto:BWV1050.\n"
                     "W.A. Mozart:Magic Flute:K620.\n"
                     "Rimsky-Korsakov:Scheherazade:Op35 \n"
                     "E.W. Elgar:Pomp & Circumstance:No 2:A-minor. \n"  );
                     
   vector<Item> things;
   for ( Item item; in; ) if ( getItem( in, item ) ) things.push_back( item );
   
   for ( Item I : things ) cout << I.name << ", " << I.project << ", " << I.specification << '\n';   
}


*** Invalid input line Rimsky-Korsakov:Scheherazade:Op35 
J.S. Bach, Brandenburg Concerto, BWV1050
W.A. Mozart, Magic Flute, K620
E.W. Elgar, Pomp & Circumstance, No 2:A-minor
Last edited on
Ugh, I absolutely loathe Pomp and Circumstance.
Ugh, I absolutely loathe Pomp and Circumstance.

You're not British!

Never mind, here's an American orchestra playing Nimrod (one of his Enigma Variations) instead. Awesome piece of music.
https://www.youtube.com/watch?v=sWm7HYzO-Lg
You're not British!

Nope, but 'mericans love that horrid composition as much as you brits. I’m an outlier.

I’ll check out Nimrod a little later, bad internet connection permitting.
I’ve now listened to both. They’re both crap. At least Elgar was consistent.
Just to confirm:
- The first and second fields can contain anything except '\n' or ':'
- The third field can contain ... what" anything except '.' and '\n'? Can it contain ':'?
- The final '.' must be followed by a newline.

Is that the syntax you're looking for?
Yes that's the syntax, I also want to make sure it contains no invalid characters.

When do I check for fail/bad bits? The eof bit putting the stream into an error state confuses me
Topic archived. No new replies allowed.