seperate string (inlcuding blanks) at tab

hello there,

i'm reading from a text file which includes a dictionary in UTF8.
it has the following structure:

word, bla "bla"\ttranslation

so there's a word and a translation seperated by tab. both can include any kind of charactars, excluding tab of course, especially blanks (found one code that claimed to seperate a string at tab, but also did at blank).

how can i seperate one line into two (or more) strings at the tab? i've been searching for a while, but can't find anything useful...

thanks in advance!


-kay
(1) find + substr

1
2
#include <string>
#include <tuple> 
1
2
3
4
5
6
std::tuple <std::string, std::string> 
split_at_tab( const std::string& s )
  {
  std::size_t n = s.find( '\t' );
  return std::make_tuple( s.substr( 0, n ), s.substr( n ) );
  }
1
2
3
4
5
6
string line, word, translation;
while (getline( f, line ))
  {
  std::tie( word, translation ) = split_at_tab( line );
  ...
  }


(2) getline

1
2
3
4
5
string word, translation;
while (getline( f, word, '\t' ) && getline( f, translation ))
  {
  ...
  }


(3) stringstream + getline
1
2
3
4
5
6
7
8
string line, word, translation;
while (getline( f, line ))
  {
  istringstream ss( line );
  getline( ss, word, '\t' );
  getline( ss, translation );
  ...
  }


Options (1) and (3) are more robust, because they handle the possibility that a given line does not contain a TAB. But if you can guarantee that every line in your dictionary is <word> TAB <translation> then option (2) is best.

Hope this helps.
yes, this helps! especially the 3rd one is great for my use! so i can even handle the in some lines existing second tab for information about word classes (\t noun). it's already working with my program.

thank you very much for your detailled answer!
Topic archived. No new replies allowed.