boost::split does not work as expected

http://www.boost.org/doc/libs/1_39_0/doc/html/string_algo/usage.html#id3408774

There is an example on using boost::split :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <vector>
#include <string>
#include <boost/algorithm/string.hpp>

int main() {

    std::string str1("hello abc-*-ABC-*-aBc goodbye");

    typedef std::vector< std::string > split_vector_type;

    split_vector_type SplitVec; // #2: Search for tokens
    boost::split( SplitVec, str1, boost::is_any_of("-*") ); // SplitVec == { "hello abc","ABC","aBc goodbye" }

    for ( unsigned i = 0; i < SplitVec.size(); ++i ) {
        std::cout << "\"" << SplitVec[i] << "\"\n";
    }

    return 0;
}


It says in the comment that SplitVec contains { "hello abc", "ABC", "aBc goodbye" }, but when you run the code the following is printed:
1
2
3
4
5
6
7
"hello abc"
""
""
"ABC"
""
""
"aBc goodbye"


?? Shouldn't it discard those empty tokens?
It could be that is_any_of() will check for any of the tokens, and split it like this:

[hello abc]-[]*[]-[ABC]-[]*[]-[aBc goodbye]

[] are the strings, others are the "splitting" tokens.
I understand that, I just wanted to show the difference between the documentation and the implementation.
Do you think is it possible to achieve the expected behavior with some sort of similar call?
Ahh, I see now...It does seem like their documentation is incorrect in this case...

The only way I can see of doing it is: (although it's not really a "similar" call like you wanted)

1
2
3
for (unsigned int i = 0; i < SplitVec.size(); ++i ) {
    if(!SplitVec[i].empty()) std::cout << "\"" << SplitVec[i] << "\"\n";
}

1
2
splitVec.erase( std::remove_if( splitVec.begin(), splitVec.end(), 
    boost::bind( &std::string::empty, _1 ) ), splitVec.end() );


will remove the zero-length entries from the vector after the call to boost::split.
Topic archived. No new replies allowed.