You want to split a string into tokens, but you require more sophisticated searching or flexibility than Recipe 4.7 provides. For example, you may want tokens that are more than one character or can take on many different forms. This often results in code, and causes confusion in consumers of your class or function.
Use Boost’s regex
class template. regex
enables the use of regular expressions on string and
text data. Example 4-33 shows how to use
regex
to split strings.
Example 4-33. Using Boost’s regular expressions
#include <iostream> #include <string> #include <boost/regex.hpp> int main() { std::string s = "who,lives:in-a,pineapple under the sea?"; boost::regex re(",|:|-|\\s+"); // Create the reg exp boost::sregex_token_iterator // Create an iterator using a p(s.begin(), s.end(), re, -1); // sequence and that reg exp boost::sregex_token_iterator end; // Create an end-of-reg-exp // marker while (p != end) std::cout << *p++ << '\n'; }
Example 4-33 shows how to use
regex
to iterate over matches in a regular
expression. The following line sets up the regular expression:
boost::regex re(",|:|-|\\s+");
What it says, essentially, is that each match of the regular expression is either a comma, or a colon, or a dash, or one or more spaces. The pipe character is the logical operator that ORs each of the delimiters together. The next two lines set up the iterator:
boost::sregex_token_iterator p(s.begin(), s.end(), re, -1); boost::sregex_token_iterator end;
The iterator p
is constructed using the regular
expression and an input string. Once that has been built, you can treat p like you would
an iterator on a standard library sequence. A sregex_token_iterator
constructed with no arguments is a special value that
represents the end of a regular expression token sequence, and can therefore be used in a
comparison to know when you hit the end.
Get C++ Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.