<<< Lexemes | Table Of Contents | Lexeme entity() >>> |
Common Text Transformation Library http://cttl.sourceforge.net/
Category:
Format:
Algorithm:
Space sensitivity:
The space sensitivity of symbol() lexeme is enabled:
#define CTTL_TRACE_EVERYTHING #include "cttl/cttl.h" using namespace cttl; int main() { std::string inp = " A B C "; const_edge< policy_space<> > substring( inp ); const_edge<> token = substring; size_t result = token( symbol() ).match( substring ); assert( result != std::string::npos ); assert( token == "A" ); return 0; }
Searchability:
Search grammar evaluation algorithms
are syntactically enabled for symbol() lexeme, but have no effect: in either case match() grammar evaluation algorithm is executed by the lexer, as demonstrated by the following fragment:
#define CTTL_TRACE_EVERYTHING
#include "cttl/cttl.h"
using namespace cttl;
int main()
{
std::string inp = " A B C ";
const_edge< policy_space<> > substring( inp );
const_edge<> token = substring;
size_t result = token(
symbol()
).find( substring );
assert( result != std::string::npos );
assert( token == "A" );
return 0;
}
Usage example:
The symbol() lexeme can take part in a grammar that retrieves characters until certain condition is met. The following grammar expression matches a sequence of characters enclosed in a pair of double-quotes:
#define CTTL_TRACE_EVERYTHING #include "cttl/cttl.h" using namespace cttl; int main() { std::string inp = " \"A B C\" "; const_edge< policy_space<> > substring( inp ); const_edge<> token = substring; size_t result = ( symbol( '\"' ) // opening quote, + // followed by... token( *( // ...any number of... -symbol( '\"' ) // ...not a quote... + // ...but... symbol() // ...any character ) ) + // followed by symbol( '\"' ) // a closing quote. ).match( substring ); assert( result != std::string::npos ); assert( token == "A B C" ); return 0; }
Trace output format:
The trace of symbol() lexeme includes the matched character, annotated by the "symbol" label:
---------------------- A@| A 2-7 symbol
One potential reason for symbol() lexeme to fail is an empty parseable substring. In that case the trace symbol is 'L':
------------------------@?{e 0-0 :3:1 ~~~~~~~~~~~~~~~~~~~~~~~~@~ L 0-0 FAIL empty substring ~~~~~~~~~~~~~~~~~~~~~~~~@~ e3 0-0 FAIL }
Copyright © 1997-2009 Igor Kholodov mailto:cttl@users.sourceforge.net.
Permission to copy and distribute this document is granted provided this copyright notice appears in all copies. This document is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose.
<<< Lexemes | Table Of Contents | Lexeme entity() >>> |