<<< Overloaded operators  Table Of Contents  !!R, nonterminal symbol search operator >>> 
Common Text Transformation Library http://cttl.sourceforge.net/
Category:
Format:
!R
where operand R is a valid CTTL grammar expression.
Algorithm:
CTTL lexical analysis interface provides access to the find algorithm, which performs terminal symbol search. Besides this highlevel point of entry, the search request can also originate at the subexpression level of a grammar expression, when the search operator is applied to the former.
Overloaded C++ unary logical NOT operator implements search algorithm at the subexpression level.
The expression !R promotes current grammar evaluation algorithm of R to a higherlevel search for the nearest terminal symbol L of R:
If terminal symbol L is found, the rest of the grammar expression R, beyond L, is processed using the match evaluation algorithm.
The expressions
( !R ).match( substring ) ( R ).find( substring )
are equivalent.
Similarly, the expressions
( !!R ).match( substring ) ( !R ).find( substring ) ( R ).bang_find( substring )
are also equivalent.
Example:
#define CTTL_TRACE_EVERYTHING // define to turn tracing on #include "cttl/cttl.h" using namespace cttl; int main() { std::string inp = "123 ABC 456 DEF GHI 789"; const_edge< policy_space<> > substring( inp ); const_edge<> token = substring; size_t result = token( !( entity( isalpha ) + entity( isdigit ) ) ).match( substring ); assert( result != std::string::npos ); assert( token == "ABC 456" ); return 0; }
Space sensitivity:
The space sensitivity of !R is disabled: by design, CTTL search algorithms find and bang_find ignore space policy grammar when the nearest terminal symbol of R is found.
However, beyond the nearest terminal symbol, the remaining part of grammar expression R is evaluated using the match algorithm. Therefore, the space sensitivity becomes enabled for the second, third, and the rest of terminal symbols of grammar R .
Usage notes:
Search transitivity
Regardless of the grammar structure and complexity, only nearest terminal symbol can be the target of a search algorithm. This property of CTTL grammar expression is called search transitivity. Search transitivity is guaranteed by three common design characteristics of the find and bang_find implementations:
Overloaded unary operators, adaptors, and quotes maintain search transitivity properties by delegating the search requests to the underlying subexpressions.
Overloaded binary operators, which have lefttoright associativity, become searchtransient by delegating incoming search requests to their lefthandside expression operand.
As soon as the search algorithm reaches its first matching terminal symbol (which, by definition, is the nearest symbol in the expression), the request for search becomes consumed, that is, executed. All remaining terminal symbols in the expression (including the righthandside operands of the binary operators) are evaluated by the match() algorithm.
The following table enumerates various CTTL grammar constructs that support search transitivity property:
Construct  Search expression  Equivalent expression 

All Kleene quantifiers 
!*R zero or more matches
!(R*N) zero to N matches
!+R one or more matches
!(R+N) one to N matches
!(R+pair(N,M)) N to M matches
!(R+pair(N,npos)) exactly N matches

Transitivity is supported by the quantifier algorithm: the first occurrence of R is searched, all following occurrences are matched. 
All grammar expression adaptors and assertions 
!R negative lookahead assertion
!begin(R) positive lookahead assertion
!edge(R) input token adaptor
!node(R) node expression adaptor
!entity(R) nonempty match validator
Transitivity is supported by each expression adaptor: the search request is delegated to the underlying expression R. 
!R
begin(!R)
edge(!R)
node(!R)
entity(!R)

All symmetric quote formats 
!quote(RL, RM, RR) adaptor
Transitivity is supported by the quote adaptor: the search request is delegated to the opening syntactic unit RL. 
quote(!RL, RM, RR)

Logical set operators 
!(R1 + R2) sequence
!(R1 ^ R2) concatenation
!(R1  R2) set complement
!(R1 & R2) set intersection
Transitivity is supported by the binary operator algorithm: the search request is delegated to lefthandside operand R1. The righthandside operand R2 is evaluated by match algorithm. 
(!R1) + R2
(!R1) ^ R2
(!R1)  R2
(!R1) & R2

Set union operators 
!(R1  R2) set union
!(R1  R2) POSIX union
Search request is distributive: both operands execute the search algorithm. 
(!R1)  (!R2)
(!R1)  (!R2)

Search algorithm unavailability in function adaptors
CTTL grammar rules are represented by C++ functions (i.e. production rules ). Grammar expressions invoke such functions by using function adaptors. At runtime, the need to execute a C++ function call creates a physical barrier between two lexer components that represent the caller and the callee. When invoking production rule functions, CTTL lexer allows only one argument  a reference to the parseable substring. Unfortunately, this type of argument provides no means by which the function adaptor can discriminate among different grammar evaluation algorithms, while it prepares to make the function call. Because of that, CTTL function adaptors support only match grammar evaluation algorithm, but not the search.
This limitation precludes usage of search algorithms with CTTL function adaptors. For example, none of the following "search mode" production rule calls compile:
#define CTTL_TRACE_EVERYTHING // define to turn tracing on #include "cttl/cttl.h" using namespace cttl; size_t rule_dummy( const_edge<>& substr_ ) { return substr_.first.offset(); } int main() { std::string inp; const_edge<> substring( inp ); size_t result = ( rule( rule_dummy ) ).match( substring ); // compiles fine result = ( rule( rule_dummy ) ).find( substring ); // error: no member named 'find' is available result = ( !rule( rule_dummy ) // error: no member named 'find' is available ).match( substring ); result = ( rule( rule_dummy ) ).bang_find( substring ); // error: no member named 'bang_find' is available result = ( !!rule( rule_dummy ) ).match( substring ); // error: no member named 'bang_find' is available return 0; }
Trace output format:
The trace symbol of CTTL operator! is the exclamation mark '!', enclosed in a pair of symmetrical braces. The above example generates the following trace:
@123 ABC 456 DEF GHI 789?{e 023 :3:1 @123 ABC 456 DEF GHI 789? {! 023 @123 ABC 456 DEF GHI 789? {;! 023 123 ABC@ $ 723 isalpha 123 ABC 456@ $ 1123 isdigit } } @ABC 456 e 411 }
The exclamation mark '!' after the semicolon emphasizes that the sequence of tokens
entity( isalpha ) + entity( isdigit )
was evaluated by the terminal symbol search grammar evaluation algorithm.
Copyright © 19972009 Igor Kholodov mailto:cttl@users.sourceforge.net.
Permission to copy and distribute this document is granted provided this copyright notice appears in all copies. This document is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose.
<<< Overloaded operators  Table Of Contents  !!R, nonterminal symbol search operator >>> 