<<< Alphabetical Index | Lambda Home | Connection to CTTL grammar >>> |
Common Text Transformation Library http://cttl.sourceforge.net/
Early releases of CTTL library evolved around EBNF grammar construction and text transformations via semantic actions. The library provided limited support for direct interaction between grammar expression and program components outside of the lexer boundaries.
At early stages of the project, for reasons of portability, CTTL implementation was limited to basic C++ template functionality. Starting with release 2.07, a significantly wider range of features became part of the library, using template recursion and partial specialization of class templates. Version 2.08 provided lambda expressions (in-line semantic actions), higher-order functions, and closures.
Although CTTL has no embedded support for automatic generation of syntax trees, lambda expressions, presented here, can instantly distribute parsed input tokens to the data structures representing symbol tables, syntax trees, and other types of storage containers of user's choice. The notion of an in-line semantic action is critical for writing grammars intended for annotation of syntax trees. Along with lambda expressions, CTTL 2.08 provided support for in-line data structure known as lambda composite, closely imitating homogeneous binary tree, accessible as a multi-dimensional array.
CTTL places no stringy limits on the coding style of the programmer. Using lambda part of the library is entirely optional, since the same results can be achieved by semantic actions, defined simply as C++ class member functions or free-standing functions. At the same time, a program has a potential for improvement in places where short, specific, and, often, non-reusable code can be written in-line, within the body of the grammar rule definition. Essentially, lambda expressions reduce the crowd of little functions with very small amount of code. For reasons of simplicity and ease of debugging, I decided against any looping constructs in the current version of the lambda library, although, this might change in the future.
CTTL relies heavily on C++ expression templates defining C++ types on the fly, bypassing ordinary class declarations. Code with expression templates generally has good runtime performance, although sometimes may increase the program compile time.
Finally, it is worth mentioning that CTTL lambda library, based on multiple template specializations and other similar pieces of repetitive code, contains substantial amount of the source code generated by gumus. Whenever there was a need to write duplicated code, gumus was there up for the challenge. This little tool has saved me many hours, if not weeks, by automating a large amount of mundane programming tasks.
In CTTL, lambda expression is similar to a C++ sub-expression, written as part of CTTL grammar expression. It means that the library defines a large number of overloaded operators, along with a few classes representing expression operands. To be more specific, the library provides a family of scalar primitive classes that serve as the operand placeholders in lambda interrelation.
Instances of scalar classes act as adaptors of
By type of storage, scalars are divided into two categories:
In addition to serving as a mere placeholder of an expression operand, lambda scalar can store a function pointer (or a pointer to a class member function.) Such scalar then becomes a closure, carrying the capability of a delayed function call. Closures provide means for a direct communication between grammar definitions and other parts of the program, such as STL containers. Closure represents an "outside call," made from lambda expression to a global function, or a member function of the user-defined type, as demonstrated by stack_alias.cpp example:
Keywords: stack_alias.cpp, stack, scalar_reference, CTTL_LAMBDA_ASSERT
// stack_alias.cpp // Program illustrates access to std::stack via alias helpers. #define CTTL_TRACE_RULES // automatically turns lambda tracing on #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; int main(/*int argc, char* argv[]*/) { std::stack< size_t > st; lambda< std::stack< size_t > >::scalar_reference stack_ref( &st ); ( // Same as // assert( st.empty() ); CTTL_LAMBDA_ASSERT( alias::empty( stack_ref ) ), // Same as // st.push( size_t( 5 ) ); alias::push( &stack_ref, size_t( 5 ) ), // Same as // assert( st.top() == size_t( 5 ) ); CTTL_LAMBDA_ASSERT( alias::top( stack_ref ) == scalar( 5u ) ), // Same as // assert( !st.empty() ); CTTL_LAMBDA_ASSERT( !alias::empty( stack_ref ) ), // Same as // st.top() == size_t( 6 ); alias::top( &stack_ref ) = scalar( 6 ), // Same as // assert( st.top() == size_t( 6 ) ); CTTL_LAMBDA_ASSERT( alias::top( stack_ref ) == scalar( 6u ) ) ).evaluate(); return 0; }
Here is an "inventory" of parts from the example:
Element | Description |
---|---|
st | User stack of size_t values |
stack_ref | Reference-scalar holding mutable reference to the stack |
alias::empty() | Closure invoking st.empty() |
alias::push() | Closure invoking st.push() |
alias::pop() | Closure invoking st.pop() |
alias::top() | Closure invoking st.top() |
scalar( 6 ) | Scalar holding integer value of 6. |
CTTL_LAMBDA_ASSERT(expr) | A macro asserting that expr returns a non-zero value. |
Note that
cttl::alias::empty() cttl::alias::push() cttl::alias::pop() cttl::alias::top()
are helper functions returning the corresponding closure objects, and
scalar( 6 )
is another helper function, returning a scalar primitive that accommodates integer 6. The latter is an example of an in-line scalar, whereas the stack_ref is a standalone, explicitly instantiated scalar primitive.
Entire lambda library becomes available via inclusion of a single "lambda/lambda.h" header. However, lambda implementation itself depends on the presence of "cttl/cttl.h". In order to use lambda facilities, the program has to include both headers in a translation unit:
#include "cttl/cttl.h" #include "lambda/lambda.h"
In the previous stack_alias.cpp sample, the actual execution of lambda expression is triggered by the evaluate() call. This is a member function of the C++ class that is generated as a result of template instantiation of all C++ template classes, yielded by the combination of the above helper functions, scalar operands, and overloaded operators. The approach to cause C++ compiler to generate C++ template classes at compile time is known as template metaprogramming technique. Both CTTL lambda library and CTTL grammar parsing engine are utterly based on this technique.
The implementation goals of the lambda library are:
To provide CTTL grammar extensions for direct distribution of parsed input tokens to the data structures of programmer's choice. For example, lambda library shall become immediately useful under the following scenarios:
The flow of grammar needs to insert parsed data elements into STL containers by invoking contaner's member functions.
A parser program needs indirect writing into STL containers by using the corresponding STL iterators.
A grammar expression has to make function call to access the program's data structures.
To simplify input text transformations by making these operations available in-line.
A set of functional requirements for CTTL lambda implementation can be summarized as follows:
Implementation is thread safe;
Implementation is entirely compile-time, with minimal run-time overhead of closure calls via function pointers;
Implementation has familiar look and feel of C++ expressions, made possible by overloading commonly used logical, arithmetic, and assignment operators;
Implementation supports closures (delayed function calls), and higher-order functions (compositions of functions);
CTTL core parsing facilities remain independent from the lambda library;
A lambda application is not limited, or determined by, the CTTL core. That is, lambda expressions can be written, compiled, and executed as stand-alone, independent features.
Lambda sub-expressions can mix with CTTL grammars freely, performing instant semantic actions in-line.
If accommodating grammar rule re-interprets the value of a lambda expression as grammar evaluation result, the lambda expression becomes an autonomous grammar sub-unit.
If lambda expression updates the boundary of a parseable substring, the side effect of the change may have an indirect influence on the grammar flow.
In the following sample program, lambda_expression.cpp, the variable named var is passed to the scalar() helper function by its address. The function returns an automatic object called a reference-based scalar, which encapsulates the reference to the integer variable var. When automatic object is returned from a helper function, it's considered to be instantiated inline:
// lambda_expression.cpp // Program demonstrates lambda expression (delayed expression), // and delayed evaluation. #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; template< typename LambdaT > void f( LambdaT lambda_ ) { lambda_.evaluate(); // delayed evaluation } void g() { int var = 0; f( scalar( &var ) = 5 ); // delayed expression "var = 5" assert( var == 5 ); } int main(/*int argc, char* argv[]*/) { g(); return 0; }
Overloaded assignment operator= in expression
scalar( &var ) = 5
creates an implementation adaptor, binding together -
Member function evaluate() of the resulting assignment adaptor class exposes encapsulated assignment expression to the outside world. Each call to evaluate() triggers execution of the actual assignment operation: var=5.
Since evaluation of "var=5" is delayed, the encapsulated expression has a meaning of lambda expression. Lambda expressions are unnamed functions (also called lambda functions), and are often described as small, in-line functions, defined on the fly by a single expression.
C++ result type of the expression constitutes the type of lambda function. One distinctive property of a lambda expression is its delayed evaluation, also known as lazy evaluation.
Incidentally, any CTTL grammar expression also fits this description of a lambda expression, since its evaluation is delayed. Yet it is worth clarifying that the design semantics of CTTL grammars are tailored specifically for the pattern recognition tasks.
By contrast, CTTL lambda expressions have general-purpose application, permitting interactions between grammar units and the rest of the program. Such interaction often requires data flow in two directions:
Parsed tokens need to be stored in a data structure for the future use by the program;
The program wants to alter (transform) the input text in the midst of the grammar evaluation. (A subset of this type of request is an operation that appends more input to assemble a meaningful message, which is a common task for stream parsers. )
Beyond scalar instantiation, writing lambda expressions is easy: there is little difference between lambda and native C++ expressions. The next sample, node_2_character_access.cpp , demonstrates lambda expression that updates positions of two cttl::node objects. Combined with subscript cttl::node::operator[], this type of lambda expression permits reading and writing of individual characters:
Keywords: node_2_character_access.cpp, std::string, node, scalar, alias::offset const_scalar
// node_2_character_access.cpp // Program demonstrates overloaded subscript operators. // Subscript access to characters of std::string via cttl::node operator[]. #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; int main() { std::string inp = "abcdef"; node<> node1( inp ); node<> node2 = node1; lambda< char >::scalar ch; lambda< node<> >::scalar n1( node1 ); lambda< node<> >::scalar n2( node2 ); ( // update node positions: alias::offset( &n1, size_t( 0 ) ), alias::offset( &n2, size_t( 1 ) ), // swap characters: ch = n1[ const_scalar( 0 ) ], n1[ const_scalar( 0 ) ] = n2[ scalar( 0 ) ], n2[ const_scalar( 0 ) ] = ch ).evaluate(); assert( inp == "bacdef" ); }
Next sample, vector_element_access.cpp , demonstrates how to access the elements of a vector by using the overloaded subscript operator[]:
Keywords: vector_element_access.cpp, CTTL_TRACE_RULES, std::string, std::vector at, alias::at
// vector_element_access.cpp // Program demonstrates overloaded subscript operators. // Subscript access to the elements of std::vector< std::string > via operator[]. #define CTTL_TRACE_RULES // automatically turns lambda tracing on #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; int main(/*int argc, char* argv[]*/) { std::vector< std::string > vec( 1 ); // vector with 1 element lambda< std::vector< std::string > >::scalar_reference vec_ref( &vec ); lambda< std::string >::scalar str( std::string( "ABC" ) ); ( vec_ref[ 0 ] = str , CTTL_LAMBDA_ASSERT( scalar( std::string( "ABC" ) ) == vec_ref[ 0 ] ) , CTTL_LAMBDA_ASSERT( scalar( std::string( "ABC" ) ) == alias::at( &vec_ref, size_t( 0 ) ) ) , vec_ref[ 0 ] = scalar( std::string( "DEF" ) ) , CTTL_LAMBDA_ASSERT( scalar( std::string( "DEF" ) ) == vec_ref[ 0 ] ) , CTTL_LAMBDA_ASSERT( scalar( std::string( "DEF" ) ) == alias::at( &vec_ref, size_t( 0 ) ) ) ).evaluate(); return 0; }
The vector of strings vec is constructed with one element. The vector is referenced by vec_ref scalar primitive. Another scalar primitive, named str, instantiates an std::string object, which is initialized by the "ABC" string. Lambda expression
vec_ref[ 0 ] = str
assigns new value to the first element of the vector.
A similar program, iterator_vector_subscript.cpp, shows how to gain access to the vector elements by using the iterator:
Keywords: iterator_vector_subscript.cpp, scalar_reference, alias::begin, dereference operator const_scalar, subscript operator
// iterator_vector_subscript.cpp // Program demonstrates: // - STL iterator access // - std::vector // - overloaded subscript operator #define CTTL_TRACE_RULES // automatically turns lambda tracing on #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; int main(/*int argc, char* argv[]*/) { // create vector with three elements: ( 33, 33, 33 ) std::vector< int > vec( 3, 33 ); std::vector< int >::iterator it; // create integer scalar: lambda< int >::scalar sint; // create vector scalar: lambda< std::vector< int > >::scalar_reference svec( &vec ); // create iterator scalar: lambda< std::vector< int >::iterator >::scalar_reference sit( &it ); ( sit = alias::begin ( &svec ), CTTL_LAMBDA_ASSERT( *sit == scalar( 33 ) ), *sit = scalar( 34 ), CTTL_LAMBDA_ASSERT( *sit == scalar( 34 ) ), CTTL_LAMBDA_ASSERT( svec[ const_scalar( 0 ) ] == scalar( 34 ) ) ).evaluate(); return 0; }
Evidently, most common uses of lambda expressions originate from the idea of depositing parsed input tokens into the program data structures. The lambda_grammar.cpp sample demonstrates grammar rule with lambda expression in function named parse():
Keywords: lambda_grammar.cpp, const_edge, policy_space, std::stack, std::string, entity match
// lambda_grammar.cpp // Program parses tokens and inserts them into the stack. // Demonstrates lambda grammar, stack, kleene star, and epsilon parser. //#define CTTL_TRACE_EVERYTHING #include "cttl/cttl.h" #include "lambda/lambda.h" using namespace cttl; size_t parse( const_edge< policy_space<> >& substr_, std::stack< std::string >& str_stack_ ) { return ( *( entity( isalpha ) & *( scalar( &str_stack_ ) = scalar( &substr_ ) ) ) ).match( substr_ ) ; } int main(/*int argc, char* argv[]*/) { std::stack< std::string > str_stack; std::string inp = "abc def ghi"; const_edge< policy_space<> > substring( inp ); if ( parse( substring, str_stack ) == std::string::npos ) { std::cout << "*** parser failed ***" << std::endl; return 1; } assert( str_stack.size() == 3 ); assert( str_stack.top() == "ghi" ); return 0; }
Up to this point, lambda expressions were used in a standalone form:
( lambda_expr ).evaluate();
The last sample, lambda_grammar.cpp, presents a different form, connecting lambda expression with CTTL grammar rule:
( grammar_expr & lambda_expr ).match( substring );
The aspect of lambda accommodation inside grammar definitions is covered next.
Copyright © 1997-2009 Igor Kholodov mailto:cttl@users.sourceforge.net.
Permission to copy and distribute this document is granted provided this copyright notice appears in all copies. This document is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose.
<<< Alphabetical Index | Lambda Home | Connection to CTTL grammar >>> |