Main Page | Namespace List | Class Hierarchy | Alphabetical List | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

gumus.cpp


SourceForge.net Logo     CTTL on    
    SourceForge    
    Download    
    Latest    
    Documentation    
    Index    
    Library    
    News    
    CVS    
    Repository    
   Other    
   Links    



introduction


Gumus utility is a processor of a small scripting language to transform prefabricated units of free text into pieces of C++ code of stream output that is capable of reproducing the original text. For example, if file input.txt contains a single line of text,

input.txt:
    Hello, world!

then gumus output will be

>gumus input.txt
  std::cout<< "    Hello, world!"<< std::endl;
>

Multiple lines of of input are converted into multiple stream output statements:

input.txt:
Line one
Line two

the output will be

>gumus input.txt
  std::cout<< "Line one"<< std::endl;
  std::cout<< "Line two"<< std::endl;
>

Gumus scripts are simple to write and maintain. They become extremely useful when there is a need to generate computer programs instead of doing error-prone copy-and-paste operations by hand. CTTL lambda library, which is based on multiple template specializations and other similar pieces of repetitive code, contains substantial amount of source generated by gumus scripts. Whenever there is a need to write a series of programs composed of duplicated code, you should consider writing a program to generate such programs. In many cases, gumus script could come as a handy tool to automate mundane programming tasks.


dot C++ code


Besides just text, gumus input file may contain lines of C++ code, marked by dots at the beginning of the line. For example:

input.txt:
.for (int idx = 0; idx < 3; ++idx ) {
    Hello, world!
.}

The output becomes

>gumus input.txt
 for (int idx = 0; idx < 3; ++idx ) {
 std::cout<< "    Hello, world!"<< std::endl;
 }
>

After being processed, the dots have been removed from the original input, while the C++ code remained intact. Thus, dots introduce lines of C++ code into gumus output in such way that C++ code can be mixed freely with units of free text inside gumus script. If instead of displaying the output on the screen we save it into a temporary C++ header file,

>gumus input.txt > temp.h
>

we can now construct a small driver program that includes the generated header as follows:

main.cpp:
#include <iostream>
#include <string>

void print_output()
{
#include "temp.h"
}

int main( int argc, char* argv[] )
{
    print_output();
    return 0;
}

When the driver program is compiled and run, it will produce the following output:

>a.out
    Hello, world!
    Hello, world!
    Hello, world!

>


<<.expr.>>


The input script also supports output expressions, deliniated as <<.expr.>>. For example, we could modify our input file as follows:

input.txt:
.char one = '1';
.for (int idx = 0; idx < 3; ++idx ) {
    Hello, world! line <<.char( one + idx ).>>
.}

Then, if we rerun our script, recompile the driver program, and run it again, the output will become

>a.out
    Hello, world! line 1
    Hello, world! line 2
    Hello, world! line 3

>

Most of the gumus scripts are very simple and use only single variables as opposed to complex expressions inside text output units. In any case, output expressions <<.expr.>> must not contain character or string literals, because after processing by gumus all literals become decorated as escaped characters, for example,

input.txt:
<<.'A'.>>

will be converted to

>gumus input.txt
  std::cout<< ""<< \'A\'<< ""<< std::endl;
>

The result can no longer compile as a valid C++ code. To remedy the problem, user could store character in a variable to avoid character literal in output expression:

input.txt:
.char literal_A = 'A';
<<.literal_A.>>

which now becomes

>gumus input.txt
 char literal_A = 'A';
 std::cout<< ""<< literal_A<< ""<< std::endl;
>


dot indentation


Multiple dots can be specified for the dotted lines. When processed, the dots are replaced by spaces:

input.txt:
.//comment
..//comment
...//comment
...//comment
..//comment
.//comment

Thus, the dots are helpful when indentation level of the output is important:

>gumus input.txt
 //comment
  //comment
   //comment
   //comment
  //comment
 //comment
>


tracking script source


When large sets of files are generated, it is important to keep track of the script origins. The following script adds current time, the name of the file, and the source line number to the output:

input.txt:
// generated by <<.__FILE__.>>:<<.__LINE__.>>
// on <<.cttl::time2string( cttl::current_time() ).>>.

To make time functions available, we may need to modify our driver program as follows:

main.cpp:
#include <iostream>
#include "cttl/cttl.h"
#include "utils/timeutils.h"

void print_output()
{
#include "temp.h"
}

int main( int argc, char* argv[] )
{
    print_output();
    return 0;
}

Rerun the gumus script and recreate the temporary header file:

>gumus input.txt > temp.h
>

Now recompile and run the program:

>a.out
// generated by temp.h:1
// on Thu Jun 29 17:05:30 2006.
>


gumus command line parameters


There are two optional parameters that can be specified on the command line:

The name of the stream output is important if driver program wants to accumulate generated text in memory, rather than send it to the standard output. This can be very helpful if driver needs to accumulate multiple outputs and save them in different files.

The following example, shows how to redirect output to std::stringstream:

input.txt:
ABC
DEF

Add stroutput argument to the command line:

>gumus input.txt stroutput > temp.h
>

Inside temp.h two lines of code were written:

temp.h:
  stroutput<< "ABC"<< '\n';
  stroutput<< "DEF"<< '\n';

Finally, the driver program main.cpp uses std::stringstream object to capture the output:

main.cpp:
#include <cassert>
#include <iostream>
#include <string>
#include <sstream>

void print_output( std::stringstream& stroutput )
{
#include "temp.h"
}

int main( int argc, char* argv[] )
{
    std::stringstream stroutput;
    print_output( stroutput );
    assert( stroutput.str() == "ABC\nDEF\n" );
    return 0;
}


gumus source code


Source code of the gumus utility comes from gumus.cpp file:

// sample code: gumus.cpp
// Gumus script preprocessor utility.
// Usage: specify a gumus source file to convert to C++.
// syntax:
//   text
//   text <<.variable.>> <<.variable.>>...
//   text
// .C++ code
// .//C++ comment, etc.
// Note, that dots specify indentation level for the line output.

//#define NDEBUG    // define before assert.h to stop assertions from being compiled 
//#define CTTL_TRACE_EVERYTHING //define to turn tracing on
//#define CTTL_TRACE_RULES  //define to turn light tracing on
//#define GUMUS_TRACE_VARS  //define to debug gumus script variables

#include <iostream>
#include "cttl/cttl.h"
#include "utils/fileio.h"
#include "utils/itos.h"

using namespace cttl;

struct gumus {
    std::string line_prefix;        // std::cout or str_. To be used as LHS with output operator.
    std::string output_operator;    // either << for std::cout, or += for string output.
    std::string line_suffix;        // std::endl or '\n'. To be used as RHS with output operator.
    int indentation_level;

    gumus(
        std::string const& line_prefix_     /*= "std::cout"*/,
        std::string const& output_operator_ /*= "<<"*/,
        std::string const& line_suffix_     /*= "std::endl;"*/,
        int indentation_level_ = 2
        )
        :
        line_prefix( line_prefix_ ),
        output_operator( output_operator_ ),
        line_suffix( line_suffix_ ),
        indentation_level( indentation_level_ )
    {
    }

    void remove_cr( edge<>& edge_ )
    {
        edge_.push();
        while ( ( !symbol( '\r' ) ).match( edge_ ) != std::string::npos )
        {
            edge_.push();
            edge_.second.offset( edge_.first.offset() );
            edge_.first.offset( edge_.first.offset() - 1 );
            edge_.text( "" );
            edge_.pop();
        }
        edge_.pop();
    }

    size_t match_lines( edge<>& edge_ )
    {
        return (
            +(
                -end()
                +
                quote(
                    true
                    ,
                    (
                        begin( '.' )
                        +
                        CTTL_RULE( gumus::event_code_line )
                    )
                    |
                    (
                        CTTL_RULE( gumus::event_line )
                        &
                        *CTTL_RULE( gumus::find_escape )
                        &
                        *CTTL_RULE( gumus::find_variable )
                    )
                    ,
                    '\n' | end()
                )
            )
        ).match( edge_ );
    }

    size_t event_code_line( edge<>& edge_ )
    {
        for (
            indentation_level = 0
            ;
            ( edge_.first[ indentation_level ] == '.' )
            &&
            ( indentation_level < edge_.length() )
            ;
            ++indentation_level
            )
        {
            edge_.first[ indentation_level ] = ' ';
        }

        return edge_.second.offset();
    }

    size_t event_line( edge<>& edge_ )
    {
        // line decorations
        std::string indentation( indentation_level, ' ' );

        if ( edge_.length() ) {
            // non-empty line
            edge_.first.insert_go( indentation + line_prefix + output_operator + "\"" );
            edge_.second.insert_stay( "\"" + output_operator + line_suffix );
        } else {
            edge_.first.insert_go( indentation + line_prefix + output_operator + line_suffix );
        }

        return edge_.first.offset( edge_.second.offset() );
    }

    size_t find_escape( edge<>& edge_ )
    {
        return (
            (
                first( "\"\'\t\\" )
            )
            &
            CTTL_RULE( gumus::event_escape_char )

        ).find( edge_ );
    }

    size_t event_escape_char( edge<>& edge_ )
    {
        if ( edge_.first[ 0 ] == '\t' )
            edge_.text( "\\t" );
        else
            edge_.first.insert_go( "\\" );
        return edge_.second.offset();
    }

    size_t find_variable( edge<>& edge_ )
    {
        return (
            (
                symbol( "<<." )
                +
                !symbol( ".>>" )
            )
            &
            CTTL_RULE( gumus::event_variable )

        ).find( edge_ );
    }

    size_t event_variable( edge<>& edge_ )
    {
        // left connector
        edge_.first[ 0 ] = output_operator[ 0 ]; //'<' -or- ';'
        edge_.first[ 1 ] = output_operator[ 1 ]; //'<' -or- '+'
        edge_.first[ 2 ] = output_operator[ 2 ]; //' ' -or- '='

        edge_.first.insert_go( "\"" );

        edge_.second[ -3 ] = output_operator[ 0 ]; //'<' -or- ';'
        edge_.second[ -2 ] = output_operator[ 1 ]; //'<' -or- '+'
        edge_.second[ -1 ] = output_operator[ 2 ]; //' ' -or- '='

        edge_.second.insert_go( "\"" );
#ifdef GUMUS_TRACE_VARS
        edge_.second.insert_go( "/*" + edge_.text().substr( 3, edge_.length() - 7 ) + "*/" );
#endif // GUMUS_TRACE_VARS

        return 0;
    }

    bool parse( edge<>& universe_ )
    {
        remove_cr( universe_ );
        assert( universe_.length() == int( universe_.parent().length() ) );
        if ( match_lines( universe_ ) != std::string::npos )
            return true;
        return false;
    }
};

int main(int argc, char* argv[])
{
    std::string line_prefix( "std::cout" );
    std::string output_operator( "<< " );
    std::string line_suffix( "std::endl;" );

    if ( argc == 1 ) {
        std::cout
            << std::endl
            << "Usage: specify a gumus source file to convert to C++:"
            << std::endl
            << std::endl
            << '>' << argv[ 0 ] << " path/file.ext [str_] [end-of-line-suffix]"
            << std::endl
            << std::endl
            << "\t Second argument is optional. If specified, it becomes"
            << std::endl
            << "\t name of the stream output. The default is \'std::cout\'."
            << std::endl
            << std::endl
            << "\t Third argument is optional. Specifies"
            << std::endl
            << "\t end-of-line suffix. The default is \'std::endl;\'."
            << std::endl
            << "\t If second argument given, the default changes to"
            << "\t more general \'\\n\'."
            << std::endl
            ;
        return 1;

    } else if ( argc == 3 ) {
        line_prefix = argv[ 2 ];
        line_suffix = "\'\\n\';";

    } else if ( argc == 4 ) {
        line_prefix = argv[ 2 ];
        line_suffix = argv[ 3 ];
    }

    input<> inp;
    file2string( argv[ 1 ], inp.text() );
    assert( inp.length() );
    edge<> universe( new_edge( inp ) );

    gumus parser( line_prefix, output_operator, line_suffix );
    if ( parser.parse( universe ) ) {
        std::cout << inp.text();
        return 0;
    }

    std::cout << "*** parser failed ***" << std::endl;
    return 1;
}



Copyright © 1997-2006 Igor Kholodov mailto:cttl@users.sourceforge.net.

Permission to copy, use, modify, sell and distribute this document is granted provided this copyright notice appears in all copies. This document is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose.


Generated on Thu Nov 2 17:44:56 2006 for Common Text Transformation Library by  doxygen 1.3.9.1