C++ Notes


The C++ runtime and generated grammars look very much the same as the java ones. There are some subtle differences though, but more on this later.

Building the runtime

The following is a bit unix centric. For Windows some contributed project files can be found in lib/cpp/contrib. These may be slightly outdated.

The runtime files are located in the lib/cpp subdirectory of the ANTLR distribution. Building it is in general done via the toplevel configure script and the Makefile generated by the configure script. Before configuring please read INSTALL.txt in the toplevel directory. The file lib/cpp/README may contain some extra information on specific target machines.

./configure --prefix=/usr/local
make

Installing ANTLR and the runtime is then done by typing

make install
This installs the runtime library libantlr.a in /usr/local/lib and the header files in /usr/local/include/antlr. Two convenience scripts antlr and antlr-config are also installed into /usr/local/bin. The first script takes care of invoking antlr and the other can be used to query the right options for your compiler to build files with antlr.

Using the runtime

Generally you will compile the ANTLR generated files with something similar to:
c++ -c MyParser.cpp -I/usr/local/include
Linking is done with something similar to:
c++ -o MyExec <your .o files> -L/usr/local/lib -lantlr

Getting ANTLR to generate C++

To get ANTLR to generate C++ code you have to add

language="Cpp";
to the global options section. After that things are pretty much the same as in java mode except that a all token and AST classes are wrapped by a reference counting class (this to make live easier (in some ways and much harder in others)). The reference counting class uses
operator->
to reference the object it is wrapping. As a result of this you use -> in C++ mode in stead of the '.' of java. See the examples in examples/cpp for some illustrations.

AST types

New as of ANTLR 2.7.2 is that if you supply the

buildAST=true
option to a parser then you have to set and initialize an ASTFactory for the parser and treewalkers that use the resulting AST.
ASTFactory my_factory;	// generates CommonAST per default..
MyParser parser( some-lexer );
// Do setup from the AST factory repeat this for all parsers using the AST
parser.initializeASTFactory( my_factory );
parser.setASTFactory( &my_factory );

In C++ mode it is also possible to override the AST type used by the code generated by ANTLR. To do this you have to do the following:

Using Heterogeneous AST types

This should now (as of 2.7.2) work in C++ mode. With probably some caveats.

The heteroAST example show how to set things up. A short excerpt:

ASTFactory ast_factory;

parser.initializeASTFactory(ast_factory);
parser.setASTFactory(&ast_factory);

A small excerpt from the generated initializeASTFactory method:

void CalcParser::initializeASTFactory( antlr::ASTFactory& factory )
{
   factory.registerFactory(4, "PLUSNode", PLUSNode::factory);
   factory.registerFactory(5, "MULTNode", MULTNode::factory);
   factory.registerFactory(6, "INTNode", INTNode::factory);
   factory.setMaxNodeType(11);
}

After these steps ANTLR should be able to decide what factory to use at what time.

Extra functionality in C++ mode.

In C++ mode ANTLR supports some extra functionality to make life a little easier.

Inserting Code

In C++ mode some extra control is supplied over the places where code can be placed in the gerenated files. These are extensions on the header directive. The syntax is:
header "<identifier>" {  }

identifier where
pre_include_hpp Code is inserted before ANTLR generated includes in the header file.
post_include_hpp Code is inserted after ANTLR generated includes in the header file, but outside any generated namespace specifications.
pre_include_cpp Code is inserted before ANTLR generated includes in the cpp file.
post_include_cpp Code is inserted after ANTLR generated includes in the cpp file, but outside any generated namespace specifications.

Pacifying the preprocessor

Sometimes various tree building constructs with '#' in them clash with the C/C++ preprocessor. ANTLR's preprocessor for actions is slightly extended in C++ mode to alleviate these pains.

NOTE: At some point I plan to replace the '#' by something different that gives less trouble in C++.

The following preprocessor constructs are not touched. (And as a result you cannot use these as labels for AST nodes.

As another extra it's possible to escape '#'-signs with a backslash e.g. "\#". As the action lexer sees these they get translated to simple '#' characters.

A template grammar file for C++

header "pre_include_hpp" {
    // gets inserted before antlr generated includes in the header file
}
header "post_include_hpp" {
    // gets inserted after antlr generated includes in the header file
     // outside any generated namespace specifications
}

header "pre_include_cpp" {
    // gets inserted before the antlr generated includes in the cpp file
}

header "post_include_cpp" {
    // gets inserted after the antlr generated includes in the cpp file
}

header {
    // gets inserted after generated namespace specifications in the header
    // file. But outside the generated class.
}

options {
   language="Cpp";
    namespace="something";      // encapsulate code in this namespace
//  namespaceStd="std";         // cosmetic option to get rid of long defines
                                // in generated code
//  namespaceAntlr="antlr";     // cosmetic option to get rid of long defines
                                // in generated code
    genHashLines = true;        // generated #line's or turn it off.
}

{
   // global stuff in the cpp file
   ...
}
class MyParser extends Parser;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...

{
   // global stuff in the cpp file
   ...
}
class MyLexer extends Lexer;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...

{
   // global stuff in the cpp file
   ...
}
class MyTreeParser extends TreeParser;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...