
                  PUBLIC DOMAIN ANSI C RECOGNIZER

                            January 1995

               Terence Parr, Parr Research Corporation
               with Randy McRee, Tandem Corporation
               Released as public-domain by Tandem Corporation
               Originally taken from Tory Eneboe (tory@cs.montana.edu)
                  who typed in and partially "ANTLR-ized" the
                  K&R 2nd edition grammar.

GRAMMAR

This grammar is much nicer than the old pccts/lang/C stuff that came
with the PCCTS distribution.  That was junk.  This new one uses all
the tricks of later versions of ANTLR.  The rule names should be the
same or close to the names used in the ANSI C grammar given in the K&R
2nd edition book.

The only distinction between symbols for parsing reasons is type or
non-type.  Hence, functions and vars etc... are lumped together.  Also
note that struct names are not types all by themselves in ANSI C, so
they are not types vis a vis parsing.

The grammar uses C++ output mode of ANTLR and needs the 1.32b1 version
of ANTLR or above.

Their are syntactic and semantic predicates in the grammar to handle
the context sensitive portions.  When given a choice, I chose
readability over parsing speed.  Comments throughout the grammar
should help you understand it--though it is very clean.

CURRENT BEHAVIOR

The current behavior of the grammar is to simply dump out the symbols
in the different scopes.

NOTE: you'll have to run your C code through /lib/cpp (the C
preprocessor) before this grammar will accept it.

MODIFICATIONS

You'll note a bunch of member function triggers embedded in the
grammar.  I use these to manage the symbol table.  You can subclass
CParser such as the example in main.C to redefine the various
operations.  This would be an easy way to build type trees, for
example.  [for expression trees, you might want to modify the
grammar using the ANTLR AST stuff].

If you use MyParser instead of CParser in main.C, you'll note the series
of function calls.  For example,

int *p[3];

generates:

beginDeclaration();
declarationSpecifier();
declaratorID();
declaratorArray();
declaratorPointerTo();
global scope:
[non-type: p]

and

int (*p)[3];

generates:

beginDeclaration();
declarationSpecifier();
declaratorID();
declaratorPointerTo();
declaratorArray();
global scope:
[non-type: p]

There is a big difference between an array of pointers (first example)
and a pointer to an array (second example).  You'll see that the order
of function calls is different.


Good luck!
Terence Parr
PS	I did this on a free weekend...what a nice guy...



