              PUBLIC DOMAIN PCCTS-BASED C++ GRAMMAR

                          VERSION 1.2

                            AUTHORS

  Sumana Srinivasan, NeXT Inc.;             sumana_srinivasan@next.com
  Terence Parr, Parr Research Corporation;  parrt@parr-research.com
  Russell Quong, Purdue University;         quong@ecn.purdue.edu

[Restructured for public consumption by Terence.  Please bug Terence
 about problems, comments, generally-warm thoughts, or stern reprimands;
 don't bug Sumana or Russell.  Terence can also help you enhance this
 for your particular application if you ask him nicely (i.e., throw pizza
 money at him).]

[This is only the third release and is not perfect, but heh!  What
 ya want for nothin'?  We anticipate enhancing this grammar.]


                   BETA TESTERS (alphabetical order)

	Lutz.Hamel, Lutz.Hamel@comlab.ox.ac.uk
	Scott Haney, haney@random.llnl.gov
	Thomas Herter, Thomas.Herter@mch.sni.de
	Thomas Hutto, TAHUTTO@ix.netcom.com
	Michael Richter, mtr@globalx.com
	Steven Robenalt, steven_robenalt@internet.uscs.com
	Dan Yoder, dyoder@pencom.com

[Heh, how come four of you guys have family names starting with 'H'?
 Conspiracy? Coincidence?  And...2 'R's?  Hmm...]


                          SOFTWARE RIGHTS

This software is free.  We do not reserve any LEGAL rights to its use
or distribution, but you may NOT claim ownership or authorship of this
grammar or support code.  An individual or company may otherwise do
whatever they wish with the grammar distributed herewith including the
incorporation of the grammar or the output generated by ANTLR into
commerical software.  You may redistribute in source or binary form
without payment of royalties to us as long as this header remains in
all source distributions.

We encourage users to develop parsers/tools using this grammar.  In
return, we ask that credit is given to us for developing this grammar.
By "credit", we mean that if you incorporate our grammar or the
generated code into one of your programs (commercial product, research
project, or otherwise) that you acknowledge this fact in the
documentation, research report, etc....  In addition, you should say
nice things about us at every opportunity.

As long as these guidelines are kept, we expect to continue enhancing
this grammar.  Feel free to send us enhancements, fixes, bug reports,
suggestions, or general words of encouragement at parrt@parr-research.com.


                            DISCLAIMER

We make no guarantees that this grammar works, makes sense, or can be
used to do anything useful.  "Your mileage may vary".


                           INSTALLATION

First, modify the makefile macro in Makefile:

PCCTS = /usr/local/pccts

to point to your PCCTS base directory.  Note that if your
binaries are not in $(PCCTS)/bin, then change

BIN = $(PCCTS)/bin

to the appropriate directory.  Now, simply do a

	make Cplusplus=g++

or whatever if the 'Cplusplus' makefile macro is not defined.  The
executable cplusplus will exist in the current directory.


                         DESIGN PHILOSOPHY

This grammar was developed using the NeXT C++ grammar as a guide (The
portions of the NeXT grammar relating to pure C and Objective-C
recognition are excluded, however).  The NeXT C++ grammar is
fine-tuned to be parse C++ very quickly as it is an important part of
the code browser.  The restructed grammar totally ignores speed
issues, striving for high readability.  The various objects in the C++
language such as function definitions, constructors, destructors, and
templates are easily identified in the grammar--they are recognized
by separate rules.  This separation also makes the development of
translators easier because it is not necessary to determine whether a
given sentence is a definition or a declaration.  If we wanted to be
funny, we might say that the guiding principle was

    "When in doubt...predicate"

The parsing speed is not great: about 70000 lines / minute on a
pentium-90 NeXTStep 3.2 machine.  However, the speed can easily be
improved with simple things like making symbol table entry allocation
faster.  It currently does a "new" for each entry (as well as the list
containers), which is definitely of non-linear complexity.  The
recipient can spend the time to increase the speed if they wish.
Most people don't care about speed as they are more concerned with
getting a robust and maintainable product out fast.


                           FUNCTIONALITY

***What the Executable Does

The current behavior of the grammar is to simply dump out the symbols
in the different scopes and to print out the function names of the
"triggers" int the grammar as they are encountered.

By turning on the -trace option ("cplusplus -trace < input.cpp"), the
invocation of each rule is displayed with the current 2-token
lookahead buffer.

***How To Use the Executable

The grammar does not include a preprocessor--therefore, the user must
explicitly run a C++ file through the preprocessor before our grammar
can chew on it.  For example

lonewolf:/projects/cpp$ cat > test.cpp
#include <iostream.h>
#define A 3

int i = A;
f()
{
        cout << "wow...a C++ grammar\n";
}
^D
lonewolf:/projects/cpp$ g++ -E test.cpp > t
lonewolf:/projects/cpp$ ./cplusplus < t
beginDeclaration();
declarationSpecifier();
declaratorID(_G_clock_t);
defining _G_clock_t in scope 1
beginDeclaration();
declarationSpecifier();
declaratorID(_G_dev_t);
defining _G_dev_t in scope 1
 :
 :

The C++ preprocessor expands iostream.h and converts A to 3.  Directly
running cplusplus on test.cpp will result in a boatload of syntax
errors.

***For GNU C++ users

The g++ compiler has some predefined types that cplusplus, of course,
is not aware of.  After running it through the preprocessor, stick
the following two lines on the top of the file:

    typedef int __wchar_t;
    typedef int bool;

other compilers should need this, but I haven't tried it.

***What Portion of C++ Does the Grammar Cover?

The grammar recognizes most of C++ (templates, exceptions, pointers to
member funcs, user-defined type-casts, etc...).  It does not cover the
recent "name-space" additions or bitfields.  It should handle the
majority of C++ code out there.

***What the Grammar Doesn't Do

No intermediate form trees (ASTs) are constructed and the set of
action "triggers" is not complete enough for most applications.  We
suggest that you extend this list of function triggers rather than
inserting actual actions directly into the grammar.


                        MODIFICATIONS

The member function triggers embedded in the grammar can be redefined
to do whatever you want.  We use these to manage the symbol table.
You can subclass CParser such as the example in main.C to redefine the
various operations.  This would be an easy way to build type trees,
for example.  [for expression trees, you might want to modify the
grammar using the ANTLR AST stuff].  Here is some sample output
showing how an array of pointers (the first definition) is
distinguished from a pointer to an array.

lonewolf:/projects/cpp$ ./cplusplus
int *p[3];
beginDeclaration();
declarationSpecifier();
declaratorID(p);
defining p in scope 1
declaratorArray();         <--- note the order here
declaratorPointerTo();
global scope:
[non-type: p]
lonewolf:/projects/cpp$ ./cplusplus
int (*p)[3];
beginDeclaration();
declarationSpecifier();
declaratorID(p);
defining p in scope 1
declaratorPointerTo();     <--- and here
declaratorArray();
global scope:
[non-type: p]


                      IMPLEMENTATION NOTES

What version of ANTLR: at least 1.32b7.

New predicates: (...)? => <<...>>?
	Incomplete...sorry.

Meaning of #pragma approx
	Incomplete...sorry.

Symbol table management.
	Incomplete...sorry.

Known problems:
	No bitfields

	template<class T> T f() {...} doesn't work because the return
	type of f() will not have been entered into the symbol table
	int time to get parsed as a type.

	Can't have a declarator without a type or storage class.
	You get a syntax error (a bad one) from cplusplus.  For example,
	"a;".

	Recognizes the wrong construction for hexadecimal escape sequences.

	Doesn't handle this:

	    class A {
	            typedef int I;
	    };

	    class B : public A {
	            I a;
	    };

	since no class hierarchy check.


	