Skip directly to the questions list by clicking here.
This is the Frequently Asked Questions (FAQ) list for the Purdue Compiler Construction Tool Set (PCCTS). The FAQ is currently at version 2.2. Any suggestions for improving this document should be forwarded to email@example.com. PLEASE DO NOT BUG TERENCE PARR ABOUT THE CONTENTS OF THIS DOCUMENT! He's busy enough enhancing PCCTS and making a go of a consulting business without having to contend with the FAQ.
This FAQ is based partly on the first draft of the FAQ written by Steve Robenalt as supplied by Terence Parr. It has since then been heavily edited and had several new sections added. It was finally converted to HTML and published on the World Wide Web by Michael T. Richter.
The FAQ contains routine, non-technical to semi-technical questions about PCCTS and its related tools. Questions along the line of "What is PCCTS and where can I get it?" will be found in here.
Two new features begins with version 1.4 of the FAQ. The first of these, a list of uses to which PCCTS has been put, along with some contact names if applicable, is intended to provide new users with an idea of what kinds of things PCCTS can be used for. The second of these, a list of consultants willing to do PCCTS-specific consulting, is provided as a service to the PCCTS-using community as a whole.
Anyone who wants their product or their services listed in either or both sections can feel free to drop me a line at the e-mail address listed above.
Please note: both of these FAQ features are separate pages on the World Wide Web as of revision 2.0 of the FAQ. Follow the menu frame to find them.
The FAQ is not an appropriate forum for answering questions about how to use PCCTS and its related tools. Consult Question 2.2 below for details on other documentation sources.
It would be nice if I could get a more complete list of sites which carry PCCTS. Of special interest would be mirror sites outside of the USA.
I can always use more work. Feel free to send in new questions, comments about format and style and anything else you would like to see in this FAQ. Please note that, to avoid unnecessary duplication of effort, I will not place questions of a deeply technical nature into this FAQ. Tom Moog's "Notes for New Users" file (c.f. below) is the best place to put these. Contact Tom (firstname.lastname@example.org) for technical issues.
The editor of this FAQ, Michael T. Richter, can be reached by e-mail as email@example.com. Since he maintains this FAQ in his Copious Free Time, there may be a few days' turnaround in processing incoming suggestions and requests. Rest assured, however, that all incoming e-mail is read and responded to where necessary. To speed processing, place the string "PCCTS FAQ:" somewhere in the subject header.
If I have forgotten to credit someone appropriately, please send me e-mail. Any corrections requested will be immediately entered into the FAQ for the next release. I'm not trying to steal anyone's thunder!
1.1 What is PCCTS?
1.2 Where can I get PCCTS?
1.3 How much does it cost?
2.1 What can I do with PCCTS?
2.2 What documentation is available?
2.3 What grammars are available?
2.4 What platforms can I use it on?
2.5 What is the difference between PCCTS and YACC/LEX?
3.1 What is SORCERER?
4.1 Where can I get PC-compatible tools for working with PCCTS?
4.2 Why is this FAQ not in comp.answers or on rtfm.mit.edu?
4.3 How do I port PCCTS to Win32?
PCCTS is the Purdue Compiler Construction Tool Set, so named because it began as a project in the School of Electrical Engineering at Purdue University. Several releases of the tools have been made publicly available; the current version is 1.33.
PCCTS consists of two tools, ANTLR (Another Tool for Language Recognition) and DLG (DFA-based Lexical analyser Generator), assorted support files, sample Pascal and C grammars, and full source code. ANTLR is a functional equivalent of YACC (Yet Another Compiler Compiler), which is distributed with most UNIX platforms. Both tools generate parsers from a BNF-like grammar description. Similarities and differences are covered under a later section.
DLG is a functional equivalent of LEX which is also distributed with most UNIX platforms. These tools generate lexical analysers (scanners) from a set of regular-expression rules.
Another tool which is commonly associated with PCCTS is SORCERER, which is a tree parser generator that uses a description grammar similar to that of ANTLR to build tree-parsers (tree walkers). This tool will be covered in detail in a later section. Sorcerer will soon be incorporated with PCCTS.
The principal source of PCCTS is from the primary anonymous ftp site:
Mirror sites exist at:
and possibly many other sites. Using archie may help you locate other sites if you are unable to access these for any reason.
The comp.compilers.tools.pccts newsgroup (actually the mailing list digest) is archived at the ftp sites in:
This archive appears to have stopped as of November 1994, however.
If you are unable to use ftp to access these files, you may want to consider the following alternatives:
e-mail: send a request to firstname.lastname@example.org with any "Subject:" line.
floppy disk or CD-ROM: don't hold your breath...
To quote from the source itself:
We reserve no LEGAL rights to the Purdue Compiler Construction Tool Set (PCCTS) -- PCCTS is in the public domain. An individual or company may do whatever they wish with source code distributed with PCCTS or the code generated by PCCTS, including the incorporation of PCCTS, or its output, into commercial software.
We encourage users to develop software with PCCTS. However, we do ask that credit is given to us for developing PCCTS. By "credit", we mean that if you incorporate our source code into one of your programs (commercial product, research project, or otherwise) that you acknowledge this fact somewhere in the documentation, research report, etc... If you like PCCTS and have developed a nice tool with the output, please mention that you developed it using PCCTS. In addition, we ask that this header remain intact in our source code. As long as these guidelines are kept, we expect to continue enhancing this system and expect to make other tools available as they are completed.
Please note that Terence Parr, the primary author, has stated that the PCCTS tools will always be free. He does, however, plan to release several commercial tools which will enhance the functionality of PCCTS.
Return to questions list by clicking here.
The answer to this question depends on whether you acquired PCCTS with a project or goal in mind, or if you acquired it to learn more about compilers and tools. In either case, you will probably want to check out the tutorials which are available at the same ftp sites listed above in one of the older distribution directories. There are two tutorials for ANTLR and DLG which have documentation and sample code included. These will help explain how to write a grammar, and how to build appropriate semantic actions into the grammar. Note that these tutorials use "ancient" PCCTS technologies. Terence has no time to update them, however, so they will have to stand for now.
If you wish to learn more at this point, you'll probably want to have a look at the ANSI C grammar and the C++ grammar at the FTP site. These grammars are described briefly below. If, on the other hand, you already have a project in mind, you should probably just jump right in building or transcribing the grammar. It would be a good idea to review all of the relevant documentation as described below before beginning, since some of the older methods have been superseded by newer, more powerful ones.
At the time of this printing, the best single source of documentation for PCCTS v1.33 is The Book. (It is more formally entitled "Language Translation Using PCCTS and C++: A Reference Guide" but has been called The Book for so long that I'm not about to change it.) An "Initial Release to Internet for Review and General Bashing" is available at the main FTP site as:
After publication of The Book, it will be available commercially at a reasonable (hopefully!) cost. Terence will be sure to notify us of precisely how to get The Book after publication.
Another ABSOLUTELY ESSENTIAL source of documentation for PCCTS is Tom Moog's "Notes for New Users of PCCTS" document. It is appended to The Book and is also available at:
The distribution tar files have several documents targetting platform-specific issues. These are named "NOTES.<platform>" and should be read by anyone not compiling to the canonical UNIX platform which is the presumed default.
Several sample grammars will be available at the FTP site under /pub/pccts/contrib. These include ansi.tar, a sophisticated ANSI C grammar which uses all of the latest tricks, and a Pascal grammar which used to be distributed with the v1.06 PCCTS toolset. A very sophisticated C++ grammar is also at the FTP site. None of the grammars generates a full compiler, but rather a parser which executes some kind of action on the source file.
Other grammars are being built by many users of these tools. If you would like to know about the availability of a particular grammar, please post a question to the newsgroup. Someone else may be working on something similar and may be willing to share their work.
For some more detailed information, consult the following URL:
PCCTS was written in such a way to make it as flexible as possible to build and run it under as many different operating systems and C/C++ compilers as possible. It has been built successfully under many flavours of UNIX and many UNIX-like operating systems (NetBSD, FreeBSD, Linux), under OS/2 and Windows (including NT and 95), and even under DOS using DJGPP (GCC with DOS extender) and Watcom C/C++. However, it may be difficult to build using a 16-bit compiler since the current version contains more than 64k (1 segment) of static data, which causes problems with most DOS C compilers. Most 32-bit C compilers should work with minimal trouble, using K&R, ANSI, or C++ compiler modes.
One problem which is reported frequently upon the initial build of ANTLR and DLG (as happens using the installation program) is the failure of the build process due to the inability to invoke ANTLR or DLG to build themselves. This is not a problem with GNU Make, but does occur frequently with other versions of make from other vendors. The solution to the problem is to simply comment out the target lines in the Makefiles for ANTLR and DLG which try to invoke ANTLR and/or DLG to build themselves. The C files which would be created by these target lines already exist, so the build will proceed normally once these lines are commented out. After the initial build, these lines can be restored to their original state for all subsequent builds.
Note that if you try to build first and see these errors, you should unpack all of the files from the original distribution since the failed build may have written over the valid files before failing. Using touch to make the target files appear to be up to date may or may not work, depending on which version of make you are using. Alternatively, since the C files necessary for building the tool have already been generated, commenting out the dependencies which require ANTLR/DLG will solve the problem.
This question is obviously flame-bait, but it does come up and will continue to do so. Thus, here is a brief summary of some distinctions:
SORCERER is a tool for generating tree parsers and tree transformers and is related to trees as ANTLR is related to text. It is more suitable for the class of translation problems lying between those solved by code-generator generators and by full source-to-source translator generators.
SORCERER generates simple, flexible, top-down, tree parsers that, in contrast to code-generators, may execute actions at any point during a tree walk. SORCERER accepts extended BNF notation, allows predicates to direct the tree walk with semantic and syntactic context information, and does not rely on any particular intermediate form, parser generator, or other pre-existing application. Often a programmer has a front-end that constructs intermediate form trees and simply wants to traverse the trees and execute a few actions.
SORCERER shares several features with PCCTS including grammar description format, the use of semantic and syntactic predicates to direct the parse, and the generation of readable C code. SORCERER parsers are LL(1), compared to LL(k) for ANTLR, but this places only minimal constraints on the type of trees which can be parsed because programmers specifically design their intermediate forms/ASTs to be easy to walk.
While SORCERER is designed to work very well in conjunction with PCCTS, it can also be used with any other tools that generate the minimally-defined form of AST which SORCERER expects. The latest version is 1.00B15 and was recently heavily upgraded to handle tree transformations in a more reasonable manner. It also comes with a number of support libraries.
SORCERER will soon be integrated with PCCTS.
http://www.acs.oakland.edu (or ftp://oak.oakland.edu) is a *MAJOR* site worth a visit by seekers of free/shareware tools for a software development environment. A PC-compatible TAR is there. GZIP is there also. You will also find ghostview and ghostscript there in the Simtel/msdos/postscrp directory. These four tools are must-haves when dealing with Ter's postscript documentation on an MS-DOS/WINDOWS platform. Tom Moog's site, http://www.mcs.net/~tmoog/pccts.html, has links to various sources of such programs as well.
The current processes for placing material in the *.answers hierarchy (and by extension at rtfm) are baroque, bordering on silly. The editor of this FAQ maintains it in his Copious Free Time and does not feel that a document well in excess of 64KB explaining how to post to these sites is valid or even remotely useful.
Until the process to post to *.answers/rtfm is made saner, this editor will not do so. (If someone else wants to go through the headache, he can feel free to.) Yes, the editor was in a very bad mood when writing this particular answer. He has remained in a bad mood about this particular subject in subsequent bouts of editing as well.
Franz-Josef Kaiser was kind enough to post his answer to this question on the Web. Read of his experiences at:
You can reach me by e-mail at email@example.com.