antlr



ANTLR(1)                      PCCTS Manual Pages                      ANTLR(1)




NAME

       antlr - ANother Tool for Language Recognition


SYNTAX

       antlr [options] grammar_files


DESCRIPTION

       Antlr converts an extended form of context-free grammar into a set of C
       functions which directly implement an efficient form  of  deterministic
       recursive-descent LL(k) parser.  Context-free grammars may be augmented
       with predicates to allow semantics to influence parsing; this allows  a
       form  of  context-sensitive  parsing.   Selective  backtracking is also
       available to handle non-LL(k) and even non-LALR(k)  constructs.   Antlr
       also  produces  a definition of a lexer which can be automatically con-
       verted into C code for a DFA-based lexer by dlg.  Hence, antlr serves a
       function  much  like that of yacc, however, it is notably more flexible
       and is more integrated with a lexer generator (antlr directly generates
       dlg  code,  whereas  yacc  and lex are given independent descriptions).
       Unlike yacc which accepts LALR(1) grammars, antlr accepts  LL(k)  gram-
       mars in an extended BNF notation — which eliminates the need for prece-
       dence rules.

       Like yacc grammars, antlr  grammars  can  use  automatically-maintained
       symbol  attribute  values  referenced  as  dollar  variables.  Further,
       because antlr generates  top-down  parsers,  arbitrary  values  may  be
       inherited  from  parent rules (passed like function parameters).  Antlr
       also has a mechanism for  creating  and  manipulating  abstract-syntax-
       trees.

       There  are  various  other  niceties in antlr, including the ability to
       spread one grammar over multiple files or even multiple grammars  in  a
       single  file,  the  ability  to  generate a version of the grammar with
       actions stripped out (for documentation purposes), and lots more.


OPTIONS

       -ck n  Use up to n symbols of lookahead when using  compressed  (linear
              approximation)  lookahead.  This type of lookahead is very cheap
              to compute and is attempted before full LL(k)  lookahead,  which
              is of exponential complexity in the worst case.  In general, the
              compressed lookahead can be much deeper (e.g, -ck 10)  than  the
              full lookahead (which usually must be less than 4).

       -CC    Generate C++ output from both ANTLR and DLG.

       -cr    Generate  a cross-reference for all rules.  For each rule, print
              a list of all other rules that reference it.

       -e1    Ambiguities/errors shown in low detail (default).

       -e2    Ambiguities/errors shown in more detail.

       -e3    Ambiguities/errors shown in excruciating detail.

       -fe file
              Rename err.c to file.

       -fh file
              Rename stdpccts.h header (turns on -gh) to file.

       -fl file
              Rename lexical output, parser.dlg, to file.

       -fm file
              Rename file with lexical mode definitions, mode.h, to file.

       -fr file
              Rename file which remaps globally visible symbols,  remap.h,  to
              file.

       -ft file
              Rename tokens.h to file.

       -ga    Generate ANSI-compatible code (default case).  This has not been
              rigorously tested to be ANSI XJ11 C compliant, but it is  close.
              The  normal  output  of antlr is currently compilable under both
              K&R, ANSI C, and C++—this option does nothing because antlr gen-
              erates  a  bunch  of #ifdef’s to do the right thing depending on
              the language.

       -gc    Indicates that antlr should generate no C code, i.e., only  per-
              form analysis on the grammar.

       -gd    C  code is inserted in each of the antlr generated parsing func-
              tions to provide for user-defined handling of a  detailed  parse
              trace.  The inserted code consists of calls to the user-supplied
              macros or functions called zzTRACEIN and zzTRACEOUT.   The  only
              argument  is  a char * pointing to a C-style string which is the
              grammar rule recognized by the current parsing function.  If  no
              definition is given for the trace functions, upon rule entry and
              exit, a message will be printed  indicating  that  a  particular
              rule as been entered or exited.

       -ge    Generate an error class for each non-terminal.

       -gh    Generate  stdpccts.h  for  non-ANTLR-generated files to include.
              This file contains all defines needed to describe  the  type  of
              parser  generated  by antlr (e.g. how much lookahead is used and
              whether or not trees are constructed) and  contains  the  header
              action specified by the user.

       -gk    Generate  parsers  that  delay  lookahead  fetches until needed.
              Without this option, antlr generates parsers which always have k
              tokens of lookahead available.

       -gl    Generate line info about grammar actions in C parser of the form
              # line "file" which makes error messages from the C/C++ compiler
              make more sense as they will point into the grammar file not the
              resulting C file.  Debugging is easier as well, because you will
              step through the grammar not C file.

       -gs    Do  not generate sets for token expression lists; instead gener-
              ate a ||-separated sequence of LA(1)==token_number.  The default
              is to generate sets.

       -gt    Generate code for Abstract-Syntax Trees.

       -gx    Do  not  create  the lexical analyzer files (dlg-related).  This
              option should be given when the user wishes to  provide  a  cus-
              tomized  lexical  analyzer.  It may also be used in make scripts
              to cause only the parser to be rebuilt when a change not affect-
              ing the lexical structure is made to the input grammars.

       -k n   Set k of LL(k) to n; i.e. set tokens of look-ahead (default==1).

       -o dir Directory where output files should go (default=".").   This  is
              very  nice  for  keeping the source directory clear of ANTLR and
              DLG spawn.

       -p     The complete grammar, collected from all input grammar files and
              stripped of all comments and embedded actions, is listed to std-
              out.  This is intended to aid in viewing the entire grammar as a
              whole and to eliminate the need to keep actions concisely stated
              so that the grammar is easier to read.  Hence, it is  preferable
              to  embed  even  complex actions directly in the grammar, rather
              than to call them as  subroutines,  since  the  subroutine  call
              overhead will be saved.

       -pa    This  option  is  the same as -p except that the output is anno-
              tated with the first sets determined from grammar analysis.

       -prc on
              Turn on the computation and hoisting of predicate context.

       -prc off
              Turn off the computation  and  hoisting  of  predicate  context.
              This  option makes 1.10 behave like the 1.06 release with option
              -pr on.  Context computation is off by default.

       -rl n  Limit the maximum number of tree nodes used by grammar  analysis
              to  n.   Occasionally, antlr is unable to analyze a grammar sub-
              mitted by the user.  This rare situation can only occur when the
              grammar  is  large  and  the amount of lookahead is greater than
              one.  A nonlinear analysis algorithm is used by PCCTS to  handle
              the  general  case  of LL(k) parsing.  The average complexity of
              analysis, however, is near linear due to some fancy footwork  in
              the implementation which reduces the number of calls to the full
              LL(k) algorithm.  An error message will be  displayed,  if  this
              limit  is  reached,  which indicates the grammar construct being
              analyzed when antlr hit a non-linearity.   Use  this  option  if
              antlr  seems  to  go out to lunch and your disk start thrashing;
              try n=10000 to start.  Once the  offending  construct  has  been
              identified, try to remove the ambiguity that antlr was trying to
              overcome with large lookahead  analysis.   The  introduction  of
              (...)?  backtracking  blocks eliminates some of these problems —
              antlr does not analyze alternatives that begin with  (...)?  (it
              simply backtracks, if necessary, at run time).

       -w1    Set  low  warning  level.   Do  not  warn if semantic predicates
              and/or (...)? blocks are assumed  to  cover  ambiguous  alterna-
              tives.

       -w2    Ambiguous  parsing  decisions  yield  warnings  even if semantic
              predicates or (...)? blocks are used.  Warn if predicate context
              computed   and  semantic  predicates  incompletely  disambiguate
              alternative productions.

       -      Read grammar from standard input and  generate  stdin.c  as  the
              parser file.


SPECIAL CONSIDERATIONS

       Antlr  works...  we think.  There is no implicit guarantee of anything.
       We reserve no legal rights to the software known as the Purdue Compiler
       Construction  Tool  Set  (PCCTS)  —  PCCTS is in the public domain.  An
       individual or company may do whatever they wish with source  code  dis-
       tributed  with  PCCTS  or  the  code  generated by PCCTS, including the
       incorporation of PCCTS, or its output, into  commercial  software.   We
       encourage  users  to  develop  software with PCCTS.  However, we do ask
       that credit is given to us for developing PCCTS.  By "credit", we  mean
       that if you incorporate our source code into one of your programs (com-
       mercial product, research project, or otherwise) that  you  acknowledge
       this  fact  somewhere in the documentation, research report, etc...  If
       you like PCCTS and have developed a nice tool with the  output,  please
       mention that you developed it using PCCTS.  As long as these guidelines
       are followed, we expect to continue enhancing this system and expect to
       make other tools available as they are completed.


FILES

       *.c    output C parser.

       *.cpp  output C++ parser when C++ mode is used.

       parser.dlg
              output dlg lexical analyzer.

       err.c  token  string array, error sets and error support routines.  Not
              used in C++ mode.

       remap.h
              file that redefines all globally visible  parser  symbols.   The
              use of the #parser directive creates this file.  Not used in C++
              mode.

       stdpccts.h
              list of definitions needed by C files, not generated  by  PCCTS,
              that reference PCCTS objects.  This is not generated by default.
              Not used in C++ mode.

       tokens.h
              output #defines for tokens  used  and  function  prototypes  for
              functions generated for rules.


SEE ALSO

       dlg(1), pccts(1)



ANTLR                           September 1995                        ANTLR(1)

Man(1) output converted with man2html