1 files changed, 579 insertions, 0 deletions
diff --git a/ext/ply/CHANGES b/ext/ply/CHANGES
index 9c7334066..d88f3e5d6 100644
--- a/ext/ply/CHANGES
+++ b/ext/ply/CHANGES
@@ -1,3 +1,582 @@
+Version 2.3
+-----------------------------
+02/20/07: beazley
+          Fixed a bug with character literals if the literal '.' appeared as the
+          last symbol of a grammar rule.  Reported by Ales Smrcka.
+
+02/19/07: beazley
+          Warning messages are now redirected to stderr instead of being printed
+          to standard output.
+
+02/19/07: beazley
+          Added a warning message to lex.py if it detects a literal backslash
+          character inside the t_ignore declaration.  This is to help
+          problems that might occur if someone accidentally defines t_ignore
+          as a Python raw string.  For example:
+
+              t_ignore = r' \t'
+
+          The idea for this is from an email I received from David Cimimi who
+          reported bizarre behavior in lexing as a result of defining t_ignore
+          as a raw string by accident.
+
+02/18/07: beazley
+          Performance improvements.  Made some changes to the internal
+          table organization and LR parser to improve parsing performance.
+
+02/18/07: beazley
+          Automatic tracking of line number and position information must now be 
+          enabled by a special flag to parse().  For example:
+
+              yacc.parse(data,tracking=True)
+
+          In many applications, it's just not that important to have the
+          parser automatically track all line numbers.  By making this an 
+          optional feature, it allows the parser to run significantly faster
+          (more than a 20% speed increase in many cases).    Note: positional
+          information is always available for raw tokens---this change only
+          applies to positional information associated with nonterminal
+          grammar symbols.
+          *** POTENTIAL INCOMPATIBILITY ***
+	  
+02/18/07: beazley
+          Yacc no longer supports extended slices of grammar productions.
+          However, it does support regular slices.  For example:
+
+          def p_foo(p):
+              '''foo: a b c d e'''
+              p[0] = p[1:3]
+
+          This change is a performance improvement to the parser--it streamlines
+          normal access to the grammar values since slices are now handled in
+          a __getslice__() method as opposed to __getitem__().
+
+02/12/07: beazley
+          Fixed a bug in the handling of token names when combined with
+          start conditions.   Bug reported by Todd O'Bryan.
+
+Version 2.2
+------------------------------
+11/01/06: beazley
+          Added lexpos() and lexspan() methods to grammar symbols.  These
+          mirror the same functionality of lineno() and linespan().  For
+          example:
+
+          def p_expr(p):
+              'expr : expr PLUS expr'
+               p.lexpos(1)     # Lexing position of left-hand-expression
+               p.lexpos(1)     # Lexing position of PLUS
+               start,end = p.lexspan(3)  # Lexing range of right hand expression
+
+11/01/06: beazley
+          Minor change to error handling.  The recommended way to skip characters
+          in the input is to use t.lexer.skip() as shown here:
+
+             def t_error(t):
+                 print "Illegal character '%s'" % t.value[0]
+                 t.lexer.skip(1)
+          
+          The old approach of just using t.skip(1) will still work, but won't
+          be documented.
+
+10/31/06: beazley
+          Discarded tokens can now be specified as simple strings instead of
+          functions.  To do this, simply include the text "ignore_" in the
+          token declaration.  For example:
+
+              t_ignore_cppcomment = r'//.*'
+          
+          Previously, this had to be done with a function.  For example:
+
+              def t_ignore_cppcomment(t):
+                  r'//.*'
+                  pass
+
+          If start conditions/states are being used, state names should appear
+          before the "ignore_" text.
+
+10/19/06: beazley
+          The Lex module now provides support for flex-style start conditions
+          as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html.
+          Please refer to this document to understand this change note.  Refer to
+          the PLY documentation for PLY-specific explanation of how this works.
+
+          To use start conditions, you first need to declare a set of states in
+          your lexer file:
+
+          states = (
+                    ('foo','exclusive'),
+                    ('bar','inclusive')
+          )
+
+          This serves the same role as the %s and %x specifiers in flex.
+
+          One a state has been declared, tokens for that state can be 
+          declared by defining rules of the form t_state_TOK.  For example:
+
+            t_PLUS = '\+'          # Rule defined in INITIAL state
+            t_foo_NUM = '\d+'      # Rule defined in foo state
+            t_bar_NUM = '\d+'      # Rule defined in bar state
+
+            t_foo_bar_NUM = '\d+'  # Rule defined in both foo and bar
+            t_ANY_NUM = '\d+'      # Rule defined in all states
+
+          In addition to defining tokens for each state, the t_ignore and t_error
+          specifications can be customized for specific states.  For example:
+
+            t_foo_ignore = " "     # Ignored characters for foo state
+            def t_bar_error(t):   
+                # Handle errors in bar state
+
+          With token rules, the following methods can be used to change states
+          
+            def t_TOKNAME(t):
+                t.lexer.begin('foo')        # Begin state 'foo'
+                t.lexer.push_state('foo')   # Begin state 'foo', push old state
+                                            # onto a stack
+                t.lexer.pop_state()         # Restore previous state
+                t.lexer.current_state()     # Returns name of current state
+
+          These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and
+          yy_top_state() functions in flex.
+
+          The use of start states can be used as one way to write sub-lexers.
+          For example, the lexer or parser might instruct the lexer to start
+          generating a different set of tokens depending on the context.
+          
+          example/yply/ylex.py shows the use of start states to grab C/C++ 
+          code fragments out of traditional yacc specification files.
+
+          *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also
+          discussed various aspects of the design.
+
+10/19/06: beazley
+          Minor change to the way in which yacc.py was reporting shift/reduce
+          conflicts.  Although the underlying LALR(1) algorithm was correct,
+          PLY was under-reporting the number of conflicts compared to yacc/bison
+          when precedence rules were in effect.  This change should make PLY
+          report the same number of conflicts as yacc.
+
+10/19/06: beazley
+          Modified yacc so that grammar rules could also include the '-' 
+          character.  For example:
+
+            def p_expr_list(p):
+                'expression-list : expression-list expression'
+
+          Suggested by Oldrich Jedlicka.
+
+10/18/06: beazley
+          Attribute lexer.lexmatch added so that token rules can access the re 
+          match object that was generated.  For example:
+
+          def t_FOO(t):
+              r'some regex'
+              m = t.lexer.lexmatch
+              # Do something with m
+
+
+          This may be useful if you want to access named groups specified within
+          the regex for a specific token. Suggested by Oldrich Jedlicka.
+          
+10/16/06: beazley
+          Changed the error message that results if an illegal character
+          is encountered and no default error function is defined in lex.
+          The exception is now more informative about the actual cause of
+          the error.
+      
+Version 2.1
+------------------------------
+10/02/06: beazley
+          The last Lexer object built by lex() can be found in lex.lexer.
+          The last Parser object built  by yacc() can be found in yacc.parser.
+
+10/02/06: beazley
+          New example added:  examples/yply
+
+          This example uses PLY to convert Unix-yacc specification files to
+          PLY programs with the same grammar.   This may be useful if you
+          want to convert a grammar from bison/yacc to use with PLY.
+    
+10/02/06: beazley
+          Added support for a start symbol to be specified in the yacc
+          input file itself.  Just do this:
+
+               start = 'name'
+
+          where 'name' matches some grammar rule.  For example:
+
+               def p_name(p):
+                   'name : A B C'
+                   ...
+
+          This mirrors the functionality of the yacc %start specifier.
+
+09/30/06: beazley
+          Some new examples added.:
+
+          examples/GardenSnake : A simple indentation based language similar
+                                 to Python.  Shows how you might handle 
+                                 whitespace.  Contributed by Andrew Dalke.
+
+          examples/BASIC       : An implementation of 1964 Dartmouth BASIC.
+                                 Contributed by Dave against his better
+                                 judgement.
+
+09/28/06: beazley
+          Minor patch to allow named groups to be used in lex regular
+          expression rules.  For example:
+
+              t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)'''
+
+          Patch submitted by Adam Ring.
+ 
+09/28/06: beazley
+          LALR(1) is now the default parsing method.   To use SLR, use
+          yacc.yacc(method="SLR").  Note: there is no performance impact
+          on parsing when using LALR(1) instead of SLR. However, constructing
+          the parsing tables will take a little longer.
+
+09/26/06: beazley
+          Change to line number tracking.  To modify line numbers, modify
+          the line number of the lexer itself.  For example:
+
+          def t_NEWLINE(t):
+              r'\n'
+              t.lexer.lineno += 1
+
+          This modification is both cleanup and a performance optimization.
+          In past versions, lex was monitoring every token for changes in
+          the line number.  This extra processing is unnecessary for a vast
+          majority of tokens. Thus, this new approach cleans it up a bit.
+
+          *** POTENTIAL INCOMPATIBILITY ***
+          You will need to change code in your lexer that updates the line
+          number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1"
+         
+09/26/06: beazley
+          Added the lexing position to tokens as an attribute lexpos. This
+          is the raw index into the input text at which a token appears.
+          This information can be used to compute column numbers and other
+          details (e.g., scan backwards from lexpos to the first newline
+          to get a column position).
+
+09/25/06: beazley
+          Changed the name of the __copy__() method on the Lexer class
+          to clone().  This is used to clone a Lexer object (e.g., if
+          you're running different lexers at the same time).
+
+09/21/06: beazley
+          Limitations related to the use of the re module have been eliminated.
+          Several users reported problems with regular expressions exceeding
+          more than 100 named groups. To solve this, lex.py is now capable
+          of automatically splitting its master regular regular expression into
+          smaller expressions as needed.   This should, in theory, make it
+          possible to specify an arbitrarily large number of tokens.
+
+09/21/06: beazley
+          Improved error checking in lex.py.  Rules that match the empty string
+          are now rejected (otherwise they cause the lexer to enter an infinite
+          loop).  An extra check for rules containing '#' has also been added.
+          Since lex compiles regular expressions in verbose mode, '#' is interpreted
+          as a regex comment, it is critical to use '\#' instead.  
+
+09/18/06: beazley
+          Added a @TOKEN decorator function to lex.py that can be used to 
+          define token rules where the documentation string might be computed
+          in some way.
+          
+          digit            = r'([0-9])'
+          nondigit         = r'([_A-Za-z])'
+          identifier       = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)'        
+
+          from ply.lex import TOKEN
+
+          @TOKEN(identifier)
+          def t_ID(t):
+               # Do whatever
+
+          The @TOKEN decorator merely sets the documentation string of the
+          associated token function as needed for lex to work.  
+
+          Note: An alternative solution is the following:
+
+          def t_ID(t):
+              # Do whatever
+   
+          t_ID.__doc__ = identifier
+
+          Note: Decorators require the use of Python 2.4 or later.  If compatibility
+          with old versions is needed, use the latter solution.
+
+          The need for this feature was suggested by Cem Karan.
+
+09/14/06: beazley
+          Support for single-character literal tokens has been added to yacc.
+          These literals must be enclosed in quotes.  For example:
+
+          def p_expr(p):
+               "expr : expr '+' expr"
+               ...
+
+          def p_expr(p):
+               'expr : expr "-" expr'
+               ...
+
+          In addition to this, it is necessary to tell the lexer module about
+          literal characters.   This is done by defining the variable 'literals'
+          as a list of characters.  This should  be defined in the module that
+          invokes the lex.lex() function.  For example:
+
+             literals = ['+','-','*','/','(',')','=']
+ 
+          or simply
+
+             literals = '+=*/()='
+
+          It is important to note that literals can only be a single character.
+          When the lexer fails to match a token using its normal regular expression
+          rules, it will check the current character against the literal list.
+          If found, it will be returned with a token type set to match the literal
+          character.  Otherwise, an illegal character will be signalled.
+
+
+09/14/06: beazley
+          Modified PLY to install itself as a proper Python package called 'ply'.
+          This will make it a little more friendly to other modules.  This
+          changes the usage of PLY only slightly.  Just do this to import the
+          modules
+
+                import ply.lex as lex
+                import ply.yacc as yacc
+
+          Alternatively, you can do this:
+
+                from ply import *
+
+          Which imports both the lex and yacc modules.
+          Change suggested by Lee June.
+
+09/13/06: beazley
+          Changed the handling of negative indices when used in production rules.
+          A negative production index now accesses already parsed symbols on the
+          parsing stack.  For example, 
+
+              def p_foo(p):
+                   "foo: A B C D"
+                   print p[1]       # Value of 'A' symbol
+                   print p[2]       # Value of 'B' symbol
+                   print p[-1]      # Value of whatever symbol appears before A
+                                    # on the parsing stack.
+
+                   p[0] = some_val  # Sets the value of the 'foo' grammer symbol
+                                    
+          This behavior makes it easier to work with embedded actions within the
+          parsing rules. For example, in C-yacc, it is possible to write code like
+          this:
+
+               bar:   A { printf("seen an A = %d\n", $1); } B { do_stuff; }
+
+          In this example, the printf() code executes immediately after A has been
+          parsed.  Within the embedded action code, $1 refers to the A symbol on
+          the stack.
+
+          To perform this equivalent action in PLY, you need to write a pair
+          of rules like this:
+
+               def p_bar(p):
+                     "bar : A seen_A B"
+                     do_stuff
+
+               def p_seen_A(p):
+                     "seen_A :"
+                     print "seen an A =", p[-1]
+
+          The second rule "seen_A" is merely a empty production which should be
+          reduced as soon as A is parsed in the "bar" rule above.  The use 
+          of the negative index p[-1] is used to access whatever symbol appeared
+          before the seen_A symbol.
+
+          This feature also makes it possible to support inherited attributes.
+          For example:
+
+               def p_decl(p):
+                     "decl : scope name"
+
+               def p_scope(p):
+                     """scope : GLOBAL
+                              | LOCAL"""
+                   p[0] = p[1]
+
+               def p_name(p):
+                     "name : ID"
+                     if p[-1] == "GLOBAL":
+                          # ...
+                     else if p[-1] == "LOCAL":
+                          #...
+
+          In this case, the name rule is inheriting an attribute from the
+          scope declaration that precedes it.
+       
+          *** POTENTIAL INCOMPATIBILITY ***
+          If you are currently using negative indices within existing grammar rules,
+          your code will break.  This should be extremely rare if non-existent in
+          most cases.  The argument to various grammar rules is not usually not
+          processed in the same way as a list of items.
+          
+Version 2.0
+------------------------------
+09/07/06: beazley
+          Major cleanup and refactoring of the LR table generation code.  Both SLR
+          and LALR(1) table generation is now performed by the same code base with
+          only minor extensions for extra LALR(1) processing.
+
+09/07/06: beazley
+          Completely reimplemented the entire LALR(1) parsing engine to use the
+          DeRemer and Pennello algorithm for calculating lookahead sets.  This
+          significantly improves the performance of generating LALR(1) tables
+          and has the added feature of actually working correctly!  If you
+          experienced weird behavior with LALR(1) in prior releases, this should
+          hopefully resolve all of those problems.  Many thanks to 
+          Andrew Waters and Markus Schoepflin for submitting bug reports
+          and helping me test out the revised LALR(1) support.
+
+Version 1.8
+------------------------------
+08/02/06: beazley
+          Fixed a problem related to the handling of default actions in LALR(1)
+          parsing.  If you experienced subtle and/or bizarre behavior when trying
+          to use the LALR(1) engine, this may correct those problems.  Patch
+          contributed by Russ Cox.  Note: This patch has been superceded by 
+          revisions for LALR(1) parsing in Ply-2.0.
+
+08/02/06: beazley
+          Added support for slicing of productions in yacc.  
+          Patch contributed by Patrick Mezard.
+
+Version 1.7
+------------------------------
+03/02/06: beazley
+          Fixed infinite recursion problem ReduceToTerminals() function that
+          would sometimes come up in LALR(1) table generation.  Reported by 
+          Markus Schoepflin.
+
+03/01/06: beazley
+          Added "reflags" argument to lex().  For example:
+
+               lex.lex(reflags=re.UNICODE)
+
+          This can be used to specify optional flags to the re.compile() function
+          used inside the lexer.   This may be necessary for special situations such
+          as processing Unicode (e.g., if you want escapes like \w and \b to consult
+          the Unicode character property database).   The need for this suggested by
+          Andreas Jung.
+
+03/01/06: beazley
+          Fixed a bug with an uninitialized variable on repeated instantiations of parser
+          objects when the write_tables=0 argument was used.   Reported by Michael Brown.
+
+03/01/06: beazley
+          Modified lex.py to accept Unicode strings both as the regular expressions for
+          tokens and as input. Hopefully this is the only change needed for Unicode support.
+          Patch contributed by Johan Dahl.
+
+03/01/06: beazley
+          Modified the class-based interface to work with new-style or old-style classes.
+          Patch contributed by Michael Brown (although I tweaked it slightly so it would work
+          with older versions of Python).
+
+Version 1.6
+------------------------------
+05/27/05: beazley
+          Incorporated patch contributed by Christopher Stawarz to fix an extremely
+          devious bug in LALR(1) parser generation.   This patch should fix problems
+          numerous people reported with LALR parsing.
+
+05/27/05: beazley
+          Fixed problem with lex.py copy constructor.  Reported by Dave Aitel, Aaron Lav,
+          and Thad Austin. 
+
+05/27/05: beazley
+          Added outputdir option to yacc()  to control output directory. Contributed
+          by Christopher Stawarz.
+
+05/27/05: beazley
+          Added rununit.py test script to run tests using the Python unittest module.
+          Contributed by Miki Tebeka.
+
+Version 1.5
+------------------------------
+05/26/04: beazley
+          Major enhancement. LALR(1) parsing support is now working.
+          This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu)
+          and optimized by David Beazley. To use LALR(1) parsing do
+          the following:
+
+               yacc.yacc(method="LALR")
+
+          Computing LALR(1) parsing tables takes about twice as long as
+          the default SLR method.  However, LALR(1) allows you to handle
+          more complex grammars.  For example, the ANSI C grammar
+          (in example/ansic) has 13 shift-reduce conflicts with SLR, but
+          only has 1 shift-reduce conflict with LALR(1).
+
+05/20/04: beazley
+          Added a __len__ method to parser production lists.  Can
+          be used in parser rules like this:
+
+             def p_somerule(p):
+                 """a : B C D
+                      | E F"
+                 if (len(p) == 3):
+                     # Must have been first rule
+                 elif (len(p) == 2):
+                     # Must be second rule
+
+          Suggested by Joshua Gerth and others.
+
+Version 1.4
+------------------------------
+04/23/04: beazley
+          Incorporated a variety of patches contributed by Eric Raymond.
+          These include:
+
+           0. Cleans up some comments so they don't wrap on an 80-column display.
+           1. Directs compiler errors to stderr where they belong.
+           2. Implements and documents automatic line counting when \n is ignored.
+           3. Changes the way progress messages are dumped when debugging is on. 
+              The new format is both less verbose and conveys more information than
+              the old, including shift and reduce actions.
+
+04/23/04: beazley
+          Added a Python setup.py file to simply installation.  Contributed
+          by Adam Kerrison.
+
+04/23/04: beazley
+          Added patches contributed by Adam Kerrison.
+ 
+          -   Some output is now only shown when debugging is enabled.  This
+              means that PLY will be completely silent when not in debugging mode.
+          
+          -   An optional parameter "write_tables" can be passed to yacc() to
+              control whether or not parsing tables are written.   By default,
+              it is true, but it can be turned off if you don't want the yacc
+              table file. Note: disabling this will cause yacc() to regenerate
+              the parsing table each time.
+
+04/23/04: beazley
+          Added patches contributed by David McNab.  This patch addes two
+          features:
+
+          -   The parser can be supplied as a class instead of a module.
+              For an example of this, see the example/classcalc directory.
+
+          -   Debugging output can be directed to a filename of the user's
+              choice.  Use
+
+                 yacc(debugfile="somefile.out")
+
+          
 Version 1.3
 ------------------------------
 12/10/02: jmdyck