1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
|
PLY (Python Lex-Yacc) Version 1.2 (November 27, 2002)
David M. Beazley
Department of Computer Science
University of Chicago
Chicago, IL 60637
beazley@cs.uchicago.edu
Copyright (C) 2001 David M. Beazley
$Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
See the file COPYING for a complete copy of the LGPL.
Introduction
============
PLY is a 100% Python implementation of the common parsing tools lex
and yacc. Although several other parsing tools are available for
Python, there are several reasons why you might want to consider PLY:
- The tools are very closely modeled after traditional lex/yacc.
If you know how to use these tools in C, you will find PLY
to be similar.
- PLY provides *very* extensive error reporting and diagnostic
information to assist in parser construction. The original
implementation was developed for instructional purposes. As
a result, the system tries to identify the most common types
of errors made by novice users.
- PLY provides full support for empty productions, error recovery,
precedence specifiers, and moderately ambiguous grammars.
- Parsing is based on LR-parsing which is fast, memory efficient,
better suited to large grammars, and which has a number of nice
properties when dealing with syntax errors and other parsing problems.
Currently, PLY builds its parsing tables using the SLR algorithm which
is slightly weaker than LALR(1) used in traditional yacc.
- Like John Aycock's excellent SPARK toolkit, PLY uses Python
reflection to build lexers and parsers. This greatly simplifies
the task of parser construction since it reduces the number of files
and eliminates the need to run a separate lex/yacc tool before
running your program.
- PLY can be used to build parsers for "real" programming languages.
Although it is not ultra-fast due to its Python implementation,
PLY can be used to parse grammars consisting of several hundred
rules (as might be found for a language like C). The lexer and LR
parser are also reasonably efficient when parsing typically
sized programs.
The original version of PLY was developed for an Introduction to
Compilers course where students used it to build a compiler for a
simple Pascal-like language. Their compiler had to include lexical
analysis, parsing, type checking, type inference, and generation of
assembly code for the SPARC processor. Because of this, the current
implementation has been extensively tested and debugged. In addition,
most of the API and error checking steps have been adapted to address
common usability problems.
How to Use
==========
PLY consists of two files : lex.py and yacc.py. To use the system,
simply copy these files to your project and import them like standard
Python modules.
The file doc/ply.html contains complete documentation on how to use
the system.
The example directory contains several different examples including a
PLY specification for ANSI C as given in K&R 2nd Ed. Note: To use
the examples, you will need to copy the lex.py and yacc.py files to
the example directory.
A simple example is found at the end of this document
Requirements
============
PLY requires the use of Python 2.0 or greater. It should work on
just about any platform.
Resources
=========
More information about PLY can be obtained on the PLY webpage at:
http://systems.cs.uchicago.edu/ply
For a detailed overview of parsing theory, consult the excellent
book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
Ullman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown
may also be useful.
Given that this is the first release, I welcome your comments on how
to improve the current implementation. See the TODO file for things that
still need to be done.
Acknowledgments
===============
A special thanks is in order for all of the students in CS326 who
suffered through about 25 different versions of these tools :-).
Example
=======
Here is a simple example showing a PLY implementation of a calculator with variables.
# -----------------------------------------------------------------------------
# calc.py
#
# A simple calculator with variables.
# -----------------------------------------------------------------------------
tokens = (
'NAME','NUMBER',
'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
'LPAREN','RPAREN',
)
# Tokens
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIVIDE = r'/'
t_EQUALS = r'='
t_LPAREN = r'\('
t_RPAREN = r'\)'
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
def t_NUMBER(t):
r'\d+'
try:
t.value = int(t.value)
except ValueError:
print "Integer value too large", t.value
t.value = 0
return t
# Ignored characters
t_ignore = " \t"
def t_newline(t):
r'\n+'
t.lineno += t.value.count("\n")
def t_error(t):
print "Illegal character '%s'" % t.value[0]
t.skip(1)
# Build the lexer
import lex
lex.lex()
# Precedence rules for the arithmetic operators
precedence = (
('left','PLUS','MINUS'),
('left','TIMES','DIVIDE'),
('right','UMINUS'),
)
# dictionary of names (for storing variables)
names = { }
def p_statement_assign(t):
'statement : NAME EQUALS expression'
names[t[1]] = t[3]
def p_statement_expr(t):
'statement : expression'
print t[1]
def p_expression_binop(t):
'''expression : expression PLUS expression
| expression MINUS expression
| expression TIMES expression
| expression DIVIDE expression'''
if t[2] == '+' : t[0] = t[1] + t[3]
elif t[2] == '-': t[0] = t[1] - t[3]
elif t[2] == '*': t[0] = t[1] * t[3]
elif t[2] == '/': t[0] = t[1] / t[3]
def p_expression_uminus(t):
'expression : MINUS expression %prec UMINUS'
t[0] = -t[2]
def p_expression_group(t):
'expression : LPAREN expression RPAREN'
t[0] = t[2]
def p_expression_number(t):
'expression : NUMBER'
t[0] = t[1]
def p_expression_name(t):
'expression : NAME'
try:
t[0] = names[t[1]]
except LookupError:
print "Undefined name '%s'" % t[1]
t[0] = 0
def p_error(t):
print "Syntax error at '%s'" % t.value
import yacc
yacc.yacc()
while 1:
try:
s = raw_input('calc > ')
except EOFError:
break
yacc.parse(s)
|