LALR Parsers
Why LALR when CLR is the most
powerful parser?
- The table obtained by LALR is smaller than the corresponding CLR table
- Most of the syntactic constructs can be conveniently expressed by using LALR
- It is much easier and economical to construct compared to CLR
- Using LALR we can construct parsers for some constructs on which SLR fails
- With the same number of states that of SLR, LALR can parse more construct
Note: For a typical programming language the number
of states for a SLR and LALR parser are equal and CLR for the same language
will have around 10 times the number of states present in SLR or LALR.
Constructing LALR parser from
LR(1) Items:
- An Lr(1) item is a combination of LR(0) item(core) followed by a lookahead as shown
[A → α.Bβ,
a]
Where α is core , B is lookahead
- Now, the steps to construct LALR(1) collection of items and LALR parsing table from LR(1) items is as follows
Merge the items with the same
core. Suppose Ii and Ij sets have the same core then merge them and create the
set of items Iij
As goto function depends on only
core, merge further the items produced in step1 if they have the same core
When no merging is possible,
modify the goto and action entries of the CLR parsing table accordingly
These steps can be described in a
more formal form as:
- Construct C ={I0,I1,…In}, the collection of sets of LR(1) itesm
- For each core present among the set of LR(1) items. The parsing actions for state ‘I’ are constructed from Ji. If there is a parsing action conflict, the algorithm fails to produce a parser and the grammar is said not to be LALR(1)
- The goto table is constructed as follows. If J is the union of one or more sets of LR(1) items. That is J = I1 ∪ I2 ∪ …… ∪Ik, then the cores of goto(I1, X), goto(I2, X),…..goto(Ik, X) are the same, since I1,I2…..,Ik all have the same core. Let K be the union of all sets of items having the same core as goto(I1, X). then goto(J, X) = K
Note:
The
table constructed by using the above algorithm is known as LALR parsing table
If
the LALR table has no conflicts the grammar is known as LALR(1)