Predictive Parser - LL(1) Parser
Structure of LL(1) Parser
As shown the parser program works with the following 3 components to produce output
INPUT: Contains string to be parsed with $ as it's end marker
STACK: Contains sequence of grammar symbols with $ as it's bottom marker. Initially stack contains only $
PARSING TABLE: A two dimensional array M[A,a], where A is a non-terminal and a is a Terminal
It is a tabular implementation of the recursive descent parsing, where a stack is maintained by the parser rather than the language in which parser is written
Important Points:
It eliminates the need for recursion facility in the language used for implementing parser
An overhead with this technique is explicit maintenance of stack
Implementation of a Predictive Parser:
Predictive parser use an algorithm to parse the input string, which makes use of entries in a table which are filled by using another algorithm using two important set computed for the given grammar
FIRST:
What is FIRST of string of symbols?
If ∝ is any string of grammar symbols, then the set FIRST(∝) is defined as, the set of terminals that begin strings derived from ∝
If ∝⇒* ε then ε is also in FIRST(∝)
What is FIRST of Grammar Symbols?
FIRST(X) for all grammar symbols X, use the following rules
until no more terminal or ε can be added to any FIRST set
If X is
a terminal then FIRST(X) = {X}
If X is
a non-terminal and X → Y1, Y2, ...... Yk is a production for some k ≥ 1, then
the place a in FIRST(X) if for some i, a is in FIRST(Y1), and ε is in all all
of FIRST(Y1),.. FIRST(Yi+1); that is Y1,....Yi-1 ⇒ ε. IF ⇒ is the FIRST(Yj) for all j = 1,2,....,k, then add ε to
FIRST(X). If Y1 does not derive ε, then we add nothing more to FIRST(X), but if
Y1 ⇒ ε, then we add FIRST(Y2) and so on
If X → ε is a production, then add ε to FIRST(X)
Now, we
can compute FIRST for any string X1,X2...Xn as follows
Add to
FIRST(X1,X2,...Xn) al non ε symbols of FIRST(X1)
Also
add the non ε symbols of FIRST(X2), if ε is in FIRST(X1); the non ε symbols of
FIRST(X3), if ε is in FIRST(X1) and FIRST(X2); and son on.
Finally,
add ε to FIRST(X1,X2,...Xn) if all i, ε is in FIRST(Xi)
Computing FIRST Example 1:
E
→ TE'
E'
→ +TE' | ε
T
→ FT'
T'
→ *FT' | ε
F
→ (E) |id
Variable's
are E, E', T, T' and F Terminals are +, *, (, ), ε and id
Now
FIRST(E)
= FIRST(T) = FIRST(F) = {(, id}
FIRST(E')
= {+, ε}
FIRST(T')
= {*, ε}
Computing FIRST Example 2:
S →
iCtSS' | a
S' → eS
| ε
C → b
In the
given Grammar, Variables are S, S' and & Terminals are i, t , a, ε and b
Now
FIRST(S) = {i, a}
FIRST(S') = {e, ε}
FIRST(C) = {b}
FOLLOW
What is
FOLLOW of Grammar Symbols?
FOLLOW of a
non-terminal A can be defined as a set of:
- Terminals a that can appear immediately to the right of A in some sentential form, S⇒* αAaβ for some α and β and
- $, if A can be the right most symbol in some sentential form
Computing
FOLLOW of Grammar Symbols?
For any
non-terminal A, the set FOLLOW(A) is computed by the following rules:
- $ is in FOLLOW(A), where A is the start symbol
- If there is a production of the form A → αBβ, β ≠ ε, then everything in FIRST(β) but ε is FOLLOW(B)
- If there is a production of the A → αB, or a production A → αBβ where FIRST(β) contains ε, then everything in FOLLOW(A) is in FOLLOW(B)
Computing FOLLOW Example:
S → iCtSS' | a
S' → eS | ε
C → b
In the given grammar
Variables are S, S' and C and Terminals are i, t, a, ε and b
Now
FOLLOW(S) = {$, e} [by the rule 1 and 2]
FOLLOWS(S') = {e, $} [by rule 3]
FOLLOW(C) = {t} [by rule 2]
S → iCtSS' | a
S' → eS | ε
C → b
In the given grammar
Variables are S, S' and C and Terminals are i, t, a, ε and b
Now
FOLLOW(S) = {$, e} [by the rule 1 and 2]
FOLLOWS(S') = {e, $} [by rule 3]
FOLLOW(C) = {t} [by rule 2]