Overview:
Step 1: Keep a pointer just before the first symbol in the
input string to be parsed
Step 2: Expand the start symbol
Step 3: Compare the terminal after the pointer with the
unmatched leaf(from left to right, initially it is the first leaf).
Here there may be three cases,
- The leaf is a non-terminal: Then expand it by its production
- A matching terminal: Advance the pointer by one position
- A non-matching terminal: Do not advance pointer, but
Undo
the current expansion/s and
Move
pointer backwards basing on the matched symbols in the undone expansions and Expand
the non-terminal by its other alternative if it exists
Step 4: Repeat step 2 until the
entire parse tree is constructed or failure occurs
Example:
Consider the grammar S→ cAd A → ab | a and
parsing the string w = cad
Initially we maintain an input
pointer to c in w and expand S by cAd as mentioned below
Now, the left most leaf c is matched with first symbol in w. So, we advance the input pointer to a, the second symbol in w and consider the next leaf
As the next leaf is a non-terminal A, expand it as displayed below
Left Recursion which is immediate
Now, the left most leaf c is matched with first symbol in w. So, we advance the input pointer to a, the second symbol in w and consider the next leaf
As the next leaf is a non-terminal A, expand it as displayed below
Now, we have a match for the second symbol that is a in w. So, we advance the input pointer to d.
Example:
Now, there is a mismatch between leaf b and d. So, we go back to A to expand it by another alternative
In going back to A, we must reset the pointer in w to point again to a
Now, we have a match for the second symbol that is a in w and third symbol d in w
So, we halt and announce successful completion of parsing of string w
In going back to A, we must reset the pointer in w to point again to a
So, we halt and announce successful completion of parsing of string w
Limitations:
It can't tolerate left recursion, which leads the parser into an infinite loop
The major problem is backtracking, which involves undoing of semantic effects like removing entries from the symbol table
The order in which the alternatives are tried can effect the language accepted
For Example, if we try alternative a for A first for string cabd
As S has no other alternative we report a failure even though cabd is in the language
When a failure is reported we have a little idea about where it actually occurred
Left Recursion:
A grammar G is said to be left recursive if it has an
non-terminal A such that there is a derivation of the form A=> Aa for some ∝
Examples for Left Recursive Grammars
S → Sa | b
S → Ab A → Ad | Sa | e
S → Ac A→Ab | Bc | a B → Bb | Aa | Sc | b
Types of Left Recursion:
This is indicated by production of the form A→A∝
Example: S →
Sa | b
Left Recursion Which is not immediate
Some productions cause left recursion after two or more
steps in the derivation process
Example:
Grammar S →
Ab A → Ad | Sa | e
S ⇒ A b ⇒ Sab
(S is left recursive)
Elimination of Left Recursion which is immediate:
The immediate left recursion among some A productions of the form,
The immediate left recursion among some A productions of the form,
A→A∝1
| A∝2 | ……. | A∝m | β1 | β2
| ….| βn
Where no βi
begins with A can be eliminated by replacing A productions by the following two
productions
Where no βi begins with A can be eliminated by replacing A productions by the following two productions,
A→β1A'|β2A'|......|βnA'
A'→∝1A'|∝2A'|....|∝mA'|∈
How to eliminate Backtracking?
The solution is, one must know that the given current input symbol a and the non terminal A are to be expanded in which one of the alternates of the production
A→β1|β2|......|βn
is the unique alternate that derives a string beginning with a
In other terms, the proper alternate is detectable by looking at the first symbol it derives
Example 1:
Consider the grammar S→aS | bS | c
If the current input symbol is a, then we can easily say that the production S→aS has to be used
Example 2:
Consider the grammar S→aS | aaS | aSa | b
If the current input symbol is a, then we can't decide uniquely which production is to be used
Left Factoring:
The left factored grammar for
A→∝β1|∝β2|......|∝βn|γ
where γ represents all alternates that do not begin with ∝ is,
A→∝A'|γ
A'→β1|β2|......|βn
where A' is the new non terminal
Where no βi begins with A can be eliminated by replacing A productions by the following two productions,
A→β1A'|β2A'|......|βnA'
A'→∝1A'|∝2A'|....|∝mA'|∈
Where A' is anew non-terminal introduced
Example 1: S→ Sa |b
After eliminating left recursion S→bS' S'→ aS'|∈
Example 2: E→E+T | T T→T*F|F F→(E)|id
Here only E and T are left recursive, after eliminating left recursion
E→TE' E'→+TE'|∈
T→FT' T'→FT'|∈
F→(E)|id
Elimination of Left Recursion which is not Immediate:
The following algorithm is used to eliminate left recursion which is not guide
1) arrange the non-terminals in some order A1,A2,....An
2) for(each i from 1 to n) {
3) for(each j from 1 to i-1) [
4) replace each production of the form Ai →Ajy by the
Ai→δ1y| δ2y|……| δky, where
Aj→δ1| δ2|…..| δk are all current Aj-productions
5) }
6) eliminate the immediate left recursion among the Ai-productions
7) }
How to eliminate Backtracking?
The solution is, one must know that the given current input symbol a and the non terminal A are to be expanded in which one of the alternates of the production
A→β1|β2|......|βn
is the unique alternate that derives a string beginning with a
In other terms, the proper alternate is detectable by looking at the first symbol it derives
Example 1:
Consider the grammar S→aS | bS | c
If the current input symbol is a, then we can easily say that the production S→aS has to be used
Example 2:
Consider the grammar S→aS | aaS | aSa | b
If the current input symbol is a, then we can't decide uniquely which production is to be used
Left Factoring:
The left factored grammar for
A→∝β1|∝β2|......|∝βn|γ
where γ represents all alternates that do not begin with ∝ is,
A→∝A'|γ
A'→β1|β2|......|βn
where A' is the new non terminal
Example 1: Grouping of common prefixes of alternative
Consider
the grammar, which is not left factored
S
→ aS | aaS | aSa | b
Common
prefix is a
After
left factoring,
S
→ aS' | b
S'→
S | aS | Sa
Example 2: Consider the grammar (dangling else problem),
which is not left factored
S
→ iEts | iEtSeS | a
E
→ b
Common
prefix is iEtS
After left factoring S
→ iEtSS' | a
S'
→ eS | ε
E
→ b