LR Parsers

Why LR Parsers

In LR parsing, R stands for ‘Rightmost derivation in reverse order’
They can be constructed to recognize all programming language constructs for which a CFG can be written
It is more general than any other shift reduce technique and can be implemented efficiently
I can detect syntactic errors as soon as it is possible to so on a left to right scan of the input

What is LR(K)?

It stands for left to right scan of the input to generates the right most derivation in reverse using k symbol look ahead
Generally k is considered to be ≥ 1. If k is omitted, it is considered as 1
There are three different types of LR Parsers:
Simple LR(SLR): it is very easy to implement, but may fail to produce parsers for certain grammars
Canonical LR(CLR): It is the most powerful LR technique and will work on a very large class of grammars. Unfortunately it is very expensive to implement
Look Ahead LR(LALR): Its power is intermediate in between SLR and CLR. This will work on most programming language grammars and can be implemented efficiently
So regarding their strength the order among LR parser is LR < LALR < CLR

Why LR parsers are better than LL parsers?

In LR(k), one must be able to recognize the occurrence of right side of a production in a right sentential form by using k input symbol look ahead
In LL(k), one must decide on which production to use by seeing only the first k symbols of its right side derives
The first requirement is less stringent than the second one thereby making LR grammars to describe more languages than LL grammars. That is LR grammars are superset of LL grammars
Any parser is powerful than top down parser

How to implement LR parser

It is very difficult to build an LR parser by hand for a programming language grammar
We need special tools to generate such parsers automatically by taking the CFG as input
One example for such tool is YACC (Yet Another Compiler Compiler)
These tools are very efficient and can report on

Ambiguities in the grammar

Constructs that are difficult to parse in a left to right scan of the input

Design of LR Parser

LR Parsers use an algorithm to parse the input string which makes use of entries in a table, which are filled by using another algorithm
For SLR parser, the table entries are filled by using an algorithm that use a set of items known as canonical collection of LR(1) items

Note: All LR parsers use the same algorithm to parse their input, but differ only in the way the entries in the table are computed.

What is an LR(0) Item?

It is defined as a production in a grammar G with a dot at some position of the right side

Example 1: The production A → .XYZ A → X.YZ A → XY.Z A → XYZ.

Each item says how much of a production ε have seen at a given point in the parsing process. For example, the first item says that we are expecting a string derivable form XYZ. The second item says that we have just seen on the input string derivable from X and we next expect to see the string derivable from YZ

Example 2: The production A → ε generates only one item A → .

Home

Recursive Best First Search (RBFS) of A* Algorithm