Structure of LR Parser
The parser has four important components:
INPUT
Contains the string to be parsed and pointer
PARSING TABLE
Contains two parts ACTION and GOTO which is used by the
parser program
STACK
Contains string of the form s0X1s1X2…..Xmsm where sm is on
the top of the stack. Each Xi denotes a grammar symbol and si a state. Each
state symbol summarizes the information contained in the stack below it
PARSER PROGRAM
This determines the state on the sm on the top of the stack
and the current input symbol a to determine the next step
Working of LR Parser Program (Common for all LR parsers)
After reading the stack top state sm and the current input
symbol a, the parser consults the action [sm ,a] entry in the parsing table,
which can cause one of the four changes in the parser configuration (Assuming (s0X1s1X2…..Xmsm, ai,ai+1,….an$) is the current configuration of the parser.
Configuration contains of stack and input)
If actions [sm, ai] = shift s, the parser executes a shift
move, entering the configuration
(Assuming
(s0X1s1X2…..Xmsm, ai,ai+1,….an$)
Here the parser has shifted both the current input symbol ai
and the next state s, which is given in action [sm, ai], onto the stack; ai+1 becomes the current input symbol.
If actions [sm, ai] =
reduce A →β, then the parser executes a reduce move, entering the configuration
(s0X1s1X2…..Xmsm, ai,ai+1,….an$)
Where s = goto[sm-r , A] and r is the length of β, the right side of the
production. Here the parser first popped 2r symbols off the stack (r state
symbols and r grammar symbols), exposing state
sm-r . The parser then pushed both A, the left side of
the production and s, the entry for goto[sm-r, A], onto
the stack. The current position of input pointer is not changed in a reduce
move. For the LR parsers we shall construct Xm-r+1 ….Xm , the sequence of grammar symbols popped off the stack, will
always match β, the right side of the reducing production.
If action [sm , ai] = accept, parsing is completed
If action [sm , ai] = error, the parser has discovered an
error and calls an error recovery routine
Note: Initially, only the initial
state is placed on the stack.
The method described can be framed as an algorithm given
below:
Set ip
to point to the first symbol of w$;
Repeat forever begin
Let s
be the state on top of the stack and a the symbol pointed to by ip;
if action[s, a] = shift s’ then begin
push a
then s’ on top of the stack;
advance
ip to the next input symbol
end
else if action [s, a] = reduce A → β then
begin
pop 2 *
|β| symbols off the
stack;
let s’
be the state now on top of the stack;
push A
then goto [s’ , A] on top of the stack;
output
the production A → β
end
else if action [s, a] = accept then
return
else error()
end