Visual PCYACC
Developing and Debugging with
Visual Pcyacc
by
Y. Jenny Luo
PCYACC® is a software product of ABRAXAS SOFTWARE INC.
For more information, contact
ABRAXAS SOFTWARE INC.
Post Office Box 42363
Portland, OR 97242 USA
TEL: 503-802-0810
FAX: (206) 309-0304
Internet: support@abxsoft.com
URL: http://www.abxsoft.com
Copyright© 1984-2008 by ABRAXAS SOFTWARE INC
I. OVERVIEW................................................................................................................ 3
II. LR Bottom-Up Parser.......................................................................................... 4
1. Definitions and Introductions............................................................................... 4
2. LR Bottom-up Parser.............................................................................................. 5
3. Example..................................................................................................................... 8
III. How PCYDB Works.......................................................................................... 11
1. States....................................................................................................................... 11
2. State Actions.......................................................................................................... 11
a. Action: Shift to a new state...................................................................... 12
b. Action: Reduce one or more input tokens to a single nonterminal symbol, according to a grammar rule.............................................................................................. 12
c. Action: Go to a new state........................................................................... 13
d. Action: Accept the input............................................................................ 14
e. Action: Find an error................................................................................... 14
IV. Using Text Version PCYDB........................................................................... 17
1. Invoking PCYDB................................................................................................... 17
2. Quitting PCYDB.................................................................................................... 17
V. PCYDB Function................................................................................................. 19
1. BREAKSTATE...................................................................................................... 19
2. BREAKTOKEN..................................................................................................... 19
3. CLEARBREAK..................................................................................................... 20
4. GENSTATE............................................................................................................ 20
5. GO............................................................................................................................. 20
6. HELP....................................................................................................................... 21
7. INIT......................................................................................................................... 21
8. LOADTOKEN........................................................................................................ 21
9. LOADSRC............................................................................................................... 22
10. QUIT...................................................................................................................... 22
11. SAVE..................................................................................................................... 22
12. SETGDF................................................................................................................ 22
14. STACK.................................................................................................................. 23
15. STATE................................................................................................................... 23
16. STEP...................................................................................................................... 24
18. SYMBOL............................................................................................................... 24
VI How to Use the Parse Tree............................................................................. 25
VII. How to Use the Parsing Stack................................................................... 27
VIII. How to Use Conflict Parse Trees............................................................ 29
IX. How to Use Grammar Rule Matches......................................................... 32
X. How to Use Regular Expression Matches................................................ 34
XI. How to Control the Flow of Your Input Data...................................... 35
XII. How to Use Parsing Tables......................................................................... 37
XIII. PCYPP – Handle Preprocessor and Comment in Integration of GDF and SDF 43
1. Separate *.ey file into *L.l and *Y.y files.......................................................... 43
2. Support minimum preprocessor......................................................................... 47
2. Support comment inside GDF and SDF...................................................... 49
XIV. Using GUI Version PCYDB......................................................................... 51
1. Invoking GUI PCYDB.................................................................................... 52
a. Select description source files for YACC Debugger................. 52
b. Select input source file for YACC debugger..................................... 52
c. Setting State Breakpoint for YACC Debugger........................... 53
d. Setting Token Breakpoint for YACC Debugger......................... 53
e. Single-Step Execution........................................................................... 53
f. Execute Until a Breakpoint or EOF Is Hit...................................... 53
g. Restart YACC Debugger........................................................................ 53
1. Quitting GUI PCYDB..................................................................................... 53
2. How to Use the Parse Tree............................................................................. 54
3. How to Use the Parsing Stack....................................................................... 54
4. How to Use Conflict Parse Trees.................................................................. 54
5. How to Use Grammar Rule Matches........................................................... 54
6. How to Use Regular Expression Matches................................................... 55
7. How to Control the Flow of your Input........................................................ 55
8. How to Use Parsing Tables............................................................................ 55
9. Conclusion......................................................................................................... 55
Recently, GUI YACC debugging has become more popular than any time before. Although GUI application can provide user-friendly interface, it is very slow mainly due to the fact that it has to deal with graphics library overhead which is usually much less efficient than the YACC code itself. A stand alone, portable and efficient YACC debugger is becoming much more important for programmers who are using parsing and lexing tools to build theirs own compilers and searching for quick implementation. Under this circumstance, ABRAXAS SOFTWARE provides you a powerful YACC interactive debugger called PCYDB.
PCYDB is a command-line and GUI based YACC debugger tool, which uses most advanced lexing and parsing techniques available, bringing everything inside parsing execution to your fingertips. It allows you to stop parsing execution at any point, examine and change grammar file, and “single step” through the parsing execution. When execution is paused, the internal data of the parser can be displayed and examined to pinpoint problems. PCYDB provides several important functionalitites, which you can benefit from when building your own parser. These functions are listed as following:
• See the Parse Tree
• See the Parsing Stack
• See Conflict Parse Trees
• See Grammar Rule Matches
• See Regular Expression Matches
• See the Flow of your Input Data
• See Various Tables
Detailed descriptions of these functionalities will be presented in their respective chapters.
LR parsing is currently most popular parsing technique. This parsing method is called bottom-up because it tries to construct a parse tree for an input string beginning at the leaves (the bottom) and working up towards the root (the top). It scans the input from left-to-right and constructs a rightmost derivation in reverse. There are several reasons why this technique is quite popular.
• For any programming language that can be defined using a context-free grammar, LR parsers can be generated to parse the source code written in that language.
• The LR parsing technique is more general than any of the other common shift-reduce techniques. Although it is more general, the degree of efficiency can be as good as other methods if implemented correctly.
• When scanning through the input from left to right, LR parser can detect errors as soon as possible.
However, implementing a LR parser in an efficient way is not an easy task. Fortunately, ABRAXAS Software provides PCYACC - a LR parser generator to help you avoid doing too much work to implement a LR parser by hand for a typical programming-language grammar. PCYACC is used to generate deterministic bottom-up parsers. The generated parser starts with the input word of the program source code, which is internally recognized as a token and attempts to match a syntax structure for a string of tokens. If a string of tokens matches a rule specified in the context-free grammar, a production is found by the parser. When a right production side is found, reduction to the nonterminal of the left side takes place. The parser then alternates between reading the next input symbol and executing as many reductions as necessary. The number of necessary reductions is determined by whatever the initial reduction result is and the fixed-length section of the remaining input. The bottom-up parser finishes its job by reading all its inputs and reducing it to the start symbol specified by the context-free grammar.
The syntax analysis parsers are based on the theory of automata and formal languages. The important theorem that lays down the foundation of the syntax parsing concerns the relationship between a syntax free grammar and a pushdown automata:
(1) for every context-free grammar a pushdown automaton can be constructed which accepts the language defined by the grammar.
(2) the language accepted by a pushdown automaton is context free, and therefore has a context-free grammar (which is even effectively constructible).
Because a pushdown automata can be constructed for any language that can be defined by the context free grammar, almost all the computer programming languages are defined using the context free grammar. A LR parser is a realization of the pushdown automata that accepts the context free grammar specification of a language.
An LR parser consists of a parsing table and a driver routine. The parsing table is generated from the context-free grammar of a language by a parser generator. The driver routine makes sure the execution of the parser follows the specification of the parsing table. The driver routine is the same for all LR parsers; only the parsing table changes from one parser to another. The driver routine is also usually copied to the parser code by a parser generator. The parsing table is the key component of a LR parser because it determines the characteristics of the parser. Figure 2-1 shows the generation of parsing table and the parser’s functionality.
PCYACC
![]()
Grammar Parsing Table
Parser Generator
(a) Parsing table generation.
![]() |
![]()
Driver Parsing
![]()
Input Output
Routine Table
(b) Parser Functionality.
Figure 2-1. Parsing Table and Parser Functionality
Abraxas’s PCYACC is responsible for generating a LALR (Look Ahead LR) parser. It generates the parsing tables from an input grammar description file. The LALR parser thus generated is fairly powerful and can be implemented efficiently.
Figure 2-2 shows the internal operation of an LR parser. The parser includes an input, a stack, a driver routine and a parsing table. A parsing table consists of two parts, action and goto. The input is actually a token.
t1 ••• ti ••• tn $ Input
![]() |
![]()

![]()
sm Driver Parsing
![]()
Xm Routine Table
Stack sm-1
Xm-1
•••
s0
Figure 2-2. Diagram of LR bottom-up parser
The token is passed to the parser by a lexer. Every time the parser needs a token, the parser calls the lexer. The lexer reads the input source code and translates them into tokens. For simplicity, in Figure 2-2, the input is shown as an integer array, which represents every token of the input stream with the input order preserved. The functionality of the lexer is conveniently omitted. The driver routine reads the input tokens from left to right from this input one at a time. The driver routine populates the stack in the form of s0X1s1X2s2…Xmsm, where sm is on the top of the stack. Xi represents a grammar symbol and si is a state symbol. The information held in the stack below the state symbol is summarized by the state symbol. The state symbol on the top of the stack along with the current input symbol (token) determines the index into the parsing table and the corresponding shift-reduce parsing decision. The grammar symbols are not absolutely necessary to be put onto the stack in actual implementations. It is included here to help describe the operation of an LR parser.
There are two parts contained in a parsing table, ACTION functions and GOTO functions. The driver routine determines the current state on top of the stack sm based on the information saved on the stack below. It also reads in the current input token ti. The driver routine then calls the function ACTION[sm, ti], to determine the parsing action table entry for state sm and input token ti. The parsing table entry determined by ACTION[sm, ti] can have one of four values:
· shift s
· reduce ![]()
· accept
· error
The GOTO function determines the next state to goto based on current state on the top of the stack and the current input symbol. It is essentially the transition table of a deterministic finite automaton whose input symbols are the terminals and nonterminals of the grammar.
A configuration of an LR parser consists of the stack contents followed by the unexpanded token stream as shown below:
( s0 X1 s1 X2 s2 • • • Xm sm, ti ti+1 • • • tn $)
The parser decides on its next action to take by examining the current state sm on top of the stack and reading in the next token from the input. The parsing table entry ACTION[sm, ti] points to four types of actions that the parser will take. They are described as follows,
· If ACTION[sm, ti] = shift s, the parser takes a shift action, the configuration after executing a shift is
( s0 X1 s1 X2 s2 • • • Xm sm ti s, ..ti+1 • • • tn $)
Here s = GOTO[sm, ti] is the next state, which is also determined by the current state sm and current input token ti. Thus the current input token and the next state is shifted onto the stack. Notice that ti+1 now becomes the new current input token.
· If ACTION[sm, ti] = reduce the grammar description of the form
, a reduce action is executed by the parser, after which the configuration becomes,
( s0 X1 s1 X2 s2 • • • Xm-r sm-r A s, ti ti+1 • • • tn $)
where s = GOTO[sm-r, A] is a state determined from sm-r state and left hand side of a production A. Here r is the length of
, the number of terminals and non-terminals on right side of the production. The parser pops r state symbols and r grammar symbols off the stack, leaving state sm-r at the top of the stack. Then the parser pushes the left-hand side of the production A onto the stack. Finally, the next state s, which is determined by the entry for GOTO[sm-r, A], is pushed onto the stack. During the parser’s reduce action, no change is made to the current input tokens. If the sequence of the grammar symbols popped off the stack is reconstructed in sequence, it looks like,
Xm-r+1 ••• Xm,
it should always match the right hand of the reduction production
.
· If ACTION[sm, ti] = accept, then all the parsing is completed, all the grammar rules have been reduced to the start symbol.
· If ACTION[sm, ti] = error, the parser detected errors in the input token string, an error handling routine is called to display messages and recover from the error.
The algorithm that the LR parser uses for its operation is very simple. It starts with a designated initial state s0 and an initial configuration of
(s0, t1 t2 • • • tn $)
where t1 t2 ••• tn is the token string to be parsed. The parser determines its next action to execute based on the current state and current input token. This process iterates until it reaches an accept action or an unrecoverable action. Almost all parses behave the same way, the difference exists only in the parsing table where the next state or next action is specified.
To illustrate the operation of an LR parser, we will use a very simple example. The simple grammar we will use in this example is:

Assume we have a parser generator like PCYACC which generates both the driver routine and parsing table for us already. The parsing table specifying the action and goto functions of an LR parser is shown in Figure 2-3.
State | Action | Goto | |||||||
| id | + | * | ( | ) | $ | E | T | F |
0 | s5 |
|
| s4 |
|
| 1 | 2 | 3 |
1 |
| s6 |
|
|
| acc |
|
|
|
2 |
| r2 | s7 |
| r2 | r2 |
|
|
|
3 |
| r4 | r4 |
| r4 | r4 |
|
|
|
4 | s5 |
| s4 |
|
|
| 8 | 2 | 3 |
5 |
| r6 | r6 |
| r6 | r6 |
|
|
|
6 | s5 |
|
| s4 |
|
|
| 9 | 3 |
7 | s5 |
|
| s4 |
|
|
|
| 10 |
8 |
| s6 |
|
| s11 |
|
|
|
|
9 |
| r1 | s7 |
| r1 | r1 |
|
|
|
10 |
| r3 | r3 |
| r3 | r3 |
|
|
|
11 |
| r5 | r5 |
| r5 | r5 |
|
|
|
Figure 2-3. Parsing table for LR Bottom-up Parser.
The meanings of the actions are:
· si shift and stack state i,
· rj means reduce by production numbered j,
· acc means accept,
· blank means error.
The next state to go to that is specified by the value of GOTO[s, t] for terminal token t is found in the action field connected with the shift action on input t for state s. The goto field gives GOTO[s, T] for nonterminal T. However, how the entries are selected is solely determined by the parser generator when it is generating the parsing tables for an LR parser. And this is also where the difference between LR parsers comes from.
Now, assume an input token stream id * ( id + id ) will be parsed. The sequence of actions taken by the parser and the state of stack and input token stream is shown as following in Figure 2-4.
| Stack | Input | Action |
(1) | 0 | id * ( id + id )$ | shift |
(2) | 0 id 5 | * ( id + id )$ | reduced by r6 |
(3) | 0 F 3 | * ( id + id )$ | reduced by r4 |
(4) | 0 T 2 | * ( id + id )$ | shift |
(5) | 0 T 2 *7 | ( id + id )$ | shift |
(6) | 0 T 2 *7 ( 4 | id + id )$ | shift |
(7) | 0 T 2 *7 ( 4 id 5 | + id )$ | reduced by r6 |
(8) | 0 T 2*7 ( 4 F 3 | + id )$ | reduced by r4 |
(9) | 0 T 2*7 ( 4 T 2 | + id )$ | reduced by r2 |
(10) | 0 T 2*7 ( 4 E 8 | + id )$ | shift |
(11) | 0 T 2*7 ( 4 E 8 + 6 | id )$ | shift |
(12) | 0 T 2*7 ( 4 E 8 + 6 id 5 | )$ | reduced by r6 |
(13) | 0 T 2*7 ( 4 E 8 + 6 F 3 | )$ | reduced by r4 |
(14) | 0 T 2*7 ( 4 E 8 + 6 T 9 | )$ | reduced by r1 |
(15) | 0 T 2*7 ( 4 E 8 | )$ | shift |
(16) | 0 T 2*7 ( 4 E 8 ) 11 | $ | reduced by r5 |
(17) | 0 T 2*7 F 10 | $ | reduced by r3 |
(18) | 0 T 2 | $ | reduced by r2 |
(19) | 0 E 1 | $ | Accept |
Figure 2-4. Actions of LR parser on id * ( id + id )
The LR parser starts with an initial state of 0 (line (1)). The current input token is id, the action to take is found in the parsing table based on the state number and input token symbol. The action in row 0 and column id of the action field of Figure 2-3 is s5, meaning shift (putting one input token onto stack from input stream) and fill the stack with state 5 on the top. After execution of the action, the first token id and the state symbol 5 have both been pushed onto the stack, and id has been removed from the input token stream with the remaining input stream as “* ( id + id )”. This is illustrated in line (2).
Now, * becomes the current input token, and looking at the action of state 5 on input token * is to reduce by r6. Since r6 is referenced to F®id, so two symbols (one state symbol and one grammar symbol) are popped from stack and only state 0 remains on the top of the stack. According to parsing table, the destination state of goto function on state 0 for F nonterminal is state 3, so nonterminal F and state 3 have been pushed onto stack. Similarly, the remaining moves on input id * ( id + id ) can be deducted according to previous description. The operation of the parser completes by reaching an accept action or stopped by encountering an error action.
PCYDB is a YACC debugger designed for the purpose of debugging parsers generated by Abraxas PCYACC. It allows user to follow the entire parsing procedure, examine almost real-time changes in stack, parse tree and input token stream. It also allows user to compare the internal operation of the parser and how grammar rules are matched.
The LR parser theory that Abraxas PCYACC based on has been covered in the previous chapter. The following chapters focus on describing PCYDB internal functionalities, debugger commands and examples illustrating the usage of PCYDB.
To use PCYDB effectively, it is helpful to understand some of the internal working of the LR parser generated by PCYACC tool.
The internal state of a parser is a point where the parser is reading input from the token stream and ready to handle one of them. The driver routine inside the parser consults the parsing table to switch between states and take appropriate actions.
The parser generated by YACC uses the internal states to subdivide the parsing process into simpler processes. For each step, the parser reads its input token stream and based on current state and current input token to determine the action to take the picks the next state by checking the lookahead token (next token in the input stream).
Each state is assigned a number. The initial state is usually numbered as state 0 to distinguish it as the parser’s initial condition before any token is read from the input stream. Others states are numbered when YACC generates the parser and is dependent on the implementation of YACC.
For each internal state of a parser, there are several actions that can be taken. The possible actions are:
· Shift to a new state
· Reduce one or more input tokens to a single nonterminal symbol, according to a grammar rule
· Go to a new state
· Accept the input
· Find an error
The actual action taken by the parser is determined by the current state the parser is in and the current input token. Most of the time several choice of actions exist at each state, special states also exist that the parser can only have one action no matter what the input lookahead token is.
The following is a description for each of the possible actions that the parser can take. Understanding these actions will be a great help for comprehend the inner workings of PCYACC parser.
The shift action is taken by the parser when the parser is in the middle of validating a grammar. The lookahead token is read in and several possible states can be selected by the parser as the next state. The choice is made based on the lookahead token and current state. After entering the new state, the parser can shift to another state based on the next lookahead token read from the input stream.
Internal to the parser, there is a state stack to keep track of the state transitions that the parser is experiencing. The stack records the history of the states that the parser has been in. When the parser shifts to a new state, the previous state is pushed onto the state stack.
In addition to the state stack, there is a value stack, which records the values of tokens and nonterminal symbols during the source code parsing process. The token value is returned by the lexer “yylex” called by the parser. It is usually implemented as a global “yyval”. A nonterminal symbol value appears in the grammar description file as $$. Its value is set by the recognition action associated with the symbol’s definition. If the symbol’s definition did not have an associated recognition action, the value of this symbol is the value of the first item in the symbol’s definition.
The Shift action simultaneously pushes the current state onto the state stack and the global “yyval” (the lookahead token) onto the value stack.
When the parser recognizes all the items that make up a non-terminal symbol, the parser will take the Reduce action irrespective of what the lookahead token will be. A Reduce action is the result of parser recognizing the nonterminal symbol in a grammar rule.
The Reduce operation first pops several states off the state stack. If the recognized nonterminal symbol had N components, the Reduce operation pops N-1 states off the state stack. The parser actually goes back to the state it was once in when it first began to gather the recognized constructs.
The value stack is modified next. If the grammar rule definition being processed has N components, then a total of N values will be popped off the value stack by the Reduce action. The symbols $N, $N-1, ..., all the way down to $1 that usually show up in the grammar rule definition are assigned theses values popped off the stack sequentially.
After assigning all the $N, $N-1, ..., $1 values, the Reduce action invokes the recognition action associated with the grammar rule being processed. The nonterminal symbol value represented by $$ is determined by the values of $N-$1 and the grammar rule itself. The $$ value is then pushed onto the stack as a replacement of the N values that were previously popped off the value stack.
If there is no recognition action associated with the nonterminal, or if the associated recognition action does not set the value $$, then the Reduce action simply puts back the value $1 back on to the value stack. (Pactically, $1 is simply not popped off the value stack in the first place)
The last clean up action performed by the Reduce action is to setup the lookahead symbol such that it seems to be the nonterminal symbol that was just recognized.
The Goto action is a continuation of the Reduce process. Goto action is almost identical to the Shift action; the only difference is that the Goto action takes place when the lookahead symbol is a nonterminal symbol while a Shift takes place when the Lookahead symbol is a token.
While the Shift action pushes the current state onto the state stack, the Goto action does not have to do this: The current state was on the state stack already. Shift action also pushes a value onto the value stack, but Goto action does not. This is because the Goto action happens after the Reduce action and the value corresponding to the nonterminal symbol was already put onto the value stack by the Reduce action. The destination state was determined by the parsing table based on the current state and the nonterminal symbol, and Goto action replaces the top of the state stack with the destination state.
After the parser transitioned to the destination state, the current Lookahead symbol is restored to whatever the current input token it was at the time of the Reduce action.
Thus the essential difference between a Goto action and a Shift action is that Goto action takes place when the parser goes back to a state after the completion of the Reduction action while Shift action is based on the current parser state. Also, a Shift action is based on the value of a single current input token, whereas a Goto action is based on a nonterminal symbol prepared by the Reduce action.
The Accept action is the successful end point of the parsing process. It happens when the parser has processed all the input tokens correctly and the parser has reduced all the grammar rules to the start symbol. When the conditions for Accept action is true, the yyparse() function returns a zero to the calling function indicating a successful parsing of the input token stream according to the grammar rule descriptions.
The Error action is taken when the parser encounters encounters any input token that cannot legally appear in a particular input location. The parser usually canno do much to handle an input error except in extreme cases. However, it is highly undesirable to stop processing of the input token stream whenever an error is found. The more desirable behavior is for the parser to skip over the incorrect input and resume parsing as soon as possible. This is a much more efficient way of doing the parsing because the parser can identify most syntax errors during just a single pass through the input.
Most parser generators therefore tries to generate a parser that can restart as soon as possible after an error condition occurs. YACC does this by letting the user specify the points at which the parser should pick up after errors. User can also specify the actions to takke whene an error is found at those points.
The Error action has the following steps:
· See if the current state has a Shift action associated with the error symbol. If it does, shift on this action.
· If there is no Shift action associated with the current state, then pop the current state off the state stack and start checking the next state. To sync the state stack and the value stack, the value at the top of the value stack is also popped off.
· The previous step is repeated until the parser finds a state that has an associated Shift action to shift on the error symbol.
· Once this state is found, the Shift action associated with the error symbol taken. This pushes the current state on the stack - that is, the state that can handle errors. No new value is pushed onto the value stack; the parser keeps whatever value was already associated with the state that can handle errors, which is already on the values stack.
After the parser shifts out of the state that can handle errors, the lookahead token is whatever token caused the error condition in the first place. The parser then tries to proceed with normal processing.
2. PCYDB Working Process
PCYDB accepts four different combinations of input files:
· Grammar Description Files (GDF) plus input source code, lexers are hand written in the grammar description file (the lexer name has to be yylex())
· GDF and SDF (Scan Description File) plus the input source code, lexers are automatically generated by PCLEX
· Extended Grammar File (GDF and SDF combined in a single file) plus the input source code
· GDF plus an file specifying an array of integers representing the input token stream
PCYDB initializes internal parameters when invoked. Either specified on the command line when invoking PCYDB or by issuing genstate at the command prompt after invoking PCYDB, the grammar file is turned into the parser code needed. Along with the parser generation, necessary lexer code is also generated for obtaining tokens from input source code. This is unnecessary if a binary token input stream is directly specified.
Once the user entered the commands to set the break points, ..., etc, these break points are recorded internally in the PCYDB. When the user then entered any command to start executing the parser, the actual parser code is executed. Each time the parser completes one round of processing the input token, it checks against certain internal records to find out if there is any condition needs attention, e.g., breakpoints. If there is a break point, the execution is temporarily halted awaiting the user’s command to continue. When the execution is halted, internal states of the parser can be examined. This includes the parsing tables, state stack, value stack and token streams.
If the grammar description file has been changed during the PCYDB session, a genstate command can be used to regenerate the parsing tables and the parser. The new parser can also be reloaded into PCYDB without having to exit PCYDB. Debugging can now start from the beginning again on the modified grammar file. This makes it much easier to debug the YACC program.
PCYDB is different from the convention language debuggers that uses the operating system debugging services by setting hardware breakpoints and involving interrupt and exception handling routines. Instead PCYDB uses the internal state of the parser program itself to set the break points, etc. No interrupt and exception handing overhead is involved. Due to the efficiency of the implemented LR parser, using the internal state of the parser make PCYDB much faster than the conventional debuggers. It also offers operating system independence which is a very desirable feature.
|
PCYDB
![]() | ![]() | ||


LR Parser Token Stream
| |||||

Lexer
|
| |||||

Grammar Description File Scan Description File
(GDF) (SDF)
Figure 3-1. PCYDB Interactions
Figure 3-1 shows the interaction between PCYDB and the LR parser and various other components.
In this chapter, we will discuss how to start PCYDB, and how to get out of it. The essentials are:
· type “pcydb” to start PCYACC debugger.
· type “quit” to exit from PCYACC debugger.
You can invoke PCYACC debugger by running the program pcydb. Once started, PCYACC debugger reads in commands from the terminal until you tell it to exit.
You can also run pcydb with a variety of arguments and options, to specify more of your debugging environment. There are several ways to define your PCYACC debugging environment with a variety of command line options. The command-line options are described following.
The most usual way to invoke PCYACC debugger is to just type “pcydb” without any options following:
pcydb
If command line options are specified, the format of it should be:
pcydb [-g <GDF filename with hand-written lexer>] [-i <Input Source Code>]
pcydb [-g <GDF filename>] [-s <SDF filename>] [-i <Input Source Code>]
pcydb [-e <Extended GDF filename>] [-i <Input Source Code>]
pcydb [-g <GDF filename>] [-t <Token Input Stream File>]
These four formats correspond to the four possible input file combinations accepted by PCYDB. Not all the files need to be specified all on the command line. Some or all of the options can be set after invoking PCYDB by issuing various PCYDB commands.
Before start execution of the parser in the debug mode, however, a check is made on the availability of the grammar description file, the scanner description file plus the input source code or a token input stream file. If any one of them is missing, the LR parser execution cannot be started correctly and the debug process will not be successful. In this case, an error message will be displayed asking user to input more information by issuing related PCYDB commands before continuing.
To exit PCYDB after completing a debug session, type:
quit or q
command (abbreviated q). PCYDB will clean up all temporary files, and terminate normally, displaying the terminal command prompt to the user again.
A PCYDB command is a single line of input. There is no limit on how long it can be. It starts with a command name, which is followed by arguments whose meaning depends on the command name. For example, the command breaktoken accepts one argument which is a token number, as in “breaktoken 258”. However, some commands do not allow any argument like step, which simply executes one step of the parsing process starting from the current execution point.
Each PCYDB command has a corresponding abbreviation. All the possible command abbreviations are listed below for each individual command. A blank line as input to PCYDB does not mean to repeat the previous command like other traditional debugger, since unintentional repetition for some commands might cause trouble in the debugging session.
Set breakpoint to a specified state number. Since the parser uses the driver routine to consult the parsing table to switch between various states during the process of getting a token from input stream, setting a breakpoint at a particular state makes it possible to follow the execution of the parser and check the internal variables maintained by the parser. This command has one required argument statenumber. If this argument is missing, PCYDB will display a warning to the user to indicate missing argument and no action is taken by PCYDB.
Syntax format:
breakstate statenumber
Abbreviation:
bs statenumber
Set breakpoint at next specified token. PCYDB takes token integer array as its input, and switches state according to the input token and current state of the parser. By allowing a breakpoint when the parser sees a particular token, user can again check the internal variables and states of the parser to understand its operation. This command has one required argument tokennumber. If this argument is missing, PCYDB will display a warning to the user to indicate missing argument and no action action is taken by PCYDB.
Syntax format:
breaktoken tokennumber
Abbreviation:
bt tokennumber
This command clears all the break point set using the previous two commands. It requires no argument. Error message will be displayed if any additional argument is specified.
Syntax format:
clearbreak
Abbreviation:
cb
Generate new state table based on the grammar input for PCYDB debugger. PCYDB will create parsing tables, which is a required procedure for parsing grammar. There are several ways that a grammar file can be specified.
· Entered on command line according to one of the four formats specified in the previous chapter. In this case, a check is made to the input filenames to make sure both a GDF and a SDF (or specified input token stream) do exist so that the parsing table can be generated and will be able to execute.
· If no command line option has been specified for the GDF and SDF, then use the command setgdf and setsdf to specify the GDF and SDF files used for generating the parser.
This command first checks that all the necessary source files needed for generating the parser have been specified using either method mentioned above, it then generates the parsing tables and all internal data structures for the parser and the parser will be ready to execute. Any file missing will cause this command to fail and error message displayed to the user. The parser is loaded into memory after this command is executed.
This command has no argument. Error message will be displayed if any argument is specified along with the command.
Syntax format:
genstate
Abbreviation:
gs