yacc program error handling

When the parser reads an input stream, that input stream might not match the rules in the grammar file.

The parser detects the problem as early as possible. If there is an error-handling subroutine in the grammar file, the parser can allow for entering the data again, ignoring the bad data, or initiating a cleanup and recovery action. When the parser finds an error, for example, it may need to reclaim parse tree storage, delete or alter symbol table entries, and set switches to avoid generating further output.

When an error occurs, the parser stops unless you provide error-handling subroutines. To continue processing the input to find more errors, restart the parser at a point in the input stream where the parser can try to recognize more input. One way to restart the parser when an error occurs is to discard some of the tokens following the error. Then try to restart the parser at that point in the input stream.

The yacc command uses a special token name, error, for error handling. Put this token in the rules file at places that an input error might occur so that you can provide a recovery subroutine. If an input error occurs in this position, the parser executes the action for the error token, rather than the normal action.

The following macros can be placed in yacc actions to assist in error handling:
Macros Description
YYERROR Causes the parser to initiate error handling
YYABORT Causes the parser to return with a value of 1
YYACCEPT Causes the parser to return with a value of 0
YYRECOVERING() Returns a value of 1 if a syntax error has been detected and the parser has not yet fully recovered

To prevent a single error from producing many error messages, the parser remains in error state until it processes three tokens following an error. If another error occurs while the parser is in the error state, the parser discards the input token and does not produce a message.

For example, a rule of the following form:
stat  :  error ';'

tells the parser that when there is an error, it should ignore the token and all following tokens until it finds the next semicolon. All tokens after the error and before the next semicolon are discarded. After finding the semicolon, the parser reduces this rule and performs any cleanup action associated with it.

Providing for error correction

You can also allow the person entering the input stream in an interactive environment to correct any input errors by entering a line in the data stream again. The following example shows one way to do this.
input : error '\n'
        {
          printf(" Reenter last line: " );
         }
         input
       {
         $$ = $4;
       }
       ;
However, in this example, the parser stays in the error state for three input tokens following the error. If the corrected line contains an error in the first three tokens, the parser deletes the tokens and does not produce a message. To allow for this condition, use the following yacc statement:
yyerrok;

When the parser finds this statement, it leaves the error state and begins processing normally. The error-recovery example then becomes:

input : error '\n'
        {
          yyerrok;
          printf(" Reenter last line: " );
         }
         input
       {
         $$ = $4;
       }
         ;

Clearing the look-ahead token

The look-ahead token is the next token that the parser examines. When an error occurs, the look-ahead token becomes the token at which the error was detected. However, if the error recovery action includes code to find the correct place to start processing again, that code must also change the look-ahead token. To clear the look-ahead token, include the following statement in the error-recovery action:
yyclearin ;