Ch 3.4 Attribute Grammars

Static Semantics refers to the context-sensitive information in a programming language that can be determined prior to runtime. Normally, semantics describes the behavior of a program during execution but in this context reflects the context-sensitive meaning of a program. Examples of static semantics include type checking and the location of declarations. BNF/Context-free grammars (CFGs) are well-suited to describe the context-free syntax of programming languages but not easily context-sensitive information without becoming large & clumsy. Since the size of a grammar determines the size of the parser, attribute grammars enhance compilers with static semantics checking.

An attribute grammar is a generative grammar that can describe both the context-free and the context-sensitive syntax of a language. Attribute grammars add context-sensitive information to CFGs by transporting semantic information in a controlled fashion up and down the parse tree. Information is passed through synthesized and inherited attributes that are attached to the nodes of the parse tree.

Synthesized attributes pass information from child to parent up the tree. The parse tree is "decorated" with synthesized attributes in a bottom-up fashion. If the value of a synthesized attribute on a leaf node is assigned before parsing begins (hence outside the parse tree), the attribute is called intrinsic.

Intrinsic attributes are synthesized attributes of leaf nodes, the value of which is known before the sentence is parsed. These attributes pass external information (such as from the symbol table) up the parse tree.

Inherited attributes pass information from parent to child and from sibling to sibling. The overall effect is to move information down the parse tree. The parse tree is "decorated" in a top-down fashion for inherited attributes. Each grammar symbol may have inherited or synthesized attributes, or both.

Note: If the tree contains both inherited and synthesized attributes, a combination of top-down and bottom-up order is used to decorate the tree.

Attribute Grammar Formal Definition

  An attribute grammar is a context-free grammar G = <S, N, T, P> 
  with four additions:
 
 1. For each grammar symbol X ∈ N ∪ T, there is a set A(X) of 
    attribute values 

 2. Each rule has a set of functions that define certain attributes of the 
    nonterminals in the rule (also called semantic rule or semantic function)

 3. For each syntactic category X ∈ N, there are two finite disjoint sets
    I[X] and S[X] of inherited and synthesized attributes. For X = S, 
    I[X] = ∅

 4. Each rule has a (possibly empty) set of predicates to check for attribute 
    consistency. Predicates (which return TRUE or FALSE) check semantic meaning,
    such as type rules. Predicates are inserted into the attribute grammar
    at any point in which you can determine whether the sentential form 
    conforms to the semantic rules of the grammar.

Example.
  Let   X_0 -> X_1 X_2... X_n  be a rule

  Functions of the form S(X_0) = f(A(X_1), ... , A(X_n)) define synthesized 
  attributes.

    In this case attributes for X_0 can be synthesized from any of X_0's 
    children; i.e., the terms  X_1 ... X_n in the production rule
     
      *synthesized attributes pass semantic information up a parse tree*

  Functions of the form I( X_j ) = f(A(X_0)), ... , A(X_n)), 1 <= j <= n, 
  define inherited attributes

  In this case attributes can be inherited from the parent and all siblings.

  Functions of the form I(X_j) = f(A(X_0)), ... , A(X_j-1)), 
  for 1 <= j <= n, can also define inherited attributes (to eliminate self 
  and siblings to the right)

     *inherited attributes pass information down and across the parse tree*
Attribute Grammar Example
  A BNF grammar for a simple assignment statement:

  1. <assign> -> <var> = <expr>
  2. <expr> -> <var> + <var> 
  3. <expr> -> <var>
  4. <var> -> A | B | C

  This context-free grammar is limited: cannot check type rules. To check
  type, this grammar needs two predicates, one for each statement type:

  A = B
  A = B + C

  This grammar has two attributes:

  o actual_type: a synthesized attribute for nonterminals <var> and <expr>
                 used to store the actual type; e.g., int or real

  o expected_type: an inherited attribute for nonterminal <expr>  
 
  1. Syntax rule:  <assign> -> <var> =  <expr>

     Semantic rule: <expr>.expected_type <- <var>.actual_type (inherited)

  2. Syntax rule:  <expr> -> <var>[2] + <var>[3]
    ([2] and [3] differentiate the three <var> nonterminals)

     Semantic rule: <expr>.actual_type <- 
                          if <var>[2].actual_type == int) and
                             <var>[3].actual_type == int)
                          then int else real        (synthesized)

     Predicate: <expr>.actual_type == <expr>.expected_type

  3. Syntax rule:  <expr> -> <var>
     Semantic rule: <expr>.actual_type <- <var>.actual_type 

     Predicate: <expr>.actual_type == <expr>.expected_type

  4. Syntax rule:  <var> -> A | B | C 

     Semantic rule: <var>.actual_type <- lookup (<var>.string) 
The parse tree for A = B + C
                         <assign>
                        /   |    \    
                  <var>     =  <expr>
             .actual_type      .actual_type 
                  |               .expected_type
                  A               /    |      \    
                            <var>      +    <var>
                          .actual_type       .actual_type
                              |                   |
                              B                   C  
The parse tree for A = C
                        <assign>
                      /    |      \  
                  <var>    =  <expr>
             .actual_type      .actual_type 
                  |            .expected_type
                  B               |    
                                <var>   
                               .actual_type  
                                 |      
                                 C     
Assume:
   int A, B; real C
   A = B + C
   B = C 
   What happens?
* How are attribute values computed?

Initially, there are only intrinsic attributes on the leaves...

  1. <var>.actual_type <- look-up(A)     (Rule 4) 
  2. <expr>.expected_type  <- <var>.actual_type (Rule 1)

  3. <var>[2].actual_type  <- lookup (A)   (Rule 4)
     <var>[3].actual_type  <- lookup (B)   (Rule 4)
  4. <var>[1].actual_type <- either int or real (Rule 2) 

  5. <expr>.expected_type == <expr>.actual_type is either TRUE or FALSE 
More Examples