Chapter 7 - Expressions and Assignment Statements

resources:
C ref man
C/C++ precedence & associativity chart
hw07.c sample C code

Expressions and assignment statements are the fundamental means of computation in imperative languages. An expression is any statement that can legally appear on the right side of an assignment statement. An expression can be as simple as a literal constant

     num =  5.787;
     str = "hello"; 
or can involve any number of the operators in the language. Supported operators vary by language. C/C++ include these low-level bitwise operators:
 << - shift left
        >> - shift right
        & - bitwise and
        | - bitwise or
        ^ - bitwise exclusive or 
) and a comma (sequence) operator:

  while(read_string(s), s.len() > 5)
  {
     //do something
  }
. Scripting languages and Java support string concatenation. The power of an imperative language is related to the number and type of operators that are supported. See Java Expressions. Java includes concatenation. The more strongly typed a language is, the more complicated are the constraints placed on expressions (compare C with Java).

The two primary types of expressions in modern languages are expressions that return a number (arithmetic) and expressions that return true or false (relational and boolean expressions). Every expression in C returns a number since there is no boolean type.

ARITHMETIC EXPRESSIONS

  • arithmetic computation influenced development of programming languages
  • expressions consist of operators, operands, and parentheses
  • in most modern languages, the return value from a function call can be an operand
     
    Design Issues
    o  operator precedence and associativity rules
    o  order of operand evaluation (issue if side-effects)
    o  operand evaluation side effects 
    o  operator overloading
    o  mode mixing expressions (e.g. float with integer)
    
    Operators
      Arity (number of operands)
      unary:   ++
      binary:  +
      ternary: ?:    (a>5) ? b++ : b--;   <= ternary conditional 
    
    Operator Precedence Rules 
    
       Typical precedence levels: C++ chart
    
          parentheses
          postfix ++, --
          prefix ++, --
          unary +, -
          *,/,%
          +, -
          =
    
    Operator Associativity Rules (see chart)
    o associativity sets evaluation order of adjacent operators of equal precedence 
    o binary operators typically associate left to right
    o unary operators typically associate right to left
    o Sample code in C
    
    
    Ternary Conditional Expressions
    
       average = (count == 0) ? 0 : sum / count
       means:
         if (count == 0) 
              average = 0
         else 
              average = sum /count
    			
    Operand Evaluation Order
    
    o  Variables: fetch the value from memory
    o  Constants: sometimes a fetch from memory; sometimes the constant is in the 
    	machine language instruction
    o  Parenthesized expressions: evaluate all operands and operators first
    o  postfix v. prefix increment/decrement operators: 
       y = x * z++;  the current value of z is used to evaluate the expression 
                     (i.e., y = x * z) then z is incremented 
       y = x * ++z;  z is incremented first
    
    Unwanted Functional Side Effects 
    
    o  A side-effect is anything that changes the environment of a program during
       execution ; imperative languages are built on side-effects ; a functional
       side-effect occurs in the time between a function call and a function 
       return
    
    o  There is a potential "unwanted side-effect" when a function changes a 
       two-way parameter or a non-local (global or static) variable:
    
          int b, a = 10;
          b = a + fun(&a);  /* assume fun changes a to 5 and returns it*/
          What is the value of b?
          If 'a' becomes 5 before addition, then b=10, otherwise b=15
             
    o  Solution 
       1. Disallow functional side effects in language ; no two-way parameters or 
          non-local references in functions; Disadvantage: inflexibility 
    
       2. Demand operand evaluation order be fixed in language definition
          Disadvantage: limits some compiler optimizations
    
     OVERLOADED OPERATORS
    
    o  Use of an operator for more than one purpose is called operator overloading
    o  Some are easy to understand (e.g., + for int and float)
    o  Some are not (*  in C/C++ is both multiplication and pointer dereferencing)
    o  Loss of readability if meaning is not intuitive
    o  Avoided by use of new symbols (e.g., Pascal's div for integer division)
    o  C++ and Ada allow user-defined overloaded operators
    o  Potential problems: 
       Users can define nonsense operations 
       Can increase code complexity
    
       Example in JavaScript of overloaded + operator:
     
    // JavaScript is dynamically typed but does not behave like Perl stuff = prompt('Enter an integer or a string:'); myInt = 5; // + is overloaded to accept numbers or strings // by default myInt is coerced to a string - unlike perl area.innerHTML = stuff + ' + ' + myInt + '=' + stuff + myInt; // if stuff is an integer OK, otherwise returns NaN // the parens around parseInt MUST be there or + is string concat area2.innerHTML = stuff + ' + parstInt(' + myInt + ')=' + (parseInt(stuff)+myInt); (run the script)
    o Advantage: overload '=' operator to prevent cross-linked pointers in C++; (see C++ example)
    TYPE CONVERSIONS

    A narrowing conversion converts an object to a type that reduces precision or range of values of original type e.g., float to int or int to short

    A widening conversion converts an object to a type that increases precision or the range of values of original type e.g., int to float or short to int. There are some standards but see /usr/include/limits.h for the limits on your specific compiler. limits.h for sleipnir is shown below:

    Type Bytes Bits Range short int 2 16 -16,384 -> +16,383 unsigned short int 2 16 0 -> +32,767 unsigned int 4 16 0 -> +4,294,967,295 int 4 32 -2,147,483,648 -> +2,147,483,647 long int 4 32 -2,147,483,648 -> +2,147,483,647 signed char 1 8 -128 -> +127 unsigned char 1 8 0 -> +255 float 4 32 double 8 64 long double 12 96
    (see types.c) Mixed Mode o A mixed-mode expression contains operands of different types o A coercion is an implicit type conversion made by the compiler or runtime system Disadvantage: decrease type error detection of the compiler o In most languages, numeric types are coerced using widening conversions o In C++, polymorphism uses implicit coercions from derived to base class (downcasting) (see C++ code) Explicit Type Conversions o Called casting in C-based language. Examples: C: int sum = 100; int num = 15; float avg = (float) sum / num; C++: static_cast <int>(num) Errors in Expressions o Inherent limitations of arithmetic; e.g., division by zero o Limitations of computer arithmetic; e.g. overflow o either ignored by run-time system or will give compiler specific results:
    num = __INT_MAX__: 2147483647 01111111111111111111111111111111 num2 = num + 1: -2147483648 10000000000000000000000000000000 num + num2 = -1 11111111111111111111111111111111

    RELATIONAL AND BOOLEAN EXPRESSIONS

    Relational Expressions
    o  consists of relational operators and operands of various types
    o  evaluates to some boolean representation (e.g. T or 1)
    o  operator symbols vary among languages; e.g. not equal: !=, /=, .NE., <>, #
    o  relational expressions are a type of boolean expression
    o bitwise boolean operators are not boolean expressions (do not evaluate to T/F)
    
    Boolean Expressions
    o a boolean expression evaluates to T or F (or some representation of T/F)  
    o  boolean operators are: and, or, not, xor 
    o  most modern languages use C notation: && is AND, || is OR, ! is NOT (no XOR)
    o  operands are also boolean expressions; e.g. ((5 > 3) || (7 == 3)) is true  
    
    Languages Without a Boolean Type 
    o  C has no boolean type--it uses int type: 0 is false and nonzero is true
    o  For C's relational expressions, associativity is L to R: 
               a < b < c;  (legal code) 
    o  a and b are compared, producing 0 or 1; the result (0 or 1) is compared w/ c
    
    Operator Precedence 
    o  Precedence in C-like languages:
          !
          <, >, <=, >=
          =, !=
          &&
          ||                    
    
    
    SHORT CIRCUIT EVALUATION

    If the result of an expression can be determined without evaluating all operands, you can stop evaluation; i.e., short-circuit evaluation. Example:

         (13*a) * (b/13-1)       # if 'a' is zero, no need to evaluate (b/13-1) 
    

    A disjunctive boolean expression (clauses separated by ORs) can be short-circuited after the first true in the expression:

         ( (5 < 7) || (A > B) || (C == D) )     # stop at (5 < 7) 
    
    A conjunctive boolean expression (separated by ANDs) can be short-circuited after the first false in the expression:
         ( (5 > 7) && (A > B) && (C == D) )     # stop at (5 > 7) 
    

    This is a problem with non-short-circuit evaluation:
       int LIST[MAX]; 
    	index = 0;
    	while ( (index < MAX ) && (LIST[index] != value) )
    		index++;
    
    When index == MAX, evaluating LIST [index] will be an out-of-bounds exception

  • C, C++, Java use short-circuit evaluation for the usual Boolean operators (&& and ||)
  • bitwise Boolean operators (& | ^ ) are NOT short circuit
  • short-circuit evaluation can cause side effects in expressions
    	    ((stuff[index++] == 99) || (index < SIZE))   #  what index do you mean? 
    

    ASSIGNMENT STATEMENTS

    Expressions can be part of a condition statement ((num+num2/5)>num3) or part of an output statement (cout << num + num2). But the most common use of expressions is to be the right hand side of an assignment statement. BNF syntax for an assignment statement:
    	<target_var> <assign_operator> <expression>
    
    The assignment operator differs by language
    ' = ' FORTRAN, BASIC, PL/I, C, C++, Java, Perl, php,...
    ' := ' ALGOL, Pascal, Ada
    *The use of '=' in C-like languages is problematic since '=' means equality in mathematics

    Conditional Target on Assignment

    Adopted by C and all C-like languages (java, php, perl, javascript...)
    (flag)? total : subtotal = 0 means: if (flag) total = 0 else subtotal = 0

    Compound Assignment Operators

    A compound assignment statement is a shorthand method of assignment introduced in ALGOL and adopted by C and all C-like languages; Example
     a += b    # is shorthand for a = a + b
     a *= b    # is shorthand for a = a * b
    

    Unary Assignment Operators

    Unary assignment operators in C-based languages combine increment and decrement operations with assignment; ex.
    	sum = ++count ( count incremented then assigned to sum )
    	sum = count++ ( count assigned to sum then incremented )
    	count++ (count incremented)
    
    *Modifying a variable more than once in the same statement is undefined:
      count = 5; 
      count = -count++;   // what does this mean? should count be -4 or -6?
    

    Assignment as an Expression

    In C-like languages, an assignment statement itself can be an expression; i.e., the result of the assignment becomes an operand for the expression; e.g.,:
    	   while ((ch = getchar())!= EOF){. . .}
    
    ch = getchar() is carried out; the result is assigned to ch and becomes the lefthand side of the != operator. Disadvantage is in languages that do not have a Boolean type and use '=' for assignment (as in all C-like languages):
       int n = 0; 
       if ( n = 0 )            # assignment statement is confused for equality
          cout << "made it!";  
    

    Mixed Mode Assignment

    Assignment statements, just like expressions, can be mixed-mode to varying degrees (depending on the type rules of the language):
    	int a, b;
    	float c;
    	c = a / b;
    
  • In Java, only widening assignment coercions are supported (see docs)
  • In Ada, there is no assignment coercion
  • C/C++ supports both widening and narrowing coercions in assignments (the compiler will issue a warning that can be ignored)