Chapter 9 - Subprograms

C++ code
Ada code
C# code
Python code
 
SUBPROGRAMS

+  also called a subroutine, a function, a procedure, a method
+  a collection of reusable statements - a process (not data) abstraction 
+  has a single entry point
+  suspends execution of the calling program during execution of the called 
   procedure 
+  returns control to caller when the called procedure terminates
+  includes a mechanism to pass parameters for parameterized computations and 
   to return a value
+  that is semantically modeled on a mathematical function should only return 
   a value and not produce side effects (i.e. functional languages) 
 
SUBPROGRAM TERMINOLOGY

+ A procedure definition describes interface and actions of the subprogram

+ A procedure call is an explicit request for the procedure to be 
  executed

+ A procedure header is the first part of the definition, includes 
name, type of procedure, and formal parameters

+ The parameter profile (aka signature) of a procedure is the number, 
  order, and types of its parameters

+ The protocol is a procedure's parameter profile and, if it is a 
  function, its return type  

+ A procedure declaration (or prototype) provides the protocol, but not 
the body, of the procedure 

A formal parameter is the variable name listed in the procedure header
and used in the procedure

An actual parameter (also called argument) represents a value or 
address used in the procedure call statement 

/* Actual to Formal Parameter Correspondence */ int test (int num) { # num is the formal parameter return num++; } int x = 5; test(x); # x is the actual parameter
Positional binding is the binding of actual parameters to formal parameters by position: the first actual parameter is bound to the first formal parameter and so forth (safe and effective); C/C++/Java do this. Keyword binding occurs when the name of the formal parameter to which an actual parameter is to be bound is specified with the actual parameter - parameters can appear in any order - (see Python) Formal Parameter Default Values In certain languages (e.g., C++, Ada), formal parameters can have default values (if not actual parameter is passed) In C++, default parameters must appear last because parameters are positionally associated (ex. C++) Variable Number of Parameters Python supports variable length parameter lists through *name syntax; C# methods accept a variable number of parameters of the same type DESIGN ISSUES + What parameter passing methods are provided? (in C pass-by-value only) + Are parameter types checked? (in C yes but in Perl no) + Are local variables static or dynamic? (in C dynamic by default) + Can procedures be nested? (not in ANSI C but yes in GNU C) + Are procedures treated as first class objects? (not in C but yes in Python) + Can procedures be overloaded? (not in C but yes in C++) + Can procedure be generic? (C++ templates) LOCAL REFERENCING ENVIRONMENTS stack-dynamic local variables + this is the default practice + variables are bound/unbound to storage at runtime + supports recursion + storage for locals is shared among some procedures + run-time overhead of allocation/de-allocation and initialization + indirect addressing (inefficient and security problems) + procedures cannot be history sensitive static local variables + variables are bound/unbound to storage at compile time + more efficient (no indirection): see timing.c + no run-time overhead + cannot support recursion FIVE PARAMETER PASSING METHODS The three semantic modes (in mode, out mode and inout mode) are implemented by one of the following 5 mechanisms. Pass-by-Value (In Mode) + semantics: pass a value in; do not change it within the function or in caller + The R-value of the actual parameter is copied to the R-value of the formal parameter + copying requires additional storage and copy operation (can be costly) + All parameters in C and Java are pass-by-value ("pass by reference" is really passing a reference or pointer by value) (see Java code)
// use of const to enforce In Mode parameter passing semantics test(const int formal) { int result = 100/formal; formal = formal + 1; <= compilation error } int actual = 5; test(actual); // R-value of actual is copied to R-value of formal
Pass-by-Value-Result (In-out Mode) (also called pass-by-copy) + pass the value coming in and pass the value out upon exit (two copies) + On call, r-values of actuals are passed and copied to r-values of formals + On return, r-values of formals are copied back to r-values of actuals + has the disadvantage of pass-by-value (copying cost) + advantage over pass-by-reference is no dereferencing + this mechanism is implemented in Fortran. Pass-by-Reference (In-out Mode) (also called pass-by-sharing) + the l-value of the actual parameter is assigned to the l-value of the formal parameter; i.e., formal becomes an alias for the actual + the semantics supports an assignment statement in the callee that modifies the value in the calling routine
// demonstrate the difference between C++ pass-by-ref and // C pass-by-value pointer void foo(int & x){ x = 5; <== C++ modifies actual parameter } void foo (int * x) { x = 0; <== C does NOT modify actual parameter }
+ Passing process is efficient (no copying or duplicated storage) + more secure than the use of pointers - cannot modify the pointer address + see C++ code with reference type (&) Disadvantages: + potentials for un-wanted side effects (as with any out mode parameter) + un-wanted aliases (access broadened)
// demonstrate pass by reference in C# // uses keyword 'ref' (see code) void Foo (ref int x) { x = 0; // will modify y in calling routine } ... int y=5; Foo (ref y); Console.WriteLine ("{0} \n",y); // will print 0 >
Pass-by-Result (Out Mode) + no value is transmitted to procedure + the corresponding formal parameter must be an L-value; its value is transmitted to caller's actual parameter upon return + cannot be accessed until AFTER procedure call + if implemented by copy rather than access path (typically is) requires extra storage and copy operation
// demonstrate pass-by-result in C# // uses keyword 'out' keyword (see code) void Foo (out int x) { // int a = x; // compilation error since x is not in x = 10; // assignment must occur for out variable int a = x; // The value of x can be read now } int y; // declare variable with no value Foo (out y); // pass y in as an output parameter Console.WriteLine (y); // prints 10
Semantic Difficulties: Passing the same L-value to more than one result parameter: sub(p1, p1); // what is the current value of p1? Potential problem: evaluation time of actual parameters can impact result; if list[index] is passed in and index changed in the subroutine (as global or pass by result), the address of list[index] will differ between call and return. Pass-by-Name (In-out Mode) + Formals are bound to an access method at the time of the call, but actual binding to a value or address takes place at the time of a reference or assignment at runtime + Allows flexibility in late binding + Used by Algo 60, Miranda, Haskell
// demonstrate pass-by-name in C++ // implemented as a virtual function (polymorphism) Circle c; // Circle is a Shape test(c); void test (Shape & s){ s.update(); // late binding of s.update() // to Circle c object }
IMPLEMENTING PARAMETER-PASSING METHODS + In most languages parameter communication takes place according to the conventions of the language - parameters are loaded into specific registers or places on the call frame of the runtime stack + Pass-by-reference is the simplest to implement - the address is easily loaded into a register or on the call frame + A subtle but fatal error occurs with pass-by-reference and pass-by-value- result when a formal parameter corresponding to a constant is mistakenly changed - modern compilers store immutable values in .rodata segment + any security of pass-by-value is eliminated when the value passed is a pointer (see C code) Parameter Passing Methods of Major Languages + Fortran Always used the In-out semantics model Before Fortran 77: pass-by-reference Fortran 77 and later: scalar variables are often passed by value-result + C Everything is pass-by-value Pass-by-reference is "simulated" by passing a pointer by value + C++ pass-by-value, pass-by-reference, pass-by-name (see code) + Java Everything is pass-by-value (see docs) For object parameters, a reference is passed by value (see Java docs) in the same way that C passes pointers - except programmer has no access to the pointer (see code) + Ada Three semantics modes of parameter transmission: IN, OUT, IN OUT; IN is the default mode Formal parameters declared out can be assigned but not referenced; those declared in can be referenced but not assigned; in out parameters can be referenced and assigned (see code) + C# Default method: pass-by-value Pass-by-reference is specified by preceding both a formal parameter and its actual parameter with 'ref' Pass-by-result is specified by preceding both a formal parameter and its actual parameter with 'out' (see code) + PHP: Default method: pass-by-value Use & to pass by ref ; more examples of parameter passing + Perl: all actual parameters are implicitly placed in a predefined array named @_ (see test.perl) + Python: A parameter reference (address) is passed by value (like C and Java); immutable objects in the caller cannot be changed in the callee but mutable objects (lists for example) can be changed in the callee. see functions.py for examples. The author of Python (Guido van Rossum) calls this parameter passing scheme "call by object reference." (see code):
def one(): x = 1 # a scalar is immutable two(x) print "x:" x alist = [0,1] # a list is mutable three(alist) print "alist: " alist def two(y): y = 2 def three(blist): # binds blist to alist blist.append(2) # blist and alist become [0,1,2] blist = [] # binds blist to null list; alist not changed one() # prints x: 1 alist: [0,1,2]
TYPE CHECKING PARAMETERS + Considered essential for reliability + FORTRAN 77 and original C: none Pascal, FORTRAN 90, Java, and Ada: it is always required ANSI C and C++: choice is made by the user via function prototypes Relatively new languages Perl, JavaScript, and PHP do not require type checking (see code)
<?php $num = 5; Test($num,"hello"); Test("hello",$num); function test($first, $second) { echo "$first and $second <br>"; } ?>
Parametric Polymorphism refers to languages that generally enforce parameter type checking and yet also support shifting types at runtime. This ability is part of a wider concept known as generic programming. Ada (static, strongly typed) and C++ (static, somewhat strongly typed) both support Parametric Polymorphism. In Ada it is through the keyword 'generic' applied to types, subscript ranges, and constant values. In C++ it is through inheritance and the keyword 'virtual' applied to base class methods.
generic type element is private; type Index is (<>); type Vector is array (Index) of Element; procedure Generic_Sort (List) : in out Vector) is Temp : Element; Lower_Bound : List'First; Upper_Bound : List'Range; Index_1: List'Range; Index_2 : List'Range; Outer_limit : List'Range; Inner_Begin : List'Range; begin Outer_Limit : Upper_Bound - 1; for Index_1 in Lower_Bound .. Outer_Limit loop Inner_Begin := Index_1 + 1; for Index_2 in Inner_Begin .. Upper_Bound loop if List(Index_1) > List(Index>2) then temp := List(Index_1); List(Index_1) := List(Index_2); List (Index_2) := Temp; end if; end loop; -- for Index_1 end Loop; -- for Index_2 end Generic_Sort;
MULTI-DIMENSIONAL ARRAYS AS PARAMETERS If a multi-dimensional array is passed to a procedure, the compiler needs the declared size of that array to build the storage mapping function (this makes procedures inflexible) In languages where n-dimen arrays are implemented as a single dimension, such as C and C++, the declared sizes of all but the first subscript must be included in the actual parameter. (see C code) Solution: pass a pointer to the array and the sizes of the dimensions as +ther parameters; the user must include the storage mapping function in terms of the size parameters Java and C# o Arrays are objects; they are all single-dimensioned, but the elements can be arrays o Each array inherits a named constant (length in Java, Length in C#) that is set to the length of the array when the array object is created PHP o all arrays (indexed or associative) are internally stored as associative Design Considerations for Parameter Passing + A tradeoff between efficiency (two-way) and security (one-way) + Security demands limited access to variables; i.e., one-way whenever possible + Pass-by-reference is more efficient than any other method - no dereferencing to acquire memory address; const & is the safest and most efficient. SUBPROGRAMS PASSED AS PARAMETERS + Are parameter types checked? + What is the correct referencing environment for the procedure? In C/C++, a function can be passed as a parameter by passing a pointer to the function; although void * is used - functions as parameters are type checked (see C code) PHP has built-in support for passing functions in functions run functions.php In Python functions are treated like first-class objects (like data); functions can be passed to a function and like any other data parameters are not checked. See functions.py. NESTED SUBPROGRAMS PASSED AS PARAMETERS ANSI C/C++ do not support nested procedures. Ada, GNU gcc, and JavaScript do. Referencing Environment for Nested Procedures Deep binding: The environment of the definition of the passed procedure (applicable to languages that use static scoping such as JavaScript) Shallow binding: The environment of the call statement that enacts the passed procedure (applicable to languages that use dynamic scoping) Ad hoc binding: The environment of the call statement of the passed procedure - binding depends on the function that the function is passed to
// JavaScript Example // What is the referencing environment of sub2? function sub1(){ var x = 1; function sub2(){ //x=1 if environment of function definition (lexical scope) alert(x); // this is deep binding }; function sub3(){ var x = 2; sub4(sub2); //x=2 if environment of procedure that passed sub2 }; // this is shallow binding function sub4(subx){ var x = 3; subx(); //x=3 if dynamic environment of function execution } // this is ad hoc binding sub3(); };
OVERLOADED SUBPROGRAMS
An overloaded procedure has the same name as another procedure in the same 
referencing environment - each version of the overloaded procedure must have a 
unique protocol

C++, Java, C#, and Ada include predefined overloaded procedures 

In Ada, the return type of an overloaded function can be used to disambiguate 
calls (thus two overloaded functions can have the same parameters)

Ada, Java, C++, and C# allow users to write multiple versions of procedures 
with the same name (e.g. name mangling) (see code)

PHP does not support function overloading, nor is it possible to undefine or 
redefine previously-declared functions.

GENERIC PROCEDURES
A generic or polymorphic procedure takes parameters of different types on 
different activations - Overloaded procedures provide ad hoc polymorphism

A procedure that takes a generic parameter in place of the type expression for
the parameters of the procedure provides parametric polymorphism

Examples of parametric polymorphism: C++

    template <class Type>
    Type max(Type first, Type second) {
        return first > second ? first : second;
    }
    
    The above template can be instantiated for any type for which operator > 
    is defined
    
    int max (int first, int second) {
    return (first > second) ? first : second;
    }

 DESIGN ISSUES FOR FUNCTIONS
+ Are side effects allowed? 
+ Parameters should always be in-mode to reduce side effect (like Ada)
+ What types of return values are allowed? 
  - most imperative languages restrict the return types
  - C allows any type except arrays and functions
  - C++ is like C but also allows user-defined types
  - Ada allows any type
  - Java and C# do not have functions but methods can have any type

USER-DEFINED OVERLOADED OPERATORS
An Ada example to overload * to support vectors (can also do this in C++):
Function R*S(A,B: in Vec_Type): return Integer is Sum: Integer := 0; begin for Index in A <== range loop Sum := Sum + A(Index) * B(Index) end loop return sum; end R*S; c = a * b; <== where a, b, and c are of type Vec_Type
COROUTINES (see wiki) Threads simulate simultaneous execution of multiple processes - coroutines involve synchronized execution between two processes P1 & P2: P1 -> P2 -> P1 -> P2,... Coroutines may yield and resume indefinitely Threads are designed to solve complex and processor intensive problems - using threads to solve a problem that could be solved with coroutines is an overkill In coroutines the caller and called coroutines are on an equal basis A traditional subroutine has one entry (the call) and one exit (the return) In coroutines there may be multiple entries and exits (a subroutine is a special case of coroutine) The first time a coroutine is invoked, execution starts at the beginning of the coroutine; each subsequent time a coroutine is invoked, execution resumes following the place where the coroutine last yielded A coroutine can return multiple times, making it possible to return additional values upon subsequent calls to the coroutine. Coroutines in which subsequent calls yield additional results are often known as generators. Example - producer yields to consumer and consumer yields to producer
var q := new queue coroutine produce loop while q is not full create some new items add the items to q yield to consume coroutine consume loop while q is not empty remove some items from q use the items yield to produce
Python generators * a Python generator is a specialized form of a coroutine Python coroutines