Homework 1 - Chapters 1, 2, 5

Due: Friday January 18, 2013 by midnight
  1. (2 pts) Translate the following C/C++ loop into assembly. Give JUST the instructions associated with this loop, not a full MIPS assembly program. Assume the base address of the array a is already stored in register s2, the variable i is already stored in register s1, and the variable size is already stored in register s3.
        for(i = 0; i < size; i++)
        {
          a[i] = i + 5;
        }
        i = 0;
    
  2. (4 pts) Translate the MIPS assembly you created in your answer to Question 1 into binary instructions. You may use any address you like for the immediate portion of the branch instruction.
  3. (2 pts) Translate the following MIPS assembly snippet back into a C/C++ loop:
          move  $v0, $zero      # initialize v0 to 0
          move  $t0, $zero      # initialize t0 to 0
    Loop: beq   $t0, $s0, Exit  # if t0 == s0, go to Exit
          sll   $t1, $t0, 2     # t1 = t0 * 4
          add   $t1, $t1, $s2   # t1 = s2 + (t0 * 4)
          lw    $t2, 0($t1)     # bring element in from memory
          add   $v0, $v0, $t2   # v0 += t2
          sw    $t2, 0($t1)     # send updated value back to memory
          addi  $t0, $t0, 1     # increment t0
          j     Loop
    Exit: 
    
  4. (2 pts) What would be the human-readable MIPS instruction represented by the binary string
        1000 1101 0010 1000 0000 0100 1011 0000
    
  5. (2 pts) Convert the decimal number 1054 to binary and hexidecimal.
  6. (2 pts) Consider a matrix (2D array) implemented in C or C++. One may traverse the matrix either row by row or column by column. Which of these two traversal methods is more well-suited for a cache which takes advantage of spatial locality? Justify your answer by describing how the 2D array would be arranged in main memory.
  7. (2 pts) What is the motivation behind using multiple levels of caches with different sizes and addressing modes (e.g. a level 1 cache with 16 direct mapped blocks and a level 2 cache with 64 2-way set associative blocks)?
  8. (4 pts) You are evaluating a cache design for the instruction cache. This design is 16 block direct mapped with one instruction per block. One instruction is stored into the cache for each cache miss.

    The cache row address and tag for this cache will be calculated as follows:

               31 ... 6|5 ... 2|1 0 
               --------------------
              |  Tag   |  Row  |0 0|  instruction address
               --------------------
                                /|\
                                 |
                               ignore these bits (byte offset)
    
    The instructions being executed are:
        Address    Instruction
        =======    ============================
        4000d      Loop:   beq $s0, $zero, Exit   # immediate = 6, offset to Exit
        4004d              add $t0, $s0, $s2      # compute read address
        4008d              add $t1, $s0, $s3      # compute write address
        4012d              lw $t2, 0($t0)         # read data
        4016d              sw $t2, 0($t1)         # write data
        4020d              sub $s0, $s0, $s1      # subtract offset
        4024d              j Loop                 # immediate = 1000 which is 4000/4
        4028d      Exit:
    
    Fill in the following cache table and state how many cache misses this design has. Assume that the code starts executing at the Loop: tag, that is executes for EXACTLY two interations, and that the cache is empty at the start.

    Direct Mapped Cache - 1 instruction per block
    Row (4 bits) Valid? Tag (26 bits) Data (1 instruction)
    0000 (0)      
    0001 (1)      
    0010 (2)      
    0011 (3)      
    0100 (4)      
    0101 (5)      
    0110 (6)      
    0111 (7)      
    1000 (8)      
    1001 (9)      
    1010 (10)      
    1011 (11)      
    1100 (12)      
    1101 (13)      
    1110 (14)      
    1111 (15)