Instruction Level Parallelism

What is it?

We have seen that with pipelining we can overlap the execution of instructions, thus executing multiple instructions in parallel.

However, we also saw that the amount of parallelism can be limited by hazards in the pipeline. We can characterize the performance of the pipeline by the effective CPI:


       Effective Pipeline CPI = 
                                 Ideal Pipeline CPI
                                 + Structural Stalls
                                 + RAW Stalls
                                 + WAW Stalls
                                 + WAR Stalls
                                 + Control Stalls

We will look at techniques to minimize the effect of these stalls. There are two types of techniques:
Static
Compile time techniques
Dynamic
Run time techniques
The simplest source of instruction level parallelism is in loops.

Consider the loop:

          for( i=1; i <= 1000; i++)
               x[i] = x[i] + s
We can compile this to:
          Loop: ld    f0, 0(r1)     ; f0 is array element
                addd  f4, f0, f2    ; add scalar in f2
                sd    f4, 0(r1)     ; store result into array
                subi  r1, r1, 8     ; decrement pointer (8 bytes)
                bnez  r1, Loop      ; branch if not done

How would this run on the DLX pipeline?


[up] to Overview.