However, we also saw that the amount of parallelism can be limited by hazards in the pipeline. We can characterize the performance of the pipeline by the effective CPI:
Effective Pipeline CPI =
Ideal Pipeline CPI
+ Structural Stalls
+ RAW Stalls
+ WAW Stalls
+ WAR Stalls
+ Control Stalls
We will look at techniques to minimize the effect of these stalls.
There are two types of techniques:
for( i=1; i <= 1000; i++)
x[i] = x[i] + s
We can compile this to:
Loop: ld f0, 0(r1) ; f0 is array element
addd f4, f0, f2 ; add scalar in f2
sd f4, 0(r1) ; store result into array
subi r1, r1, 8 ; decrement pointer (8 bytes)
bnez r1, Loop ; branch if not done