| loop: | LW | R1, 0(R2) |
| ADDI | R1,R1,#1 | |
| SW | R1, 0(R2) | |
| ADDI | R2, R2, #4 | |
| SUB | R4, R3, R2 | |
| BENZ | R4, loop |
Throughout this Exercise use the DLX integer pipeline and assume all memory accesses are cache hits.
a) Show the timing of this instruction sequence for the DLX pipeline without any forwarding but assuming a register read and a write in the same clock cycle "forwards" through the register file. Assume that a branch is handled by flushing the pipeline. If the memory references hit in the cache, how many cycles does this loop take to execute?
Solution:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | ||||
| l: | LW | R1,0(R2) | IF | ID | EX | ME | WB | |||||||||||||
| ADDI | R1,R1,#1 | IF | stall | stall | ID | EX | ME | WB | ||||||||||||
| SW | R1, 0(R2) | IF | stall | stall | ID | EX | ME | WB | ||||||||||||
| ADDI | R2,R2,#4 | IF | ID | EX | ME | WB | ||||||||||||||
| SUB | R4,R3,R2 | IF | stall | stall | ID | EX | ME | WB | ||||||||||||
| BENZ | R4, l | IF | stall | stall | ID | EX | MEM | WB | ||||||||||||
| IF | stall | stall | IF |
b) Show the timing of this instruction sequence for the DLX pipeline with normal forwarding hardware. Assume that the branch is handled by predicting it as not taken. If the memory references hit in the cache, how many cycles does this loop take to execute?
Solution:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | ||||||
| l: | LW | R1,0(R2) | IF | ID | EX | ME | WB | ||||||
| ADDI | R1,R1,#1 | IF | ID | stall | EX | ME | WB | ||||||
| SW | R1, 0(R2) | IF | stall | ID | EX | ME | WB | ||||||
| ADDI | R2,R2,#4 | IF | ID | EX | ME | WB | |||||||
| SUB | R2,R3,R2 | IF | ID | EX | ME | WB | |||||||
| BENZ | R4, l | IF | ID | EX | ME | WB | |||||||
| if not taken | IF | idle | idle | idle | |||||||||
| l: (if taken) | IF |
c) Assuming the DLX pipeline with a single-cycle delayed branch and forwarding hardware, schedule the instructions in the loop including the branch delay slot. You may reorder instructions and modify the individual instruction operands. Show a pipeline timing diagram and compute the number of cycles needed to execute the entire loop.
Solution:
| 1 | 2 | 3 | 4 | 5 | 6 | |||||||
| l: | LW | R1,0(R2) | IF | ID | EX | ME | WB | |||||
| ADDI | R2,R2,#4 | IF | ID | EX | ME | WB | ||||||
| SUB | R4,R3,R2 | IF | ID | EX | ME | WB | ||||||
| ADDI | R1,R1,#1 | IF | ID | EX | ME | WB | ||||||
| BENZ | R4, l | IF | ID | EX | ME | WB | ||||||
| SW | R1, 0(R2) | IF | ID | EX | ME | WB | ||||||
| IF |