Consider the following pipeline with 8 stages for a version of DLX:
| IF1 | Instruction fetch starts |
| IF2 | Instruction fetch completes |
| ID | Instruction decode and register fetch; begin computing branch target |
| EX1 | Execution starts; branch condition tested; finish computing branch target |
| EX2 | Execution completes - effective address or ALU result available |
| MEM1/ALUWB | First part of memory cycle plus WB of ALU operation |
| MEM2 | Memory access completes |
| LWB | Write back for a load instruction |
a) How many register read/write ports are required?
b) For each possible type of instruction source and each possible type of instruction destination, show a code example that depicts all possible forwarding requirements (not stalls).
c) Show the same information as part (b) but for stalls rather than forwards.
d) Assuming a predict-not-taken strategy, find the branch
penalty for a taken and untaken branch. Assume that a predicted instruction
can be executed up to, but not including, a pipestage that does a write
back.
We need 2 read ports for 2 registers to read in one clock cycle in ID stage because this is the maximum number of operands in an instruction.
We need 2 write ports due to potential overlap in time between MEM1/ALUWB and LWB stages.
1 ALU instr R1, _ , _
2 any instr
3 ALU instr _ , R1, _ / BNEZ
R1, _
| 1 | IF1 | IF2 | ID | EX1 | EX21 | MEM1 | MEM2 | LWB | ||
| 2 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | ||
| 3 | IF1 | IF2 | ID | EX13 | EX2 | MEM1 | MEM2 | LWB |
Memory - ALU / Memory - Branch / Memory - Memory
1 LW instr R1, _ , _
2 any instr
3 any instr
4 any instr
5 ALU instr _ , R1, _ / BNEZ
R1, _ / SW _ , R1
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM21 | LWB | ||||
| 2 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | ||||
| 3 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | ||||
| 4 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | ||||
| 5 | IF1 | IF2 | ID | EX15 | EX2 | MEM1 | MEM2 | LWB |
ALU - Memory
1 ALU instr R1, _ , _
2 SW _ , R1
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM11 | MEM2 | LWB | |
| 2 | IF1 | IF2 | ID | EX1 | EX2 | MEM12 | MEM2 | LWB |
1 ALU instr R1,
_ , _
2 ALU instr _ ,R1,
_ /BNEZ R1,
_
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1/ALUWB | MEM2 | LWB | |
| 2 | IF1 | IF2 | stall | stall | ID | EX1 | EX2 | MEM1 |
Memory - ALU / Memory - Branch / Memory - Memory
1 LW instr R1, _ , _
2 ALU instr _ , R1, _ / BNEZ
R1, _ /SW _ , R1
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | |
| 2 | IF1 | IF2 | stall | stall | stall | stall | ID | ... |
ALU - Memory
1 ALU instr R1, _ , _
2 SW _ , R1
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1/ALUWB | MEM2 | LWB | |
| 2 | IF1 | IF2 | stall | stall | ID | EX1 | EX2 | ... |
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | |
| 2 | IF1 | IF2 | ID | ||||||
| 3 | IF1 | IF2 | |||||||
| 4 | IF1 | ||||||||
| stall | stall | stall | IF1N | IF2N | IDN | EX1N | EX2N |
Branch not taken
1 BNEZ R1, N
2 any instr
| 1 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB | |
| 2 | IF1 | IF2 | ID | EX1 | EX2 | MEM1 | MEM2 | LWB |