This is easier to grasp if we imagine the floating-point instructions
as having the same pipeline as the integer instructions, with two important
differences:
The
EX cycle may be repeated as many times as needed to complete the operation;
There
may be multiple floating-point functional units.
Let's assume that there are four separate functional units :
The
main integer unit
FP
and integer multiplier
FP
adder (handles FP add, subtract, and conversion)
FP
and integer divider
If we also assume that the execution stages of these functional units
are not pipelined, then the resulting pipeline looks like:
In reality, the intermediate results are probably not cycled around the EX unit, but the EX pipeline stage has some number of clock delays larger then 1. We can generalize the structure of the FP pipeline to allow pipelining of some stages and multiple ongoing operations.
To describe such a pipeline we must define the latency of the functional units and the initiation interval.
Latency is defined as the number of intervening cycles between an instruction that produces a result and an instruction that uses the result.
The initiation interval is the number of cycles that must elapse between issuing two operations of a given type.
For example, we will use the latencies and initiation intervals as shown:
| Functional unit | Latency | Initiation interval |
| Integer ALU | 0 | 1 |
| Data memory(integer and FP loads) | 1 | 1 |
| FP add | 3 | 1 |
| FP multiply (also integer multiply) | 6 | 1 |
| FP divide (also integer divide and FP sqrt) | 24 | 24 |
Pipeline latency is essentially equal to one cycle less than the depth of the execution pipeline, which is the number of stages from the EX stage to the stage that produces the result. Thus, for the example pipeline above, the number of stages in an FP add is 4, while the number of stages in FP multiply is 7.
Extended pipeline is show below:
The FP multiplier and adder are fully pipelined and have a depth of
7 and 4 stages, respectively. The FP divider is not pipelined.
| MULTD | IF | ID | M1 | M2 | M3 | M4 | M5 | M6 | M7 | MEM | WB |
| ADDD | IF | ID | A1 | A2 | A3 | A4 | MEM | WB | |||
| LD | IF | ID | EX | MEM | WB | ||||||
| SD | IF | ID | EX | MEM | WB |