
RISC Microprocessor Division
Page 52
The mix of instructions in an instruction sequence can result in a variety of stalls. Dependency stalls
are the most common. A dependency occurs if one instruction uses as its source data the results from
another instruction. Such a dependency will cause a stall if the two instructions are placed right next to
each other. The 603e reduces the impact of most of these situations through use of the rename
registers and forwarding of results. However, in some situations, stalls can happen as follows.
Two orderings of a code sequences are shown. In both sequences, the
add
instruction uses as its
source the results of the
lwzx
load instruction. In the original code, the
add
occurs right after the
lwzx
. In the reordered sequence, the
add
is separated from the
lwzx
by moving it down three
instructions.
Analysis of original code sequence:
Assuming the
lwzx
hits in the data cache, its data will return in 2 clocks. Although both the
add
and
the
lwzx
can be dispatched to the completion queue in the same clock, the
add
cannot begin
execution until the data from the
lwzx
returns. Therefore it cannot retire with the
lwzx
and is stalled
by one clock. The
lis
dispatches to the SRU, executes, and is ready to retire with the
add
in cycle 3.
In cycle 4, the
stwu
and
ori
can also retire together. Then in cycle 5, the
cmpw
retires alone. Total
time: 5 clocks.
Analysis of reordered code sequence:
The
lwzx
(cache hit) takes 2 clocks. Since the
lis
is not dependent on the
lwzx
, it can retire with the
lwzx
in clock 2. The
stwu
and
ori
can also retire together on the next clock (clock 3). Finally, in
clock 4, the
add
and
cmpw
retire together. Total time: 4 clocks.
Thus by separating the generation of a result from the subsequent use of that result, we were able to
prevent a stall. It is normally a good practice to provide this separation; however in some cases the
benefit gained in one place is lost in another place.