
MOTOROLA
Chapter 7. Instruction Timing
7-13
Timing Considerations
8. In cycle 8, instruction 7 is in the third FPU execute stage. Instructions 8 and 9 have
executed and they remain in the CQ until instruction 7 completes. Instruction 10 is
dispatched to the IU.
9. In cycle 9, instruction 7 completes, allowing instruction 8 to complete. Because the
CQ is full, instructions 12 and 13 cannot be dispatched.
10.In cycle 10, instructions 9 and 10 complete. Instruction 11 has executed but cannot
exit the CQ from CQ2. Instructions 12 and 13 are dispatched to the FPU and IU,
respectively. Instruction 14 drops into IQ0.
11.In cycle 11, instruction 11 completes and instruction 12 is in the second FPU
execute stage. Instruction 13 has executed but must remain in the CQ until
instruction 12 completes. Instruction 14 enters the first FPU execute stage.
7.3.2.3
Cache Miss
Figure 7-5 shows an instruction fetch that misses the on-chip cache and shows how that
fetch affects the instruction dispatch. Note that a processor/bus clock ratio of 1:2 is used.
The same instruction sequence is used as in Section 7.3.2.2, “Cache Hit.”
A cache miss extends the latency of the fetch stage, so in this example, the fetch stage
represents not only the time the instruction spends in the IQ but also the time required for
the instruction to be loaded from system memory, beginning in clock cycle 3.
During clock cycle 2, the target instruction for the
br
instruction is not in the instruction
cache; therefore, a memory access must occur. During clock cycle 5, the address of the
block of instructions is sent to the system bus. During clock cycle 9, two instructions
(64 bits) are returned from memory on the first beat and are forwarded both to the cache
and instruction fetcher.
7.3.3
Instruction Dispatch and Completion Considerations
Several factors affect the ability of the G2 core to dispatch instructions at a peak rate of two
per cycle—the availability of the execution unit, destination rename registers, and
completion queue, as well as the handling of completion-serialized instructions. Several of
these limiting factors are illustrated in the previous instruction timing examples.
To reduce dispatch unit stalls due to instruction data dependencies, the G2 core provides a
single-entry reservation station for the FPU, SRU, and each IU, and a two-entry reservation
station for the LSU. If a data dependency keeps an instruction from starting execution, that
instruction is dispatched to the reservation station associated with its execution unit (and
the rename registers are assigned), thereby freeing the positions in the instruction queue so
instructions can be dispatched to other execution units. Execution begins during the same
clock cycle that the rename buffer is updated with the data the instruction is dependent on.
F
Freescale Semiconductor, Inc.
For More Information On This Product,
Go to: www.freescale.com
n
.