
MOTOROLA
Chapter 2. PowerPC Processor Core
2-7
PowerPC Processor Core Features
2.2.4.2 Floating-Point Unit (FPU)
The FPU contains a single-precision multiply-add array and the oating-point status and
control register (FPSCR). The multiply-add array allows the processor to efTciently
implement multiply and multiply-add operations. The FPU is pipelined so that single-
precision instructions and double-precision instructions can be issued back-to-back. Thirty-
two oating-point registers are provided to support oating-point operations. Stalls due to
contention for FPRs are minimized by the automatic allocation of rename registers. The
processor writes the contents of the rename registers to the appropriate FPR when oating-
point instructions are retired by the completion unit.
The processor supports all IEEE 754 oating-point data types (normalized, denormalized,
NaN, zero, and inTnity) in hardware, eliminating the latency incurred by software
exception routines.
2.2.4.3 Load/Store Unit (LSU)
The LSU executes all load and store instructions and provides the data transfer interface
between the GPRs, FPRs, and the cache/memory subsystem. The LSU calculates effective
addresses, performs data alignment, and provides sequencing for load/store string and
multiple instructions.
Load and store instructions are issued and translated in program order; however, the actual
memory accesses can occur out of order. Synchronizing instructions are provided to
enforce strict ordering where needed.
Cacheable loads, when free of data dependencies, execute in an out-of-order manner with
a maximum throughput of one per cycle and a two-cycle total latency. Data returned from
the cache is held in a rename register until the completion logic commits the value to a GPR
or FPR. Store operations do not occur until a predicted branch is resolved. They remain in
the store queue until the completion logic signals that the store operation is deTnitely to be
completed to memory.
The processor core executes store instructions with a maximum throughput of one per cycle
and a three-cycle total latency. The time required to perform the actual load or store
operation varies depending on whether the operation involves the cache, system memory,
or an I/O device.
2.2.4.4 System Register Unit (SRU)
The SRU executes various system-level instructions, including condition register logical
operations and move to/from special-purpose register instructions, and integer add/
compare instructions. Because SRU instructions affect modes of processor operation, most
SRU instructions are completion-serialized. That is, the instruction is held for execution in
the SRU until all prior instructions issued have completed. Results from completion-
serialized instructions executed by the SRU are not available or forwarded for subsequent
instructions until the instruction completes.