
MOTOROLA
7-6
INSTRUCTION TIMING
For More Information On This Product,
Go to: www.freescale.com
RCPU
REFERENCE MANUAL
7.2.1.1 Update of the XER During Divide Instructions
Integer divide instructions have a relatively long latency. However, these instruc-
tions can update XER[OV], the overflow bit in the integer exception register, after
one cycle. Data dependency on the XER is therefore limited to one cycle although
the latency of an integer divide instruction can be up to eleven clock cycles.
7.2.2 Floating Point Unit (FPU)
The floating-point unit contains a double-precision multiply array, the floating-point
status and control register (FPSCR), and the FPRs. The multiply-add array allows
the processor to efficiently implement floating-point operations such as multiply,
multiply-add, and divide.
The RCPU depends on a software envelope to fully implement the IEEE floating-
point specification. Overflows, underflows, NaNs, and denormalized numbers
cause floating-point assist exceptions that invoke a software routine to deliver (with
hardware assistance) the correct IEEE result. Refer to
6.11.10 Floating-Point As-
sist Exception (0x00E00)
for additional information.
To accelerate time-critical operations and make them more deterministic, the
RCPU provides a mode of operation that avoids invoking the software envelope
and attempts to deliver results in hardware that are adequate for most applications,
if not in strict conformance with IEEE standards. In this mode, denormalized num-
bers, NaNs, and IEEE invalid operations are treated as legitimate, returning default
results rather than causing floating-point assist exceptions.
7.2.3 Load/Store Unit (LSU)
The load-store unit handles all data transfer between the integer and floating-point
register files and the chip-internal load/store bus (L-bus). The load/store unit is im-
plemented as an independent execution unit so that stalls in the memory pipeline
do not cause the master instruction pipeline to stall (unless there is a data depen-
dency). The unit is fully pipelined so that memory instructions of any size may be
issued on back-to-back cycles.
There is a 32-bit wide data path between the load/store unit and the integer register
file and a 64-bit wide data path between the load/store unit and the floating-point
register file.
Single-word accesses to on-chip data RAM require one clock cycle, resulting in two
clock cycles latency. Double-word accesses require two clock cycles, resulting in
three clock cycles latency. Since the L-bus is 32 bits wide, double-word transfers
require two bus accesses.
The LSU interfaces with the external bus interface for all instructions that access
memory. Addresses are formed by adding the source one register operand speci-
fied by the instruction (or zero) to either a source two register operand or to a 16-
bit, immediate value embedded in the instruction.
F
Freescale Semiconductor, Inc.
n
.