
9-4
G2 PowerPC Core Reference Manual
For More Information On This Product,
Go to: www.freescale.com
MOTOROLA
Overview
Cache lines in the core are loaded in four beats of 64 bits each (or eight 32-bit beats when
operating in 32-bit bus mode). The burst load is performed as a critical-double-word-first
operation. The cache that is being loaded is blocked to internal accesses until the load
completes (that is, no hits under misses). The critical-double-word is simultaneously
written to the cache and forwarded to the requesting unit, minimizing stalls due to load
delays.
Cache lines are selected for replacement based on an LRU algorithm. Each time a cache
line is accessed, it is tagged as the most recently used line of the set. When a miss occurs,
if all lines in the set are marked as valid, the LRU line is replaced with the new data. When
data to be replaced is in the modified state, the modified data is written into a write-back
buffer while the missed data is being read from memory. When the load completes, the core
pushes the replaced line from the write-back buffer to main memory in a burst write
operation.
9.1.2
Operation of the System Interface
Memory accesses can occur in single-beat (1 to 8 bytes) and four-beat (32 bytes) burst data
transfers when the core is configured with a 64-bit data bus (core_32bitmode signal is
negated at reset). When the core is in the optional 32-bit data bus mode (core_32bitmode
signal is asserted at reset), memory accesses can occur in single-beat (1 to 4 bytes),
two-beat (8 bytes), and eight-beat (32 bytes) bursts. The address and data buses are
independent for memory accesses to support pipelining and split transactions. The core can
pipeline as many as two transactions and has limited support for out-of-order split-bus
transactions.
Access to the 60x bus interface is granted through an arbitration mechanism external to the
core that allows devices to compete for bus mastership. This arbitration mechanism is
flexible, allowing the core to be integrated into systems that implement various fairness and
bus-parking procedures to avoid arbitration overhead.
Typically, memory accesses are weakly ordered—sequences of operations, including
load/store string and multiple instructions, do not necessarily complete in the order they
begin—maximizing the bus efficiency without sacrificing data coherency. The core allows
load operations to precede store operations (except when a dependency exists). In addition,
the core can be configured to reorder high-priority store operations ahead of lower-priority
store operations. Because the processor can dynamically optimize run-time ordering of
load/store traffic, overall performance is improved.
Note that the Synchronize (
sync
) instruction can be used to enforce strong ordering.
The following sections describe how the G2 core interface operates, providing detailed
timing diagrams that show how the signals interact. A collection of more general timing
diagrams are included as examples of typical bus operations.
F
Freescale Semiconductor, Inc.
n
.