
MOTOROLA
Chapter 4. Instruction and Data Cache Operation
For More Information On This Product,
Go to: www.freescale.com
4-3
Instruction Cache Organization and Control
accesses translated using that particular page table entry. The W (write-through) and I
(caching-inhibited) bits control how the processor executing the access uses its own cache.
The M (memory coherence) bit specifies whether the processor executing the access must
use the MEI (modified, exclusive, or invalid) cache coherence protocol to ensure all copies
of the addressed memory location are kept consistent. The G (guarded memory) bit controls
whether out-of-order data and instruction fetching is permitted.
The G2 core maintains data cache coherency in hardware by coordinating activity between
the data cache, memory system, and bus interface logic. As bus operations are performed
on the bus by other bus masters, the G2 core bus snooping logic monitors the addresses that
are referenced. These addresses are compared with the addresses resident in the data cache.
If there is a snoop hit, the G2 core bus snooping logic responds to the bus interface with the
appropriate snoop status (for example, a core_artry_out). Additional snoop action may be
forwarded to the cache as a result of a snoop hit in some cases (a cache push of modified
data or cache block invalidation).
The G2 core supports a fully-coherent 4-Gbyte physical memory address space. Bus
snooping is used to drive the MEI three-state cache-coherency protocol that ensures the
coherency of global memory with respect to the processor’s cache. See Section 4.7.1, “MEI
State Definitions.”
This chapter describes the organization of the G2 core on-chip instruction and data caches,
the MEI cache coherency protocol, cache control instructions, various cache operations,
and the interaction between the cache, LSU, and BIU. G2 core specific information is noted
where applicable.
4.2
Instruction Cache Organization and Control
The instruction fetcher accesses the instruction cache frequently in order to sustain the high
throughput provided by the six-entry instruction queue.
4.2.1
Instruction Cache Organization
The instruction cache organization is shown in Figure 4-1. Each cache block contains eight
contiguous words from memory that are loaded from an eight-word boundary (that is, bits
A27–A31 of the effective addresses are zero); thus, a cache block never crosses a page
boundary. Misaligned accesses across a page boundary can incur a performance penalty.
Note that address bits A20–A26 provide an index to select a set. Bits A27–A31 select a byte
within a block. The tags consists of bits PA0–PA19. Address translation occurs in parallel,
such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement
algorithm is strictly an LRU algorithm; that is, the least-recently used block is filled with
new instructions on a cache miss.
F
Freescale Semiconductor, Inc.
n
.