
RISC Microprocessor Division
Page 38
The load/store hierarchy within the PowerPC chip consists of the load/store unit (LSU), data cache
(DC), and the bus interface unit (BIU). The LSU stages consist of a two-element EIB, to receive
dispatched instructions and calculate effective addresses, and a two-element store queue, to hold
stores waiting for the data cache. The data cache stages consist of slots for a load miss and a store
miss. Only one miss can be handled at a time. The BIU stages consist of a number of one-element
queues, such as the data load and store queues. Each queue can hold a separate instruction waiting
for access to memory.
Instructions are first dispatched from the instruction queue (IQ) to the LSU EIB, which has two slots:
the “reservation station” slot (LSU RS) and an “effective address calculation” slot (LSU EA). An
instruction is held in the LSU EA slot until its address operand is available.
Normally if the LSU is available for dispatch (see below), then the instruction is dispatched directly to
the LSU EA slot, if both slots are empty. If the LSU EA slot is occupied, then the instruction is
dispatched to the LSU RS slot.
Once the instruction’s effective address has been calculated, its progress through the pipeline depends
on whether it is a load or a store. A load would then access the data cache (DC), as described later.
The load’s entry in the completion queue (CQ) is marked “finished” when the data for the load returns.
A store would pass to the first LSU store queue slot, and its entry in the CQ would be marked
“finished.” Thus, a store can be considered finished and even retired from the completion queue long
before its data is actually written to cache or to memory. On the next clock cycle, the store passes to
the second LSU store queue slot and, on the subsequent clock, it is free to access the data cache.
Note that because a store must traverse two additional slots than a load before accessing the data
cache, a load instruction may bypass preceding stores within the LSU. Also, if both a load (in the LSU
EA slot) and a store (in the second LSU store queue slot) are free to access the data cache, then the
load will take precedence.