PNX1300/01/02/11 Data Book
Philips Semiconductors
5-12
PRELIMINARY SPECIFICATION
al, n
×
(n
–
1)/2 bits for n-way LRU). If set element k is ref-
erenced, the cache sets row k to
‘
1
’
and column k to
‘
0
’
:
R[k, 0..n
–
1]
←
1,
R[0..n
–
1, k]
←
0
The LRU element is the one for which the entire row is
‘
0
’
(or empty) and the entire column is
‘
1
’
(or empty):
R[k, 0..n
–
1] = 0
and
R[0..n
–
1, k] = 1
For a 4-way set-associative cache, this algorithm re-
quires six bits per set of four cache blocks. On every
cache hit, the LRU info is updated by setting three of the
six bits to
‘
0
’
or
‘
1
’
, depending on the set element that
was accessed. The bits need only be written, no read-
modify-write is necessary. On a miss, the cache reads
the six LRU bits to determine the replacement block.
PNX1300 combines the two-way and four-way algo-
rithms into an 8-way hierarchical LRU algorithm. A total
of ten administration bits are required: six to maintain the
four-way LRU plus four bits maintain the four two-way
LRUs.
The hierarchical algorithm has performance close to full
eight-way LRU, but it requires far fewer bits
—
ten instead
of 28 bits
—
and is much simpler to implement.
To update the LRU bits on a cache hit to element j (with
0 <= j <= 7), the cache applies m = (j div 2) to the four-
way LRU administration and (j mod 2) is applied to the
two-way administration of pair m. To select a replace-
ment victim, the cache first determines the pair p from
the four-way LRU and then retrieves the LRU bit q of pair
p. The overall LRU element is the p
×
2+q.
5.6.6
LRU Initialization
Reset causes the LRU administration bits to initialized to
a legal state:
R[1,0]
←
R[2,0]
←
R[3,0]
←
1
R[2,1]
←
R[3,1]
←
R[3,2]
←
0
2_way[3]
←
2_way[2]
←
2_way[1]
←
2_way[0]
←
0
5.6.7
LRU Bit Definitions
The ten LRU bits per set are mapped as shown in
Figure 5-13
. This is the format of the LRU field as re-
turned by the special operation rdstatus for the data
cache and a ld32 from MMIO space (see
Section 5.4.8,
“
Reading Tags and Cache Status
”
) for the instruction
cache.
5.6.8
LRU for the Dual-Ported Cache
For the PNX1300 dual-ported data cache, two memory
operations to the same set are possible in a single clock
cycle. To support this concurrency, two updates of the
LRU bits of a single set must be possible.
The following rules are used by PNX1300:
1. LRU bits that are changed by exactly one port receive
the value according to the algorithm described above.
2. LRU bits that are changed by both ports receive a val-
ue as if the algorithm were first applied for the access
in port zero and then for the access in port one.
5.7
PERFORMANCE EVALUATION
SUPPORT
The caches implement support for performance evalua-
tion. Several events that occur in the caches can be
counted using the PNX1300 timer/counters, by selecting
the source CACHE1 and/or CACHE2, as described in
Section 3.8,
“
Timers.
”
Two different events can be
tracked simultaneously by using 2 timers.
The MMIO register MEM_EVENTS determines which
events are counted. See
Figure 5-14
for the format of
MEM_EVENTS.
Table 5-14
lists the events that can be
tracked and the corresponding values for the
MEM_EVENTS fields. Event1 selects the actual source
LRU bit 0
R[3,0]
R[3,1]
R[3,2]
R[2,0]
R[1,0]
R[2,1]
2_way[1]
2_way[0]
2_way[3]
2_way[2]
LRU bit 1
LRU bit 2
LRU bit 3
LRU bit 4
LRU bit 5
LRU bit 6
LRU bit 7
LRU bit 8
LRU bit 9
Figure 5-13. LRU bit definitions; 2_way[k] is the two-way LRU bit of pair k = (j div 2) for set element j.
31
0
0
3
7
11
15
19
23
27
MEM_EVENTS (r/w)
0x10 000C
Event2
MMIO_BASE
offset:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Event1
Figure 5-14. Format of the memory_events MMIO register.