
275
Chapter 5 Operating System Issues
jmpfdec gr64, next
add
gr65, gr65, 1*16; point to next block
; test if all blocks tested
;
const
mfsr
nand
mtsr
gr64, 0x100
gr65, cfg
gr65, gr65, gr64; enable cache
cfg, gr65
; set the ID–bit
; read CFG register
; write CFG register
With a 2/1 memory system, testing and invalidating each block takes 10 cycles
(2/1 refers to the memory system access times –– 2–cycle first, 1–cycle for
subsequent). This amounts to 1280 cycles for all blocks in column 0; or, 51.2 micro
seconds for a 25 MHz processor. Actual use of the example code presents a
considerable overhead and is unlikely to achieve an overall system benefit over
simply invalidating the whole cache in a single cycle.
5.13.2 Instruction Cache Coherence
The 29K family does not contain unified instruction and data caches. Unified
caches can give a higher hit rate than split caches of the same total size. However,
separate instruction and data caches enable a higher performance due to
simultaneous accesses during the same processor cycle. There are less problems with
instruction cache coherence than data cache coherence. This is because a memory
supplying instructions is unlikely to be modified by another processor or external
DMA controller. Yet, a processor can use store instructions to place new instructions
in memory (assuming a write–through policy described in the following
Data Cache
Maintenance
section). When this occurs it is possible that the affected memory may
be already located in instruction cache. It is important that the instruction cache be
invalidated after self modifying code has changed memory which will later be
accessed for instructions. Because cache invalidation can only be performed by
Supervisor mode code, a system call service may be required to invalidate the cache.
The Instruction cache operates with virtual address tags when address
translation is in use (physical instruction (PI) bit clear in CPS register). The cache
tags do not contain any per–process identifiers, but can distinguish between User or
Supervisor mode access. When address translation is used, it is possible that a User
mode virtual address maps to the same physical address as a Supervisor mode virtual
address. However, the cache would assign separate blocks to each of the virtual
addresses. Hence, the instructions on shared instruction pages could be cached twice.
This results in inefficient use of the cache but is unlikely to lead to any problems
unless the instructions on the shared physical page are modified. Note, two User
mode processes can not map their virtual address to the same physical page, as the
cache must be invalidated when a process context switch occurs.
5.13.3 Branch Target Cache
The Am29000 and Am29050 3–bus processors have a Branch Target Cache
(BTC) which can supply the first four instructions of a previously taken branch.