
277
Chapter 5 Operating System Issues
block boundary. This would lead to an expansion of code space requirements and is
likely to produce little performance improvement.
5.13.5 Am29240 and Am29040 Processors
The Am29240 microcontroller has a 4K byte instruction cache. The Am29040
2–bus microprocessor has an 8K byte instruction cache. The caches are implemented
using a similar two–way set associative architecture. The major difference from the
earlier Am29030 processor cache is that the block status information has a valid bit
per instruction. The resulting four bits enable partially filled cache blocks to be
supported. This has been shown to produce an average performance gain of 4% over
the valid bit per block method. However, the performance difference may be larger
for code which contains an unusually large number of branch instructions. Note, the
Am29240 microcontroller only caches instructions held in DRAM or SRAM
address regions.
Because cache blocks are not tagged per block, it is possible to interrupt cache
reload with a higher priority operation. This means LOAD instructions need not wait
till the end of the current block reload before they can gain access to the processor
busses. Unlike the block oriented cache of the Am29030, cache reload begins with
the target instruction of a branch, not the first instruction of the block. As with the
Am29030, instructions are forwarded for execution in parallel with cache block
reload. During instruction prefetch, the next block is fetched ahead if it is not already
in the cache or if any of its valid bits are clear.
The instruction cache can be invalidated in a single cycle using an INV or
IRETINV instruction. These instructions also simultaneously invalidate the data
cache. To invalidate only the instruction cache, instructions INVI and IRETINVI are
provided.
5.14 DATA CACHE MAINTENANCE
Newer members of the 29K family can operate with internal processor speeds
which are higher than the off–chip memory system speeds. This ability is known as
Scalable Clocking. To obtain the processing benefits of the higher internal pipeline
speed, it becomes important to prevent pipeline stalling due to accesses to any
off–chip data memory. For this reason, on–chip data cache has been incorporated into
the 29K family. When a cache
hit
occurs, the accessed data is supplied by the cache
rather than off–chip memory. If the number of cache hits can be kept high, the
potential pipeline stalling which results from a cache
miss
can be minimized.
As with instruction caches, two–way set associative addressing is used (see
section 6.2). However, unlike instruction caches, 29K family data caches are always
accessed with physical rather than potentially virtual addresses. Physically
addressed caches have advantages over virtually addressed caches. For example,