279
Chapter 5 Operating System Issues
memory. This seldom happens, but when it does, the write–buffer which normally
has the lowest priority is given a higher priority for accessing the system busses.
Because load instructions have a bigger impact on performance than store
instructions, cache reload may be performed before the write–buffer is emptied. The
Am29240 has dependency logic to detect if a load is performed on a data address
which is currently pending in the write–buffer. The data is forwarded from the
write–buffer when necessary. Because the Am29040 has a copy–back rather than
write–through policy, the write–buffer is first flushed before loads that miss in the
cache are performed –– this is explained in the later
Am29040 Microprocessor
section.
The write–buffer is disabled when the data cache is disabled. In this case the
processor is not decoupled from the performance of memory writes. Before interrupt
processing commences or when a serializing instruction is executed, the write buffer
is flushed. Additionally, execution of LOADL or LOADSET instructions (which
bypass the data cached) is preceded by write–buffer flushing. Store instructions are
properly ordered, and since the STOREM instruction bypasses the write–buffer, the
buffer is emptied before the STOREM commences.
Data cache reload, resulting from a load access which missed, always fills a
complete block. The process of reloading the cache is assisted with a reload buffer
which temporarily holds the data fetched from memory. The cache reload buffer is
four words deep. When the buffer is full it is transferred into the cache in a single
cycle when the cache is currently not being accessed. Code continues to execute
during cache reload; and the cache will continue to service cache accesses which hit.
However, if a further data load operation is performed on data not found in the cache,
the processor pipeline will stall until the current reload operation is complete. When
the reload buffer becomes available the second reload operation will commence (if
necessary) and the pipeline will restart instruction processing.
The following sections present further detail about data caching for individual
29K family members. Table 5-3 summarizes this information.
5.14.1 Am29240 Microcontroller
A block diagram of the Am29240 cache architecture is shown on Figure 5-7.
The precise cache implementation may differ from the diagram but the data flow
paths can be seen.
A buffered “write–through” policy is implemented for all data stores. If write
data matches with a cached entry, then the cache is updated during the same cycle as
the store. All stores cause writes to off–chip memory, but the write–through buffer
enables the processor to continue code execution while the stores are completed in
parallel.
The cache is accessed in the execute stage of the pipeline even if address
translation is in use. This makes data that hits in the cache available for the instruction