
438
Evaluating and Programming the 29K RISC Family
The instruction cache hit–ratio is 47.7% for the Am29035 using a 2/1 DRAM
system. The Am29035 has the smallest instruction cache, 4k bytes. The Am29030
and Am29040 have an instruction cache hit–ration of 95.1%. This is due to their
larger 8K byte cache.
20 MHz Memory Systems
The performance of processors operating with 20 MHz memory systems is
shown in Figure 8-7. The relatively inexpensive Am29035 processor is not available
at this or higher frequencies. With the Am29030 processor, the 3/2 SRAM system is
faster than the 3/1 DRAM. This is due to the higher cache hit–ratio of the Am29030
compared to the Am29035. The larger cache reduces the impact cache reload has on
performance. However, it reveals the cost of data memory access which frequently
occurs with the LAPD benchmark. To access a single 32–bit data object costs
4–cycles (1–cycle precharge plus 3–cycle access). The SRAM only requires
3–cycles for the same task. Consequently, the SRAM, although it can not sustain
1–cycle burst, is faster. Using the Am29040 processor, the 3/1 DRAM is again a
better choice than 3/2 SRAM. This is a result of the on–chip data cache reducing the
effects of memory precharge for data accesses.
The 50 ns period of a 20 MHz system clock makes 1–cycle burst–mode access
possible using slightly faster DRAM. Additionally, 2–cycle first access (100 ns) is
possible with 80 ns DRAM. The 60 ns RAS precharge can be hidden if DRAM is
combined with ROM –– the precharge occurring during ROM access. Alternatively,
DRAM access, including RAS precharge, is 140 ns (80+60) which is a really 3–cycle
for an initial new page access (150 ns).
The results for 2/1 and 2/2 DRAM–only systems are highlighted on Figure 8-7.
Using an Am29030 processor, a 2/1 system has 15% more performance than the 2/2
system. The difference is 24% when using an Am29040 processors. Scalable
Clocking enables the Am29030 to be replaced with an Am29040 which improves the
performance of the 2/2 system by 96%. With a 2/1 system, the performance is
improved by 112%.
25 MHz Memory Systems
The performance of processors operating with 25 MHz memory systems is
shown in Figure 8-8. At 25 Mhz the cycle time is reduced to 40 ns. This makes
1–cycle burst–mode access difficult to achieve. Lets look at the arithmetic; fast
DRAMs have an access time of, say, 20 ns from CAS assertion, and 10 ns CAS
precharge. Let’s also assume fast 5 ns PAL logic, and a best–case input setup time of 6
ns (Am29040, 12 ns for the Am29030). This results in a total access time of 41 ns (47
ns for the Am29030). This makes 1–cycle burst–mode impossible at 25 MHz
without an interleaved memory system. However, an Am29040 could be operated at
24 MHz and just achieve the timing requirements for 1–cycle burst–mode access. An
initial 2–cycle access time can not be achieved with 80 ns DRAM. This would require