
67
Chapter 1 Architectural Overview
inadaquate memory bandwidth, high memory access latency, bus access contention,
excesive program branching, and instruction dependancies. To get the best from a
processor an understanding of instruction stream dependancies is required. Proces-
sors in the 29K familiy all have pipeline interlocks supported by processor hardware.
The programmer does not have to ensure correct pipeline operation, as the processor
will take care of any dependancies. However, it is best that the programmer arranges
code execution to smooth the pipeline operation.
1.13
PIPELINE DEPENDENCIES
Modification of some registers has a delayed effect on processor behavior.
When developing assembly code, care must be taken to prevent unexpected behav-
ior. The easiest of the delayed effects to remember is the one cycle that must follow
the use of an indirect pointer after having set it. This occurs most often with the regis-
ter stack pointer. It cannot be used to access a local register in the instruction that fol-
lows the instruction that writes to
gr1
. An instruction that does not require
gr1
(and
that means all local registers referenced via
gr1
) can be placed immediately after the
instruction that updates
gr1
.
Direct modification of the Current Processor Status (CPS) register must also be
done carefully. Particularly where the Freeze (FZ) bit is reset. When the processor is
frozen, the special-purpose registers are not updated during instruction execution.
This means that the PC1 register does not reflect the actual program counter value at
the current execution address, but rather at the point where freeze mode was entered.
When the processor is unfrozen, either by an interrupt return or direct modification of
the CPS, two cycles are required before the PC1 register reflects the new execution
address. Unless the CPS register is being modified directly, this creates no problem.
Consider the following examples. If the FZ bit is reset and trace enable (TE) is
set at the same time, the next instruction should cause a trace trap, but the PC–buffer
registers frozen by the trap will not have had time to catch up with the current execu-
tion address. Within the trap code the processor will have appeared to have stopped at
some random address, held in PC1. If interrupts and traps are enabled at the same
time as the FZ bit is cleared, then the next instruction may suffer an external interrupt
or an illegal instruction trap. Once again, the PC–buffer register will not reflect the
true execution address. An interrupt return would cause execution to commence at a
random address. The above problems can be avoided by resetting FZ two cycles be-
fore enabling the processor to once again enter freeze mode.
Instruction Memory Latency
The Branch Target Cache (BTC), or the Instruction Memory Cache, can be used
to remove the pipeline stalling that normally occurs when the processor executes a
branch instruction. For the purpose of illustrating memory access latency, the effects
of the BTC shall be illustrated. The address of a branch target appears on the address