Chapter 3 Assembly Language Programming
173
3.1.7 Floating–point
The Floating–Point instructions (Tables 3–10 and 3–11) provide operations on
single–precision (32–bit) or double–precision (64–bit) floating–point data. In addi-
tion, they provide conversions between single–precision, double–precision, and in-
teger number representations. In most 29K family members, these instructions cause
traps to routines which perform the floating–point operations in software. The
Am29050 processor supports all floating–point instructions directly in hardware. It
also has four additional instructions not shown in Tables 3–10 and 3–11. They are
MFAC ,DMAC and FMSM, DMSM; and are used to to perform single and double–
precision multiply–and–accumulate type instructions (see section 3.3.5).
3.1.8 Branch
The Branch instructions (Table 3-12) control the execution flow of instructions.
Branch target addresses may be absolute, relative to the Program Counter (with the
offset given by a signed instruction constant), or contained in a general–purpose reg-
ister (indirect addressing). For conditional jumps, the outcome of the jump is based
on a Boolean value in a general–purpose register. Only the most significant bit in the
specified condition register is tested, Boolean TRUE is defined as bit–31 being set.
Procedure calls are unconditional, and save the return address in a general–purpose
register. All branches have a delayed effect; the instruction following the branch is
executed regardless of the outcome of the branch.
The instruction following the branch instruction is referred to as the
delay slot
instruction. Assembly level programmers may have some difficulty in always find-
ing a useful instruction to put in the delay slot. It is best to find an operation required
regardless of the outcome of the branch. As a last resort a NOP instruction can be
used, but this makes no effective use of the processor pipeline. When programming
in a high level language the compiler is responsible for making effective use of delay
slots. Programmers not familiar with delayed branching often forget the delay slot is
always executed, with unfortunate consequences. For this reason, the example code
throughout this book shows delay slot instructions indented one space compared to
other instructions. This has proven to be a useful reminder.
The delay slots of unconditional branches are easier to fill than conditional
branches. The instruction at the target of the branch can be moved to, or duplicated at,
the delay slot; and the jump address can be changed to the instruction following the
original target instruction.
The JMPFDEC instruction is very useful for implementing control loops based
on a decrementing loop. The counter register (SRCA) is first tested to determine if the
value is FALSE, then it is decremented. The jump is then taken if a FALSE value was
detected. The code example below shows how
count
words of external memory can
be written with zero. Note how the address pointer is incremented in the delay slot of
the jump instruction. Additionally, the SRCA register must be initialized to
count–2
;