Philips Semiconductors
Arbiter
PRELIMINARY SPECIFICATION
20-7
For example, with the same settings as in the example of
Section 20.5.1
, then
E
2
is (3 + 2) / 2 = 2.5
E
VO
is (3 + 7) / 3 * 2.5 = 8.33,
which gives
B
VO
= (80 - 1.216) * 64 / [ 20*8.33 + 16*2 / (5/4) ]
resulting in 26.23 million B/sec corresponding to 25.01
MB/sec.
Note
: In order to compute the latency B
when a unit is
not enabled, its weight has to be considered as
‘
0
’
in the
E
equations and in E
{AI,AO,VLD}
for AI, AO or
VLD.
The maximum amount of requests, A
x,
for unit x allowed
during M
cycles
period is:
A
x
= floor(B
x
/ S)
Where floor(X) is the greatest integral value less than or
equal to X.
Note
: This number does not take into account the worst
case pattern for request acknowledgment. Thus if the pe-
riod is too small A
x
is not accurate.
20.6
EXTENDED BEHAVIOR ANALYSIS
The following sections describes a more accurate behav-
ior of the PNX1300 arbitration system.
20.6.1
Extended Bandwidth Analysis
The
minimum
bandwidth allocation derived from the ar-
biter settings is accurate if one of the two following con-
ditions are true:
The units emit requests all the time (i.e. do back-to-
back requests)
After a request has been acknowledged, the unit
emits a new request before the new arbitration point.
The arbitration is decided around every 16 cycles.
This time depends on the direction of the transac-
tions (read/write).
In PNX1300, the only unit almost able to sustain back-to-
back requests is the data cache. The other units will post
a request and wait for the data before the next request is
posted. This behavior makes the bandwidth computa-
tion:
almost accurate if the unit is down in the arbiter hier-
archy (true if the units placed above are enabled).
rather inaccurate if large weights are used for a unit.
Since no back-to-back requests are implemented, the
worst case is that a unit can only get one request out of
three if all the others are asking. This limits the use of
large weights for other units than data cache.
However some units might be able to catch one request
out of two. This depends on the way requests interleave,
since the arbitration point is dependent on the type of the
request (read or write) as well as on the CPU ratio.
This makes it almost impossible to describe the behavior
precisely.
The
exact
bandwidth necessary for units like VO, VI, AO
or AI are well known (see dedicated sections in each cor-
responding chapter). If the arbiter settings allocate more
bandwidth for these units than they can use, the extra
bandwidth can be used by units that are located below
these units (VO, VI) or at the same level as (AO and AI)
in the arbiter hierarchy.
As an example, with the default settings, VO gets 25% of
the available bandwidth and the CPU gets 50%. If the
SDRAM clock speed is 100 MHz, then 100 MB/sec are
allocated to VO. If VO runs at 27 MHz (NTSC or PAL
mode), then VO will not use all this allocated bandwidth.
Thus any of the units that are below VO in the arbiter hi-
erarchy can potentially use the remaining allocated
bandwidth.
In other words - even if only 10% are allocated to one unit
like the CPU, PCI or the ICP, it may use more.
20.6.2
Extended Latency Analysis
Some units (VO and VI) have a latency/bandwidth re-
quirement and their behavior needs to be simulated in or-
der to find out the correct settings. For example the re-
quirement for VO (in image mode 4:2:2 or 4:2:0 without
up scaling, overlay disabled) is:
During 128 VO clock cycles, VO block needs to
have 2 requests acked ([2 Ys, one U and one V]/2).
The default value
‘
0
’
for ARB_BW_CTL leads to a bus al-
location of 50% for CPU, 25% for VO and 25% for L3
blocks.
The worst case arbitration for VO is then: CPU L3 CPU
VO, CPU L3 CPU VO to which the refresh (K), internal
delays (T) and E for the first CPU request need to be
added.
The first VO request will require 129 SDRAM cycles (D
= 5 or from the worst case pattern 19 + 10 + 20 + 4 * 20).
The arbitration pattern shows that the following request
will require (in the worst case) an extra 4 * 20 SDRAM cy-
cles. Thus VO clock speed cannot be greater than
61.24% (128 / [129 + 80]) of the SDRAM clock speed.
By changing the settings to 33% for the CPU, 33% for VO
and 33% for L3 blocks (i.e. CPU
=
‘
1
’
, L2
=
‘
2
’
,
VO
=
‘
1
’
, L3
= 1), the new SDRAM/VO clock
percentage becomes 75.74% (128 / [109 + 60]) corre-
sponding to a worst case arbitration pattern of CPU L3
VO, CPU L3 VO.
Before changing the settings the minimum SDRAM
speed required to run VO at 74.25 MHz (high definition
speed) was 122 MHz. After the new allocation 100 MHz
is fine. Note that here D
VO
remains equal to
‘
5
’
.
E
5
VI
-----------------------------------------------
L5
+
L5
weight
E
4
×
=
E
6
PCI
--------------L6
L6
+
weight
E
5
×
=