
7–56
Altera Corporation
Stratix Device Handbook, Volume 2
September 2004
Discrete Cosine Transform (DCT)
All of the additions in stages 1, 2 and 3 of 
Figure 7–32 appear in
symmetric add and subtract pairs. The entire first stage is simply four
such pairs in a very typical cross-over pattern. This pattern is repeated in
stages 2 and 3. Multiplication operations are confined to stage 4 in the
algorithm. This implementation is shown in more detail in the next
section.
DCT Implementation
In taking advantage of the separable transform property of the DCT, the
implementation can be divided into separate stages; row processing and
column processing. However, some data restructuring is necessary
before applying the column processing stage to the results from the row
processing stage. The data buffering stage must transpose the data first.
Figure 7–34. Three Separate Stages in Implementing the 2-D DCT
Because the row processing and column processing blocks share the same
1-D 8-point DCT algorithm, the hardware implementation shows this
block as being shared. The DCT algorithm requires a serial-to-parallel
conversion block at the input because it works on blocks of eight data
C
10
000
0C4 000
000
00
C6 C
– 2 0
000
00
C2 C6 0
000
00
C7 C
– 5 C3
C
– 1
00
C5 C
– 1 C7 C3
00
C3 C
– 7 C
– 1 C
– 5
00
C1 C3 C5 C7
=
Cx
πx
16
------
cos
=
Row
processing
Column
processing
Transpose
matrix