A memory and speed efficient CAVLC (context adaptive variable length coding) decoder for H.264/AVC baseline
profile is proposed. Because the CAVLC consists of a kind of bit-level operations, general processor (like RISC or
MIPS) and DSP incorporating multiple parallel arithmetic units (like SIMD or VLIW) are ineffective to decode it.
Though the dedicated hardwires are very effective to decode the CAVLC, they are expensive and not programmable.
Also, the size of the CAVLC lookup table is more than 400 bytes. Demand for highly flexible and fast implementations
and lower memory size for CAVLC decoding is rising nowadays. The four instructions--ShowBits, GetBits, FlushBits
and LeadingBits, are designed after the exploiting of the functions coverage. In order to reduce the codewords matching
miss and the lookup table size, a new grouping, remainder generation method and a merged lookup table (LUT) are used;
and the specific group decoding instruction GroupDecoding is also presented. In summary, a based instruction set level
acceleration with the codewords group partition hardware architecture is proposed to speed up the CAVLC decoding.
Based on those instructions and the hardware platform, the CAVLC decoding can be significantly accelerated compared
with the method in H.264 reference software. Simulation results show that the architecture is sufficient for the CAVLC
decoding of 30 frames HDTV (1902 x 1080 pixels) per second.