While the recent JPEG2000 standard only specifies the bitstream and file formats to ensure interoperability, it leaves the actual implementation up to the designer. Like many DSP applications, there are a number of implementation platform options for the designer. This paper gives a complexity analysis of an implementation of a JPEG2000 encoder using a hardware/software co-design methodology on a Xilinx Virtex-II(TM) platform FPGA. Central to the performance of
the encoder is a high-throughput tier-1 entropy coder. This paper will describe the encoder design targeted for video surveillance applications, and will compare and contrast with two other implementation options.
It is a well-known fact that the major bottleneck of a JPEG2000 encoder is the bit/context modeling and arithmetic coding tasks (also known as the tier-1 coding portion of EBCOT). Whereas the technique of using mutiple coding passes on multiple bit-planes follows a near-optimal path on the rate-distortion curve and helps create an elegant embedded codestream, this tier-1 coding requies a large amount of computation for each block of data as well as significant memory resources and memory accesses. Luckily, the JPEG2000 standard allows us to perform a number of the tier-1 coding tasks in parallel. If this parallelization is exploited and if smart data organization techniques are used, then the throughput of a JPEG2000 system can be dramatically improved. This paper discusses an efficient, optimized hardware implementation of a tier-1 coder that exploits these available parallelisms. This paper also describes implementation on Xilinx FPGA platforms. The proposed technique described in this paper is approximately 50% faster than the best technique described in the literature.