An image coding processing scheme based on a variant of the Haar Wavelet Transform that uses only addition and subtraction is presented. After computing the transform, the selection and coding of the coefficients is performed using a methodology optimized to attain the lowest hardware implementation complexity. Coefficients are sorted in groups according to the number of pixels used in their computing. The idea behind it is to use a different threshold for each group of coefficients; these thresholds are obtained recurrently from an initial one. Parameter values used to achieve the desired compression level are established "on-line", adapting their values to each image, which leads to an improvement in the quality obtained for a preset compression level. Despite its adaptive characteristic, the coding scheme presented leads to a hardware implementation of markedly low circuit complexity. The compression reached for images of 512x512 pixels (256 grey levels) is over 22:1 (≈0.4 bits/pixel) with a rmse of 8-10%. An image processor (excluding memory) prototype designed to compute the proposed transform has been implemented using FPGA chips. The processor for images of 256x256 pixels has been implemented using only one general-purpose low-cost FPGA chip, thus proving the design reliability and its relative simplicity.
The Discrete Cosine Transform (DCT) is the most widely used transform for image compression. The Integer Cosine Transform denoted ICT (10, 9, 6, 2, 3, 1) has been shown to be a promising alternative to the DCT due to its implementation simplicity, similar performance and compatibility with the DCT. This paper describes the design and implementation of a 8×8 2-D ICT processor for image compression, that meets the numerical characteristic of the IEEE std. 1180-1990. This processor uses a low latency data flow that minimizes the internal memory and a parallel pipelined architecture, based on a numerical strength reduction Integer Cosine Transform (10, 9, 6, 2, 3, 1) algorithm, in order to attain high throughput and continuous data flow. A prototype of the 8×8 ICT processor has been implemented using a standard cell design methodology and a 0.35-μm CMOS CSD 3M/2P 3.3V process on a 10 mm2 die. Pipeline circuit techniques have been used to attain the maximum frequency of operation allowed by the technology, attaining a critical path of 1.8ns, which should be increased by a 20% to allow for line delays, placing the estimated operational frequency at 500Mhz. The circuit includes 12446 cells, being flip-flops 6757 of them. Two clock signals have been distributed, an external one (fs) and an internal one (fs/2). The high number of flip-flops has forced the use of a strategy to minimize clock-skew, combining big sized buffers on the periphery and using wide metal lines (clock-trunks) to distribute the signals.