This paper describes a new method for lossless image compression where relative pixel values of localized neighborhoods in test images are stored in a codebook. In order to achieve decorrelation of an image's pixels, each neighborhood of a pixel is assigned to a neighborhood in the codebook, and the difference between the actual pixel value and the predicted value from the codebook is coded using an entropy coder. Using the same codebook, one can achieve perfect reconstruction of the image. The method is tested on several standard images and compared with previously published methods. These experiments demonstrate that the new method is an attractive alternative to existing lossless image compression techniques.
We have previously used a MAP estimation technique to remove quantization noise from decompressed images. This constructed optimization type of post-processor has a tendency to over-smooth the image removing detail information. The proposed quantizer provides additional information about the original image vectors to reduce the size of the constraint space. For each image vector, in addition to the index of the codebook vector, the quantizer also transmits the distance from the original image vector to the codebook vector. This distance information reduces the constraint space to a solid hyper-sphere which will generally be smaller than the polyhedron defined by the codebook vectors alone. The projection onto the solid hyper-sphere also requires much less computational effort than the projection onto the polyhedron. Preliminary experimental results have shown that the VQ with additional distance constraints allows the post-processor to reduce the visibility of quantization errors while retaining detail information that otherwise would have been smoothed out of the post- processed image. The size of the constraint regions could also have been reduced by increasing the codebook size. VQ with distance constraints will be compared with VQ with a larger codebook size at the same data rate. Experimental results will be shown.
We present a VQ based technique for coding image data that, like closed loop VXC, adopts a analysis by synthesis approach. We define a new type of spatial interaction model for image data, called prediction pattern, which we use along with a quantized excitation (residuals) vector, to generate an approximation of an input block of pixels. A prediction pattern is simply a k X k array with each element representing a prediction scheme from a given set of predictors. A prediction pattern captures the spatial dependencies present in an image block. Given an image, a set of prediction schemes and a codebook of prediction patterns, we encode an image by partitioning it into blocks and for each block identifying the prediction pattern from within the codebook that best models the spatial dependencies that are present in the block. Having identified this prediction pattern we then search the residual codebook for a code vector that in combination with the already chosen prediction pattern results in the synthesis of the closest approximation to the current image block. The problem is to design an optimal set of prediction schemes and an optimal codebook of prediction patterns, given an image (or class of images). We present algorithms for codebook design and give implementation results on a few standard images. Preliminary results give substantial (between 2 or 3 db) improvements over a simple implementation of full search VQ.
Though fractal compression is often characterized as a variant of vector quantization, the important step of codebook training has not migrated over. Unlike traditional VQ, the vector codebook is formed not prior to compression, but during. This often results in a number of similar, yet distinct, domain blocks. The result can be an unnecessarily large file. Coding efficiency is improved by pruning the codebook of redundant entries, with little or no resultant loss in image quality.
In fractal image compression an image is partitioned into ranges for each of which a similar subimage, called domain, is selected from a pool of subimages. A typical choice for the domain pool may consist of all square subimages of a particular size. However, only a fraction of this large pool is actually used in the fractal code. This subset can be characterized in two related ways: (1) It contains domains with relatively large intensity variation. (2) The collection of used domains are localized in those image regions with a high degree of structure. Both observations lead us to improvements of fractal image compression. Firstly, we accelerate the encoding process by a priori discarding those domains from the pool which are unlikely to be chosen for the fractal code. This comes at the expense of a slight loss in compression ratio. In our empirical studies (using Fisher's adaptive quadtree method) we have found that a twofold acceleration leads to a drop of only 2 to 3% in the compression ratio while the image quality even improves by 0.1 to 0.2 dB. Secondly, the localization of the domains can be exploited for an improved encoding in effect raising the compression ratio back up without any penalty.
Lapped Orthogonal Transforms (LOT) are becoming more widely used in image coding applications for image transmission and archival schemes. Previously sponsored U.S. Army Missile Command research has developed a LOT Recursive Residual Projection (RRP) that uses the following multiple bases functions: Discrete Cosine Transform (DCT), Discrete Walsh Transform (DWT), and Discrete Slant Transform (DST). For high compression ratios the LOTRRP was shown no outperform the single bases transforms at the cost increased computations. The work presented in this paper describes a VHSIC Hardware Description Language (VHDL) design of the LOTDCT, LOTDWT, and LOTDST targeted for implementation on Application Specific Integrated Circuits (ASICs). This hardware solution was chosen to compress RS-170 standard video for real-time image transmission on a very low bandwidth packetized data link.
New recursive VLSI architectures for arbitrary length discrete cosine transform (DCT) and inverse DCT (IDCT) are presented. Compared with previous methods, the proposed approach saves N-1 adders but requires one additional delay element for the DCT implementation, and saves (N/2) multipliers for the IDCT. The reduction in elements results from identifying common computations in the processing that generates the transform components.
We present in this paper a hardware implemented system which allows compression and decompression of 256 X 256 still-images at almost video rate. We have followed a scheme inspired from those presented by Barland. In the compression stage, the main steps are: biorthogonal wavelet transform (5 levels), scalar and vector quantization of the last two levels, lossless coding; the decompression stage follows the same steps in their inverse forms. We have tested the robustness of the reconstruction in the presence of truncatures made on the wavelet coefficients, and it seems that the use of wavelet with high regularity filter is not really important in the compression stage. Finally we have chosen FIR splines filters with 3 coefficients for analysis and 5 for synthesis. The wavelet coefficients are truncated and coded in a non linear way. The codebook is created once for all with a set of training images using an algorithm inspired from Chok-Ki Chan. The hardware implementation has required 4 FPGA circuits and some RAM. For a 256 X 256 image with gray levels coded on bytes the compression time is 64 ms while the decompression time is 17 ms with a compression ratio of 87 and a mean PSNR of 28 dB.
The nearest neighbor search procedure has numerous applications in pattern classification and image coding, specifically for vector quantization methods. Despite its good result performance it is normally avoided because of its expensive computational complexity (O(kN) arithmetic operations where N is the number of vectors and k is the vector dimension). Several methods have been proposed in the literature to speedup the search process, some of which do not guarantee to find the closest neighbor. This may cause misclassification for recognition applications and introduce higher image distortion levels for coding applications. In this paper we present a nearest neighbor search algorithm that utilizes some of the vectors statistical features, that could be computed off-line, to quickly reject quite a few vectors that can never be closest candidates to the given vector. In fact, most of the time, the off-line built lookup table index yields the closest neighbor immediately without any further search. The algorithm is, thus almost, O(k) arithmetic operations and is guaranteed to find the same closer vector that results from a full search method. Results of applying the algorithm to vector quantize few images will be reported.
JPEG, an international standard for still-image compression, is a widely used technique for compressing natural images. Popularity of JPEG stems from its flexibility, reasonable compression rate and ease of implementation. In baseline and progressive modes of JPEG, transform coding based on 8 X 8 block discrete cosine transform (DCT) is used. At high compression ratios (i.e. low bit rates), however, JPEG typically causes blockiness in the reconstructed image. In this paper, we highlight key factors that limit baseline JPEG's performance at low bit rates. Simple modifications within the JPEG construct are studied to improve the overall quality of images at low bit rates. In addition, a multiresolution-like organization of the 2 X 2 block DCT coefficients is considered and is shown to represent the Haar-based subband/wavelet transform.
Range images are a 3D representation of the skin of objects. They are usually acquired with a laser scanner, or structured light. In contrast with intensity images, each pixel represents the distance between an object and the sensor. Because of the very nature of these images, current compression techniques, such as JPEG, are not optimized for these images. Classical lossy image compression techniques are tuned to the Human Visual System. However, range images are not intended to be visualized, but to be rendered in 3D. This puts different requirements for range image compression techniques. For instance, the main quality criterion is the maximal error, instead of the mean square error.
JPEG-compressed images, when transmitted over a noisy channel, are highly susceptible to errors. The effects of bit errors on images compressed by any method, including JPEG, are significantly augmented since the majority of statistical redundancies are removed in the compression process. Furthermore, errors affecting the run-lengths and Huffman codewords within JPEG have a much more profound influence on reconstructed images than those affecting only coefficient values. This paper presents a detailed investigation of uncompressed and JPEG-compressed (baseline and progressive) image transmission in noisy channels. Several modifications within JPEG are proposed to improve reconstructed image quality. Among the techniques investigated, a hybrid technique, combining the concepts of the baseline and progressive JPEG modes, appears to give the best performance. Here, the DC coefficients (with absolute and/or fixed bit encoding), AC codewords, and some positional information are separately coded and protected using forward error correction. The actual AC coefficient values are sent separately. This partitioning of codewords and values enable isolating, for special protection, portions of the representation that are highly sensitive to errors and that may cause lengthy error propagation. It enables sending `view-quality' images in the presence of a channel bit error rate as high as 0.01.
A new post-processing method based on wavelet thresholding is proposed to enhance transform coded images. At the encoder, the optimal (in the mean square error sense) shift- invariant wavelet packet basis is searched using a fast optimization algorithm and the location of the best basis in the entire shift-invariant wavelet packet tree is transmitted as overhead information with the bit stream to the decoder. After the decoder reconstructs the decompressed image, the post-processor performs thresholding using the optimal basis. This method can achieve comparable enhancement performance to the method using the undecimated discrete wavelet transform, and requires the same low post-processing complexity as the method using the orthonormal discrete wavelet transform.
We present a technique for lossy image compression based on the joint-adaptive space and frequency decomposition of images. The algorithm adapts to image content by both developing wavelet packet bases for separate areas of the image and by segmenting image subbands as needed. The elements of the expansion are a two-channel filter bank and a complete and disjoint binary segmentation system. We construct the joint space and frequency library by cascading permutations of these elements. We also formulate the space and frequency operations to be commutative, which allows for the full cascade system to be organized into a graph. After the full expansion, a coding cost is assigned to all elements in the library. The best joint space and frequency basis is found by pruning the graph which indexes the library such that the embedded graph with least cost is found. Its terminal nodes correspond to the best complete basis. We show that encoding the image in its best joint space and frequency basis improves compression performance.
A new multispectral image compression technique based on the KLT and the DCT is proposed. The quadtree for determining the block size and the quantizer for encoding the transform coefficients are jointly optimized in a rate-distortion sense. The problem is solved by a Lagrange multiplier approach. This approach was also used in an optimal MPEG encoding problem by the author. After a quadtree is determined by this approach, the KLT is applied to the spectral axis for each block before the DCT is applied on the spatial domain. The eigenvectors of the autocovariance matrix, the quantization scale, and the quantized transform coefficients for each block are the output of the encoder. The overhead information required in this scheme is the bits for the quadtree representation.
The most widely used still image coding method is given by the JPEG coding standard which works well on natural images with compression ratios up to 1:10. The algorithm we propose in this paper allows for progressive transmission, and works well even at higher compression ratios such as 1:50 - 1:100. The proposed method is in its main aspects close to the classical pyramid approach of Burt and Adelson. While retaining the main idea of using a Laplacian pyramid type decomposition, the new proposal differs in the filters employed for pyramid decomposition and in the bit allocation and quantization. For decomposition an improved Laplacian pyramid is used. Bit allocation and quantization is done jointly using a zerotree coding algorithm. The zerotree algorithm automatically performs bit allocation within the pyramid, hence adapting to the nonstationary nature of images. The encoder outputs an embedded bit stream. Hence, the decoder may truncate the bitstream at any point, resulting in a more or less detailed image. This feature is especially useful for browsing in image data bases, where the user may stop the transmission at any time.
Wavelet transform is used efficiently for high compression ratio image coding. It is useful for a high resolution and color document image processing system. However, a large amount of memory is required in wavelet decompression for a large image. Conventional wavelet transform does not allow reconstructing a sub-region of the image. Therefore, it is hard to use wavelet transform in a color document image processing system. In this paper, a block wavelet transform method is proposed. An image is divided into blocks and the samples of the adjacent block are used to transform one block for removing the edge artifact. The required number of samples of the adjacent block are derived. By using the proposed method, the memory requirement for wavelet coding/decoding is reduced to that of the block size. Any targeted region of an image can be compressed or reconstructed. The proposed method can be used for a color document image processing system.