Directly encoding the image pixel values using noiseless coding techniques such as Lempel- Ziv-Welch codes or arithmetic codes is inefficient as it fails to capitalize on the normally abundant correlation among neighboring pixels. A simple modification that exploits the pixels correlation as in the Delta Pulse Code Modulation scheme can improve the compression efficiency. Motivated by the `adaptive coding by block classification' idea of Chen and Smith, we take into account the pixels non-stationary nature and present a more sophisticated lossless adaptive differential image coding system that can further improve the compression efficiency.
Given a finite sequence x1, x2, .. . , x, the essential problem in lossless data compression is to process the symbols in some order and assign a conditional probability distribution for the current symbol based on the previously processed symbols . For example, if we are to process x1, x2, .. . , x in a sequenal manner  then we need to estimate the distributions p(x+iIxi, x2, . . . , xe), 1 < j < fl The number of bits needed to optimally encode the sequence x1, . . , x, is then given by
Coding techniques that can encode the sequence at rates close to the optimal are known . Hence, higher the probabilities assigned in the above product, the lesser the number of bits that are needed to encode the sequence. A model, in this context is then simply a scheme for assigning conditional probability distributions . Clearly, it is the model that determines the rate at which we can encode the sequence. Hence the critical task in lossless data compression is finding good models for the data under consideration. Finding good models for a given data set is a difficult problem. In lossless compression applications some structure is usually imposed on the data in the form offinile-state models, Markov-models, tree-models, finiie-con1ex models etc to make the problem mathematically and/or computationally tractable. Algorithms are then designed that encode the given data, in an optimal or sub-optimal manner. Algorithms that 'learn' the parameters of the model that best describes the data and achieve rates that are asymptotically optimal are known as oplimal universal coding schemes. Such schemes have been applied very successfully in text or string compression applications. Unfortunately, universal coding schemes and other standard modelling techniques do not work well in practice when applied to gray-scale image data.
The CB9 lossless image compression algorithm is context-based, and codes prediction errors with an adaptive arithmetic code. It has been developed within an algorithm class that includes (in the order of their development) Sunset, JPEG lossless, sub8xb, and now CaTH (Centering and Tail Handling). Lossless compression algorithms using prediction errors are easily modified to introduce a small loss through quantization so that the absolute error for any pixel location does not exceed prescribed value N. In this case, N varies from 1 to 7; the values for which the JPEG group issued a call for contributions. This work describes CB9 and the experiments with near-lossless compression using the JPEG test images. Included are experiments with some image processing operations such as edge-enhancement with the purpose of studying the loss in fidelity from successively performing decompression, followed by an image processing operation, followed by recompression of the new result.
In a description of a data by using arithmetic coding, coding parameters are estimated for each markov state independently. However, markov states are not completely uncorrelated in many cases. By utilizing the correlation, we can decrease estimation errors of coding parameters and improves compression performance. The utilization of the correlation can be included in the MDL (Minimum Description Length) framework. We have developed an MDL-based arithmetic coding for correlated markov states.
Textured images are generally difficult to compress because they contain a large number of high frequency components which are difficult to capture with traditional compression schemes such as transform coding, especially at high compression ratios. Since many textures possess a high degree of self-similarity at different scales, the fractal compression technique can be applied to effectively encode such textured images by exploiting this self-similar property. The main drawback of fractal compression is that the fractal encoding procedure is very time consuming. In this research, we focus on the speed up of this procedure by introducing three schemes: dimensionality reduction, energy-based classification, and tree search. We have developed an algorithm that combines these three schemes together and achieves a speed-up factor of 177 at the expense of only 0.4 dB degradation in PSNR relative to the unmodified exhaustive search for a typical textured image encoded with 0.44 bpp.
In this paper, we will first briefly review fractal equations. Then we will present a greedy algorithm for solving encoding equations of class 2 fractals. The idea is to represent images in terms of blocks in a similar way as JPEG. The DCT in JPEG is replaced by fractal transformation. Image compression ratios for various selections of block size and parameter space are calculated.
In the general fractal image compression, each range block is approximated by a contractive transform of the matching domain block under the mean squared error criterion. In this paper, we propose a fractal image compression algorithm with perceptual distortion measure rather than the mean squared error. In the perceptual distortion measure, the background brightness sensitivity and edge sensitivity are used. To obtain the sensitivity of the background brightness for each pixel, the average value of the neighborhoods is calculated and applied to a quadratic function. In the edge sensitivity for each pixel, sum of the differences in the neighborhood is calculated and applied to a nonlinear function. The perceptual distortion measure is obtained by the multiplications of the background brightness sensitivity, the edge sensitivity, and the error between the range block and the transformed domain block. For the range blocks having large distortion, they are splitted and the same algorithm is applied for smaller blocks. Compared to the method with the mean squared error measure, 10% compression ratio improvement under the same image quality is achieved.
Fractal coding of digital images offers many promising qualities. However the coding process suffers from the long search time of the domain block pool. We offer a new approach to speed up the search, based on an efficient solution of the following problem. Given a sphere and a collection of lines going through the origin in Euclidean space, enumerate those lines which intersect the sphere. Experimental results show that the algorithm speeds up the coding process by up to 50 times.
The generalized Lloyd algorithm (GLA) plays an important role in the design of vector quantizers (VQ) for lossy data compression, and in feature clustering for pattern recognition. In the VQ context, this algorithm provides a procedure to iteratively improve a codebook and results in a local minimum which minimizes the average distortion function. We present a set of ideas that provide the basis for the acceleration of the GLA, some of which are equivalent to the exhaustive nearest neighbor search and some that may trade-off performance for the execution speed. More specifically, we use the maximum distance initialization technique in conjunction with either the partial distortion method or the fast tree-structured nearest neighbor encoding or the candidate-based constrained nearest neighbor search. As it is shown by the numerical experiments, all these methods provide significant improvement of the execution time of the GLA, in most cases together with an improvement of its performance. This improvement is of the order of 0.4 dB in the MSE, 15% in the entropy and more than 100 times in the execution time for some of the results presented.
This work focuses on the development of a two-level image block classification scheme and its application to low bit rate image coding. Using this classifier, we present two adaptive encoding structures, one based on vector quantization (VQ) and the other based on transform coding. The first stage of our system classifies the image blocks into K1 classes based on the block grain, similar to the well-known classification scheme of Chen and Smith, but allows for the possibility of a variable number of vectors per class. To do this, we develop an iterative mini-max algorithm that adjusts the vectors among the classes so that the resulting mean-normalized standard deviation of the gain values within any class is similar to all other classes. After classifying based on block gain values, we further classify each gain-class into K2 spectral classes. This is accomplished by performing a 1D LPC-type analysis of each block, and clustering the resulting LPC vectors using a vector quantizer (VQ) with K2 codevectors. In order to make this spectral matching meaningful, the VQ is designed and implemented using the Itakura-Saito distortion measure. The resulting two-level classification scheme thus classifies an image into K equals K1K2 classes. A system consisting of a bank of K fixed-rate Multi-Stage VQ's and a DCT based system are then used to examine the usefulness of the proposed approaches for classification.
It is well known that random noise on images significantly affects the efficiency of compression algorithms. Traditional spectral filtering techniques are effective in many cases but may require some prior knowledge of the noise and image characteristics. Furthermore, the processing requirements of spectral filters strongly depend on their noise rejection properties. In this paper we present a block-based, non-linear, filtering technique based on the Singular Value Decomposition (SVD). Traditional applications of SVD to image processing rely on heuristics to estimate the noise power and are usually applied to the entire image. The proposed scheme employs a complexity-theoretical criterion for noise estimation which exploits the well known property that random noise is hard to compare. By combining SVD with a lossless compression algorithm, in our case lossless JPEG, we can estimate the noise power and derive accurate SVD thresholds for noise removal. Simulation results on grayscale images contaminated by additive noise show that the technique can effectively filter noisy images and improve compression performance with no prior knowledge of either the image or the noise characteristics. Furthermore, the technique does not cause any blurring, unlike linear filtering techniques or median filtering.
We present a predictive image coding technique based on the polynomial transform. This is an image representation model that analyzes an image by locally expanding it into a weighted sum of orthogonal polynomials. The scheme proposed in this paper is a Laplacian-like pyramidal structure. Based on interpolation and deblurring algorithms implemented by means of the polynomial transform, our method improves the prediction of an image at a certain spatial scale from its representation at a lower scale, thereby reducing the entropy of the prediction error image, in comparison with the Laplacian pyramid and the scale-space image coding schemes.
Several applications fields, like remote sensing, medical science, multimedia, requires the interpretation, archivation and transmission of image data. The volume of data to be managed is very large and this stimulated the research activities aiming at image compression. A particular topic of high interest is the evaluation of the specific degradations introduced by compression. In general such studies use only the mean square error and the visual inspection to determine the quality of the reconstructed image. This criteria are not sufficient for an accurate evaluation of the reconstructed image accuracy. In this paper it is introduced a new method for the quality evaluation of the block transform compressed images. It reflects the objective function of the compression algorithm, gives a measure of the visual quality and it is easily computed.
A Zerotree (ZT) coding scheme is applied as a post-processing stage to avoid transmitting zero data in the High-Speed Pyramid (HSP) image compression algorithm. This algorithm has features that increase the capability of the ZT coding to give very high compression rates. In this paper the impact of the ZT coding scheme is analyzed and quantified. The HSP algorithm creates a discrete-time multiresolution analysis based on a hierarchical decomposition technique that is a subsampling pyramid. The filters used to create the image residues and expansions can be related to wavelet representations. According to the pixel coordinates and the level in the pyramid, N2 different wavelet basis functions of various sizes and rotations are linearly combined. The HSP algorithm is computationally efficient because of the simplicity of the required operations, and as a consequence, it can be very easily implemented with VLSI hardware. This is the HSP's principal advantage over other compression schemes. The ZT coding technique transforms the different quantized image residual levels created by the HSP algorithm into a bit stream. The use of ZT's compresses even further the already compressed image taking advantage of parent-child relationships (trees) between the pixels of the residue images at different levels of the pyramid. Zerotree coding uses the links between zeros along the hierarchical structure of the pyramid, to avoid transmission of those that form branches of all zeros. Compression performance and algorithm complexity of the combined HSP-ZT method are compared with those of the JPEG standard technique.
A coding technique, based on WT (wavelet transform) and TSVQ (tree-structured vector quantization), is proposed in this paper. Wavelet transformed image is composed of several subimages according to resolutions and edge directions, and has a particular PDF (probability density function), the generalized Gaussian distribution. We propose an improved tree- structured VQ coder based on the properties of wavelet transform. Edge information extracted from the subimages in the wavelet transform domain has been used to reduce the distortion. A new vector formation scheme and a new tree growing algorithm has been presented in this paper to reduce the distortion rate in the reconstructed image. Finally, in order to allow the receiver a picture as quickly as possible at minimum cost, we propose a progressive transmission scheme using unbalanced tree structured codebook. It is shown that unbalanced TSVQ is well adapted to progressive transmission. Simulations results indicate that the quality of the reconstructed image is excellent in the range of 0.30 - 0.40 bit/pixel.
Spatially varying quantization schemes try to exploit the non-stationary nature of image subbands. One technique for spatially varying quantization is classification based on AC energy of blocks. Several different methods of subband classification have been proposed in the literature. One method is to optimally classify each subband and send the classification maps as side information. Although image subbands can be shown to be roughly uncorrelated, they are not independent. Naveen and Woods proposed a method in which classification is done based on the AC energy of the block corresponding to the same spatial location, but from the lower frequency band. In their method, inter-subband dependence is exploited to almost completely eliminate side information, albeit at the cost of decreasing classification gain. In this paper, we proposed a new method of classification based on vector quantization of AC energy n-tuples formed by energies of blocks which correspond to the same spatial location in the original image but belong to different subbands. This method allows us to reduce the side information at the same time maximizing classification gain for each band under the vector constraint. The performance of the new method is compared with the other two methods. The comparison is made based on conditional entropies as well as actual bit rates.
This paper exploits three correlation patterns that exist in the discrete wavelet transform (DWT) coefficients of an decomposed image. DWT is known as a very useful transform for image compression. Since the correlation patterns are among the DWT coefficients, they are post-DWT redundancy. By reducing this redundancy, quite significant improvement can be obtained as shown in this paper. In the real image world, edges which are discontinuities are very important in presenting an image. DWT on edges is not as efficient as it is on smooth areas, so some correlated DWT residuals around an edge can be observed in our experiments. This is the most important reason why the post-DWT redundancy exists. To make use of this redundancy, two useful techniques are employed in this paper. They are the Magnitude Partition and the Coordinate Splitting. The first one does not increase data entropy while the second one could reduce data entropy. The combination of this two techniques is the key idea to the schemes of this paper. Since this post-DWT redundancy has not been well pointed out in the current literacy, the novelty of this paper is to give an overall examination on it and to provide the useful schemes to reduce it.
In this paper, we propose a new image compression technique using wavelet transform and human visually estimated noise sensitivities. These consist of frequency, background brightness, and edge height sensitivities. The background brightness sensitivity for each quantizing point is modeled by a quadratic function. The edge sensitivity for each quantizing point is modeled by a non-linear function. The minimum value becomes background brightness sensitivity and edge height sensitivity is multiplied by the frequency sensitivity for determining the quantization step size. Quantization step sizes are calculated by using coefficients of lowest frequency band which are coded losslessly. Therefore, in the proposed method, information to specify quantization step size for higher frequency band, is not needed. The coefficients of high frequency bands are arithmetically coded in horizontal and vertical directions depending on the edge direction. Compared with previous human visual systems based image compression methods, the proposed method shows improved image quality for the same compression ratio with less computational cost.
Researchers investigating multiresolution, pyramid, or wavelet-style subband images have repeatedly noted the obvious visual similarity between corresponding portions of subband images at adjacent resolutions, pyramid levels or wavelet scales. Fractal-based compression methods directly exploit the similarity between portions of the image and spatially contracted portions of the same image. Yet, the conventional strategy of independently coding the subband images ignores these relationships, thereby neglecting behavior that could provide significant leverage in reducing the compressed bitrate. We present an interband prediction method that capitalizes on interband relationships. Starting with a lower-frequency subband image, the method predicts the content of the next-higher-frequency subband image. The process repeats recursively until the desired resolution level is reached. Both the predictor and the prediction are determined using small local regions in the subband image, to accommodate the notoriously nonstationary behavior of image data. Occasionally, the prediction error is objectionably large and requires prediction-error correction. Examples of good image quality are presented with prediction across three subband levels and prediction-error correction at roughly 10% of the predicted pixels.
It has been proved recently that for Gaussian sources with memory an ideal subband split will produce a coding gain for scalar or vector quantization of the subbands. Following the methodology of the proofs, we outline a method for successively splitting the subbands of a source, one at a time to obtain the largest coding gain. The subband with the largest theoretical rate reduction (TRR) is determined and split at each step of the decomposition process. The TRR is the difference between the rate in optimal encoding of N-tuples from a Gaussian source (or subband) and the rate for the same encoding of its subband decomposition. The TRR is a monotone increasing function of a so-called spectral flatness ratio, which involves the products of the eigenvalues of the source (subband) and subband decomposition covariance matrices of order N. These eigenvalues are estimated by the variances of the Discrete Cosine Transform, which approximates those of the optimal Karhunen Loeve Transform. After the subband decomposition hierarchy or tree is determined through the criterion of maximal TRR, each subband is encoded with a variable rate entropy constrained vector quantizer. Optimal rate allocation to subbands is done with the BFOS algorithm which does not require any source modelling. We demonstrate the benefit of using the criterion by comparing coding results on a two-level low-pass pyramidal decomposition with coding results on a two-level decomposition obtained using the criterion. For 60 MCFD (Motion Compensated Frame Difference) frames of the Salesman sequence an average rate- distortion advantage of 0.73 dB and 0.02 bpp and for 30 FD (Frame Difference) frames of Caltrain image sequence an average rate-distortion advantage of 0.41 dB and 0.013 bpp are obtained with the optimal decomposition over low-pass pyramidal decomposition.
At low bit rates most current image coding techniques will generate a smooth image where some of the important details of the original image are lost. For example, the compression of an image by JPEG down to a bit rate of 0.1 to 0.2 bit/pel usually results in a significant loss of detail and an annoying blocking effect. The basic idea in this paper is that in applications where the detail information in a low bit rate coded image is of more concern than the gray scale information, it is necessary to encode the former and disregard the latter. Based on the logarithmic image processing model, we have recently proposed a novel algorithm to generate a binary image (called the sketch image) from a gray scale image. We have shown that the sketch image retains much more information of the original image than those images generated by other image binarization techniques such as simple thresholding, ordered dither and error diffusion. In this paper, we further study the sketch image algorithm and propose a simple data compression scheme which incorporates a decimation and interpolation algorithm and the JBIG algorithm. Experimental results have shown that at low bit rate (about 0.1 bit/pel) the proposed algorithm preserves the fine details of the original image at the expense of loss of grey scale information.
The conventional method for sending halftone images via facsimile machines is inefficient. The ToneFac algorithm previously proposed improves the efficiency of halftone image transmission. ToneFac represents a halftone image by mean block values and an error image. To improve on ToneFac, this paper proposes additional processing techniques: searching for the error-minimizing gray value for each block, quantization and coding of block values, bit switching, which transforms the error image into a more compressible image, optimal block sizing, and spurious dot filtering, which removes perceptually insignificant dots. The new algorithm is compared to other methods including adaptive arithmetic coding, and is shown to provide improvement in bit rate.
We describe a procedure by which JPEG compression may be customized for grayscale images that are to be compressed before they are scaled, halftoned, and printed. Our technique maintains 100% compatibility with the JPEG standard, and is applicable with all scaling and halftoning methods. The JPEG quantization table is designed using frequency-domain characteristics of the scaling and halftoning operations, as well as the frequency sensitivity of the human visual system. In addition, the Huffman tables are optimized for low-rate coding. Compression artifacts are greatly reduced because they are masked by the halftoning patterns, and pushed into frequency bands where the eye is less sensitive. We present experimental results demonstrating that the customized JPEG encoder typically maintains `near visually lossless' image quality at rates below 0.2 bits per pixel (with reference to the final, printed image). In terms of the achieved bit rate, this performance is typically at least 20% better than that of a JPEG encoder using the suggested baseline tables.
A new block coding technique for the compression of bilevel text images is presented. The technique uses a combination of pre-processing the image to extract the edge information and block coding to compress the data. The key feature of the technique is simplicity in implementation. The pre-processing consists of an image differencing operation to decorrelate the strings of black pixels. The decorrelation is followed by the lossless coding of each block of the image. The performance of the new image differencing method is examined based on both theoretical and experimental code length data. Both theoretical and simulation results show that by pre-processing the image, the number of nonzero pixels can be significantly reduced and a more efficient block code is realized.