This paper presents a new adaptive Boxcar/Wavelet transform for image compression. Boxcar/Wavelet decomposition
emphasizes the idea of average-interpolation representation which uses dyadic averages and their
interpolation to explain a special case of biorthogonal wavelet transforms (BWT). This perspective for image
compression together with lifting scheme offers the ability to train an optimum 2-D filter set for nonlinear prediction
(interpolation) that will adapt to the context around the low-pass wavelet coefficients for reducing energy
in the high-pass bands. Moreover, the filters obtained after training is observed to posses directional information
with some textural clues that can provide better prediction performance. This work addresses a firrst step towards
obtaining this new set of training-based fillters in the context of Boxcar/Wavelet transform. Initial experimental
results show better subjective quality performance compared to popular 9/7-tap and 5/3-tap BWTs with
comparable results in objective quality.
Performance of current face recognition algorithms reduces significantly when they are applied to low-resolution face images. To handle this problem, super-resolution techniques can be applied either in the pixel domain or in the face subspace. Since face images are high dimensional data which are mostly redundant for the face recognition task, feature extraction methods that reduce the dimension of the data are becoming standard for face analysis. Hence, applying super-resolution in this feature domain, in other words in face subspace, rather than in pixel domain, brings many advantages in computation together with robustness against noise and motion estimation errors. Therefore, we propose new super-resolution algorithms using Bayesian estimation and projection onto convex sets methods in feature domain and present a comparative analysis of the proposed algorithms with those already in the literature.
This paper describes a method for selecting key frames by using a number of parameters extracted from the MPEG video stream. The parameters are directly extracted from the compressed video stream without decompression. A combination of these parameters are then used in a rule based decision system. The computational complexity for extracting the parameters and for key frame decision rule is very small. As a results, the overall operation is very quickly performed and this makes our algorithm handy for practical purposes. The experimental results show that this method can select the distinctive frames of video streams successfully.
This paper presents an object-based synthetic-natural hybrid image coding scheme, where each image object is encoded individually, provided that their boundaries are specified. This allows coding natural and synthetic image objects using different methods which are best suited their content. It also allows object-based quality scalability, in addition to mixing lossy and lossless coding modes depending on the requirements of each image object. Furthermore, we propose a new object coding method using 2D mesh-based image sampling and interpolation, followed by encoding of the interpolation error image by a traditional data/waveform coding methods. Experimental results on synthetic-natural hybrid test images are provided.
This paper presents an object-based, object-scalable mesh design and tracking algorithm for very low bitrate video coding, which consists of three stages: object segmentation, object boundary coding, and 2D mesh design and tracking within each object. Here, we use pre- segmented test sequences; hence, object/motion segmentation is not treated. The boundary of each individual object is approximated by a polygon. Next, a node point selection algorithm followed by constrained Delauney triangulation is employed, where line segments representing the boundary of the object polygons from the constraints.
We propose two key modifications to a recent motion segmentation algorithm developed by Wang and Adelson, which greatly improve its performance. They are: (i) the adaptive k- means clustering step is replaced by a merging step, whereby the hypothesis (affine parameters of a block) which has the smallest representation error, rather than the respective cluster center is used to represent each layer, and (ii) we implement it in multiple stages, where pixels belonging to a single motion model are labeled at each stage. Performance improvement due to the proposed modifications is demonstrated on real video clips.
Motion compensation using 2-D mesh models requires computation of the parameters of a spatial transformation within each mesh element (patch). It is well known that the parameters of an affine (bilinear or perspective) mapping can be uniquely estimated from three (four) node-point motion estimates. This paper presents closed-form overdetermined solutions for least squares estimation of the motion parameters, which also preserve mesh-connectivity using node-based connectivity constraints. In particular, two new algorithms are presented: The first method, based on the dense motion estimates, can be viewed as post processing of the dense motion field for best compact representation in terms of irregularly spaced samples, while the second one, which is based on spatio-temporal intensity gradients, offers closed- form solutions for direct estimation of the best node-point motion vectors. We show that the performance of the proposed closed-form solutions are comparable to those of the alternative search-based solutions at a fraction of the computational cost.
We propose a new, efficient 2D object-based coding method for very low bit rate video compression based on affine motion compensation with triangular patches under connectivity constraints. We, then, compare this approach with a 3D object-based method using a flexible wireframe model of a head-and-shoulders scene. The two approaches will be compared in terms of the resulting bitrates, peak-signal-to-noise-ratio (PSNR), visual image quality, and execution time. We show that 2D object-based approaches with affine transformations and triangular mesh models can simulate all capabilities of 3D object-based approaches using wireframe models under the orthographic projection, at a fraction of the computational cost. Moreover, 2D object-based methods provide greater flexibility in modeling arbitrary input scenes in comparison to 3D object-based methods.