We consider the problem of representing individual faces by maximum <i>L</i><sub>1</sub>-norm projection subspaces calculated from available face-image ensembles. In contrast to conventional <i>L</i><sub>2</sub>-norm subspaces, <i>L</i><sub>1</sub>-norm subspaces are seen to offer significant robustness to image variations, disturbances, and rank selection. Face recognition becomes then the problem of associating a new unknown face image to the “closest,” in some sense, <i>L</i><sub>1</sub> subspace in the database. In this work, we also introduce the concept of adaptively allocating the available number of principal components to different face image classes, subject to a given total number/budget of principal components. Experimental studies included in this paper illustrate and support the theoretical developments.
We consider the problem of online foreground extraction from compressed-sensed (CS) surveillance videos. A technically novel approach is suggested and developed by which the background scene is captured by an <i>L</i><sub>1</sub>- norm subspace sequence directly in the CS domain. In contrast to conventional <i>L</i><sub>2</sub>-norm subspaces, <i>L</i><sub>1</sub>-norm subspaces are seen to offer significant robustness to outliers, disturbances, and rank selection. Subtraction of the <i>L</i><sub>1</sub>-subspace tracked background leads then to effective foreground/moving objects extraction. Experimental studies included in this paper illustrate and support the theoretical developments.
In this paper an adaptive procedure, based on a coarse-to-fine scheme, for the segmentation of a video sequence into background and moving objects, aimed at supporting content-based functionalities, is presented. The coarse stage provides a pixel-based motion detection based on non Gaussian signal extraction using Higher Order Statistics (HOS). The fine motion detection phase refines the coarse classification by introducing some topological constraints on the
segmentation map essentially by means of simple morphological operators at low computational cost. The background model takes explicitly into account the apparent motion, induced by background fluctuations typically appearing in outdoor sequences. Spatial adaptation of the algorithm is obtained by varying the threshold of the HOS based motion detector on the basis of the local spectral characteristics of each frame, measured by a parameter representing the local spatial bandwidth. Simulation results show that, the introduction of local bandwidth to control the segmentation
algorithm rejects the large apparent motion observed in outdoor sequences, without degrading the detection performance in indoor sequences.
In this paper we extend a segmentation method aimed at separating the moving objects from the background in a generic video sequence by means of a higher order statistics (HOS) significance test performed on a group of inter-frame differences. The test is followed by the motion detection phase, producing a preliminary binary segmentation map, that is refined by a final regularization stage. The HOS threshold and the temporal extent of the motion detection phase are adaptively changed on the basis of the estimated background activity and of the detected presence of slowly moving objects. The regularization phase, imposing a local connectivity constraint on the background-foreground map by basic morphological operators, plays an important role in eliminating misclassifications due to motion estimation ambiguities, of the original video sequence. The algorithm performance is illustrated by typical results obtained on MPEG4 sequences.
In this paper we propose a segmentation method aimed at separating the moving objects from the background in a generic video sequence. This task, accomplished at the coder site, is intended to support some new functionalities oriented to access and decode single objects of the coded video sequence, foreseen by innovative multimedia scenarios focused during the MPEG4 work. The proposed segmentation method comprises a motion detection, that produces a preliminary segmentation map, followed by a morphological regularization that plays an important role in eliminating misclassifications due to motion estimation ambiguities, noise, etc., of the original video sequence. The motion detection is essentially based on a higher order statistics (HOS) test that employs a temporally, non-linearly filtered version of the video sequence; this choice is motivated by HOS detection properties. The regularization phase, performed by basic morphological operators, provides a local connectivity constraint on the background-foreground map. The segmentation algorithm performance is illustrated by some experimental results carried out on MPEG4 test sequences.
In this work we propose a method for adaptive quantization of motion compensation residuals, in an H.263 coding scheme, based on color information. In the proposed strategy, the perceptive distance between the original and the predicted version of each macroblock is evaluated in the perceptive uniform color space L, a, b. Exploiting the properties of the perceptive uniform color spaces, the color distance is easily evaluated by means of the Euclidean distance, and its average value is compared with a threshold in order to choose the suitable quantization. Simulation results show that the adaptive quantization is very efficient in reducing the bit-rate when the sequences exhibit slow regular motion, and the quantization performances are upper-bounded by the motion compensation performances. In fact, for sequences with low-medium amount of motion, the block-based motion compensation efficiently predicts the actual frame from adjacent ones and the residuals are due to noise, or to small color variations. Then, in these cases, the adaptive quantization can provide significant bit-rate reduction, without subjective quality degradation. The proposed strategy is strictly compatible with the H.263 coding standard; however it is quite general, and can be useful exploited in different coding frameworks, based on various motion compensation techniques, whenever motion compensation residuals are evaluated.
The paper describes a coding optimization strategy in conformity with the recent ITU-T H.263 Recommendation for videophone sequences at bitrates less than 64 Kbit/s. The optimization algorithm jointly selects the temporal position of the frames to be coded and the coding mode (l, P or PB) of the selected frames. The decision is based on the observation, on a group of frames, of an 'activity' parameter, representing the variation of each frame with respect to the last coded one. The proposed strategy produces coded sequences with average frame rates lower than those produced by a non optimized coder, and a better visual quality of the single frame. However, the activity parameters evaluation, and the observation of several candidates, requires a greater delay, buffer size and complexity of the coding algorithm.
The paper illustrates a method for affine warping based motion compensation that exploits the same prediction mechanism as the H263 Advanced Prediction Mode, only introducing new constant weighting matrices in the H263 Overlapped Motion Compensation algorithm. In particular we show that, with reference to a regular-mesh based motion estimation algorithm, image prediction using affine morphing can be easily performed with fixed coefficients, when a proper linear resampling is used. The performance of a H263-like coder based on affine transformation and linear resampling is illustrated through experimental data.
The paper illustrates a comprehensive method for the motion compensation to be used in predictive video coding. The method is based on the observation that structured artifacts as those consisting of isolated points, lines, edges, organized textures are directly perceived by the user, while artifacts resembling realizations of gaussian processes can be considered less important. A fidelity criterion based on the Mean Forth-Cumulant as indirect estimate of the local entropy level is then applied to drive both the segmentation and the motion estimation phases. The motion estimator is conceptually similar to the higher order moments techniques employed in time delay estimation, and takes advantage of the Gaussian signals rejection capability, typical of the higher order cumulants. The contribution describes the theoretical framework of cumulant based motion estimation. The performance of a coder based on the discrimination of the temporal activity by means of cumulants, is illustrated through experimental data.
The paper reports on a video sequence coding method taking advantage of the generic video- communication layout: some moving objects on a still background. The algorithm operates on groups of frames in which the whole digital video sequence is divided, that implies the synchronization requirements' satisfaction and an acceptable level of compatibility with standard video coding (H.261, MPEG, etc.). An analysis of the spatial-temporal continuum, represented by each group of frames, is performed, in order to detect a tridimensional segmentation that identifies the moving objects by means of spatial regions. These regions can spread, as a sort of `pipes,' through the whole group of frames in the temporal direction. Various pipes' construction and coding strategies, including techniques based on object recognition and coding, are allowed. In this work a pipes' identification method based on fixed size moving blocks and their coding by means of a 3D-DCT transform is reported. The above method allows adjacent starting pipes to part themselves, leaving uncoded stripes at their boundaries. The proposed method does not imply the stripes coding, while it minimizes their number and the amount of the artifacts generated by their presentation. As a final topic, the paper reports some considerations on the coding efficiency related to the quality of the reconstructed sequences and on the compatibility characteristics.