A delay compensation algorithm is presented for a gaze-contingent video compression system (GCS) with a robust targeted gaze containment (TGC) performance. The TGC parameter allows varying compression levels of a gaze-contingent video stream by controlling its perceptual quality. The delay compensation model is based on the Kalman filter framework that models the human visual system with eye position and velocity data. The model predicts future eye position and constructs a high-quality coded region of interest (ROI) designed to contain a targeted number of gaze samples while reducing perceptual quality in the periphery of that region. Several model parameterization schemes were tested with 21 subjects using a delay range of 0.02 to 2 s and a TGC of 60 to 90%. The results indicate that the model was able to achieve TGC levels with compression of 1.4 to 2.3 times for TGC=90% and compression of 1.8 to 2.5 for TGC=60%. The lowest compression values were recorded for high delays, while the highest compression values were reported during small delays.
This paper presents a new homomorphic image cryptosystem. The idea of this system is based on encrypting the reflectance component after the homomorphic transform and embedding the illumination component as a least significant bit watermark into the encrypted reflectance component. A comparison study is held between the RC6 block cipher algorithm and the chaotic Baker map algorithm for the encryption of the reflectance component. We present a security analysis for the proposed cryptosystem against the entropy, brute-force, statistical, and differential attacks from a strict cryptographic viewpoint. Experimental results verify and prove that the proposed homomorphic image cryptosystem is highly secure from the cryptographic viewpoint. The results also prove that this cryptosystem has a very powerful diffusion mechanism (a small change in the plain text makes a great change in the cipher image). The homomorphic encryption using RC6 algorithm is more secure than that using the chaotic Baker map algorithm but not robust to noise. Thus, the proposed homomorphic cryptosystem can be used in different applications, depending on the core algorithm used.
Multiview video coding (MVC) is an ongoing standard. In the working draft, motion estimation and disparity estimation are both employed in the encoding procedure. It achieves the highest possible coding efficiency, but results in extremely large encoding time, which obstructs it from practical applications. We propose a macroblock (MB) level adaptive search range algorithm utilizing inter-view correlation for motion estimation in MVC to reduce the complexity of the coder. For multi-view sequences, the motion vectors of the corresponding MBs in previously coded view are first extracted to analyze motion homogeneity. On the basis of motion homogeneity, MBs are classified into three types (MB in the region with homogeneous motion, with medium homogeneous motion, or with complex motion), and search range is adaptively determined for each type MB. Experimental results show that our algorithm can save 75% average computational complexity of motion estimation, with negligible loss of coding efficiency.
Nonnegative matrix factorization (NMF) is a recently developed method for dimensionality reduction, feature extraction, and data mining, etc. Currently, no NMF algorithm holds both satisfactory efficiency for applications and enough ease of use. To improve the applicability of NMF, we propose a new monotonic, fixed-point algorithm called FastNMF by implementing least-squares error-based nonnegative factorization essentially according to the basic properties of parabola functions. The minimization problem corresponding to an operation in FastNMF can be analytically solved just by this operation, which is far beyond all existing algorithms' power, and therefore FastNMF holds much higher efficiency, which is validated by a set of experimental results. For the simplicity of design philosophy, FastNMF is still one of the NMF algorithms that are the easiest to use and the most comprehensible. In addition, theoretical analysis and experimental results also show that FastNMF tends to converge to better solutions than the popular multiplicative update-based algorithms as far as approximation accuracy is concerned.
Identification, localization, and segmentation of the thoracic, abdominal, and pelvic organs are important steps in computer-aided diagnosis, treatment planning, landmarking, and content-based retrieval of biomedical images. In this context, to aid the identification of the lower abdominal organs, to assist in image-guided surgery or treatment planning, to separate the abdominal cavity from the lower pelvic region, and to improve the process of localization of abdominal pathology, we propose methods to identify and segment automatically the pelvic girdle in pediatric computed tomographic (CT) images. The opening-by-reconstruction procedure was used for segmentation of the pelvic girdle. The methods include procedures to represent the pelvic surface by a quadratic model using linear least-squares estimation and to refine the model using deformable contours. The result of segmentation of the pelvic girdle was assessed quantitatively and qualitatively by comparing with the segmentation performed independently by a radiologist. On the basis of quantitative analysis with 13 CT exams of six patients, including a total of 277 slices with the pelvis, the average Hausdorff distance was determined to be 5.95 mm, and the average mean distance to the closest point (MDCP) was 0.53 mm. The average MDCP is comparable to the size of one pixel, on the average.
Downscaling between various video sizes is an important field in transcoding, especially in the era of communication and consumer electronics. Architectures are proposed to offer the best strategy in downscaling. We combine an intra refresh mechanism with a partial encode architecture. A new intra refresh decision mechanism is then proposed based on visual quality constraint. Not only does the work improve transcoded video quality, but overall complexity is also kept low. The experiment results show that the proposed work achieves a 0.5 to 2.5-dB advance in quality, compared with intra refresh in the open loop and partial encode architectures.
We present an adaptive object segmentation based on scene change detection techniques. First, the algorithm adaptively adjusts to skip the number of frames to cope with the amount of motion displacement to cover the entire object shape. Next, the spatial processing consists of noise removal and boundary smoothing techniques to discard background content as well as to remove the background noise, to obtain a segmented object. To evaluate the quality of segmentation, both of the standard benchmarks and camera imaging are employed. Results show that the proposed algorithm can achieve low error ratios under various standard sequence testing. The segmentation algorithm combines with MPEG-4 coding for camera imaging of video surveillance. This system can successfully demonstrate real-time video surveillance, as the frame rate is 10 to 30 using CPU-based software implementation.
Image sharing is a popular technology to secure important images against damage. The technology decomposes and transforms an important image to produce several other images called shadows or shares. To decode, the shared important image can be reconstructed by combining the collected shadows, as long as the number of collected shadows reaches a specified threshold value. A few sharing methods produce user-friendly (i.e., visually recognizable) shadows-in other words, each shadow looks like a replica of reduced visual quality of a given image, rather than completely meaningless random noise. This facilitates visual management of shadows. (For example, if there are 100 important images and each creates 2 to 17 shadows of its own, then it is easy to visually recognize that a stored shadow is from, say, a House image, rather than from the other 99 images.) In addition to visually recognizable shadows, progressive decoding is also a convenient feature: it provides the decoding meeting a convenient manner to view a moderately sensitive image. Recently, Fang combined both conveniences of visually recognizable shadows and progressive decoding [W. P. Fang, Pattern Recogn., 41, 1410-1414 (2008)]. But that method was memory expensive because its shadows were too big. In order to save memory space, we propose a novel method based on modulus operations. It still keeps both conveniences, but shadows are two to four times smaller than Fang's, and the visual quality of each shadow can be controlled by using a simple expression.
Learning-based image steganalysis is an effective and universal approach to cope with the following two difficulties: unknown statistics and steganographic algorithms. A crucial part of the learning-based process is the selection of low-dimensional features, which strongly impacts the accuracy of classification. A novel principal feature selection and fusion (PFSF) method is presented to reduce features, and then it is applied to image steganalysis. First, we analyze the multicollinearity among features to eliminate redundant features. Next, we implement the linear transform based on principal components analysis (PCA) and use Savage decision-making to eliminate insignificant features. Last, in order to further reduce features, we fuse the selected features, followed by selecting the principal features from the fused features to form a new feature set. The advantage of the proposed method is that it needs the cover images only, without requiring the availability of the stego-images in the process of the features selection. Moreover, the proposed method greatly reduces the computational time. Our method has been tested on two feature sets from Moulin's and Fridrich's features. The experimental results show that our method not only reduces the feature number by 90%, but also provides more reliable detection results than the previous steganalysis methods do.
Color transforms are important methods in the analysis and processing of images. Image color transform and its inverse transform should be reversible for lossless image processing applications. However, color conversions are not reversible due to finite precision of the conversion coefficients. To overcome this limitation, reversible color transforms have been developed. Color integer transform requires multiplications of coefficients, which are implemented with shift and add operations in most cases. We propose to use canonical signed digit (CSD) representation of reversible color transform coefficients and exploitation of their common subexpressions to reduce the complexity of the hardware implementation significantly. We demonstrate roughly 50% reduction in computation with the proposed method.
We describe an information-theoretic method for quantifying overall image quality in terms of mutual information (MI). MI is used to express the amount of information that an output image contains about an input object. The more the MI value provides, the better the image quality is. Therefore, the overall quality of an image can be quantitatively evaluated by measuring MI. We demonstrated by way of image simulation that MI increases with increasing contrast and decreases with the increase of noise and blur. We investigated the utility of this method by applying it to evaluate the performance of four imaging plate detectors. We also compared evaluation results in terms of MI against those in terms of the detective quantum efficiency conventionally used for characterizing the efficiency performance of imaging systems. Our results demonstrate that the proposed method is simple to implement and has potential usefulness for evaluation of overall image quality.
We present a robust stereo-image coding algorithm using digital watermarking in fractional Fourier transform (FrFT) and singular value decomposition (SVD). For the purpose of the security, the original (left stereo) image has been degraded and watermark (right disparity map) is embedded in the degraded image. This watermarked degraded stereo image is processed in an insecure channel. At the receiver's end, both the watermarked image (left stereo image) and watermark images are found by the decoding process. The use of the FrFT, SVD, and degradation process of the stereo image add much more complexity to decode the information about the stereo images and disparity map extraction. Moreover, processing of the watermarked image only provides the stereo as well as 3-D information of the scene/object. Experimental results show that the proposed algorithm is efficient to achieve stereo image security.
We introduce a new steganography method for embedding message bits into images by utilizing the mathematical relation between the image and transformation domains. The proposed method matches the message bit sequences with the coefficients of the discrete Haar wavelet transformation (DHWT) by modifying the pixels related to those coefficients. The matching process is applied separately for each image block, which is defined with respect to the dimension of the wavelet transform. The algorithm needs to change one pixel out of all the pixels in each block to represent the message bit sequence to be embedded. A minimum amount of degradation provides better stego-image quality than the-state-of-the-art methods. In addition, the experiments performed on different image databases show that the proposed method is more robust against both blind and targeted steganalysis methods, especially in low payload capacities.
We describe a novel approach to using soft-tissue data sets, such as computer tomography on magnetic resonance, in the minimally invasive image guidance of intra-arterial and intravenous endovascular devices in neuroangiography interventions. Minimally invasive x-ray angiography procedures rely on the navigation of endovascular devices, such as guide wires and catheters, through human vessels, using C-arm fluoroscopy. Although the bone structure may be visible and the injection of iodine contrast medium allows one to guide endovascular devices through the vasculature, the soft-tissue structures remain invisible in the fluoroscopic images. We intend to present a method for the combined visualization of soft-tissue data, a 3-D rotational angiography (3-DRA) reconstruction, and the live fluoroscopy data stream in a single fused image. Combining the fluoroscopic image with the 3-DRA vessel tree offers the advantage that endovascular devices can be located within the vasculature without additional contrast injection, while the position of the C-arm geometry can be altered freely. The additional visualization of the soft-tissue data adds contextual information to the position of endovascular devices. We address the clinical applications, the real-time aspects of the registration algorithms, and fast-fused visualization of the proposed method.
Thermal excitation of electrons is a major source of noise in charge-coupled-device (CCD) imagers. Those electrons are generated even in the absence of light, hence, the name dark current. Dark current is particularly important for long exposure times and elevated temperatures. The standard procedure to correct for dark current is to take several pictures under the same condition as the real image, except with the shutter closed. The resulting dark frame is later subtracted from the exposed image. We address the question of whether the dark current produced in an image taken with a closed shutter is identical to the dark current produced in an exposure in the presence of light. In our investigation, we illuminated two different CCD chips with different intensities of light and measured the dark current generation. A surprising result of this study is that some pixels produce a different amount of dark current under illumination. Finally, we discuss the implication of this finding for dark frame image correction.
In a large plurality of applications, imaging quality is significantly reduced due to existence of static or time-varying random perturbation media. An example of such a medium can be a diffusive window through which we wish to image an object located behind, and not in proximity to, the window. Another example can be localized flow of turbulence (above hot surfaces such as black roads) or of aerosols distorting the imaging resolution of objects positioned behind the perturbation. We present a new deblurring approach for obtaining highly resolved imaging of objects positioned behind static or time-varying random perturbation media. The proposed approach for extraction of the high spatial frequencies is based on iterative computation similar to the well-known Gerchberg-Saxton algorithm for phase retrieval. By focusing our camera onto three planes positioned between the imaging camera and the perturbation, we are able to retrieve the phase distribution of those planes and then reconstruct the intensity of the object by numerical free space propagation of this extracted complex field to the estimated position of the object.
We propose a fast motion estimation (FME) algorithm that performs comparably to the full search while having improved performance when compared to the best FME methods that are recommended for the JVT/H.264 standard despite having a much lower computational complexity than these methods. Our algorithm, called the predictive and adaptive rood pattern with large motion search, incorporates motion vector prediction using spatial and temporal correlation, an adaptive search pattern, multiple refinement search paths, and an adaptive moving search window scheme that is specifically designed for searching large and complex motion.