The goal of image processing is generally to identify objects and their relationships in a digital image. The first step in this process is normally the identification of the edges in the digital image. There are many different ways to do this and they will be discussed briefly in the background and literature section. This paper assumes that a given object is presented and the software program is asked to determine if that particular object is present in a given image. The final result would consist of informing the user that the object was represented in the image and computing the pixel position of the search object in the given image. Of course if the object is not found, the user should be given that information. The proposed algorithm uses cubic Bezier curves to represent both the search object and the edges identified in the image. The advantages of the Bezier curve approach are discussed.
A study was performed to assess the ability of five objective quality measures to predict perceptual quality difference ratings. The objective measures included peak signal-to-noise ratio (PSNR), root-mean-square error (RMSE), maximum absolute difference (MAD), the Image Quality Metric (IQM), and the just-noticeable-difference (JND) metric. Perceptual difference ratings used the National Image Interpretability Rating Scale (NIIRS). NIIRS data from four previous studies of bandwidth compression and image processing were compared to values of the five objective measures to determine whether any of the objective metrics could be used as a substitute for the labor-intensive NIIRS ratings.
When evaluating an imaging system, it is important to have a confident evaluation measure as well as an understanding of the limitations of the evaluation measure. The signal-to-noise ratio (SNR) and several variants such as the peak signal-to-noise ratio (PSNR) have been used abundantly as quality measures in imaging and video systems. A debate as to whether SNR reflects human perception in some cases has attempted to dissuade the use of SNR but SNR is still used in basic research as a quality measure. Recent work for evaluating video sequences suggests that SNR can follow the human perception trend if the proper formulation is used. Likewise, this paper suggests that SNR can be proper and follow human perception for evaluating quality if a proper formulation of SNR is constructed based on recognition of vision system attributes. In particular, this paper suggests a new variant of the basic PSNR measure for evaluating single frame images based on recognition of the vision attributes. A new variant of the PSNR is introduced for evaluating video sequences based on vision attributes. The human visual measurements used to formulate the new PSNR are presented along with a demonstration of the new PSNR on images.
The experience of retinex image processing has prompted us to reconsider fundamental aspects of imaging and image processing. Foremost is the idea that a good visual representation requires a non-linear transformation of the recorded (approximately linear) image data. Further, this transformation appears to converge on a specific distribution. Here we investigate the connection between numerical and visual phenomena. Specifically the questions explored are: (1) Is there a well-defined consistent statistical character associated with good visual representations? (2) Does there exist an ideal visual image? And (3) what are its statistical properties?
A new approach to sensor fusion and enhancement is presented. The retinex image enhancement algorithm is used to jointly enhance and fuse data from long wave infrared, short wave infrared and visible wavelength sensors. This joint optimization results in fused data which contains more information than any of the individual data streams. This is especially true in turbid weather conditions, where the long wave infrared sensor would conventionally be the only source of usable information. However, the retinex algorithm can be used to pull out the details from the other data streams as well, resulting in greater overall information. The fusion uses the multiscale nature of the algorithm to both enhance and weight the contributions of the different data streams forming a single output data stream.
Measures of image quality are presented here that have been developed to assess both the immediate quality of an image and the potential at intermediate points in an imaging chain for enhanced image quality. The original intent of the metric(s) was to provide an optimand for interpolator design, and the metrics have subsequently been used for a number of differential image quality analyses and imaging system component designs. The metrics presented are of the same general form as the National Imagery Interpretability Rating Scale (NIIRS), representing quality as the base-2 logarithm of linear resolution, so that one unit of differential quality represents a doubling or halving of the resolution of imagery. Analysis of a simple imaging chain is presented in terms of the metrics, with conclusions regarding interpolator design, consistency of the latent and apparent image quality metrics, and the relationship between interpolator and convolution kernel design in a system where both are present. Among the principal results are an optimized division of labor between interpolators and Modulation Transfer Function Correction (MTFC) filters, consistency of the analytical latent and apparent image quality metrics with each other and with visually optimized aim curves, and an introduction to sharpening interpolator design methodology.
Classification of images requires extraction of optimal set of features. In this paper, a method that uses genetic algorithm creating texture descriptors on features computed from a feature extraction method is presented. A feature extraction algorithm is applied to a database of images and a training feature matrix is created. This matrix is updated by a dynamic algorithm, which finds the vectors most close to the real solution in the Euclidean norm. This set forms the texture descriptor which can be further used for classification of unknown samples. A weighted fitness function that selects best parents in each generation has been implemented. Examples of classification are presented with the features computed from a classification algorithm. Results show that the classification performance of the features improved after applying the genetic algorithm. The algorithm is cost efficient. This algorithm is also compared with that of the Learning Vector Quantization method which quantizes the training vectors to an optimal set of codebook vectors.
The display is a key element in the softcopy image chain. If the display is not optimized, information is lost. Studies seeking to assess the effects of bandwidth compression and image enhancement will reach false conclusions unless the display system is optimized. Although standards exist for the display of text and symbology, no such standards exist for continuous tone imagery. To help remedy this situation, a series of studies were conducted to help define guidelines for the effective display of continuous tone imagery, with emphasis on surveillance and reconnaissance imagery. Imagery of various types (visible, IR, multispectral, SAR) was displayed on cathode ray tube (CRT) and active matrix liquid crystal displays (AMLCD) that varied in luminance and spatial resolution performance. Over a series of eight studies, trained imagery analysts provided National Imagery Interpretability Ratings (NIIRS) and Briggs target ratings (a measure of minimum discriminable target size as a function of contrast) to assess the impact of display variations. From these studies, recommendations were derived for display pixel density, contrast modulation, and luminance measures including dynamic rang, ambient light level, color temperature, and perceptual linearization. This paper defines the display performance measures used, performance measurement procedures, and presents guidelines for display optimization. Results of studies supporting the guidelines are summarized. Use of the guidelines is recommended in any study involving softcopy display of continuous tone imagery.
This paper explores the relationship between information efficiency and pattern classification in hyperspectral imaging systems. Hyperspectral imaging is a powerful tool for many applications, including pattern classification for scene analysis. However, hyperspectral imaging can generate data at rates that challenge communication, processing, and storage capacities. System designs with fewer spectral bands have lower data overhead, but also may have reduced performance, including diminished capability to classify spectral patterns. This paper presents an analytic approach for assessing the capacity of a hyperspectral system for gathering information related to classification and the system's efficiency in that capacity. Our earlier work developed approaches for analyzing information capacity and efficiency in hyperspectral systems with either uniform or non-uniform spectral-band widths. This paper presents a model-based approach for relating information capacity and efficiency to pattern classification in hyperspectral imaging. The analysis uses a model of the scene signal for different classes and a model of the hyperspectral imaging process. Based on these models, the analysis quantifies information capacity and information efficiency for designs with various spectral-band widths. Example results of this analysis illustrate the relationship between information capacity, information efficiency, and classification.
Aliasing due to undersampling is commonly present in all digital imagery. The imaging characteristics of the sensor, including optical blur, detector integration blur and electronic filter blur, determine the potential for aliasing in the digital images produced by that sensor. The actual extent of aliasing in any particular image depends on the scene spatial content. In this paper, we analyze the potential for aliasing in imagery from some current remote sensing systems, including the Landsat-7 Enhanced Thematic Mapper + (ETM+), Terra MODIS and EO-1 Advanced Land Imager (ALI). A metric is proposed and calculated for the aliasing potential of each sensor. The results show that the EO-1 ALI is the most susceptible to aliasing because of its relatively high MTF compared to that of ETM+ and MODIS. Real image examples are used to illustrate aliasing, including the use of aliasing to measure the sensor spatial response from certain types of targets.
Many visible and infrared sampled imaging systems suffer from moderate to severe amounts of aliasing. The problem arises because the large optical apertures required for sufficient light gathering ability result in large spatial cutoff frequencies. In consumer grade cameras, images are often undersampled by a factor of twenty times the suggested Nyquist rate. Most consumer cameras employ birefringent blur filters that purposely blur the image prior to detection to reduce Moire artifacts produced by aliasing. In addition to the obvious Moire artifacts, aliasing introduces other pixel level errors that can cause artificial jagged edges and erroneous intensity values. These types of errors have led some investigators to treat the aliased signal as noise in imaging system design and analysis. The importance of aliasing is dependent on the nature of the imagery and the definition of the assessment task. In this study, we employ a laboratory experiment to characterize the nature of aliasing noise for a variety of object classes. We acquire both raw and blurred imagery to explore the impact of pre-detection antialiasing. We also consider the post detection image restoration requirements to restore the in-band image blur produced by the anti-aliasing schemes.
Aliasing is introduced in sampled imaging systems when light level requirements dictate using a numerical aperture that passes spatial frequencies higher than the Nyquist frequency set by the detector. One method to reduce the effects of aliasing is to modify the optical transfer function so that frequencies that might otherwise be aliased are removed. This is equivalent to blurring the image prior to detection. However, blurring the image introduces a loss in spatial detail and, in some instances, a decrease in the image signal-to-noise ratio. The tradeoff between aliasing and blurring can be analyzed by treating aliasing as additive noise and using information density to assess the imaging quality. In this work we use information density as a metric in the design of an optical phase-only anti-aliasing filter. We used simulated annealing to determine a pupil phase that modifies the system optical transfer function so that the information density is maximized. Preliminary results indicate that maximization of the information density is possible. The increase in information density appears to be proportional to the logarithm of the electronic signal-to-noise ratio and insensitive to the number of phase levels in the pupil phase. We constrained the pupil phase to 2, 4, 8, and 256 phase quantization levels and found little change in the information density of the optical system. Random and zero initial-phase inputs also generated results with little difference in their final information densities.
Wavefront Coded imaging systems are jointly optimized optical and digital imaging systems that can increase the performance and/or reduce the cost of modern imaging systems by reducing the effects of aberrations. Aberrations that can be controlled through Wavefront Coding include misfocus, astigmatism, field curvature, chromatic aberration, temperature related misfocus, and assembly related misfocus. The design and simulation of these systems are based on a model that describes all of the important aspects of the optics, detector, and digital processing being used. These models allow theoretical calculation of ideal MTFs and signal processing related parameters for new systems. These parameters are explored for extended depth of field, field curvature, and temperature related misfocus effects.
Effective and efficient video streaming has become a popular research topic in recent years. Significant research has been done on streaming techniques for MPEG video. In comparison, research on the same for wavelet-compressed video is far from adequate. In the paper, we exploit the features of the 3-D wavelet video in streaming applications to design a novel streaming framework. The new framework takes into consideration of the video content in both compression and transmission scheme for optimal performance. The framework consists of two parts: a compression module specially designed for video streaming and a robust transmission module. In the compression module, the input video is first segmented into GOFs (group of frames) by a dynamical grouping approach such that the frames of the same GOF have similar contents. Then the integer-based 3-D wavelet transform is used to achieve real-time computation. To prevent error propagation in the streaming, a bounded coding scheme is employed such that the output bitstreams of different subbands are independent. In the transmission module, a content-based retransmission scheme is adopted to minimize the distortion caused by packet loss, while subjecting to both rate and delay constraints. The optimized transmission is achieved by a fast decision-making system that employs a heuristic algorithm to drop the least significant packets if necessary to ensure the delivery of more important ones in the same GOF. Furthermore, optimal bandwidth allocation between different GOFs is implemented to keep the video quality consistent regardless of the packet loss rate. Experimental results are provided to show the effectiveness of the designed framework.
A new XML-based MPEG-4 coding system for video streaming is proposed in this research. The XML technology is applied to MPEG-4 coded video contents to provide more flexibility for video manipulation. To be more specific, for every compressed video file, we generate the corresponding XML document that works as an indexing file as well as a streaming facilitator. This XML-based MPEG-4 coded bitstream can be transmitted over the high level HTTP protocol or the low level IP or RTP protocol. In this work, we will examine the design of this XML-based streaming system, including the overhead of the XML document, its transmission, the processing requirement, and error resilience, etc. Experimental results will be provided to compare the performances of the traditional MPEG-4 streaming solution and the proposed XML-based solution.
We present a novel data compression technique, called recursive interleaved entropy coding, that is based on recursive interleaving of variable-to variable length binary source codes. A compression module implementing this technique has the same functionality as arithmetic coding and can be used as the engine in various data compression algorithms. The encoder compresses a bit sequence by recursively encoding groups of bits that have similar estimated statistics, ordering the output in a way that is suited to the decoder. As a result, the decoder has low complexity. The encoding process for our technique is adaptable in that each bit to be encoded has an associated probability-of-zero estimate that may depend on previously encoded bits; this adaptability allows more effective compression. Recursive interleaved entropy coding may have advantages over arithmetic coding, including most notably the admission of a simple and fast decoder. Much variation is possible in the choice of component codes and in the interleaving structure, yielding coder designs of varying complexity and compression efficiency; coder designs that achieve arbitrarily small redundancy can be produced. We discuss coder design and performance estimation methods. We present practical encoding and decoding algorithms, as well as measured performance results.
In the present work, we propose a system for error-resilient coding of synthetic aperture radar imagery, whereby regions of interest and background information are coded independently of each other. A multiresolution, constant-false-alarm-rate (CFAR) detection scheme is utilized to discriminate between target regions and natural clutter. Based upon the detected target regions, we apply less compression to targets, and more compression to background data. This methodology preserves relevant features of targets for further analysis, and preserves the background only to the extent of providing contextual information. The proposed system is designed specifically for transmission of the compressed bit stream over noisy wireless channels. The coder uses a robust channel-optimized trellis-coded quantization (COTCQ) stage that is designed to optimize the image coding based upon the channel characteristics. A phase scrambling stage is also incorporated to further increase the coding performance, and to improve the robustness to nonstationary signals and channels. The resulting system dramatically reduces the bandwidth/storage requirements of the digital SAR imagery, while preserving the target-specific utility of the imagery, and enabling its transmission over noisy wireless channels without the use of error correction/concealment techniques.
Data hiding method called steganography has bee studied. Steganography is a method to transmit noticeable data secretly by mixing them in carefree image data. In the steganography, an image is used only as the cloak to transmit noticeable information. In this summary, we propose a method to incorporate data hiding into an adaptive lossless coding method. An adaptive coding method is effective for data compression because image signals are usually not stationary. For example, it is possible to get better coding performance by dividing an image into some blocks and by using suitable coding parameter for each block. There are two types of adaptive coding schemes, one is the method to transmit suitable coding parameters, block by block, as an overhead information, and another is the method to estimate and use parameters from already transmitted data, so that the latter does not need to transmit the coding parameters. We propose the method for data hiding in the former adaptive coding method, i.e. adaptive method with overhead information. In our proposed method, coding parameters are controlled by the embedding data and the original image data are never changed.
A novel real-time approach to achieve optical image processing to evaluate the difference between the two images is carried out by implementing a phase reversal concept in Twyman-Green interferometer. The phase reversal is accomplished by varying the pressure within an air-filled quartz cell inserted in one of the arms of the interferometer. Initially, the interferometer is aligned to obtain broad interference fringes in the cell region. Then the input imageries are introduced in both the arms of the interferometer and adjusted for exact registration as seen in the plane of observation. By introducing the phase change of (pi) -radians between the two arms of the interferometer, the difference between the two input images is detected in real-time. Phase shift calibration and image subtraction carried out by the proposed method is presented with the experimental results.
Enhanced false color images from mid-IR, near-IR (NIR), and visible bands of the Landsat thematic mapper (TM) are commonly used for visually interpreting land cover type. Described here is a technique for sharpening or fusion of NIR with higher resolution panchromatic (Pan) that uses a shift-invariant implementation of the discrete wavelet transform (SIDWT) and a reported pixel-based selection rule to combine transform coefficients. There can be contrast reversals (e.g., at soil-vegetation boundaries between NIR and visible band images) and consequently degraded sharpening and edge artifacts. To improve performance for these conditions, I used a local area-based correlation technique originally reported for comparing image-pyramid-derived edges for the adaptive processing of wavelet-derived edge data. Also, using the redundant data of the SIDWT improves edge data generation. There is additional improvement because sharpened subband imagery is used with the edge-correlation process. A reported technique for sharpening three-band spectral imagery used forward and inverse intensity, hue, and saturation transforms and wavelet-based sharpening of intensity. This technique had limitations with opposite contrast data, and in this study sharpening was applied to single-band multispectral-Pan image pairs. Sharpening used simulated 30-m NIR imagery produced by degrading the spatial resolution of a higher resolution reference. Performance, evaluated by comparison between sharpened and reference image, was improved when sharpened subband data were used with the edge correlation.
Effective management of multiple media servers integrated to deliver real-time multimedia content to users for video-on-demand (VoD) applications is examined in this research. We propose a random early migration (REM) scheme to reduce the service rejection rate, to balance the load of media servers and to reduce the service delay. When a new request is dispatched to a media server, REM compares the current service load with preset thresholds and decides whether request migration is needed with a certain probability, which is a function of the service load. To control the video access rate, we apply a time window to predict the video access probability in the near future and dynamically update the video content in media servers. Simulation results demonstrate that REM alone and/or REM with dynamic content update can achieve an enhanced system performance.
In this paper, we present the real-time implementation of image filtering for impulse and mixture of impulsive and multiplicative noise removal with detail preservation by means of use of DSP TMS320C6701. The filtering scheme is given for two filters connected in cascade. In the first stage, we use the MM-KNN (Median M-type K-Nearest Neighbor) filter to provide detail preservation and impulsive noise rejection. The second stage is proposed to use an M filter to provide multiplicative noise suppression. We use different types of influence functions in the M-estimator to provide better noise suppression. Extensive simulation results demonstrate that the proposed filter consistently outperforms other filters by balancing the tradeoff between noise suppression and detail preservation.
In this paper, implementation of object-based multiview 3D display using object segmentation and adaptive disparity estimation is proposed and its performance is analyzed by comparison to that of the conventional disparity estimation algorithms. In the proposed algorithm, firstly we can get segmented objects by region growing from input stereoscopic image pair and then, in order to effectively synthesize the intermediate view the matching window size is selected according to the extracted feature value of the input stereo image pair. Also, the matching window size for the intermediate view reconstruction (IVR) is adaptively selected in accordance with the magnitude of the extracted feature value from the input stereo image pair. In addition, some experimental results on the IVR using the proposed algorithm is also discussed and compared with that of the conventional algorithms.
In this paper, a new effective 3D object remote-tracking system using the disparity information is suggested, in which not only the target object can be tracked in real-time, but also the target object is displayed in 3D at the receiver by only using the disparity information taken from the stereo object images. By using the disparity information of stereo image sequences, the target object can be detected and its location coordinates are extracted and then, using these coordinate values the stereo object tracking is accomplished by controlling the stereo camera that is mounted on a pan/tilt. And, the disparity information is sent to the receiver together with the reference image for effective 3D reconstruction of the target image under tracking. Some experimental results show the proposed system has much less transmitting data and shorter processing time for real-time 3D target tracking through the comparison to those of the conventional algorithms.
In general, the stereo input images obtained by the parallel-axis camera system lacks in the common field of view and the disparity information of the object due to its configuration limitation. In this paper, a new scheme is proposed, in which this stereopsis can be improved by adaptively controlling the disparity of the input stereo images. In the proposed method, each object in the stereo input image is segmented by considering the relative distance information of them and then, each segmented object is horizontally shifted according to these values, in which especially the differential shifting scheme is applied to maximize the stereopsis of the stereo input images. From some experimental results, it is shown that the proposed method improves the disparity of the original stereo image about 1.6 dB in PSNR. In addition, 6-view stereoscopic image synthesized by the proposed method is presented.
A crucial step in image restoration involves deconvolving the true object from noisy and often poorly sampled image data. Deconvolution under these conditions represents an ill-posed inversion problem, in that no unique computationally stable solution exists. We propose a statistical information based approach to regularize the deconvolution process. Using Shannon Information, one monitors the information about the object that is processed during the deconvolution in order to obtain an optimal stopping criterion and hence the``best' solution to the inversion problem. The optimal stopping criterion is based on how Shannon Information changes in the spatial frequency domain as the deconvolution proceeds. We present results for the Maximum Entropy Method (MEM) and Richardson-Lucy (RL) non-linear deconvolution techniques.