This paper presents a method for performing model-based motion estimation for ground tracking from an airborne
video. Different model geometries are used for the input into a four-step search (4-SS) motion estimation
algorithm. The algorithm allows the user to select both the model geometry and the center mass of the vehicle.
The estimated center of mass is determined from the 4-SS motion estimation and used as the input for the
comparison between the next two images in the sequence. The model geometries considered are a rectangle
internal to the vehicle, a rectangle external to the vehicle, or a polygon that matches the general shape of the
vehicle. Each geometry is compared against ground truth data. A goodness measure is used to compare tracking
quality. Initial results indicate both geometries track well as long as the model geometry is adjusted based on
aircraft motion and as long as variations such as specularities are kept to a minimum.
Selecting the correct feature set is an essential basis for video sequence analysis that leads to applications such as tracking and recognition of vehicles. This paper selects diverse multiple features and tests their accuracy for tracking a static vehicle. The static vehicle images are captured with airborne infrared and color video cameras. The camera collects 30 frames per second of compressed video in Motion JPEG format. The diverse features are selected from representative histogram-based, invariant-based, spatial-temporal-based and the center-symmetric autocorrelation-based family of features. A small dataset of airborne video sequences that include static vehicles with variations in quality, orientation, resolution, foreground lighting and background lighting are used to test feature selection and static tracking. The track of the vehicles is hand selected frame-to-frame to create a truth track. The result of each feature to maintain track is tested and scored based on distance from the truth track. Once a significant break from track occurs, the truth data is used to reacquire track. One goodness score is based on how often a feature breaks track. This analysis shows promise for identifying appropriate features for improved tracking results. The suggested algorithm is demonstrated on only a few video sequences with limited variations in operating conditions but demonstrates improvement possibilities for near real-time application.
Resolution enhancement in video sequences can involve a variety of enhancement techniques including deinterlacing, de-blurring, motion compensation, super-resolution and more. Depending on the video artifacts, techniques are chosen and applied to each frame in the video. The selected technique(s) is/are then applied to each complete video frame. This paper suggests using resolution improvement techniques, not on the complete frame, but only on specific moving vehicles within the frame. Using improved resolution only on the vehicles being tracked offers the possibility for improved local vehicle identification and enhancement to the near real-time vehicle tracking approaches. Additionally, the approach can save the computational cost associated with complete image frame enhancement. The paper develops a spatial-temporal deinterlaced resolution-ratio enhanced approach to a vehicle being tracked. For deinterlacing, this paper uses a variant of a recent new spatial-temporal technique which uses directional interpolation and motion compensation. The technique uses intra-field spatial and inter-field temporal interpolation. The suggested algorithm is demonstrated on a few video sequences and shows promise for near real-time application. The final processing result integrates targets through feature tracking creating the psychophysical "popout" effect with a higher target-to-background resolution ratio.
Object shape in an image can vary for many reasons. Therefore, a goal of object recognition research is to create algorithms that are able to accurately recognize objects with variability in shape. This paper suggests recognizing shape through an assessment of different measures. There are two primary approaches to measuring shape variations for recognition: measure, compare, and match; and compare, measure, and match. In the first approach, attributes of the object shape are measured and compared with the same measured attributes of the template shape. This paper focuses on the second approach, which first compares the object and template jointly and then creates a normalized measure for matching. This approach is called multiple joint comparative normalized measures (MJCNM). Confidence in the match is shown to be better against certain shape variabilities on using MJCNM than on using just one shape measure. In particular, the MJCNM approach here uses matched-filter, Procrustes, partial-directed Hausdorff, and percent-pixels-same measures. An experimental result is given that demonstrates the implementation and usefulness of that approach.
One of the key variables that have been used for identifying objects in two-dimensional imagery is shape. Humans have the ability to discriminate between shapes and can perceive an imperfect shape as belonging to a particular object class. Each object class has a boundary where a human perceives an object as belonging to one class or the other. Perceptual classification boundaries define the human perception that classifies a shape as belonging to a particular object class. In this paper, the perceptual difference between several
primitive two-dimensional object shapes is examined. Unlike the human, computer recognition algorithms are typically designed to recognize a finite number of classes of objects. This paper focuses on two-class and three-class recognition problems using simple primitive shapes consisting of a single-filled, closed loop
contour. To determine the perceptual classification boundary, one primitive shape is morphed into another, and a group of persons are used to quantify where the perceived boundary is located between objects. Various shape measures are then applied to the primitive shapes to determine how well some current measures can
quantify the perceived classification boundary. The addition of gaussian noise to the primitive two-dimensional shapes is also examined along with quantitative and perceived human results. The results suggest that the tested quantitative measures do not provide results similar to human perception. Some measures are better than others at achieving perceptual classification. The paper demonstrates that an approximate perceptual classification measure can be achieved by using human observer perceptual thresholds along with a quantitative measure.
In the real world of remote sensing, rarely does the extracted object precisely match a stored template of that object. A certain level of uncertainty must be permitted between the stored template and the extracted object. One solution to deal with this uncertainty is to evaluate measures between the object and template. Many measures have been introduced independently in literature for evaluating the statistical nature of the extracted objects such as a variety of shape and texture measures. The object is measured and compared to similar measures taken from the template. This paper suggests using measures extracted through a joint comparative process of template and object. It also suggests using multiple measures from the joint class of measures as opposed to using an individual measure to determine the sufficiency of the match. In particular, this paper demonstrates the value of using multiple comparative shape measures as opposed to one particular shape measure to achieve confidence in a match. The multiple-shape measure approach uses a matched filter measure, a Procrustes metric, a partial-direct hausdorf measure, and percent-pixels same measure. Each shape measure gives slightly different insight about the shape comparison which allows more confidence in the match. An experimental result is given that demonstrates the implementation and usefulness of the multiple comparative measure approach for recognizing objects from remotely sensed imagery.
In object recognition, one goal of matched filter design has been to define a matching function that produces an ideal correlation peak when a target object in an image scene precisely matches the pre-defined template object. The benefit of such a function is that it guarantees a precise detection/identification. The ideal correlation-based function that defines the match has been described as a dirac delta in the correlation plane. This paper suggests that if similarity as opposed to precise matching is the goal of the correlation function, then ony using current two-dimensional correlation techniques will result in a non-dirac delta in the correlation plane. This paper suggests basing the design of the function on the object recognition goal. The approach for correlation function design is demonstrated using psychophysical evidence for class differentiation. A function is designed based on psychophysical experimental results for distinguishing between two simple objects and their deformations: a square and a circle.
Synthetic aperture radar (SAR) imagery is one of the most valuable sensor data sources for today's military battlefield surveillance and analysis. The collection of SAR images by various platforms (e.g. Global Hawk, NASA/JPL AIRSAR, etc.) and on various missions for multiple purposes (e.g. reconnaissance, terrain mapping, etc.) has resulted in vast amount of data over wide surveillance areas. The pixel-to-eye ratio is simply too high for human analysts to rapidly sift through massive volumes of sensor data and yield engagement decisions quickly and precisely. Effective automatic target recognition (ATR) algorithms to process this growing mountain of information are clearly needed. However, even after many years of research, SAR ATR still remains a highly challenging research problem. What makes SAR ATR problems difficult is the amount of variability exhibited in the SAR image signatures of targets and clutters. There are many different factors that can cause the variability in SAR image signatures. It is of convention to categorizes those factors into three major groups known as extended operating conditions (OC's) of target, environment and sensor. The group of sensor OC's includes SAR sensor parametric variations in depression angle, polarization, squint angle, frequencies (UHF, VHF, X band) and bandwidth, pulse repetition frequency (PRF), multi-look, antenna geometry and type, image formation algorithms, platform variations and geometric errors, noise level, etc. Many existing studies of SAR ATR have been traditionally focused on the variability of SAR signatures caused by a sub-space of target OC's and environment OC's. The similar studies in terms of SAR parametric variations in sensor OC's have been very limited due to the lack of data across the sensor OC's and the inherent difficulties as well as the high cost in supplying various sensor OC's during the data collections. This paper will present the results of a comprehensive survey of SAR ATR research works involving the subjects of various sensor OC's. We found out in the survey that, to this date, very few research has been devoted to the problems of sensor OC's and their effects over the performance of SAR image based ATR algorithms. Due to the importance of sensor OC's in the ATR applications, we have developed a research platform as well as important focus areas of future research in SAR parametric variations. A number of baseline ATR algorithms in the research platform have been implemented and verified. We have also planned and started SAR data simulation process across the spectrum of sensor OC's. A road-map for the future research of SAR parametric variations (sensor OC's) and their impact on ATR algorithms is laid out in this paper.
Image segmentation is a process to extract and organize information energy in the image pixel space according to a prescribed feature set. It is often a key preprocess in automatic target recognition (ATR) algorithms. In many cases, the performance of image segmentation algorithms will have significant impact on the performance of ATR algorithms. Due to the variations in feature set definitions and the innovations in the segmentation processes, there is large number of image segmentation algorithms existing in ATR world. Recently, the authors have investigated a number of measures to evaluate the performance of segmentation algorithms, such as Percentage Pixels Same (pps), Partial Directed Hausdorff (pdh) and Complex Inner Product (cip). In the research, we found that the combination of the three measures shows effectiveness in the evaluation of segmentation algorithms against truth data (human master segmentation). However, we still don't know what are the impact of those measures in the performance of ATR algorithms that are commonly measured by Probability of detection (PDet), Probability of false alarm (PFA), Probability of identification (PID), etc. In all practical situations, ATR boxes are implemented without human observer in the loop. The performance of synthetic aperture radar (SAR) image segmentation should be evaluated in the context of ATR rather than human observers.
This research establishes a segmentation algorithm evaluation suite involving segmentation algorithm performance measures as well as the ATR algorithm performance measures. It provides a practical quantitative evaluation method to judge which SAR image segmentation algorithm is the best for a particular ATR application. The results are tabulated based on some baseline ATR algorithms and a typical image segmentation algorithm used in ATR applications.
This paper presents a new paradigm for feature extraction and segmentation of SAR imagery. Most of the existing segmentation algorithms explore the features based on the variations in image intensity, contrast and texture, mimicking human SAR scene analysts. Like medical ultrasound imaging, CT imaging and magnetic resonance imaging, the imaging modality of SAR is not consistent with the natural ability of human vision. That is why we need trained experts to analyze those medical images as well as SAR images. In the ATR application, SAR imagery will be processed and segmented by automatic computer algorithms without human analysts in the loop. Therefore, in order to fully utilize the capability of SAR as an advanced surveillance instrument, we need to develop a feature space that is based on the physics of SAR imaging modality, not the human visual perception. After the definition of feature space, we can process the SAR sensor data in the image domain or even before image formation. In this research, we try to focus on establishing a new SAR image segmentation processing paradigm based on the discrete frame theory. We will show the framework of the paradigm on a limited feature space covering some SAR attributes like targets and shadows. After setting up the feature space, we will develop a discrete frame to transform SAR sensor data into a feature space representation. The feature space representation consists of transform coefficients that indicate the location and strength of the features. Those transform coefficients can be further manipulated by some classification algorithms for ATR exploitation.
When evaluating an imaging system, it is important to have a confident evaluation measure as well as an understanding of the limitations of the evaluation measure. The signal-to-noise ratio (SNR) and several variants such as the peak signal-to-noise ratio (PSNR) have been used abundantly as quality measures in imaging and video systems. A debate as to whether SNR reflects human perception in some cases has attempted to dissuade the use of SNR but SNR is still used in basic research as a quality measure. Recent work for evaluating video sequences suggests that SNR can follow the human perception trend if the proper formulation is used. Likewise, this paper suggests that SNR can be proper and follow human perception for evaluating quality if a proper formulation of SNR is constructed based on recognition of vision system attributes. In particular, this paper suggests a new variant of the basic PSNR measure for evaluating single frame images based on recognition of the vision attributes. A new variant of the PSNR is introduced for evaluating video sequences based on vision attributes. The human visual measurements used to formulate the new PSNR are presented along with a demonstration of the new PSNR on images.
Correlators have been used for detecting shapes but not as often for measuring shape similarity. The complex inner product (CIP) has been used in various formulations as a shape similarity measure. The CIP is essentially a one-dimensional correlation approach to measuring similarity. One-dimensional variants of the correlation techniques including the matched filter (MF), phase-only filter (POF), and amplitude-modulated phase only filter (AMPOF) are shown to measure shape similarity in a trend that approaches human perception, however, clear performance differences are noted. The results show that the best correlator for measuring shape similarity is not the best correlator for detecting a shape. It is suggested that detection and shape similarity are fundamentally different functions that are in opposition to some degree. Ideal detection and ideal similarity measurement functions are explored. The degree to which various formulations of correlators approach the ideal functions of detection and similarity measurement are shown as well as results from human psychophysical experiments.
Measuring a system's capability to acquire accurate three- dimensional shape is important for validating the system for a particular application. Various system factors are reviewed that contribute to inaccurate shape. As shown in this paper, different shape measures do not do a complete evaluation but provide different information depending on the type of error. A partial-directed hausdorf (PDH) and complex inner product (CIP) measure that were previously introduced to measure two-dimensional shapes are now extended to measure three-dimensional shapes. PDH measures how close the 3D surface is to the ideal 3D surface within a predefined acceptable error margin, while the CIP measures how well the 3D surface correlates to the ideal 3D surface. Two variants of the CIP measure are used in this paper including a pure phase only filter and a normalized matched filter. The CIP measure is compared to the Procrustes metric for comparing shapes. Using a test case shape, the measures are compared and shown to provide varying information. Alone, any one measure cannot provide complete shape information. Combining measures provides a more robust three-dimensional shape measurement system. The shape measures are demonstrated first on three-dimensional data with controlled variation and then on laser ranging data.
Proc. SPIE. 4470, Photonic Devices and Algorithms for Computing III
KEYWORDS: Image processing algorithms and systems, Detection and tracking algorithms, Image segmentation, Video, Image analysis, Video surveillance, Distance measurement, Video compression, Video processing, Target recognition
Appropriate segmentation of video is a key step for applications such as video surveillance, video composing, video compression, storage and retrieval, and automated target recognition. Video segmentation algorithms involve dissecting the video into scenes based on shot boundaries as well as local objects and events based on spatial shape and regional motions. Many algorithmic approaches to video segmentation have been recently reported, but many lack measures to quantify the success of the segmentation especially in comparison to other algorithms. This paper suggests multiple bench-top measures for evaluating video segmentation. The paper suggests that the measures are most useful when 'truth' data about the video is available such as precise frame-by- frame object shape. When precise 'truth' data is unavailable, this paper suggests using hand-segmented 'truth' data to measure the success of the video segmentation. Thereby, the ability of the video segmentation algorithm to achieve the same quality of segmentation as the human is obtained in the form of a variance in multiple measures. The paper introduces a suite of measures, each scaled from zero to one. A score of one on a particular measure is a perfect score for a singular segmentation measure. Measures are introduced to evaluate the ability of a segmentation algorithm to correctly detect shot boundaries, to correctly determine spatial shape and to correctly determine temporal shape. The usefulness of the measures are demonstrated on a simple segmenter designed to detect and segment a ping pong ball from a table tennis image sequence.
Numerous approaches to segmentation exist requiring an evaluation technique to determine the most appropriate technique to use for a specific ladar design. A benchtop evaluation methodology that uses multiple measures is used to evaluate ladar-specific image segmentation algorithms. The method uses multiple measures along with an inter-algorithmic approach that was recently introduced for evaluating Synthetic Aperture Radar (SAR) imagery. Ladar imagery is considered to be easier to segment than SAR since it generally contains less speckle and has both a range and intensity map to assist in segmentation. A system of multiple measures focuses on area, shape and edge closeness to judge the segmentation. The judgement is made on the benchtop by comparing the segmentation to supervised hand-segmented images. To demonstrate the approach, a ladar image is segmented using several segmentation approaches introduced in literature. The system of multiple measures is then demonstrated on the segmented ladar images. An interpretation of the results is given. This paper demonstrates that the original evaluation approach designed for evaluating SAR imagery can be generalized across differing sensor modalities even though the segmentation and sensor acquisition approaches are different.
To ensure that the best possible image segmentation algorithm transitions into a real world recognition system, the segmentation algorithm must be properly evaluated. A novel approach is introduced for evaluating image segmentation algorithms. Part of the approach is to use a system of multiple measures. Using intra-algorithmic comparisons, three measures are tested on a small suite of segmented image test cases. The results from using three measures on the test suite demonstrate significant differences in the ability of the measures to distinguish segmentation quality. Another part of the novel approach is the use of an inter-algorithmic comparison which is shown to assist the evaluation of segmentation algorithms where exact edge truth is unknown such as is the case with Synthetic Aperture Radar (SAR) imagery. Results are demonstrated by using segmentation on a SAR target chip provided by the Moving and Stationary Target Acquisition and Recognition (MSTAR) program.
Because of the large number of SAR images the Air Force generates and the dwindling number of available human analysts, automated methods must be developed. A key step towards automated SAR image analysis is image segmentation. There are many segmentation algorithms, but they have not been tested on a common set of images, and there are no standard test methods. This paper evaluates four SAR image segmentation algorithms by running them on a common set of data and objectively comparing them to each other and to human segmentations. This objective comparison uses a multi-measure approach with a set of master segmentations as ground truth. The measure results are compared to a Human Threshold, which defines the performance of human segmentors compared to the master segmentations. Also, methods that use the multi-measures to determine the best algorithm are developed. These methods show that of the four algorithms, Statistical Curve Evolution produces the best segmentations; however, none of the algorithms are superior to human segmentations. Thus, with the Human Threshold and Statistical Curve Evolution as benchmarks, this paper establishes a new and practical framework for testing SAR image segmentation algorithms.
In automatic target recognition and machine vision applications, segmentation of the images is a key step. Poor segmentation reduces the recognition performance. For some imaging systems such as MRI and Synthetic Aperture Radar (SAR) it is difficult even for humans to agree on the location of the edge which allows for segmentation. A real- time dynamic approach to determine the quality of segmentation can enable vision systems to refocus of apply appropriate algorithms to ensure high quality segmentation for recognition. A recent approach to evaluate the quality of image segmentation uses percent-pixels-different (PPD). For some cases, PPD provides a reasonable quality evaluation, but it has a weakness in providing a measure for how well the shape of the segmentation matches the true shape. This paper introduces the complex inner product approach for providing a goodness measure for evaluating the segmentation quality based on shape. The complex inner product approach is demonstrated on SAR target chips obtained from the Moving and Stationary Target Acquisition and Recognition (MSTAR) program sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL). The results are compared to the PPD approach. A design for an optoelectronic implementation of the complex inner product for dynamic segmentation evaluation is introduced.
The transmission and storage of digital video currently requires more bandwidth than is typically available. Emerging applications such as video-on-demand, web cameras, and collaborative tools with video conferencing are pushing the limits of the transmission media to provide video to the desktop computer. Lossy compression has succeeded in meeting some of the video demand, but it suffers from artifacts and low resolution. This paper introduces a content-dependent, frame-selective compression technique which is developed wholly as a preconditioner that can be used with existing digital video compression techniques. The technique is heavily dependent on a priori knowledge of the general content of the video which uses content knowledge to make smart decisions concerning the frames selected for storage or transmission. The velocital information feature of each frame is calculated to determine the frames with the most active changes. The velocital information feature along with a priori knowledge of the application allows prioritization of the frames. Frames are assigned priority values with the higher priority frames being selected for transmission based on available bandwidth. The technique is demonstrated for two applications: an airborne surveillance application and a worldwide web camera application. The airborne surveillance application acquires digital infrared video of targets at a standard frame rate of 30 frames per second, but the imagery suffers from infrared sensor artifacts and spurious noise. The web camera application selects frames at a slow rate but suffers form artifacts due to lighting and reflections. The results of using content-dependent, frame-selective video compression shows improvement in image quality along with reduced transmission bandwidth requirements.
When imaging the ground from the air, distortions can occur if the imagery was created form an electro-optical line scanner pointing to nadir and mounted on the bottom of an airborne platform. The inability of the aircraft to maintain a perfect trajectory can cause the distortions. In the worst case scenario, camera stabilizers fail, no geographical reference or navigation data is available, and the sensor periodically fails leaving incomplete data for image construction. Motion compensation can restore the images. This paper describes various distortions that can be created for an airborne nadir-aimed line scanner. A motion-compensation technique is introduced that combines multiple cues from geographical reference and navigation data as well as line-scan matched filtering. A semi- automated restoration implementation is introduced followed by the automated line-scan matched filter implementation. These various compensation techniques provide backup for each other thus creating a more efficient motion- compensation system. Even in the worst case scenario, the system continues to attempt motion compensation using an optimal line-scan matched filtering technique. The results of using this automated technique for motion compensation is demonstrated using simulated high-definition imagery and then using actual electro-optical and hyperspectral images that were obtained form the Dynamic Data Base (DDB) program sponsored by the Defense Advanced Research Projects Agency (DARPA).
An enhanced region-growing approach for segmenting regions is introduced. A region-growing algorithm is merged with stopping criteria based on a robust noise-tolerant edge-detection routine. The region-grow algorithm is then used to segment the shadow region in a Synthetic Aperture Radar (SAR) image. This approach recognizes that SAR phenomenology causes speckle in imagery even to the shadow area due to energy injected from the surrounding clutter and target. The speckled image makes determination of edges a difficult task even for the human observer. This paper outlines the edge-enhanced region grow approach and compares the results to three other segmentation approaches including the region-grow only approach, an automated-threshold approach based on a priori knowledge of the SAR target information, and the manual segmentation approach. The comparison is shown using a tri-metric inter- algorithmic approach. The metrics used to evaluate the segmentation include percent-pixels same (PPS), the partial- directed hausdorff (PDH) metric, and a shape-based metric based on the complex inner product (CIP). Experimental results indicate that the enhanced region-growing technique is a reasonable segmentation for the SAR target image chips obtained from the Moving and Stationary Target Acquisition and Recognition (MSTAR) program.
Speckle noise and phase errors are two major a sources of quality degradation for synthetic aperture radar imageries. In this work, we address this problem with the proposal of a spatio-temporal metric to benchmark these degradations by analyzing an azimuthal image sequence. Preliminary results of the metrics with and without multiresolutional formulation are reported.
DARPA's Moving and Stationary Target Acquisition and Recognition (MSTAR) program has shown that image segmentation of Synthetic Aperture Radar (SAR) imagery into target, shadow, and background clutter regions is a powerful tool in the process of recognizing targets in open terrain. Unfortunately, SAR imagery is extremely speckled. Impulsive noise can make traditional, purely intensity-based segmentation techniques fail. Introducing prior information about the segmentation image -- its expected 'smoothness' or anisotropy -- in a statistically rational way can improve segmentations dramatically. Moreover, maintaining statistical rigor throughout the recognition process can suggest rational sensor fusion methods. To this end, we introduce two Bayesian approaches to image segmentation of MSTAR target chips based on a statistical observation model and Markov Random Field (MRF) prior models. We compare the results of these segmentation methods to those from the MSTAR program. The technique we find by mapping the discrete Bayesian segmentation problem to a continuous optimization framework can compete easily with the MSTAR approach in speed, segmentation quality, and statistical optimality. We also find this approach provides more information than a simple discrete segmentation, supplying probability measures useful for error estimation.
Target recognition research for Synthetic Aperture Radar (SAR) has been made easier with the introduction of target chip sets. The target chips typically are of good quality and consist of three regions: target, shadow and background clutter. Target chip sets allow recognition researchers to bypass the quality filtering and detection phases of the automatic recognition process. So, the researcher can focus on segmentation and matching techniques. A manual segmentation process using supervised quality control is introduced in this paper. Using 'goodness of fit' measures the quality of manual segmentation on SAR target chips is presented. Using the expected metrics associated with the manual segmentation process, the performance of automated segmentation techniques can be evaluated. The approach of using manual segmentation to evaluate the performance of automated segmentation techniques is presented by demonstrating the results on a simple automated segmentation technique that incorporates speckle removal and segmentation.
This paper addresses the problems associated with the dynamic change detection of an image sequence in the compressed domain. In particular, wavelet compression is considered here. With its multi-resolutional decomposition there are many different routes of image compression with wavelets. This paper will present some preliminary results of different compression schemes on spatio-temporal change detection metrics.
An airborne hyperspectral line scanner is used to image the ground as the aircraft moves on a single trajectory. In reality, it may be difficult for the aircraft to maintain a perfectly steady course causing distortions in the imagery. So, special subsystems including stabilizers are used to maintain the hyperspectral line scanner on the proper course. If the subsystems of an airborne hyperspectral line scanner are malfunctioning or if the proper stabilizers are not available, then a technique is needed to restore the imagery. It no stabilizers are used on the airborne line scanner, but if aircraft navigation information is available including yaw, pitch and roll, then the restoration may be automated. However, if the stabilizers are malfunctioning or if the navigation information is corrupted or unavailable, then a technique is needed to restore the imagery. This paper introduces an automated technique for restoring hyperspectral images that was used on some images obtained for the Dynamic Data Base program sponsored by the Defense Advanced Research Projects Agency. The automated approach is based on image flow vectors obtained from the unstable image. The approach is introduced along with results that demonstrate how successful the restoration is at the feature level.
This paper proposes to use a multiresolutional spatio- temporal metric for the segmentation of an image sequence. In particular, scene-cut detection performance from an image sequence will be furnished. Wavelet decomposition is used for the multiresolutional analysis. The segmentation results obtained here can be used in the video browsing and indexing in multimedia applications.
This paper introduces a spatio-temporal technique for selecting or filtering out lower quality digital image frames. The technique is demonstrated on Electro-Optical/Infrared image sequences which suggests it is a candidate for exploiting reconnaissance (recce) imagery or can be a part of a recce subsystem. For human vision exploitation, a few poor quality image frames out of hundreds in a digital image sequence may be only a minor irritation when the sequence runs at the typical 30 frames per se cond. Of course, if that human needs to examine each frame, a system that automatically removes or enhances lower quality image frames could be beneficial. For machine vision subsystems, a few poor quality image frames could cause lower probability of recognition.The filter technique introduced in this paper can improve input into machine vision algorithms. Another application for this technique is digital transmission to filter out unwanted images prior to transmission or to selectively enhance the poor quality frames. A major portion of current research into quality in digital image sequences focuses on transmission systems where an input high quality image sequence can be compared to the lower quality image sequence receivved at the output of the transmission system. However, this paper shows a technique for juding the quality of the input image frames prior to transmission, without a transmission system or without any knowledge of the higher quality image input. The impact of digital image artifacts on the spatio-temporal quality are shown. The quality variations in the individual frames of the input image sequence are charted to show which frames are of lower quality and thus need filtering.
This paper introduces a metric called Velocital Information Content (VIC) which is used to chart quality variations in digital image sequences. Both spatially-based and temporally-based artifacts are charted using this single metric. VIC is based on the velocital information in each image. A mathematical formulation for VIC is shown along with its relation to the spatial and temporal information content. Some strengths and weaknesses of the VIC formulation are discussed. VIC is tested on some standard image sequences with various spatio-temporal attributes. VIC is also tested on a standard image sequence with various degrees of blurring using a linear blurring algorithm. Additionally, VIC is tested using standard sequences that have been processed through a digital transmission algorithm. The transmission algorithm is based on the discrete cosine transform, and thus introduces many of the known digital artifacts such as blocking. Finally, the ability of VIC to chart image artifacts is compared to a few other traditional quality metrics. VIC offers a different role from traditional transmission-based quality metrics which require two images: the original input image and degraded output image to calculate the quality metric. VIC can detect artifacts from a single image sequence by charting variations from the norm. Therefore, VIC offers a metric for judging the quality of the image frames prior to transmission, without a transmission system or without any knowledge of the higher quality image input. The differences between VIC and transmission-oriented quality metrics, can provide a different role for VIC in analysis and image sequence processing.
Proc. SPIE. 3370, Algorithms for Synthetic Aperture Radar Imagery V
KEYWORDS: Signal to noise ratio, Detection and tracking algorithms, Sensors, Image processing, Image quality, Image sensors, Digital imaging, Image transmission, Automatic target recognition, Analog electronics
For the Automatic Target Recognition (ATR) algorithm, the quality of the input image sequence can be a major determining factor as to the ATR algorithm's ability to recognize an object. Based on quality, an image can be easy to recognize, barely recognizable or even mangled beyond recognition. If a determination of the image quality can be made prior to entering the ATR algorithm, then a confidence factor can be applied to the probability of recognition. This confidence factor can be used to rate sensors; to improve quality through selectively preprocessing image sequences prior to applying ATR; or to limit the problem space by determining which image sequences need not be processed by the ATR algorithm. It could even determine when human intervention is needed. To get a flavor for the scope of the image quality problem, this paper reviews analog and digital forms of image degradation. It looks at traditional quality metric approaches such as peak signal-to-noise ratio. It examines a newer metric based on human vision data, a metric introduced by the Institute for Telecommunication Sciences. These objective quality metrics can be used as confidence factors primarily in ATR systems that use image sequences degraded due to transmission systems. However, to determine the quality metric, a transmission system needs the original input image sequence and the degraded output image sequence. This paper suggests a more general approach to determining quality using analysis of spatial and temporal vectors where the original input sequence is not explicitly given. This novel approach would be useful where there is no transmission system but where the ATR system is part of the sensor, on-board a mobile platform. The results of this work are demonstrated on a few standard image sequences.
A complex associative memory model based on a neural network architecture is proposed for tracking three-dimensional objects in a dynamic environment. The storage representation of the complex associative memory model is based on an efficient amplitude-modulated
phase-only matched filter. The input to the memory is derived from the discrete Fourier transform of the edge coordinates of the to-be-recognized moving object, where the edges are obtained through motion-based segmentation of the image scene. An adaptive threshold is used during the decision-making process to indicate a match or identify a mismatch. Computer simulation on real-world data proves the effectiveness of the proposed model. The proposed scheme is readily amenable to optoelectronic implementation.
A complex associative memory model based on a neural network architecture is proposed for recognizing three-dimensional objects acquired from a dynamic environment. The storage representation of the complex associative memory model is based on an efficient amplitude modulated phase-only matched filter. The input to the memory is derived from the discrete Fourier transform of the edge coordinates of the to-be-recognized moving object, where the edges are obtained through motion-based segmentation of the image scene. An adaptive threshold is used during the decision making process to indicate a match or identify a mismatch. Computer simulation on real world data proves the effectiveness of the proposed model. The proposed scheme is readily amenable to opto-electronic implementation.