An operational video compressor was built using field-to-field differencing, on fields processed using Landau and Slepian's Hadamard transform method. Because of disturbing motion artifacts, the system was partially redesigned to repeat fields or frames. The spatial resolution was improved using a fixed-rate, three-mode adaptive system to compress each field. A substantial performance improvement was achieved while the transmission rate, memory size, and much of the hardware remain unchanged.
Space Shuttle will be using a field-sequential color television system for the first few missions but the present plans are to switch to a NTSC color TV system for future missions. The field-sequential color TV system uses a modified black and white camera, producing a TV signal with a digital bandwidth of about 60 Mbps. This article discusses the characteristics of the Shuttle TV systems and proposes a bandwidth compression technique for the field-sequential color TV system that could operate at 13 Mbps producing a high fidelity signal. The proposed bandwidth compression technique is based on a two-dimensional DPCM system that utilizes temporal and spectral, as well as spatial correlation inherent in the field-sequential color TV imagery. The proposed system requires about 60 watts and less than 200 integrated circuits.
Three modifications of the Constant Area Quantization (CAQ) image bandwidth compression technique have been developed and tested in order to broaden the range of compression ratios obtainable with acceptable image quality. The first modification involved the introduction of an adaptive area threshold, the second was a two-threshold algorithm and the third was a hybrid of the CAQ with a Hadamard transform technique. Using these three algorithms together with the basic CAQ, images spanning the range from 0. 2 to 2 bits per picture element were obtained from an 8 bit original.
A practical video bandwidth compression system is described in detail. Video signals are Haar transformed in 2 dimensions and the resulting transform coefficients are adaptively filtered to achieve reduced transmission data rates to less than 1 bit/pel, while maintaining good picture quality. Error detection and compensation is included into the transmission bit structure.
The challenge of providing a robust six megahertz video data channel for remotely piloted vehicles (RPV's) has caused several old standby techniques to be reviewed. Spread spectrum methods, the most promising class for robust channel signalling, require source bandwidths much smaller than those of the channel. Consequently, the application of video data compression algorithms to limit source bandwidths play an important role. Several video data compression techniques that have been applied to the RPV problem to date are discussed, ranging from direct pictorial redundancy reduction, to reduction of frame rate, to data synthesis. These illustrate how special requirements of the RPV video problem are addressed in the designs. This paper describes the general scenario in which the RPV video capability will be used. Major techniques discussed will include transform coding on subpictures, hybrid coding, slow frame rate data processing and source content adaptive variable resolution data processing. The RPV video communications system truly provides a new challenge to the video data compression community.
This paper discusses a high-resolution laser image transmission system recently designed, built, and tested under contract to Rome Air Development Center. The system and equipment was designed for two-way transmission of intelligence and reconnaissance imagery over narrowband communication links. The following technological advances embodied in the system are covered in this paper.
The use of source encoding on the image data to achieve an average of 15:1 reduction in the transmitted data
•Economical achievement of high-resolution imagery with 20-Ip/mm scanning resolution
•Synchronizing encoding for transmission over 9.6-kb/s wire-line links in a heavily errored environment
•Use of a modulated laser as a scanning and reproducing light source
•Use of dry-silver photocopy as the output media
This paper summarizes a research study on data compression techniques applicable for Remote Piloted Vehicles (RPV). Interframe techniques are considered and algorithms are determined to account for motion in frame-to-frame aerial photographs resulting in large bit reduction ratios provided that the various parameters of the mission such as altitude, velocity, etc. are accurately known. Differential encoding is used to further reduce bandwidth. Intraframe techniques suitable for the RPV mission including two-dimensional transform techniques and hybrid coding schemes are investigated and evaluated. It is shown that the hybrid schemes using Hadamard transform in one spatial direction and DPCM in the other spatial direction performs equivalent and at times superior to 2-dimensional transform techniques. The effects of channel errors on both the transform and hybrid coding schemes are investigated. Although the hybrid coding scheme is shown to be more sensitive to noise, optimization of the prediction coefficient results in satisfactory performance in a noisy environment. An adaptive scheme is considered which shows improved resolution in regions of high activity within the picture.
This overview begins by emphasizing the importance of the human observer to the process of efficiently encoding picture material. By way of example, two aspects of threshold vision that bear importantly on the coding problem are discussed: the threshold of a perturbation presented against a plain background and threshold adjacent to a luminance step. Methods for incorporating threshold information into the coding operation are briefly described.
This paper describes how simulated hard copy can provide performance prediction and design optimization data during the development of high resolution electro-optic reconnaissance systems. A series of software models have been developed that act on an input image realistically introducing MTF degradation, distortion, and noise as they would occur in the actual system under study. Properties of the various functional elements such as the focal plane, signal processing, and the communications channel are discussed with particular emphasis on those areas where complex tradeoffs are involved. The mechanics of the simulation process are described together with several examples of its use.
Computers play a role in everyone's life in one way or another. Ever since Turing postulated his, by now well known, "Turing test," there has been an enormous increase in the use of machines to do more than mere computation. Within the last five years or so there has been spurt in the use of machines to perform assembly line tasks. In particular welding, painting, assembly inspection, manufacturing etc. To quite an extent these various attempts are a blindfold approach to manufacturing. In other words they do not use Computer Vision. This paper is an attempt at explaining some of the results, the problems and shortcomings of using Computer Vision in the real world.
This paper describes a new image coding system which combines the detection and coding of visually significant edges in natural images. The edges are defined as amplitude discontinuities between different regions of an image. The edge detection system makes use of 3 x 3 masks, which are well suited for digital implementation. Edge angles are quantized to eight equally spaced directions, suitable for chain coding of contours. Use of an edge direction map improves the simple thresholding of gradient modulus images. The concept of local connectivity of the edge direction map is useful in improving the performance of this method as well as other edge operators such as Kirsch and Sobel. The concepts of an "edge activity index" and a "locally adaptive threshold" are introduced and shown to improve the performance even further.
The choice of the color coordinates for extraction of visually significant edges and boundaries in a color image is discussed. The compass gradient edge detection method developed by the author for monochrome images is extended to color edge detection and a quantitative measure is discussed in choosing the color coordinates for edge extraction.
This paper describes some analytical results relative to the effectiveness of applying data compression techniques for efficient transmission of synthetic aperture radar (SAR) signals and images. A Rayleigh target model is assumed in the analysis. It is also assumed that all surface reflectivity information is of interest and needs to be transmitted. Spectral characteristics of radar echo signals and processed images are analyzed. Analytical results generally indicate that due to the lack of high spatial correlation in the Rayleigh distributed radar surface reflectivity, application of data compression to SAR signals and images under the square difference fidelity criterion may be less effective than its application to images obtained using incoherent illumination. On the other hand, if certain random variations in radar images are considered as undesirable, substantial compression ratio may be achieved by removing such variations.
The problems of data rate and data volume are generally consisered as basic and fundamental ones i the development of real-time or near real-time multispectral scanner data systems. This paper silggests that cur rent advances in the field of programmable microprocessors may provide a solution to these problems particularly when applied in a preprocessing mode. Five such microprocessors are considered (2, 3.3, 4, 5, and 61111z) operating on a five channel, conical, multispectral scanner data stream with a maximum 30 kc data rate. A general, linear, data channel coupling is examined in order to establish saturation points primarily for the classification problem. Some data formats, including multiplexing, are suggested. Algorithms for obtaining quick, first-order data alignment (geometric corrections) are considered. A specific application, topographic mapping, is seen to be particularly suited to the stereo aspects of a conical scanner and is given as in example of a total system. Finally, each function above is discussed (general channel coupling, geometric corrections, and topographic mapping) and then subjected to the constraints of data rate and microprocessor speed (above) in order to deduce a number of microprocessor/scanner configurations. These are summarized in tabular form.
Presenting two target stimuli at the proper spacing, duration, and time interval produces movement perception in human observers. The observer sees a single stimulus, moving continuously across the physically empty space, from its first location to its second. The production of the movement perception depends upon such physical variables as duration of stimuli, X1; time between stimuli, X2; intensity of stimuli, X3; and distance between stimuli, X4' The research examines the optimum conditions for producing movement perception. The determination of the optimum movement perception was made in the range of variables X1, X2, and X4 in the respective interval 90<X1<310 milliseconds, O<X2 <50 seconds, 0<X4<15 centimeters. A chisquared test of independence reveals that the frequency of responses of optimum movement is dependent only on the variables X1 and X4. Therefore, by curvilinear regression analysis, the frequency of responses of optimum movement perception is approximated in terms of X for each experimental value of X. For each value of X the parabola, 1 f(X14 IXh) = 0 +1X1 + f3,2X12' and the value of X1 where f has its maximum were found. The application areas of the results are further discussed in the second paper.
The present experiments are concerned with the analysis of the optimum apparent movement perception in relation to fast and slow movement perception. An attempt was made to find a theoretical interpretation of the apparent movement perception relative to real movement perception.
We define a Gauss-Markov field over E2 oy a stochastic partial differential equation with space invariant coefficients. This model may be used as an approximation to pictorial and other two-dimensional data. When we apply it to block coding it has some interesting properties. The solution of an associated homogenous boundary value problem with the field data as boundary conditions defines a "pinning" function. The KLT of the data minus its pinning function is shown to be identical to the Fourier Sine Transform. Some properties of this PKLT as well as its relation to Jain's Fast KLT are discussed.
Visual thresholds play an important role in the process of incorporating properties of the human visual system in encoding picture signals. They tell us how much the picture signal can be perturbed without the perturbations being visible to human observers. We describe psychovisual experiments to determine the amplitude thresholds at a single edge having a given slope. The perturbation of the picture signal caused by the encoding process should not exceed the threshold corresponding to the slope at any picture point, in order that the original and the encoded pictures be visually indistinguishable. We give methods to design quantizers for use in Differential Pulse Code Modulation (DPCM) systems, such that the quantization error is below the threshold and either a) the number ofquantizer levels or b) the entropy of the quantized output is minimized. We also disucss the structure of these quantizers and evaluate their performance on real pictures both in terms of picture quality and entropy.
This paper describes research which extends transform and linear predictive spatial domain image coding concepts to the coding of frame-to-frame sequences of digital images. The emphasis is directed towards interframe image coding systems that exploit temporal as well as spatial image redundancies. Interframe coder implementations investigated include three-dimensional unitary transform coders and hybrid coders which employ two-dimensional transforms in the spatial domain coupled with first-order DPCM predictive coding in the temporal domain. Based on a statistical image representation, models are developed for transform coefficient and transform coefficient temporal difference variance matrices. Using these models, theoretical MSE performance levels for both coders are determined as a function of spatial subblock size. Results are verified by computer simulation experiments.
There have been many investigations to determine achievable compression ratios for various transform encoding schemes. Often, results are not comparable because they are done with different images digitized to a different number of bits and sampled at different rates. In this paper a typical image for a remotely piloted vehicle application was selected and compared utilizing most of the popular transform methods. The comparison was made at average bit allocations of: 1 bit/pel, .75 bit/pel, and .5 bit/pel. The error criteria was a combination of visual, RMS, and RMS correlated error. The results indicated that the best performance was by the Fast Karhunen Loeve followed closely by the Discrete Cosine and Discrete Linear Basis. Those transforms classed as, "good performers", achieved a compression ratio of 12:1 or 1/2 bit per pixel. The auto correlation of the error images was computed and a characteristic decrease in correlation for lag 1 followed by an increase in correlation for lag 2 and lag 3 was observed. The Discrete Cosine Basis set was also compared with the eigenvectors of a first order Markov correlation matrix of dimension 16 and verified Ahmed's suggestion that the fit is good for even small dimensions.
In the usual transform compression encoding schemes, an image is divided into regular shaped subimages or blocks, a transform is performed on each block, low energy components thrown away, remaining components encoded and transmitted, and then the received image reconstructed. This paper discusses ways in which the image pixels can be permuted before the image is blocked and a special annihilation transform technique to perform the image coding. Experimental results show the compressed images to have no blocking effects, but a more mottled appearance compared to a discrete cosine transform coding method. The annihilation method on permuted images not only gives reconstructed images better visual quality, but also gives lower RMS error.
it is hypothesized that a rudimentary measure of human color perception can help define a fidelity measure for color image coding. A simple visual model is considered, which defines a non-linear transformation of the red, green and blue image components into functions that are more closely related to the perceived colored stimuli. The effects of the transformations on image coding efficiency are discussed, and an optimal coder, which minimizes the average distortion of these "perceptual" image functions is simulated. Results provide a tentative quality yardstick for actual coders operating at the same bit rate, but further experiments will be required to substantiate the results.
This paper describes a rather fundamental conflict between the theoretical and actual performance of differential pulse code modulation (DPCM) when operating on image data. Under the ubiquitous assumption of an exponential image autocorrelation function (Markov Model) and unity predictor weights in the DPCM loop, the one dimensional DPCM (1-D DPCM) predictor is theoretically "better than" the 2-D DPCM predictor. This result, we believe, conflicts with both intuition and practice. We begin by defining DPCM, the predictor error, and the exponential autocorrelation model. Theoretical performance measures for 1- and 2-D DPCM are computed and compared with the results for actual image data. Finally, we discuss possible reasons for the apparent discrepancies between theory and practice.
In this presentation some of the more recent results in frame-to-frame coding of television will be reviewed with an eye toward pointing out the advantages and disadvantages of applying these techniques to image transmission or image storage. Interframe coding, as the name implies, takes advantage of frame-to-frame redundancy in a television signal, of which there is a considerable amount, if movement is not excessive in the scene being televised. In videotelephone or conference TV applications, for example, it has been known for some time1,2 that 1 MHz, 271-line, monochrome signals can be coded using between 3/4 and 1 bit per picture element (pel) or 1.5 to 2 megabits per second compared with 16 Mb/,s required for 8-bit PCM. This bit-rate reduction is possible because in videotelephone or TV conferencing, the cameras are usually stationary with very little zooming or panning taking place, with the result that the pictures produced typically contain large stationary areas which do not change from frame to frame. These areas do not have to be transmitted. At the receiver they can be obtained simply by repeating the information from the previous frame. Transmitting only the pels which change significantly from frame to frame is called "conditional replenishment."
A binary facsimile coding technique capable of coding documents at compression ratios from 10:1 to 40:1 has been developed. In the basic system, individual document characters are isolated into small rectangular blocks. The location, size, and contents of each block are coded for transmission. A more sophisticated version of the coder utilizes matching of a character with previously scanned characters. If a match exists, a code word indicating the match is coded along with the character location.
Three discrete cosine transform architectures are modeled to include the computational limitations expected in hardware implementations. The models are used to compare the performance of architectures using the chirp Z algorithm, prime transform algorithm, and a fast digital algorithm. The models have been simulated on a digital computer, and the resulting images are presented.
The channel rate equalization problem inherent in a variable rate coding problem is analysed in this paper. Specific solutions are developed for adaptive transform coding algorithms. The actual algorithms depend on either pretransform or post-transform buffering. Simulations indicate small performance variations between the techniques.
Historically, the data compression techniques utilized to process image data have been Unitary Transform encoding or time domain encoding. Recently, these two approaches have been combined into a hybrid transform domain time domain system. The hybrid system incorporates some of the advantages of both concepts and eliminates some of the disadvantages of each. However, the problems of picture statistics dependence and error propagation still exist. This is due to the fact that the transformed coefficients are non-stationary processes, which implies that a constant DPCM coefficient set cannot be optimal for all scenes. In this paper, an approach is suggested that has the potential of eliminating or greatly alleviating these problems. The approach utilizes modern adaptive estimation and identification theory techniques to "learn" the picture statistics in real time so that an optimal set of coefficients can be identified as the signal statistics change. In this way, the dependency of the system on the picture statistics is greatly reduced. Furthermore, by updating and transmitting a new set of predictor coefficients periodically, the channel error propagation problem is alleviated.
Tsypkin's algorithm for optimum, adaptive, recursive quantization of data processes with a priori unknown statistics was simulated. Gaussian and random sinewave data types were used. It is shown in this paper how to choose the initial conditions and gain factors, which Tsypkin left incompletely specified in his English reports, so that the algorithm produces quantizers close to the known optimum quantizers. The cases of 2, 4, and 8 levels are demonstrated. To the author's knowledge, this is the first report on an application of the Tsypkin algorithm in the literature. Further study of Tsypkin's algorithm is recommended.