The Discrete Cosine Transform (DCT) followed by scaling and quantiza.tion is an mlportant
operation in image processing. Because of the scaling, the DCT itself need not he
computed, but rather a scalar mu]tiple of the DCT might do, with appropriate compensation
incorporated into the scaling. We present a fast method for computing such scaled output of
the 2-dimensional DCT on 8 x 8 points. We also present a similar algorithm for the inverse
Because of the easiness of design and efficiency of
Implementation, the McClellan transform is by far the most
popular and successful method to design multi-dimensional finite
impulse response filters for Image Processing. The filters can
be implemented efficiently by the directed structure, cascade
structure and Chebyshev structure with operators required
proportional to N not to N**2. The Chebyshev structure has the
best roundoff noise performance, but it has the drawback of
large storage required and more complicated architecture.
Although the output roundoff noise of the cascade structure is
larger than that of the chebyshev structure, they are very
In this paper we find that the output roundoff noise of the
cascade structure can be further reduced and the result structure
still has the same efficient implementation. We provide two
methods to reduce the output roundoff noise: (1) Utilize the full
dynamic range of each stage, and (2) Reduce the possible noise
source in each stage. The cascade structure has the advantage of
low sensitivity and inherent pipelinability. Also it is very
suitable to be implemented by nowadays modular image processing
hardware. So with the output roundoff noise reduced, the cascade
structure will prove itself to be a suitable structure for the
fixed-point implementation of the filters designed by McClellan
"Cascade coding," a technique of double coding an image is introduced in
this paper. Blocks of the image are first transform-coded and the retained
coefficients of the transform are then quantized by a Block Truncation Coding
C BTC) algorithm for transmission or storage. Upon reception or recall, the
quantized transform coefficients are used in the inverse transform to
reconstruct the image. The new method combines the spatial correlation
characteristics of the transform methods with the ease of implementation of the
BTC. Illustrations presented here on sub-bit image coding, shows it to perform
consistently better than straight Cosine Transform (DCI') coding.
This paper presents applications of Two Dimensional Convolute Integer Operators. These Operators are
frequency sensitive, feature selecting, classical replacement point and interstitial point convolvers. Interstitial point
generation permits enhanced high frequency digital magnification. These Operators are mathematically equivalent to
two dimensional partial derivatives.
Images varying from a frequency response test pattern to GOES data display roll-off from four different noise
Operators. High and low frequency gradients, resolution enhancements, three dimensional effects, Laplacians,
divergence, first and second order two dimensional partial derivatives and enhanced high frequency digital magnification
Two Dimensional Convolute Integer Technology generates image processing Operators for high speed two
dimensional frequency sensitive, feature enhancing, theoretically correct convolutions, interstitial point generation and
spurious value replacement. These Operators are a correct approach towards curl, divergence, Laplacian, gradients and
high frequency digital magnification.
Enhanced high frequency digital magnification doubles the number of lines per image and the number of
picture elements per line, without stairstepping diagonals or curves and blotches of blocky gray level areas associated
with magnification by pixel replication.
Due to the theoretical symmetry properties, multiplications per convolution are reduced by approximately 75%,
so they are well suited for video rate hardware convolutions.
This paper presents a method combining shape and shading models in order to obtain estimations
of 3D shape parameters directly from image grey values. The problem is considered as an
application of optimal parameter estimation theory, according to Liebelt 8 This theory has been
applied previously, where the emphasis was laid on time-delay 2, and motion estimation 3, 5, 9. It
is applied here to provide an environment in which somewhat more complicated models can be
designed with relative ease and to indicate how the behaviour of the parameters can be
investigated. A shading model is added, offering explicit prediction of image grey values. We
consider the problem for a single image and for an image pair, showing the shade of the object at
two consecutive points of time. The last problem requiresalso a model for the motion of the body.
The resulting non-linear estimation problem is linearized about a last parameter guess 8,so that a
linear estimator can be applied to compute a new estimate. The various stages of the modelling
process are separated by introducing several coordinate systems. Coordinate transformations will
show the object from other points of view, and perform an orthographic projection of the 3D scene
into the 2D image plane. The explicit grey value prediction yields a template, having a definite
extent in the image. Because of the shading model this method requires no gradient images, as in
the case of motion estimation 6 or stereo 5. The gradients can be computed analytically. To
demonstrate the usefulness and the flexibility of our method, we consider a solid cylinder,
irradiated with X-rays. The image is a shadow image originating from the absorption of radiation
by the cylinder.
In section 2 some background is given about the theory of parameter estimation from digital
images. In section 3 the various models for the shape and motion of the body and the imaging
process are given. In section 4 and 5 we investigate the properties of the estimator. In section 4
identifiability and uniqueness of the parameters are considered, yielding the parameters, that can
be estimated uniquely from the image data. In section 5 some examples are given, elucidating the
stabilty properties of the algorithm.
To conclude we mention the possibility to replace the motion model with a model connecting
images taken from two different positions. Thus this method is also suited to handle a stereo
The DE-1 satellite has gathered over 500,000 images of the Earth's aurora. Finding the location and shape of
the boundaries of the oval is of interest to geophysicists but manual extraction of the boundaries is extremely time
consuming. This paper describes a computer vision system that automatically provides an estimate of the inner
auroral boundary for winter hemisphere scenes. The system performs automatic checks of its boundary estimate.
If the boundary estimate is deemed inconsistent, the system does not output it. The performance of this system is
evaluated using 44 DE-1 images. The system provides boundary estimates for 37 of the inputs. Of these 37 estimates,
31 are consistent with the corresponding manual estimates. At this level of performance, the supervised use of the
system provides more than one order of magnitude increase in throughput compared to manual extraction of the
This paper describes a new method for identifying scene images following the example of human
vision. In identifying the scene, the major problems are the differences between the two images, the
difference in the view of scope, and the difference of color spectrum distribution, which results from
the variations of photographing condition. We propose that these problems can be solved by describing
the rough structure of the scene.
NEWV.EEW is a highly interactive software environment designed especially for the manipulation, processing,
analysis and display of digital image (two-dimensional) data. It is designed using the paradigm of an algorithm
developer's workbench to support a wide variety of digital image processing applications, from remote sensing to
desktop publishing. The system consists of a comprehensive library of image processing algorithms and a library of
fast, novel, raster rendering routines for display manipulation. Combined with a mouse-driven, multi-window
display manager, it provides a unified and versatile environment for the development and testing of image
The Space-Interval Probability Distribution (SIPD) has been successfully used in photonlimited
images or signals.In this work the SIPD is used for binary images obtained from the
clipping of well illuminated images. A statistical description of the SIPD and its behavior
under partial image modifications are also described.
We propose the use of a compressive function of contrast measure ( - operator) as the spread of
a Gaussian to be used in the context of the Intensity-Dependant Spread (IDS) and generalized IDS
filters. It is shown that the new approach enhances the performance of the IDS filter such as smoothing
( noise reduction/rejection), avoids blurring of closely spaced edges and also if they are under low and
nonuniform illumination functions. Simulation results verify that the time complexity of the CDS is one
order of magnitude faster than the IDS filter in 1D. Illustrative examples comparing the two filters are
This paper describes a novel approach to designing and applying spatial transformations to images, a technique that
has only recently been exploited. The concept is to create and save the transformation as a look-up table (LUT) and then
to use the look-up table to control the image resampling. This approach is very flexible; it is even amenable to
transformations that cannot be implemented using classical approaches.
This paper reports on the progress toward the design and implementation of a general
purpose image exploration, analysis, and annotation program that will deal with a
wide variety of 2D and 3D image sizes and pixel formats.
Digital Pulsed Laser Velocimetry (DPLV) is a novel full-field, two dimensional, noninvasive, quantitative flow
visualization technique. The technique described here includes the use of direct digitization of the images for flow
analysis using a high resolution imaging system. The image data is stored for further analysis by a series of new image
processing and analysis software developed for flow experiments.
The imaging processing and analysis software developed includes a compression program for reducing storage
requirements of the image data to 10%. An image finding, smoothing, and defining program has also been developed.
Analysis time has been greatly reduced and the software is now running on a PC/AT compatible. This program groups
pixels that could logically be defined as one image, smooths that image and calculates important parameters for the
In the technique images via a high resolution camera (1024 x 1024), ten consecutive frames of data, separated by a
time increment of 150 ms, are recorded. Each of these ten frames contains the images of particles at that one instant of
time. A third computer program is developed to match the image from each of the frames into tracks of the particles
through time. The program uses a statistical technique to determine the best possible path of the fluid seeds.
The ability of pulsed laser velocimetry with these image processing techniques to capture simultaneous and
quantitative rather than qualitative information is its most important capability.
The classification based on the minimum distance classifier has
been found to take lesser computing time than any of the maximum
likelihood classifiers. An efficient algorithm for classifying image
data based on the threshold distance from the 'means' of the classes
is presented. The naive algorithm computes the distance of a pixel
from every class mean and the pixel is classified to the nearest
class. Bryant has reduced the number of computations by first
calculating the distance of the pixel from the class, to which the
previous pixel was assigned to, and truncating the computation of
the distance from subsequent classes whenever the latter is greater.
The computation of the distances from subsequent classes is avoided
for most cases, in the algorithm presented, by finding the threshold
distance for each class, i.e. if the pixel to be classified lies
within the threshold distance of the class, it is classified to that
class. The minimum of the distances of the class mean from all
other class means is calculated. The threshold value of the class
is half this minimum. This algorithm is computationally efficient.
Multilevel unitary wavelet transform methods for image compression are described. The sub-band
decomposition preserves geometric image structure within each sub-band or level. This yields a multilevel
image representation. The use of orthonormal bases of compactly supported wavelets to represent a
discrete signal in 2 dimensions yields a localized representation of coefficient energy. Subsequent coding
of the multiresolution representation is achieved through techniques such as scalar/vector quantization,
hierarchical quantization, entropy coding, and non-linear prediction to achieve compression.
Performance advantages over the Discrete Cosine Transform are discussed. These include reduction of
errors and artifacts typical of Fourier-based spectral methods, such as frequency-domain quantization
noise and the Gibbs phenomenon. The wavelet method also eliminates distortion arising from data
blocking. The paper includes a quick review of past/present compression techniques, with special
attention paid to the Haar transfOrm, the simplest wavelet transform, and conventional Fourier-based subband
coding. Computational results are presented.
In this paper we develop two entropy-coded subband image coding schemes. The difference between
these schemes is the procedure used for encoding the lowest frequency subband: predictive coding is
used in one system and transform coding in the other. Other subbands are encoded using zero-memory
quantization. After a careful study of subband statistics, the quantization parameters, the corresponding
Huffman codes and the bit allocation among subbands are all optimized. It is shown that both schemes
perform considerably better than the scheme developed by Woods and O'Neil . Roughly speaking,
these new schemes perform the same as that in  at half the encoding rate. To make a complete
comparison against the results in  , we have studied the performance of the two schemes developed here
as well as that of  in the presence of channel noise. After developing a codeword packetization scheme,
we demonstrate that the scheme in  exhibits significantly higher robustness against the transmission
The microgravity experiments to be implemented on Space Station Freedom generate
gigabytes of PCM data daily. The limited bandwidth of the NASA Tracking and Data
Relay Satellite System (TDRSS) will require the images from these experiments to be
compressed before they are transmitted to the ground control centers. The matrix of
compression ratios required for these experiments suggest compression ratios of 2:1 for
lossless coding and 2:1 to 6:1 for lossy predictive coding. Forthe Space Station program,
it would be highly desirable to implement one baseline compression system which would
meet both of these criteria.
This paper presents a LZW (Lempel-Ziv-Welch) hybrid coding system which is adaptable
to either mode of operation. The system has been designed to provide the scientist with
control over the data-quality versus data-volume trade-off. This control may be made
through the command and control system while an experiment is under way. This system
also has the ability to compress interleaved telemtered data while operating in the lossless
This paper presents a two-dimensional code excited linear prediction (CELP) method for image coding.
This method is a two-dimensional extension of the CELP systems commonly used for speech coding. The
decoder is identical to a conventional DPCM decoder. However, at the encoder, the input images are first
decomposed into disjoint blocks. A single codeword from a table of N codewords is used to represent the
vector of quantized residuals for each block. The encoder selects the appropriate codeword by reconstructing N
versions of the current block, using each of the N vectors of the codebook. The index of the codeword giving
the least distortion is then transmitted. In designing the codebook, while the LBG method of clustering failed
to converge, we have succeeded in finding a deterministic codebook based on a training set using the method
of successive clustering. The system has been extended by using adaptive prediction, where one of K possible
prediction filters is used for each block; the encoder chooses the prediction filter that results in the least mean
squared prediction error. An index is transmitted to the decoder indicating which prediction filter has been
used. With no additional overhead, K different codebooks can be used, corresponding to each of the prediction
filters. We have tested this system using five predictors. The five predictors were initially selected to give
good performance on different types of image material, e.g. edges of different orientation, and then refined by
minimizing the mean square prediction error on those pixels for which the initial predictor gave the lowest mean
Vector Quantizer encoding process is based on a codebook designed to minimize some performance criteria.
The codebook is formed through the use of long training sequences, which are considered to be in the same
class as the source data to be encoded. With the penalty of a long training, the approach is successfully
used to encode the speech and image signals. In this paper, we describe a model which generates image
signals suitable for coding at a stationary codebook. In this model, the image signal is represented by a
zero mean Gaussian stochastic process. Each block of n*n samples of a stochastic process is encoded into
one out of M randomly generated Gaussion sequence of length n*n by minimizing the signal to noise ratio.
We find out that the model can achieve an acceptable quality of coded image at low bit rates and low
In this paper, a simple image analysis/synthesis technique is proposed which provides exact
reconstruction of an image from a set of subimages with the overall number of samples of all
subimages being equal to the number of samples of the original image and each subimage extracting
a context-dependent feature. The technique has been developed in terms of nonoverlapping
divisions of an image followed by nonsingular linear transformations of the individual blocks.
The Hadamard transform is an example of a typical linear transformation that can be used in
the proposed technique. The above technique appears attractive in image processing, in particular,
in image coding. Basically there are two fundamental coding techniques: predictive coding
and transform coding. The proposed analysis/synthesis technique could be used for subband
coding of images using any of these two coding techniques. By making use of the correlation
between the subimages and setting samples of subimages to zero when their variances are less
than a given threshold, it is possible to improve the coding efficiency. In addition we can use
the proposed technique to implement a two source coding scheme.
A digital still camera (DS) is composed of CCD(Charge coupled device), AID converter,
signal processing block, and an IC memory card as the media for image data storage. (Fig.
1) CCD output signal is digitized and processed in DS camera to be stored in the IC
Pyramid data structures have found an important role in progressive image transmission.
In these data structures, the image is hierarchically represented where each level corresponds to
a reduced-resolution approximation. To achieve progressive image transmission, the pyramid is
transmitted starting from the top level. However, in the usual pyramid data structures, extra
significant bits may be required to accurately record the node values, the number of data to be
transmitted may be expanded and the node values may be highly correlated. In this paper, we
introduce a reduced-difference pyramid data structure where the number of nodes, corresponding
to a set of decorrelated difference values, is exactly equal to the number of pixels. Experimental
results demonstrate that the reduced-difference pyramid results in lossless progressive image
transmission with some degree of compression. By using an appropriate interpolation niethod,
reasonable quality approximations are achieved at a bit rate less than 0.1 bits/pixel and excellent
quality at a bit rate of about 1.2 bits/pixel.
There are an increasing number of digital image processing systems that employ photographic image
capture; that is, a color photographic negative or transparency is digitally scanned, compressed, and stored
or transmitted for further use. To capture the information content that a photographic color negative is
capable of delivering, it must be scanned at a pixel resolution of at least 50 pixels/mm. This type of
high quality imagery presents certain problems and opportunities in image coding that are not present in
lower resolution systems. Firstly, photographic granularity increases the entropy of a scanned negative,
limiting the extent to which entropy encoding can compress the scanned record. Secondly, any MTFrelated
chemical enhancement that is incorporated into a film tends to reduce the pixel-to-pixel correlation
that most compression schemes attempt to exploit. This study examines the effect of noise and MTF
on the compressibility of scanned photographic images by establishing experimental information theoretic
bounds. Images used for this study were corrupted with noise via a computer model of photographic grain
and an MTF model of blur and chemical edge enhancement. The measured bounds are expressed in terms
of the entropy of a variety of decomposed image records (e.g., DPCM predictor error) for a zeroeth-order
Markov-based entropy encoder, and for a context model used by the Q-coder. The resultsshow that the
entropy of the DPCM predictor error is 3-5 bits/pixel, illustrating a 2 bits/pixel difference between an
ideal grain-free case, and a grainy film case. This suggests that an ideal noise filtering algorithm could
lower the bitrate by as much as 50%.
This paper describes the color imaging system in terms of vector space notation. This includes the effects of the
scanning filters and the response of the eye as defined by the CIE color matching functions. This formulation
allows many image processing techniques to be generalized to include more accurate models. The problem ofimage
restoration is used as an example for the vector space approach. The problem is presented in hierarchical steps.
Color scanning and effect of the human observer is presented first. The problem is extended to spatial representation
and spatial processing. Finally, the effects of image reproduction are considered. The assumptions at each step of
the modelling process are made explicit and simplifications are noted. The choice of defining the most appropriate
optimization function is considered with respect to mathematical tractability as well as subjective accuracy.
We propose a new algorithm available to variety of real-time color space
transformations : color correction for hardcopy system, perceptual color
control in CIE-LAB space, color coordinate conversions, and color
recognition. This algorithm consists of color look up tables and a new 3D
color space interpolator. This interpolator makes it easy to design a simple
real-time color processor. The simulation shows how the flexible
transformations can be performed without degrading the color and tone.
The OrbitalManeuveringVehicle(OMV) is an unmanned spacecraft which is scheduled tobe deployed in the early 1990's
by NASA. Its purpose is to relocate satellites and other orbiting objects in space, e.g., to reboost large observatories as
their orbits gradually decay. The OMV Video System (VS) was designed to be used for remote-controlled docking with
an orbiting object. The final approach and rendezvous with the OMV will be controlled by a ground-based pilot. The
OMVVS captures 5framesofvideo data per second and compresses the data in eachvideo frameby approximately 5.5to1
using a spatial resolution reduction scheme and a multi-mode intraframe coding scheme.
The OMV VS is crucial to a successful mission. The image quality must be sufficient for the pilot to precisely locate the
target object. The quantity of transmitted data (video and error correcting codes) must be limited due to communication
channel constraints. The compression hardware is constrained by power and heat dissipation limitations on the OMV
The existing system and some possible Improvements will be described.
In this paper a coding technique is presented which extends the methodology of
vector quantization to image sequence coding. The devised algorithm basically consists
of a processing stage encharged to build up the initial codebook and a codebook
updating mechanism running on-line with the coding phase, to adaptively track the
varying statistics of the incoming images. The codebook is organized in a binary tree
structure where each leaf represents a reproduction vector. The tree grows
intrinsecally unbalanced in the sense that each node splits or not according to the
intensity statistics of the vector population which refers to it. By doing so, the
suboptimality of tree structured vector quantization with respect to full search
approaches is successfully overcome. During the coding phase the unbalanced tree is
left free to plastically track the temporal variations of the incoming data statistics. This
results either in some reproduction vectors to be replaced by new ones, or in some
changes in the tree topology to be carried out, typically a leaf which needs to split or
two leaves which, conversely, shall be cut off. An updating code stream is periodically
delivered onto the channel, interlaced with coding data, to replenish the
reconstruction codebook. Application to videotelephone sequences has given promising
preliminary results which are presented and discussed.
According to the features of imaging spectrometer a two dimension
compression has been converted into a series of one dimension
processing. A real-time "two-true-value linear prediction method"
which has the ability of on-board compressing the raw data in the
spectrum direction has been developed. This method can preserve
absorption band information during compression/decompression process.
The results show that for most spectra at least 2:1 compression ratio
can be obtained under 1% reconstruction accuracy; on an average the
raw data rate can be reduced 3-4 times. The arts of coding compressed
data, and the reconstruction error have also been described.
We describe a tnotion-compensated hybrid DCT/DPCM video compression scheme that incorporates arithmetic
coding. The scheme is based on a current ISO/CCITT standards proposal for compressing still images with suitable
extensions to handle video sequences. Compared to Iluffman coding, arithmetic coding can increase compression
efficiency for the same image quality or, alternately, improve image quality for the same transmission rate. We have
extended the use of arithmetic for coding motion vector data and motion-compensated interframe data. We have
also investigated the trade-off in image quality, transmission bandwidth, and algorithm complexity between using
I-Iuffman or arithmetic coding in motion video.
The variable bit rate concept has been converted in one of the mean
topics in the study of coding algorithms for packet video. These algorithms are
strongly dependent on the characteristics of the ATM network. In that sense, a
quasi-variable bit rate coding scheme for packet video, based on some
modifications of the codec in CCITT Rec.H.261 is presented in this paper.
The proposed solution consists in giving the codec the possibility of
switching the p value for each case, adequating it to the minimum value, with the
condition that the quality of the coded sequence is maintained.
After a full description of the introduced modifications, the results of
the simulations obtained with the proposed model and several standard sequences
will be done.
This paper presents a solution for the hierarchical encoding of progressive HDTV sources, providing full
downwards and upwards compatibility between HDTV, TV and High Quality Videophony (HQVT).
As most of the video codecs at the present time are constituted with motioncompensated hybrid
predictive4ransform structures, the solution is based on the splitting of transform coefficients blocks.
A basic coding structure providing compatibility between two successive levels of the hierarchy is presented.
A second point is devoted to the performance study of such a solution from the point of view of filtering
and aliasing. This is done at the light of multirate filter banks theory.
For low bit rate coding of moving video motion information has to be exploited. Motion estimation algorithms
to be used in hybrid coding schemes with motion compensated prediction, e.g. three-step block matching, are chosen
with regard to computational requirements and/or estimation accuracy. But all algorithms, once employed, remain
fixed throughout coder operation and therefore lack any means of adaptivity to the scene.
The computational complexity of a block matching motion estimation scheme is directly proportional to the
number of search positions. Every search position requires the calculation of the error criterion. Employing codebook
generation techniques, as known from vector quantization, the selection and total number of search positions can be
optimized by a vedorbook design. The vectorbook may be tailored to the computational requirements. It can be
switched adaptively during coder operation (on-the-fly), in order to trade matching accuracy against computational
complexity. The vector-book also may be altered adaptively to substitute rarely needed search positions by those,
which are better suited for coding of the actual scene.
Due to the error criterion which is based on averaged differences ,the reliability of the displacement vector field is
strongly affected by changes in the overall illumination of a scene. Upon a change in mean brightness, usually blocks
are moved in order to fit the average brightness, to some degree regardless of the image structures inside the block.
By an illumination correction interlaced with motion estimation, block matching always works on mean-removed
We investigate high quality coding systems by exploiting both statistical andpsychovisual video signal properties. Source image
filtering via Quadrature Mirror Filters (QMF) and quantization of the obtained subbands, seem suitable to exploit both these
signal properties. All the investigated QMF filters are separable, non-recursive, and oftwo different types. The first type consists
of linear phase filters which allow, after synthesis, exact cancellation of all aliasing, but the overall characteristic of the
analysis-synthesis system has amplitudeonly approximately constant The secondtype is constitutedby more sophisticated filters
which allow exact signal reconsiruction and, naturally, cancellation of all aliasing; however, the single filters constituting the
bank can be nonlinear phase filters. These can be implemented by a lattice structure which induces, and thereby guarantees,
perfectreconstruction even ifthelatticeparameters arecoarsely quantized. Uniform quantizers havebeenchosen for all subbands,
with DPCM for the lowest frequency band and PCM for the higher frequency bands. Certain quantizer parameters are adapted
to the type of filter used. Additional parameters are adaptively adjusted according to the statistics of the subbands. This work
aims at finding a satisfactory compromise between the characteristics of the QMF filter bank and those of the coders.
With a HDTV coding method for the interlaced format, that is based on the DCT, mechanisms for local adaptivity are
described. There are two alternative adaption methods regarding the special situation with the interlaced format. Further
mechanisms concern quality parameters as a spatial frequency cut-off and the quantizer step size which adapt to the local
image content. Results of a study which compares the different adaptive approaches show the influence on the image
quality. So already a high degree of quality can be obtained. The application of a new motion compensation method
between the fields adds essential further quality reserves.
CCITT Study Group XV (Working Party XV/1) is charged with transmission systems. Under WP XV/1
a Specialists Group was established dealing with drafting recommendatioms for the secomd generation
sub-primary rate ( n x 384 kbit/s, ii = 1, . . . , 5) or (p x 64 lcbit/s, p = 1, . . . , 30) video codecs.
The Specialists Group on Coding for Visual Telephony reaches their objectives by ezchanging the
results with the different partners involved (Europe, Japan, USA, Korea and Canada). During the study
period 1984-1989 the Specialists Group agreed upon the usage of a so called reference model (hereafter
abbreviated RM) for simulation purposes. The specification for a fiezible hardware is derived from these
simulations. In this paper a description of the reference model and the evolution towards the last reference
model (RM8) is given which is the basis for the H.261. The intentions of this contribution is to show the
ft ezibility of the algorithm for different applications. The term universal approach makes therefor reference
to the usage of the algorithm for a range of possible applications. In an joint expert group of ISO/IEC,
the Moving Picture Coding Expert Group (MPEG) ISO/IEC JCT/5C2/WG8, work is carried out to select
a standard for coded represenation of moving images and sound for the provision of interactive moving
picture applications. In this expert group, members of the specialists group on coding for visual telephony
are participating trying to realize interworking between the standard in preparation by the joint expert group
MPEG and the coming CCITT recommendation H.261. For the most important used techniques in the
CCITT Reference Model, among which are quantization, scanning, loop filter, entropy coding, multiple
versus single VL C and block type discrimination theoretical information is provided and some examples for
possible improvements are included. A significant development is reported i.e a modification of the reported
n x 384 kbit/s algorithm paving the way to a universal standard codec capable of operating at p x 64 kbit/s (
p = 1,...,30).
Recent developments in edge detection have exposed diiferent criteria to gauge the performance of edge detectors in the presense of noise. One of the criteria is "Localization", which is the ability of the edge detector to produce from noisy data a detected edge that is as close as possible to the true edge in the image. In this paper, we show the limitation of the localization criteria as previously formulated and propose an alternative. This new performance measure is based on the theory of zero-crossings of stochastic processes. We show that the derivative of a Gaussian is the optimal edge detector for this new measure.
Edge detection algorithms play an important role in automated vision analysis. In this sequel we propose
a model based edge detection algorithm based on the second directional derjvative of each pixel in the
image. The new method models the picture as the output of a two dimensional all pole causal sequence
with a quarter plane region and a nonsymmetric half plane region of support. To estimate the parameters
of the model we used an overdetermined system of normal equations and utilized the least squares and total
least squares approach to solve for the unknown parameters. The estimated parameters were subsequently
used in a closed form to approximate the second directional derivative for detecting edges.
We compared our method along with Haralick's  and Thou et a!  with that of Canny's . The
first three algorithms are parametric algorithms: they are based on parametrizing the local behavior of the
image. By contrast the last algorithm is non-parametric since it does not assume any particular model for
We take mto account the quantitative measures introduced in  to study the performance of various
algorithms for different synthetic images.
keywords- QP: quarter plane, NSHP: non symmetric halfplane, LS: least squares, TLS: total least squares
Various digital filters, edge detectors, histogram modification, and three-dimensional display experiments are performed on
mosaicked Geologic LOng-Range Inclined Asdic (GLORIA) acoustic imagery. These experiments have the motivation of
establishing Navy capability for viewing the seafloor-especially in deep water and in three dimensions, detecting objects on the
seafloor, and enhancing existing monochrome GLORIA imagery. It was found that a Gaussian filter with a kernel size of 5x 5
provided subjective enhancement to the lower intensity areas while some of the other filtering techniques, e.g., difference and
gradient destroyed the dynamic range of the image. Kernel sizes were found to be extremely crucial in the experiments with this
imagery, especially the median filter which did provide excellent smoothing of the imagery without sacrificing the edges. The
digital mosaicking performed on this particular data set of acoustic imagery was determined to introduce multiple artificial
artifacts. Image analysis showed the intensities (8 bit, 0-255)to follow the classic Gaussian distribution. Histogram equalization
yielded exceptional results for adding contrast (which allows the determination ofgeologicalboundaries and detection of various
seafloor objects. The vector intensity profile of the intensity offered an interesting future research objective, the correlation of
acoustic imagery to bathymetry, the measurement of the depth of large bodies of water.
A new algorithm for designing DPCM systems is proposed for image data compression. When transmitting images
over noiseless channels, the distortion between the original and reconstructed images is primarily due to quantization
noise. This is true when optimal predictor structures are employed. The quantization error becomes severe at low bit
rates. This is due to the large quantization error being directly fed back into the predictor and used in subsequent
estimation of future pixels. The DPCM scheme developed attempts to balance between non-optimal predictor designs
and significantly reduced feedback effects due to quantization errors with the objective of maximizing reconstructed
image quality. DPCM system performance using the algorithm is about 2.5 dB greater than that obtained from an
optimally designed conventional system. In addition, the algorithm is robust. Thus, the DPCM predictor does not
need to be re-designed using exact statistics of the input image data for each image to be transmitted.
This paper describes an efficient coding algorithm developed for the bit rate reduction
of full motion video signals . The algorithm combines a motion adaptive subsampling
process with a Discrete Cosine Transform (DCT) based digital coding method in order
to achieve a compression factor greater than twenty without annoying artefacts.
Thus images with Common Intermediate Format ( CIF ) can be coded at a bit
rate of about 1,2 Mbit/s with a good quality.
A version of this algorithm has been used for coding of some test sequences at 900
Kbitls. They have been presented to the Moving Picture Expert Group's ( MPEG)
meeting held in Japan, October 1989.
A new algorithm of opto-'electronic hybrid pattern recognition,
which is based on the theory of sequency spectrum analysis, is
presented. In this method, a series of variable blackS-white ratio
gratings realized with an array CCD is employed as spatial filters
with which the recognized pattern is filtered. The algorthm is
implemented on a hybrid system of opto-electronics and computer,
and also the hybrid algorithm of Fast Walsh Transform is realized.
The Joint Photographic Experts Group is an ISO/CCITT working group in the process
of developing an international standard for general-purpose, continous-tone, still-image cornpression.
A brief history is presented as background to a surnrnary of the past year's progress,
which was highlighted by definition of the overall structure of the proposed standard, and
agreement on the major technical issues. This is followed by a summary and explanation of
the overall algorithm structure, which consists of (1) the Baseline System, a simple coding
method sufficient for many applications, (2) a set of Extended System capabilities, which extend
the Baseline System to satisfy a broader range of applications, and (3) an Independent
Lossless method for applications needing that type of compression only. The Baseline System,
the heart of the JPEG standard, is then described in greater technical detail. The paper closes
with a summary of the current status of the JPEG committee, and the target schedule toward