Electron beam pattern generators are nowadays used extensively for the production of masks and for direct writing on wafers. For obvious reasons electron beam pattern generators are optimized for integrated circuit fabrication. However there is also considerable potential for the use of electron beam lithography in other areas. In this paper recent trends in the development of electron beam pattern generators are described and the prob-lems encountered in the application of electron beam machines in other areas e.g. integrated optics are discussed.
The objective lens is a single element glass lens with one aspheric replicated surface and one flat surface. The manufacture of the aspheric surface by means of replication is suitable for high quality mass-production.
IRAS sky mapping data is being reconstructed as images, and an entropy-based restoration algorithm is being applied in an attempt to improve spatial resolution in extended sources. Reconstruction requires interpolation of non-uniformly sampled data. Restoration is accomplished with an iterative algorithm which begins with an inverse filter solution and iterates on it with a weighted entropy-based spectral subtraction.
A new technique is presented for identification of thin objects whose appearance in an image is altered by contour noise, partial occlusions and distortions induced by 3-D flexible deformations. Starting from polygonal approximations of the shape to be analysed and of the object prototypes, the technique is based on an iterative deformation of each prototype which aims at reducing its dissimilarity with the shape under consideretion. Identification is then decided in favor of the most similar deformed prototype.
This paper describes the optical implementation of banded matrix algorithms by means of Outer Product calculations. All the algorithms described are based on the Gaussian Elimination process, i.e. they only require elementary row operations that are suitable for Outer Product computations.
This paper proposes the image polynomial as an image descriptor. An image polynomial is defined by a power series whose coefficients coincide with gray levels of a digital image. And orders of an image polynomial agree with locations of pixels of a digital image. The image polynomial is also suitable to describe a shift-invariant image formation system. Furthermore, we derive an iterative image restoration algorithm which yields the original clear image by a finite number of iteration.
One of the simplest classification algorithm which utilizes a linear discriminant function is known as the minimum distance classifier, it is widely used in pattern recognition. But, it encounters the problem of useless dimension compensation when the feature dimensionality is very large. This is the situation when one wants to use textural features as input parameters for classification as it is now possible with the remotely sensed high resolution images (optical or radar). To avoid this problem, we propose a hierarchical classification algorithm.
Component labelling is an important part of region analysis in image processing. Component labelling consists of assigning labels to pixels in the image such that adjacent pixels are given the same labels. There are various approaches to component labelling. Some require random access to the processed image; some assume special structure of the image such as a quad tree. Algorithms based on sequential scan of the image are attractive to hardware implementation. One method of labelling is based on a fixed size local window which includes the previous line. Due to the fixed size window and the sequential fashion of the labelling process, different branches of the same object may be given different labels and later found to be connected to each other. These labels are con-sidered to be equivalent and must later be collected to correctly represent one single object. This approach can be found in [F,FE,R]. Assume an input binary image of size NxM. Using these labelling algorithms, the number of equivalent pair generated is bounded by O(N*M). The number of distinct labels is also bounded by O(N*M). There is no known algorithm that merge the equivalent label pairs in time linear to the number of pairs, that is in time bounded by O(N*M). We propose a new labelling algorithm which interleaves the labelling with the merging process. The labelling and the merging are combined in one algorithm. Merged label information is kept in an equivalent table which is used to guide the labelling. In general , the algorithm produces fewer equivalent label pairs. The combined labelling and merging algorithm is O(N*M), where NxM is the size of the image. Section II describes the algorithm. Section III gives some examples We discuss implementation issues in section IV and further discussion and conclusion are given in Section V.
Temporal subsampling of the image sequences is a technique often used to reduce the bit-rate in video codecs. If a 2:1 time subsampling is used, the degradation introduced at the receiver by the repetition of the same frame is acceptable; if the subsampling rate drops to 4:1 or more, the interpolation of the missing frames by either repetition or linear interpolation is absolutely unsatisfactory. It is necessary, therefore, to use more sophisticated interpolation methods, using motion compensation. Using two successively transmitted images (called A and B) we estimate two motion fields. One indicates the displacements of the points of image A going towards image B, and the other the opposite. For this estimation we have used the pel-recursive displacement estimator proposed by Cafforio and Rocca. This algorithm presents a strong interaction between the direction of recursion and the direction of the displacement to be estimated. To minimize these effects four different recursions are utilized and, using an adequate motion selection rule on the four fields, a new motion field is defined, homogeneous and without visible recursion effects. Using these two motion fields a reliable connection between the points of the two images can be created. With this information, missing frames interpolation can be improved.
Three-dimensional VLSI ( in short, 3-D VLSI ) is a new device technology that is expected to realize high performance systems. In this paper, we propose an image processing architecture based on 3-D VLSI, which consists of optically-connected layers. Since the optical inter-layer connection seems to have a kind of useful functions due to isotropic radiation of the light, we algebraicly formulate them as the picture processing operators. As is described later, those operators are available for applications such as template matching. The availability of proposed template matching algorithm is also verified by simulation.
In the context of Expert-Systems there is a pressing need of efficient Image Processing algorithms to fit the various applications. This paper presents a new electronic card that performs Image Acquisition, Processing and Display, with an IBM-PC/XT or AT as a host computer. This card features a Pipeline data flow architecture, an efficient and cost effective solution to most of the Image Processing problems.
A multiprocessor system for fast Image Processing (IP) is presented with two categories of (high-speed) processors, supervised by a general purpose microprocessor: the AD-p is optimized for pixel address calculations and the DA-p is optimized for integer arithmetic. Due to a high hierarchical structure, problems can be split up into smaller (independent,parallel) subtasks which are easy to implement. The resulting AD-DA configuration can be used as a powerful general purpose IP system. Since both processors are implemented each on a single board, they offer a valuable solution for diverse industrial vision applications.
In this paper, we describe the design of a functionally distributed multiple array processor system for parallel vision processing. This new architecture blends the power of associative processor for performing fast information retrievals with the capability of cellular array processor to process various tasks in parallel. The pixel-based image resides in an Iconic Array Processor(IAP), and the symbolized image resides in a Symbolic Array Processor (SAP). The transfer from iconic to symbolic is accomplished by a Mapping Multi-Processor(MM P). The capabilities of this system allow for feedback between high and low level processing and also support the parallel processing for mapping the pixel-based representation of an image into a symbolic representation(a semantic network) used for high-level vision processing.
Many image processing languages support the concept of a neighborhood or structuring element. In such image processing languages, a neighborhood is usually defined similarly to an image. Operations involving an image and a neighborhood yield a resultant image by some kind of convolution of this single fixed neighborhood with the source image. In this paper we show how generalizing the neighborhood concept to permit neighborhoods that are functionally specified and can vary from pixel to pixel within an image can provide a powerful algebraic language for image processing. Many complex image processing procedures requiring multiple operations to describe in other languages, such as rotation of an image by an arbitrary angle, can be described by a single algebraic operation involving a functionally specified neighborhood. We give examples of the use of functionally specified neighborhoods and argue that this general approach to neighborhood operations is appropriate for implementation on state of the art computer architectures.
In this paper we provide a neighbor finding algorithms for three-dimensional objects modeling represented by octants. First, Cixy' lyz , and C lxz' the stereographic projections of the object modeling C on the XY,YZ,and XZ planes, are found respectively. C1xylxz' Ctxzlyz, the stereographic projections of Clxy' lyz' and Clxz on the XZ,XY, and YZ planes, are found respectively. Each octants constructing C are levelized in a specified direction. The octants associated with same level number are grouped together. Each groups of octants are ordered in ascending order of level number. Finally, surface neighbors of the octants are detected in a specified direction by employing order-mapping asynchronous principle. All neighbors of each octants can be detected in a specified direction during scaninq once.
In this paper, a measurement method for the distance between binary objects will be presented. It has been developed for a specific purpose, the evaluation of rheumatic disease, but should be useful also in other applications. It is based on a distance map in the area between binary objects. A skeleton is extracted from the distance map by searching for local maxima. The distance measure is based on the average of skelton points in a defined measurement area. An objective criterion for selection of measurement points on the skeleton is proposed. Preliminary results indicate that good repeatability is attained.
Over the past years, an extensive software package has been developed to determine quantitatively the location, extent and type of thallium-201 myocardial perfusion abnormalities from early and late post-exercise scintigrams. The analysis is based on circumferential profiles obtained from the early and late post-exercise images and a computed washout profile. A prototype of an expert system shell has been built, called ESATS, for the objective interpretation of the scintigraphic studies. Basically, ESATS consists of a knowledge base, a fact base and a control mechanism. The control mechanism features both top-down and bottom-up reasoning and allows external procedures to be called. The development leading towards this rule-based expert system are described.
A new imaging modality based on the measurement of low angle (coherent) x-ray scatter radiation is described. This technique allows, in addition to conventional transmission computed tomography (CT), the two dimensional distribution of the coherent scatter cross-section to be mapped. The potential of coherent scatter arises from the fact that it evidences diffraction effects, which are not only present in regular crystals (interference patterns), but also in biological tissue, plastic and other construction materials. The form of the coherent scatter cross-sect ion is related to the types of atoms present in the sample and also their relative locations. Some diffraction patterns of biological samples and plastic materials are presented to illustrate the sensitivity of x-ray diffraction for diagnosing disease and characterizing materials. We present some typical coherent scatter images of simple objects obtained with an experimental system, illustrating the potential of our technique in computed tomography (CT) and computed radiography (CR) applications. Possibilities for improving the contrast/noise ratio using an angular resolved measuring system are discussed.
A new method is described for the automated detection of left ventricular (LV) contours in contrast cineangiograms in the RAO projection. The method requires the manual definition of three reference points, two at the aortic valve and one at the apex. Next, a model, derived from a learning set of manually drawn contours, is fit through these points and edge features are extracted along scanlines perpendicular to the local model boundary direction. The edge detection method is based on dynamic programming techniques, thus allowing the determination of local contour points to be influenced by the entire global border path. A preliminary qualitative evaluation study showed that in 85-90% of the ED-frames and in 70-80% of the ES-frames the LV boundary could be detected fully automatic. The time required to detect an ED- or ES-contour in a 512x512 image is 10 seconds. On the basis of these preliminary data it may be concluded that reliable automated detection of LV-boundaries in a routinely acceptable processing time is feasible.
The radiology practice is going through rapid changes due to the introduction of state-of-the-art computed based technologies. For the last twenty years we have witnessed the introduction of many new medical diagnostic imaging systems such as x-ray computed tomo-graphy, digital subtraction angiography (DSA), computerized nuclear medicine, single pho-ton emission computed tomography (SPECT), positron emission tomography (PET) and more re-cently, computerized digital radiography and nuclear magnetic resonance imaging (MRI). Other than the imaging systems, there has been a steady introduction of computed based information systems for radiology departments and hospitals.
In recent years important new stimuli for studies of human stereo and motion perception have been the Julesz random-dot stereograms and Koenderink's random-dot kinematograms. With such stimuli the remarkable properties of human vision to perceive differentials of stereo depth or motion from pairs of individually formless images have been demonstrated. At the same time there has been considerable speculation as to how human vision can achieve such performance, it being usually assumed that a considerable amount of perceptual association and interpretation is involved. In this paper it is shown that, starting with a particular interpretation of early human vision, one can extract local stereo disparity or local motion to a very high accuracy from random-dot patterns very simply and directly, and generate approximate boundaries of stereo depth discontinuity or motion disruption. These findings provide both a simple explanation of some of the workings of human vision and a very simple practical technique for use in computer vision.
Quantitation of the severity of coronary obstructions from automatically detected con-tours is limited to the assessment of the obstruction diameter and percent diameter stenosis in that particular view. Even if an obstruction is analyzed trom two views, the computed cross-sectional area does not provide a reliable measure for asymmetric lesions. A densitometric technique is described which attempts to provide measures about cross-sectional areas from a single view on the basis of the measured brightness levels within the arterial segment. Phantom studies show that the accuracy of the assessment of percent area stenosis of obstructions equals 2.79% (s.d. 1.76%) with a computed densitometric transfer function.
In this paper we present a modification of the back-projection (BP) method, well known as one of tomographic reconstruction procedures. Classical back-projection method contains three steps:back-projection, summation and filtration, while in our method the last step can be omited. This was achieved by nonuniform smearing of projections in contrast to the classical back-projection method.
Estimation of global and regional cardiac function from X-ray recordings acquired during contrast injection is a conventional clinical task which can be thoroughly facilitated by digital image processing techniques. Global left ventricular (LV) function is represented by ventricular volumes and volume changes. These are derived from the projected contours by assuming that the ventricle can be represented by an ellipsoid which has the same projected area and long axis as the ventricular silhouette (accuracy typically 10%). LV contours can be found by various automatic edge detecting techniques, but so far the success rate in clinical settings is less than 100%. Attention for user-friendly correction methods is essential. Applying subtraction of an appropriate other image from the same acquisition run as pre-processing operation improves success considerably. Quantitative analysis of regional ventricular function from these same contours can be applied nowadays ('Regional Ejection Fraction') but the outcome is dependent on the model of ventricular contraction used. As an alternative, methods are being devised to perceptively enhance the images or some dynamic feature of the image series. Output of such operations may either be a new series, which is displayed dynamically, or a static, 'functional' image. Visual observation of such images should yield detection of abnormalities based on an acquired concept of normalcy. A recent addition to such qualitative images is the generation of 3D shaded ventricular shapes constructed of contours in single or biplane images. In the latter case simultaneous interpretation of both contours has become possible.
This paper describes how a rudimentary type of ray tracing has been implemented on standard CT scanning systems for routine clinical operation. In addition, an image coherence technique is described that speeds rendering of a series of 3-D views of complex anatomic structures. Object surface information from an existing view is used to help predict where ray tracing can begin to search for ray object intersections in the subsequent view. This method is shown to reduce the computational expense of finding ray-object intersections by beginning this search in the proximity of object surfaces. Our goal is to improve 3-D realism and, when clinically appropriate, make the application of 3-D presentations routine.
In this paper we present work in progress on 3-D surface reconstruction and 3-D integrated representation of objects defined by a set of parallel cross sections. Three problems have to be addressed: planar contour approximation, reconstruction of surfaces between successive parallel contours and representation/approximation of the reconstructed surface. This paper focuses on the first two issues, the solution to the third one still being undetermined.
Two simulated sea surfaces were generated by a numerical model. Each sea surface has a different power spectrum. I used the Pierson-Neumann theoretical spectrum for the generation of a first sea surface and the Pierson-Moskowitz theoretical spectrum for the generation of the second sea surface. The images were recorded on photographic film by means of a microdensitomer with a writing mode. To obtain the bidimensional power spectrum of these simulated sea surfaces, a coherent optical system was used. These power spectrum have information about frequencies in the highest energy peak and the direction that the waves have at a specific time. Both dimensional power spectrum are compared and optical autocorrelations for each sea surface along the directions parale and perpendicular to the wind were obtained respectively. Cross-correlation of the two simulated images of the sea surface were obtained. It was possible to obtain images at selected wavelenghts using different filters in the Fourier plane of the optical system. Because each sea surface has a Gaussian distribution, and the heights of the waves are totally random, the filter technique permit us to analyze the spatial behaviour of certain wavelenghts.
It is shown that the performance of image coding techniques can be enhanced via the utilization of a priori knowledge. Critical features of the image are first identified and then they are accounted for more favorably in the coding process. For satellite imagery, thin lines and point objects constitute critical features of interest whose preservation in the coding process is crucial. For a human visual system, the impact of the coding degradation at low rates is much more detrimental for these features than for the edges which constitute boundaries between regions of different contrasts. A highly non-linear, matched filter-based algorithm to detect such features has been developed. Pre-enhancement (highlighting) of the detected features within the image prior to coding is shown to noticeably reduce the severity of the coding degrada-tion. A yet more robust approach is the pre-enhancement of the slightly smoothed image. This operation gives rise to an image in which all critical thin lines and point ojects are crisp and well-defined at the cost of non-essential edges of the image being slightly rounded off. For the transform coding techniques, distortion parameter readjustment and variable-block size coding provide promising alternatives to, the pre-enhancement approaches. In the former, the sub-blocks containing any part of the detected critical features are kept within a low distortion bound via the local rate adjustment mechanism. The latter approach is similar to the former except that the image is partitioned into varying size sub-blocks based on the extracted feature map.
In this paper, we propose a raster scanning algorithm for component labeling, which enables processing under pipeline architecture. In the raster scanning algorithm, labels are provisionally assigned to each pixel of components and, at the same time, the connectivities of labels are detected at first scan. Those labels are classified into groups based on the connectivities. Finally provisional labels are updated using the result of classification and a unique label is assigned to each pixel of components. However, in the conventional algorithm, the classification process needs a vast number of operations. This prevents realizing pipeline processing. We have developed a method of preprocessing to reduce the number of provisional labels, which limits the number of label connectivities. We have also developed a new classification method whose operation is proportionate to only the number of label connectivities itself. We have made experiments with computer simulation to verify this algorithm. The experimental results show that we can process 512 x 512 x 8 bit images at video rate(1/30 sec. per 1 image) when this algorithm is implemented on hardware.
Arealtime digital signal filtering system has been designed to investigate new methods of passive IR image processing with a combined staring and scanning IR array sensor. The image processing consists of cascaded spatial and temporal filters applied to the IR scene data to extract known features from cluttered and moving backgrounds. This approach concentrates on the detection of unresolved targets from large IR scenes. Testbed and simulation results have demonstratedthe detection of a target whose intensity was three orders of magnitude less than the background value.
A method for the automatic recognition of defects in wood has been developed and implemented on the Visual Interpretation System for Technical Applications (VISTA). VISTA hardware modules for the computationally complex algorithms are available or under development. By means of these modules the method works in real time.
A fundamental problem in machine vision is to detect and identify special objects in an image. In the field of machine-reading for existing printed matter and books, a very important technique allows extracting and recognizing characters in desired text lines from a document image. This paper describes a hierarchical image segmentation, which separates a document image into its entities. Furthermore, a character segmentation, with minimum variance criterion, and a character recognition, based on three improved loci feature, have been developed as two elemental methods for reading books. In these experimental results using different commercial Japanese pocket books, 99% of text lines were correctly extracted. Also, it was successful in reading 99.30% of the Japanese characters and Chinese ideographs, as used in printed text.
The purpose of this paper is to consider the moment invariants of known forms for perspective transformation. These are important for machine vision systems because every lens system induces a perspective transformation. This approach considers the non-linear perspective transformation in a higher dimensional, homogeneous space. The perspective transformation is linear in this homogeneous space. Algebraic invariant theory used to determine absolute algebraic and moment invariants. By using the theory of conics, the known forms may be constructed and the general form may be reduced to the standard form. The cross ratio is a well known perspective invariant. The cross ratio, which is a geometrical feature independent of the perspective transformation, is expressed in terms of the absolute invariant. New moment invariants corresponding to the perspective transformation are also derived. Examples are presented to demonstrate the theoretical approach. The significance of this work lies in increasing the understanding of object recognition for human and machine vision.
An omnidirectional vision navigation system has interesting characteristics and potential applications for the guidance of a mobile system. Experimental results and the analysis on the linearity between the zenith angle and the image location of an object, on the error due to the effect of relative motion, and on the time requirement for processing an image are described in this paper. The significance of this work is to add to the knowledge of the characteristics of the omnidirectional vision navigation system designed for the guidance of a mobile robot.
ZSA, an industrial image processing system is characterized by a modern parallel architecture for digital signal processing. Besides the use of standard video cameras the system is strongly intended to be used with one dimensional sensors like CCD line cameras, x-ray and infrared arrays. In the basic ZSA system the processing power is distributed among two processors, a programmable digital signal processor (DSP) for rapid processing of the incoming data and a 16 bit standard microprocessor which not only performs system management and communication but also can be used for signal processing of the preprocessed image data coming from the DSP. In order to increase the computing power for the handling of fast data streams special hardware preprocessor modules dramatically reduce the data rate so that the programmable units can operate in realtime. In a different approach increasing of computing power is achieved by parallel processing of a ZSA master board and three ZSA slaves where the standard microprocessor controls the results of the four DSPs. Additionally, combining several master/slave configurations to a master/slave cluster further increases the computing power.
Measurement of physical dimensions of produced parts is one of the applications area of visual inspection. High accuracy measurements -typical mechanical tolerances are about 0.1% or 0.05%- without needing image sizes of 1024 or 4096 pixels implies subpixel accuracy interpolation techniques. An overview of some algorithms will be given in this paper first. One of the subpixel accuracy algorithms has been implemented efficiently on an image computer which has been designed in our laboratory. A general image processor is implemented as a set of two microprogrammable processors: the first is optimized for search operations and will locate the edge of interest, the second is optimized for arithmetic operations and will calculate the accurate edge positions. Both operations are relatively independent and can be pipelined to achieve a maximum performance. Since on-line inspection is considered, execution time is of main importance.
The rapid development of flow velocimetry and visualisation by pulsed laser in the last five years has posed problems familiar in image processing in a new context. This paper outlines the PLV technique and the difficulties encountered in data reduction best solved by image processing methods. Finally, consideration is given to the potentially large cumulative computational times encountered in PLV. The potential for parallel architecture computers in PLV applications is discussed and illustrated by example algorithms run on the ICL DAP.
Very fast analogical and logical treatments of the T.V. signal allow to obtain in real time the edge detection multiple segmentations and pseudo-phase contrast imaging. Usin only two bits per pixel, the restituted image built in 684 x 512 pixels is used for analyzing T.V. images over 128 grey levels which can be transfered in pseudo-colours.
In the present work, writing of microwave-electronic circuits (MWFC) is presented. The bits of information for the entire circuit is generated by means of a host computer. The two value information computer generated is directly related to a high and a low intensity light pulses of a light source used to sensitize a high resolution photographic film in which the electronic circuit shall be written on. With this method there is only two steps involved in order to have the electronic circuit mask. A practical comparison is made analyzing the steps on the film that defines the limits of the circuit for several photographic emulsion. Finally a practical change of the light source in the play back system of the microdensitomer is recommended in order to sensitize directly the high resolution Kodalith film put on the copper board of the real circuit.
A method for real - time flow visualization using a speckle interferometer and a TV camera is presented. Instead of electronic subtraction of the two signals, deformed and undeformed, an intermediate photographic recording with TV camera is used.
Tomosynthesis presents a simple procedure for 3D angiography, because the recording step requires only one injection of contrast medium. Digital flashing tomosynthesis is based on a new nonlinear reconstruction producing less artifacts than conventional backprojection techniques. In addition to reconstructed slices synthetic projections can be calculated, which can be used in combination with the original projections for stereo views. The object structures can be analysed by a stereo cursor controlled by a 3D joystick. This may be an important adjunct to diagnostic and interventional angiographic procedures.
This paper describes the optical implementation of the 'Rotating' and 'Folding' mathematical approach for systolic array processing of the LU-factorization of tridiagonal linear systems. The optical implement-ation of the solution for the resulting triangular systems of equations is also discussed.
Digital image matching, the basic tool of computer stereovision, can be formulated indirectly as performed by mainly two object space models: one for object space surface Z = Z(X,Y), and one for the optical density function D = D(X,Y). They are represented by linear interpolation functions in facets. The parameters for these functions can be computed very effectively by least squares adjustment.
A new hierarchical encoding scheme for grey-level pictures is presented here. The picture field is split by a modified quadtree algorithm into blocks of size 32 x 32, 16 x 16, 8 x 8 and 4 x 4 pels, according to their subjective importance in the picture. The larger cells, of size 32 x 32, 16 x 16 and 8 x 8 pels, corresponding to uniform or low-detailed areas, are coded at a very low rates by block truncation in the Discrete Cosine Transform field. The smallest blocks, representing mainly high-detailed areas of the like edges or textures are coded with a multi-codebook vector quantization scheme. Due to its structure, such an encoding scheme is especially well adapted for coding "head and shoulders" pictures, mostly encountered in videophone or videoconference application, where large areas of background may appear. Concerning the vector quantization, several techniques were investigated in order to improve the subjective quality and to reduce the search time through the codebooks. This permits a faster elaboration of the codebooks. Results are presented with bit-rates ranging from 0.4 to 0.8 bits/pel depending on the picture complexity.
Transform coding suffers from a necessary block segmentation of the entire image. The block size defines the image resolution by the low-pass characteristic of the bit assignment. Detail-dependent variations of the block aperture achieve adaptive modification of the resolution parameters.
A classification method for adaptive scalar quantization and efficient encoding of the DCT coefficients of video phone signals is described. In a corresponding coder concept using hybrid vector and scalar quantization the vector quantization (VQ) stage is not only used as an efficient prequantization method but also as a classifier for zonal coding by the following adaptive scalar quantization and encoding stage. To optimize the coder per-formance, the VQ classification result is combined with the output of an energy-based classifier. Our simulation results indicate that in 61 kbit/s encoding applications this concept improves the signal-to-noise-ratio by up to 0.8 dB with regard to a pure energy-based classification method.
The need to send Photographic images on Telecommunication network is well established today and more than numerous publications on this subject has been issued. On narrow band channel, typically 64 Kbit/s, known as e.g. B channel on the emerging ISDN, Real Time Still image services like Photovideotex, are within hand reach and largely under studies. One of these studies, funded by C.E.C. (Commission of the European Community) and known as the ESPRIT-PICA Project aims to produce an algorithm capable of compressing a Photovideotex image to less than 1 bit/pel than can be decoded at ISDN data rates. If the network can today, cater for Real Time Still image Services, the coding and decoding complexity of a renowned high compression scheme like A.D.C.T. might still remain the bottleneck of the Real time Implemen-tation. To remedy this situation a new approach of the DCT algorithm has been sought and has led to a very fast implementation of the Forward and Inverse D.C.T. in distributed arithmetic which is furthermore well suited to Integration. The approach is based on the decomposition of the DCT into polynomial products, and the evaluation of the polynomial products by distributed arithmetic. This leads to LSI chip, with a great regularity and testabili-ty. Furthermore, the same structure can be used for FFT computation by the modification of the ROM content of the chip. This architecture is theoritically based on a new formulation of a length - 2N DCT as a cyclic convolu-tion, which is described in the first section of the paper. Other sections will describe the implementation of the D.C.T. using Distributed Arithmetic approach and its evaluation in VLSI.
Vector quantization (Vq) provides us with an appropriate technique to obtain high compression ratios. However, for image sequence coding, vector quantizer has to design temporally stable codebooks of representative vectors and this constraint of temporal stability has to be solved with a minimal computational complexity. We propose a new coding scheme based on visual classification and temporal refreshment procedures which enable to design subjectively optimal codebooks for any kind of real broadcast image sequence. This contribution reports results of simulations about how such a classification may be introduced and evaluated and how vector quantizer may be refreshed. We obtain significantly better reconstruction quality for a given codebook size and investigate several available solutions to implement the codebook refreshment procedures.
The goal of image coding is to reduce the number of bits required to represent an image or a sequence of images. For the new generation of video coders the coding should be tailored to image understanding and should more and more replace the purely statistical or waveform coding techniques. At present the more conventional algorithms are prefered in the light of a direct hardware realization. Most of these methods are block based and do not consider the global image features. In this paper an attempt is made to extend a hybrid DPCM/transform coding configuration with object based features. For videoconferencing and videophone applications we intend to use a simple method exploiting the knowlegde of typical interpersonal videocommunication scenes such as head and shoulders. The first step is the segmentation of the image into participants and background. For this segmentation information available in the coder is combined with knowledge obtained from the actual analysis. To make the segmentation more robust to various input signals we accept the fact that objects often are more easily recognized in images that have a very low spatial sampling rate. At this low level details are not taken into account which yields a more homogenous segmentation. This leads to a pyramidal image data approach. By applying a coarse to fine structure the procedure starts at a low resolution level and is refined at ever-increasing resolutions. A comparison is given of a coder with and without this a priori knowlegde.
2D-signal processing techniques for interframe hybrid coders with motion compensating prediction are investigated. "Equivalent" 2D-filters and "equivalent" coder structures are proposed to improve the subjective quality of moving pictures encoded at about 64 to m x 384 kbit/s (m = 1, 2, ..., 5). An optimum predictor is derived and it is concluded that this solution is quite close to the concept of equivalent filters. The complete coding scheme is described.
A new image coding technique based on an interpolative method is described. The interpolation strategy is very adaptive for the local contents (line and edge features) of the image. Results show that the method produces good quality images at bit rates of 0.5 bits/pixel.