This paper discusses research in spectrum analysis, beamforming, and direction finding, using signal subspace techniques. Such methods can provide improvements in performance by effectively incorporating prior information concerning the spatial and temporal structure of the signals and noise. During the past decade, many researchers have contributed to broadening the range of applicability of such methods and increasing our understanding of their relation to traditional linear statistical modelling. Representative results will be described, together with an outline of future research directions and a brief introduction to the literature on systolic architectures and algorithms for parallel implementation of the required matrix computations.
This paper describes a new family of algorithms for adaptive weight computation for sensor arrays with arbitrary geometries. This family includes existing algorithms of the steepest-descent and matrix-factorization types, plus a range of new algorithms that are intermediate between these two classes, in terms of performance and computational complexity. This approach is particularly valuable for applications where steepest descent is too slow, but the standard matrix methods require too much computation.
We propose two novel algorithms for signal processing of data from a rectangular (two dimensional) antenna array for which the number of complex operations necessary to Implement the algorithms is significantly less than that for the naive extension of the MUSIC algorithm. We present simulation results for both algorithms and discuss these in relation to the information content of the data.
A theoretical analysis is done to arrive at expressions for quick estimation of the leakage noises due to jitter and drift motions in a surveillance sensor system. The expression for the mean square clutter leakage started out in an integral form, involving pixel footprint size, background spatial distribution density, optical parameters of the telescope, observation range, detector transfer function, and the filter function. The filter is assumed to be an Nth order temporal differencing type, and the background scene has a Cauchy type distribution with specific standard deviation and correlation length. Then the integral expression is reduced to a simplified case where the system motion is assumed to be one dimensional, consisting of a linear combination of a uniform drift motion and a sinusoidal jitter vibration. For both the jitter and the drift dominant case the integrations are reduced to the summing of sine and cosine integral functions. Asymptotic expansions are then used to express the clutter leakage in terms of sine and log functions. The resulting expressions are simple enough to be rapidly used with a desk-top calculator. The results are compared with curves generated by a digital computer using a more exact numerical integration method and agrees well except for the values at the jitter vibrational peaks where the approximation method over estimates.
In this paper we present a geometric formalism by which it is possible to derive a large family of least-squares algorithms for systolic arrays. The geometric formalism is based upon that introduced by Shensal and Lee, et al., in their derivation of the Least-Squares Lattice Algorithm. Specifically, their time update theorem is used along with an order update theorem to derive general vector time and order recursion relations. As a result, a large number of new and previously known time and order recursive least-squares algorithms for systolic arrays can be derived in a unified manner.
There are many cases, both in signal and in image processing applications, where the eigenvalues and vectors of particular matrix operators are required. Furthermore, in many situations, on a particular matrix operator, there is an a priori information available. Most eigenvalue algorithms do not utilize all the available information. In this paper, the use of the Lie algebra of matrix operators is suggested for systolic array eigenvector and value computation. After a brief survey of the relevant Lie algebra for such computation, a number of possible Lie signal processing algebra examples are presented.
Optical Cellular Array Processors (OCAPs) are a subject of current interest because they offer the potential of programmable optical computers. By a suitable combination of data formatting, systolic movement of data in an acousto-optic cell, parallel masking, and thresholding; fast OCAPs are readily constructable.
We examine the checksum schemes of Abraham et al. for the computation of the LU-factorization using a multiprocessor array. Their methods are very efficient for detecting a transient error, but quite expensive for correcting it due to the need for a computation rollback. In this paper, we show how to avoid the rollback and how to implement pivoting. We also introduce a new checksum method for solving triangular sets of linear equations.
A high-performance systolic array computer called Warp has been designed by CMU and is currently under construction. The full scale machine has a systolic array of 10 or more linearly connected cells, each of which is a programmable processor capable of performing 10 million floating-point operations per second (10 MFLOPS). By the end of 1985 the first full scale machine will be operational. Low-level vision processing for robots and autonomous vehicles are among the first applications of the machine. This paper describes a new boundary processor to be attached to an end of the linear systolic array in Warp. Extending Warp with this boundary processor will substantially enhance the performance and applicability of the machine. The extended machine will be efficient for new application areas such as solution of linear systems of equations and adaptive signal processing.
A radar digital beamformer VLSI architecture is defined which provides the very high-throughput data flow in a modular failure-tolerant structure. A number of VHSIC/VLSI chip implementation approaches were evaluated and tradeoff curves are presented here. The results indicate affordability of radar elemental beamformers including large two-dimensional arrays.
The pyramid is well suited for image enhancement since its' bandpass structure closely matches the human visual system and efficient because successively lower octave frequency bands are represented by proportionally fewer samples. The algorithm is also well suited for pipe-lined real-time video processing since each successively lower spatial frequency band is generated from its' predecessor using the same low-order filters. A prototype pyramid system was built to process off-air monochrome NTSC video in real-time at 30 frames-per-second. The system digitizes input frames into 512 horizontal samples per line and 48() vertical lines linearly quantized to 8 bits. It decomposes these images into 5 bandpass sub-images simultaneously using multiple cascaded 7-tap two-dimensional filters. The original image is then reconstructed from the bandpass com-ponents using the inverse process. The system is computer controlled allowing variation of internal parameters such as filter coefficients, filter kernel size, arithmetic precision, and number of bandpass images generated. The hardware is configured to allow simultaneous viewing of the input, 5 bandpass components, and reconstructed original image. Image enhancement operations, such as band optimized noise coring, can be added to the system as board level features in-line with the bandpass images' data paths prior to reconstruction. The complete pyramid system is modular, self contained, performs the real-time two-dimensional bandpass processing and will likely have impact on future television systems.
The Burt pyramid algorithm is a potentially powerful tool for advanced television image processing, enhancement, coding, and restoration. In order to exploit this algorithm for the real-time processing of digitized NTSC television signals, where 10-15 MBytes; second data rates are typical and multidimensional processing is essential, a novel cascaded-filter pipe-line architecture was developed. The algorithms' two-dimensional, octave-bandpass filtering is accomplished using multiple arrangements of separ-able horizontal and vertical one-dimensional, 7-tap FIR (Finite impulse Response) filters. These two-dimensional filters are configured into a serial-pipelined data structure to produce simultaneous bandpass image representations. This resampling architecture achieves the performance of high-order filters through the repetitive use of less complex low-order filters. In addition, each successive lower spatial-frequency band is calculated from previously computed bands while the linear sampling density is reduced by two per dimension to eliminate redundant information which is wasteful of both memory storage and processing time. An algorithm imple-mented using this architecture produces a very efficient and economical process well suited for real-time operation. Because the architecture is modular, highly structured, and arithmetically well-behaved it is attractive for VLSI implementation.
A systolic clock generator has been designed and built with 320 identical computing elements on a single VLSI chip which can provide a considerable increase in the rate at which an accumulator may be used to produce a variable frequency clock. A single Arithmetic Frequency Synthesizer (AFS) chip (with appropriate external circuitry), having the equivalent of sixteen accumulators, can provide a programmable clock at frequencies sixteen times greater than that obtainable with a single accumulator. By using up to four AFS chips in parallel, a 64 times improvement of the system output clock frequency may be obtained.
The Honeywell Parallel Programmable Processor (PPP) set of VHSIC chips is intended to provide a flexible architecture for two dimensional real-time image processing. The chip set consists of three devices; the PPP, Sequencer, and Arithmetic Generator. In performing an analysis of the use of these devices in an image processing application, the designer must often simulate the imaging algorithm so that it fits the architecture of the VHSIC devices. In this paper we describe a macro-model of the Honeywell PPP devices written in the DoD Ada language. The Ada model for the PPP uses the concurrency features of Ada to emulate the parallel processing internal to the PPP devices. An image processing algorithm performed on two dimensional images using the Ada model are described. The Ada model is useful to understand the processing characteristics of algorithms when they are to be executed in the Honeywell PPP architecture.
Novel circuitry that logarithmically reduces the number of adder delays needed for converting mixed binary to true binary is introduced. This circuitry, which has been built with standard TTL parts, can be easily improved using high speed VLSI CMOS, thereby increasing the overall efficiency of multiplier/converter modules.
ESL is currently building a 350 MFLOP Systolic Adaptive Beamformer. The beamformer implements the Minimum Variance Distortionless Response (MVDR) algorithm. The Systolic Adaptive Beamformer can process over 100 sensors, and 280 frequencies in real time. This frequency domain adaptive beamformer is being developed using a systolic architecture processor implemented with custom VLSI chips.
The potential development of digital optical computing systems is discussed. Previous approaches are reviewed and their shortcomings with respect to digital electronic circuitry are analyzed. Fundamental arguments are presented which indicate the potential advantages of optical logic over electronic logic. A generic classification of digital optical computing is presented.
Kalman filtering represents formidable computational linear algebra requirements for each new input measurement vector. The air-to-air missile guidance problem is addressed for which an extended Kalman filter (EKF) is required because the measurements are nonlinear in Cartesian coordinates. An explicit formulation is used. At the outset, we discretize the system dynamics and measurement model and incorporate a discrete-time EKF. A factorized L D LT algorithm is used to propagate the covariance matrices between sample times. A simulation analysis of the number of data bits required in the computations is provided. Comparison with other EKF algorithms shows that this method requires only 18 bit accuracy (compared to 32-40 bits for other methods). Quantitative position, velocity and acceleration estimates obtained for a highly maneuverable target are presented. A high-accuracy floating point optical processor is presented that is capable of computing the full EKF to allow a new measurement update each msec.
A proposed high performance, many Gigaflop signal processor is described in which 512 processors are interconnected with a 768 by 768 crossbar switch utilizing a spatial light modulator of the type presently under development. Optical fibers are used to provide high speed communication between the processors and the switch. The system, processor nodes, programming and functional operation are described. The advantages are discussed for reconfigurability, optical crossbar switches and programmed data flow. Efficient implementations are presented for: a systolic filter, a fast Fourier transform, a correlator and a matrix-vector multiplier.
A new high-accuracy optical linear algebra processor (OLAP) with many advantageous features is described. It achieves floating point accuracy, handles bipolar data by sign-magnitude representation, performs LU decomposition using only one channel, easily partitions and considers data flow. A new application (finite element (FE) structural analysis) for OLAPs is introduced and the results of a case study presented. Error sources in encoded OLAPs are addressed for the first time. Their modeling and simulation are discussed and quantitative data are presented. Dominant error sources and the effects of composite error sources are analyzed.
The DMAC (Digital Multiplication by Analog Convolution) algorithm has been shown to be one technique for performing optical matrix-multiplication with improved precision. Past work in this area has addressed fixed-point arithmetic only. Presented in this paper is an extension of the DMAC algorithm for handling floating-point binary numbers as well. However, the technique employed for handling floating-point numbers is based on fixed-point concepts. For this reason we choose to call the arithmetic as being flixed-point, since it is a hybrid combination of both floating and fixed-point arithmetic. In this paper we also describe an acousto-optical time-integrating architecture using binary flixed-point arithmetic to perform matrix-vector multiplication. By employing an array of full-adders in conjunction with the photodetector array at the back-end of this architecture, it is possible to avoid generating mixed binary outputs that normally result through the use of the DMAC algorithm. Hence, we eliminate the need for analog-to-digital converters needed to convert mixed binary to pure binary. Preliminary experimental results are also presented.
Optically implemented threshold logic systems that are characterized by thresholding operations concentrated at one functional location are considered. The objective is to identify architectures and associated integrated optical and holographic techniques that might be used to design superior register-level computation modules. A complete design for a lumped threshold 2-bit multiplier is presented as an example, and methods for general lumped threshold module synthesis are discussed.
A new technology for opto-electronics has been developed, semiconductor multiple quantum, wells(MQWs). These MQWs, which are extremely thin(~100Å) heterostructures, have opto-electronic properties greatly enhanced over conventional semiconductors. For example, we have fabricated a 150 μm long optical modulator with an on/off ratio of 10:1, and used it to generate an optical impulse less than 100 ps long. This technology is compatible with existing source and detector material systems, and produces devices that are compact, high speed, and suitable for integration.
Optical threshold logic implementions of register-level operations such as multiply-accumulate are considered. Specific 2- and 8-bit multiply-add designs using both conventional (Boolean) and threshold logic are described and compared. The threshold logic designs are shown to have advantages of factors of 2 or 3 in number of logic levels, number of logic elements, and number of interconnections. The possibility of all-optical implementations (optical logic elements and optical internal connections) of these designs is discussed in terms of integrated optical networks and nonlinear optical devices, particularly in GaAs structures.
Lenslet array systems are different from conventional, single aperture systems in that many, possibly somewhat dissimilar, apertures, each containing its own lens, are used. This may add some (mostly conceptual) complexity, but it allows for a great deal of added flexibility. The availability of many independent optical channels offers many opportunities for optical information processing. One important possible class of applications for these systems are smart sensors - combining image acquisition with preliminary processing for machine vision and efficient image transmission. Another potential application of the device is the DiLAP, for Digital Lenslet Array Processor. This is a combination of an analog lenslet array device with a point non-linear electro-optical device. It provides highly parallel digital processing capability. This work describes the principles of operation of these LAP or Lenslet Array Processor systems, review some specific experimental examples, and discuss their potential limitations and capabilities.
A real-time incoherent optical image correlator which employs a laser diode, an AM-FM AOD, and a TDI mode CCD is described. A theoretical analysis is given to indicate that this system possesses high resolution capability by virtue of its special image feeding approach using an AOD in an AM-FM mode. Experimental results showing the system impulse response are presented to confirm the theoretical analysis. Correlation functions of some example images and the scale invariant performance of this incoherent optical correlator by adopting an adaptive AM-FM signal format are demonstrated.
A novel approach of building a large-capacity optical correlator using Vander Lugt's matched spatial filters is introduced. The system capacity is substantially increased by utilizing a wavelength-angle multiplexing scheme. An optical correlator consisting of a tunable dye laser as the light source and an optical dispersive element such as a diffraction grating placed in the input plane is presented. While sequentially tuning the wave-length of the line emitted from the dye laser and rotating the diffraction grating about the optical axis, the input signal spectrum is scanned along a trajectory of concentric rings in the Fourier plane. With this 2-D scanning approach, up to 1000 MSFs can be spatially multiplexed on a single holographic plate due to the more effective use of the system space-bandwidth product. In synthesizing the spatially multiplexed MSFs a monochromatic laser is used instead of a dye laser since the latter generally does not provide adequate coherence length. In order to accommodate wavelength shifting between the filter recordings and signal correlation detections, a compensation technique which includes the scaling of input signal spectrum and reference beam angle according to the wavelength's ratio is utilized during the filter's fabrication. The large capacity of the optical correlator can be used effectively at high speed and accuracy under electronic/computer control for either multiple-signal detections or for the recognition of a single object with scale and orientation variations. Several illustrated experimental demonstrations are also presented.
We have built a parallel-shear interferometer, which provides the visibility of celestial objects in real time. The interferometer produces two pairs of images of the entrance aperture, and fringes of constant visibility form in the overlap area of each pair. Due to atmospheric phase distortions the fringes have limited spacing and life time. This, combined with the sparsity of the photons, imposes strict limitations on the detectors and integration time. The fringes are modulated internally at 100 kHz, above the typical atmospheric frequencies. The image is transferred via a fiber optic bundle to a bank of photo-multipliers and preamplifiers, off the telescope. In our instrument twenty digital channels operate in parallel to extract the average fringe modulation. In each channel the photoelectron pulses are fed into a simple counter and two quadrature lock-in counters. The results of all sixty counters is read every few milliseconds. The modulation amplitude is found through the sine and cosine counters, regardless of the random phase. The simple counter serves to remove the Poisson noise bias.
Time and space integrating folded spectrum techniques utilizing acousto-optic devices (AOD) as 1-D input transducers are investigated for a potential application as wideband, high resolution, large processing gain spectrum analyzers in the search for extra-terrestrial intelligence (SETI) program. The space integrating Fourier transform performed by a lens channelizes the coarse spectral components diffracted from an AOD onto an array of time integrating narrowband fine resolution spectrum analyzers. The pulsing action of a laser diode samples the interferometrically detected output, aliasing the fine resolution components to baseband, as required for the subsequent CCD processing. The raster scan mechanism incorporated into the readout of the CCD detector array is used to unfold the 2D transform, reproducing the desired high resolution Fourier transform of the input signal.