In many modern signal processing applications, in particular real-time and adaptive signal processing, it turns out that there is a strong relationship between the algorithms (nested-loop type) that compute solutions to problems and the architectures (parallel/pipeline like) onto which the algorithms can be or should be mapped. From this observation it is believed that, within this class of applications, development of algorithms and design of architectures could better be considered simultaneously or even interchangeably. The framework in which such algorithms and architectures can best be expressed and designed is that of flow graphs which naturally describe both computations and communications between computations. In order to ensure a consistent design methodology, it is necessary to have a generic model and well-defined methods with which such designs can beformally undertaken. We introduce these ingredients in this paper by analyzing the prototype application of linear equations solving.
Algorithmic engineering provides a rigorous framework for describing and manipulating the type of building blocks commonly used to define parallel algorithms and architectures for digital signal processing. So far, the concept has only been illustrated by means of some relatively simple examples. These relate to the use of QR decomposition by Givens rotations for the purposes of adaptive filtering and beamforming. In this paper we present a much more challenging example whereby the techniques of algorithmic engineering are used to derive the QRD-based lattice algorithm for multi-channel least squares linear prediction. The elegant simplicity of this derivation, which comprises a sequence of straightforward diagrammatic manipulations, serves to demonstrate the potential power of algorithmic engineering as a formal design technique.
In this paper we derive a unitary eigendecomposition for a sequence of matrices which we call the periodic Schur decomposition. We prove its existence and discuss its application to the solution of periodic difference equations arising in control. We show how the classical QR algorithm can be extended to provide a stable algorithm for computing this generalized decomposition. We apply the decomposition also to cyclic matrices and two point boundary value problems.
Key words. Numerical algorithms, linear algebra, periodic systems, K-cyclic matrices, two-point boundary value problems
In tins paper, we propose a method for computing the SVD of a product of two 2 x 2 triangular matrices. We show that our method is numerically desirable in that all relevant residual elements will be numerically small.
Discretized 2-D deconvolution problems arising, e.g., in image restoration and seismic tomography, can be formulated as 1eas squares compuaions, mm lib— Tx112 where T is often a large-scale rectangular Toeplitz-block matrix. We consider solving such block least squares problems by the preconditioned conjugate gradient algorithm using square nonsingular circulant-block and related preconditioners, constructed from the blocks of the rectangular matrix T. Preconditioning with such matrices allows efficient implementation using the 1-D or 2-D Fast Fourier Transform (FFT). It is well known that the resolution of ill-posed deconvolution problems can be substantially improved by regularization to compensate for their ill-posed nature. We show that regularization can easily be incorporated into our preconditioners, and we report on numerical experiments on a Cray Y-MP. The experiments illustrate good convergence properties of these FET—based preconditioned iterations.
The algorithm-based fault tolerance (ABFT) provides low-cost error protection for VLSI processor arrays used in real-time digital signal processing. Several ABFT techniques (weighted check-sum) have be proposed to design fault-tolerant LU decomposition, QR decomposition, and matrix inversion. In these schemes, encoding/decoding uses either multiplication or division so that overhead is high. In this paper, we propose new encoding/decoding methods for designing fault-tolerant matrix operations. Since the new encoding/decoding methods use only additions and subtractions, the overhead of a fault-tolerant processor array can be drastically reduced.
In this paper we address the problem of implementing a Least Squares problem, like that arising in the adaptive beamforming case, on a multiprocessor network comprising of general purpose DSP processors. Although optimal array processors have been proposed for this problem, an implementation on a generalized network is a flexible and more modular and reconfigurable solution, suited for a highly dynamic environment with changing applications and problem sizes. The parallelization issues have been explored and a scheme for an efficient implementation has been described.
This paper explores novel techniques involving number theoretic concepts to perform real-time digital signal processing for high bandwidth data stream applications in digital signal processing. Often the arithmetic manipulations are simple in form (cascades of additions and multiplications in a well defined structure) but the numbers of operations that have to be computed every second can be large. This paper discusses ways in which new number theoretic mapping techniques can be used to perform DSP operations by both reducing the amount of hardware involved in the circuitry and by allowing the construction of very benign architectures down to the individual cells. Such architectures can be used in aggressive VLSI/ULSI implementations. We restrict ourselves to the computation of linear filter and transform algorithms, with the inner product form, which probably account for the vast majority of digital signal processing functions implemented commercially.
We present algorithms for n-bit multiplication and division suitable for execution in k < n- bit wide processing elements in massively parallel computers. A combined implementation is proposed and its performance in terms of the number of instructions required is evaluated.
Orthogonal matrix transformations form an important part of matrix-based signal processing applications. Systolic arrays for computing these algorithms have been developed and the size of these arrays usually depends directly on the size of the problem. For large matrix sizes, implementing large numbers of processors in hardware is not physically feasible. In this paper, we examine two popular orthogonal transformations, Givens rotations and householder transformations (HT), from the viewpoint of realizing a fixed-size parallel processor array that can handle large data matrices. An efficient scheduling procedure is used to compute the HT on a systolic type array, its performance is compared with that of an array designed for computing the Givens method. An important conclusion resulting from the comparison is that the performance of the HT array is superior to that for the Givens method when the matrices are larger compared to the array size.
The use of wireless personal communication networks (PCN) to increase the capacity of cellular radio systems is a topic currently receiving much attention. In addition to the increase in capacity one achieves by going to a microcellular system, one can increase the spectral efficiency even further by allowing the PCN to overlay a frequency band which is already occupied with narrowband waveforms. Various signal processing techniques can be used advantageously in this regard, and such techniques are outlined below.
The Viterbi algorithm (VA) for decoding convolutionally encoded data has historically been implemented on special-purpose digital electronic hardware. For short/moderate (K equals 3 to 9) constraint length codes, a primary design goal is to maximize the decoded bit rate while minimizing circuit area. In recent years, a number of special-purpose architectures based upon shuffle-exchange networks, cube-connected cycles, ring-based networks, systolic arrays, or programmable processors have been designed for efficient implementation of the VA at these and longer constraint lengths. However, at the same time, the performance:cost ratio of high- end general-purpose computing machines has been improving dramatically. Recognizing the substantial investment in time and resources required to design and build an ASIC-based decoder for long (K equals 10 to 15) constraint length codes, the feasibility of implementation of the VA as a background process on a readily available general-purpose parallel processing machine deserves exploration. We consider the limitations and benefits of a Viterbi decoder for long constraint length codes implemented in software on a general-purpose parallel processing machine.
Central to `multimedia' image processing is the desire to encode computer graphics data into a standard television signal, complete with line, field, and color subcarrier synchronizing information. The numerous incompatibilities between television and computer display standards render this operation far less trivial than it sounds to anyone who hasn't worked with both types of signals. To simplify the task of encoding computer graphics signals into standard NTSC (North America and Japan) or PAL (most of Europe) television format for display, broadcast, or recording, TRW LSI Products Inc. has introduced the two newest members of it multimedia integrated circuit family, the TMC22090 and TMC22190 digital video encoders.
The NRL FLEX processor architecture is designed for real-time radar signal processing and was described at last year's conference in the context of its original application, the Point Defense Demonstration Radar project at NRL. This paper describes the current status of the processor and the application of the same architecture to a new problem which requires a different board configuration to satisfy a different set of latency and data flow requirements.
Many computational schemes in linear algebra can be studied from the point of view of (discrete) time-varying linear systems theory. For example, the operation `multiplication of a vector by an upper triangular matrix' can be represented by a computational scheme (or model) that acts on the entries of the vector sequentially. The number of intermediate quantities (`states') that are needed in the computations is a measure of the complexity of the model. If the matrix is large but its complexity is low, then not only multiplication, but also other operations such as inversion and factorization, can be carried out efficiently using the model rather than the original matrix. In the present paper we discuss a number of techniques in time-varying system theory that can be used to capture a given matrix into such a computational network.
This paper presents modifications of the continuous Hopfield and Hartline-Ratliff networks for use in signal restoration and parameter estimation. The particular parameter estimation problem of interest is concerned with the estimation of the directions of arrival of an unknown number of plane waves in unknown noise. Restoration of linearly distorted noisy images is considered as an example of regularized restoration of a signal with known dynamic range.
Coupled digital phase-locked loops (CDPLLs), derived from extended Kalman filter theory, have been shown to suppress adjacent and co-channel interferers. This paper addresses the ability of these CDPLLs to separate co-channel signals. State observability criteria are defined and examined for non-linear state observations which model phase modulation. These metrics can then be used to address signal separation and tracking behavior of CDPLLs.
Recovery of rotational parameters is a very important problem in a navigation system. A constrained least squares (CLS) method where the estimated rotation matrix is constrained to be orthonormal or a quaternion based least squares method is used to estimate the parameters. However, the solution is restricted to analyzing the motion between two successive frames only. In this paper, we propose a computationally efficient recursive scheme to estimate the rotation parameters from a long sequence of data using the quaternion formulation. The proposed method is based on the rotation of the input data at different frames to approximately align them to a given subspace. The left (the right) singular subspace defined by the running sum of the outer products of the available data points is used as the alignment matrix. The exact alignment is then obtained using the singular value decomposition (SVD) technique. The recursive estimate of the quaternions representing the constant, fixed axis rotation is shown to be given by the vector corresponding to the smallest singular value. Thus, the proposed formulation is identical to the problem of recursive updating of symmetric matrices. A recursive CLS method is also developed using a similar technique, which would yield identical estimates as in the previous method, for a given data set. The details of these methods and simulation results are presented in this paper.
The problem of linearly constrained least squares has many applications in signal processing. In this paper, we present a perturbation analysis of a novel, linearly constrained least squares algorithm for adaptive beamforming. The perturbation bounds for the solution as well as for the latest residual element are derived. We also propose an error estimation scheme for the residual element. This error estimation scheme can be incorporated into a systolic array implementation of the algorithm. The additional time required for error estimation is small, as we show how the processor idle time can be used to carry out almost all the calculations.
The direction-of-arrival (DOA) estimation problem consists in determining the angles of arrival of a number of signals impinging on a sensor array. Recently, the so-called ESPRIT method of Roy and Paulraj has received a great deal of attention in the literature. The method employs matrix decomposition techniques, such as singular value decomposition (SVD) and generalized Schur decomposition (GSD). The computational complexity thus involved represents a serious impediment, especially when a real-time implementation is aimed for. Therefore, the aim here is to develop an ESPRIT-type algorithm which is fully adaptive and amenable to parallel implementation. By introducing adaptivity, the computational complexity per sampling period is reduced from (Omicron) (m3) to (Omicron) (m2), where m is the `problem size,' i.e., the number of antenna doublets. On a parallel processor array with (Omicron) (m2) processors, the throughput is then (Omicron) (mo), which means that the number of measurements that are processed per time unit is independent of the problem size. The algorithm is based on an adaptive SVD updating algorithm, combined with an adaptive GSD. The corresponding systolic implementation is based on the systolic SVD updating array of Moonen et al.
A robust focusing matrix, called unitary constrained array manifold focusing, is proposed for the coherent signal-subspace method which is a novel approach for direction finding of broad- band sources. The proposed focusing matrix has robustness against size variations of focusing sectors obtained by preliminary group angle estimates. To quantify information loss due to source focusing, a new measure, relative efficiency index, is introduced. This index is useful in assessing statistical sufficiency of a data-reduced statistic formed by source focusing. The sensitivity issue as well as estimation performance comparisons for various focusing schemes, based on simulations and relative efficiency index, are also presented.
In this paper, under the assumption that noise correlation is spatially limited, using two separated arrays, we propose a new parametric approach for consistent directions of arrival (DOA) estimations in unknown noise environments. The theoretical performance analysis of the proposed DOA estimations is also presented. With the use of the theoretical performance, the best weighting matrices of the parametric criteria have been derived. More significantly, it has been shown that within the best weighted criteria, using canonical decomposition, we can achieve optimal performance of the DOA estimation among a large set of eigendecompositions.
A direction-finding algorithm employing the technique of array manifold interpolation is proposed for the estimation of directions-of-arrival of multiple narrow-band sources. The algorithm is formulated by applying array manifold interpolation to an ESPRIT-type method. As a result, a virtual array can be interpolated from the data collected from a real array, thereby eliminating the need for two identical arrays as in ESPRIT. Moreover, only a single interpolation matrix for the entire field-of-view is required, thus significantly reducing the computational load of the sector-divided interpolation scheme. Simulation results are provided to show the effectiveness of the proposed algorithm.
This paper consists of two parts. The first part reviews a class of higher-order Wigner-Ville distributions projected onto a single frequency axis. This class is referred to as polynomial Wigner-Ville distributions (PWVDs). For random processes, the expected value of PWVDs represent time-varying higher-order moment spectra. The second part defines and studies a particular member of a class of time-varying higher-order spectra based on PWVDs, namely a reduced Wigner-Ville trispectrum (RWVT). This novel time-frequency representation is shown to be a very efficient tool for analysis of FM signals affected by Gaussian amplitude modulation. In the paper, we present a statistical comparison of the RWVT and WVD based instantaneous frequency estimates for linear FM signals in absence and in presence of multiplicative noise. For multicomponent signals, the RWVT has to be re-defined and calculated in the full multi-lag domain.
A general overview of the definition and properties of the recently defined Wigner higher order moment spectra (WHOS) is given in this paper. It is shown how the properties satisfied by WHOS are the extension of the well known properties of the Wigner-Ville distribution to a higher-order domain. In addition, a general class formulation, the discrete implementation, a comparison with a cumulant-based definition, the reduced interference condition, and tone application also are discussed.
A new class of time-frequency kernels is introduced. Members in this class satisfy the desired time-frequency distribution properties and simultaneously provide local autocorrelation functions (LAF) which are amenable to high resolution techniques over periods of stationarities. These high spectral resolution kernels map the sinusoids in time into damped/undamped sinusoidal bilinear data products over the LAF lag variable. The damped sinusoids represent cross-terms. Using SVD-based backward linear prediction techniques, the signal zeros, the cross-term zeros, and the extraneous zeros, respectively, lie on, outside, and inside the unit circle, providing a mechanism to distinguish between different types of components. It is shown that the binomial kernel introduced is a member of this class.
Detection and classification of cyclostationary signals in noise of unknown distribution is addressed and novel tests for cyclostationarity are proposed. Both cases of known and unknown signal statistics are considered. The proposed approaches exploit the asymptotic normality of sample cyclic- cumulant and polyspectrum estimators for deriving asymptotically optimal X2 tests. Simpler, but generally suboptimal versions are also presented. Simulations are performed to test the proposed algorithms and illustrate their insensitivity to any stationary noise as well as the ability of higher-than second-order schemes to suppress cyclostationary Gaussian interferences of unknown covariance.
Some recent results relating to system identification are described and illustrated in this contribution. The system considered is nonlinear and time-invariant, being represented by a Volterra series up to second order. Closed-form expressions for the transfer functions of first and second order are derived for a class of non-Gaussian stationary input processes. It is shown that the obtained parameters are optimum in the mean square sense. Once the system is identified, we derive a closed-form expression for the quadratic coherence that is a measure of the goodness of fit of the quadratic model. It is shown that this expression simplifies to well known results when the system is linear or its input is Gaussian. Furthermore, we develop estimates for the transfer functions and the quadratic coherence from spectral and bispectral estimates, based on averaged periodograms and biperiodograms of data stretches of the observed input and output of the system. This method is tested and validated by using simulated input-output data of a known quadratically nonlinear system, with known input signal statistic, Finally, we discuss the problem of testing a specified value of the quadratic coherence.
Techniques for machine recognition of speech exist, but use representations of the speech signal which cannot convey some of the speech information. Alternatives from time-frequency and time-scale analysis are compared with the traditional methods with a view to improving the word recognition rate. Experiments have shown the speech recognition rate improves in some cases.
Cohen's class of time-frequency distributions has been recognized to have significant potential for the analysis of complicated signals. The spectrogram, though it offers comparatively lower time-frequency resolution than other, more recently investigated members of Cohen's class, is still the most broadly used TFD today. Packages are available which perform SP evaluation and offer signal synthesis from the closely related short-time Fourier transform. These packages allow the SP to be the widely accessible signal processing tool that it is. In this paper, we introduce two decompositions of TFDs, the SP decomposition and the weighted reversal correlator decomposition. The decompositions are useful for TFD interpretation, fast implementations using SP building blocks and related high-resolution, linear signal synthesis method using STFT signal synthesis building blocks. It is hoped that these decompositions will facilitate the development of TFD signal analysis/synthesis packages (using existing SP analysis/synthesis packages), and that these packages will make TFDs other than the SP accessible to a broad audience.
The radon transform of the Wigner spectrum has been shown to be the optimal detection scheme for linear FM signals, but the properties of this representation have not been well characterized. By the projection slice theorem, each line integral through the Wigner spectrum corresponds to the inverse Fourier transform of a radial slice through the ambiguity plane. Since line integrals through the Wigner spectrum can be calculated by dechirping, calculation of the Wigner spectrum may be viewed as a tomographic reconstruction problem. In this paper we show that all time-frequency transforms of Cohen's class may be achieved by simple changes in backprojection reconstruction filtering. The resolution-ringing trade-off that occurs in computed tomography is shown to be analogous to the resolution-cross-term trade-off that occurs in time-frequency kernel selection. `Ideal' reconstruction using a purely differentiating backprojection filter yields the Wigner distribution while low-pass differentiating filters produce cross-term suppressing distributions such as the spectrogram or the Born-Jordan distribution. The one-to-one identities between the Wigner, Radon-Wigner, and ambiguity planes suggest that the Radon-Wigner domain may be a new design space for time-frequency filtering and kernel design. The distribution of white noise in this space is presented as well as some simple examples of time-varying filtering.
A simple formulation is given for generating convolution theorems in any representation. Using this method we obtain the convolution theorem for the scale representation. We generalize the concept of invariance to any basis set and devise a method for handling linear invariant systems for arbitrary quantities. The Wiener-Khinchin theorem is generalized to arbitrary power energy densities. Also, we show how standard probability theory can be formulated in terms of signals.
A tree-structured wavelet transform has been developed for texture classification in our previous work. The new transform, which offers a non-redundant representation, is able to zoom into dominant frequency channels containing significant information of textures and can be interpreted as the decomposition of a 2-D function with the wavelet packet basis. In this research, we extend our work to the texture segmentation problem. A new multiscale texture segmentation algorithm based on the tree-structured wavelet transform and hierarchical fuzzy clustering technique is proposed. Numerical experiments are given to demonstrate the performance of our new algorithm.
This paper describes a technique for classification of 2-D discrete signals. It consists of four modules, namely the partition, representation, measurement, and the classification modules. The first of these either passes the observed signal as a whole or divides it into subregions which may or may not overlap. The representation module first computes the shift-invariant multiscale wavelet representations (MSWAR) of the reference and the observed signals and then generates a corresponding set of 1-D signatures. The measurement module extracts those vital signal features to which the decision rules of the classification module are applied. The paper presents the design and implementation of each of these modules, emphasizing theoretical background behind the design and efficiency of their implementation. Also some preliminary results have been included that demonstrate the ability of this technique to classify observed signals that are corrupted by different types of deformities.
The problem of detecting micropotentials in high resolution ECGs is of great importance to predict ventricular tachycardia. Since both conventional time domain and frequential domain methods present serious limitations, we address this problem using a high resolution time frequency method. The possibilities of the spectrogram and Wigner distribution (WD) are evaluated in order to analyze ECGs acquired both on patients with and without ventricular tachycardia. Furthermore, quantitative features are extracted from those representations in order to discriminate the two classes.
In this paper, a new framework for understanding the performance of adaptive IIR filters is proposed and shown to enhance the understanding of filter stability during adaptation in time- varying data environments. This new understanding arises from considering the error surface as defined by the data, rather than the statistics of the data, in a manner analogous to that used to develop recursive least squares (RLS) FIR adaptive filters. The stability of the filter is shown to be dependent on the problem formulation. Defining the time-varying error surface using frequency domain, rather than time domain, methods is shown to result in substantially better stability structure in the performance surface.
Cumulants are employed for classification and synthesis of textured images because they suppress additive Gaussian noise of unknown covariance and are capable of resolving phase and causality issues in stationary non-Gaussian random fields. Their performance is compared with existing autocorrelation based approaches which offer sample estimates of smaller variance and lower computational complexity. Nonlinear matching techniques improve over linear equation methods in estimating parameters of non-Gaussian random fields especially under model mismatch. Seasonal 1-D sequences allow for semi-stationary 2-D models and their performance is illustrated on synthetic space variant textures. The potential of prolate spheroidal basis expansion is also described for parsimonious nonstationary modeling of space variant textured images.
We propose a novel technique for the least-squares reconstruction of the Fourier phase of an image from the modulo-2(pi) phase of its bispectrum. The proposed technique unwraps the given bispectral phase and reconstructs the Fourier phase of the image iteratively through alternating projections onto two particular constraint sets: (1) the set of bispectral phase functions that differ from the given modulo-2(pi) bispectral phase by only integral multiples of 2(pi) at every sample point, and (2) the set of bispectral phase functions that correspond to deterministic signals. We formally define the above sets and their corresponding projection operators. We describe the iterative algorithm in detail, and provide experimental results to demonstrate its performance.
Higher-order Wigner distributions are not unique: definitions differ in the lag separations between the terms used in the time-domain product, as well as in how many of the terms are conjugated. We study a class of third-order WDs (TWD), parameterized by a single parameter (alpha) , and show that there is a duality between the choices of (alpha) equals -1/3 (Gerr's definition) and (alpha) equals -1. Interesting signal attributes, such as the instantaneous frequency, the derivative of the log-magnitude, and the group delay can be recovered from the TWD. Important issues such as aliasing problems and sampling requirements, and whether or not the analytic form of a real signal should be used, are addressed. It is shown theoretically that the TWD with (alpha) equals -1 is particularly useful for the detection of transients in the presence of colored Gaussian noise.