The combinations of a 60 fps kV x-ray flat panel imager, a 19 focal spot kV x-ray tube enabled by a steered electron beam, plus SART or SIRT sliding reconstruction via GPUs, allow real time 6 fps 3D-rendered digital tomosynthesis tracking of the respiratory motion of lung cancer lesions. The tube consists of a “U” shaped vacuum chamber with 19 tungsten anodes, spread uniformly over 3 sides of a 30 cm x 30 cm square, each attached to a cylindrical copper heat sink cooled by flowing water. The beam from an electron gun was steered and focused onto each of the 19 anodes in a predetermined sequence by a series of dipole, quadrupole and solenoid magnets. The imager consists of 0.194 mm pixels laid out in 1576 rows by 2048 columns, binned 4x4 to achieve 60 fps projection image operation with 16 bits dynamic range. These are intended for application with free breathing patients during ordinary linac C-arm radiotherapy with modest modifications to typical system hardware or to standard clinical treatment delivery protocols. The sliding digital tomosynthesis reconstruction is completed after every 10 projection images acquired at 60 fps, but using the last 19 such projection images for each such reconstruction at less than 8 mAs exposure per 3D rendered frame. Comparisons, to “ground truth” optical imaging and to diagnostic 4D CT (10 phase) images, are being used to determine the accuracy and limitations of the various versions of this new “19 projection image x-ray tomosynthesis fluorooscopy” motion tracking technique.
We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials.
The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies.
Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.
In this paper, we propose an algebraic reconstruction technique (ART) based discrete tomography method to reconstruct
an image accurately using projections from a few views. We specifically consider the problem of reconstructing an
image of bottles filled with various types of liquids from X-ray projections. By exploiting the fact that bottles are usually
filled with homogeneous material, we show that it is possible to obtain accurate reconstruction with only a few
projections by an ART based algorithm. In order to deal with various types of liquids in our problem, we first introduce
our discrete steering method which is a generalization of the binary steering approach for our proposed multi-valued
discrete reconstruction. The main idea of the steering approach is to use slowly varying thresholds instead of fixed
thresholds. We further improve reconstruction accuracy by reducing the number of variables in ART by combining our
discrete steering with the discrete ART (DART) that fixes the values of interior pixels of segmented regions considered
as reliable. By simulation studies, we show that our proposed discrete steering combined with DART yields superior
reconstruction than both discrete steering only and DART only cases. The resulting reconstructions are quite accurate
even with projections using only four views.
We present a 3-D electronic unpacking technique for airport security images based on volume rendering techniques
developed for medical applications. Two electronic unpacking techniques are presented: (1) object-based unpacking and
(2) unpacking by bag-slicing. Both techniques provide photo-realistic 3-D views of contents inside a packed bag with
clearly marked threats. For the object-based unpacking, the 3-D objects within packed bags are unpacked (or isolated)
though object selection tools that cut away undesired regions to isolates the 3-D object from the background clutter.
With this selection tool, the operator is able to electronically unpack various 3-D objects and manipulate (rotate and
zoom) the 3-D photo-realistic views for the immediate classification of the suspect object. The unpacking by bag-slicing technique places arbitrary cut planes to show the content beyond the cut plane that can be stepped forward or backward electronically. The methods may be used to reduce the need for manual unpacking of suitcases.
In contrast to X-rays, ultrasound propagates along a curved path due to spatial variations in the refraction index of the medium. Thus, for ultrasonic TOF tomography, the propagation path of the ultrasound must be known to correctly reconstruct the slice image. In this paper, we propose a new path determination algorithm, which is essentially a numerical solution of the eikonal equation viewed as a boundary value problem. Due to the curved propagation path of ultrasound, the image reconstruction algorithm takes the algebraic approach, for instance, the ART or the SART. Note that the image reconstruction step requires the propagation path and the paths can be determined only if the image is known. Thus, an iterative approach is taken to solve this apparent dilemma. First, the slice image is initially reconstructed assuming straight propagation paths. Then the paths are computed based on the recently reconstructed image using our path determination algorithm and used to update the reconstructed image. The process of the image reconstruction and the path determination repeats until convergence. This is the approach taken in this paper and it is tested using both a simulation data and a real concrete structure scanned by a mechanical scanner.
Three-dimensional visualization of medical images, using maximum intensity projection (MIP), requires isotropic volume data for the generation of realistic and undistorted 3-D views. However, the distance between CT slices is usually larger than the pixel spacing within each slice. Therefore, before the MIP operation, these axial slice images must be interpolated for the preparation of the isotropic data set. Of many available interpolation techniques, linear interpolation is most popularly used for such slice interpolation due to its computational simplicity. However, as resulting MIP’s depend heavily upon the variance in interpolated slices (due to the inherent noise), MIP’s of linearly interpolated slices suffer from horizontal streaking artifacts when the projection direction is parallel to the axial slice (e.g., sagittal and coronal views). In this paper, we propose an adaptive cubic interpolation technique to minimize these horizontal streaking artifacts in MIP’s due to the variation of the variance across interpolated slices. The proposed technique, designed for near-constant variance distribution across interpolated slices, will be shown to be superior over the linear interpolation technique by completely eliminating the horizontal streaking artifacts in MIP’s of simulated data set and real CT data set.
Three-dimensional visualization of medical images based on X-CT 2-D slice images requires a specialized medical imaging workstation and a trained operator to effectively generate 3-D images of the human anatomy. We therefore propose a system so that remote users can visualize and manipulate 3-D medical images without any specialized equipment nor all the slice data. The 3-D data set is pre-rendered at appropriately designed multiple viewpoints. The resulting images are compressed using the moving picture experts group (MPEG) standard and stored. The remote user can easily access the 3-D data for browsing and manipulation over the Internet using our newly developed 3-D viewer. The key features of our system are as follows: (1) the viewpoints for pre-rendering are spaced apart isotropically on the surface of the sphere, (2) the quantization level is determined at encode time to keep the PSNR of all images constant, (3) the encoder is designed to improve compression efficiency exploiting the similarity between adjacent viewpoint images and (4) the compressed 3-D dataset is streamed over the network so that the user can view the received data while the rest is being streamed.
X-ray computed tomography (CT) scanners, due to its inherent scanning geometry, provide images with non-isotropic voxels. The typical CT scanner generates axial slices with the thickness on the order of a few millimeters with sub- millimeter pixels. The multi-slice images obtained with such protocol must be then interpolated across the slices for an effective and realistic 3-D visualization of the patient anatomy. In this paper we focus on the effects of slice interpolation for maximum intensity projection (MIP) images with the projection direction orthogonal to the z-axis, for instance, for the generation of coronal or sagittal views. Linear interpolations, although simple, due to the inherent noise in the data, generate MIP images with noise whose variance vary quadratically along the z-axis. As such, the MIP images will often suffer from horizontal streaking artifacts, exactly at the position of the original slices. To combat this situation we have developed a different interpolation technique based on a digital finite impulse response (FIR) filter. It is shown that this band-limited interpolation based on the FIR filter will flatten the change in the noise variance along the z-axis, the net effect being either the elimination or a reduction of the horizontal streaking artifact.
A new block-matching algorithm, based on the Fermat Number Transform (FNT), is presented. It declares the most correlated-block as the best matching block, as opposed to declaring the block with the least sum of differences between the blocks. The proposed number theoretic approach significantly reduces the computational complexity for the estimation process.
We have recently proposed a new dequantization scheme for DCT- based transform coding based on regularization principles. The new approach sharply reduced blocking artifacts in decoded images and the performance of the new dequantization scheme has been evaluated against the standard JPEG, MPEG and H.263+ in terms of the peak-signal-to-noise ratio (PSNR) with our own definition of the blockiness measure (BM). Basically, the proposed dequantizer maps the received data to within the range +/- (quantizer spacing/2), so that the final decompressed image is 'smooth' in the sense of minimizing the cost functional including the stabilizing term weighted by a regularization parameter. In this paper, we focus on several important aspect of this regularized dequantizer, namely the selection of the regularization parameter and the convergence of the dequantization algorithm.
With the abstraction of digital video, as the corresponding binary video, a process which, upon subjective experimentation seems to preserve the intelligibility of video content, we can pursue a precise and analytic approach to digital video storage and retrieval algorithm design based upon geometrical and morphological intuition. The foremost and tangible general benefit of such abstraction, however, is the immediate reduction of both data and computational complexities, involved in implementing various algorithms and databases. The general paradigm presented may be utilized to address all issues pertaining to video library construction, including visualization, optimum feedback query generation, and object recognition. However, the primary focus of attention in this paper pertains to detection of fast and gradual scene changes, such as dissolves, fades, and various special effects, such as wipes. Upon simulation, we observed that we can achieve performances comparable to those of others with drastic reductions in both storage and computational complexities. The conversion from grayscale to binary videos can be performed directly (with minimal additional computation) in the compressed domain by thresholding on the DCT DC coefficients themselves, or by using the contour information attached to MPEG4 formats. The algorithms presented herein are ideally suited for performing fast (on-the-fly) determinations of scene change, object recognition, and/or tracking, as well as other, more intelligent, tasks, traditionally requiring heavy demand of computational and/or storage complexities. The fast determinations may then be used on their own merit , or can be used in conjunction/complement with other higher-layer information in the future.
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes, including fades, dissolves, and wipes, as these are often missed or falsely detected. In this paper, we focus on the detection of wipes. The proposed algorithm begins by processing the visual rhythm, a portion of the DC image sequence. It is a single image, a sub-sampled version of a full video, in which the sampling is performed in a predetermined and systematic fashion. The visual rhythm contains distinctive patterns or visual features for many different types of video effects. The different video effects manifest themselves differently on the visual rhythm. In particular, wipes appear as curves, which run from the top to the bottom of the visual rhythm. Thus, using the visual rhythm, it becomes possible to automatically detect wipes, simply by determining various lines and curves on the visual rhythm.
Most image analysis/understanding applications require accurate computation of camera motion parameters. However, in multimedia applications, particularly in video parsing, the exact camera motion parameters such as the panning and/or zooming rates are not needed. The detection--i.e., a binary decision--of camera motion is all that is required to avoid declaring a false scene change. As camera motions can induce false scene changes for video parsing algorithms, we propose a fast algorithm to detect such camera motions: camera zoom and pan. As the algorithm is only expected produce a binary decision, without the exact panning/zooming rates, the proposed algorithm runs on a reduced data set, namely the projection data. The algorithm begins with a central portion of the image and computes the projection data (or the line integrals along the x- or y-axis) to turn the 2D image data into a 1D data. This projected 1D data is further processed via correlation processing to detect camera zoom and pan. Working with projection data saves processing time by an order of magnitude, since for instance, a 2D correlation takes N2 multiplies per point, however a 1D correlation takes N multiplies per point. The efficacy of the proposed algorithm is tested for a number of image sequences and the algorithm is shown to be successful in detecting camera motions. The proposed algorithm is expected to be beneficial for video parsers working with Motion-JPEG data stream where motion vectors are not available.
In a number of image compression standards, the motion vectors are generally obtained by the well-known block matching algorithm (BMA), the error image along with the computed motion vectors are encoded. Unfortunately, this widely used approach generates artificial block boundary discontinuities, called blocky artifacts, between the blocks. Since the blocky artifacts are caused by synthesizing the predicted frame using one constant motion vector per block, we propose an algorithm that interpolates the motion vectors before the construction of the predicted image. Naturally, using spatially smooth motion vectors completely eliminates the blocky artifact. However, we can no longer use the motion vectors as provided by the BMA. The optimum motion vectors must minimize the norm of the error image. The proposed algorithm computes the optimum motion vectors, with the interpolation process built into the algorithm. To obtain spatially smooth motion vectors, we use a band-limited interpolation, and thus, we refer to our algorithm as the band-limited motion compensation (BLMC). Our simulations indicate that the BLMC completely eliminates the blocky artifacts, as expected, and in addition provides higher peak-signal-to-noise-ratio in comparison to the traditional BMA based motion compensation (BMC) as well as the overlapped BMC.
The automatic video parser, a necessary tool for the development and maintenance of a video library, must accurately detect video scene changes so that the resulting video clips can be indexed in some fashion and stored in a video database. With the current existing algorithms, abrupt scene changes are detected fairly well; however, gradual scene changes, including fade-ins, fade-outs, and dissolves, are often missed. In this paper, we propose a new gradual scene change detection algorithm. In particular, we focus on fade- ins, fade-outs, and dissolves. The proposed algorithm is based on the chromatic video edit model. The video edit model indicates that, for sequences without motion, the second partial derivative with respect to time is zero during fade- ins, fade-outs, and dissolves. However, it is also zero for static scenes. Thus, the proposed algorithm computes the first (to disregard static scenes) and second partial derivatives, and if the norm of the second derivative is 'small' relative to the norm of the first derivative, the algorithm declares a gradual scene change. The efficacy of our algorithm is demonstrated using a number of video clips and some performance comparisons are made with other existing approaches.
In this paper, we present some fundamental theoretical results pertaining to the question of how many randomly selected labelled example points it takes to reconstruct a set in euclidean space. Drawing on results and concepts from mathematical morphology and learnability theory, we pursue a set-theoretic approach and demonstrate some provable performances pertaining to euclidean-set-reconstruction from stochastic samples. In particular, we demonstrate a stochastic version of the Nyquist Sampling Theorem - that, under weak assumptions on the situation under consideration, the number of randomly-drawn example points needed to reconstruct the target set is at most polynomial in the performance parameters and also the complexity of the target set as loosely captured by size, dimension and surface-area. Utilizing only rigorous techniques, we can similarly establish many significant attributes - such as those relating to robustness, cumulativeness and ease-of- implementation - pertaining to smoothing over labelled example points. In this paper, we formulate and demonstrate a certain fundamental well-behaving aspect of smoothing.
We propose an optimal radar pulse compression technique and evaluate its performance in the presence of Doppler shift. The traditional pulse compression using Barker code increases the signal strength by transmitting a Barker coded long pulse. The received signal is then processed by an appropriate correlation processing. This Barker code radar pulse compression enhances the detection sensitivity while maintaining the range resolution of a single chip of the Barker coded long pulse. But unfortunately, the technique suffers from the addition of range sidelobes which sometimes will mask weak targets in the vicinity of larger targets. Our proposed optimal algorithm completely eliminates the sidelobes at the cost of additional processing.