Visualization methods are very important in biomedical imaging. As a technology that understands life, biomedical imaging has the unique advantage of providing the most intuitive information in the image. This advantage of biomedical imaging can be greatly improved by choosing a special visualization method. This is more complicated in volumetric data. Volume data has the advantage of containing 3D spatial information. Unfortunately, the data itself cannot directly represent the potential value. Because images are always displayed in 2D space, visualization is the key and creates the real value of volume data. However, image processing of 3D data requires complicated algorithms for visualization and high computational burden. Therefore, specialized algorithms and computing optimization are important issues in volume data. Photoacoustic-imaging is a unique imaging modality that can visualize the optical properties of deep tissue. Because the color of the organism is mainly determined by its light absorbing component, photoacoustic data can provide color information of tissue, which is closer to real tissue color. In this research, we developed realistic tissue visualization using acoustic-resolution photoacoustic volume data. To achieve realistic visualization, we designed specialized color transfer function, which depends on the depth of the tissue from the skin. We used direct ray casting method and processed color during computing shader parameter. In the rendering results, we succeeded in obtaining similar texture results from photoacoustic data. The surface reflected rays were visualized in white, and the reflected color from the deep tissue was visualized red like skin tissue. We also implemented the CUDA algorithm in an OpenGL environment for real-time interactive imaging.
Three-dimensional (3D) ultrasound data are acquired mostly using a dedicated mechanical probe that houses a 1D array
transducer. This 1D transducer swivels back and forth continuously in the elevation direction (continuous scanning) for
fast acquisition. When 3D ultrasound data are acquired via continuous scanning but the continuous motion of a
transducer is not taken into account during reconstruction, the reconstructed volume contains error. In this study, we
systematically analyzed this error, which is a complex function of many parameters. The error increases when the
transducer angular speed (ω) increases. Also, it varies depending on the voxel location inside an acquired volume. The
mean error is calculated by averaging the errors at all acquired voxel locations. With a 60-degree volume angle, a 60-degree sector angle, 12-cm scan depth and 48 transmit beams per slice, the mean error is 5.3 mm when ω is 0.6
degrees/ms. When ω is reduced to 0.1 degrees/ms, the mean error decreases to 0.81 mm. We also assessed the impact
of this error on the reconstructed images of a 3D phantom using simulation. At high angular speeds, the error in
reconstructed images becomes noticeable and results in missing parts, geometric distortion and lowered image quality.
Multi-volume rendering is a technique that renders and displays multiple volumes simultaneously. In ultrasound
imaging, multi-volume rendering is used for mixing 3D anatomical structures from B-mode imaging with blood flow
information from power Doppler imaging (PDI) or color Doppler imaging (CDI). A variety of multi-volume rendering
techniques have been proposed, such as post fusion (PF), composite fusion (CF) and progressive fusion (PGF). PF,
which combines independently-rendered volumes, is unable to depict a spatial relationship between B-mode images (i.e.,
tissue structure) and PDI/CDI images (i.e., blood flow). The CF technique suffers from color distortion due to
intermixing of hue values. In our recent study, the PGF technique was found to better retain and display tissue structures,
vasculature and their depth relationship. However, the disadvantages of PGF include its high computational cost due to
the requirement of maintaining a separate rendering pipeline for each volume (i.e., B-mode and power/color Doppler)
and potential artifacts of depth-order ambiguity. In this paper, we present a new flexible computationally efficient multivolume
rendering technique, named volume fusion (VF), and compare it with existing techniques. We have evaluated
the VF method and other multi-volume rendering techniques with data acquired from a commercial ultrasound machine
and found that the VF technique can preserve the spatial relationships well amongst multiple volumes without color
distortion while the same rendering pipeline can be used to support both PDI and CDI volume fusion.
Three-dimensional (3D) ultrasound has become a useful tool in cardiac imaging, OB/GYN and other clinical
applications. It enables clinicians to visualize the acquired volume and/or planes that are not easily accessible using 2D
ultrasound, in addition to providing an intuitive understanding of the structural anatomy in three dimensions. One
effective way to examine the acquired volumetric data is by clipping away parts of the volume using cross-sectional cuts
to reveal the underlying anatomy masked by other structures. Ideally, such boundaries should reflect the orientation and
location of the clip surfaces without altering the information content of the original data. Because of the artificial surface
introduced by the clip boundary, shading employed to enhance the surfaces of the object gets modified, resulting in
inconsistent shading and noticeable artifacts in the case of ultrasound data. Consistent shading of clip surfaces was
previously studied for graphics hardware-based volume rendering, and an algorithm was developed and demonstrated in
MRI, CT and non-medical datasets. However, that algorithm cannot be applied directly to fast software-based rendering
approaches such as the shear-warp algorithm. Furthermore, ultrasound data require a different clipping approach due to
their fuzzy nature, lower signal-to-noise ratios, and real-time requirements. In this paper, we present a software-based
volume clipping technique that can effectively and efficiently overcome the difficulties associated with the shading of
the clip boundaries in ultrasound data using shear-warp. Our technique improves the viewer's comprehension of the clip
boundary without altering the original information content within the volume. The method has been implemented on
programmable processors while maintaining the interactive speed in rendering.
Volume rendering in 3D ultrasound is a challenging task due to the large amount of computation required for real-time rendering. The shear-warp algorithm has been traditionally used for 3D ultrasound rendering for its effectiveness in lowering computing cost. However, this lowered computing cost does come at the price of reduced image quality due to (a) the presence of final warp interpolation, which smoothes out fine details and (b) sampling only at discrete slice locations, which introduces aliasing, e.g., staircase artifacts. For 3D ultrasound, we have merged pre-integration with the shear-image-order algorithm to overcome both limitations of shear-warp while still enjoying the computational savings. Pre-integration overcomes the aliasing artifacts while shear-image-order preserves details. We have also developed a technique to integrate shading coefficient into pre-integrated rendering. This pre-integrated shear-image-order algorithm, with slightly higher computation than what is required to support the shear-warp algorithm, improves the quality of the rendered image significantly. In this paper, we discuss the pre-integrated shear-image-order algorithm and present the results of subjective quality evaluation on several data sets. We have also analyzed how this algorithm can be implemented on an advanced digital signal processor (DSP) to achieve real-time performance.
We have developed a new adaptive clutter rejection technique where an optimum clutter filter is dynamically selected according to the varying clutter characteristics in ultrasound color Doppler imaging. The selection criteria have been established based on the underlying clutter characteristics (i.e., the maximum instantaneous clutter velocity and the clutter power) and the properties of various candidate clutter filters (e.g., projection-initialized infinite impulse response and polynomial regression). We obtained an average improvement of 3.97 dB and 3.27 dB in flow signal-to-clutter-ratio (SCR) compared to the conventional and down-mixing methods, respectively. These preliminary results indicate that the proposed adaptive clutter rejection method could improve the sensitivity and accuracy in flow velocity estimation for ultrasound color Doppler imaging. For a 192 x 256 color Doppler image with an ensemble size of 10, the proposed method takes only 57.2 ms, which is less than the acquisition time. Thus, the proposed method could be implemented in modern ultrasound systems, while providing improved clutter rejection and more accurate velocity estimation in real time.
3D ultrasound imaging is quickly gaining widespread clinical acceptance as a visualization tool that allows clinicians to obtain unique views not available with traditional 2D ultrasound imaging and an accurate understanding of patient anatomy. The ability to acquire, manipulate and interact with the 3D data in real time is an important feature of 3D ultrasound imaging. Volume rendering is often used to transform the 3D volume into 2D images for visualization. Unlike computed tomography (CT) and magnetic resonance imaging (MRI), volume rendering of 3D ultrasound data creates noisy images in which surfaces cannot be readily discerned due to speckles and low signal-to-noise ratio. The degrading effect of speckles is especially severe when gradient shading is performed to add depth cues to the image. Several researchers have reported that smoothing the pre-rendered volume with a 3D convolution kernel, such as 5x5x5, can significantly improve the image quality, but at the cost of decreased resolution.
In this paper, we have analyzed the reasons for the improvement in image quality with 3D filtering and determined that the improvement is due to two effects. The filtering reduces speckles in the volume data, which leads to (1) more accurate gradient computation and better shading and (2) decreased noise during compositing. We have found that applying a moderate-size smoothing kernel (e.g., 7x7x7) to the volume data before gradient computation combined with some smoothing of the volume data (e.g., with a 3x3x3 lowpass filter) before compositing yielded images with good depth perception and no appreciable loss in resolution. Providing the clinician with the flexibility to control both of these effects (i.e., shading and compositing) independently could improve the visualization of the 3D ultrasound data. Introducing this flexibility into the ultrasound machine requires 3D filtering to be performed twice on the volume data, once before gradient computation and again before compositing. 3D filtering of an ultrasound volume containing millions of voxels requires a large amount of computation, and doing it twice decreases the number of frames that can be visualized per second. To address this, we have developed several techniques to make computation efficient. For example, we have used the moving average method to filter a 128x128x128 volume with a 3x3x3 boxcar kernel in 17 ms on a single MAP processor running at 400 MHz. The same methods reduced the computing time on a Pentium 4 running at 3 GHz from 110 ms to 62 ms. We believe that our proposed method can improve 3D ultrasound visualization without sacrificing resolution and incurring an excessive computing time.
Unsharp masking is a widely used image enhancement method in medical imaging, e.g., in computed radiography, digital radiography, and digital mammography. It mainly consists of 3 steps: (1) convolving an input image with a lowpass filter, (2) obtaining a highpass-filtered image by subtracting the lowpass-filtered image from the original image, and (3) adding the weighted highpass-filtered image to the original image. It is computationally expensive, e.g., convolving a 2k x 2k image with a 21 x 21 kernel alone requires about 3.7 billion arithmetic operations. To support this high computational demand for unsharp masking, hardware-based solutions using ASIC, FPGA and FPLD have been developed and used. While they have reasonably met the computing requirement, they suffer from limited flexibility. On the other hand, software solutions using programmable processors are more flexible and can easily change algorithmic parameters, such as filter kernel size, and incorporate new features, but they have not been able to meet the fast computing requirement. Modern programmable mediaprocessors, such as MAP-CA and Texas Instruments TMS320C64x, can meet both fast computing and flexibility requirements due to their high computing power and full programmability. In this paper, we present an efficient implementation of adaptive unsharp masking on a MAP-CA mediaprocessor. For a 2k x 2k 16-bit image, our adaptive unsharp masking operation with a 149 x 149 boxcar kernel takes only 300 ms. This fast unsharp masking not only reduces the overall processing time in imaging modalities, but also allows the operator to adjust the selected parameters interactively for optimal image quality. Our implementation on the MAP-CA can be easily extended to other high-performance mediaprocessors, such as TMS320C64x and Pentium 4.
To meet the computational requirements of mid-range and high-end programmable ultrasound systems, multiple processors are currently required. Algorithms optimized specifically for a single processor-based system may not perform well in a multiprocessor environment. They need to be efficiently remapped on multiple processors to take advantage of the increased computing power while minimizing the interprocessor data transfer and the latency between data acquisition and display. In this paper, we describe a multiprocessor-based implementation of scan conversion, a key processing task in an ultrasound system that geometrically transforms the acquired polar ultrasound data to Cartesian coordinates for display. The single processor-based scan conversion algorithm that was reported previously uses inverse mapping for geometric transformation, where the pixel values in the Cartesian display are determined from data in the polar domain. Inverse mapping requires access to a full frame of pre-scan-converted ultrasound data, which in a multiprocessor system can be located across multiple processors, thus requiring a significant amount of interprocessor data communications. Our modified scan conversion algorithm reduces the data movement by performing inverse-mapped scan conversion locally on the polar-domain data present in each processor's memory. Each processor handles a smaller amount of data, thus reducing the latency. The raster pixels generated by each processor are combined later. Interprocessor synchronization is used to ensure that each processor displays data belonging to the same frame. Data overlapping between processors avoids boundary artifacts between regions that are processed on different processors. Using four Hitachi/Equator Technologies' 300-MHz MAP-CA processors, scan conversion requires 5.6 ms for a 600x420 RGB frame, as compared to 14.6 ms using a single processor, and the latency is reduced by 33.3%. We believe that this type of parallel algorithms will facilitate the development and deployment of flexible multiprocessor-based ultrasound and other medical imaging systems.
New high performance programmable processors, called mediaprocessors, have been emerging since the early 1990s for various digital media applications, such as digital TV, set-top boxes, desktop video conferencing, and digital camcorders. Modern mediaprocessors, e.g., TI's TMS320C64x and Hitachi/Equator Technologies MAP-CA, can offer high performance utilizing both instruction-level and data-level parallelism. During this decade, with continued performance improvement and cost reduction, we believe that the mediaprocessors will become a preferred choice in designing imaging and video systems due to their flexibility in incorporating new algorithms and applications via programming and faster-time-to-market. In this paper, we will evaluate the suitability of these mediaprocessors in medical imaging. We will review the core routines of several medical imaging modalities, such as ultrasound and DR, and present how these routines can be mapped to mediaprocessors and their resultant performance. We will analyze the architecture of several leading mediaprocessors. By carefully mapping key imaging routines, such as 2D convolution, unsharp masking, and 2D FFT, to the mediaprocessor, we have been able to achieve comparable (if not better) performance to that of traditional hardwired approaches. Thus, we believe that future medical imaging systems will benefit greatly from these advanced mediaprocessors, offering significantly increased flexibility and adaptability, reducing the time-to-market, and improving the cost/performance ratio compared to the existing systems while meeting the high computing requirements.
Scan conversion is an important ultrasonic processing stage that maps the acquired polar coordinate data to Cartesian coordinates for display. This requires computationally expensive square root and arctangent calculations for geometric transformation. Previously, we developed an algorithm for implementing scan conversion for gray-scale images using pre-computed lookup tables. In a clinical setting, however, interactive changes of scan conversion parameters, e.g., zoom and sector angle, require these table to be recomputed often. In this paper, we describe a fast lookup table generation algorithm and its implementation on Hitachi/Equator's MAP-CA mediaprocessor architecture. In addition, we have extended the gray-scale scan conversion algorithm for color images, which requires interpolation between angular data. For a 600x420 output image, gray- scale scan conversion takes 12 ms while color scan conversion takes 20.3 ms on a 300 MHz MAP-CA. Interactive parameter changes take 102.5 ms for table regeneration. We believe that this high performance is an important step towards making software-based ultrasound programmable systems using mediaprocessors a reality. Such a system would provide more flexibility and improved cost/performance in the future than the existing hardwired solutions.
Mathematical morphology has proven to be useful for solving a variety of image processing problems and plays a key role in certain time-critical machine vision applications. The large computation requirement for morphology poses a challenge for microprocessors to support in real time, and often hardwired solutions such as ASICs and EPLDs have been necessary. This paper present a method to implement binary and gray-scale morphology algorithm sufficiently on programmable VLIW mediaprocessors. Efficiency is gained by (1) mapping the algorithms to the mediaprocessor's parallel processing units, (2) avoiding redundant computations by converting the structuring element into a unique lookup table, and (3) minimizing the I/O overhead by using an on- chip programmable DMA controller. Using our approach, 'C' implementation of gray-scale dilation takes 7.0 ms and binary dilation takes 1.2 ms on a 200 MHz MAP1000 mediaprocessor, and more than 35 times faster than that reported for general-purpose microprocessors. With comparable performance to ASIC implementations and the flexibility of a programmable processor, this real-time image computing with mediaprocessors will be more widely used in machine vision and other imaging applications in the future.
KEYWORDS: Digital signal processing, Logic, Image compression, Detection and tracking algorithms, Image processing, Data processing, Signal processing, Video processing, Convolution, Algorithm development
Convolution is widely used as an effective tool for enhancing image features, such as points, lines, or edges, and smoothing noise. One major challenge in implementing convolution in real time has been its large computational requirement. For example, convolving a 512 X 512 image with a 7 X 7 kernel requires 50 million operations. Therefore, to achieve the computational performance needed in real-time applications, hardwired solutions with ASICs and/or fixed-function chips with little programmability have been used. The disadvantages associated with hardwired implementations are that they are rigid, unifunctional and not upgradable. Our approach has been programmable convolution, which is flexible, multi-functional, easily upgradable and has a performance comparable to the hardwired implementations. This paper describes an efficient algorithm for convolution, which can be implemented in software on the new generation of VLIW mediaprocessors. These processors can perform multiple multiplication, addition and load/store operations in a single instruction, which can be used effectively in convolution to reduce the execution time. We have implemented this algorithm on a new mediaprocessor called the MAP1000<SUP>TM</SUP> where it takes 8.6 ms for the convolution of a 512 X 512 image with a 7 X 7 kernel. This performance is 7 times faster than the previously reported software-based convolution on the Texas Instruments TMS320C80 mediaprocessor and is comparable with the hardwired implementations for the same image and kernel size. This algorithm and its implementation on the next- generation programmable mediaprocessor clearly demonstrate the feasibility of software-based convolution.
Convolution is a fundamental operation to many image processing algorithms and applications. One such algorithm is unsharp masking, which is widely used in medical imaging. A major component in unsharp masking is the computation of a lowpass-filtered image, e.g., via generalized convolution with a Gaussian filter or via specialized convolution with a boxcar filter. Generalized convolution is computationally expensive, e.g., convolution with a 3 X 3 kernel on a 512 X 512 image takes 1.45 sec on SUN SparcStation 20/71. In order to achieve faster computation in convolution, hardwired solutions with ASICs and/or fixed-function chips with little programmability have been traditionally used. The disadvantages associated with hardwired implementations are that they are rigid, uni-functional and not upgradable. Our approach has been programmable convolution, which is flexible, multi-functional, easily-upgradable and has a performance comparable to the hardwired implementations. This paper describes efficient software implementations of both generalized and boxcar convolution on a programmable multimedia processor, the Texas Instruments TMS320C80, also known as Multimedia Video Processor (MVP). Using the MVP's advanced digital signal processors (ADSPs), instruction-level parallelism and intelligent input/output interface, we have been able to significantly improve the performance of both generalized and boxcar convolution. For a 512 X 512 8-bit image, generalized convolution takes 19.5 ms for a 3 X 3 kernel. While the boxcar convolution has similar performance for a 3 X 3 kernel, the performance improvement by a factor of up to 13 has been achieved for large-size kernels such as 21 X 21. Our implementation of convolution algorithms on programmable mediaprocessor clearly demonstrates the feasibility of software-based approach.