The inspection of a patient's data for diagnostics, therapy planning or therapy guidance involves an increasing number
of 3D data sets, e.g. acquired by different imaging modalities, with different scanner settings or at different times. To
enable viewing of the data in one consistent anatomical context fused interactive renderings of multiple 3D data sets are
desirable. However, interactive fused rendering of typical medical data sets using standard computing hardware remains
In this paper we present a method to render multiple 3D data sets. By introducing local rendering functions, i.e.
functions that are adapted to the complexity of the visible data contained in the different regions of a scene, we can
ensure that the overall performance for fused rendering of multiple data sets depends on the actual amount of visible
data. This is in contrast to other approaches where the performance depends mainly on the number of rendered data sets.
We integrate the method into a streaming rendering architecture with brick-based data representations of the volume
data. This enables efficient handling of data sets that do not fit into the graphics board memory and a good utilization of
the texture caches. Furthermore, transfer and rendering of volume data that does not contribute to the final image can be
avoided. We illustrate the benefits of our method by experiments with clinical data.
Modern multi-slice CT (MSCT) scanners allow acquisitions of 3D data sets covering the complete heart at different phases of the cardiac cycle. This enables the physician to non-invasively study the dynamic behavior of the heart, such as wall motion artifacts. To this end an interactive 4D visualization of the heart in motion is desirable. However, the application of well-known volume rendering algorithms enforces considerable sacrifices in terms of image quality to ensure interactive frame rates, even when accelerated by standard graphics processors (GPUs). Thereby, the performance of pure CPU implementations of direct volume rendering algorithms is limited even for moderate volume sizes by both the number of required computations and the available memory bandwidth. Despite of offering higher computational performance and more memory bandwidth GPU accelerated implementations cannot provide interactive visualizations of large 4D data sets since data sets that do not fit into the onboard graphics memory are often not handled efficiently. In this paper we present a software architecture for GPU-based direct volume rendering algorithms that allows the interactive high-quality visualization of large medical time series data sets. In contrast to other work, our architecture exploits the complete memory hierarchy for high cache and bandwidth efficiency. Additionally, several data-dependent techniques are incorporated to reduce the amount of volume data to be transferred and rendered. None of these techniques sacrifices image quality in order to improve speed. By applying the method to several multi phase MSCT cardiac data sets we show that we can achieve interactive frame rates on currently available standard PC hardware.
With the evolution of medical scanners towards higher spatial resolutions, the sizes of image data sets are increasing rapidly. To profit from the higher resolution in medical applications such as 3D-angiography for a more efficient and precise diagnosis, high-performance visualization is essential. However, to make sure that the performance of a volume rendering algorithm scales with the performance of future computer architectures, technology trends need to be considered. The design of such scalable volume rendering algorithms remains challenging. One of the major trends in the development of computer architectures is the wider use of cache memory hierarchies to bridge the growing gap between the faster evolving processing power and the slower evolving memory access speed. In this paper we propose ways to exploit the standard PC’s cache memories supporting the main processors (CPU’s) and the graphics hardware (graphics processing unit, GPU), respectively, for computing Maximum Intensity Projections (MIPs). To this end, we describe a generic and flexible way to improve the cache efficiency of software ray casting algorithms and show by means of cache simulations, that it enables cache miss rates close to the theoretical optimum. For GPU-based rendering we propose a similar, brick-based technique to optimize the utilization of onboard caches and the transfer of data to the GPU on-board memory. All algorithms produce images of identical quality, which enables us to compare the performance of their implementations in a fair way without eventually trading quality for speed. Our comparison indicates that the proposed methods perform superior, in particular for large data sets.