Cardiac perfusion magnetic resonance imaging (MRI) has proven clinical significance in diagnosis of heart diseases.
However, analysis of perfusion data is time-consuming, where automatic detection of anatomic landmarks
and key-frames from perfusion MR sequences is helpful for anchoring structures and functional analysis of
the heart, leading toward fully automated perfusion analysis. Learning-based object detection methods have
demonstrated their capabilities to handle large variations of the object by exploring a local region, i.e., context.
Conventional 2D approaches take into account spatial context only. Temporal signals in perfusion data present
a strong cue for anchoring. We propose a joint context model to encode both spatial and temporal evidence. In
addition, our spatial context is constructed not only based on the landmark of interest, but also the landmarks
that are correlated in the neighboring anatomies. A discriminative model is learned through a probabilistic
boosting tree. A marginal space learning strategy is applied to efficiently learn and search in a high dimensional
parameter space. A fully automatic system is developed to simultaneously detect anatomic landmarks and key
frames in both RV and LV from perfusion sequences. The proposed approach was evaluated on a database of
373 cardiac perfusion MRI sequences from 77 patients. Experimental results of a 4-fold cross validation show
superior landmark detection accuracies of the proposed joint spatial-temporal approach to the 2D approach that
is based on spatial context only. The key-frame identification results are promising.
Non-rigid multi-modal volume registration is computationally intensive due to its high-dimensional parameter
space, where common CPU computation times are several minutes. Medical imaging applications using registration,
however, demand ever faster implementations for several purposes: matching the data acquisition speed,
providing smooth user interaction and steering for quality control, and performing population registration involving
multiple datasets. Current GPUs offer an opportunity to boost the registration speed through high
computational power at low cost. In our previous work, we have presented a GPU implementation of a non-rigid
multi-modal volume registration that was 6 - 8 times faster than a software implementation. In this paper, we
extend this work by describing how new features of the DX10-compatible GPUs and additional optimization
strategies can be employed to further improve the algorithm performance. We have compared our optimized
version with the previous version on the same GPU, and have observed a speedup factor of 3.6. Compared with
the software implementation, we achieve a speedup factor of up to 44.
Non-rigid multi-modal registration of images/volumes is becoming increasingly necessary in many medical settings.
While efficient registration algorithms have been published, the speed of the solutions is a problem in
clinical applications. Harnessing the computational power of graphics processing unit (GPU) for general purpose
computations has become increasingly popular in order to speed up algorithms further, but the algorithms have
to be adapted to the data-parallel, streaming model of the GPU. This paper describes the implementation of
a non-rigid, multi-modal registration using mutual information and the Kullback-Leibler divergence between
observed and learned joint intensity distributions. The entire registration process is implemented on the GPU,
including a GPU-friendly computation of two-dimensional histograms using vertex texture fetches as well as an
implementation of recursive Gaussian filtering on the GPU. Since the computation is performed on the GPU,
interactive visualization of the registration process can be done without bus transfer between main memory
and video memory. This allows the user to observe the registration process and to evaluate the result more
easily. Two hybrid approaches distributing the computation between the GPU and CPU are discussed. The first
approach uses the CPU for lower resolutions and the GPU for higher resolutions, the second approach uses the
GPU to compute a first approximation to the registration that is used as starting point for registration on the
CPU using double-precision. The results of the CPU implementation are compared to the different approaches
using the GPU regarding speed as well as image quality. The GPU performs up to 5 times faster per iteration
than the CPU implementation.