1 November 2009 Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit
Author Affiliations +
J. of Biomedical Optics, 14(6), 060506 (2009). doi:10.1117/1.3275463
Fourier domain optical coherence tomography (FD-OCT) requires resampling of spectrally resolved depth information from wavelength to wave number, and the subsequent application of the inverse Fourier transform. The display rates of OCT images are much slower than the image acquisition rates due to processing speed limitations on most computers. We demonstrate a real-time display of processed OCT images using a linear-in-wave-number (linear-k) spectrometer and a graphics processing unit (GPU). We use the linear-k spectrometer with the combination of a diffractive grating with 1200 lines/mm and a F2 equilateral prism in the 840-nm spectral region to avoid calculating the resampling process. The calculations of the fast Fourier transform (FFT) are accelerated by the GPU with many stream processors, which realizes highly parallel processing. A display rate of 27.9 frames/sec for processed images (2048 FFT size×1000 lateral A-scans) is achieved in our OCT system using a line scan CCD camera operated at 27.9 kHz.
Watanabe and Itagaki: Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit

Optical coherence tomography (OCT) is a noninvasive, noncontact imaging modality used to obtain cross sectional images of tissue structures with high resolution. Broadly, OCT has been classified into two categories: time-domain (TD)-OCT and Fourier-domain (FD)-OCT. Conventional TD-OCT can detect the echo time delays of light by measuring the interference signal as a function of time during a depth scan (A-scan) in a reference arm at each position of a lateral probe beam. In FD-OCT, instead of a mechanical A-scan, depth information can be retrieved by detecting the interference signal as a function of wavelength. In spectral-domain (SD) OCT, this is achieved with a broadband source and a spectrometer on the detector arm.1 Optical frequency domain imaging (OFDI) uses a swept source and point detector to acquire the same information.2 The main advantage of these schemes over TD-OCT is a marked increase in sensitivity and imaging speed.3 Spectrometers with a linear array detector and swept source normally operate at tens of kHz, which allows data acquisitions at video rates [ 30framessec , (fps)]. FD-OCT images require resampling of the axial data from wavelength (λ) space to wavenumber (k=2πλ) space, and the subsequent application of the inverse Fourier transform. Therefore, although data acquisitions of high-speed FD-OCT are achieved at a higher rate than that of video, the display rate of OCT images often occurs at a much slower rate than the acquisition, because this heavy signal processing must be performed by a computer.

One example of real-time display has been demonstrated by digital signal processing (DSP) hardware using a single field programmable gate array (FPGA) integrated circuit (IC) and a custom electronics board.4 The display frame rate for processed OCT images (1024 axial pixels×512 lateral A-scans) was 27fps in the SD-OCT system. This type of equipment is expensive, and must be custom built for FPGA technology

To avoid calculating a numerical k -space resampling prior to Fourier transform, a linear-in-wave-number (linear- k ) spectrometer and k -space linear swept source have been designed.5, 6, 7, 8 A linear- k spectrometer consisting of a grating and an optical glass prism in the 1.3-μm region not only saved computing time but also improved SNR falloff.5 The FD-OCT systems based on linear- k techniques still require high speed fast Fourier transform (FFT) processing to realize real-time display of OCT images.

Recently, one approach to accelerating numerical calculations has been to use a graphics processing unit (GPU) instead of a central processing unit (CPU). A GPU with many stream processors allows us to use highly parallel processors. The advantage of GPU computing is the implementation of high speed computation at a low cost, and simple programming on the host computer. In the field of optics, GPU techniques have been applied to reconstruct digital holograms.9, 10

In this work, we demonstrated real-time display on a linear- k SD OCT system using GPU programming. We estimated the optimal combination of a diffractive grating and a prism for the linear- k spectrometer in the 840-nm spectral region. The computing time using the GPU was 6.1ms for data size of 2048 FFT size×1000 lateral A-scans, and was shorter than the frame interval time (35.8ms) using a line scan camera at 27.9kHz . A display rate of 27.9fps for processed images was achieved using a low cost GPU.

Figure 1 shows a schematic of our SD-OCT system. The output light of a superluminescent diode [SLD-370-HP, Superlum, (County Cork, Ireland), center wavelength λ0=840.8nm , full-width at half-maximum spectral width Δλ=48.7nm ] was split into a sample and reference arm, with the latter terminated by a mirror. A probe at the end of the sample arm delivered light to a sample and received backscattered light from within the sample. Achromatic lenses (f=100mm) were inserted in both arms. The predicted lateral resolution was 21.4μm . The light returned from the two interferometer arms was recombined and directed to a linear- k spectrometer consisting of a diffraction grating (Wasatch Photonics, Volume Phase Holographic Grating, 1200linesmm ) and an optical glass prism. When the incident angle is the blaze angle, θin=sin1(mλ02) , which gives the best diffraction efficiency, the first-order diffraction angle of light at a wavelength λ is


where m is the groove number of the grating. The output angle θout of the prism is


where n(λ) is the refractive index determined by the material. The parameters α and β are the apex angle of the prism and the angle between the grating and the prism, respectively. The light from the prism was focused on a line scan CCD camera (e2v Aviiva SM2, 2048pixels , 14-μm pixel size, 12-bit resolution) using an achromatic lens (f=250mm) . The location of the spectral component at the CCD camera is described as


where θ0 is the output angle of the light at the central location x0 . The output of the camera was transferred to a personal computer (PC) via a camera link board (National Instruments, Austin, Texas, PCIe-1427, 16-bit resolution). The sampled data were transferred to a GPU on a graphics card [Nvidia (Santa Clara, California) GeForce GTX 280, processor clock of 1296MHz , memory clock of 2214MHz , 240 stream processors, and memory 1Gbytes ]. We used Nvidia’s compute unified device architecture (CUDA),11 which could be programmed in only a C language environment to implement the processing power of the GPUs. We developed software that included image acquisitions, GPU programming, and a graphical user interface environment in Microsoft Visual C++, 2008 Express Edition.

Fig. 1

Schematic of spectral domain optical coherence tomography with a linear-in-wave-number spectrometer. SLD, superluminescent diode; BS, beamsplitter; L, achromatic lens; θd , diffraction angle; θin , incident angle; θout , output angle of prism; α , apex angle of prism; β , angle between grating and prism; x0 , central location of CCD camera.


First we calculated the location of the spectral component at the focal plane in a wave number range between 7 and 8(μm) at each angle β between the prism and the grating to optimize the linearity of the spectrometer. Here, the prism materials are BK7, F2, and SF10, with the angle α=60deg . A comparison of the derivatives of the locations at the optimal angle with respect to wavenumber is shown in Fig. 2a . The optimal angle was estimated by the standard deviation of the derivatives, as shown in Fig. 2b. From these, the combination of the grating with 1200linesmm and the F2 equilateral prism were suitable for the linear- k spectrometer.

Fig. 2

Derivatives of the location of the spectral component at the optimal angle β with respect to wave number. (b) Standard deviation of derivatives with respect to angle between the prism and the grating.


Next we estimated the computing time using a GPU, which operated in 32-bit floating-point (single precision) mode. Figure 3 shows the flowchart of real-time OCT imaging. Initially, the reference intensity is measured for DC removal, and then stores in the memory on the GPU. In Fig. 3, the dash block shows the routine procedure in our system. First, a spectral interference image (2048 axial pixels×1000 lateral pixels, 16-bit resolution) is captured on the host computer and then is transferred to the GPU memory. Second, the type of data is converted from a 16-bit integer to a 32-bit floating point, and then the DC removal process is performed using the stored reference intensity. Here the real and imaginary parts of complex data are set to the processed data and zero, respectively. Third, the 2048-point FFT is performed for 1000 A-scans and then a log scaling process is performed to obtain an OCT image. We performed FFT processing using Nvidia’s CUDA FFT library (CUFFT).11 Finally, after converting the type of data from a 32-bit floating point to a 16-bit integer, the calculated result is transferred to the memory on the host computer and then is displayed on the monitor. The estimated computing time between the data transfer to the GPU memory and the data transfer from the GPU memory was 6.1ms and was shorter than the frame interval time (35.8ms) using a line scan camera at 27.9kHz . Consequently, the real-time display of the processed OCT images could be achieved using the GPU in our system.

Fig. 3

Flowchart of real-time OCT imaging. The dashed block shows the routine procedure in our system.


High performance computing can be achieved by Intel’s Math Kernel Library (MKL), which is a library of highly optimized, thread-safe, mathematical functions for an Intel CPU. Govindaraju have compared their novel algorithms of discrete Fourier transforms to CUFFT for the GPU and MKL on the CPU.12 Their algorithm was two times faster than the CUFFT on the GPU (Nvidia, GTX280) and 12 times faster than the MKL on the CPU (Intel QX9650 3.0-GHz quad-core processor and 4-GB memory) for computing the data size of 2048 FFT size×4096 number of FFTs. From this comparison, we can understand that the CUFFT on the GPU was about six times faster than the MKL on the CPU.

Finally, we measured the OCT images of a human finger pad in vivo. We used a commercial available F2 prism (Thorlabs Incorporated, Newton, New Jersey) for the linear- k spectrometer. The spectrometer settings provided a spectral resolution of 0.049nm and a depth range of 3.6mm . With a probing power of 5.0mW and an integration time of 34μs , a sensitivity of 99dB was measured close to the zero delay and dropped to 94dB at 1mm depth. The probe beam was scanned at 27.9Hz using the sawtooth waveform with a duty cycle of 90%, which was modified to reduce mechanical vibrations. Figure 4 and Videos 1 show the OCT images with an imaging range of a 4.0×3.6mm2 (lateral×axial) . We could obtain the OCT images without the resampling process, because the spectral linearity in the wave number was improved by a linear- k spectrometer rather than a conventional spectrometer. Since it is not perfect linearity to achieve high precision OCT imaging, our OCT system needs to perform a resampling process using calibrated spectral data. The computing time on the GPU had a wide margin for the FFT process only in our SD-OCT system. Therefore, GPU programming has potential for implementing necessary signal processing such as resampling, spectral shaping of non-Gaussian spectral data, and dispersion compensation. The performance of a real-time display is very important for clinical applications that need immediate diagnosis for screening or biopsy/surgery. The GPU is an attractive tool for clinical and commercial systems because of its high performance computing and low cost.

Fig. 4

In vivo OCT images of a human finger pad at different positions. Imaging range: 4.0×3.6mm2 (lateral×axial) . E: epidermis. D: dermis. Scale bar: 500μm .


Video 1

Human finger pad in vivo, consisting of 200 OCT images at the display rate of 27.9fps (QuickTime, 4 MB). .


In conclusion, we demonstrate a real-time display on the linear- k SD OCT system using GPU programming. We use the linear- k spectrometer combined with a diffractive grating (1200linesmm) and a F2 equilateral prism at 840nm to avoid resampling of the axial data from wavelength to wave number. The calculation of the FFT is accelerated by the GPU. The computing time is 6.1ms for data of size 2048pixels×1000 lateral A-scans, and is shorter than the frame interval time of the interference frame. Our system can display processed OCT images in real time at 27.9fps . Since the GPUs are cost effective for real-time display of FD-OCT images, the potential applications for this technique are wide.


This study was partially supported by Grant-in-Aid for Scientific Research (20700375) in the Japan Society for the Promotion of Science (JSPS) and Industrial Technology Research Grant Program in 2005 from New Energy and Industrial Technology Development Organization (NEDO) of Japan.



G. Häusler and M. W. Lindner, “Coherence radar and spectral radar—new tools for dermatological diagnosis,” J. Biomed. Opt.1083-3668 3, 21–31 (1998).10.1117/1.429899Google Scholar


A. F. Fercher, C. K. Hitzenberger, G. Kamp, and S. Y. El-Zaiat, “Measurement of intraocular distances by backscattering spectral interferometry,” Opt. Commun.0030-4018 117, 43–48 (1995).10.1016/0030-4018(95)00119-SGoogle Scholar


R. A. Leitgeb, C. K. Hitzenberger, and A. F. Fercher, “Performance of Fourier domain vs. time domain optical coherence tomography,” Opt. Express1094-4087 11, 889–894 (2003).Google Scholar


T. E. Ustun, N. V. Iftimia, R. D. Ferguson, and D. X. Hammer, “Real-time processing for Fourier domain optical coherence tomography using a field programmable gate array,” Rev. Sci. Instrum.0034-6748 79, 114301 (2008).10.1063/1.3005996Google Scholar


Z. Hu and A. M. Rollins, “Fourier domain optical coherence tomography with a linear-in-wavenumber spectrometer,” Opt. Lett.0146-9592 32, 3525–3527 (2007).10.1364/OL.32.003525Google Scholar


G. Y. Gelikonov, V. M. Gelikonov, and P. A. Shilyagin, “Linear wave-number spectrometer for spectral domain optical coherence tomography,” Proc. SPIE0277-786X 6847, 68470N (2008).10.1117/12.763541Google Scholar


C. M. Eigenwillig, B. R. Biedermann, G. Palte, and R. Huber, “K-space linear Fourier domain mode locked laser and applications for optical coherence tomography,” Opt. Express1094-4087 16, 8916–8937 (2008).10.1364/OE.16.008916Google Scholar


C. Chong, A. Morosawa, and T. Sakai, “High-speed wavelength-swept laser source with high-linearity sweep for optical coherence tomography,” IEEE J. Sel. Top. Quantum Electron.1077-260X 14, 235–242 (2008).10.1109/JSTQE.2007.911766Google Scholar


N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express1094-4087 14, 587–592 (2006).10.1364/OPEX.14.000587Google Scholar


T. Shimobaba, Y. Sato, J. Miura, M. Takenouchi, and T. Ito, “Real-time digital holographic microscopy using the graphic processing unit,” Opt. Express1094-4087 16, 11776–11781 (2008).10.1364/OE.16.011776Google Scholar



N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli, “High performance discrete Fourier transforms on graphics processors,” Proc. ACM/IEEE Conf. on Supercomputing, pp. 1–12, IEEE Press, Piscataway, NJ (2008).Google Scholar

Yuuki Watanabe, Toshiki Itagaki, "Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit," Journal of Biomedical Optics 14(6), 060506 (1 November 2009). http://dx.doi.org/10.1117/1.3275463

Optical coherence tomography



Image processing

Computing systems

Fourier transforms

Graphics processing units

Back to Top