Recent advances in Fd-OCT1, 2, 3 make possible in vivo acquisition of ultrahigh-resolution volumetric retinal OCT data in clinical settings. This technology has led to new and powerful tools that have the potential to revolutionize the monitoring and treatment of many retinal and optic nerve diseases similar to the advancement achieved in other medical areas due to the application of clinical volumetric imaging. However, in order to fully realize this potential, new tools allowing the visualization and measurement of retinal features are required. Attempts at visualization of OCT volumetric retina data have recently been presented by several groups4, 5, 6 and have included visualizations of highly magnified retinal structures imaged with adaptive optics (AO-OCT) systems.7, 8 Possible approaches to retinal layer segmentation have also been presented. 9, 10, 11, 12
In this paper we present a clinical Fd-OCT system that produces retinal volumes that are visualized and analyzed in real time with custom software. A fully automated approach that segments, classifies, and analyzes retinal layers would be ideal. However, the morphology of retinal layers varies dramatically from patient to patient and depends on the particular pathogenic changes of the disease in question. This causes problems for existing automatic retinal layer extraction methods used in clinical instruments.13 Therefore, to simplify this task, we have developed a system based on a semiautomatic segmentation method. A clinician interactively specifies the location of a retinal layer on a few select slices in order to create a segmentation of the entire volume. Our method uses an SVM-based classification system,14, 15 which is a machine learning method used to predict results based on a given set of inputs. An SVM is tolerant of misclassification by the user and of physiological variation between patients and diseases and easily adapts to the varying data properties that constitute a retinal layer. Also, our SVM approach gracefully handles the speckle noise that disturbs all data acquired through OCT by modeling it as a normal distribution characterized by a voxel’s local data mean and variance. Once the layers are segmented, we can extract a thickness map, layer profile, and -axis intensity projection or measure a volume of this structure. We compare our semiautomatic segmentations to manually segmented layers and test its performance for different retinal and optic nerve diseases.
Materials and Methods
The results presented in this paper have been acquired during the routine use of our clinical Fd-OCT system. Over the past two years, we have used this system to acquire volumetric scans from more than 200 subjects, with healthy and diseased retinas or optic nerve heads. More than half of these data sets have been used to create volumetric representations of the imaged retinal structures. SVM segmentation has been implemented with 30 normal and diseased retinas to evaluate classification of different retinal structures. Due to involuntary eye motion and reduced (or distorted) intensity of some OCT images, not all of the 3D scans are appropriate for volumetric reconstruction; the inappropriate scans are related to (especially among elderly patients) advanced cataract, significant eye aberrations, long eyelashes, and ptosis. As a standard method of qualifying retinal scans for volumetric reconstruction and subsequent segmentation, a movie showing all consecutive B-scans acquired in the volume is generated and viewed by the operator. The C-scan reconstructed from OCT images is also used to rate distortions caused by eye motion. Due to these problems, which are routine in any clinic, we have concentrated our efforts on a segmentation method that tolerates noisy data.
The base configuration of the experimental Fd-OCT system used for acquiring 3D volumes has been described previously.7 For data presented in this paper, a different light source has been used: a superluminescent diode with an 855-nm central wavelength and an FWHM of . The spectral bandwidth of this light source allows axial resolution in the retina (refractive index, ). The high power of this light source supports use of a 90/10 fiber coupler, directing 10% of the light toward the eye, resulting in the power at the subject’s cornea equal to and allowing 90% throughput of the light back-reflected from the eye to the detector. With our current spectrometer design (200-mm focal length of the air-spaced doubled), the maximum axial range (seen after the Fourier transform) is about in free space, corresponding to approximately in the eye. Figure 1 shows a schematic of our clinical Fd-OCT system.
Each subject was imaged with several OCT scanning procedures including 3D scanning patterns centered at the fovea and ONH. We used two different scanning arrangements for 3D scanning patterns including (1) equally spaced 200 B-scans, with each based on 500 A-scans, and (2) equally spaced 100 B-scans based on 1000 A-scans. In both cases, volumes consisted of the same number of A-scans (100,000), and the time required to acquire a volume was for CCD exposure time and for exposure time. The longer exposure time was mainly used to increase image intensity and with 100 B-scans based on 1000 A-scans per frame acquisition mode, at the cost of more motion artifacts. Commercial Fd-OCT acquisition software from Bioptigen, Inc. permitted real-time display (at the acquisition speed) of the imaged retinal structures and saved the last acquired volume.
After each imaging session, saved spectral data are processed in LabVIEW using standard Fd-OCT procedures,3, 7, 16 including spectral shaping, zero padding, and dispersion compensation. As a result, a set of high-quality OCT B-scans is saved in tiff format. It is possible to then analyze data and choose volumes with no eye motion or ones in which eye movement occurred only at the beginning or toward the end of the 3D acquisition, provided there was good fixation during the rest of the time (especially if the structure of clinical relevance was scanned without distortions). In these cases, the part of the image that was affected by eye motion is removed, resulting in a reduction in the number of B-scans and overall size of the reconstructed 3D volume. In addition, as a preprocessing step before volume visualization, OCT B-scans are registered using standard registration techniques. Figure 2 shows a single B-scan and also a cross section of several B-scans showing the alignment in different cross sections.
To correct for axial and lateral distortions, we used ImageJ,17 with the Turboreg plug-in developed in the laboratory of P. Thévenaz at the Biomedical Imaging Group, Swiss Federal Institute of Technology Lausanne,18 which computes rigid body transformations (translation and rotation) to minimize the difference between neighboring frames, as can be seen in the lower panel of Fig. 2.
Visualization and Analysis Software
To render the volumetric data, we used a technique similar to that described by Carbal 19 Every time the volume is rendered, it is sliced into evenly spaced view-aligned planes. Each slice is rendered through a pixel shader using the voxel’s data value, a color map, and alpha blending to create a volume rendering that is equivalent to a ray tracer that uses evenly spaced samples. This is done efficiently by implementing a highly optimized cube slicing algorithm and applying it to the bounding box of the volume. We also use adaptive slice spacing to balance performance/functionality and visual quality.
There has been substantial research on classification and segmentation of volumetric data. Most of this work focuses on visualization instead of explicit segmentation. The most common approach used in volume visualization is based on a one-dimensional transfer function to map scalar intensity to color and opacity. This effectively allows a region to be classified based solely on scalar intensity, hiding regions occluding areas of interest while highlighting the remaining areas with a chosen color scheme. This approach does not work for noisy data if it results in a region that cannot be classified solely by its scalar intensity (since these values occur somewhat randomly throughout the volume). This approach is also not applicable to volumes that contain volumetric objects of similar intensity values that occlude each other. A great deal of research has attempted to solve these inadequacies. Levoy 20 used gradient magnitude to highlight material boundaries in volumetric data to better express structure. Kindlmann and Durkin21 proposed the use of a two-dimensional transfer function, represented by a scatterplot, of scalar values versus gradient magnitudes. Others have described methods that help a user better utilize this 2D transfer function.20, 22, 23 Iserhardt-Bauer 24 combined a 2D transfer with a region growing method that expands to fill regions of interest. Region growing methods are typically ideal for large meandering regions. Multidimensional transfer functions tend to operate better than lower-dimensional counterparts since the method can use more information to segment the data. The method of Bordignon 25 demonstrates a multidimensional transfer function with a user interface based on star coordinates. This is a technique that maps multidimensional data to a 2D plane so that a user can interact with it. Pfister 26 compared a variety of transfer function methods including some of those listed above. Despite these advances, 2D transfer functions remain inaccessible to the common user, and their effectiveness is still predicated on noiseless data. When it comes to discrete and explicit segmentation, medical imaging research has used artificial neural networks as a means to assist in these tasks.27, 28 However, support vector machines14, 15, 29 have demonstrated better results to date than neural networks (at a larger computational cost) when applied to pattern recognition including object identification,30 text categorization,31 and face detection.32 Tzeng 33 compared the use of neural networks and support vector machines when trying to construct an N-dimensional transfer function that uses additional variables such as variance and color (if present) in addition to scalar values and gradient information. The ability of neural networks and SVMs to handle error (in the form of noise) makes them useful for noisy volumetric data sets. However, these methods still seek to classify a volumetric object solely based on discrete values, not on distributions. Our approach converts noise into a classifiable characteristic by using a specialized SVM that operates on distributions rather than scalar values.
Two main design decisions customize an SVM to a particular application. The first is the kernel used to map input (or world) space to feature space. As a consequence, the SVM can process problems that are not linearly separable in input space. In our case we chose a radial basis function kernel:is a constant scaling exponential decay and and are two vectors in the feature space.
The second design decision to be made is the dimensionality (data characteristics) of our input space. The first (and most obvious) value we add to the input (or feature) vector is scalar intensity. It allows for efficient segmentation of regions having a low standard deviation (noiseless data). We also include the spatial location of a voxel since we want to track spatial locality of data distributions, which allows the SVM to distinguish between features having similar data distribution characteristics but residing in different locations.
The method described by Tzeng, 33 from which ours is derived, suggests that six neighbor intensity values should be considered at to counter noise. They suggest that if a voxel value has been perturbed by noise, inclusion of its neighbors will help determine the “actual value.” We found this approach to dramatically decrease the effectiveness of the SVM and lead to unintuitive results. We modified this approach to instead use the mean intensity of the six neighbors. We also included the local variance (instead of the standard deviation) both to let the SVM gauge the accuracy of the mean and to characterize the local distribution. Our variance estimate is accurate as long as the set of neighboring voxels used to compute the variance belong to the same feature (i.e., are not on the border of two or more features, which would give an inaccurate representation of the variance for a particular layer). We detect border regions through a locally approximated gradient magnitude calculation. In summary, for a voxel with scalar intensity value , the data characteristics we use for our input vector areis a set of neighboring voxels around and is a voxel in this set. Thus, for each we store its intensity, location of a local approximation of the mean value , a local approximation of the variance , and a local approximation of the gradient using a local difference operator.
User interface and SVM training
The main advantage of using an SVM to perform segmentation is that it does not need to know what type of region it is segmenting a priori. Training data are provided at run time by a user through an intuitive interface, which is a part of our volumetric rendering software, to dynamically create a “segmentation function.” Our system allows a user to quickly classify features by using a small number of specification points. We perform this specification on B-scans of the volume, similarly to Tzeng 33 The user can interactively choose a B-scan from the volume and “paint” on that B-scan to mark points as “feature” or “background.” The user is required only to draw points on the regions of interest and indicate regions not of interest. The difficulty of obtaining an accurate segmentation is not selecting training points for the SVM, but finding the smallest set of points that provide the best result. This turns into an iterative process of selecting points and testing the result of the SVM segmentation first on the single frame and then on the whole volume. The cost of choosing excessive numbers of points is an increase in processing time by the SVM.
The classification provided by the SVM is a partition of the entire volume so we can use it to extract relevant morphological data. For example, one can track the change in the volume of the given structure (e.g., cup of the ONH) that may be useful for retinal or optic nerve diagnosis. Another potentially useful metric is the density of a volumetric object and its standard deviation. Density in an OCT volume represents the intensity per voxel volume; thus, the amount of backscattered light that can be used as a metric characterizing the internal structure of a given retinal layer or structure.
SVMs have a large computational cost that hinders a user’s ability to iteratively create segmentation. We have developed techniques to reduce the computational cost for both SVM training and classification. Our techniques aim to reduce the time needed between iterations for creating and testing an SVM by a clinician. With the first method, the user tests the segmentation on individual slices (B-scans). This method usually requires only a few seconds (even for volumes of resolution). The user can browse remaining slices and apply the trained SVM in order to preview the segmentation. A user can adjust the current SVM by marking misclassified voxels, retraining the SVM, and continuing. Once trained, an SVM classifies each voxel independently of other voxels. Thus, we take advantage of multiprocessor computers by using multithreading in order to significantly decrease computation time for the clinician.
Another effective method restricts the SVM to a subvolume of the data. Our application allows a user to specify axis-aligned clipping planes in order to specify a subregion. This approach can significantly reduce the number of voxels to be processed. Time is saved because a user does not have to train the SVM regarding “background” data existing outside this subvolume. Additionally, smaller training data reduce the mapping complexity of the SVM, resulting in both faster training and classification.
Assuming that objects are relatively large with respect to voxel size, we have implemented a checkerboard scheme. Thus, we only process every other voxel by the SVM. Each unclassified voxel is classified based on its neighbors. If no clear majority class is implied by the neighbors, we apply the SVM to that voxel to decide. This reduces classification time by about 40% and also leads to smoother object boundaries and fewer “orphaned” voxels, i.e., those that are disconnected from larger objects and appear to be noise.
Our visualization software is regularly used to render retinal volumes and allows interactive viewing by our clinicians. Due to the use of hardware acceleration in our rendering software (we use common features of modern graphic cards that have been optimized for interactive media applications), this is done in real time at high resolution. We tested our SVM segmentation software on a variety of retinal volumes and retinal structures that have been segmented. SVM training and segmentation time varies dramatically depending on the complexity of the segmented layer or the feature, the number of voxels in the volume, and the accuracy of segmentation. Total segmentation time for a whole volume performed on a PC workstation with two Intel Xeon 3.6-GHz processors and of main memory can be as short as a few minutes for small volumes and well-defined features to as long as 2 hours for large volumes and complex structures. Generally, the more training points needed to extract a feature, the more time-consuming is the segmentation. Thus, it is important for the operator to learn how to efficiently train the SVM to reduce time needed for segmentation. In order to quantify the accuracy of the segmentation, our visualization software has a built-in manual segmentation option, where the operator can draw the borders of the segmented feature on each B-scan. This process may be very time-consuming. However, it can be used to create a “reference” segmentation that can be directly compared to an SVM segmentation for verification purposes. Additionally, we can extract morphological data that have known normal values, such as retinal layer thickness maps or retinal layer profiles that can be directly used as a diagnostic tool in ophthalmology. Moreover, we can use our software to create -axis projection intensity maps of extracted layers, allowing visualization of specific retinal features.
Visualization and Analysis of Volumetric Data
To illustrate the performance of our clinical Fd-OCT and our 3D data visualization and analysis software, we present four examples of visualization and analysis of the clinical data.
The first example is data acquired from a 55-year-old patient with non-exudative age-related macular degeneration (AMD). Figure 3 shows the patient’s fundus photo used for regular clinical diagnosis. The top right image shows a reconstructed en face image (virtual C-scan) from the OCT volume (100 B-scans/1000 A-scans/B-scan acquired with /line exposure) superimposed on the fundus picture. This step allows precise registration of the acquired volume and estimation of the distortions caused by the subject’s eye motions during the experiment.
To demonstrate the performance of our Fd-OCT system, four B-scans (labeled A–D) are shown in the lower part of Fig. 3. The location of the diseased structures can be easily seen in these images. As described above, this data set is used to create the interactive volumetric visualization using our software. Figure 4 shows a screen shot of our volume renderer and segmentation system. Note the points used to define the SVM segmentation.
Figure 5 shows an example of different 3D visualizations of this data set. We used a false color intensity based on a black-red-yellow color lookup table with black representing low-intensity voxels and yellow representing high-intensity voxels. Transparency of each voxel is based on its intensity and can be interactively set by the operator.
To test the performance of our SVM segmentation algorithm, we used it to segment the retinal pigmented epithelium (RPE) and photoreceptor layers, structures that show the main changes associated with disease progression in AMD. The bumpy surface is due to large drusen, a defining feature of early stage AMD. A thickness map of these segmented structures is created from the SVM segmentation. Figure 6 shows the corresponding thickness maps from the SVM segmentation and from manual segmentation. As can be seen from the images in Fig. 6, the SVM method leads to clear separation of these layers, allowing its visualization as well as estimation of the thickness of the area of interest. The differences between the SVM and manual segmentations suggest that some refinement of the SVM may be necessary to better segment high density structures.
Figure 7 shows another SVM-based segmentation of the RPE-photoreceptor-outer segment complex from the retinal foveal region of a 56-year-old patient with early-stage dry AMD. In contrast to the first patient presented, the RPE-photoreceptor-complex disruptions are subtle. Figure 7 shows results generated with our custom visualization software (including -axis intensity projection).
Our algorithm was able to segment these structures. However, since the size of these disrupted regions was small and there was intensity variation across the volume, mainly at the corners of the image, not all drusen were identified in the thickness map. When the size of disruption is bigger, as in the previous example, the SVM segmentation algorithm has been able to pick out the features of interest very accurately.
In the next two examples, we tested the SVM segmentation algorithm in segmenting the RNFL around the ONH. Thinning of RNFL is known to be a good indicator of possible onset of glaucoma; therefore, being able to accurately segment and measure its thickness and follow it over time would be a good diagnosis and monitoring tool in glaucoma management. Figure 8 shows the segmentation results of the ONH from a healthy 30-year-old volunteer with our SVM segmentation software.
We also segmented the same structure in a 48-year-old glaucoma patient with a moderate amount of visual field loss. The results are shown in Fig. 9 .
As can be seen from these last two examples, SVM-based segmentation was able to successfully differentiate RNFL from other retinal structures, allowing visualization of the RNFL thickness map that shows thinning of this layer for the subject with glaucoma. It confirms good performance of the SVM-based segmentation for thick and well-defined structures.
Conclusion and Discussion
A major limitation in segmentation of OCT volumes is that image intensity can vary depending on B-scan location and due to the shadowing effects caused by blood vessels. These effects are visualized in Fig. 10 , where blood vessels do not allow much light to pass through them, which means that data in regions around and below them are obscured, distorted, or occluded. Our visual system can cope with this by recognizing global patterns and extrapolating them into these regions. But as stated earlier, SVM classification remains a local operation and is unable to “see” global structure. This makes segmentation in these areas difficult and inaccurate. Recently Mujat 9 have presented a fully automated segmentation method that can overcome this problem—however, the computational cost is larger, resulting in frame segmentation time similar to our volume segmentation time. The only way to cope with these anomalies in the SVM is to specify a large amount of additional training data. When spending only a small amount of time generating training data, these inaccuracies are pronounced enough to make any morphological data extracted from these regions useful only as a first approximation. Filters partially alleviate this problem, but a better solution is desirable. We plan to direct our attention to this problem in the future, using SVM-based segmentation methods.
It should be noted that when one performs analyses using morphological data obtained from SVM segmentation, the quality of a segmentation depends heavily upon the training data and SVM quality (i.e., how well the SVM can predict based on the given input data).
We will refine our method to take into account certain global information such as, in the case of OCT retinal data, the number of layers that make up the retina and the fact that these layers normally do not intersect one another. Using such known constraints should further improve segmentation results.
This research was supported by a grant from the National Eye Institute (EY 014743). We gratefully acknowledge the collaboration in designing this instrument with Prof. Joseph Izatt from Duke University, Durham, NC, and Bioptigen Inc. Research Triangle Park, NC. We thank the members of Vision Science and Advanced Retinal Imaging Laboratory, VSRI, and the visualization and computer graphics research group, IDAV, University of California, Davis, for their help.