Optical coherence tomography angiography (OCT-A) is an emerging imaging modality with which the retinal circulation can be visualized by computing the decorrelation signal on a pixel-by-pixel basis. Variants of OCT-A methods have been described in recent review articles.12.3.–4 We have recently validated the speckle variance approach to OCT-A against standard invasive techniques, such as fluorescein angiography (FA),5,6 in which only the superficial capillaries can be distinguished due to excessive choroidal fluorescence7 and ex vivo histological analyses.6,89.–10
The noninvasive, in vivo visualization of the retinal microvasculature using OCT-A can be instrumental in studying the onset and development of retinal vascular diseases. For example, OCT-A has enabled the visualization of the deep plexus layer and furthered the understanding of diseases, such as paracentral acute middle maculopathy1112.–13 and diabetic retinopathy.14,15 Quantitative measurements, such as capillary density, can be used to stratify the risk of disease progression, visual loss, and also for monitoring the course of disease.16,17 Due to projection artifact and poor contrast, it is often difficult to trace individual vessels in this layer when only one en face image is visualized. An additional challenge to this end is the small dimension and pulsatile flow of the retinal capillaries, making them less consistently visible and difficult to distinguish from the speckle noise relative to larger vessels. This limits the detection sensitivity for changes in the retinal microvascular circulation due to diseases, aging, or treatment. Methods for reliable visualization of the microvasculature in the OCT-A images are required for studies conducting longitudinal and cross-sectional quantitative analysis.
Our work focuses on improving the visible definition of retinal microvasculature in OCT-A by motion correction, registration, and averaging of sequentially acquired images. Detailed, high-quality OCT-A images are needed for clinical studies, such as comparisons of OCT-A with histology and fundus photography FA, and studying the shunting of vessels in a focal area, such as the inner ring of vessels in the foveal avascular zone (FAZ) or in glaucomatous focal defects.18
Serially acquiring and averaging multiple OCT-A images can be an effective solution for confirming the presence or absence of capillaries as the discontinuous appearance of the capillary vessels is beyond improvement simply by just applying image filtering.3,19 A crucial step in the serial acquisition approach is the registration of multiple OCT-A images, the difficulty of which is compounded by the fact that an OCT-A image is acquired over multiple seconds and, thus, particularly susceptible to motion artifacts. The registration of sequential B-scans20 can aid in attenuating small motion artifacts, but not the larger motion artifacts associated with imaging subjects with pathologies. OCT-A with eye-tracking has been implemented in commercial retinal imaging systems, although this increases hardware cost and complexity on which the sensitivity and reliability of motion detection also depend. Previous works on posthoc motion artifact removal by en face summed volume projection have been reported in the literature, see for example Refs. 3, 21, and 22. In the work by Hendargo et al.,21 two to three sets of orthogonal (- and -fast) volumes were acquired and divided into motion-free strips. The visualization and contrast of the vessels were improved by multiresolution Gabor filtering, and the strips were registered one-by-one, first globally by - and -translation that maximized the correlation in the overlapping region and locally by B-spline free-form deformation in the overlapping region. Zang et al.22 did not acquire orthogonal data sets, but instead serially acquired two OCT-A volumes in the same scan orientation that were divided into parallel motion-free strips. The strips were first registered by - and -translation and rotation that minimized the squared difference of “large vessels,” which were defined in the paper as pixels with decorrelation value greater than 1.3 times the mean value. This was followed by B-spline free-form deformation on “small vessels,” defined as pixels with decorrelation value less than 1.3 times and greater than 0.6 times the mean value. Both groups presented mosaicking of OCT-A images into wide-field views, which has been reported in other works as well.19,23
In this work, we present averaging of up to 10 serially acquired OCT-A images with parallel strip-wise microsaccadic noise removal and localized nonrigid registration. Unlike the previous two methods,21,22 which concentrated motion artifact removal and wide-field imaging, our purpose was to improve the contrast and signal to background of the capillaries in focal regions. The details of our methodology are presented below. In brief, the serially acquired OCT-A images were divided into microsaccade-free strips. The target strips were first aligned to a template image by - and -translation based on maximum cross-correlation, followed by affine registration using scale-invariant feature transform (SIFT),24 a feature extraction method robust to scaling, orientation changes, illumination changes, and affine distortions.
The image warping and local distortion due to slower eye movements are less obvious and more difficult to model than the strong stripe artifacts from microsaccadic motion. Instead of free-form deformation,21,22 our approach optimized the intensity value at each pixel location as the average of the values from each overlapping strip determined by translation and rotation of a windowed region in each strip. Thus pixel-wise correspondence across multiple OCT-A images was found by local neighborhood matching.
The remainder of this report is organized as follows. Section 2 describes our processing algorithm for OCT-A image averaging in detail, as well as quantitative metrics to evaluate the improvements to the image quality. The algorithm was tested on OCT-A images of six healthy volunteers, with the vessel visibility improvement qualitatively demonstrated in all, superficial, and deep plexus layers. Quantitatively, we evaluated the algorithm performance by contrast-to-noise ratio (CNR) and signal-to-noise ratio (SNR) with the background speckle noise information from the FAZ. We end with a discussion of the image quality improvements using our method of averaging serially acquired OCT-A images.
All subject recruitment and imaging took place at the Eye Care Centre of Vancouver General Hospital. The project protocol was approved by the Research Ethics Boards at the University of British Columbia, Simon Fraser University, and Vancouver General Hospital, and performed in accordance with the tenets of the Declaration of Helsinki. Written informed consent was obtained from all subjects.
Optical Coherence Tomography Instrumentation
The OCT-A images were acquired from a graphics processing unit-accelerated OCT clinical prototype; the details of the acquisition system have previously been reported.5 The OCT system used a 1060-nm swept source (Axsun Inc.) with 100 kHz A-scan rate and a 111-nm 10-dB tuning bandwidth; the digitized spectrum used for imaging corresponded to an axial resolution of in tissue. The size of the focal waist on the retina was estimated using the Gullstrand–LeGrand model of the human eye to be (calculated using Gaussian optics) corresponding to a lateral FWHM of . The scan area was sampled in a grid with a field of view in 3.15 s. Ten serially acquired volumes centered at the FAZ were obtained per eye in . During this image acquisition period, patients were asked to maintain their gaze on a particular target and encouraged to blink as necessary in order to prevent drying of the cornea. The automated parsing of the image data strips (Sec. 2.3.1) eliminated issues of motion artifact and partial volumes. For the angiogram, the speckle variance calculation25 was used26,27
En Face Angiogram Extraction
Postprocessing of the raw intensity data was performed to extract optimal quality images of the retinal microvasculature. Coarse axial motion artifact was corrected using cross-correlation between adjacent B-scans. Subpixel registration was then performed on each set of three repeat B-scans before computing the speckle variance angiogram.28 Three-dimensional (3-D) bounded variance smoothing was applied to the motion-corrected intensity B-scans in order to reduce the effect of speckle while preserving and enhancing the boundaries between retinal layers. The inner limiting membrane (ILM), posterior surface of the inner plexiform layer (IPL), and posterior boundary of the outer nuclear layer (ONL) were segmented automatically in 3-D using a graph-cut algorithm.29 The automated segmentation was examined and, where needed, corrected by a trained researcher using Amira (version 5.1; Visage Imaging, San Diego, California). The angiogram data from the superficial plexiform layer (ILM to IPL), deep plexiform layer (IPL to ONL), and all retinal vascular layers (ILM to ONL) were extracted and averaged in the axial direction to produce projected en face images of the microvasculature. Projection artifacts in the deep layer angiogram were attenuated using a modified slab-subtraction algorithm.30 In Eq. (2), is the projection resolved en face angiogram of the deep layer, where represents the normalization process, is the number of pixels in the deep layer, is the number of pixels in all the retinal layers, is the number of pixels in the superficial layer, and is the angiogram
Contrast-limited adaptive histogram equalization was then performed on all en face images to enhance the contrast.
The algorithm overview is shown in Fig. 1. The 10 serially acquired en face images of all retinal layers were divided into microsaccade-free strips, which were then registered to a template image, first using rigid registration for the coarse alignment, followed by nonrigid registration for finer features. Transforms applied to the en face image of the full retinal thickness were then applied to both the superficial and deep layer angiograms.
Microsaccade-free strip generation
For each eye, a microsaccade-free image from the 10 en face images was chosen as the template image. In the case that all images contained microsaccadic motion artifacts, a template was generated by stitching together microsaccade-free strips using the registration methods discussed below.
After the template image was chosen/generated, the remaining images were divided into strips between positions in the image corresponding to where the patient fixation was lost, which appeared as vertical white stripes in the en face image, as shown in Fig. 2. Strips less than 40 pixels wide often contained large drift artifact and were therefore discarded. If multiple microsaccade-free images existed per eye, the first was selected as the template, and the rest were divided into three equal-sized strips for registration. Each strip was zero padded to match the size of the template image and coarsely aligned to the template by - and -translation using maximum cross-correlation.29
Strip-based affine registration
SIFT keypoints were automatically extracted from both the template image and each strip to be registered.24 Briefly, keypoints are the locations of local scale-space extrema in the difference-of-Gaussian function convolved with the image. Further refinement to the keypoints can be made by assigning each keypoint an orientation to achieve invariance to image rotation. Finally, a local image descriptor is assigned to each keypoint using the image location, scale, and orientation as found above. Readers are encouraged to refer to Ref. 24 for a more detailed description of the SIFT algorithm.
As the SIFT feature descriptor is invariant to uniform scaling and orientation, it is ideal for identifying matching keypoints in noisy or speckled images, such as OCT-A angiograms. The calculation of Euclidean distances in MATLAB® is computationally expensive, and therefore matching keypoints between the template and strip were identified as the closest corresponding keypoints by a small angle approximation to the Euclidean distance. Keypoints were considered matching if the ratio of the vector angles from the nearest to the second nearest match was less than a threshold value of 0.75. As the image had been coarsely aligned in the previous step, a second check was included to ensure that the matched keypoints were no more than 40 pixels distant in the - or -direction.
All strips that had a minimum of four matched keypoints were then transformed using an affine transform estimated using the matching keypoints as inputs to the estimate geometric transform function in MATLAB®. This function iteratively compares an affine transformation using three randomly selected keypoints, where the transformation with the smaller distance metric calculated using the M-estimator sample consensus algorithm is used as the transformation matrix for the next comparison. The maximum number of random trials for finding the inliers was set to 5000 for improved robustness.
Strip-based nonrigid registration
The vertical white lines in the target image in Fig. 2 mark the image discontinuities due to microsaccades accounted for by the strip-based affine registration; however, localized mismatch still remains in the aligned images after this affine registration step. The next step in our algorithm is to compensate for the smoother tremor and drift motions represented by image warping and distortion, by using nonrigid registration. Prior to nonrigid registration, a averaging filter was applied to both the template and the aligned strip to smooth any fine speckle that may affect the nonrigid registration. The template and aligned strips were both then zero padded by 15 pixels. For each pixel, in the strip, the normalized cross-correlation31 was calculated, defined asFigure 2 shows a pictorial schematic of the registration pipeline described in this section. A smaller field of view demonstration of the coarse, affine, and nonrigid registration steps is shown in Fig. 3. The stack of registered strips could then either be combined by taking the mean or median to generate a higher quality image.
The performance of the algorithm was evaluated with qualitative observation and quantitative measures of the CNR, SNR, and structural similarity index (SSIM).34 was not used here, as the quality metric was only used for intravolume comparison to measure the trends.
The SSIM35 is a quality metric used to measure the perceived relative quality of a digital image and is defined as
A total of 10 eyes from six healthy volunteers (4 males and 2 females) aged years were acquired according to the imaging protocol. A comparison of the template image and the final averaged OCT-A images for all retinal layers, as well as the superficial and deep vascular layers is shown in Figs. 4 and 5. In the template images, the vessels near the FAZ are relatively clear; however, it becomes harder to differentiate the vessels further toward the periphery. In contrast, the vessels in the averaged images are clearly seen throughout. Improvement in vessel visibility is particularly marked in the deep layer, where the OCT signal strength is weaker. Qualitatively the median images appear sharper than the mean images as the median averaging acts as a speckle reducing filter. However, the mean images appear more smooth than the corresponding median image.
For quantitative comparisons of the template and final averaged images, the average CNR and SNR of the images were calculated. The average CNR of the angiograms with all retinal layers increased from using the template images to with the mean images and with the median images. Additionally, the average SNR of the angiograms with all retinal layers increased from using the template images to with the mean images and with the median images. The mean improvement of both the CNR and SNR was statistically significant () using a paired -test.
To evaluate the change in perceptual quality per strip, the SSIM was calculated on each incremental averaged image of the template and registered strips. Although 10 volumes were acquired per eye, the number of microsaccades and strips less than 40 pixels in the corresponding en face images was different for all eyes, and therefore the number of strips used to generate the averaged images was not necessarily equal. The mean number of strips per eye was strips. As seen in Fig. 6, the SSIM values show a rapid increase as the first few strips are registered and applied to the template image, and then the rate of improvement slows with additional registered strips. This trend was observed in both the mean and median averaged images.
The major findings in this paper are as follows: averaging multiple registered sequentially acquired OCT-A images (1) qualitatively enhances the visualization of the retinal microvasculature networks, (2) increases the SNR and CNR of the angiograms, and (3) increases the perceptual visual quality when using SSIM as a metric.
After averaging multiple en face images, the vessels of the deeper capillary plexus are more readily identified, making quantification more reliable and thereby facilitating investigation of its role in the pathophysiology of retinal vascular disease. Although minimal projection artifact can still be seen in Fig. 4 corresponding to the larger superficial vessels, the overall qualitative condition of the en face images is improved.
The SNR and CNR both increased significantly by averaging the individual strips. Although both the SNR and CNR of the mean images are larger than those of the median images, there is no significant difference between the mean and median, and therefore no recommendation of an averaging method can be made based on these metrics.
The SSIM is a full reference metric where the final averaged image was taken to be the perfect quality reference image. Note that each of the 10 volumetric sets for each subject was divided differently, depending on the motion in a particular acquisition, and therefore a different number of strips were used for each averaged reference image. SSIM is a relative metric, and an SSIM value of 1 indicates that the image has reached the same quality as the reference image for that data set. The SSIM values cannot be compared across different images, for example, an SSIM value of 1 for different images does not indicate that they are all of the same quality. As shown in Fig. 6, the SSIM increases with each additional registered strip that is averaged to the template image. As expected when averaging images, the first few strips applied to the template affected the SSIM the most whereas the later strips provided only modest improvement to the SSIM. The deep plexus showed the greatest increase overall. By increasing the visibility of individual vessels, this technique has the potential to improve automated segmentation results thereby improving our ability to quantify capillary density in normal and diseased states.
Although we have demonstrated the ability to enhance the visualization of the retinal plexuses through averaging multiple sequentially acquired OCT-A images, we acknowledge several limitations of this work. This study assessed only relatively young subjects with clear ocular media and good fixation ability. The presence of media opacities in older subjects may limit the amount of capillary information that can be attained from images. Although the algorithm attenuates nonmicrosaccade motion in the registered strips, the template may contain distortions and image warping, which is not accounted for here.
The authors have no relevant financial interests in the paper and no other potential conflicts of interest to disclose.
The authors acknowledge funding support from the Natural Sciences and Engineering Research Council of Canada, the Canadian Institutes for Health Research, the Brain Canada Foundation, the Alzheimer Society Canada, the Pacific Alzheimer Research Foundation, the Michael Smith Foundation for Health Research, and the Genome British Columbia.