Optical coherence tomography (OCT) is a three-dimensional (3-D) imaging modality providing high-resolution and high-speed volumetric images with depth of penetration in tissue on the order of a millimeter, which has become more common in clinical and biomedical applications due to the ability to resolve diagnostically relevant features.1 OCT applications are currently being developed for many organs, including the small distal airways of the lung. In order to access these highly constrained and hard-to-reach areas, OCT systems are often catheter based. Catheter-based systems for in vivo clinical imaging have been developed for cardiology, gastroenterology, and pulmonology.12.3.4.–5 Specifically in the lung, OCT can visualize distal airway tissue structures at high resolution and when combined with autofluorescence imaging (AFI), can probe specific molecular components of airway tissue such as collagen and elastin.2,6,7 Therefore, combined OCT–AFI systems can produce complementary information that may enable increased detection and characterization of structural and functional features associated with different lung diseases. Our group has previously reported a combined endoscopic OCT–AFI instrument using a double-clad fiber (DCF) catheter that is capable of detecting pulmonary nodules and vascular networks.7
Successful application of catheter-based OCT for in vivo pulmonary imaging requires overcoming several challenges including motion artifacts associated with the cardiac cycle, breathing, and nonuniform rotation distortion (NURD), that make identification of structures such as blood vessels difficult.4 Cardiac and breathing motion artifacts are more prominent when the heartbeat and respiratory periods are much shorter than the total data acquisition time. Cardiac and respiratory motion artifacts can be reduced to some degree by decreasing the image acquisition time, but even then there remains a need to compensate for NURD artifacts. Catheters using micromotors to directly rotate the optical assembly are expected to have less severe NURD artifacts compared to proximally driven torque cable catheters;8 however, the miniaturization of these catheters to access the narrowest organ sites is limited by the relatively large size of the motor. Moreover, due to the difficulty in fabricating perfectly balanced micromotors, NURD can still degrade image quality.
Understanding and correcting motion artifacts may improve image quality and subsequent interpretation. Several techniques have been proposed to correct NURD in catheter-based OCT systems. Structural landmarks, or fiducial markers, have been used to register successive frames using extrinsic objects.8,9 Reflections from the sheath or optical components of the catheters can also be used for correcting rotational fluctuations caused by NURD.10 In other studies, adjacent A-lines or frames have been registered by maximizing crosscorrelation between the speckle in adjacent search regions.4,11 Another method measures the rotational speed of a catheter by determining the statistical variation in the speckle between adjacent A-lines.12 However, poor tissue apposition regions can result in inaccurate rotational speed interpolation. Yet, methods using cross correlation or phase information may be more sensitive to speckle noise and generally require highly correlated A-line data. Finally, some methods require disabling the pullback entirely.10,11
In this work, the motion artifacts in pulmonary OCT–AFI data sets are estimated from both AFI and OCT images based on azimuthal registration of slowly varying structures in the two-dimensional (2-D) en face image or the calculated en face image of a 3-D image data set. These estimations can be used to correct or reduce such artifacts. We present a new method called azimuthal en face-image registration (AEIR) for motion correction that is applicable to any 3-D or 2-D rotational catheter data with repeating angularly varying values that correlate with physical structures. Performance of the algorithm is evaluated on images generated from NURD phantoms, in vivo OCT–AFI datasets of peripheral lung airways, and known images with simulated artifacts applied.
Materials and Methods
The OCT–AFI system used in this study has been previously described.7 Briefly, the OCT subsystem employs a 50.4-kHz wavelength-swept source (SSOCT-1310, Axsun Technologies Inc., Billerica, Massachusetts) with the illumination centered at 1310 nm with 100 nm bandwidth. The AFI subsystem uses a 445-nm semiconductor laser (CUBE 445-40C, Coherent, Santa Clara, California). The OCT and AFI modalities are combined into a single DCF-based catheter. The catheter fiber-optic assembly consists of a length of DCF (9/105/125-20PI, FUD-3489, Nufern, East Granby, Connecticut) spliced to beam-shaping fiber optics (comprised of step-index multimode, graded-index, and angle-polished no-core fibers). A rotary-pullback drive unit allows volumetric OCT–AFI imaging of airways up to 7 cm in length. The OCT and AFI signals are collected simultaneously and custom data acquisition software collects and processes the data for immediate display.
Phantom and In Vivo Imaging
The NURD phantom was a 3-D-printed object that contained eight parallel, evenly spaced features that were oriented along the path, such that deviations from the expected geometry due to NURD could be quantified.13 This phantom can be created for catheters of various diameters and with complex imaging paths with multiple bends. OCT–AFI of this phantom was obtained to enable the identification of NURD artifacts.
In vivo pulmonary OCT–AFI imaging of human subjects was approved by the Research Ethics Board of the University of British Columbia and the British Columbia Cancer Agency. Informed consent was obtained from all participants, and optical imaging was performed during flexible bronchoscopy under local anesthesia applied to the upper airways and conscious sedation.
Motion Correction Method
In our study, the AFI and OCT imaging modalities generate 2-D and 3-D images, respectively. Each logarithmic-scaled OCT frame or AFI was resized to either 504 or 512 A-lines or pixels along the rotational direction depending on the original number of A-lines, which was variable between different image acquisition sessions. The approach used for motion correction, AEIR, is based on calculating the correlation between pixels along the rotational direction and the corresponding adjacent (in direction of pullback) pixels from a previous frame. This method assumes that slowly varying structures exist in the direction of the pullback in the en face image and that continuous angular mismatch corresponding to motion artifact can be estimated from these structures. These visible structures arise from biological features, such as the vascular networks, collagen network, and alveoli. The features that are detected in the 2-D image can be used to assess the degree and form of the motion artifacts.
The rotational catheter generates a continuous stream of equally time-spaced pixels or A-lines respective to AFI or OCT images with index . We represent the 2-D AFI or 2-D projection of OCT volume (en face image) as where is the frame index (integer division of by number of pixel or A-lines per frame), and is the rotation index position in pixels (remainder after division). Although and are functions of time, for simplicity, the time dependency of and is not explicitly stated here. If no abrupt discontinuities associated with motion artifacts occur, the continuity of the motion and the slow variations make the motion artifact problem ideally qualified for treatment with windowed dynamic time warping (WDTW).11 WDTW is a dynamic programming (DP) technique for matching and aligning two time series, by finding the optimal continuous path through a cost matrix while restricting search range within the matrix. The cost matrix measures similarities between pixels in adjacent frames. This optimal continuous path representing the correlations between adjacent rotations can be used to align the pullback data.
A quick overview of the proposed AEIR method is presented in following steps:
• Select data from a 2-D image (AFI) or calculate an en face projection of a 3-D image (OCT)
• Select two pullback frames and ;
• Construct a cost matrix by comparing a strip, , centered on within frame with multiple strips () within frame using Eq. (1);
• Rescale the cost matrix to reduce noise and ensure proper connectivity constraints by and factors;
• Compute the DP solution for the optimal continuous path representing the estimated motion artifacts;
• Rescale the path to its original size in the image;
• Apply the correction by reversing the obtained path.
• (For a 2-D image) A motion-corrected image is generated.
• (For a 3-D image) Apply the same correction to the original frames; moving A-lines in conjunction with their calculated en face pixel, this will give a motion-corrected 3-D image.
As the proposed method uses en face images, a mean intensity projection along the A-line for each B-scan is obtained for 3-D image data sets, which results in a 2-D image . When compared to the maximum intensity projection, we have found that the mean intensity projection gives an en face image with higher contrast. The en face image is smoothed using a median filter to reduce speckle noise in the image.
Figure 1 shows the first two steps needed to construct the cost matrix. The algorithm proposed here uses strips of length centered on each pixel along the rotational direction ( direction) as in Fig. 1(a) on the image (at the beginning and end of a frame, strips reach into the neighboring en face image column to the temporally closest pixels). Each strip from ()’th frame is compared to the corresponding () strips from the previous pullback frame , ; represents the ’th pixel/strips in the en face pullback frame and is a parameter of the algorithm determining the number of strips in the ’th column to be compared with them [Fig. 1(b)]. The strips are compared using the following equation as the measure of similarity to construct the cost matrix
In our method, each frame is corrected one by one and the corrected frame is used as the reference of comparison to correct the next frame. In order to maintain the continuity of a frame to its next frame, the cost matrix for the ()’th frame, , is concatenated to to construct , where . This cost matrix is resampled by stretching the vertical -direction with a parameter (to get subline precision) and downsampled along with a parameter reducing noise as well as constraining angle steps. The optimal continuous path through the cost matrix representing motion artifacts, which accounts for the continuous rotation of the catheter, can be found using DP, and then resampled to its original size. Image correction can be applied by reversing the obtained optimal path; pullback columns are aligned by replacing each pixel with one that is shifted based on the obtained path.14 The same correction is applied to the 3-D frames since each pixel in the en face-frame corresponds to an A-line in the 3-D frame. For this work, all images were processed in MATLAB®, and the interpolation methods were specified to use “bicubic” for resizing images and “Pchip” for aligning pixels or A-lines in MATLAB® R2014a.
For our system, the AFI and OCT images are obtained simultaneously and are therefore subject to the same motion artifacts. For motion correction of 3-D OCT images, we can use corrections from either the AFI or en face OCT image and apply it to the 3-D OCT frames. These two different correction options are denoted as () and (). We have applied our technique to the AFI and OCT images of an NURD phantom and in vivo clinical pulmonary images.
Van Soest et al.11 previously used DP for correction of NURD artifacts in OCT images called azimuthal registration of image sequences (ARIS). They used the L2 norm to measure similarity and calculated the cost matrix from full A-lines of OCT frames being aligned. In order to compare the corrections based on full A-lines in the OCT frame and the -pixel strip in the en face image, we also applied a similar method to construct the cost matrix based on full A-lines using Eq. (1). The result of the correction from this method is denoted as .
In order to quantitatively characterize the correction for each method, we have evaluated the correction using two approaches.
In the first approach, we have quantitatively evaluated the correction on the NURD phantom images. Each image contains four gray and black strips that create eight edges in total. Motion artifacts make length of the edges to be longer than an ideal straight edge. We detected the edges in each image and measured their lengths by calculating the Euclidean distance between each pixel along the edge. The measured lengths were normalized by length of an ideal strip with no motion artifacts. The average normalized length () was calculated for phantoms images.
The second approach for quantitative analysis of motion corrections was evaluated on both phantom and in vivo images. To perform a quantitative analysis of the amount of correction needed for each image and the amount of correction applied by each of the methods one needs to know the ground truth image, the starting image without detectable motion artifacts, to which a known amount and type of motion artifact is added, then the ground truth image with artifact can be corrected by the algorithm and the correction adjustments compared to the known applied artifact. In other words, we need to know the artifacts in the image to compare against the applied correction. For this purpose, we have simulated motion artifacts in endoscopic OCT and AF images with frequencies similar to those observed in our NURD phantom and in vivo image data sets. We have observed motion artifacts in these images to be noisy sinusoidal patterns along the pullback direction with different frequencies depending on the type of artifact;4,8,12,13 e.g., heart beat (frequency 1 to 2 Hz) and breathing artifacts () are generated with their respective frequencies, whereas nonbiological NURD artifacts can have high and/or low frequencies as it also can be seen in Figs. 2Fig. 3Fig. 4Fig. 5–6(a).
We created a model to simulate these motion artifacts that consists of a combination of wavelets for each respective type of artifact along pullback direction. Each wavelet can be calculated simply by placing a Gaussian envelope over a sine wave with a corresponding frequency for each artifact type.Figs. 8–11(g). Two simulated artifacts are applied to an NURD phantom and in vivo images as shown in Figs. 8Fig. 9Fig. 10–11(b).
A 3-D digital phantom with no artifacts, an NURD phantom, and an in vivo image with limited observable artifacts were the ground truth images we used in this study. We generated a digital 3-D phantom with four circular targets at the same depth in each frame. Each circular target has a different radius and intensity, which gives a similar pattern to the NURD phantom’s image. For the NURD phantom and in vivo image, we have chosen a scan/pullback with little visible motion artifacts. We have applied our method iteratively to correct for unobservable artifacts (that are detectable by the algorithm) until there was little change between corrections. The en face (linear mean) projections of the digital phantom, precorrected NURD phantom, and in vivo image were used as the ground truth images. Two simulated artifacts were applied to these ground truth images to generate ground truth images with artifacts for each image. These images were then corrected using three different correction methods. Each correction method generated a correction matrix () representing the artifacts detected in the image to be corrected. This correction matrix was compared to the artifact matrix to quantitatively evaluate the degree of correction achieved by each of the methods.
Two parameters were defined to quantitatively evaluate the amount of correction each method accomplishes: (1) correlation coefficient () and (2) average compensated difference (). The correlation coefficient was calculated between the correction and artifact matrixes using the following equation:
The difference between and was defined as the difference matrix (). In the correction method, each frame is compared with its previous frame; thus, an error in a frame can propagate to all subsequent frames. To compensate for this possible accumulation of an error measure over multiple subsequent frames and to localize the mistake to its original frame, a subtraction of the previous frame values was calculated for frame two and above and denoted as the compensated difference matrix (). The average of was named .
Figure 7(c) shows good correspondence between the visual assessment of the amount of artifacts corrected and this figure of merit.
NURD Phantom and In Vivo Image Corrections
Performance of the correction methods on the 3-D images was visually examined using the en face images. Different sets of parameters () were evaluated to apply correction where the parameters were allowed to vary between , , , and with step size 20 for and , and 1 for and . The optimal parameters for method on our datasets were found to be , , , and based on the visual assessment of correction performance results and the average run-time normalized to a per frame value. We also quantitatively evaluated the choice of the parameters. The same set of correction parameters were evaluated while applying simulated motion artifacts on both the NURD phantom image and an in vivo image. We show quantitative results for four sets of parameters on an NURD phantom in Fig. 2. As it is seen, the quantitative metrics also confirm the visualized optimization set of , , , and to have equal or better performance than the other sets while having the lowest computational cost. Although and values are comparable for the four sets, they are more similar for 20-20-5-1 and 100-60-5-1. The correction parameters were selected to be 20-20-5-1 by considering the run-time. (All subsequent figures in this paper were processed using these optimized parameters.)
The method also using , , and parameters was applied on the OCT images with the same parameter as AEIR methods (the optimal parameters were also the same, , , and ). Although the optimized parameters were selected as , , and by Van Soest et al.,11 we could not visually detect any difference between the correction results of these two sets of parameters. To achieve the best correction results for this method, the 3-D OCT data also had to be smoothed. We used a median filter to do intraframe filtering (along the A-line and azimuthal direction on each frame) and a mean filter of size five frames for interframe averaging.
Figure 3 compares the results of applying the three correction methods on the same NURD-phantom OCT image; we have presented the mean projection en face image of the corrected 3-D images to show the performance of correction methods on the 3-D images. As seen in Fig. 3, motion correction with our technique appears much more effective than the previously published method11 using full A-lines. Results from and corrections are comparable to each other. The AF image and en face OCT images from the NURD-phantom are similar, so we present only the results applied to OCT images in Fig. 3.
The method was applied to an in vivo 2-D AF image in Fig. 4. This technique demonstrates significant correction of both NURD (as seen by the reduction of the high frequency oscillations in the inset image) and cardiac/breathing artifacts (reduced large lower frequency oscillations) in these images. There is noticeable motion correction in Fig. 4 due to correction method.
Results of the three different methods on an in vivo 3-D OCT image are shown in Fig. 5. We also present AF images corresponding to each en face image with the same corrections as its en face image for better visual evaluation. Motion artifacts including NURD cause wavy patterns in the en face image as well as deformation of structures. After applying our correction method to the images, it is noticeable that performance of and OCT- are much better methods for reducing the wavy patterns than using the previously published method on 3-D OCT guided by A-line correlations. These techniques demonstrate significant correction of both NURD [as seen by the reduction of the high frequency oscillations in the enlarged orange box in Fig. 5(b)] and cardiac/breathing artifacts [as seen reduced high-amplitude, lower frequency oscillations in enlarged black box in Fig. 5(a) and yellow arrow in Fig. 5(b)] in these images.
In Fig. 6, dashed rectangles show where the method seems to have corrected the motion artifact better. The method did poorly in these areas. The yellow arrows show a region where did better with correction as there were contrast between structures to find artifacts and its correction.
The run-time is the average time required to apply the correction to all frames of one image. The average run-time was 0.10, 0.10, and 0.25 s per frame for , , and , respectively.
Average normalized length for raw image and corrected images with three different methods.
For an ideal strip , and an closer to one indicates better NURD correction. Using a student’s test to compare the edge lengths between the four images we found that all corrections resulted in edge length data (shorter) that was highly significantly different (, two-tailed test) from the raw image data. Similarly, the method presented here resulted in edge data that was highly significantly different (, two tailed test, shorter) than the edge length data from the previously published algorithm (OCT). We expect and methods have the same correction since the en face and AF image for the phantom are similar; however, there is a small difference between them due to differences in their image contrast. The detected edge lengths for the two methods ( and ) were not found to be significantly different as the test -value was found to be 0.386.
In the second approach, quantitative analyses of motion corrections on the NURD phantom and in vivo images were evaluated using two parameters, the correlation coefficient and the average compensated difference. These two parameters together evaluated the performance of the correction methods. Different artifacts have been applied to the same images, and restoration attempted with the different correction methods to evaluate the reproducibility of the methods. Figure 7(a) shows the eight motion artifacts we simulated and then applied to a NURD phantom and an in vivo image. These images and applied simulated motion artifacts were then corrected by the three correction methods. Figures 7(b) and 7(c) show the previously discussed two metrics evaluated on a 3-D OCT in vivo image with the eight artifacts. The reproducibility of each method across the eight simulated motion artifacts can be analyzed by comparing results of the correction methods for the different artifacts as seen in the box plots of Figs. 7(b) and 7(c).
A more detailed examination of artifacts 1 and 2 in Fig. 7 follows. Figures 8Fig. 9Fig. 10–11 show results of the artificial artifacts 1 and 2 on OCT en face images, their corrections, and the comparison between the artifacts and the corresponding three correction methods. The results of the application of the artifacts a.1 and a.2 and their corrections on an NURD phantom image are shown in Figs. 8 and 10, respectively, and on an in vivo image in Figs. 9 and 11, respectively. Figure 12 shows artifacts 1 and 2 on the corresponding in vivo AF images.
In Figs. 8Fig. 9Fig. 10–11, the original image is shown in (a). The artificial artifact was applied to this image and the result shown in (b). The corrected images using , , and methods are shown in (c–e). The artifacts and three correction matrices, where each pixel represents the corresponding pixel shift in each matrix, are shown in (g–j). For an excellent correction method, the correction matrix should be the same as the artifact matrix; in other words, the correlation of these two matrices should be 100%. is shown in images (l–o), where (l) has a correction matrix identical to the artifact matrix. The average compensated difference is calculated based on matrices in sections (l–o). The average pixel shift for each frame (row) of the artifact and correction matrices is shown in (f). The difference between the average pixel shifts of the artifact matrix and the correction matrices are shown in (k).
The two correction evaluation parameters for the NURD phantom and in vivo images are shown in Tables 2 and 3, respectively. The performances of the correction methods were evaluated considering both parameters. The -value is larger for and compared to indicating more similarity between the correction and artifact matrices. is closer to zero for in all cases. Although is smaller for than in phantom images, it is bigger in the in vivo ones. One reason might be due to some uncorrected imaging artifact still present in the corrected ground truth image even after multiple iterations of the correction process. As an example, there are two leaps in average pixel shift at frames 322-323 and 343-344 that cause to perform less optimally than [Fig. 10(f)]. This might arise from the corrected ground truth image, which is distorted dependant on the alignment algorithm (the AF data). It would be reasonable to assume that the same algorithm, based on the same data, is more likely to return to its previous stable state. The other two methods would likely have different stable states, and a nonoptimal error metric even if no motion artifact is applied, this could put them at a disadvantage.
Two parameters were calculated for quantitative and reproducibility analysis of the correction methods for artifacts 1 and 2 on NURD phantom image as shown in Figs. 8 and 10.
|Parameter||Correction matrix based on|
Two parameters were calculated for quantitative and reproducibility analysis of the correction methods for artifacts 1 and 2 on in vivo image as shown in Figs. 9 and 11.
|Parameter||Correction matrix based on|
Our procedure allows for correcting motion artifacts in rotary-pullback 2-D and 3-D image modalities along the azimuthal direction. For 3-D images, motion artifacts along the radial direction (A-lines) are not detected nor corrected with our method. Our method corrects and aligns images along the azimuthal direction using the mean projection of A-lines for better registration of the calculated cost matrix from en face contrast within its strips than the full A-lines data. Thus, proposed methods do not correct for radial artifacts on 3-D data. On the other hand, there are no radial artifacts originating from NURD, and only may originate from in vivo cardiac and breathing motions, which could be reduced by shorter scan times as mentioned in Sec. 1.
Performance of the correction methods on the en face image of the 3-D images was visually examined, and we concluded the AEIR methods are correcting for motion artifacts and improving visual image quality. The correction performs more artifact removal than the method for images that have strong AF signal. The AF guided method performs poorly when there is no AF signal from tissue, e.g., lumen. The method needs the en face image projection to have structures with good contrast to enable a good correction of motion artifacts in the image based on our results. In addition, the correction may get misled when a feature is not parallel to the pullback direction; also when there are no features present (e.g., lumen), artifacts may not be found and corrected.
Van Soest et al. applied a DP method to stationary 3-D OCT images for NURD corrections and compared full A-lines in B-scans to construct cost matrix using the L2 norm. However, we used the formula in Eq. (1) to construct the cost matrix, which is not a norm, and it strongly penalizes cost paths with nonoptimal intermediate steps. We have also tried using the L2 and L1 norms to construct the cost matrix; however, our method did not converge on the optimal continuous path using these norms for all images. We have found that comparing full A-lines are not sufficient due to reduced or absent feature correlation between A-lines in the data we have collected. Motion correction with our technique, using the enface contrast within the strips appears much more effective since each strip is a mean projection of W A-lines, which were compared to each other. Although the full A-line had more pixels, it was depth information, which was not used for azimuthal registration and motion correction.
Our method calculated the correction of motion artifacts about two to three times computationally faster than the method since we were using the en face image for correction rather than its full 3-D stack/OCT-volume. Our method may be applied in real time since it only needs two frames, one to be corrected and one is its previous frame. We have applied this method to multiple pullback catheter images and have shown that our methods can be guided by either or on in vivo images. We have concluded that and can be complimentary to each other when we have both modalities because they may be more effective in different parts of the pullback, and there is more likelihood of strong contrast existing in at least one of the modalities when combined than considered alone. An improved version of this algorithm could conceivably be constructed by making use of correlations in both modalities simultaneously for estimating motion artifacts. In the case of a dual modality imaging, e.g., OCT-AFI, these two methods could be combined to provide more complementary and efficient corrections.
The reproducibility of each method was analyzed by comparing results of the correction methods for different artificial artifacts. The reproducibility of the correction methods depends on artifacts needing to be corrected.
Based on our visual evaluation of the corrected images as well as quantitative analysis based on two metrics, we conclude that overall and appear to correct a larger fraction of the visible artifacts than does .
Our method allows applying the motion correction to 2-D images. Motion corrections of 2-D AFI were obtained by the method, and it might be generalized to other 2-D images for motion corrections.
In summary, correction of distortions of tissue features due to motion artifacts may enhance image interpretation of OCT-AFI. This enhancement may aid biopsy-guidance applications, diagnosis of neoplastic tissue, and/or aid monitoring disease progression in patients. Finally, the method could reduce motion artifacts from 3-D OCT catheter rotary-pullback data sets, and method can reduce motion artifacts for 2-D AF images. These 2-D and 3-D motion corrections methods may be generalized for other 2-D and 3-D image modalities to apply motion corrections.
The authors have no relevant financial interests in this article and no potential conflicts of interest to disclose.
This work was supported by Canadian Institutes of Health Research (CIHR) (ppp-141717).