Capillaries are embedded within organs of the human body. As blood vessels of the smallest caliber, they are an important supply channel for blood, nutrients, gas exchange, and waste disposal of the cells.1 Changes in architecture of the capillary network can be symptomatic for organ aging, inflammatory processes, disease development, and wound healing.2
Using optical coherence tomography angiography (OCTA), three-dimensional (3-D) maps of capillary perfusion are obtained in vivo with high resolution. Different methods have been developed for OCTA image reconstruction.188.8.131.52.8.–9 They are based on the movement of red blood cells within the blood stream. These particles cause a variation in the OCT signal intensity and a shift in frequency of the backscattered light due to the Doppler effect.6 By contrasting the perfusion with static tissue, a 3-D angiographic image is generated. Often, such an angiographic volume is projected to a two-dimensional (2-D) image to condense the angiographic image and to avoid the appearance of image artifacts, such as projection shadows.10 Such an en face maximum-intensity projection (MIP) displays the capillary network of a tissue slab at a glance. Color coding can be used to additionally visualize the depth of the vessels.
For the comparison of the capillary structures along a cohort, such as in clinical studies or the preclinical research environment, it is necessary to extract the quantitative characteristics of the vascular network. Quantitative metrics are, for example, the number of vessels, their diameter, and shape, as well as the density and complexity of the vascular network.5,11 To derive these metrics from an angiographic image, first, a segmentation is required, which classifies each pixel to represent either a vessel or the background. To obtain such a segmentation map, manual labeling by an expert is the gold standard.5,1213.–14 Since this is a time-consuming procedure and the outcome might vary from expert to expert, efforts have been focused on the automatic segmentation of the vascular structures.
In the field of computer vision, segmentation is an important task. Often a pipeline composing different image-processing steps is employed. Such a pipeline usually consists of a preprocessing step, which aims to remove noise and to enhance the contrast between foreground and background, a binarization step that separates the object and background, and a post-processing step that refines the segmented object boundaries.5,11,15
In computer vision, it is common to evaluate the results of automatic segmentations with a manual ground-truth segmentation of experienced rater.1617.–18 The congruence of the ground truth and automatic segmentation can be described by the amount of correctly and falsely classified pixels.5,1617.–18 Further, to derive quantitative characteristics of the vascular architecture, such as the number of vessels, their curvature, or the complexity of the network, it is common to reduce the binary vascular map to its skeleton,5 which substitutes each vessel with its 1-pixel thick centerline.
Most of the published algorithms for vessel segmentation and analysis combine these steps of image processing in various ways. For an interested user it is hard to compare the quality of the proposed algorithms and to estimate the segmentation error that a specific method carries. It is also impossible to directly compare findings of different studies based on OCTA with automatic evaluation and to critically assess published results. Furthermore, in dermatology, OCTA imaging is progressively applied by clinicians in office or bedside situations.19,20 In these real-world applications, imaging artifacts may be introduced due to movement of the bulk tissue, defocus21 and shading by air bubbles or lesions. Unfortunately, methods proposed in literature to segment and analyze vascular networks have not been evaluated for robustness against such artifacts that are common for OCTA imaging of the skin.5,11,15,22,23 However, with OCT as a valuable tool for clinicians and scientists, methods for an automated and robust vessel detection are demanded for a reliable quantification.
In our case, OCTA images were obtained of in-vivo mouse skin in order to observe the wound healing processes of capillaries after fractional photothermolysis.24 With this treatment, the vascular architecture changes dramatically and requires a segmentation method that does handle very different appearances and quality of the OCTA images.
This work demonstrates the creation of an optimization-based segmentation pipeline for angiographic images of skin. It is capable of coping with imaging artifacts and especially is suitable for dermatological applications. A variety of state-of-the-art image-processing methods for denoising, contrast enhancement, binarization, and refinement were systematically tested and combined to a pipeline to optimize the quality of the segmentation and the validity of quantitative metrics. Thanks to our optimization-based vessel segmentation (OBVS), we are now able to analyze even low-quality data of a longitudinal mouse study, which would have been discarded otherwise. Although this optimization was done for OCTA images of cutaneous microvasculature, it might be applicable to retinal and cerebral OCTA images.
Material and Methods
A subset of 10 different OCTA volumes acquired during a study for cutaneous photoaging in mice was chosen to find the optimal pipeline for vessel segmentation (see Fig. S2 in the Supplementary Material). This subset of OCTA volumes contains examples of different imaging situations that occurred during this study, such as artifact-free images under optimal condition; slightly defocused images due to epidermal thickening; and images corrupted by motion artifacts, lesions, or airbubbles between skin and objective. The sample volumes were chosen to be representative of skin studies in rodents.
To find the optimal segmentation approach (see Sec. 3.2) the set was split into a training group and a test group with five images each.
OCTA imaging was performed using the commercially available spectral-domain OCT scanner TELESTO II (Thorlabs Inc., Newton, New Jersey). The device operates at a central wavelength of 1300 nm and an axial resolution of in tissue. Using a lens with a lateral resolution of (LSM03, Thorlabs Inc.), we acquired two times oversampled images with a voxel size of . Field of view (FOV) was (); A-Scan rate was 76 kHz. For further analysis with reasonable cost for manual labeling, the angiographic volumes were cropped to , showing the capillary layer in between and depth in skin. After angiographic image processing (see Sec. 2.2), the volumes were converted to 2-D images as MIPs.
OCTA images were acquired in the dorsal region of nude mice (Foxn1nu). The tissue was immobilized by using a z-spacer (IMM3, Thorlabs Inc.) in front of the scanner with glycerol as immersion fluid. During an imaging session, the mice were anesthetized with isoflurane (3% induction, 1% to 1.5% maintenance) in 80% air and 20% oxygen. All procedures were approved by the Institutional Animal Care and Use Committee (IACUC) of the Massachusetts General Hospital (Protocol No.: 2015N000170).
Angiographic Processing of the OCT Data
We evaluated numerous other angiographic algorithms and settings, such as optical microangiography (OMAG),5,6 correlation mapping (cmOCT),7 complex differential variance (cdvOCT),8 and complex correlation (ccOCT).9 Example images of these algorithms are given in Fig. S1 in the Supplementary Material. The contrast and signal-to-noise ratio (SNR) were compared among these methods using formulas such as proposed by Zhang et al.6 and Lozzi et al.25 However, as the results of these two assessments contradicted each other, we were unable to clearly identify the best performing method by contrast or SNR. Visually, in our case, the svOCT method generated more detailed maps of capillary perfusion than cmOCT, cdvOCT, and ccOCT, while it appeared less noisy and richer in contrast to OMAG.
All computations, including the optimization of the segmentation described in Sec. 3.2, were performed using MATLAB (R2016b, MathWorks, Natick, Massachusetts) running on a Dell Precision T1700 with an Intel Xeon(R) CPU with four cores (eight threats) at 3.6 GHz.
For each of the 10 images, four manual segmentations are created by the same rater using the itkSnap tool.26 Even though the manual segmentation is the gold standard method for precise labeling, it is also prone to inaccuracy and variations due to some ambiguous pixel and the subjective decision of the rater. Using the Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm, as proposed by Warfield et al.,13 the most likely true, common segmentation is computed by iteratively assessing the rater’s performance and weighting each rating session in a combined segmentation. STAPLE terminates with a probability map of every foreground pixel to have been identified correctly.
To obtain a binary mask, we further considered every pixel with probability of 75% and more to be the correct foreground and every other pixel as background (Fig. 1). These segmentations were used as the ground truth to optimize and evaluate the quality of the automatic segmentation algorithms.
Segmentation Quality Metrics
During the optimization process (see Sec. 3.2), the quality of a segmentation result was evaluated by its congruence with the ground truth.5,17,27 Each pixel of the binary map was classified as either true-positive, true-negative, false-positive, or false-negative. Based on the ratio of the amount of pixel in each set, the following quality metrics are derived for a segmentation:
Vascular Quantitative Metrics
Quantitative descriptive metrics of vascular networks were obtained from the skeleton of segmentation map to characterize its architecture and condense the information in single values. This is especially helpful when comparing vascular network characteristics among many angiographic images, such as in large datasets of clinical studies. We utilized vascular quantitative metrics as proposed in previous work of OCTA.5,6,15,17
The fractal dimension (FD) quantifies the complexity of a vascular network28 and is obtainable from both the binary segmentation and the skeletonized map. It has been observed that it gives more characteristic results for skeletons.5
To assess the tortuosity of the skeleton, Bullitt et al.29 proposed to integrate the angular changes along the vessel segments and normalize by their length. This sum of angles metric (SOAM) is effective to recognize abnormalities characterized by high-frequency, “low-amplitude” coils.
To assess these metrics, we implemented the code in MATLAB (R2016b, MathWorks, Natick, Massachusetts). Their provided values were further used to estimate the quality of OBVS and other established methods, as listed in Table 1. The methods were benchmarked against the ground-truth skeletons, which were here obtained by morphological skeletonization18 of the STAPLE result.
Different vessel segmentation algorithm for evaluation of OCTA images, whose performances are compared to OBVS. Methods have been derived from the publications shown. We have optimized the parameters for maximal Youden’s index J with respect to the corresponding ground truth for the five training images.
|Methods and parameters for segmentation||Site||Authors|
|1.||Fixed threshold binarization [optimal: th = 1.066 * mean(OCTA-signal)] skeletonization: Voronoi30||Skin||Liew et al.22|
|Carter et al.23|
|2.||2.1 k-means threshold binarization||Retina||Khansari et al.11|
|2.3 Particle filtering (delete objects < 100 px)|
|2.4 Dilation (disk, radius = 1 px)|
|2.5 Filling, skeletonization: distance transform ()|
|3.||3.1 Global threshold for noise removal||Retina||Chu et al.15|
|3.2 Vesselness31 (, 10, , , 10)|
|Reif et al.5|
|3.3 Local adaptive threshold (radius = 9), skeletonization: morphological|
|4.||4.1 Niblack binarization32 (window = 32 px, )||Skin||Lozzi et al.25|
|4.2 Opening (cube, radius = 8), skeletonization: morphological|
Optimizing Parameters of Four Previously Used Algorithms for Segmentation of Capillaries
Thus far, previous work in OCTA is mainly applied to images of the retinal11,15,33 and cerebral vasculature. Both show vessel in low scattering environments. The imaging and segmentation of capillaries in skin, as shown in mouse ear by Reif et al.,5 and in human skin by Liew et al.22 and Carter et al.,23 are more challenging, as skin is a stronger scattering imaging environment and is more likely to cause image artifacts.
As a benchmark for OBVS, we have implemented four previously used algorithms for vessel analysis from OCTA images and optimized the respective free parameters for best overall congruence of the five training images with the respective ground-truth segmentation. Table 1 outlines these algorithms together with the parameters, which are chosen by maximizing the Youden’s index for the training datasets in a similar fashion as described for OBVS in Sec. 3.2.
Optimization of Processing Pipeline for OVBS
The segmentation pipeline consisted of the subsequent modules denoising, contrast enhancement, binarization, and refinement. Within each of these stages, several methods were evaluated of their impact on the quality of the result. As indicated in Fig. 2, for each method the input parameter combination was optimized with regard to the Youden’s index , as it combined the specificity with the sensitivity metric. Parameters of rational numbers, such as noise level or filter strength, were optimized using the Nelder–Mead simplex method,34 whereas integer parameter, such as kernel size, were consecutively set and analyzed. Each optimization step was performed on the whole set of training images.
First, to reduce noise and imaging artifacts in the angiographic images, methods of denoising were tested. We evaluated Gaussian filter, median filter, bilateral filter,35 anisotropic diffusion filter,36 nonlocal means filter,37 BM3D filter,38 and the Hessian-based vesselness filter.31 Each of these methods was controlled by different input parameter, e.g., kernel size, filter strength, or search area. As indicated in Fig. 2, the optimal combination of input parameter was found using the downhill simplex optimization method34 and iteration for discrete parameter to maximize the Youden’s index . As we were only able to compare the congruence of two binary maps, the denoised images were provisionally binarized using the optimal global threshold, which was determined for each image individually by another downhill simplex optimization nested into the parameter optimization. Each of these best possible binary maps was then compared to the ground truth to assess the index.
We observed each of these denoising methods strongly enhancing the quality of the segmentation result compared to a segmentation without denoising. Especially the BM3D image filter, as proposed by Dabov et al.,38 scored highest, lifting the average value from 0.62 without denoising to 0.74 after BM3D filtering. Hence, the BM3D with the optimal noise level parameter was chosen as denoising method for the following investigations.
In the second stage of the segmentation pipeline, after the denoising step with the BM3D filter, methods for contrast enhancement were evaluated. Among methods such as histogram equalization, Retinex,39 and local-phase-based filtering,40 the contrast-limited adaptive histogram equalization (CLAHE) led to slightly superior segmentation results. The CLAHE method performed optimally with a number of tiles and the contrast enhancement factor with a uniform distribution and was chosen for the following steps.
Based on the images that were denoised by the BM3D filter and enhanced by CLAHE, different methods for binarization were evaluated and optimized. The most intuitive approach for binarization is to set a global threshold. This value can either be chosen semiautomatically by the user15,22,23 or automatically derived from the image by using one of the various known methods.41 As empirical threshold, such as utilized by Liew and Carter,22,23 we chose the average of the optimal thresholds for each sample image. These were found with knowledge of the ground-truth segmentation as leading to the maximal Youden’s index . Comparing the results of this fixed threshold with the automatic techniques such as the Otsu’s method42 and Isodata,43 which adjusted the threshold for each image individually, the Otsu’s method led to superior segmentation results. Furthermore, the local adaptive threshold approach, as e.g., utilized by Reif and Chu,5,15 and the method by Niblack,32 as proposed for the evaluation of OCTA images by Lozzi et al.,25 led to better results than the global approaches, when they have been optimized for window size and noise level. Even though for the test dataset the improvement over the global methods was not significant (), the local adaptive thresholding method with an optimal kernel size of and a Gaussian statistic with was chosen as the optimal binarization method, as it was expected to be more robust against artifacts and signal variation.
As segmentation in OCTA images was prone to image artifacts, a postprocessing step was conducted to eliminate the small objects of misclassification and to smooth object boundaries in the binary maps. Here, as methods of refinement, kernel-based morphological operations, such as opening and closing, have been optimized for kernel size and evaluated.
However, in our investigations the improvement of the segmentation result was even more prominent by the min-flow max-cut (graph-cut) approach, such as proposed by Boykov and Kolmogorov.44,45 This approach solved the active contour model optimization problem, which regularized the length of the object boundary in the binary map in balance with a low intra-object intensity variance of the segmented image. This eventually led to smoother object boundaries and the compensation of minor segmentation errors. Here, the optimization of Youden’s index resulted in a neighborhood size of weighted by the Potts model with a weighting of foreground and background of and , respectively.17
For the analysis of the organization of the segmented vascular network, the binary segmentation map was simplified to its skeleton, which was represented by a line along the center of the vessel. Here, three different methods to obtain such a skeleton were evaluated. The approach proposed by Haralick and Shapiro18 provided an unregularized skeleton using morphological operations. The resulting skeleton can be further refined by using the distance transform giving the vessel radius at each position on the skeleton.46 By applying a threshold for the vessel diameter , very small structures were discarded, e.g., here with a diameter of which is below the lateral imaging resolution of the OCT setup. Moreover, the shape of a skeleton can be even more regularized using the Voronoi diagram.30 In our investigations, however, the morphological skeletonization led to optimal results, as the segmentations were smoothened already due to the refinement step and further smoothening of the skeleton was not necessary.
Comparison of Performance
With the aim to develop a vessel segmentation method robust against artifacts in OCTA imaging of skin, we have benchmarked the proposed OBVS method and the established algorithms listed in Table 1 against the ground truth. In Figs. 3Fig. 4–5, the performance of each of the algorithms is demonstrated for different images of the test group. Furthermore, the segmentation quality metrics of the algorithms applied to the whole group of test images are given in Fig. 6, together with the quantitative metrics such as VLD and FD that are derived from the segmented vascular network and the ground truth.
Figure 3 shows an OCTA image of the test group containing only minor artifacts. A region of was extracted for the evaluation of the segmentation algorithms against the manual ground truth, which was obtained using the STAPLE algorithm (Sec. 2.3). The performance was shown as exemplary at a tile of . Surprisingly, the different methods provided very different results, even though the tile was not corrupted by imaging artifacts and had a contrast common for OCTA images of skin. The binary maps of both the methods, proposed by Liew et al.22 and Lozzi et al.,25 based on global and local thresholding, respectively, were affected by noise in the foreground and background. The method proposed by Khansari et al.,11 which mainly consisted of morphological operations, and the method proposed by Chu et al.,15 applying Hessian-based filtering, appeared to be less sensitive against such noise corruption. However, Khansari’s method11 was developed to detect large vessels and gives just a few vessels with an enlarged caliper. OBVS performed superior compensating the noise successfully, while also maintaining the integrity of vessel diameter, boundary, and connectivity.
The skeleton generated by the method of Liew et al.22 was obtained using Voronoi regularization30 and compensated most of the noise successfully, whereas the skeleton method of Chu et al.,15 which used morphological skeletonization, gave a few vessel fragments in the background. The result of Lozzi et al.25 appeared rather toothy and angular. Only OBVS appeared to match the ground truth well, finding the structure and smoothness of the vessel in the ground truth.
This observation could be made as well in Figs. 4 and 5, which contain artifacts. In Fig. 4, OBVS is evaluated on a blurry image that has a poor contrast and is corrupted by sample motion. Figure 5 shows the cutaneous capillary network one day after the fractional photothermolysis treatment with a laser.24 Here, the vascular structures are distributed sparsely with a low contrast in the upper tile and dilated surrounding a lesion in the bottom tile, due to inflammation responses of the skin during wound healing.
The quality of each method in terms of sensitivity, specificity, and Youden’s index with the ground-truth segmentation is averaged over all five test images and is given in Fig. 6(a). The evaluation shown in the plot emphasizes the superiority of OBVS, which scored the highest Youden’s index with . Note, the average congruence of single manual segmentations of the rater compared to the STAPLE result, as described in Sec. 2.3, scored with .
Furthermore, for each method individually, the error of quantitative metrics derived from the vascular network was assessed and given averaged over the five test images in Figs. 6(b)–6(d). Although this analysis indicates the integrity of each segmentation method, the desired method would produce results with the least difference to the ground truth. As shown in Figs. 6(b) and 6(c), results of OBVS agreed almost completely in terms of VLD, FD, and SOAM with those derived from the ground truth, with 0.1%, 0.2%, and 0.5% errors, respectively.
Discussion and Conclusion
The comparison of OBVS with four published algorithms for vessel segmentation in OCTA images demonstrated improved quality and reliability, which was obtained using the pipeline proposed in this work. Although the composition of this pipeline was specifically optimized to cope with imaging artifacts, also for uncorrupted images, it showed results superior to the four cited methods. The validity of quantitative metrics assessed for vascular networks was also considerably improved.
Training and testing of OBVS are limited to OCTA images of mouse skin, although we believe that the composition of methods in our OBVS pipeline is applicable for OCTA images of other tissue types such as human skin. The parameters of the methods might need to be adapted.
While creating our pipeline, we found the denoising step to have the most critical impact on the segmentation quality. Despite the fact that the Hessian-based vesselness filter was specifically designed to enhance vascular structures in angiograms,31 in our investigations it performed with surprisingly poor specificity and thus scored a poor Youden’s index . It adopted imaging artifacts creating phantom vessels in noisy regions (similar to the results of the method of Chu et al. shown in Fig. 4 and upper tile of Fig. 5) and produced overly dilated vessel. The BM3D filter, which had not been applied to OCTA images in prior work, showed superior performance for both sensitivity and specificity. It compensated the noise well, while producing sharp and smooth vessel boundaries and enhancing the contrast between vessel and background.
The procedure to find the optimal segmentation pipeline was inspired by the machine learning algorithms. In contrast to deep-learning-based segmentation, here, we applied not a variety generic filters but the ideal combination of established methods with optimal parameters. This tailoring of the pipeline enabled us to train with a set of training data that was much smaller than those for deep-learning approaches, while containing also images corrupted by artifacts. For example, Prenta et al.47 used 80 normal OCTA images for a deep-learning-based segmentation and scored an accuracy of 77% to 83%, whereas OBVS scored here with an accuracy of for corrupted images. Note, in comparison to deep-learning, the optimization to obtain OBVS was done subsequently and not globally and OBVS was not evaluated on the data of Prenta et al.
Our optimization was based on a metric for segmentation quality and hence strongly depended on the integrity of the ground-truth segmentation. Using the STAPLE method, we were able to combine the votes of four manual segmentations and find a consensus. Yet, we observed that even the single rater was unable to identify every vessel reliably [indicated as yellow and blue regions in Fig. 1(b)]. This might be due to the rater guessing in areas of artifacts and low contrast of the sample OCTA images. Note, manual labeling was performed on the original images prior to any enhancement. Some commercial OCTA devices might only provide images that already underwent image processing and show modified vessel networks. Nonetheless, with [Fig. 6(a)] the proposed automatic method approximatively reached the average Youden’s index of the manual segmentations of the rater, whose four segmentations compared to the STAPLE result scored in average with .
This sample dataset consisted of images with various types of artifacts that might not occur in different imaging situations or different anatomical sites. For example, air bubbles were absent when imaging the retina in an intact eye.21 The challenge of segmenting the vessels in these corrupted images might have led to scores that were lower than those commonly demonstrated in angiographic imaging modalities other than OCTA in both manual and automatic segmentations. For example in fundus photography the work of Zhao et al. scored with 17 and Fraz et al. reviewed methods with scores of 48 [we converted these values from given sensitivity and specificity using Eq. (5) in Sec. 2.4]. Those analyses of fundus photography were benchmarked on open-source image databases, such as STARE and DRIVE, with high-quality manual segmentations obtained by many experts on fundus photography images with minimal artifacts. As such a database is not yet available for OCTA images, especially of skin, our scores of qualitative metrics are not directly comparable to work in other fields.
Yet the visual impression and the analysis of the quantitative metrics show the advantages of our method by providing results closer to the ground truth than other published methods, even when optimized in the same fashion. Moreover, this study demonstrates how much the quantitative vascular network characteristics that are derived from an OCTA image vary when different methods for vessel segmentation are applied to the same images [Figs. 6(b)–6(d)]. Consequently, in any work on automatic analysis of OCTA images, the error of the utilized segmentation method should be assessed by comparison to ground truth data and presented to the reader.
As the development and evaluation of image-processing methods are dependent on the quality of the ground truth, open-source databases have been created in many fields of medical imaging. Unfortunately, in OCTA, such a resource is not available, even though a centralized, open-source database would enable to benchmark angiographic algorithms on raw data (see Sec. 2.2) and image processing and quantification on angiographies. This could lead to a common standard methodology to obtain quantitative metrics of OCTA images. Furthermore, open-access OCTA databases could eventually enable the comparison of medical findings among different publications, and researchers would not need to redo redundant parts of studies but could derive cross-wise conclusions from their and other’s findings. However, the image acquisition and processing among commercial OCTA devices have not been standardized yet, which leads to images of heterogeneous quality.49 Hence, the assembly of such a database at this point would be impracticable due to the variability and amount of commercial devices and developments. Furthermore, its applicability to every single device would be limited.
The authors declare that there are no financial interests or conflicts of interest related to this article.
The authors would like to thank Dr. Cuc Nguyen and Dr. Garuna Kositratna for their help in conducting the animal experiments.