Reproducibility of an airway tapering measurement in computed tomography with application to bronchiectasis

Abstract. We propose a pipeline to acquire a scalar tapering measurement from the carina to the most distal point of an individual airway visible on computed tomography (CT). We show the applicability of using tapering measurements on clinically acquired data by quantifying the reproducibility of the tapering measure. We generate a spline from the centerline of an airway to measure the area and arclength at contiguous intervals. The tapering measurement is the gradient of the linear regression between area in log space and arclength. The reproducibility of the measure was assessed by analyzing different radiation doses, voxel sizes, and reconstruction kernel on single timepoint and longitudinal CT scans and by evaluating the effect of airway bifurcations. Using 74 airways from 10 CT scans, we show a statistical difference, p=3.4×10−4, in tapering between healthy airways (n=35) and those affected by bronchiectasis (n=39). The difference between the mean of the two populations is 0.011  mm−1, and the difference between the medians of the two populations was 0.006  mm−1. The tapering measurement retained a 95% confidence interval of ±0.005  mm−1 in a simulated 25 mAs scan and retained a 95% confidence of ±0.005  mm−1 on simulated CTs up to 1.5 times the original voxel size. We have established an estimate of the precision of the tapering measurement and estimated the effect on precision of the simulated voxel size and CT scan dose. We recommend that the scanner calibration be undertaken with the phantoms as described, on the specific CT scanner, radiation dose, and reconstruction algorithm that are to be used in any quantitative studies.


Introduction and Purpose
Bronchiectasis is defined as the permanent dilatation of the airways. Patients with bronchiectasis can suffer severe exacerbations requiring hospital admission and have a poorer quality of life. 1 Clinicians diagnose bronchiectasis on computed tomography (CT) imaging by visually estimating the diameter of the airway/bronchus and its adjacent pulmonary artery and calculating the broncho-arterial (BA) ratio. A BA ratio >1 indicates the presence of bronchiectasis. 2 Various groups have proposed methods to automatically and semiautomatically compute the BA ratio for bronchiectatic airways. [3][4][5] However, use of the BA ratio to diagnose bronchiectasis has two major flaws. First of all, the maximum healthy range of the BA ratio can be 1.5 times size of the artery. 6 Second, blood vessels can change size as a result of factors including altitude, 7 patient age, 8 and smoking status. 9 This conflicts with the assumption that the pulmonary artery is always at a constant size.
An alternative approach to diagnose and monitor bronchiectatic airways is to analyze the taper of the airways, i.e., the rate of change in the cross-sectional area along the airway. 2 In patients with bronchiectasis, the airway is dilated and so the tapering rate must be reduced. Airway tapering is difficult to assess visually and to measure manually. As described by Hansell,6 the observer would have to make multiple cross-sectional area measurements along the airway. As mentioned by Cheplygina et al, 10 measuring multiple lumen is a manually exhaustive task and prone to mistakes.

Related Work
There have been various strategies to quantify tapering in the airways. The initial proposed tapering measurements by Odry et al. 11 were restricted to short lengths of the airways. A segmented airway would be split into four equal parts. Each segment had an array of computed lumen diameters. The tapering was measured as the linear regression of the lumen diameters along the branch. The method shared similarity to Venkatraman et al., 12 but the diameter measurements were taken across the central half of each branch. Various analyses attempted to measure the taper of airways containing multiple branches. In the work of Oguma et al., 13 the region of interest from the carina to the fifth generation airway was measured; however, this was only performed in patients with COPD. Finally, Weinheimer et al. 14 used a graphical model of the airways for their proposed tapering measurement. The graphical model was based on a graphical tree originating at the trachea and extending into distal branches, depending on airway bifurcations. A tapering measure was assigned to the edge of the graph depending on the lumen area and generation. They also proposed a regional tapering measurement based on the segments of the lobes of the lungs.
The described tapering measurements have two key limitations. First of all, there is no detailed quantification of reproducibility when considering differences in specifications of the CT scanner, or reconstruction kernel, making it difficult to compare tapering statistics from different machines or from the same CT scanner employing different scanning parameters. Second, the region of interest for the tapering measurement was restricted to airways that were segmented using the respective airway segmentation software. Bronchiectasis is a heterogeneous disease-it can affect any area in the lung including the peripheral regions. 15 Thus to encapsulate the disease in the tapering measurement, one would need to consider the region of interest as the entire airway, from the trachea to the most distal point.
In all the proposed tapering measurements, obtaining the cross-sectional area is a necessary input for the algorithms. There have been various analyses attempting to validate the reproducibility and precision of measurements against dose, [16][17][18] voxel size, 19 reconstruction kernel, [20][21][22] and normal biological variation. 23 In most of the validation experiments, area measurements were taken from phantom, 20,21 porcine, 16,18 or cadaver 24 models. Fetita et al. 25 used synthetic models of the lung. None of these experiments were explicitly performed on scans with bronchiectasis. Furthermore, the area measurements were not taken at contiguous intervals along the lumen thus missing possible dilatations from a bronchiectatic airway.
For our work and the method from Oguma et al., 13 the tapering measurement involves the computation of the arc length of the airway at contiguous intervals. In the literature, investigations of the reproducibility of arc length computation in airways are limited. The work of Palágyi et al. 26 used simulated rotation of in vivo scans. The assessment of the reproducibility was based on the lengths of a single branch rather than multiple generations of branches, thereby precluding estimations of reproducibility of airway quantitation from the carina to an airway's most distal point.

Contributions of the Paper
To our knowledge, there is no detailed analysis on the reproducibility of a global tapering measurement of airways using CT. Thus the purpose of this paper is as follows. First of all, we will discuss in detail a tapering measurement of the airways on CT imaging. Second, we quantify the reproducibility of the measurement against variations in simulated dose and voxel sizes. In addition, we compare the variability of the tapering measurements across different CT reconstructions kernel. Finally, we analyze the effect of bifurcations on tapering measurements and consider measurement repeatability using longitudinal scans.

Method
We first describe in detail the steps to acquire the airway tapering measurement. The method was initially proposed by Quan et al. 27 ; it is summarized in Fig. 1. The pipeline required two inputs. First, the most distal point of each airway of interest was manually identified by an experienced radiologist (J. J.).  A single voxel was marked at the end of the airway centerline. The entire analysis was completed using ITK-SNAP. 28 Second, a complete segmentation of the airway was produced. We obtained an airway segmentation by implementing a method developed by van Rikxoort et al. 29 The algorithm was based on a region growing paradigm. In summary, a wave front was initialized from the trachea. Voxels on each new iteration were classed as airways based on a voxel criterion. The wave front continued until a wave front criteria was met. In certain cases, the airway segmentation was unable to reach the distal points, and, in these cases, we extended the airway segmentation manually to the distal points. Our method is designed so that it can be incorporated to any system that provides both the segmentation and distal point to the airway of interest. Once the inputs were available, we acquired the measurement using an automatic process.

Centerline
The centerline was used to identify and order the airway segments for the tapering measurement. We implemented a curve thinning algorithm developed by Palágyi et al. 26 At initialization, the algorithm used the airway segmentation and distal points acquired in Sec. 2. The final input was the start of the centerline at the trachea. The shape of the trachea was assumed to be tubular, with an approximate constant diameter and orientated near perpendicular to the axial slice. Thus the centerline of the trachea lay on the local maximum value of the distance transform of the segmented trachea. 30 Algorithm 1 was used to find the centerline start point.

Recentering and Spline Fitting
The next task was to separate the centerline of each individual airway from the centerline tree. To this end, we modeled the centerline tree as a graphical model similar to Mori et al. 31 The nodes corresponded to the centerline voxels and the edges linked neighboring voxels. We performed a breadth first search algorithm 32 on the centerline image. Starting from the carina, we iteratively found the next set of sibling branches. When a distal point was found at the end of a parent branch, the path leading to the distal point was saved. The output was an array of ordered paths describing the unique route from the trachea to the distal point. The proposed tapering measurement started at the carina. Thus centerline points corresponding to the trachea were removed from further analysis.
For each path, we corrected for the discretization errora process known as recentering. 33 We implemented a similar method to that described by Irving et al. 34 A five point smoothing was performed along each path. We modeled the centerline as a continuous model by fitting a cubic spline F∶½0; k n → R 3 denoted as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 3 2 6 ; 7 1 9 FðtÞ ¼ 8 > > < > > : f n ðtÞ t ∈ ½k n−1 ; k n ; (1) where f i ðtÞ ¼ P 3 j¼0 c i;j t j and c i ∈ R 3 . The knots k i where taken on every smoothed point on the centerline. The spline fitting was performed using the cscvn 35 function in MATLAB. The continuous model should enable computations of the arc length and tangent at subvoxel intervals along the airway.

Arc Length
The tapering measurement required an array of arc lengths at contiguous intervals from the carina to the distal point. For our pipeline, we considered small parametric intervals t i on the cubic spline FðtÞ. At each interval t i , we computed the arc length from the carina to t i as 36 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 3 2 6 ; 5 1 1 where _ F ¼ dF dt and ð·Þ is the dot product. For our work, we considered parametric intervals of 0.25 along the spline.

Plane Cross Section
We measured the cross-sectional area accurately by constructing a cross-sectional plane perpendicular to the airway. Using the interval t i from the arc length computation, we computed tangent vector q ∈ R 3 by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 3 2 6 ; 3 7 2 From linear algebra, points on the plane can be generated by their corresponding basis vector. 37 To this end, we generated a set of orthonormal vectors v 1 , v 2 ∈ R 2 using the method stated by Shirley and Marschner. 38 The method is summarized in Algorithm 2.
Assuming Fðt i Þ was the origin, each point u ∈ R 3 on the plane can be written as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 3 2 6 ; 2 4 7 We selected the scalars α 1 ; α 2 ∈ R such that the point spacing is 0.3 mm isotropically.

Lumen Cross-sectional Area
We calculated the cross-sectional area using the edge-cued segmentation-limited full-width half-maximum (FWHM ESL ), developed by Kiraly et al. 39 The method is as follows. The cross-sectional planes were aligned on both the CT image and airway segmentation. The intensities of the plane were computed for both images using cubic interpolation. Fifty rays were cast out in a radial direction, from the center of the plane. Each ray sampled the intensity of the two planes at a fifth of a pixel via linear interpolation. Thus each ray produced two 1-D images with the first from the binary plane r b and second from the CT plane r c . We then applied Algorithm 3 to find boundary point l.
The final output of the FWHM ESL was an array of 2-D points corresponding to the edge of the lumen. Finally, we fitted an ellipse based on the least square principle. The method was developed and implemented in MATLAB by Fitzgibbon et al. 40 We considered the cross-sectional area as the area of the fitted ellipse.

Tapering Measurement
We assumed for a healthy airway that the cross-sectional area was modeled by an exponential decay along its centerline. It has been shown in human cadaver studies that the average cross-sectional area in a branch reduces at an exponential rate at each generation. 41 The same observation has been noted in porcine models. 42 Using the decay assumption, we modeled the relationship between the arc length and the cross-sectional area as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 3 2 6 ; 6 4 6 where x is the arc length of the spline, T is the proposed tapering measurement, y is the cross-sectional area, and A is an arbitrary constant.
In terms of implementation, for each airway track, we considered the array arc length and cross-sectional area computed for each individual airway. A logarithmic transform logðxÞ was applied only on the cross-sectional area array. We fitted a linear regression on the signal; the tapering measurement is defined as the gradient from the line of best fit.

Evaluation
An experienced radiologist (J. J.) selected a total of 74 airways from 10 scans. The CT images were analyzed from nine patients with bronchiectasis after obtaining written informed consent at the Royal Free Hospital, London. The voxel size ranged from 0.63 to 0.80 mm in plane and 0.80-to 1.5-mm slice thickness. The airways were classified as healthy (n ¼ 35) or bronchiectatic (n ¼ 39) by the same radiologist. Details including the make and model of the scanner are provided in Table 1. 43 From our dataset, many of the airways affected by bronchiectasis came from two patients. We used the same airways for the simulated low-dose and voxel size experiments. A subset of the same airways was used for CT reconstruction kernel and bifurcation experiments.

Simulated Images
In this experiment, we simulated images with differing radiation dose and voxel size. The purpose was to analyze the reproducibility of the tapering measurement against various properties of the CT image. Furthermore, we varied the noise and voxel sizes at regular intervals. Thus we also analyzed the sensitivity of the tapering measurement against the given parameters. Finally, we investigated the reproducibility of cross-sectional area and airway length measurements with changes in dose and voxel sizes, respectively.

Dose
To simulate the images acquired with different radiation doses, we used the method adapted from Ref. 44. We performed a Radon transform on each axial slice of the original CT image. The output is a sinogram of the respective axial slice. To simulate different radiation doses, Gaussian noise was added on each sinogram with standard deviation σ ¼ 10 λ , with a range of λ. The noisy sinograms were then transformed back into physical space using the filtered back projection. The final output is a noisy CT image in Hounsfield units in integer precision. A MATLAB implementation is displayed in Algorithm 4. For our experiment, we varied λ from 0.5 to 5 in increments of 0.5. An example of the output image is displayed in Fig. 2.
To relate λ to the physical dose from a CT scanner, we adopted the method described by Reeves et al. 45 This paper quantified the dose of an image with a homogeneous region in the chest CT scan. To this end, we used the homogeneous region inside the trachea. Using the airway segmentation, we considered the first 60 axial slices of the segmented trachea. To avoid the influence of the boundary, the tracheas were morphologically eroded 46 with a structuring element of a sphere of radius 5. All segmentations were visually inspected before further processing. Finally, we computed the standard deviation of the intensities inside the mask, denoted as T n . Table 2 shows values of T n on a selection of images against a range of λ. Using results from Reeves et al. 45 and Sui et al., 47 a low-dose scan with a tube current-time product 25 mAs has maximum T n of 55 HU. Thus we assume λ ¼ 3.5 approximately corresponds to a low-dose scan. We considered higher values of λ to verify any correlations in the results.
We computed the taper measurement on the noisy images using the same segmented airways and labeled distal point that were identified on the respective original image. The literature has shown in low-dose scans, airway segmentation software 29,48 cannot segment airways to the lung periphery as well as standard dose scans of the same patient. But these methods can still segment a large number of branches in low-and ultralow-dose scans. 29,48 Furthermore, research has shown that there are minor differences in the performance of radiologists when attempting to detect features from standard and low-dose CT scans. 47,49,50

Voxel size
We analyzed the effect of voxel sizes on the tapering measurement. For each CT image, the voxel spacing s x , s y , s z was subsampled to new spacing of σs x , σs y , σs z , where σ is a scalar constant. The intensities at each new voxel position were computed using sinc interpolation with a small amount of smoothing. We chose Sinc interpolation to preserve as much information as possible from the original image. To compute the tapering value, we resampled the segmented airway and distal point to the same coordinate system using nearest neighbor interpolation. Morphological filtering via a closing operation 46 was used on segmented airways to remove artefacts caused by the resampling. For our experiment, we used the parameters σ ¼ 1.1; : : : ; 2 with increments of 0.1.

CT Reconstruction
On a subset of images, four patients were scanned using the Toshiba Aquilion ONE scanner. On the same scan, two different images were computed. The images were reconstructed using the lung and body kernels, respectively. An example of the reconstruction kernels is displayed in Fig. 3. We acquired the airway segmentation and distal point from a single-reconstruction kernel as described in Table 3. The tapering measurement was computed on both reconstruction kernels using the same airway segmentation and distal points. We used the same airways as described in Table 1.

Effect of bifurcations
We analyzed the effect of airway bifurcations on the tapering measurement. To this end, we manually identified regions of bifurcating airways. On a selected subset of airways, we considered the reconstructed airway image described in Fig. 16. Using ITK-SNAP, the author (K. Q.) started at the cross-sectional plane corresponding to the carina and scrolled toward the distal point. Using visual inspection, the following protocol was developed to identify bifurcations on cross-sectional planes: 1. The scrolling stops when the airway is almost or at the point of separation.
2. The author scrolls back until the airway stops decreasing in diameter. An alternative interpretation is when the airways are about to enlarge due to the bifurcation.
3. Starting at the point of enlargement and scrolling forward, each slice is delineated as a bifurcating region until complete separation of bifurcating airways is reached.
The criteria for a complete separation are the lumen wall of both airways that are completely visible and separate. The entire protocol is summarized in Fig. 4.
For our experiment, we selected 19 airways from Table 1. The data consisted of 11 healthy and 8 bronchiectatic airways. The entire analysis was performed on ITK-SNAP..

Progression
We examined possible changes in tapering of airways in patients over time. In this experiment, we consider two sets of longitudinal scans. First, pairs of airways that were healthy on both baseline and follow up scans. Second, pairs of airways that were healthy on baseline scans and became bronchiectatic on follow up scans. For pairs of healthy airways, a trained radiologist (J. J.), manually identified 14 pairs of airways across 3 patients. The criteria were the airway track that must have a healthy appearance on both baseline and follow up scans. For the second population, the same radiologist manually identified 5 pairs of airways from a single patient P1. The scans were obtained from the University College London Hospital and acquired with written consent. The criteria for selection were airways that appear healthy on baseline scans and became bronchiectatic at the follow up scan, an example is displayed in Fig. 5. Details of the CT images are summarized in Tables 4 and 5.
Using two separate work stations, the airways were visually registered between the longitudinal scans. Airways were taken from various regions of the lungs and were different to the airways displayed in Table 1. The tapering measurements were taken from the method discussed in Sec. 2.

Dose
We analyzed the difference in cross-sectional area measurements and the final tapering measurements at different CT radiation doses. For the cross-sectional areas, Fig. 7 compares the crosssectional areas between the original image and one of the noisy images. Each graph contains ∼30; 000 unique lumen measurements. The correlation coefficients between the populations were r > 0.99 on all graphs. The 95% confidence intervals increase with the amount of noise. For the tapering measurement, Fig. 8 displays the measurements from all the noisy images compared to their respective original images. The correlation coefficient between noisy and original tapering measurements was r > 0.98 on all values of λ.
We analyzed the tapering difference between the original images and simulated images. We interpret the mean and standard deviation of the error difference as the bias and uncertainty, respectively. Figure 9 shows an overestimation bias with an increase in noise and a positive correlation between uncertainty and dose.

Voxel Size
We analyzed the computed spline and tapering for all the scaled images. We used the arclength of the spline as the metric for comparison for the computed spline. Figure 10 compares the arclengths computed from the scaled splines with the respective originals. On all scales σ, the correlation coefficients between measurements were r > 0.98. Furthermore, we analyzed the error difference in arclength. In Fig. 11, the mean difference shows a weak correlation coefficient with r ¼ 0.55 with scale σ. The mean difference shows both an overestimation and underestimation bias with the arclength measurement. Figure 11 shows a weak correlation between standard deviation and scale with r ¼ 0.51.
In terms of the tapering measurement, Fig. 12 compares the tapering values from the scaled images with the respective originals. The correlation coefficients between the scaled and original tapering values was r > 0.97 on all scales σ. In addition, we  Taper difference (mm -1 ) 10 -3  examined the error difference of the original minus the scaled tapering. Figure 13 shows a negative correlation with both overestimation and scale with r ¼ −0.98. Furthermore, Fig. 13 shows a positive correlation with uncertainty and scale with r ¼ 0.94.

CT Reconstruction
We analyzed the difference in cross-sectional area and tapering measurement between reconstruction kernel. Figure 14, compares the difference in area measurements. On all patients, in cross-sectional area measurements, the correlation coefficient between the two measurements was r > 0.99. The largest 95% confidence was in patient bx515 with AE1.98 mm 2 from the mean. Figure 15 compares the differences in tapering measurement. We collected n ¼ 44 tapering measurement from 4 patients. The correlation coefficient was r ¼ 0.99 between the reconstruction kernels.

Bifurcations
We compared tapering measurements with and without points corresponding to bifurcations. On the first dataset, the tapering measurements were computed using all area measurements. The second dataset had tapering measurements computed without area measurements from the bifurcating regions as described in Fig. 16. As we compared the measurements in Fig. 17, the correlation coefficient was r ¼ 0.99. The uncertainty of each tapering measurement was computed using the standard error of estimate s defined as 55 where y i is the arclength, Y i is the estimate from the linear regression from each computed area x i , and N is the number of points in the profile. Figure 16 compares the uncertainty between the two populations. There was a statistical difference between the populations, on a Wilcoxon rank-sum test, p ¼ 7.1 × 10 −7 .

Progression
For healthy airways, we grouped tapering values between the baseline and follow up point. Figure 18 compares the measurements between the two time points. The results demonstrated good agreement with an intraclass correlation coefficient 56 ICC > 0.99. The standard deviation of the tapering difference was 1.45 × 10 −3 mm −1 . For airways that became bronchiectatic, we considered the change in tapering, i.e., tapering value at follow up minus   tapering value at baseline, the results are displayed in Fig. 18.

Standard deviation difference
The results show that bronchiectatic airways have a greater tapering change in magnitude compared to airways that remained healthy p ¼ 0.0072.

Discussions
In this paper, we propose a tapering measurement for airways imaged using CT and validate the reproducibility of the measurement. The tapering measurement is the exponential decay constant between cross-sectional area and arclength from the carina to the distal point of the airway. Unlike other proposed tapering measurements, we assess reproducibility of the tapering measurement against simulated CT dose, voxel size, and CT reconstruction kernel. Finally, we assess the effect of tapering across airway bifurcations and examine repeatability over time using longitudinal scans. Part of the evaluations consists of analyzing the difference in tapering across longitudinal scans. The timescales between scans range from 9 to 35 months. The motivation for a long timescale is a proof of principle demonstration that the tapering measurement is reproducible for clinical studies. Examples include drug trails 57 and investigations in exacerbations, 58 where the timescales in monitoring patients were 12 months and 60 months, respectively. The pipeline consists of various established image processing algorithms. We chose the centerline algorithm developed by Palágyi et al. 26 Unlike other proposed methods, 30,59,60 the algorithm explicitly links the distal points to the carina. Furthermore, it has been shown that the algorithm of Palágyi et al. 26 can be used on images with nonisotropic voxel sizes. By modeling the centerline as a graphical model similar to Mori et al., 31 we performed a breadth first search 32 to avoid analyses of false airway branches. The removal of false branches is not a trivial task. 31,34,61 We corrected the centerline discretization error or recentering by smoothing points on the centerline. Smoothing has been an established method in the literature. 34,62 A recentering method was proposed by Kiraly et al., 61 which shifts the centerline voxels in relation to a distance transform. The process is iterative compared to a single computation of smoothing.
For our pipeline, we generated the orthonormal plane based on the method of Shirley and Marschner. 38 We set the pixel size   66 that the method can create high-frequency artefacts in the image. 67 Various methods have been proposed to measure the area of the airway lumen. [68][69][70] We used the FWHM ESL because of two distinct advantages. First, the method is parameter free. Second, the method is robust against slight variations in intensities. The method can, therefore, be applied to images from different scanners and images acquired using different image reconstruction kernels.

Limitations
In this study, we compared the tapering measurement for healthy and diseased airways using a Wilcoxon rank-sum test. The test assumes the data points are independent. However, we used a variety of airways from the same lung. Thus the tapering profiles of the same patients will have a degree of overlap. Future work is needed to analyze data points that are not dependent on each other.
A key limitation of the tapering measurement is the requirement of having a robust airway segmentation. In this paper, the airway segmentation software was often unable to reach the visible distal point of an airway. Thus time-consuming manual delineation was needed to extend the missing airways. The distal point is usually located at the periphery of the lungs. Thus to avoid manual labeling, a segmentation algorithm would need to automatically segment the airways past the sixth airway generation. From the literature, the state-of-the art software developed by Charbonnier et al. 71 using deep learning could still only consistently segment airways to the fourth generation. The segmentation of small and peripheral airways is not a trivial task. 66,72,73 In this paper, we analyze the reproducibility of all computerized components of the tapering algorithm. This paper does not address reproducibility of manual labeling of the airways. It is noted in the literature that semimanual labeling of small airways can take hours. 74 Future work is required to analyze the reproducibility of manual segmentation of the airways. We hypothesize that the segmented healthy peripheral airways consist of a small number of voxels; therefore, any errors in voxel labeling will be considerably smaller then a dilated peripheral airway affected by bronchiectasis.
In this work, we simulated low-dose scans through performing Radon transforms on existing CT images, adding Gaussian noise on the sinogram and using backprojection to reconstruct noisy CT images. There are proposed methods to simulate a low-dose scans by adding a combination of tailored Gaussian and Poisson noise on the sinogram. 75 These methods assume the original high-dose sinogram are available for simulation; however, it has been acknowledged that sinograms are generally not available in the medical imaging community. 76,77 Thus various groups have proposed low-dose simulations using reconstructed CT images. The methods involve adding Gaussian 76,77 or a combination of Gaussian and Poisson noise 78 on the sinogram of the forward projection of the CT image. Although there has been limited validation of the appearance of lung nodules against simulated low dose simulation, 79 there has been no validation on the efficacy of these methods on the appearance of airways. We believe that our low-dose simulation is sufficient because the measured standard deviation of the trachea mask T n is similar to results taken from low-dose scans from Reeves et al. 45 and Sui et al. 47 Similarly, with voxel size simulation, ideally one would reconstruct the images from the original sinogram. 19 However, as the sinograms were unavailable, we simulated the voxel size through interpolation of the original CT images similar to Robins et al. 80 We believe that the simulation is sufficient as it shows the robustness and precision of the centerline, recentering, and crosssectional plane algorithms in the pipeline. Changes in voxel sizes will change the combinatorics or arrangement of the binary image. By showing steps in the pipeline like centerline computation are repeatable across voxel sizes, we avoid resampling the image to isotropic lengths. Thus potentially avoiding a computationally expensive 48 preprocessing step.
We showed that the tapering measurement is reproducible by measuring the same airway across longitudinal scans with a minimum 5-month interval. The time between scans was on a similar scale from a reproducibility study on airway lumen by Brown et al. 23 An ideal experiment to assess reproducibility of the same airway from different scans would be to acquire follow up scans immediately after baseline scans similar to Hammond et al. 16 However, that work was performed on porcine models. Due to considerations of radiation dose, it is difficult to justify the acquisition of additional scans of no clinical benefit. 81 For our experiment, each airway was chosen by a subspecialist thoracic radiologist. The airway was inspected to ensure it was in a healthy state, e.g., with no mucus present. Thus we assume that each pair of airways is disease free and healthy.

Conclusions
In this paper, we show a statistical difference in tapering between healthy airways and those affected by bronchiectasis as judged by an experienced radiologist. From Fig. 6, the difference between the mean and median of the two populations was 0.011 and 0.006 mm −1 , respectively. In simulated low-dose scans, the tapering measurement retained a 95% confidence interval of AE0.005 mm −1 up to λ ¼ 3.5, equivalent to a 25-mAs low-dose scan. In simulations assessing different voxel sizes, the tapering measurement retained a 95% confidence between AE0.005 mm −1 up to σ ¼ 1.5. The tapering measurement retains the same 95% confidence, AE0.005 mm −1 interval against variations in CT reconstruction kernels, bifurcations, and, importantly, over time in evaluating sequential scans in normal airways. Importantly, we showed as a proof of principle that the magnitude change in tapering for healthy airways is smaller than those from airways that became bronchiectatic. From our previous work, 27 we showed that the measurements are accurate to a subvoxel level. Our findings suggest that our airway tapering measure can be used to assist in the diagnosis of bronchiectasis, to assess the progression of bronchiectasis with time and, potentially, to assess responses to therapy.
We analyzed the reproducibility of the components that constitute the tapering measurements. The reproducibility of area measurements was analyzed in relation to simulated radiation dose and CT reconstruction kernels. For simulated dose, we found the 95% confidence interval retains AE1.5 mm 2 in noisy images under λ ¼ 3, equivalent to a dose just higher than a 25-mAs low-dose scan. We note in Fig. 7, there is a bias toward overestimating larger lumen sizes at lower doses. As the centerline length remains constant and bias on the smaller lumen remain stable, the overestimation results in an increase in taper magnitude. For reconstruction kernels variation, we found the largest 95% confidence interval was AE1.9 mm 2 . The reproducibly of arclengths was tested against voxel sizes variability and showed that arclengths have a 95% confidence interval of up to AE5.0 mm for scales under σ ¼ 1.5. The increase in the standard deviation of arclength and area against voxel size and dose, respectively, correlate with uncertainty in tapering. This paper provides useful information for clinical practice and clinical trials. An accurate prediction of the noise amplitude in a particular CT scan and its distribution is a function of the limited radiation dose of the scan, scanner geometry, reconstructed voxel size, other sources of noise, the reconstruction algorithm, and any pre-and postprocessing used. Many of these factors are proprietary information of the CT manufacturer and hence not available to users. 82,83 We have undertaken an experiment to assess the dependence of our measurements on a simulated noise field added to the CT scan data and have presented the results. This gives an indication of the dependence on radiation dose assuming all other factors remain the same. We recommend that the accuracy experiment presented in this paper be repeated for the particular reconstruction, scan protocol, and scanner type used to make the measurements.
Bronchiectasis is often described as an orphan disease and has suffered a lack of interest and funding. 84,85 We have shown that the reproducibility of automated airway tapering measurements can assist in the diagnosis and management of bronchiectasis. In addition, we show that it is feasible to use our tapering measurement in large-scale clinical studies of the disease provided careful phantom calibration is taken.

Disclosures
Part of this work has been presented at the 2018 SPIE Medical Imaging Conference. 27 For potential conflicts of interest: Ryutaro Tanno has been employed by Microsoft, ThinkSono, and Butterfly Network (the employment is unrelated to the submitted work), Joseph Jacob has received fees from Boehringer Ingelheim and Roche (unrelated to the submitted work), and David Hawkes is a Founder Shareholder in Ixico plc (unrelated to the submitted work). The other authors have no conflicts of interest to declare.