Difference imaging from single measurements in diffuse optical tomography: a deep learning approach

Abstract. Significance “Difference imaging,” which reconstructs target optical properties using measurements with and without target information, is often used in diffuse optical tomography (DOT) in vivo imaging. However, taking additional reference measurements is time consuming, and mismatches between the target medium and the reference medium can cause inaccurate reconstruction. Aim We aim to streamline the data acquisition and mitigate the mismatch problems in DOT difference imaging using a deep learning-based approach to generate data from target measurements only. Approach We train an artificial neural network to output data for difference imaging from target measurements only. The model is trained and validated on simulation data and tested with simulations, phantom experiments, and clinical data from 56 patients with breast lesions. Results The proposed method has comparable performance to the traditional approach using measurements without mismatch between the target side and the reference side, and it outperforms the traditional approach using measurements when there is a mismatch. It also improves the target-to-artifact ratio and lesion localization in patient data. Conclusions The proposed method can simplify the data acquisition procedure, mitigate mismatch problems, and improve reconstructed image quality in DOT difference imaging.

1 Introduction without the target, is often used in DOT in-vivo imaging. Difference imaging can partially cancel out modeling errors that are invariant between the measurements. Typically, the image reconstruction is carried out using the difference between the measurements and a linearized approximation of the forward model. 9,18,19 For example, in clinical breast patient studies, our group normalized the measured diffuse reflectance from the lesion side breast to the contralateral normal breast (the reference side) to produce a "perturbation" for imaging reconstruction. 20 In practice, a sequence of reference measurements is acquired, and then, according to the chest wall shown in the coregistered US images and the perturbation, the best reference is manually selected for reconstruction. These procedures can significantly increase the data acquisition and postprocessing time. On the other hand, the assumption made in difference imaging that the background properties of the lesion side and reference side are the same may be inaccurate in some cases in clinical studies, and such mismatch can cause incorrect estimation of the lesion's optical properties and produce image artifacts. [21][22][23] Here, to simplify the data acquisition procedure and to avoid mismatch problems in DOT difference imaging, we propose a new approach using a deep artificial neural network (ANN) to generate perturbation from target measurements only. The ANN model is trained and validated on simulation data, including data generated with the realistic VICTRE breast phantom 24 and VICTRE breastMass software. 25 The performance of the model is then tested with simulations, phantom experiments, and patient data.

DOT System
A frequency-domain US-guided DOT method and system were used for simulations, phantom experiments, and patient data. The system was previously described in Ref. 26. Briefly, the light source consists of four laser diodes (with wavelengths of 730, 785, 808, and 830 nm) modulated at 140 MHz. The detection of diffuse reflectance is achieved by 14 parallel photomultiplier tube detectors. A local oscillator at 140.02 MHz is used to demodulate the detected signals to 20 kHz. Then the signals are digitized and sampled by two eight-channel A/D data acquisition cards. A US transducer in the middle of the DOT probe provides the coreregistered B-scan US images.

Simulation
A total of 43,398 sets of measurements from finite element method (FEM) and Monte Carlo simulations 27 were used: 80% were randomly chosen as the training set and the rest were used as the validation set. The training set was used to fit the model, and the validation set was used to determine the ANN structure and to select proper hyperparameters. Different target sizes, depths, shapes, and absorption and reduced scattering coefficients (μ a and μ s 0 , respectively) were considered. Tissue heterogeneity was simulated with realistic digital breasts. 24 The target and background properties of the training and validation dataset are given in Table 1. More details about the setup of the simulations can be found in Refs. 28 and 29. To demonstrate the generalization of the model, we generated 180 additional sets of simulation data for testing. This dataset covered various target sizes, depths, and absorption coefficients, and we made sure that the μ a and μ s 0 of targets, the breast tissue, and the chest wall in this dataset were different from that of the training and validation sets.

Phantom experiments
To further evaluate the performance of the model, phantom experiments and patient data were used as the test set because they are more representative of future unseen clinical data. Two sets of solid targets with high contrast (μ a ¼ 0.23 cm −1 ) and low contrast (μ a ¼ 0.11 cm −1 ) were placed in Intralipid ® solution with a μ a of 0.015 cm −1 and a μ s 0 of 7.3 cm −1 , both measured at 730 nm. 26 These targets had diameters of 1, 2, and 3 cm and were placed at different depths (central depths of 1.5, 2, and 2.5 cm). To demonstrate the advantage of the proposed method in avoiding reconstruction inaccuracy due to mismatch between the target and reference measurements, we also conducted phantom experiments with chest wall mismatch. A gelatin-intralipid phantom 30 with a μ a of 0.076 cm −1 and a μ s 0 of 9.8 cm −1 measured at 730 nm was used to simulate the chest wall. Chest wall mismatch was created by placing the chest wall deeper at the target side than the reference side.

Clinical data
Patient data from 43 benign (22 fibroadenomas or proliferative lesions (PL) and 21 low-risk benign lesions of fibrocystic changes or complex cysts) and 13 malignant cases were used to evaluate the performance of the proposed method. This group of 56 patients had an average age of 49.2 years (AE13.8 years). The lesion radii in the z-direction ranged from 0.5 to 1.5 cm. The study was approved by the local IRB and was HIPAA compliant. Written informed consent was obtained from all patients. The data used in this paper were deidentified.

Artificial Neural Network
In traditional difference imaging for US-guided DOT, the normalized perturbation, pert, was calculated from the frequency-domain measurements of the lesion-side breast normalized to the contralateral reference-side breast: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 1 9 2 where U l and U r are the lesion-side and the reference-side measurements, respectively. Under the assumption that the only difference between the lesion side and the reference side is that the lesion has a higher absorption coefficient than the background, A l should be smaller than A r . Simulations have shown that the phase difference between the two sides jϕ l − ϕ r j does exceed 90 deg; 31 thus the real part of the normalized perturbation should be between −1 and 0. Multilayer perceptron (MLP), a type of feedforward ANN, can approximate any continuous mapping function from one finite-dimensional space to another, 32 and this mapping function can Target μ a (cm −1 ) 0.08 to 0.30 0.10 to 0.30 Tissue μ a (cm −1 ) 0.01 to 0.06 Digital breasts with 20% to 80% fat; 24 no chest wall Chest wall depth (cm) 1.6 to 5 Chest wall tilting angle (deg) −10 to 10 For nonspherical targets, the radii are calculated as half of the largest dimensions in the z direction.
be learned using a data-driven approach. Hence, we trained an MLP-based ANN to output perturbation measurements pert from target measurements U l only. The nonlinear U l -to-pert mapping function, or the weight of the ANN, was learned from simulations with known U l and pert. As shown in Fig. 1, the input and output layers of the ANN both have 252 neurons, and the three hidden layers have 256, 128, and 256 neurons, respectively. To introduce nonlinearity to the neural network, a rectified linear unit activation function was used after each fully connected layer except the last one. The inputs of the network are the lesion side log amplitude, logðA l ρ 2 Þ, and phase, ϕ r , of simulated light reflectance on the lesion side with 2% random Gaussian noise added. Here, A l is the amplitude and ρ is the source-detector distance. The outputs are the real and imaginary parts of the normalized perturbation. To avoid scaling problems for different datasets, the logðA l ρ 2 Þ values were variously offset in each set of measurements to keep the maximums of logðA l ρ 2 Þ the same. Similarly, the minimums of ϕ r were kept the same. The output of the ANN is the real and imaginary parts of the normalized perturbation. The mean square error was used as the loss function. The network was implemented in PyTorch 33 and was trained for 200 epochs with a batch size of 64, using the Adam solver 34 with a learning rate of 1e-4 and a weight decay of 1e-5. Validation loss was monitored to avoid overfitting. The learning rate was reduced by a factor of 0.2 of its previous value when the validation loss did not decrease for three epochs. The training procedure took ∼20 min on an NVIDIA Quadro P2000, and the testing procedure took <0.1 s to generate a perturbation from one set of target measurements.

Image Reconstruction and Quantification
The DOT inverse problem was linearized using the Born approximation and was formulated as the following regularized optimization problem: 35 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 2 2 5 Here, δμ a represents the unknown changes in target μ a compared with the reference side, W is a sensitivity matrix calculated using the Born approximation, and λ is a regularization parameter. The conjugate gradient algorithm and a dual mesh scheme using coregistered US guidance were employed to solve the inverse problem. 36 For patient data, the total hemoglobin (tHb) distribution was calculated using the reconstructed absorption maps of all four wavelengths. For phantom data, the structural similarity index (SSIM) was used to evaluate the similarity between the reconstructed images and the ground truths. To evaluate the reconstructed image quality of patient data that did not have ground truths, we used a semiautomated CNN to segment lesions from the coregistered US images of the patient data. 37,38 After each lesion was segmented, its xand z-centroids were calculated (assuming the y-centroid to be 0), and the distance between the centroid and the reconstructed lesion center of mass was calculated. For each depth in the reconstructed image, we defined the lesion area to be three times the US-segmented width from the x∕y-centroid, defined outside to be the artifact area, and then we calculated maxðtHb target Þ∕ maxðtHb artifact Þ.
The Wilcoxon rank sum test was used to evaluate the statistical significance, with a p-value of <0.05 being considered statistically significant. All image reconstruction and statistical tests were conducted using MATLAB 2021a.

Testing on Simulation Data
For the 180 sets of simulation data in the test set, the averaged absolute difference between the reconstructed maximum μ a from ANN predicted perturbation and from matched perturbation was 0.0068 cm −1 (95% CI: 0.0062 to 0.0074 cm −1 ). Figure 2 shows one ground truth, simulated perturbation without mismatch, simulated perturbation with chest wall mismatch, ANNpredicted perturbation, and their corresponding reconstructed images. When there is no mismatch, the real part of the measured perturbation is negative and localized. When the reference side chest wall is shallower than the target side chest wall, both the real and imaginary parts of the perturbation move in the positive direction, and the target is not reconstructed. The ANNpredicted perturbation is similar to the perturbation without mismatch, and its corresponding reconstructed image shows a similar target shape, depth profile, and μ a to the image using perturbation without mismatch. Fig. 2 Ground truth, simulated perturbation without chest wall mismatch (middle row, left), simulated perturbation with chest wall mismatch (middle row, middle), ANN-predicted perturbation (middle row, right), and their corresponding reconstructed images (bottom row). The target has μ a ¼ 0.15 cm −1 , radius ¼ 1 cm, and depth ¼ 1.5 cm. Chest wall depth is 3 cm on the target side and the matched reference side and is 2 cm on the mismatched reference side. Figure 3 shows images of phantoms reconstructed using the ANN-predicted perturbation and the measured perturbation. For high contrast targets, the two methods give similar reconstructed target sizes, shapes, and μ a values. However, for low contrast targets, the reconstruction using the measured perturbation has distorted shapes, and the reconstructed μ a values are lower than the ground truths, whereas the reconstruction using the ANN-predicted perturbation gives shapes and μ a closer to the ground truths. Figure 4 shows the maximum reconstructed μ a for high and low contrast targets with different radii and depths. In all tested cases, except for the high contrast target with a 0.5-cm radius and a 1.5-cm depth, the ANN-predicted perturbations give the maximum reconstructed μ a values closer to the ground truth than the measured perturbation does.   4 The maximum reconstructed μ a for (a) high and (b) low contrast targets with different radii (r ) and depths (d ). Figure 5 shows the SSIM values between the reconstructed images using the corresponding ground truths. For all tested cases, images generated using ANN-predicted perturbations have SSIM values higher than or similar to (difference < 3%) those using measured perturbations. Figure 6 shows measured perturbations without and with chest wall mismatch and the ANNpredicted perturbation and their corresponding reconstructed images. When there is no chest wall mismatch, the real part of the measured perturbation is negative and localized, with a few outliers. The reconstructed target is mostly located in the bottom layer. When the reference side chest wall is shallower than the target side chest wall, both the real and imaginary parts of the measured perturbation move in the positive direction, and the target is not reconstructed. The ANN-predicted perturbation generated an image that shows the target with three layers, which is closer to the ground truth than the other two.   Figure 7(b) shows a 52-year-old patient with a benign cyst that is located 1.5 cm deep and measures 2.8 and 2.0 cm in the x and z directions, respectively. In Figs. 7(a) and 7(b), no significant mismatch was found by evaluating the coregistered US images and the measured perturbations. Compared with the measured perturbation, the ANN-predicted perturbation generated images with similar depth profiles and the maximum absorption coefficients, but they are more focused and centralized. Figure 7(c) shows a 48-year-old patient with a small pseudoangiomatous stromal hyperplasia that measures 1.4 and 0.8 cm in the x and z directions, respectively. By evaluating the measured perturbation, we found that there was a mismatch between the lesion and reference side measurements. In the reconstructed image, the lesion was not properly reconstructed and did not show up. Using the proposed deep learning-based approach, the lesion was successfully reconstructed in the corresponding area of the US image, using the perturbation generated from the target side measurements only. Figures 8(a) and 8(b) show box plots of the ratio of the maximum values of the reconstructed target tHb to the artifact tHb and the reconstructed center of mass to US segmented centroid distances for all 56 cases, respectively. The images reconstructed using the ANN-predicted perturbations have higher target tHb/artifact tHb ratios, and the lesions are closer to the US-segmented lesion positions. Figure 9 shows box plots of the maximum tHb for 21 low-risk benign lesions, 22 fibroadenomas (fibro) or PL, and 13 malignant cases, reconstructed using the measured and ANNpredicted perturbations. The tHb values calculated for all target files were averaged for each patient. The tHb values reconstructed using the ANN-predicted perturbations are overall lower than those using the measured perturbations. The tHb maps of low-risk benign and malignant lesions reconstructed from the ANN-predicted perturbations are significantly different, although the p-value is slightly higher than those reconstructed from the measured perturbations calculated using the best manually selected references. For reconstructions using ANN-predicted perturbations, the PL/fibro group has larger variations than the other two groups.

Discussion
We designed an ANN to generate data for DOT difference imaging from target measurements only. The ANN was trained and validated using simulation data that included a wide range of realistic background tissues as well as target sizes, shapes, depths, and optical properties to account for individual differences in patients. To simulate the noise in actual measurements, we included Monte Carlo simulations that contain photon noise and added Gaussian noise to the simulated target measurements. The model was then tested with phantom and clinical data. It performed comparably to the approach using measurements without mismatch, and it outperformed the approach using measurements when there was a mismatch between the target and reference measurements. This is because the ground truths for training the ANN are the perturbations calculated from matched target and reference measurements. Hence, unlike the traditional approach in which the perturbation highly depends on the reference measurements, the ANN will always output a matched perturbation. This predictive approach can simplify data Fig. 6 Ground truth for phantom experiment with chest wall (top row), measured perturbation without chest wall mismatch (middle row, left), measured perturbation with chest wall mismatch (middle row, middle), ANN-predicted perturbation (middle row, right), and their corresponding reconstructed images (bottom row). The target has μ a ¼ 0.11 cm −1 , radius ¼ 1 cm, and depth ¼ 2 cm. Chest wall depth is 3 cm on the target side and the matched reference side and is 2 cm on the mismatched reference side.  acquisition and mitigate mismatch problems in DOT difference imaging. It also improved the image quality in terms of the target-to-artifact-ratio and lesion localization.
This study has several limitations. First, the simulations included in the training set are limited. The proposed approach can be further improved using more complex simulations, such as larger lesions with peripheral Hb distribution, off-center lesions, and complex chest walls. Also, because perturbations without system noise were not available, we do not have ground truths for the phantom experiments, so we cannot fine-tune the ANN model with phantom experiments. However, when testing the trained model on phantom data, most of the images reconstructed using ANN-predicted perturbations were closer to the ground truth than those using true measurements without mismatch, indicating a good generalization of the model on experimental data. For patient data, the tHb values reconstructed using ANN-predicted perturbation are overall lower than those using measured perturbation; thus, a new and lower threshold for separating benign from malignant lesions must be set. The performance of the ANN is better than or comparable to that of the measured perturbation, regardless of target sizes, for phantom data (Figs. 4 and 5). Nevertheless, in clinical data, we found that the ANN performs better than the measurements in distinguishing smaller benign and malignant lesions with z dimensions ≤1 cm but worse in distinguishing larger lesions. This relatively poor performance likely reflects the fact that larger lesion produced perturbations are more difficult to completely separate from the tissue background. In addition, the ANN performance for 808 and 830 nm is worse than 730 and 785 nm, which may be related to the sensitivity of these two wavelengths to the chest wall. For a prospective clinical application, we would streamline the reconstruction based on lesion size measured by US and decide if the ANN model or measured perturbation would be used for reconstruction. Finally, for other DOT systems with different configurations, the ANN may be modified according to the specific DOT systems and retrained with the new data.

Disclosures
The authors declare that there are no conflicts of interest related to this article.