Overconstrained library-based fitting method reveals age- and disease-related differences in transcutaneous Raman spectra of murine bones

Abstract. Clinical diagnoses of bone health and fracture risk typically rely on measurements of bone density or structure, but the strength of a bone is also dependent on its chemical composition. Raman spectroscopy has been used extensively in ex vivo studies to measure the chemical composition of bone. Recently, spatially offset Raman spectroscopy (SORS) has been utilized to measure bone transcutaneously. Although the results are promising, further advancements are necessary to make noninvasive, in vivo measurements of bone with SORS that are of sufficient quality to generate accurate predictions of fracture risk. In order to separate the signals from bone and soft tissue that contribute to a transcutaneous measurement, we developed an overconstrained extraction algorithm that is based on fitting with spectral libraries. This approach allows for accurate spectral unmixing despite the fact that similar chemical components (e.g., type I collagen) are present in both bone and soft tissue. The algorithm was utilized to transcutaneously detect biochemical differences in the tibiae of wild-type mice between 1 and 7 months of age and between the tibiae of wild-type mice and a mouse model of osteogenesis imperfecta. These results represent the first diagnostically sensitive, transcutaneous measurements of bone using SORS.


Introduction
Although the strength of a bone is dependent on its chemical composition, 1 technologies that are used to monitor fracture risk primarily measure bone density or structure. For example, bone mineral density (BMD), which is typically measured with dual-energy x-ray absorptiometry, is used clinically as an indirect indicator of bone fragility. Despite its ubiquity, BMD provides limited information about chemical composition, and its discriminatory capacity to identify peri-and early postmenopausal women at high risk of fracture is low. 2 Vibrational spectroscopy has been used extensively in ex vivo studies to measure the chemical composition of both the mineral and organic matrix components of bone. 3,4 Specifically, Raman spectroscopy has revealed chemical perturbations to cortical bone in animal studies of aging, 5 lead exposure, 6 osteogenesis imperfecta, 7 early-onset osteoarthritis, 8 rheumatoid arthritis, 9 and glucocorticoid-induced osteoporosis 10,11 and in a study of women with postmenopausal osteoporosis. 12 Many studies have also revealed correlations between Raman spectroscopy-based measures of chemical composition and the biomechanical properties of bone. 5,13,14 In fact, preliminary studies from our group indicate that Raman spectra can generate predictions of biomechanical strength and toughness that are more accurate than those produced by the clinically used parameter of BMD. 6,9,10 These results on exposed-bone samples have inspired attempts to perform bone spectroscopy transcutaneously. [15][16][17][18][19][20][21][22][23][24] Tissue phantoms with controlled optical and chemical properties have also been developed to model transcutaneous measurements. [25][26][27] The majority of these studies have used spatially offset Raman spectroscopy (SORS) [28][29][30][31][32][33][34][35][36][37] in conjunction with scaled subtractions 18 or self-modeling curve resolution algorithms, such as band target entropy minimization (BTEM), 38,39 to separate the interfering spectrum of soft tissue from the spectrum of the underlying bone. In essence, these approaches consist of using multiple source-detector combinations to interrogate volumes with different ratios of bone and soft tissue and then exploiting the diversity in the spectral data to elucidate the bone signature. A number of transcutaneous studies employing these methods have reported plausible estimates of underlying bone spectra, with high correlations to reference measurements on exposed bones. [20][21][22][23] It has not yet been shown, however, that such methods are accurate enough to resolve the subtle spectral differences associated with disease and increased fracture risk.
In this publication, a new spectral extraction method is presented and compared to BTEM. The new method is overconstrained to fit a set of transcutaneous measurements of bone with varying amounts of only two spectra (one of bone and one of soft tissue), each of which is built from a unique and separately acquired spectral library. Via two-layer simulations, we demonstrate that the new method is robust when similar chemical components (e.g., type I collagen) are present in both layers, while the assumptions underlying BTEM make it fundamentally unstable. We also show experimentally that applying the overconstrained, library-based fitting method to data acquired transcutaneously from murine tibiae produces accurate estimates of spectra acquired directly from the underlying bone. Analysis shows that these estimates are sensitive to age-and disease-related chemical differences.

Biochemistry of Bone and Overlying Soft Tissues
Bone is a composite of inorganic mineral and organic matrix components. The inorganic component is made up of calcium and phosphate and is often characterized as a carbonated, poorly crystalline hydroxyapatite. The organic matrix of bone is composed of approximately 90% type I collagen by dry weight. 40 The biomechanical properties of bone and its susceptibility to fracture in both health and diseases is affected by changes to these components and is not limited to BMD. Bone can be classified into two types, cortical bone and trabecular bone, on the basis of porosity and microstructure. The long bones, such as the tibia, have an outer shell of dense cortical bone with the proximal and distal ends, in the regions known as the epiphyses and metaphyses, reinforced internally by a more porous, mesh-like trabecular bone.
The primary soft tissues that contribute interfering signals to transcutaneous Raman measurements include dermal tissue, adipose tissue, and muscle. Transcutaneous Raman spectroscopy is particularly challenging since type I collagen is abundant both in the organic matrix of bone and in soft tissue (e.g., dermal tissue is composed of approximately 80% type I collagen by dry weight 40 ). Although it exhibits a unique pattern of covalent cross-linking in bone, 41 type I collagen is genetically identical in all tissues, 42 so determining the amount to assign to each tissue region is difficult. Representative Raman spectra of murine bone and soft tissue are displayed in Fig. 1.

Spatially Offset Raman Spectroscopy and Band Target Entropy Minimization
In SORS, Raman scattered light is collected from surface locations that are spatially offset from the illumination location. The spectral contribution from the subsurface layers relative to the contribution from the superficial layer increases with the magnitude of the offset. In the case of transcutaneous Raman spectroscopy of bone, spectra have typically been acquired at multiple spatial offsets. The variation between spectra is then analyzed, often with BTEM, to estimate the spectrum of the underlying bone. BTEM is an example of a self-modeling curve resolution algorithm; i.e., it uses a set of mixture spectra (in the present case, from SORS data sets) to estimate the spectrum of a single pure component. The spectral estimate is constructed from the principal components of the mixture spectra based on the minimization of an objective function. In the case of transcutaneous measurements of bone, the dominant PO 3− 4 ν 1 peak (the largest peak in Fig. 1) is targeted, and the BTEM process dictates how the principal components are coadded to render a spectrum exhibiting that peak and others. Despite the success of BTEM in other applications, we show in this publication that the assumptions made by the algorithm are insufficient to produce accurate estimates of bone spectra due to the fact that spectrally similar chemical components (e.g., type I collagen) are present in both bone and soft tissue.

Simultaneous, Overconstrained, Library-Based Decomposition
In order to provide improved estimates of bone spectra extracted from transcutaneous measurements, we developed an overconstrained method that fits transcutaneously acquired data with spectral libraries. In this study, the method was used to model transcutaneous Raman data with two spectra, one built from a bone library and the other built from a soft-tissue library. In general notation, the simultaneous, overconstrained, library-based decomposition of m spectra that span the same space as n spectral components measured over l pixels is where M is an l × m matrix of measured spectra (e.g., SORS data), S is an l × n matrix of the underlying spectral components (e.g., bone and soft tissue), W is an n × m matrix of the relative weights of each spectral component, and E is an l × m error matrix of spectral features not fit by the model (in this section, boldface uppercase characters represent matrices, boldface lowercase characters represent column vectors, and lowercase italic characters represent scalars). For ease of reference, we refer to this fitting method (simultaneous, overconstrained, library-based decomposition) as SOLD for the remainder of this article. As noted earlier, utilizing transcutaneously acquired Raman data, BTEM has been used to estimate the underlying spectrum of bone (a single column of S) by asserting that the spectrum has a peak near 960 cm −1 (the PO 3− 4 ν 1 peak). In the SOLD model, the a priori information is more extensive: the assertions are that the number of spectral components, n, is known and that the regions of spectral space from which the columns of S can be built are each bounded by a separately acquired library of spectra. The k'th spectral column of S is related to its parent library via (2) where Lfkg is the library used to model the k'th spectral component and cfkg is a vector of fit coefficients that, along with the coefficients of W, are tuned to minimize the sum of squared errors, E. In order to prevent overfitting, cfkg and W are required to be non-negative and the model is overconstrained by requiring that the number of measured spectra, m, be greater than the number of underlying spectral components, n.
In this publication, each transcutaneous measurement of bone submitted to SOLD consisted of m ¼ 3 spectra acquired at different source-detector separations. The SOLD model was overconstrained by fitting this data with n ¼ 2 spectral components, one for bone and the other for the overlying soft tissue, which allowed for accurate spectral unmixing despite the fact that similar chemical components (e.g., type I collagen) are present in both bone and soft tissue. We elected to set the bounds on the estimated bone and soft-tissue spectra empirically, by acquiring a library of measurements of each tissue type and building our spectral estimates out of non-negative amounts of those spectra only. Other approaches, such as models with tunable spectral peak heights and shifts, could be substituted for the experimentally acquired libraries of spectra. Finally, it should be noted that the SOLD approach is robust against sample-to-sample variability in tissue thickness and optical properties (e.g., scattering and absorption) since the weighting coefficients, W, were not fixed a priori, but rather only required to be non-negative. The MATLAB® (version 7.8, The MathWorks™, Natick, MA) code used to fit transcutaneous data sets with SOLD can be obtained by contacting the corresponding author (A.J.B.).

Instrumentation and Spectral Processing
All soft-tissue, exposed-bone, and transcutaneous-bone spectra were acquired on a locally constructed Raman spectroscopy system that utilized an 830-nm semiconductor laser (Model PI-ECL-830-500-FS, Process Instruments Inc., Salt Lake City, UT) to deliver approximately 150 mW of power to the sample as shown in Fig. 2(a). The laser light was filtered with a bandpass filter (Chroma Technology Corp., Bellows Falls, Vermont) and the illumination numerical aperture was 0.34.
In order to collect the Raman scattered light, lenses L3 and L4 imaged the sample plane onto the face of an optical fiber bundle. The fiber bundle contained 61 multimode optical fibers with 100∕120 μm core/cladding diameters arranged in a circular array at the collection end as shown in Fig. 2(b). The circular array consisted of a center fiber surrounded by four annuli of fibers. For the transcutaneous measurements, the set of spectra acquired by all of the fibers was submitted to BTEM in accordance with the methods used in previous publications. For the transcutaneous measurements submitted to SOLD, the data were binned together yielding three total spectra per acquisition (one from the center fiber and first annulus, one from the next annulus, and one from the last two annuli). For the exposedbone and soft-tissue measurements, the spectra from all of the fibers were averaged yielding a single spectrum per acquisition.
Each fiber in the fiber bundle viewed a spot of diameter 100 μm at the sample surface. For the transcutaneous measurements, the illumination spot (1∕e 2 full-width ¼ 230 μm) overlapped with the image of the center fiber of the collection fiber bundle, introducing a range of spatial offsets between the illumination spot and the images of the collection fibers. Although the power delivered by our instrument is greater than the maximum permissible exposure set by the American National Standards Institute, 43 no thermal damage was observed. In addition, no adverse consequences were reported in other nearinfrared Raman spectroscopy studies of a variety of tissues (including skin) that used power levels greater than ours. 44 For the exposed-bone and soft-tissue measurements, the illumination spot was defocused (1∕e 2 full-width ¼ 1000 μm), by adjusting the position of the delivery end of the multimode fiber (MMF in Fig. 2), to overlap with the image of the entire collection fiber bundle. The Appendix contains additional information about the instrument and data processing routines.
After processing, the transcutaneously acquired spectra (and simulated spectra described below) were submitted separately to Fig. 2 (a) Raman spectroscopy system for transcutaneous and exposed-bone measurements. Light gray corresponds to the laser illumination path and dark gray corresponds to the Raman scattering collection path. Abbreviations: ST, shutter; L, lens; MMF, multimode fiber; BP, bandpass filter; BS, dichroic beam-splitter; MTS, motorized translation stage; NF, notch filter; FB, fiber bundle; SPEC, spectrograph; HG, holographic grating; CCD, charge-coupled device. (b) Image of the fiber bundle geometry. For the transcutaneous measurements submitted to SOLD, the data were binned together yielding three total spectra per acquisition (one from the center fiber and first annulus, one from the next annulus, and one from the last two annuli) as indicated by the semi-transparent green circle and rings. The circular array of collection fibers is rearranged into a line for delivery to the imaging spectrograph so that the Raman scattered light collected by each fiber can be separately recorded.
the SOLD and BTEM algorithms to estimate the spectrum of the underlying bone. The smallest subset of principal components that explained at least 70% of the total variance in each transcutaneous data set was submitted to BTEM, the PO 3− 4 ν 1 peak near 960 cm −1 was targeted, and the spectral estimate was formed by minimizing the objective function suggested by Ong et al. 39 The Nelder-Mead downhill simplex method 45 was used to minimize the BTEM objective function.

Simulated Data
Prior to analyzing experimental, transcutaneously acquired data, a simulated data set was constructed to compare the performances of SOLD and BTEM. The parameters described in this section were chosen to generate data sets that resemble transcutaneous spectra acquired from murine tibiae.

Generation of spectral libraries (L) used to fit simulated data
Spectra of some of the tissue types and chemicals present in bone and soft tissue were acquired as a part of this study. The average spectra of dermal tissue, adipose tissue, and muscle tissue acquired from multiple sites on four different mice are displayed in Fig. 3. Pure ceramic hydroxyapatite (Clarkson Chromatography Products Inc., South Williamsport, PA) was used to represent bone mineral, and dermal tissue was used as a surrogate for the organic matrix of bone since the chemical compositions of these components are similar (approximately 80 and 90% type I collagen, by dry weight, for dermal tissue and organic bone matrix, respectively 40 ). The spectral libraries used to generate and fit the simulated data sets were chosen to create an idealized comparison between SOLD and BTEM. More comprehensive libraries were constructed to fit experimental data as described in Sec. 4.3.

Generation of simulated spectral components (S)
To generate a simulated spectrum of bone, a linear combination of the hydroxyapatite spectrum [ Fig. 3(a)] and the organic bone matrix spectrum [ Fig. 3(b)] was formed, where the ratio of the coefficient multiplied by the hydroxyapatite spectrum to the coefficient multiplied by the organic bone matrix spectrum was varied over a factor of 3. Similarly, a linear combination of the spectra of dermal tissue, adipose tissue, and muscle tissue [ Fig. 3(c), 3(d), and 3(e), respectively] formed each simulated spectrum of soft tissue, where the coefficients multiplied by each spectrum were varied over a factor of 10. Finally, the simulated bone and soft-tissue spectra were normalized to their mean absolute deviations (similar to a calculation of the root mean squared spectral intensity).

Generation of simulated transcutaneous spectra (M) measured at a single location
The process of simulating the transcutaneous spectra from a single-site measurement is summarized in Fig. 4. The ensemble of SORS spectra was constructed by forming linear combinations of a single pair of bone and soft-tissue spectra [components of a single S matrix; Fig. 4(a) and 4(b), respectively]. For each source-detector pairing, the bone spectrum was multiplied by a coefficient between 0 and 2∕3 and added to the soft-tissue spectrum. This linear combination was normalized to its mean absolute deviation and added to a spectrum of zero-mean, Gaussian noise [ Fig. 4(c)] to form a single transcutaneous spectrum [ Fig. 4(d)]. This process was repeated to generate 30 transcutaneous spectra [ Fig. 4(e), 4(f), and 4(g)] with varying contributions of the same single pair of bone and soft-tissue spectra. Transcutaneous spectra without added noise were also generated.

BTEM-and SOLD-produced estimates of simulated bone spectrum (one column of S)
The set of 30 transcutaneous spectra was submitted to BTEM to estimate the underlying spectrum of bone. To simulate SORS measurements acquired at three different spatial offsets (corresponding to the three regions depicted in Fig. 2), the 30 transcutaneous spectra were sorted according to the contribution of the underlying bone lineshape. The spectra were then grouped based on their rank order [ Fig. 4(e), 4(f), and 4(g)] and each set of spectra was summed yielding three composite spectra [ Fig. 4(h), 4(i), and 4(j)] that were submitted to SOLD. The entire procedure described in Sec. 4.2 to generate and process simulated data is depicted as a flow chart in Fig. 5. This procedure was repeated 100 times (with different pairs of underlying bone and soft-tissue spectra and with or without added noise) to simulate transcutaneous data acquired from different samples. wild-type (WT) littermates (four per group). The oim∕oim mouse is a model of type III osteogenesis imperfecta, which is a genetic, musculoskeletal disorder characterized by mutations in type I collagen. Mice that are homozygous for the oim mutation are deficient in proα2ðIÞ collagen, which also affects the level of mineralization and ultimately results in a weaker bone. 46 In addition to the oim∕oim mice and WT controls, seven C57BL/6J WT mice (between 1 and 7 months of age) and six transgenic mice that constitutively overexpress a human tumor necrosis factor-α transgene 11,[47][48][49] (between 5 and 14 months of age; C57BL/6J background) were studied; the variations due to age and disease caused a greater spread in the biochemical properties of the bones. The mice that overexpress human tumor necrosis factor-α develop chronic inflammation and arthritis with secondary osteoporosis, while C57BL/6J mice are a commonly used inbred mouse strain. The mice were euthanized by inhalation of carbon dioxide, fur was removed from the right hindlimb with a depilatory agent (hair remover cream, Nair®, Church & Dwight Co., Inc., Princeton, NJ), and transcutaneous Raman spectra were acquired from all 21 intact mice. The samples were hydrated for the duration of each measurement in order to prevent laser-induced thermal damage of the tissue and spectra were acquired from the medial side of the mid-diaphysis region over an area of approximately 5 mm 2 ð5 mm × 1 mmÞ. For each mouse, five transcutaneous measurements were acquired spaced 1 mm apart along the shaft axis of the tibia with an integration time of 5 min per location. The average thickness of soft tissue overlying the measurement sites was approximately soft-tissue spectra were generated from the spectra in Fig. 3. A spectrum of zero-mean, Gaussian noise (c) was added to a linear combination of the bone and soft-tissue spectra to form a single transcutaneous spectrum (d). This process was repeated to generate 30 transcutaneous spectra [(e), (f), and (g)] with varying contributions of the bone and soft-tissue spectra. These 30 spectra represent a transcutaneous data set acquired from a single measurement site. The set of 30 spectra was submitted to BTEM to estimate the underlying spectrum of bone. In addition, each set of 10 spectra grouped together in (e), in (f), and in (g) was summed, yielding three spectra [(h), (i), and (j)] that were submitted to SOLD. The entire process summarized in this figure was repeated 100 times (with different pairs of underlying bone and soft-tissue spectra) to simulate transcutaneous data acquired from different samples. The gray bars highlight the PO 3− 4 ν 1 peak, which is present in the spectrum of bone. The amplitude of this peak qualitatively represents the contribution of bone to each transcutaneous spectrum. Fig. 5 Flow chart depicting the procedure used to generate and process simulated data with BTEM and SOLD. This procedure was repeated 100 times (with different pairs of underlying bone and soft-tissue spectra) to simulate transcutaneous data acquired from different samples. 1 mm. This study protocol was approved by the Committee on Animal Resources at the University of Rochester.

Measurement of underlying bone spectrum (one column of S) and generation of spectral libraries (L) used to fit experimental, transcutaneous data
Following acquisition of each set of transcutaneous measurements, the soft tissue was removed and gold-standard, colocalized measurements were acquired from the exposed bone for comparison. Similar to the transcutaneous measurements, spectra were acquired from five locations spaced 1 mm apart along the shaft axis of the tibia with an integration time of 5 min per location. A total of 160 spectra of excised soft tissue, including spectra of specific tissue types (Fig. 3) as well as spectra of bulk tissue, were also acquired from the 21 mice. In analyzing the transcutaneous data from each mouse, only data from the other mice were included in the spectral libraries (i.e., leaveone-sample-out analysis). Figure 6 presents the BTEM and SOLD spectral estimates of bone extracted from the single-site, simulated data set shown in Fig. 4. All spectra have been normalized to the height of the PO 3− 4 ν 1 peak near 960 cm −1 . Each algorithm was run twice, once upon the simulated data set without added noise [ Fig. 6(a) and 6(b)] and once upon the simulated data set with added noise [Fig. 6(c) and 6(d)]. In both cases, SOLD provided a superior estimate of the true underlying bone spectrum [ Fig. 6(e)] as evidenced by the accurate reconstruction of the organic bone matrix peaks (e.g., the starred CH 2 wag peak near 1450 cm −1 ). The correlation coefficients between the true and SOLD-estimated bone spectra were both greater than 0.999, while the correlation coefficients between the true and BTEMestimated bone spectra were 0.985 and 0.968 for the transcutaneous data sets without and with added noise, respectively. All of the spectral correlation coefficients reported in this publication were calculated over the 775 to 1500 cm −1 spectral range, which was chosen to match a previous study. 23 Qualitatively similar results were observed for correlations calculated over a larger spectral range that included the amide I peak near 1665 cm −1 .

Simulated Data
As described previously, the procedure used to generate a simulated, transcutaneous data set was repeated 100 times with a variety of simulated bone and soft-tissue spectra and the transcutaneous data were submitted separately to BTEM and SOLD. The mineral/matrix ratios of all 100 estimated spectra and the corresponding true underlying bone spectra are compared in Fig. 7. The mineral/matrix ratio of each bone spectrum was calculated as the ratio of the fit coefficients produced by leastsquares fitting with the pure spectra of hydroxyapatite and organic bone matrix [ Fig. 3(a) and 3(b), respectively].

Experimental Data
A representative, transcutaneously acquired data set and the corresponding SOLD fit are presented in Fig. 8. As described previously, the transcutaneously acquired spectra were binned together yielding three total spectra per acquisition [ Fig. 8(a), green circles] prior to fitting with SOLD. Variation between these spectra is due to differences in the relative sampling of the bone and soft tissue, which is due to different spatial offsets between the illumination spot and the image of each annulus of collection fibers on the surface of the sample [ Fig. 2(b)]. All three transcutaneously acquired spectra are well fit by a single bone and a single soft-tissue spectrum [ Fig. 8(c) and 8(d), respectively]; as a reminder, these two spectra are formed by linear combinations of the appropriate spectral library (Lfkg). Spectrum (c) in Fig. 8 is thus the SOLD estimate of the underlying bone spectrum for this particular sample and location on the tibia. Figure 9 presents the BTEM and SOLD estimates of the underlying bone spectrum extracted from the data set displayed in Fig. 8. All spectra have been normalized to the height of the PO 3− 4 ν 1 peak near 960 cm −1 . Again, SOLD provided a superior estimate of the true underlying bone spectrum [ Fig. 9(c)] as evidenced by the accurate reconstruction of the organic bone matrix peaks (e.g., the starred CH 2 wag peak near 1450 cm −1 ). The correlation coefficient between the exposed-bone measurement and the SOLD-estimated bone spectrum was 0.996, whereas the correlation coefficient between the exposed-bone measurement and the BTEM-estimated bone spectrum was 0.935.
The mean correlation coefficients between all of the SOLDestimated and BTEM-estimated bone spectra and the corresponding gold-standard reference measurements on exposed bone are compared in Table 1. The mean correlation coefficient between the exposed-bone spectra acquired from all of the mice described in Sec. 4.3.1 and the mean correlation coefficient reported in a study by Schulmerich et al. 23 are also tabulated. The Fisher transformation, 50 along with a Student's t test, was used to compare the correlations produced by each of the methods. SOLD produced spectra with the greatest correlations to the corresponding reference measurements and the difference between the mean correlation coefficient produced by SOLD and the mean correlation coefficient produced by each other method was statistically significant. (e) True simulated bone spectrum for comparison. To guide comparison, the asterisk highlights the CH 2 wag peak, which was more accurately reconstructed with SOLD.
The SOLD-estimated spectra were also used to predict the mineral/matrix ratios of the corresponding bones. A leaveone-out cross-validation approach utilizing partial least squares regression (PLSR) 51 was used to generate predictions based on the full SOLD-estimated spectra. A scatter plot of these predictions (referred to as PLSR-SOLD) versus the mineral/matrix ratios calculated from the corresponding exposed-bone spectra is displayed in Fig. 10. For the experimental, exposed-bone data  Fig. 9 Comparison of BTEM and SOLD bone-spectrum estimation methods operating on experimental data. (a) BTEM and (b) SOLD estimates of exposed-bone spectrum from representative transcutaneously acquired data set presented in Fig. 8. (c) True exposed-bone spectrum for comparison. To guide comparison, the asterisk highlights the CH 2 wag peak, which was more accurately reconstructed with SOLD.
sets, the mineral/matrix ratios were calculated as the ratio of the height of the PO 3− 4 ν 1 peak near 960 cm −1 to the heights of the peaks near 855 and 880 cm −1 , which are due to vibrations of the amino-acids proline and hydroxyproline, respectively. 52 All mineral/matrix ratios were normalized to the mean mineral/matrix ratio of the two-month-old WT mice. The relationship between the PLSR-SOLD-derived mineral/matrix ratios and the ratios calculated from the exposed-bone spectra was quantified by a chi-squared test. This test was used to compare the variance of the errors in the PLSR-SOLD estimates (0.030) with the variance of the exposed-bone mineral/matrix ratios (0.176). A one-tailed test revealed that the error variance was significantly less than the variance in the exposed-bone measurements (p < 0.01), suggesting that SOLD, in conjunction with PLSR, is capable of extracting useful estimates of mineral/matrix ratios.
Analysis of the data from the 11 WT mice between 1 and 7 months of age revealed a statistically significant correlation between age and mineral/matrix ratio (r ¼ 0.967, p < 0.01). This correlation with age was preserved in the PLSR-SOLDproduced mineral/matrix ratios (r ¼ 0.912, p < 0.01). Regarding disease differences, the mineral/matrix ratios of the two-monthold oim/oim mice and WT littermates are compared in Fig. 11. Both the exposed-bone and PLSR-SOLD-derived ratios completely separate the two groups of mice.

Discussion
The results of this study demonstrate that diagnostically sensitive estimates of the chemical composition of bone can be extracted from transcutaneous measurements using a spectral unmixing method that is overconstrained to fit the data with varying amounts of only two spectra (one of bone and one of soft tissue), each of which is built from a unique and separately acquired spectral library. A number of other studies have utilized BTEM to estimate bone spectra from transcutaneous Raman measurements. [17][18][19][20][21][22][23] As shown here, however, BTEM is unable to accurately reconstruct spectra of individual layers if spectrally similar chemical components (e.g., type I collagen) are present in more than one layer. This limitation is due to the algorithm not incorporating sufficient a priori information. In particular, BTEM does not assert that the transcutaneously acquired data are made up of a fixed number of spectral components.
In the simulated data sets without noise, BTEM consistently overestimated the mineral/matrix ratios of the underlying bone spectra [ Fig. 7(a)]. This behavior can be understood by considering the following simplified picture. Since the variation in each simulated transcutaneous data set [e.g., Fig. 4(e), 4(f), and 4(g) without noise] is due to a single bone and a single soft-tissue spectrum [e.g., Fig. 4(a) and 4(b), respectively], the first two principal components span the same space as these spectra. The spectral estimate formed by BTEM (built from these two components) can therefore be thought of as the linear combination of bone and soft-tissue spectra that minimizes the objective function. Since bone and soft tissue have significant levels of spectrally similar protein components (e.g., type I collagen), the bone and soft-tissue spectra can be combined in a weighted subtraction to produce a mineral-dominated Table 1 Mean correlation coefficients between estimated and/or directly measured bone spectra. The SOLD and BTEM entries refer to correlations between estimated bone spectra and their corresponding reference measurements. The second BTEM entry lists the result reported by Schulmerich et al. 23 in the most recent and comprehensive work that can be directly compared to our study. The mean correlation among the spectra of exposed bones from different mice is also listed. Notice that only SOLD produced a mean correlation coefficient greater than this baseline. Statistically significant differences were tested for with unpaired Student's t tests operating on the Fisher-transformed correlation coefficients. The correlation coefficients produced by each method were compared with those produced by SOLD and in each case the difference was statistically significant.  Fig. 10 Scatter plot of the PLSR-SOLD-estimated mineral/matrix ratios versus the ratios from the corresponding exposed-bone spectra. The filled-in marker (highlighted by the arrow annotation) corresponds to the mineral/matrix ratios of the spectra presented in Fig. 9.  Fig. 11 Bar plots of the mean mineral/matrix ratios of WT (white) and oim∕oim (gray) mice calculated from exposed-bone spectra and the corresponding PLSR predictions based on the SOLD-estimated spectra. Each circular marker represents the mineral/matrix ratio of a single sample. * p < 0.01 as determined by a two-sample, unpaired Student's t test.

Method
spectrum. Since this reduces the intensity of nontargeted bands without making any go negative, BTEM finds this linear combination attractive and, as seen in Fig. 6(a), the nonmineral peaks (e.g., the CH 2 wag peak near 1450 cm −1 ) in the BTEMderived spectrum are indeed underestimated. This behavior is a fundamental limitation that cannot be allayed by adjusting the parameters of the algorithm. When noise was added to the simulated data, the mineral/ matrix ratios of bone spectra estimated with BTEM were biased in the opposite direction [ Fig. 7(c)]. BTEM customarily penalizes negative spectral intensities, including those associated with noise, which can lead to overestimates in spectral intensity of nontargeted peaks [ Fig. 6(c)]. Although this particular bias could potentially be mitigated by tuning the objective function, the magnitude of the bias is dependent on the magnitude of the noise in the measured data and its principal components, and applying a correction that is robust over a variety of measurement settings is difficult.
Many studies have reported correlations between spectral estimates extracted from transcutaneous measurements with BTEM and reference measurements on exposed bones. [20][21][22][23] However, these studies did not provide context to the magnitudes of the correlations. For example, the mean spectral correlation coefficient reported in a study by Schulmerich et al. 23 was 0.96, a value that in general implies a useful degree of correlation. In our study, however, the spectra acquired from different bones were so similar that the mean spectral correlation coefficient among exposed bones from different mice was 0.988, while the mean spectral correlation with the corresponding BTEM estimates was 0.935. In other words, each spectrum of exposed bone resembled the corresponding BTEM estimate less than it resembled the spectra of other exposed bones. In contrast, the mean correlation coefficient between the SOLDestimated spectra and the corresponding exposed-bone spectra was 0.996, which was statistically significantly greater than the mean spectral correlation coefficients associated with the other methods.
There are many approaches that can be utilized to estimate spectra of subsurface layers in multilayer samples, including, for example, optical depth-sectioning with confocal microscopy. Unfortunately, confocal microscopy is not well suited to measure bone transcutaneously since the penetration depth of this technique is limited to the reduced mean free path of the sample, which is typically a few hundred microns for skin measured at visible and near-infrared wavelengths. 53,54 Another broad method is to sample the layers at various source-detector separations (e.g., SORS) and analyze the variation among the spectra. Application of one or more a priori constraints (e.g., targeting a peak as in BTEM or overconstraining a fitting algorithm by using a fixed number of spectral components as in SOLD) can drive the estimation procedure. Blind decomposition methods, 55,56 if properly constrained, could also potentially be used to estimate bone spectra based on transcutaneous measurements. The results of this study demonstrate that the mineral/matrix ratio of cortical bone, which has frequently been used as a Raman-based indicator of bone health and strength, 4-14 can be measured transcutaneously by processing SORS data sets with SOLD ( Figs. 10 and 11). Overconstraining the algorithm to fit multiple transcutaneous measurements with two lineshapes (one for bone and one for soft tissue) produced accurate estimates of the underlying exposed-bone spectra despite the fact that spectrally similar chemical components (e.g., type I collagen) are present in both bone and soft tissue.
As the first diagnostically sensitive, transcutaneous measurements of bone using SORS, our results support the feasibility of monitoring bone diseases noninvasively and in vivo in mice. Other studies have reported the transcutaneous acquisition of bone signal beneath up to 5 mm of soft tissue 16,17 and from the distal phalanx of a human thumb. 18 Although the measurement sites may be limited depending on the thickness of the overlying soft tissue, these studies, along with the results presented here, suggest that our approach could also provide a platform to noninvasively measure bone biochemistry in human patients.

Appendix: Additional Description of Raman Instrument and Data Processing Routines
The optical fiber bundle depicted in Fig. 2 delivered the Raman scattered light to a spectrograph (HolospecTM VPT System, Kaiser Optical Systems Inc., Ann Arbor, MI), where it was dispersed onto a 1024 × 256 array back-illuminated deep-depletion CCD camera, CCD1 (Model DU 420-BR-DD, Andor Technology, Belfast, Northern Ireland), achieving a spectral resolution of approximately 6 cm −1 . A dichroic beam splitter (Chroma Technology Corp., Bellows Falls, Vermont) and a holographic notch filter (Kaiser Optical Systems Inc., Ann Arbor, MI) were used to reject the elastically scattered light and pass only the Stokes-shifted light to the spectrograph. Finally, because of the limited height of the CCD array, only spectra from the 40 central fibers of the fiber bundle were imaged onto CCD1. The position of each sample with respect to the laser illumination spot was confirmed by a white-light image of the sample plane acquired with a second CCD camera, CCD2.
Raw spectral data were imported into MATLAB® and analyzed through a number of built-in and locally written scripts. Spectral processing included cosmic ray removal, readout and dark-current subtraction, correction for the frequency-dependent response of the system, and correction for the imaging aberrations of the spectrograph system. 57 The spectrum of Raman scattered light collected by each fiber was extracted from the full image by fitting with the separately measured spectral response of each fiber. 58 Background fluorescence lineshapes were modeled and removed by fitting with a seventh-order polynomial and the spectra were smoothed with a Savitzky-Golay filter 59 over a 6 cm −1 window, chosen to match the resolution of the spectrograph. Portions of the spectra below 745 cm −1 were discarded due to the presence of a strong uncorrelated fluorescence contribution. Portions of the spectra above 1740 cm −1 were discarded due to the absence of major Raman spectral features in this region and the falloff in CCD sensitivity. Finally, the Raman shift axis was calibrated with N-acetyl-para-aminophenol, better known by the brand name TYLENOL® (McNEIL-PPC Inc., Fort Washington, PA), to correct for spectral instabilities in the excitation source.