We are developing near-infrared (NIR) Raman spectroscopy as a method to measure the concentrations of blood analytes noninvasively. In this paper we describe our recent achievements with this technology, using glucose as an example.
It is estimated that the number of people afflicted with diabetes mellitus will increase from 150 million to 220 million worldwide from 2000 to 2010.1 There are many serious long-term complications, the most significant being cardiovascular, retinal, renal and neuropathic. The Diabetes Control and Complications Trial report makes it clear that tight control of blood glucose levels, which entails frequent blood sampling, significantly delays occurrence of these complications, resulting in improved quality of life and reduced burden on the health care system.2 Conventional blood sampling methods are painful and have other undesirable features. Noninvasive (“transcutaneous”) blood sampling methods are an attractive alternative for monitoring glucose, as well as other blood analytes. Several transcutaneous techniques are under development; for a review see Ref. 3. Methods employing near-infrared (NIR) spectroscopy combined with multivariate regression analysis are among the most promising.4 5 6 Of the noninvasive techniques for measuring glucose reported in the scientific literature, none has demonstrated sufficient accuracy for nonadjunctive clinical use.7 In addition, there has been no substantial proof that the measured signals result from the actual glucose concentrations.3 Instead, it has been shown that the calibration models derived easily become over determined, and that chance correlations can be interpreted as variations in glucose concentrations.8 9 This indicates the need for a noninvasive method providing greater specificity.
In this paper we demonstrate the use of another optical technique, Raman spectroscopy, for transcutaneous monitoring of glucose concentrations. Raman spectra exhibit distinct narrow features characteristic of the molecules present in the blood-tissue matrix, including glucose. Despite its weak signals, Raman spectroscopy has been shown to provide detailed quantitative information about the chemical composition of skin (proteins and lipids),10 11 and corresponding changes associated with the development of cancer12 13 and atherosclerosis.14 Because spectra from blood or tissue are composed of contributions from many constituents, extraction of quantitative information requires use of a reliable multivariate calibration method, such as partial least-squares (PLS) regression analysis.15 PLS analysis of Raman spectra has been successfully applied to quantitative measurements of glucose and other analytes in serum16 and whole blood samples.17 The present study employs Raman spectroscopy for quantitative transcutaneous measurements. We show that glucose concentration variations in human volunteers can be quantitatively measured. We also present clear spectral evidence that the spectrum of the glucose molecule is an important part of the calibration, the first such demonstration using a noninvasive optical technique.
Materials and Methods
Raman spectra were collected by means of a specially designed instrument, optimized to collect Raman light emitted from a scattering medium (tissue) with high efficiency. The setup (Fig. 1) used an 830 nm diode laser (PI-ECL-830-500, Process Instruments, Salt Lake City, UT) as the Raman excitation source. The beam was passed through a bandpass filter (Kaiser Optical Systems, Ann Arbor, MI), directed toward a paraboloidal mirror (Perkin-Elmer, Azusa, CA) by means of a small prism, and focused onto the forearm of a human volunteer with an average power of 300 mW and a spot area of ∼1/mm2. Backscattered Raman light was collected by the mirror and passed through a notch filter (Super Notch Plus, Kaiser Optical) to reject the backscattered Rayleigh peak and the specular reflection at 830 nm. The filtered light was transferred to a spectrometer (Holospec f/1.8i, Kaiser Optical) by means of an optical fiber bundle (Romack Fiber Optics, Williamsburg, VA), which converted the circular shape of the collected light to a single row of fibers, in order to match the shape of the spectrometer entrance slit. The spectra were collected by a cooled charge coupled device array detector (1340×1300 pixels, Roper Scientific, Trenton, NJ) corrected for the image curvature in the vertical direction caused by the spectrometer optics and grating and then binned in the vertical direction, resulting in a spectrum with intensities at 1340 frequency intervals.
The intensity level of excitation light used in this experiment was based upon a thorough study in which tissue samples were irradiated with various fluences (J/cm2) of 830 nm light. The samples were then examined by a pathologist for changes in histology. The selected 300 mW level was substantially lower than the levels that caused histological changes. Mechanisms for cooling present in vivo, such as blood flow, were not included in this study.18 With this result as an input, our protocol was approved by MIT’s Committee on the Use of Humans as Experimental Subjects. A dermatologist examined the skin of the first volunteer before and after the measurements and observed no change. Except for one volunteer who developed a small blister, none of the volunteers experienced any discomfort during the test or exhibited any skin damage afterwards.
At this power level, our signal to noise ratio (SNR), calculated as the ratio of the collected signal to the noise at each wave number value for a 3 min measurement averaged across the spectral measurement range, 355–1545 cm−1, was 6500:1.
In Vivo Data Collection
Raman spectra were collected from the forearms of 20 healthy Caucasian and Asian human volunteers following the intake of 220 mL of a beverage (SUN-DEX) containing 75 g of glucose. For each volunteer, all spectra were measured from the same area. The data from three of the volunteers were not included in the study because of problems such as excessive movement during the test with two of the volunteers and a small blister developed by the third. Using the data from the remaining 17 volunteers, each spectrum was formed by averaging 90 consecutive 2 s acquisitions (3 min collection times). Spectra were acquired every 5 min over a period of 2–3 h (2.3 h, on average), forming a “measurement series” for each volunteer (27 spectra per series, on average). During this period, the blood glucose concentration typically doubled and then returned to its initial value. The glucose concentrations for all volunteers ranged from 68 to 223 mg/dL. During the measurements, reference capillary blood samples were collected from finger sticks every 10 min (277 total) and analyzed by means of a Hemocue glucose analyzer, with a one std precision specified by the manufacturer as ⩽6 mg/dL. Reference measurements with this amount of imprecision could have added approximately 10% to our reported error in glucose measurement. Spline interpolation was used to provide reference values at the 5 min intervals.
Raman Spectral Pre-Processing
Raman spectra in the range 355–1545 cm−1 were selected for processing. Spectra collected in vivo consisted of large, broad backgrounds superposed with small, sharp Raman features. We utilized two methods of processing the collected spectra. In the first method, the background was removed by least-squares fitting each spectrum to a fifth order polynomial and subtracting this polynomial from the spectrum, leaving the sharp Raman features. In the second method, the spectra were analyzed without removal of the background. Removing the background offers the advantage of more clearly showing the Raman spectra. All of the Raman spectra illustrated in the figures were pre-processed in this way. However, we found that somewhat more accurate calibrations were obtained using data without the background removed (mean absolute error of 7.8% versus 9.2%). Intensity decreases and spectral shape changes in the background signal were observed during the course of measurements on each individual. The effect of the polynomial subtraction method on Raman spectra extracted from background signals with these changes may be the reason that the errors are higher when the background is removed. Therefore, the performance results discussed below are based upon measured spectra without background removal.
The features of the observed in vivo Raman spectra were seen to be dominated by spectral components of human skin. These contributions were evaluated by least-squares fitting the observed Raman spectra to Raman spectra of the key constituents: human callus skin (thickened stratum corneum with high keratin content), collagen I and III to model dermal and epidermal structural protein, and triolein (a triglyceride) to model subcutaneous fat. A Raman spectrum of human hemoglobin was also included to account for the blood volume probed. The spectra of other possible components, such as water, cholesterol, elastin, phosphatidylcholine and actin, were also included. The spectrum for each component was normalized by its total Raman signal strength.
Spectral Data Processing
The combined background/Raman spectra from each volunteer were analyzed by means of partial least-squares regression.15 The spectra were smoothed with a 13 point Savitsky–Golay algorithm to increase the effective SNR and then mean centered. A PLS calibration was created, using Pirouette software (Infometrix, Bothell, WA) and validated using leave-one-out cross validation.19 A PLS calibration regression vector was formed from between 3 and 10 loading vectors from each calibration set. In most cases, the method utilized to determine the optimal number of factors was to first determine the number of factors that produced a minimum Standard Error of Validation (SEV). Then, to reduce the chance of overfitting, the model chosen was the one with the lowest number of factors such that there was not a significant difference in its error compared to the model with the lowest SEV.15 With four sets of data, we utilized more than the number of factors determined optimal by the above method to obtain calibrations that are more strongly influenced by glucose. This is explained further in the Analysis and Discussion section. The predicted glucose concentrations were then obtained as the scalar product of the measured Raman spectra and the calibration regression vector plus the mean value of reference glucose concentrations. A mean absolute error was calculated for the predicted glucose concentrations of the n samples in each data set as
In-vivo Raman Spectra
Figure 2 compares a typical Raman spectrum from the forearm of a volunteer to the Raman spectra of the primary chemical components of the superficial layers of human skin (epidermis, dermis, and subcutaneous fat). From visual inspection, as well as by fitting the spectral components to the in vivo spectra, the dominant spectral feature was found to be collagen I, the main component of dermis. A percentage weight coefficient of 0.62±0.08 was obtained, averaged over the 461 in vivo spectra. This is more than twice that found for the second largest component, triolein (0.27±0.13), characteristic of subcutaneous fat. Keratinized tissue (0.08±0.06), hemoglobin (0.019±0.01) and collagen III (0.011±0.02) all contributed to a lesser extent. The contributions of water, cholesterol, elastin, phosphatidylcholine and actin were all found to be insignificant. The large standard deviations reflect the variations in chemical composition among volunteers, whereas within each measurement series the component weight coefficients were relatively constant (standard deviations an order of magnitude lower).
In Vivo Measurements
A comparison of the predicted glucose concentrations to the corresponding reference data from one of the volunteers is shown in Fig. 3. The mean absolute error (MAE) in the validated data is 5.0% with an R2 of 0.93.
This procedure was applied individually to data from each of the 17 volunteers. A summary of the results of cross validated calibrations on the data set from each volunteer is shown in Table 1. Although the example in Fig. 3 shows the calibration with the lowest MAE, the calibrations for many other volunteers are also good, as can be seen in Table 1.
|Summary of results from cross validated calibrations generated from the data set of measurements on each of the 17 volunteers, sorted by R2.|
The cross validated calibration results from each of the 17 volunteers combined into one chart are shown in Fig. 4. For the data from all 17 volunteers considered as one set, the mean absolute error is 7.8% and the R2 is 0.87.
Analysis and Discussion
The ability to noninvasively monitor variations in glucose present at low concentrations in the blood-tissue matrix of skin, a complex molecular medium, requires a sensitive and highly specific method. This study has shown that Raman spectroscopy can be used for this purpose, thanks to its sharp, characteristic spectral features. (For a review see Ref. 21.) The fact that the multiple peaks of the Raman spectrum of glucose are distinct from those of human skin tissue (Fig. 5) enables differentiation of changes in glucose concentration from changes in tissue characteristics.
In order to measure glucose concentrations in human skin, it is necessary to sample the innermost skin layer, the viable dermis, which is well supplied by glucose from its capillary network. The penetration depth of 830 nm excitation light and the subcutaneous focal point of the collection optics facilitate sampling this layer. Evidence that the dermis is being sampled is provided by the fact that the Raman spectra collected from the forearms of the volunteers are dominated by collagen (approximately 90% of the total protein content, according to a least-squares fit), the major component of dermis.22 Its contribution is much stronger than that of the keratinized outermost skin layer. The underlying subcutaneous fat is also sampled, as evidenced by the fact that triglyceride is the second largest contribution to the skin spectrum. Comparison with the Raman spectrum of subcutaneous fat indicated that triglycerides are the major Raman scatterers in adipose tissue (data not shown). This establishes that the sampling depth extends beyond the dermis. Also worth noting is the small but significant contribution from hemoglobin.
This study was an initial evaluation of the ability of Raman spectroscopy to measure glucose noninvasively. Thus, the focus was on determining its capability on a range of subjects rather than on long-term tracking. The protocol did not include measurement on the volunteers over a number of days and thus independent data was not obtained. We note that a mean absolute error based upon cross validated calibration provides only an indication of the calibration quality and is not a measure of the expected accuracy over a longer term.
However, even understanding these limitations, the results are promising. The calibrations appear good for many volunteers, with ten of the volunteers having an R2 of over 0.8 and mean absolute errors of 9% or less. All but two of the volunteers had an R2 of more than 0.7.
A question that occurs with this kind of procedure is whether the calibration is based upon glucose. This is a question that is relevant to many noninvasive measurement technologies and particularly to a protocol like a glucose tolerance test and where no independent data are available. It is possible that variations specific to an individual or an instrument that happen to be correlated with the glucose concentrations can dominate the calibration.8 9
Raman spectroscopy offers a unique way to address this question. Due to the sharp features of Raman spectra, it is possible to develop a sense of the importance of glucose in the calibration by comparing the calibration regression vector to the spectrum of glucose. As an example, Fig. 6 compares the regression vector for the calibration shown in Fig. 3 to the spectrum of glucose in water. The fact that numerous glucose spectrum peaks appear in the regression vector indicates that the glucose variation is indeed captured in this calibration. We have used the correlation between the regression vector and the spectrum of glucose as a numerical indicator of the importance of glucose in the calibration. We do not expect this correlation to be close to 1 because the regression vector also includes spectral contributions from interferents. In Fig. 6, the correlation is 0.31. We believe that this signifies that glucose is an important component in this calibration. We will continue to develop a base line to help us determine what number for this measure to expect for a good calibration.
This appearance of glucose peaks in the regression vector and the correlation between it and the glucose spectrum is not as strong for all volunteers as is shown in the previous example. These results indicate that we can use this correlation as another factor along with MAE, R2 and slope with which to judge the quality of calibrations for Raman measurements.
Use of the correlation of the regression vector with the glucose spectrum as an additional metric with which to judge the quality of calibrations has helped us improve some of the calibrations. In the calibrations for four of the volunteers (2, 11, 13 and 17), the numbers of factors having the lowest SEVs were 2, 3 or 4. The regression vectors generated by the use of these numbers of factors had a very low correlation (even negative in some) to the glucose spectrum. We found that increasing the number of factors beyond the point of lowest SEV significantly improved the correlation with glucose. This change brought the numbers of factors more in line with calibrations on other volunteers. In these cases, calibrations with a higher correlation with glucose, even though they have a higher SEV, are more strongly influenced by glucose. We have also found that for 2 volunteers (7 and 12), where the optimum number of factors is 3, increasing the number of factors does not increase a low correlation (0.06 in both cases) to glucose. The MAEs and R2 ’s for these calibrations are in the same range as those for other volunteers. However, the low correlations with glucose suggest that these calibrations may be based, in part at least, upon spurious factors. The calibration for Volunteer 4 also appears good, as judged by an MAE of 6.9% and an R2 of 0.91. However a −0.03 correlation between its regression vector and glucose suggests that this calibration is also based upon spurious factors.
An additional way to determine the influence of glucose in the calibrations is to examine the results of calibrations formed by combining data sets from a number of volunteers together, as in the following procedure.
Data from a number of volunteers were combined into one set. A calibration algorithm was generated for the entire set and validated by leave-one-out cross validation. The mean absolute error is expected to rise as data from more volunteers are added to the set because the different chemical and physical characteristics among various people increase the spectral variability. However, a limited rise would indicate that the signal from the common variable, glucose, is strong enough to be seen among the other variations. We have found through simulation, in vitro testing and processing this transcutaneous data that the correlation between glucose and spurious factors that may exist with one volunteer is weakened by calibration using data from multiple volunteers. A factor which is due to the environment/instrument that happens to be correlated with glucose during the test protocol for one volunteer is less likely to be correlated to glucose during test protocols for multiple volunteers.
A calibration was generated on data comprising 244 samples from a group of nine volunteers whose calibration quality appears to be relatively high. The fact that the optimum number of factors for this calibration is 17 indicates that many differences among volunteers are being accounted for. The results are shown in Fig. 7. A mean absolute error for this group of 12.8% and an R2 of 0.70 is an indication that glucose is an important part of the calibration. Stronger evidence that this calibration is based on glucose is provided by observing the regression vector for the calibration on this data, also shown in Fig. 7. Many glucose spectrum peaks are seen in the calibration regression vector. The strong correlation between the regression vector and the glucose spectrum of 0.45, even though there are 17 factors, indicates that the glucose signal is strong enough to be detected among the large variances in spectra that occur among nine different volunteers. This is direct evidence that spectrum of the glucose molecule has a strong influence in the calibration.
When data from all 17 volunteers are combined into one group, the average error grows to 16.9%. Although this error is higher than our eventual target, this level of error is encouraging for an initial transcutaneous study. A very positive result is that even with this data set, the regression vector includes many peaks of glucose, as is shown in Fig. 8. Even though many more parameters are changing, as indicated by a model with 21 factors, the correlation between the regression vector and the glucose spectrum of 0.35 indicates that glucose is still a key factor.
Unlike many methods of measuring glucose, with which there are valid questions about whether glucose is being measured, the strong presence of glucose in the regression vector developed from Raman measurements provides direct spectral evidence that the measurements result from the active glucose concentrations.
This study has provided us with important issues to address so as to better understand the scientific basis for the measurement and calibration processes and to bring this technology closer to practical use. We believe that determining the causes of the decreasing background signal observed during the course of measurements on each individual and reducing the impact of this change on the background subtracted Raman signal will improve our performance. We have realized that instrument wave number and intensity stability is critical to obtaining good performance using independent data. To this end we have improved the stability of our system for future studies. Creating improved methods of processing data to reduce prediction error and increase robustness is another important goal. We also are continuing our effort to increase our understanding of and ability to utilize the information that exists in the regression vector.
This study demonstrates the feasibility of noninvasive blood glucose measurements using Raman spectroscopy. This result combined with our earlier report on whole blood measurement of a number of analytes17 suggests the feasibility of noninvasive measurement of other blood analytes as well. It also projects the promise that technology based upon Raman spectroscopy can be developed to meet clinical accuracy requirements. To our knowledge, this is the first report of optical noninvasive glucose measurements to clearly demonstrate that the spectral features of the glucose molecule are an important part of the calibrations.
This work was performed at the MIT Laser Biomedical Research Center and supported by the NIH National Center for Research Resources, Grant No. P41-RR02594, and a grant from Bayer Health Care, LLC. A.E. acknowledges support from the Swedish Research Council. The participation of Eric Schwartz of the MIT Medical Department is gratefully acknowledged. We thank Tae-Woong Koo for his helpful comments.