1.IntroductionTo date, there is no curative treatment for type 1 and type 2 diabetes.1,2 Therefore, patients must monitor their blood glucose levels (BGLs) to prevent further progression of the disease. Continuous glucose monitoring (CGM) is becoming more widely used among diabetic patients,3 and its accuracy is steadily improving.4 In addition, its availability is expected to further increase since the United States Food and Drug Administration (FDA) recently granted the first over-the-counter (OTC) approval for a specific model. Having said that, conventional self-monitoring blood glucose (SMBG) is still widely used as the first step in BGL control because of its reliability, lower cost, accuracy,5 and accessibility as an OTC device. However, SMBG requires a painful finger prick, which carries the risk of infection6,7 and sometimes results in low patient adherence.8 To address these problems with conventional SMBGs, the authors proposed a non-invasive glucose monitoring (NIGM) method, called the metabolic index (MI) method, which is based on the phase delay between oxy- and hemoglobin pulsation signals induced by oxygen consumption in cell respiration.9 Although a smartwatch-based prototype successfully demonstrated the basic idea of the proposed phase-delay-based method, which showed a proper correlation with the reference BGLs in the case of sugary and non-sugary oral challenges, the smartwatch-based prototype was not suitable enough for accurate BGL estimation at that time. Therefore, another prototype based on a smartphone camera was introduced for the repeatability test. However, although the smartphone camera-based prototype is acceptable as a portable device, it is difficult for continuous use in daily life. Given the reported effectiveness of CGM devices for glycemic control,10,11 NIGM should preferably be integrated into portable devices and continuously monitor BGLs. The main difficulties in using wearable smartwatches as NIGM devices are as follows. First, the size of the active area of the detector and sensitivity are limited,12 which may result in poor signal quality depending on the peripheral perfusion status.13 Second, smartwatches and smart rings are worn close to the extremities of the human body so that their longer moment arms can easily suffer from signal distortion and artifacts caused by body motion.14 These are serious obstacles not only for NIGM but also for heart rate (HR) and heart rate variability (HRV) monitoring. To address these difficulties that inevitably accompany photoplethysmography (PPG) techniques, many researchers have already proposed methods to evaluate PPG signal quality and to provide measures to mitigate low-quality data. Elgendi15 compared the performance of several signal quality indices (SQIs) for PPG signals previously proposed by other researchers and concluded that the skewness index showed better performance in discriminating between excellent and other lower-quality signals. Although the basic idea of the skewness index can also be applied to the NIGM, details need to be modified and optimized to fit into the MI method. A combination of inertial sensors with PPG sensors has also been proven to detect and remove motion artifacts. Wood and Asada16 and Lee et al.17 proposed motion artifact cancellation methods using accelerometers and gyroscopes, respectively. Although this type of solution would be best suited for traditional smart devices that include multiple inertial sensors for activity tracking and gesture recognition, this method can sometimes lead to high power consumption.18 To overcome the above concern, Tabei et al.19 and Afandizadeh Zargari et al.,18 respectively, investigated machine learning (ML)-based methods to detect and remove motion and noise artifacts (MNAs) without using accelerometers, and both achieved successful results. However, ML methods typically require a sufficient amount of data over a wide range of personal variations, including gender, age, and race. This process of data accumulation also requires a significant amount of time and money, which can sometimes be burdensome for small organizations and may prohibit the advancement of technology. Obviously, the signal-to-noise ratio (SNR) is also crucial for the PPG signal quality. However, the effect of SNR on the accuracy of BGL estimation in the MI method has not been analyzed. For the above reasons, a simple and specialized index to detect and reject motion artifacts for the MI method, which does not require additional inertial sensors and ML models, needs to be discussed to realize portable and continuous MI-based NIGM devices. Taking the form factor of wearable devices such as smartwatches into consideration shows that conserving battery power is essential for prolonging the lifespan of the NIGM function. Moreover, the increasing penetration rate of smartwatches across various regions and age groups, along with the ever-increasing spread of diabetes throughout the world, clearly shows that a signal quality method that is easy to implement and apply regardless of personal and individual differences is absolutely essential for meeting the increasing worldwide demand for NIGM devices. In this study, the authors first explain the SQIs for the MI method. Then, the proposed quality indexes and data rejection method are further validated through multiple oral challenge tests using a smartwatch-based prototype. 2.Theory and Formulation2.1.Basic Formulas of the MI-Based NIGM MethodFigure 1 shows a schematic diagram of a near-infrared spectroscopy (NIRS) measurement on a living body. According to the modified Beer-Lambert law (MBLL), by using two different probe wavelengths and solving for the matrix calculation, changes in oxy- and deoxyhemoglobin NIRS signals and can be expressed as follows:20–24 where and are the molar concentrations of oxyhemoglobin and deoxyhemoglobin in the blood at time , is the optical path length with respect to the time , and the subscript 0 represents the initial condition, respectively.Here, both oxy- and deoxyhemoglobin incremental NIRS signals and can be decomposed into quasi-DC and AC components as follows: Here, the subscripts AC and DC indicate the AC and quasi-DC components of the corresponding physical quantity, respectively. For reference, a visual explanation of this decomposition operation is shown in Fig. 2. Here, by applying several assumptions and approximations, the oscillation amplitude of the AC component of deoxyhemoglobin concentration is obtained as follows:9 where is the total hemoglobin concentration, is the oscillation amplitude of the AC component of the optical path length , and is a dimensionless metabolic index defined as follows: where is the tiny phase delay between oxy- and deoxyhemoglobin signals and , and is the arterial oxygen saturation, which can be approximated as follows: where and are the amplitudes of the AC components of the oxy- and deoxyhemoglobin signals, respectively. To help visually comprehend Eq. (8), Fig. 3 illustrates the behavior of the MI versus oxygen saturation when . As can be seen in Fig. 3, MI increased along with a decrease in the oxygen saturation from 100% and reached a maximum value at 50% saturation. Conversely, MI becomes zero at 100% oxygen saturation. This behavior suggests that the metabolic activity resulting from local cell respiration occurs concurrently with a lower peripheral oxygen saturation that is attributable to oxygen transactions in capillaries.Here, by applying the assumption about the relationship between and , Eq. (7) can be rewritten as follows:9 whereHere, and are AC and quasi-DC components of the optical path length in the initial condition, respectively. The term denotes an amplitude-corrected metabolic index, and denotes a dimensionless optical path length correction factor, expressed as follows:9 Here, can be obtained from and by employing the following relationship: As corresponds to the oxygen consumption in each cardiac cycle of the probed region and BGL affects the living body metabolism, can be assumed to exhibit a strong correlation with BGL. Furthermore, given , , and are all constant, BGL can be estimated by monitoring . 2.2.Derivation of the Signal Quality Index for the MI MethodTo evaluate the signal quality for the AC oxy- and deoxy hemoglobin NIRS signals and , the standard deviation concept is introduced. Figure 4(a) shows a typical example of the discretely-sampled oxy- and deoxyhemoglobin NIRS signals, and Fig. 4(b) illustrates the band-pass-filter (BPF)-applied and normalized waveforms of Fig. 4(a), where components around the HR frequency are extracted. As can be seen in Fig. 4(b), the two waveforms exhibit a high degree of similarity with minimal discernible differences in ideal conditions. In this study, the authors sought to quantify the dissimilarity between the two normalized BPF-applied waveforms by employing the standard deviation , which is defined as where is the discretely sampled data length and and are the elements of normalized BPF-applied oxy- and deoxyhemoglobin data, respectively. Given that and are BPF-applied values, the standard deviation is nearly zero under regular conditions.Here, consider the scenario where noise is added to the NIRS waveforms. Figure 5(a) shows an example of the discretely-sampled oxy- and deoxyhemoglobin signals with noise, and Fig. 5(b) illustrates the normalized BPF-applied waveforms of Fig. 5(a), where components around the HR frequency are extracted. In this example, a small spike is superimposed on the deoxyhemoglobin waveform around data number 200. Figure 5(b) illustrates that a minor fluctuation in the NIRS waveform can have a significant impact on the BPF-applied waveform, resulting in an increase in the standard deviation . Nevertheless, even under optimal conditions, the standard deviation is constrained by the inherent delay between the oxy- and deoxyhemoglobin NIRS waveforms, which is on the order of several tens of milliradians. Given a specific value for the phase delay , the reachable standard deviation limit can be expressed as follows: Here, the derivation process of Eq. (16) is described in detail in Sec. S1 of the Supplementary Material. Subsequently, by applying a deformation to Eq. (16), the following is derived: where is the estimated phase delay based on the standard deviation . Equation (17) indicates that the phase delay can be calculated from the standard deviation in addition to the method of comparing the fast Fourier transform (FFT) phase at the main peak of each spectrum of and , which was employed in the previous research.9 Hereafter, for the sake of convenience, the FFT-phase-based is referred to as . Finally, the signal quality of the oxy- and deoxyhemoglobin NIRS signals can be evaluated by calculating the phase error , which is expressed as follows:In this study, each calculated is initially screened by to be less than a predetermined value. 2.3.Derivation of Theoretical Phase Estimation Errors Defined by the Background Noise LevelAlthough in Eq. (18) is effective in identifying and rejecting visually recognizable errors in NIRS signals, it lacks the capacity to assess potential errors that are not visually apparent. In a strict sense, the FFT-based phase delay, , is inherently affected by perturbation from the background noise, which is typically defined according to the noise floor level. This results in an error in phase detection. In this subsection, the authors attempt to formulate the phase estimation error defined by SNR. Figure 6 illustrates the basic idea of the phase estimation error caused by the random background noise. Here, represents the normalized main peak vector of the FFT-applied NIRS signal, and represents the background noise vector with amplitude and argument of and , respectively. In this representation, the phase estimation error, which is denoted by , can be replaced by the argument of the combined vector and can be expressed as follows: where is the inverse value of the , which can also be called the noise-to-signal ratio, and it is assumed that can be treated as a constant within a limited measurement period.Then, the typical phase estimation error due to the background noise, denoted by , can be calculated as the square root of the mean value of over , which is expressed as follows: Here, Eq. (20) cannot be solved analytically. However, by assuming that , the inverse value of the SNR, is sufficiently less than 1, Eq. (20) can be approximated as The derivation process of Eq. (21) is described in detail in Sec. S2 of the Supplementary Material. Here, is practically calculated by comparing the FFT phase of the oxy- and deoxyhemoglobin signals at the HR frequency. Therefore, the total phase estimation error, denoted by , should be expressed as the resultant error as follows: where and are FFT-phase estimation errors of the oxy- and deoxyhemoglobin NIRS signals at the HR frequency, and and are SNR of oxy- and deoxyhemoglobin NIRS signals, respectively.Typically, the signal amplitude of the deoxyhemoglobin NIRS signal is approximately one-tenth of that of the oxyhemoglobin NIRS signal. Assuming that the background noise amplitude is almost identical between oxy- and deoxyhemoglobin NIRS signals, is times worse than . Therefore, is dominant in Eq. (22), and Eq. (23) can be approximated as follows: As indicated by Eq. (24), becomes as high as 70 mrad when is 10, or 20 dB in decibels. In the previous research, of BGL change induced of phase delay in .9 Consequently, the suboptimal quality of may readily compromise the precision of the BGL estimation. 2.4.Derivation of Theoretical Phase Estimation Errors Defined by the Data Sampling FrequencyIt is also crucial to consider the sampling frequency of the NIRS signals to estimate the phase estimation error. When the sampling frequency and the HR frequency are defined as and , respectively, the mathematical phase division step size of the discretely sampled NIRS signals, denoted by , can be expressed as follows: Here, Figs. 7(a) and 7(b) illustrate examples of the potential range of temporal uncertainty resulting from discrete sampling, where Fig. 7(b) is an enlarged view of Fig. 7(a) at . For the sake of simplicity, the values of and are set to 100 and 1.0 Hz in Figs. 7(a) and 7(b). In addition, the term represents the phase offset from the original waveform, the dashed vertical lines in Fig. 7(b) represent the horizontal sampling step size determined by and , and the dash-dot lines in Fig. 7(b) represent a waveform reconstructed from the discretely sampled data. Given that the temporal phase uncertainty resulting from discrete sampling is constrained by the boundaries of , its probability can be approximated by the continuous uniform distribution. Subsequently, by applying the formula for the standard deviation of the continuous uniform distribution, the typical phase uncertainty due to discrete sampling can be expressed as follows: As with the case of , both the oxy- and deoxyhemoglobin NIRS signals have their own sampling uncertainty. Consequently, the total sampling uncertainty, denoted by , can be given as follows: 2.5.Total Estimation Error of the Metabolic Index Resulting from Phase Estimation ErrorsFrom Eqs. (8), (10), (11), (24), and (27), the combined value of the latent estimation errors of the metabolic index , denoted by , can be approximated as follows: where is the resultant error of and . Here, the estimation errors of and resulting from the background noise are ignored for having limited impacts on compared with the phase estimation errors introduced in this paper.In this study, the value of screened by in Eq. (18) is subjected to further examination based on , to ensure the estimation quality. 3.Materials and MethodsThis section presents an examination of a proposed low-quality data rejection method utilizing a two-stage screening process. The examination is conducted through a series of oral challenge tests using a smartwatch-based prototype device. 3.1.Smartwatch-Based PrototypeIn this study, the Samsung Galaxy Watch 4 44 mm (SM-R870) was utilized as the experimental unit in a manner consistent with that employed in the previous research.9 Figure 8(a) shows the schematic rear view of the SM-R870. Red and infrared (IR) LEDs are located in the center, and their center wavelengths are and 930 nm, respectively. Eight photodetectors are arranged radially around the LEDs. The sampling frequency of the red and IR LED signals is 100 Hz. Here, no hardware modifications were made to the prototype, and the BGL estimation function was implemented in software. Figure 8(b) shows a schematic of the application style of the prototype device. Here, the device was wrapped around a finger pad by taking account of the capillary density of the measurement portion, which correlates strongly with signal quality. 3.1.1.Experimental protocolFigure 9(a) shows the schematic of the clinical test setup. A protocol was established while referencing factors that can affect reading values in the case of a pulse oximeter.25 To ensure consistency in the acquired data, a single, healthy, and non-diabetic male was selected as the sole test subject. The test subject was asked to sit still during the experiment with the smartwatch resting on their finger pad to reduce fluctuations in peripheral blood flow caused by minor changes in posture and local blood pressure. The subject was also asked to rest his elbows on armrests. In addition, before starting the measurement, the subject held a hand warmer for a few minutes to ensure adequate peripheral blood flow. Figure 9(b) shows the typical course of a clinical test. A CGM device (Abbot, FreeStyle Libre®), which records BGL values at 15-min intervals, was used as a reference for the BGLs. To use CGM sensors at their best performance, CGM sensors were installed at least two days before the experiments for aging.26 Furthermore, the sensors were not utilized for clinical testing during the final two days of their specified expiration period. In addition, to correct for the systematic offset and delay specific to each CGM sensor, SMBG strips (Abbot, FreeStyle Precision®, Chicago, Illinois, United States) were used as needed before and after data recording by the smartwatch. To prevent possible distortion of the PPG signal due to body movements resulting from SMBG usage, no SMBG strips were used during the recording period. Data recording then started after at least 2 h of fasting, oral challenges were given to 30 min after the start of recording, and data recording continued after the oral challenges until the subject’s BGL had mostly returned to the initial level. As a result, of PPG data were collected in each data recording. The clinical tests were conducted over a period of four months, with a total of 30 repeatability tests performed. In this research, sugar-containing carbonated beverages and glucose-containing jelly beverages were used for oral challenges. For reference, Table S1 in Sec. S3 of the Supplementary Material shows the main nutritional values of the oral challenges. When administering oral challenges, care was taken to ensure that all oral challenges were not excessively cold, thus avoiding peripheral blood flow reduction due to the lowering of the body temperature. All clinical trials described in this paper were conducted in accordance with the Clinical Trials Act of the Ministry of Health, Labor and Welfare of Japan, published on the basis of the Declaration of Helsinki, and were approved by the Ethical Committee of Hamamatsu Photonics K. K. Informed consent was obtained from the subject before measurements were performed. All clinical tests were performed under the supervision of the co-author having a medical license. 3.1.2.Data processingFigure 10 depicts the flowchart for the calculation and data screening process of the modified metabolic index based on the data quality metrics mentioned in Eqs. (18) and (28). Here, and represent arbitrary-defined thresholds for and , respectively. Initially, the retrieved data underwent preprocessing. Raw PPG data from red and IR LEDs were retrieved and accumulated to a certain data length. Then, the raw PPG data were converted to the incremental NIRS signals and . Furthermore, BPF was then applied to each NIRS signal to remove unnecessary high-frequency components and low-frequency components due to respiratory cycles. Here, two different BPF parameter sets were used for different purposes. One parameter set was for calculation, which extracts around the HR components of NIRS signals. The other parameter set was for FFT calculation, which extracts 0.8 to 10 Hz components. After applying the BPFs, the excess end portions in each waveform were trimmed so that the first and last points of the data corresponded to the beginning and end of the pulse wave. Furthermore, FFT was applied to the filtered-and-trimmed NIRS signals. Here, each signal for FFT calculation was resampled to make the data length the smallest power of two greater than or equal to the original length. Subsequently, the BPF-applied signals and FFT spectra were employed to calculate , , , , and , which comprise , , and . Finally, only those elements that satisfy the specified screening conditions were retained as valid data. Once the preprocessing was complete, the recorded dataset of was subjected to further postprocessing. Figure 11 shows a flowchart and its visual explanation of the postprocessing. First, the values were divided into 1-min chunks. Second, the representative value within each chunk was calculated. Third, linear interpolation was employed to address the absence of representative values resulting from the low-quality data rejection in the preprocessing stage. Finally, the moving average was applied to the representative values. In this study, the median value was employed as the representative value for each 1-min interval, and the window length for the moving average was set at 30 points, which is equivalent to 30 min for a 1-min interval dataset. This two-stage smoothing approach is beneficial for identifying and eliminating residual outlier values, which have been shown to have a favorable correlation with the reference CGM values from previous research.9 3.2.First-Stage Screening Results for the Obtained DataFigure 12(a) depicts an example scatter plot of versus generated from a typical oral challenge test. Here, for illustrative purposes, the area of the error region is depicted in Fig. 12(a), and the color scale in the plot indicates the spatial density of nearby points. Although the majority of data points were distributed around the diagonal line, notable amounts of outliers were confirmed. Figure 12(b) shows a time-series plot of generated from the same data as Fig. 12(a). Here, the color scale in each data point indicates , and the dashed red curve shows the smoothed values. Figure 12(b) shows that data points close to the smoothed curve have lower and vice versa. This result suggests that outlier data can be excluded by examining . Then, an appropriate value for the acceptable limit needs to be determined. Figure 12(c) shows the typical transition of the pass rate and the fluctuation level according to different values. Here, the typical fluctuation level was derived by calculating the standard deviations of the difference between adjacent time-domain data after applying the -screening. Although it is possible to establish strict data screening by applying a tighter , eliminating the excessive amount of data may result in the generation of inaccurate representative values in the post-processing. To achieve a balance between the two opposing factors of stricter screening and higher pass rate, 10 mrad of , which yields of the pass rate and a quasi-minimum value in the fluctuation level, has been adopted in this study. Finally, Fig. 12(d) shows the binarized result of Fig. 12(b) for the passed and rejected data when . With reference to Fig. 12(d), it was revealed that a sufficient amount of outlier values could be eliminated by -screening while maintaining an adequate quantity of data for the post-processing. Here, it is more probable that the remaining outliers in Fig. 12(d) are the result of a random correlation caused by fluctuations in both and . Therefore, it was not possible to eliminate them by applying a stricter threshold. For reference, Figs. S2(a)–S2(d) in the Supplementary Material show binarized results of Fig. 12(b) for various values. 3.3.Post-Processed Results for the Obtained DataFigures 13(a)–13(d) show typical post-processed results of oral challenge tests. For comparison, the evaluation result without -screening was indicated in each plot. The CGM delays were adjusted by using the sensor-specific constant value of each sensor, which was determined through comparison with SMBG readings during the pre- or post-experimental period, and the error bars calculated using the typical values of are indicated in each curve. For the reference CGM values, the 10% vertical error bar range was applied, taking into account the typical accuracy of the sensor.4 In addition, the ranges of the left and right vertical axes are adjusted based on the relationship between and BGL as established in the previous research.9 In this study, of threshold and of power exponent in -correction have been applied, and the optimization process will be explained later on. Looking at each of Figs. 13(a)–13(d), it can be seen that the error bar width and the overestimation of have been effectively mitigated through the implementation of -screening. To illustrate the impact of the -screening on and , Figs. 14(a) and 14(b) present histograms of and generated from data points across 30 repeatability tests, with and without the -screening, respectively. Note that the y-axes of Figs. 14(a) and 14(b) are presented on a logarithmic scale to enhance the visibility of outlier values and that of threshold and of power exponent in -correction have been applied, as is the case in Fig. 13. Moreover, the standard deviation of each histogram, indicated by STD, is presented in the corresponding plot. Looking at each of Figs. 14(a) and 14(b), it was demonstrated that -screening can effectively remove outlier values in , thereby reducing the standard deviation of , and resulting in eliminating inappropriately large values. 3.4.Repeatability Test ResultsFigure 15(a) shows a scatter plot of the acquired values versus reference CGM values, generated from the entire repeatability test results, without applying both and -screening processes. Here, of power exponent was applied in the -correction. For illustrative purposes, the linear approximation and the correlation coefficient calculated by the linear least squares (LLS) fitting are plotted in Fig. 15(a). Furthermore, the highlighted region around the linear approximation indicates a range of . Figure 15(b) shows a Parkes error grid for type 1 diabetes27 that was generated from the results and the conversion coefficients obtained by the LLS fitting presented in Fig. 15(a). Here, each numerical data for corresponding to each CGM data point has been calculated by interpolation, resulting in the generation of 183 data point pairs of and CGM values. The color scale in the plot indicates the spatial density of nearby points. In this study, CGM values were utilized as the reference values for error grid analysis, in lieu of blood glucose values obtained via venous blood glucose testing. Because of this substitution, note that error grids presented in this paper have combined errors from and the CGM. In addition, the mean absolute relative difference (MARD) and the root-mean-square error (RMSE) are also plotted in Fig. 15(b) as BGL-estimation accuracy metrics. Here, in fact, both Figs. 15(a) and 15(b) contain values significantly outside the vertical plot ranges. For reference, Fig. S3 in the Supplementary Material shows zoom-out views of Figs. 15(a) and 15(b). By applying both - and -screening to the repeatability test data presented in Fig. 15(a), the scatter plot and the Parkes error grid can be obtained as Figs. 15(c) and 15(d), respectively. Here, the same parameters for -screening and -correction applied in Fig. 13, along with the plotting methods employed in Figs. 15(a) and 15(b), were applied to Figs. 15(c) and 15(d). A comparison of Figs. 15(a) and 15(c) reveals that the number of data points within the region and the correlation coefficient are greater when the two-stage screening is applied. In the same way, a comparison of Figs. 15(b) and 15(d) reveals that the number of data points within Zone A and the accuracy metrics are improved by applying the screening process. For reference, zoomed-out views of Figs. 15(c) and 15(d) are shown in Fig. S4 in the Supplementary Material. In addition, Fig. S5 in the Supplementary Material presents a transition movie between without and with the two-stage screening process (Video 1, Mp4, 191 KB [URL: https://doi.org/10.1117/1.JBO.29.10.107001.s1]). Moreover, Figs. S6–S9 in the Supplementary Material illustrate the individual results of the entire oral challenge test. Finally, Table 1 shows the comparison of key performance metrics with and without the two-stage screening process. For purposes of comparison, the performance metrics observed in the previous research using a smartphone camera-based prototype9 are also listed in the table. As demonstrated in Table 1, the smartwatch prototype exhibited comparable or enhanced BGL estimation performance by adopting the proposed two-stage screening process. Moreover, although the BGL range and the number of test subjects are limited and cannot be compared directly, the performance metrics obtained through the proposed screening process are comparable to those obtained through other NIGM methods, including Raman spectroscopy28 and radio frequency (RF) spectroscopy.29 Table 1Comparison of key performance indicators of BGL estimation with and without screening process.
3.5.Parameter OptimizationFigure 16(a) shows a transition of the correlation coefficient computed by the LLS fitting according to different -screening threshold , with the power exponent for the -correction fixed at 0.5. In this plot, the region below a typical least value of is filled in gray. Figure 16(a) demonstrates that the correlation coefficient reaches its maximum value at approximately . Figure 16(b) shows a transition of MARD and RMSE according to different . Similar to Fig. 16(a), both MARD and RMSE exhibited their optimal values around . In general, a more strict threshold leads to more effective control of data variations. However, in this case, an excessively strict threshold leads to a thorough rejection of the obtained data. This results in insufficient data points in the post-processing, which in turn leads to a deterioration in accuracy. Therefore, it can be inferred that the modest screening threshold exhibited the best performance. Next, Fig. 17(a) shows a transition of the correlation coefficient computed by the LLS fitting according to different power exponent for -correction, with the threshold for the -screening fixed at . Here, the area where -correction is non-applicable due to significant deterioration in accuracy is filled in gray. In this case, the correlation coefficient attained its optimal value at , which implies that the AC amplitude of the optical path length, , is approximately proportional to the square root of the total optical path length, . Similarly, Fig. 17(b) shows a transition of MARD and RMSE according to different power exponents . The optimal values for MARD and RMSE were observed at in this case as well. 4.DiscussionAlthough 10.5% of MARD has been confirmed by applying the proposed two-stage screening process in Fig. 15(d), it should be noted that this MARD value is equivalent to a training result in the case of the ML method. It is occasionally proposed that only predictive results should be considered in the context of NIGM development.30 It is therefore more equitable that the acquired oral challenge results be evaluated by separating them into a training and a test stage. Figure 18 shows the average MARD in the test stage as a function of the specific training data ratio. In the case of a 30% training data ratio, for instance, nine datasets were randomly selected as training data from a total of 30 datasets. The LLS fitting coefficients were calculated from the training data. Subsequently, the MARD was calculated from the remaining 21 datasets for the test stage, using the LLS coefficients that had been derived in the training stage. Given that MARD values in the test stage are affected by the extraction patterns employed for the test data, a sufficient number of repetitions were conducted to ensure the accuracy of the average MARD calculation, utilizing different test data pick-up patterns. In addition, the highlighted region in Fig. 18 indicates the standard deviation range of MARD. Figure 18 indicates that the average MARD is below 12%, which is comparable to the value observed in Fig. 15(d), at 30% of the training data ratio. Then, the average MARD curve exhibits a nearly flat trend above 50% of the training data ratio, accompanied by a minimal standard deviation between 60% and 70% which is a typical training data ratio in a regular ML process. The results demonstrated that the MI method with the proposed two-stage screening process exhibited sufficient performance in the BGL prediction stage as well as the training stage. For reference, Fig. 19(a) shows a typical example of the training result at 30% of the training data ratio. Figure 19(b) shows an error grid analysis result generated from the training result for Fig. 19(a). In addition, Fig. 19(c) illustrates a test result generated from the learned parameters derived from Fig. 19(a). Here, given that the reference CGM has typically exhibited a MARD of approximately or below 10%, it can be reasonably assumed that of the average MARD observed in Fig. 18 at 70% of the training data ratio is nearly equal to the reachable limit under the current experimental protocol. To examine the more precise performance of the MI method, venous blood draw or finger prick SMBG must be adopted as the reference BGL values. As may be seen in Table 1, the smartwatch prototype exhibited 28.1 mg/dL of RMSE without the proposed two-stage screening process, which represents a 1.5-fold increase in error relative to the smartphone camera-based prototype observed in the previous research. This gap in RMSE can be attributed primarily to the difference in SNR between the prototypes. Specifically, the smartphone camera prototype employs a high-sensitivity CMOS camera with 108 megapixels, whereas the active area of the PD sensor on the smartwatch is optimized for the detection of HR and the measurement of oxygen saturation. Moreover, the LEDs in most commercially available smartwatches draw the optimum current for battery conservation. These factors collectively contribute to the observed difference of over 20 dB in SNR between the two types of prototypes. This ultimately led to a rise in the RMSE of the smartwatch prototype. Although it is demonstrated that this specific disadvantage of smartwatches in the SNR can be compensated through the proposed two-stage screening process, this screening approach is not a fundamental solution. In this study, of acquired data were rejected through the two-stage screening process in each of the 30 oral challenge tests. In the context of lower ambient temperatures or reduced peripheral blood flow for various reasons, it is anticipated that the data rejection rate will increase, resulting in the creation of multiple void sections in the continuously monitored NIGM data. To address the fundamental issue of smartwatches, it is necessary to consider the enhancement of SNR, particularly in relation to the deoxyhemoglobin NIRS signal. As previously stated, an increase in LED current will result in a reduction in battery life and may potentially lead to an increase in the risk of optical skin burn. Consequently, it would be preferable to pursue an approach that enhances the photodetector in this instance. Indeed, in the case of the smartphone camera-based prototype, which exhibited an SNR that is 10 times or 20 dB superior to that of the smartwatch prototype, the device is largely free from the potential effects of background noise. For the smartphone prototype, the sampling frequency-rooted error represents the most significant contributing factor in reducing noise. Figure 20 shows a typical transition of the , , , and without the proposed screening process, all of which have been normalized at , and this plot is derived from a specific oral challenge test. Note that the y-axis is a logarithmic scale. Figure 20 illustrates that the reached its lowest value at due to a reduction in the amplitude of the NIRS signal pulsation. Then, according to Eq. (13) with and Eq. (24), and increased in inverse proportion as , resulting from a lower NIRS signal pulsation amplitude. Finally, exhibited an increasing trend at a rate proportional to the inverse of the , to the power of greater than one. For purposes of comparison, the curve is plotted in Fig. 20. As illustrated in Fig. 20, may increase by a factor of five times its initial value. This result indicates that may reach as high as even if its typical value at the normal condition is . Therefore, to ensure practical use, the must have an ample safety margin in place, if applying the NIGM system to smartwatches without the proposed screening process. Given that the typical value of the smartwatch prototype in a normal condition is or 34 dB in decibels, then at least two or even a four times better equivalent to 40 to 46 dB in decibels is preferable. With 40 dB of typical , the typical will be only 7 mrad, which is approximately one-tenth of the typical in a 100 Hz sampling state. In consideration of the form factor of smartwatches, it is not a viable solution to expand the active region of photodiodes by a factor of 4 while maintaining the conventional background noise level.12 It is therefore necessary to consider enhancing the photoelectric conversion efficiency and reducing the thermal noise at the amplifier circuit, in addition to increasing the photodiode size as much as possible. In light of the above-mentioned issues, it can be reasonably deduced that augmenting the sampling frequency in the absence of an accompanying SNR enhancement would prove to be an ineffective strategy, particularly given the prevailing circumstances about smartwatches where the SNR represents the dominant source of the BGL estimation error. Here, on the other hand, it is necessary to discuss the acceptable low SNR threshold. In cases where the -correction is not a requisite component, or namely, when the optical path length is maintained nearly constant or compensated for through the implementation of suitable opto-mechanical measures, can be suppressed to some extent, and the SNR requirement can be eased. Given that , in accordance with the typical conditions for the smartwatch-based prototype, it is possible to maintain below 10 mrad with a minimum of 10, or 20 dB in decibels, which is five times lower than the typical . We can therefore propose with reasonable certainty that implementing opto-mechanical measures for the purpose of compensating the optical path length will provide an effective alternate solution for improving the SNR. Future studies for more effectively applying the proposed data screening method should also include investigations made by testing multiple subjects with different characteristics such as age, sex, body mass index, and BGL. 5.ConclusionThis study presents an analytical derivation of an SQI for the detection of tangible noise on NIRS signals and an investigation of the factors that may affect BGL estimation accuracy. Subsequently, a two-stage data screening process was proposed, utilizing the derived SQI and identified error factors. The effectiveness of this process was validated through 30 oral challenge tests. The implementation of the proposed screening process has led to an enhancement in the accuracy of BGL estimation for the smartwatch-based prototype, which SNR is constrained by the device’s form factor. The proposed screening process would facilitate the integration of wearable and continuous BGL monitoring into size- and SNR-limited devices such as smartwatches and smart rings. In future studies, to reduce the data rejection ratio through the proposed screening process and enhance data utilization, it is essential to consider making fundamental improvements to the SNR. This can be achieved by combining active area enhancement of photodetectors along with reducing the noise in the amplifier circuit. Code and Data AvailabilityThe data underlying the results presented in this paper are not publicly available at this time due to privacy and ethical concerns but may be obtained from the authors after making a reasonable request. AcknowledgmentsThe authors would like to express their gratitude to Masato Kitabayashi, Koji Tsubota, Hideki Maeda, and Yu Hashimoto of Hamamatsu Photonics K. K. for their invaluable assistance in the NIGM project, both in their official and personal capacities. ReferencesJ. Divya,
“Clinical medicine and curative treatment for diabetes mellitus,”
Can. J. Biotechnol., 1 183
–183 https://doi.org/10.24870/cjb.2017-a169
(2017).
Google Scholar
N. Rathwa et al.,
“β-cell replenishment: possible curative approaches for diabetes mellitus,”
Nutr. Metab. Cardiovasc. Diseases, 30
(11), 1870
–1881 https://doi.org/10.1016/j.numecd.2020.08.006
(2020).
Google Scholar
O. Didyuk et al.,
“Continuous glucose monitoring devices: past, present, and future focus on the history and evolution of technological innovation,”
J. Diabetes Sci. Technol., 15
(3), 676
–683 https://doi.org/10.1177/1932296819899394
(2021).
Google Scholar
S. Alva et al.,
“Accuracy of a 14-day factory-calibrated continuous glucose monitoring system with advanced algorithm in pediatric and adult population with diabetes,”
J. Diabetes Sci. Technol., 16 70
–77 https://doi.org/10.1177/1932296820958754
(2020).
Google Scholar
L. Heinemann and A. Stuhr,
“Self-measurement of blood glucose and continuous glucose monitoring—is there only one future?,”
Eur. Endocrinol., 14 24 https://doi.org/10.17925/EE.2018.14.2.24
(2018).
Google Scholar
Y. Suzuki, Y. Atsumi and K. Matsuoka,
“Finger infection resulting from self-monitoring of blood glucose and a new aid for reducing risk,”
Diabetes Care, 21
(8), 1373 https://doi.org/10.2337/diacare.21.8.1373 DICAD2 0149-5992
(1998).
Google Scholar
B. Han et al.,
“Association between self-monitoring of blood glucose and hepatitis b virus infection among people with diabetes mellitus: a cross-sectional study in Gansu Province, China,”
BMJ Open, 11
(10), e048463 https://doi.org/10.1136/bmjopen-2020-048463
(2021).
Google Scholar
G. A. Puckrein et al.,
“Assessment of glucose monitoring adherence in Medicare beneficiaries with insulin-treated diabetes,”
Diabetes Technol. Ther., 25
(1), 31
–38 https://doi.org/10.1089/dia.2022.0377
(2023).
Google Scholar
T. Nakazawa et al.,
“Non-invasive blood glucose estimation method based on the phase delay between oxy- and deoxyhemoglobin using visible and near-infrared spectroscopy,”
J. Biomed. Opt., 29
(3), 037001 https://doi.org/10.1117/1.JBO.29.3.037001 JBOPFO 1083-3668
(2024).
Google Scholar
W. H. Polonsky et al.,
“The impact of continuous glucose monitoring on markers of quality of life in adults with type 1 diabetes: further findings from the DIAMOND randomized clinical trial,”
Diabetes Care, 40 736
–741 https://doi.org/10.2337/dc17-0133 DICAD2 0149-5992
(2017).
Google Scholar
M. I. Maiorino et al.,
“Effects of continuous glucose monitoring on metrics of glycemic control in diabetes: a systematic review with meta-analysis of randomized controlled trials,”
Diabetes Care, 43 1146
–1156 https://doi.org/10.2337/dc19-1459 DICAD2 0149-5992
(2020).
Google Scholar
Y. Na et al.,
“Quarter-annulus Si-photodetector with equal inner and outer radii of curvature for reflective photoplethysmography sensors,”
Biosensors, 14
(2), 109 https://doi.org/10.3390/bios14020109 BISSED 0265-928X
(2024).
Google Scholar
J. Park et al.,
“Photoplethysmogram analysis and applications: an integrative review,”
Front. Physiol., 12 808451 https://doi.org/10.3389/fphys.2021.808451
(2022).
Google Scholar
A. K. Maity, A. Veeraraghavan and A. Sabharwal,
“PPGMotion: model-based detection of motion artifacts in photoplethysmography signals,”
Biomed. Signal Process. Control, 75 103632 https://doi.org/10.1016/j.bspc.2022.103632
(2022).
Google Scholar
M. Elgendi,
“Optimal signal quality index for photoplethysmogram signals,”
Bioengineering, 3
(4), 21 https://doi.org/10.3390/bioengineering3040021 BENGEQ 0178-2029
(2016).
Google Scholar
L. B. Wood and H. H. Asada,
“Noise cancellation model validation for reduced motion artifact wearable PPG sensors using MEMS accelerometers,”
in Int. Conf. IEEE Eng. Med. and Biol. Soc.,
3525
–3528
(2006). https://doi.org/10.1109/IEMBS.2006.260359 Google Scholar
H. Lee, H. Chung and J. Lee,
“Motion artifact cancellation in wearable photoplethysmography using gyroscope,”
IEEE Sens. J., 19
(3), 1166
–1175 https://doi.org/10.1109/JSEN.2018.2879970 ISJEAZ 1530-437X
(2018).
Google Scholar
A. H. Afandizadeh Zargari et al.,
“An accurate non-accelerometer-based PPG motion artifact removal technique using CycleGAN,”
ACM Trans. Comput. Healthc., 4
(1), https://doi.org/10.1145/3563949
(2023).
Google Scholar
F. Tabei et al.,
“A novel personalized motion and noise artifact (MNA) detection method for smartphone photoplethysmograph (PPG) signals,”
IEEE Access, 6 60498
–60512 https://doi.org/10.1109/ACCESS.2018.2875873
(2018).
Google Scholar
I. Oshina and J. Spigulis,
“Beer-Lambert law for optical tissue diagnostics: current state of the art and the main limitations,”
J. Biomed. Opt., 26
(10), 100901 https://doi.org/10.1117/1.JBO.26.10.100901 JBOPFO 1083-3668
(2021).
Google Scholar
N. Shah et al.,
“The role of diffuse optical spectroscopy in the clinical management of breast cancer,”
Dis. Markers, 19 95
–105 https://doi.org/10.1155/2004/460797 DMARD3 0278-0240
(2004).
Google Scholar
S. Matcher et al.,
“Performance comparison of several published tissue near-infrared spectroscopy algorithms,”
Anal. Biochem., 227
(1), 54
–68 https://doi.org/10.1006/abio.1995.1252 ANBCA2 0003-2697
(1995).
Google Scholar
D. A. Boas et al.,
“Noninvasive imaging of cerebral activation with diffuse optical tomography,”
In Vivo Optical Imaging of Brain Function, 193
–221 CRC Press(
(2002). Google Scholar
H. Zhao et al.,
“Maps of optical differential pathlength factor of human adult forehead, somatosensory motor and occipital regions at multi-wavelengths in NIR,”
Phys. Med. Biol., 47
(12), 2075 https://doi.org/10.1088/0031-9155/47/12/306
(2002).
Google Scholar
Jr., III W. A. Bowes, B. C. Corke and J. Hulka,
“Pulse oximetry: a review of the theory, accuracy, and clinical applications,”
Obstetr. Gynecol., 74
(3, Part 2), 541
–546
(1989).
Google Scholar
M. Tsoukas et al.,
“Accuracy of FreeStyle Libre in adults with type 1 diabetes: the effect of sensor age,”
Diabetes Technol. Ther., 22 203
–207 https://doi.org/10.1089/dia.2019.0262
(2019).
Google Scholar
J. L. Parkes et al.,
“A new consensus error grid to evaluate the clinical significance of inaccuracies in the measurement of blood glucose,”
Diabetes Care, 23
(8), 1143
–1148 https://doi.org/10.2337/diacare.23.8.1143 DICAD2 0149-5992
(2000).
Google Scholar
A. Pors et al.,
“Accurate post-calibration predictions for noninvasive glucose measurements in people using confocal Raman spectroscopy,”
ACS Sens., 8
(3), 1272
–1279 https://doi.org/10.1021/acssensors.2c02756
(2023).
Google Scholar
D. Klyve et al.,
“1003-P: a new machine-learning model and expanded dataset for a noninvasive BGM,”
Diabetes, 73 1003 https://doi.org/10.2337/db24-1003-P DIAEAZ 0012-1797
(2024).
Google Scholar
J. Smith,
“The pursuit of noninvasive glucose: ‘hunting the Deceitful Turkey’,”
https://www.nivglucose.com/The%20Pursuit%20of%20Noninvasive%20Glucose%208th%20Edition.pdf
(2023).
Google Scholar
BiographyTomoya Nakazawa is an R&D engineer at the Electron Tube Division of Hamamatsu Photonics K.K. His current research interests include near-infrared spectroscopy (NIRS) for optical diagnostic devices. He received his master’s degree in engineering from the Department of Mechanical Engineering and Science, Kyoto University, and his bachelor’s degree in engineering from the Undergraduate School of Engineering Science, Kyoto University. |
Signal to noise ratio
Glucose
Blood
Near infrared spectroscopy
Prototyping
Error analysis
Education and training