Infants born preterm are vulnerable to hypoxic and ischemic cerebral insults that can lead to long-term morbidity.1 Diagnostic methods to early detect these conditions and to prevent lesions are urgently needed. Optical methods such as near-infrared spectroscopy (NIRS) have the potential to fulfill this need by assessing cerebral oxygenation and blood perfusion (hemodynamics) noninvasively, especially by being able to measure the tissue oxygenation () as an absolute parameter.2,3 The measurement of cerebral may be useful for detecting situations where the oxygenation of the brain is impaired; the hope is that by properly adjusting the cerebral oxygenation, brain lesions will be prevented. However, the in vivo precision of NIRS-based oximetry has repeatedly been reported to be insufficient for clinical practice.2,3
In previous studies, the reproducibility of was examined for various NIRS-based oximeters. For example, Sorensen and Greisen4 reported a precision of 5.2% for the NIRO 300 (Hamamatsu Photonics, Hamamatsu, Japan). Recent studies showed that a higher precision in neonates of 2.0% (term) to 4.2% (preterm) can be achieved by excluding inhomogeneous tissue for the OxiplexTS (ISS, Champaign, Illinois).5
Technically, precision is quantified by the variation of repeated measurements with an instrument at an unchanged specimen, i.e., the true value of is assumed to be constant. In studies in vivo, this is not necessarily given as numerous regulation processes may cause physiological changes and thus changes in true . These physiological changes contribute to the observed variability, causing it to be higher than the variation caused by the device (i.e., the precision). However, vital parameters associated with changes in true , such as heart rate or arterial oxygen saturation (), were unfortunately not monitored during the measurements in previous studies, such as Refs. 5 and 6. Especially, variability matters because it acts as an “input parameter,” and changes in are directly reflected in .6
Data on the influence of these physiological changes or spontaneous systemic hemodynamic fluctuations on precision of NIRS measurements are sparse and inconclusive. While Hyttel-Sorensen et al.7 found that on the adult forearm most of the variation was not due to spontaneous ﬂuctuations, Menke et al.8 stated that “most of the variation between repeated measurements is due to physiological variation” in their measurements on the forehead of preterm and term neonates.
The aim of our study was first, to determine the precision of the OxyPrem NIRS-based oximeter device in preterm neonates and second, to assess physiological changes during such measurements. Furthermore, we aimed to present several approaches reducing the impact of spontaneous systemic hemodynamic fluctuations on the measurements.
The study was performed at the Department of Neonatology, University Hospital Zurich. The study protocol was accepted by the ethical committee of Zurich (KEK 2010-0102/2) and received regulatory approval from Swissmedic (2010-MD-0019). Parental consent was obtained in all cases prior to enrollment.
Thirty-seven clinically stable preterm infants breathing spontaneously on room air were enrolled in our study. The demographic data of the evaluated infants are shown in Table 1. Infants with congenital malformations were excluded.
Demographic data of preterm infants (n=35).
|Gestational age (weeks)||33(2/7)||25(1/7) to 36(1/7)|
|Postmenstrual age (weeks)||35(5/7)||32(1/7) to 38(5/7)|
|Birth weight (g)||1630||730 to 2820|
|Weight at measurement (g)||2070||1460 to 3000|
was measured with our in-house built NIRS-based oximeter OxyPrem v1.3, peripheral arterial oxygen hemoglobin saturation () was acquired by a Sensmart Model X-100 Universal Oximetry System with “6100CN cloth sensor—neonatal” pulse oximetry sensor (Nonin Medical, Inc., Plymouth, Minnesota).
The OxyPrem v1.3 sensor consists of four continuous-wave light sources and two light detectors [Fig. 1(c)], linearly and symmetrically arranged with sources placed in between the detectors. Each source is equipped with multiple LEDs (690, 760, 805, and 830 nm). The two detectors D1, D2, and sources S1, S4 form an outer region of interest () with source–detector separations (SDS) of 15 and 35 mm, respectively. Both detectors and sources 2 and 3 form an inner region () with SDS of 20 and 30 mm, respectively. Except for this difference in source positions and SDS, both ROIs are technically identical and the major difference is that slightly different tissues are probed with offering deeper tissue sensitivity (due to the longer SDS). Sources and detectors are embedded in flexible silicone, and the sensor’s shape is adapted for easy application to strongly curved surfaces such as the tiny head of neonates.
The multidistance algorithm is based on the following steps: the ambient light intensity is subtracted. From the slope of intensity decrease over distance, the effective attenuation is calculated which is then transformed into concentrations of oxyhemoglobin and deoxyhemoglobin. Finally, is calculated as the ratio of oxyhemoglobin over the sum of oxyhemoglobin and deoxyhemoglobin. For each ROI, is calculated individually. The mean of both values for is calculated as well.
We employ a self-calibrating algorithm,9 which exploits the sensor’s symmetry to cancel most influential factors. For instance, differences in light coupling between sources of detectors are automatically canceled to a high degree. Mainly, this algorithm enables OxyPrem to compensate for local inhomogeneities such as an unequal distribution of hair underneath the sensor, birth marks, hematoma, smears of blood, or meconium under a source or detector. This algorithm also increases robustness to motion artifacts.10 OxyPrem thus provides a robust measurement.
In addition, in contrast to many other oximeters, OxyPrem’s is insensitive to the total hemoglobin concentration.11 Although designed for cerebral oximetry in neonates, it has a similar capability to assess the oxygenation in deeper layers of tissue as common adult sensors12 and is therefore a tool that determines in a variety of patients and anatomical locations.
Five 1-min measurements were conducted with an OxyPrem sensor placed over the left frontotemporal lobe (FTL) on the head of a preterm neonate. In between, the sensor was removed and reattached (resited) at approximately the same location without marking. The FTL was selected because it is a common region to place NIRS sensors as it is not covered by hair. Before the start of the next measurement, we waited for sensor LEDs to regulate (see Fig. 2 for a typical example).
To assess the spontaneous systemic hemodynamic ﬂuctuations, we recorded on the right arm and placed a second OxyPrem sensor over the occipital lobe (OL) with the center of the sensor being positioned 1 cm above the inion. The OL was selected as a reference location because it is far away from the FTL and thus avoids cross talk between the two sensors. To obtain data with high signal-to-noise ratio, we recruited preferably light-haired subjects. The placement of the OxyPrem sensors over the head of a neonate is shown in Figs. 1(a) and 1(b).
The and signals were synchronized by setting an event mark recorded directly after having started the recording. Data selection and analysis were performed as follows:
I. Typically five median values (one for each 1-min measurement, called “block median”) per subject were analyzed by a linear mixed-effects model with subject as factor (LMEM13,14) using the statistics software R (Version 3.3.2).15 The average residual corresponds to the variability within subjects and is therefore called intrasubject standard deviation (). This approach has been used in previous studies, for instance, in Ref. 16. We evaluated of and individually as well as their average () and included all data, irrespective of artifacts caused by drops in or subject movement (“all available”).
II. Visual inspection of the data revealed several subjects in which out of 5 blocks were containing artifacts, mostly motion artifacts or desaturations. These subjects were excluded for further analysis with the LMEM (“manual rating”). This approach was inspired by functional studies, where artifacts are often rated manually.
III. In another analysis with an LMEM, subjects showing high physiological variation between block medians were removed. The criterion for subject removal was an SD between the five block medians of for (“stable ”). reflects systemic physiology and hence thresholds for variability in lead to a dataset with smaller physiological changes, but also less data for analysis. Our thresholds were selected as a trade-off between removing physiologically unstable subjects and retaining data. We set a strict threshold because changes in lead to concurrent changes in anywhere in the body.7
IV. In this dataset, an additional criterion for stable OL was added (), to exclude drifts in cerebral which are possible even if is stable. The threshold was set to remove only the most severe cases, i.e., it removed only one, severely affected subject.
V. and VI. Two additional analyses were performed, requiring variability of within a block to be below a threshold. Variability during the blocks was determined by taking the median absolute deviation (MAD17) of all samples within a block. We chose MAD as a measure because it is more robust against outliers than the SD. For better comparability, a factor of 1.4826 was applied (scaled MAD18) to obtain similar values to SD for artifact free, normally distributed data. We then compared scaled MAD to two different thresholds: (“”) and (“”), and analyzed these two datasets with the LMEM. We chose two different thresholds for within-block variability. 2.5% was chosen to filter the most varying blocks only and 1% was chosen to ensure negligible variation during the blocks.
Ratios between variability during the blocks for ROIs give insight into the amount of physiological changes present in the . Calculation starts with scaled MAD of all samples of a block. Then, the scaled MAD of one ROI is divided by that of the other, and in the next step the mean of the five blocks is calculated. The mean and median over all subjects are shown in Table 3. As this analysis only takes variability within one block into account, i.e., without sensor resiting, these ratios are not affected by this procedure and we would not expect higher variability in FTL compared with OL . Short-term artifacts have marginal influence on MAD within a block of one minute. We, therefore, associate the numbers in Table 3 with the level of physiological changes contained in the of the different ROIs.
In total, 37 patients were included in this study. Data from two subjects were discarded due to a complete data loss. From the remaining 35 patients, all available data were analyzed. In some of these subjects, it was not possible to obtain sufficiently reliable data and datasets are partially incomplete: only four measurements were performed (ID 19); due to technical problems with the pulse oximeter instrument, recordings are not available (IDs 3 and 18); OL data were only available in a single complete measurement and therefore disregarded (ID7).
A typical time series of one patient (ID 12) of the FTL and OL () and of is shown in Fig. 2, which shows that all signals are strongly affected by spontaneous fluctuations.
Block medians of that indicate the degree of systemic physiological changes are shown in Fig. 3(a). The SD of throughout the whole recording varies considerably between subjects and is shown in Fig. 3(b). In a number of subjects, there are high values meaning significant variation over time and only in very few cases SD is .
The block medians of are graphically presented as box plots in Fig. 4. These show for , , and of FTL and OL sensors, respectively. In all ﬁgures, a number of subjects with a high variability of values were found, whereas for many subjects there is very little variation.
This finding is illustrated even better when plotting the variability of block medians of : SD and scaled MAD for the block medians of of each subject are shown in Fig. 5. Data are sorted in ascending order and the median values of both are indicated by horizontal lines. SD-sorted allows for subject-wise comparison of SD and scaled MAD.
The resulting for and obtained by different analyses (I to VI) are shown in Table 2. A number of subjects (subj) and number of blocks considered for each analysis type are provided. Different numbers of subjects with available data or blocks filtered out by the applied criteria are the causes for given spans. For example, “all available” contains FTL data from 35, OL data from 34, and data from 33 subjects, resulting in a span of 33 to 35, respectively.
Variability (Var) between the 1-min measurements (blocks) of SpO2 and StO2 of the FTL and OL. Variability expressed as within-subject (SDintraSubj) in StO2 [%] calculated by LMEM based on all data and after applying different criteria (I–IV). For each analysis, the minimal and maximal number of available subjects and blocks passing the quality criteria is given, e.g., II “manual rating” contains FTL data from 30, OL data from 29, and SpO2 data from 29 subjects, i.e., a span of 29 to 30.
|Method||Subjects (n)||Blocks (n)||ROI1 FTL [%]||ROI2 FTL [%]||ROI1+2 FTL [%]||ROI1 OL [%]||ROI2 OL [%]||ROI1+2 OL [%]||SpO2 [%]|
|I||All available||33 to 35||164 to 174||2.64||8.21||4.84||3.06||5.79||4.15||2.35|
|II||Manual rating||29 to 30||144 to 149||2.02||4.14||2.77||2.76||5.10||3.61||1.44|
|V||30 to 35||120 to 155||2.11||3.93||2.77||2.32||2.86||2.62||-|
|VI||20 to 27||54 to 88||2.04||2.65||2.56||1.77||2.70||2.21||-|
An analysis of the mean of variability during the 1-min measurements for is shown in Table 3.
Ratios of variability during the 1-min measurements.
|ROI1 OL/FTL||ROI2 OL/FTL||ROI1+2 OL/FTL||ROI1/ROI2 FTL||ROI1/ROI2 OL|
Taking all available data of our precision assessment in clinically stable preterm infants into account results in a variability of 2.64% for OxyPrem. However, we attribute a major stake of this number to physiological changes occurring during the measurement. Different methods for reducing the influence of physiological changes suggest that OxyPrem precision is actually superior and .
What does precision in cerebral monitoring with NIRS-based oximeters mean for clinical applications? Precision corresponds to the reliability of the numbers displayed to the clinician and is device specific. If an instrument displays an of 60% and we assume that the error of measurement is normally distributed around the 60%, we would like to know the probability that the true is in reality , values that are considered to be too low.19 For an instrument with 5% precision, in 1 out of 6 cases the true is indeed , whereas for 2.5% precision it is only 1 out of 46 and for 2% precision 1 out of 172 cases, respectively. Thus, precision is directly linked to how trustworthy a displayed is and it represents a crucial parameter when clinically applying cerebral oximetry.
To determine the precision in vivo, a sensor is repeatedly placed over the same tissue and the variation of the values between these resitings corresponds to the precision. This is only true, if the measured physiological parameter changes negligibly. Under this assumption, the results show that our new in-house developed NIRS-based oximeter device OxyPrem with a precision of 2.64% achieves the requirements for clinical practice, which are requested to be for cerebral oxygenation of preterm infants.2,3
Typically this precision corresponds to 5% for other instruments: e.g., 5.2% for neonates4 and 6.1% were reported for anesthetized children20 for the NIRO-300 (Hamamatsu), 6.7% in neonates21 and 7.1% in anesthetized children20 for the INVOS 5100C adult SomaSensor (Medtronic), and 4.8% in term born neonates for the INVOS 5100C OxyAlert neonatal sensor.22 A better repeatability in neonates with of 1.7% was reported8 for a discontinued device7 and 2.8% for the FORE-SIGHT neonatal small dual sensor (Casmed),22 However, for both sensors a decreased dynamic range of changes in was reported,22,23 i.e., a lower sensitivity. of 2.76% in 30 mostly preterm infants was demonstrated for a prototype NIRS device employing a self-calibrating algorithm.16 By excluding tissue homogeneity, a higher precision of 2.0% in term infants and 4.2% in preterm infants was determined for the OxiplexTS (ISS).5
OxyPrem is capable of measuring two independent values ( and ). We expected that the average of these two () would perform better than each individual ROI, which is not the case. The reason is that incorporates the longest source–detector distance available (35 mm), which obviously adds stability. We consequently refer to when we speak of the precision of OxyPrem.
Precision is only correctly determined if the true does not change between measurements. However, this assumption is incorrect because the human body constantly regulates and creates physiological changes. So far, these fluctuations were deemed negligible. It is one major result of this study to be the first to quantify these fluctuations and to estimate the respective errors in the precision calculation. Indeed, it was shown that these fluctuations are often not negligible at all. Therefore, an assessment of precision of an instrument requires a correction for this influence.
Contribution of Spontaneous Physiological Changes
In subject 12, ﬂuctuates with an SD of almost 4% (Fig. 3). Figure 2 shows that in and both signals of subject 12, there are significant oscillations present which is due to coupling between and .7 In addition, most of the variation in the FTL is not actually caused by the sensor resiting. Furthermore, in the OL even stronger changes are visible, although the sensor was kept in place (Fig. 2).
There are different types of contributions to the variability observed in this study. The first type is variations occurring when the true is unchanged. This type is defined as the precision of a device, i.e., (1) random noise, (2) changes in shape of the sensor, and (3) differences in optical coupling due to the resiting of the sensor. In an in vivo assessment, however, changes in the true are present. Some are caused by the sensor resiting: (4) tissue heterogeneity, if the measurement location is not kept exactly the same, (5) changes in skin perfusion and oxygenation caused by the manipulation, and (6) changes in brain perfusion and in arterial and venous compartment sizes because the head was moved during the manipulation. In addition, spontaneous physiological changes may affect cerebral NIRS measurements.24–27 These occur independently of the sensor’s removal and reattachment process and reflect true alterations in : (7) systemic changes in (desaturations and fluctuations and/or oscillations) and (8) changes in blood flow, hemoglobin concentration, and oxygen consumption of the brain due to different regulation processes.
The definition of precision is the random error associated with measuring an unchanged quantity repeatedly. However, most of the contributions to the variability listed above correspond to changes of the true (contributions 4 to 8). In the analysis I including all subjects (“all available”) (Table 2), for OL , , and FTL are almost the same. This shows that the major contribution to variability cannot be the resiting process and that corresponding to the observed variability is systematically higher than the instrument-related variability in repeated measurements (i.e., precision, contributions 1 to 3).
Figure 2 shows another effect that matters for in vivo precision experiments. The shorter the measurement periods, the stronger is the effect of physiological ﬂuctuations. In this subject (ID 12), one period of physiological ﬂuctuation is . Due to 1-min averaging, these changes cancel to a certain degree. Therefore, although SD of is (Fig. 3), this measurement passed the criteria for stable physiology (“stable physiology”) with an SD of 1.1% between the block medians. With shorter block sizes, this smoothing effect diminishes, resulting in higher . This may explain the different precisions determined by different research groups for the same sensor on the same patient groups.
In Table 3, the ratio for OL/FTL is in all three cases (, , and ) and for is in both cases (FTL and OL). As MAD is robust to short-time artifacts (e.g., motion artifacts), we assume that the variation during the blocks originates from spontaneous physiological fluctuations in the (Fig. 2). Generally, physiological variation in in the OL is higher than in the FTL. In addition, we see less variation in of than in . with SDS of 15 and 35 mm collects data from deeper layers of the brain and averages a larger volume of tissue, leading to a more stable signal.
How to Reduce the Effect of Spontaneous Physiological Changes in Case of NIRS Reproducibility Measurements
For correctly calculating precision, the true needs to be unchanged. However, several subjects showed strong physiological changes during the measurement, invalidating this assumption. This causes a systematic overestimation of the precision. To minimize this effect, we applied different criteria to remove individual blocks or subjects and analyzed the remaining data with the LMEM. Table 2 shows the resulting and the number of subjects and blocks, which have passed the respective criteria. For clinical application, there is of course no requirement for to remain unchanged and a patient with variable physiology can be reliably monitored with a NIRS instrument. Such patients probably are among those benefiting the most from such monitoring.
II. The approach with manual artifact rating depends on the person performing the rating. This is a difficult task because not only movement artifacts but also continuous spontaneous physiological changes have to be considered. We, therefore, do not favor this approach.
III. The approach to include subjects with stable systemic physiology (“Stable ”) requires additional sensors in place (). The criterion allows periodic fluctuations within the blocks but requires the physiology to be stable throughout the whole experiment. This approach results in smaller (Table 2). In at least one subject (ID 8), a substantial slow trend in was observed, which clearly violates the assumption of stable “true” . This infant was not identified by the criterion. This shows that although both are coupled,7 cannot predict cerebral . Therefore, monitoring systemic physiology alone is not sufficient to guarantee stable cerebral .
IV. Thus, we added another criterion based on variability of OL . We consider the remaining subjects having “stable physiology,” which is reflected in even lower (Table 2), than in the previous approaches. However, precision is probably still considerably overestimated due to a major physiological contribution because of of the FTL sensor is almost identical to of the OL sensor which was not resited.
V. and VI. The approach limiting variability within each block (“var ” and “var ”) discards all blocks with periodic fluctuation irrespective whether or not they influence the block median. However, a slow change in oxygenation is not removed and still causes an overestimation of . Especially the strict criterion with variability discards many blocks, with only approximately three left per subject on average, which reduces statistical power.
The box plot in Fig. 4 and the sorted variabilities of all subjects in Fig. 5 show that the majority of subjects have relatively small deviation between the blocks and only few subjects considerably increase the overall . Generally, SD and scaled MAD are similar for normally distributed data. Our dataset is not normally distributed because scaled MAD provides on average lower values than SD. The reason is that MAD provides better robustness to outliers, whereas is only valid for normally distributed data. This is further demonstrated by median values of scaled MAD in Fig. 5 being typically lower than the of LMEM (Table 2). Therefore, even in datasets with strong physiological variability excluded, LMEM systematically overestimates the variability. This also means that the precision of OxyPrem is better than 1.85% in reality. Although the resulting variability still depends on physiological changes and is still higher than the true precision of the instrument, it definitely is closer to the true precision.
For all discussed methods to obtain the true precision, of the repositioned sensor was substantially lower than in the analysis based on all available data. This indicates that physiological changes contributed a major part to . Neglecting these physiological changes leads to a substantial overestimation of the precision. The proposed methods lead to a much reduced and more correct precision, which is still somewhat overestimated. For future precision studies, we recommend employing the method requiring stable with some additional criteria to detect trends in , as both systemic and cerebral physiology are required to be monitored and restricted to certain limits, to reduce their influence on the precision determination. We thus refer to results with approach “stable physiology” (IV) as best estimate for OxyPrem precision.
Strengths and Limitations of this Study
A strength of this study is that we clearly demonstrated how systemic physiological changes have an impact on precision assessments of cerebral oximeters in clinically stable preterm infants. We further presented several approaches to cope with this undesired effect; all of them estimate precision more realistically than approaches neglecting the effect of physiological changes. A limitation of this study is that we acquired data only in preterm infants. We, therefore, cannot conclude a dominant influence of physiological changes in other patient groups, although it is very likely that this factor plays also a significant part for reproducibility studies based on other populations. We measured with a clinical pulse oximeter for the approaches applying thresholds on variability (III and IV). We consider this pulse oximeter to be reliable in particular as we only considered changes in and since the sensor was not resited. Therefore, random errors due to sensor placement or a device-specific bias do not affect the exclusion of subjects.
Do FTL and OL represent cerebral signals? Influence of extracerebral tissues has been reported for cerebral measurements in adults28,29 and could potentially also affect our study. However, the neonatal skull and other extracerebral layers are thin30 and are therefore less relevant. Furthermore, OxyPrem employs a multidistance algorithm that effectively cancels influence of superficial tissue. Therefore, we can safely state that OxyPrem reflects cerebral oxygenation in neonates. OxyPrem has similar deep-tissue sensitivity as common adult oximeters.12
We presented a methodology to assess precision of NIRS-based oximetry devices. For the first time, we showed that indeed in neonates, systemic physiology and cerebral regulation cause major changes in that invalidate conventional precision estimates. Therefore, in addition to of the sensor which was removed and reattached, and need to be measured continuously to identify such variations. It is necessary to correct for this type of variability. Several such methods were presented. For future studies, we recommend using an analysis that limits influence of systemic and cerebral physiology to obtain a proper precision estimate. OxyPrem has shown to be a highly precise instrument with a precision , which is better than instrumentation currently used in clinics.
Dr. Kleiser, Mr. Ostojic, Dr. Nasseri, and Prof. Wolf are inventors of a pending European patent filed by University of Zurich, which covers the features related to the superior precision of OxyPrem.
This work was funded by the Nano-Tera projects ObeSense, and NewbornCare, the Swiss National Science Foundation (project 159490), and the BioEntrepreneur-Fellowship of the University of Zurich, Reference No. BIOEF-17-004.