Currently, the method used for determining burn-wound depth is a visual inspection, which relies heavily on the experience of the physician. In some instances, an accurate assessment is easily achieved, as is the case in identifying superficial and full-thickness injuries. Unfortunately, assessment of burn depth is sometimes difficult, especially in differentiating intermediate partial-thickness wounds from deep partial-thickness injuries. The latter usually require surgery while the former will generally heal with appropriate wound care. In this situation, even experienced burn clinicians make an incorrect diagnosis 40% of the time.1, 2
To make the distinction between a burn that will heal spontaneously and one that requires excision and grafting, an assessment of the dermis and subcutaneous tissue is critical. The depth of injury is directly related to the healing potential of burns, which are usually classified according to their depth (Fig. 1 ). If dermal elements, including the microvascular blood supply, remain intact or are only minimally damaged, the injury has the potential to heal. Under favorable conditions, such superficial and intermediate partial-thickness injuries generate new skin spontaneously, and the wound heals on its own. If dermal elements are extensively destroyed, the blood supply cannot support the injured area and the site of injury has little or no chance of healing. Such injuries should be classified as deep partial or full thickness. Treatment decisions are based on the evaluation of burn depth, with inaccurate evaluations leading to unnecessary surgeries or lengthened hospitalization.
The diagnostic limitations of visual assessment of burn injury have led many researchers to search for instrumental methods to help differentiate or classify burn injuries. A wide range of diagnostic tools has been reported in the literature, including laser Doppler, various dyes, ultrasonography, thermography, nuclear magnetic resonance, or magnetic resonance imaging and optical coherence tomography. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 However, to date, none of these technologies has achieved widespread clinical acceptance or utilization.3, 17 Few of these technologies have demonstrated the ability to accurately distinguish shallow healing wounds from deeper wounds requiring surgery. One technique that was reported to be accurate and has undergone limited human trials is indocyanine green fluorescence.7, 8, 9 The subjectivity underlying the interpretation of the fluorescence images and the need to deliver the fluorescent dye intravenously seem to have limited the clinical adoption of this method.
Noninvasive near-infrared spectroscopy (NIRS) is sensitive to changes in the scattering properties of the underlying tissue. The structural derangements that accompany a burn injury would be expected to change the scattering properties of tissue. In addition to being able to infer structural alterations by detecting changes in the scattering of tissue, NIRS is also sensitive to the local hemodynamics of the burn wound by measuring the absorption of near-infrared light by hemoglobin. Inadequate blood supply or oxygen delivery to the dermis seals the fate of the injury and indicates that surgery is warranted. NIRS was used to examine the hemodynamics of burn injuries in the early post-burn period in a porcine model.18, 19 The regional hemodynamic response of a severe burn injury in the early post-burn period shows a drop in tissue water content, total hemoglobin, and oxygenation, indicating a loss of circulation to the dermis. The response at milder burns was less dramatic, indicating a residual blood supply to the skin. These studies demonstrated statistical associations between NIRS measurements of tissue oxygenation and blood volume and the severity of the injury. While the strong statistical associations between the hemodynamics as measured by NIRS and burn severity are compelling, the studies fell short of demonstrating that the hemodynamic parameters obtained from NIRS were capable of being used as a predictive tool for classifying the severity of the injury. In this paper, we demonstrate the ability of NIRS to perform individual-level classification of burn severity using an acute porcine model where burn depth can be precisely controlled and confirmed through biopsy.
Methods and Materials
All procedures conformed to the guidelines set out by the Canadian Council on Animal Care regarding the care and use of experimental animals and were approved by the local Animal Care Committee of the National Research Council of Canada. The animals were anesthetized for the entire duration of the protocol and then euthanized at the end of the experiment without any recovery of consciousness.
The burn study was performed on the dorsal surface of pigs ( animals; with 4 burns/animal). A heated brass rod, in diameter, was used to create superficial and intermediate partial-thickness burns, which are considered to be shallow healing injuries, as well as deep partial- and full-thickness burns that generally require surgery. Two types of shallow injuries were created where the brass rod was in contact with the skin for 3 and , while two types of deep injuries were created using contact times of 20 and respectively. Depth of injury was confirmed at the end of the experiment by histological analysis. Biopsies from each burn were acquired and stained with hematoxylin and eosin (H&E). Under a dissecting microscope, the extent of epidermal and dermal damage in the stained biopsy was determined visually and graded as superficial, intermediate partial, deep partial, and full thickness according to the extent of the burn as depicted in the upper panel of Fig. 1. Representative images of superficial, intermediate, and deep partial-thickness injuries as well as a full-thickness wound are presented in the bottom panel of the figure.
Since the tissue’s response to thermal damage can extend approximately away from the site of the injury,20 each burn was spaced apart from the next. Such spacing eliminates the response that arises from one injury from being observed at another site. The model also ensures that the total body surface area (TBSA) burned remains below 1%. Such a design minimizes the possibility of a systemic reaction, which typically occurs following larger burn injuries. Waltman21 has reported that burns covering less than 15% TBSA are considered to have only minor systemic effects, so long as full-thickness burns do not represent more than 2% of this area. The overall model was adapted from Schomaker 22 and Danilenko 23 and the details of the study design are described in prior publications.18, 19
Reflectance spectra were collected with an NIRSystems 6500 (Foss, Silver Springs, MA) spectrometer using a custom bifurcated fiber optic bundle (Fiberguide Industries, Stirling, NJ). A 99% Spectralon® reflectance standard (LabSphere Inc., North Sutton, NH) was used as a reference to convert raw data into reflectance spectra. Each reflectance spectrum consisted of two 32 co-added scans collected between at resolution.
Reflectance data were converted to optical density units through a ratio of the tissue reflectance against a 99% Spectralon® reflectance standard (LabSphere Inc., North Sutton, NH) by using the formula . To eliminate variation in the offset between optical density spectra and reduce noise, an 11-point framelength Savitzky–Golay first derivative third order polynomial smoothing filter was applied to all spectra As a method for controlling for interanimal variation, first-derivative attenuation spectra of burns over the wavelength range of were subtracted from the corresponding first derivative attenuation spectra taken at matched control sites. All further statistical analysis was carried out on these first derivative difference spectra.
Using only the NIRS data acquired after the burn injury, we developed a partial least squares (PLS)–logisitic regression (LR) model to predict the probability of the injury being a deep burn. PLS models were built using the SIMPLS algorithm24 (PLS Toolbox© 1997–98 Eigenvector Research, Inc., Manson, WA) using the Matlab scripting language (Matlab Version 6, Mathworks, Natick, MA). To perform discrimination, the two classes (shallow and deep burns) were assigned dummy regression labels of and ( number of samples, of samples in class 1, of samples in class 2). A truncated set of scores from the PLS model, in this case two latent variables, was used as independent variables in LR models, effectively reducing the number of independent variables in the LR model and avoiding potential collinearity problems. LR models of the formis the probability that the underlying tissue suffers from a deep injury and and represent the scores of the first and second latent variables from the PLS model.
Spectra labeled as arising from shallow (superficial or intermediate partial-thickness) injuries or deep (deep partial and full-thickness) injuries are required for building or training a PLS-LR model as well as for testing or validating the predictive ability of the model. These partitions of the data are commonly referred to as the training and test data sets. Cross-validation procedures iteratively split data between the training and testing phases of model building and testing. A leave-one-animal-out (LAO) cross-validation (CV) strategy was used to determine PLS scores and logistic regression coefficients, , , , as well as estimate the performance of the model. PLS scores and maximum likelihood estimates of the regression coefficients were determined using all the data except that from one animal. The predictive ability of this surrogate model was tested using the data from the animal that was excluded from building the surrogate model. Additional surrogate models were built where in-turn data from each animal are left out of the model building and reserved for model testing. The median values of the regression coefficients over the series of five surrogate models were used as the final regression coefficients for the LR model.
The general performance of the classification approach is inferred by summing the performance over the surrogate models. The LAO-CV procedure ensures that all the data is used for both training and testing the classifier model but each surrogate model has an independent test and training data set, thereby providing a nearly unbiased estimate of the classifier error. LAO-CV most importantly ensures that data from each animal exclusively appear in the test or training set over the various iterations of cross-validation.
For each difference spectrum presented to the PLS-LR algorithm, an output value is generated between 0 and 1 that is related to the probability that the burn injury is deep. By choosing a threshold value, in the two-class classification scheme, spectra can be assigned as belonging to one of the two classes, shallow or deep. Because there are two types of classification errors, misclassifying a deep burn as shallow and misclassifying a shallow burn as deep, the performance of the classifier is best summarized by reporting the true positive fraction (TPF), also known as the sensitivity, and the true negative fraction (TNF), or the specificity. By determining the TPF and TNF at various probability thresholds and plotting TPF versus (1-TNF) or the false positive fraction (FPF), a receiver operator characteristics (ROC) plot can be generated, which is a graphical representation of the classifier performance. The area under the ROC curve or AUC is often used as a summary measure of the performance of the classifier.25, 26 The feasibility of using a classification approach to distinguish between shallow and deep burns based on their near-infrared reflectance spectrum is determined by examining the ROC plots and the AUC summary measure of classifier performance.
Nonparametric ROC curves were generated using a linear scan algorithm27 based on LAO-CV test set samples ranked according to LR prediction values, and the nonparametric AUC was calculated using the Mann–Whitney U test with the standard error of the AUC calculated using the usual formula.25, 26, 28 Wounds with predicted probabilities below 0.5 were assigned to belong to the shallow class of burns, whereas those wounds with probabilities above 0.5 were assigned to belong to the deep class of burns.
Near-infrared spectroscopy is sensitive to both changes in the scattering properties of the underlying tissue as well as the altered hemodynamics as a result of a burn injury. However, there was a considerable offset between spectra not correlated with burn injury. Some of this offset could be attributed to interanimal and intersite variation in pigmentation. By using the first derivative of the reflectance spectrum, the offset between spectra could be effectively eliminated as a confounding factor. In addition, an adjacent or nearby area of uninvolved tissue was used as a control measurement and helped account for much of the interanimal variation. The mean first derivative difference spectra, burn minus control, are plotted in the middle trace of Fig. 2 for the two burn groups and compared to the overall mean attenuation spectrum in the upper trace of Fig. 2. The spectral differences between the two groups are rather subtle and distributed over the NIR region. Thus, rather than focus on the difference in the NIR attenuation at specific wavelengths, a partial least-squares approach was used to determine the appropriate weightings of the first derivative of the optical attenuation across the broad NIR range that best discriminated the two burn groups. The first two PLS loadings are plotted in the lower trace of Fig. 2.
AUC statistics were used to determine the number of latent variables to include in the final PLS-LR model. The AUC measures the ranking quality of a classifier and provides a general measure of the predictive ability of the classifier. Figure 3 plots the 90% confidence intervals of the nonparametric AUC obtained from LAO-CV predictions as a function of the number of latent variables in the PLS model. The AUC shows a small improvement in going from 0.83 to 0.88 in the one- to two-latent variable model, respectively. No further increase in the AUC was observed for increasing PLS model complexity. Classification of deep versus shallow burns was based on the two-latent variable PLS-LR model.
In Fig. 4 , the nonparametric ROC curves are plotted for the two-latent variable PLS-LR classifier. The 90% confidence interval for the AUC ranges from 0.805 to 0.955. For an uninformative classifier, , while the perfect classifier has unity area, . The ROC curve generated using the two-latent variable PLS-LR classifier shows reasonable performance, well above the random guess classifier. This is displayed more graphically in Fig. 5 , where the LAO-CV predicted probability of an injury being a deep burn is plotted versus the latent variable scores from the PLS-LR model. Open circles indicate shallow burn injuries, while solid diamonds indicate deep burns. The injuries separate rather cleanly into the two classes with only a small overlap between the two classes. If a probability threshold of 0.5 is used as a boundary between the two classes, the two-latent variable PLS-LR classifier has a 90% sensitivity (TPF), and an 83% specificity (TNF) leading to an overall accuracy of 86.7%. This point is singled out on the ROC plot with estimated 95% confidence intervals in both TPF [0.81, 0.98] and FPF [0.07, 0.27]
The ability to accurately and reproducibly assess burn injuries is critical in the early post-burn period. The importance stems from the shift in burn management from more traditional nonsurgical treatments to the current standard of early excision and grafting of deep dermal and full-thickness burns.5 Earlier studies demonstrated statistical associations between NIRS measurements of tissue oxygenation and blood volume and the severity of the injury, but fell short of demonstrating that NIRS was capable of being used as a predictive tool for classifying the severity of the injury.18 In this study, NIRS assessments of burn injuries were demonstrated to be able to distinguish between shallow injuries (superficial and intermediate partial-thickness injuries) and deep burns (deep partial- and full-thickness injuries). While the results should be considered as preliminary and require clinical validation, using an animal model where the burn injury could be carefully controlled and the burn depth confirmed histologically, we could demonstrate the potential of NIRS in the assessment of burn injuries with a relatively small sample population.
A number of instrumental methods, many photonics-based, have been adapted to examine burn injuries. Most of these methods are highly complementary to NIRS. For example, optical coherence tomography and high-frequency ultrasound scans probe the various layers of the skin and can detect the structural derangements of these layers due to the burn injury. Perfusion imaging techniques such as indocyanine green fluorescence or noninvasive methods such as laser Doppler or speckle provide measures of blood flow at the wound. Similar to these perfusion-based methods, NIRS provides information on the local hemodynamics, blood volume, and oxygenation of the wound. However, like optical coherence tomography, NIRS is also sensitive to changes in the optical scattering properties of the underlying tissue. Thus, in addition to being able to infer structural alterations by detecting changes in the scattering of tissue, NIRS is also sensitive to the local hemodynamics. These features are latent in the NIR reflectance spectrum and, as demonstrated here, have the potential to be used to predict the likelihood that the injury is deep and requiring surgery.
The technical simplicity of fiber optic-based diffuse reflectance NIRS measurements makes the technique clinically practical. The ability of the technique to provide a measure of tissue water content, blood volume, and tissue oxygenation at the site of the injury along with structure changes that can be inferred from how the light scattering properties of the injured tissue change, renders clinical relevance to the technique in the assessment and monitoring of burn injuries. The approach taken in this report takes a different track. By using the inherent differences in the spectral signature from skin subject to a shallow burn injury compared to skin with a deep injury an individual-level classification of burn severity is developed based on these differences in spectral signature. This method does not explicitly attempt to extract concentration estimates of constituents in the tissue, and thereby does not rely on having reliable estimates of the optical pathlength and by inference the optical scattering and absorption coefficients of the tissue. The probabilistic output of the classifier indicates the likelihood that the burn injury is severe and requires early excision and grafting. The classification approach does not preclude extracting concentration estimates of water and hemoglobin from the tissue spectra and providing hemodynamic indices of the burn. The diagnostic information from the classifier along with the hemodynamic indices also available from NIRS measurements provides unprecedented insight into the biochemical, physiological, and structural changes occurring at the wound site, which should be a significant aid to the burn specialist in managing these injuries.
The assistance of Rachelle Mariash, Lori Gregorash, and Mark Hewko is gratefully acknowledged.