Cancer is considered to be a serious threat to public health, next only to coronary diseases.1 Oral cancer is one of the major cancers in developing countries.2 It is also the sixth most common cancer in men and the 14th most common among women all over the world. The number of new cases of oral cancer in the United States per year is about 30,000. World oral cancer statistics have reported that the survival rate (five years after the diagnosis) is only 54%, and approximately about one person per hour is killed by this cancer.3, 4, 5, 6 It is the most common cancer in India, accounting for 50 to 70% of total cancer mortality.7 In India, the age standardized incidence rate is reported as 12.6 per 1,00,000 population.8
It is very well recognized that early detection is essential for successful therapy in all types of malignancy. In spite of the fact that the oral cavity is amenable for easy observation, current methods for early detection of conditions leading to oral malignancy is still not very satisfactory for several reasons. A large fraction of practicing physicians and dentists do not feel adequately trained for an effective oral cancer examination.9 Even after observation of a lesion, current methods for decision making require biopsy and histopathology. Because of the tendency for field cancerization to lead to multicentric lesions, the location of correct sites for biopsy is difficult. This necessitates repeated biopsies and often the risk of underdiagnosis.10 It has been suggested that “biopsies are not representative of lesions/the whole premalignant lesion,” and that “none of the associated variables including presence of any degree of epithelial dysplasia in the whole lesion, site, demarcation, and smoking had influence on the risk of malignant development.”11 Moreover, it has been shown that “carcinomas were induced by the incision.”11, 12 In view of this, no biopsy/histopathology was done for premalignant conditions. For early detection and successful therapy of oral cancer, it will be very useful to have a method not based on biopsy. The method should be able to discriminate the condition of the organ as different from, not only normal, but more importantly, as not yet malignant. In other words, premalignancy should be judged as premalignancy, and it should be possible to detect change over to a malignant condition as early as possible. We have developed an objective serum protein profiling method capable of diagnosing whether the subject under investigation has a normal, premalignant, or malignant oral condition. A homogeneous sample (serum) only is used, which has the same composition irrespective of any “field cancerization.” The method is thus independent of the various possible errors mentioned before, due to inhomogeneity of samples (tissue). This method is thus highly suitable for regular surveillance of premalignant conditions to detect any transformation to a malignant state as early as possible. It should be noted that this is very relevant in oral cancer, because once transformation to malignancy has occurred, the lesion, being amenable to easy visual examination and diagnosis, is only a formality requiring confirmation by histopathology.
Many optical techniques have been reported recently for the early detection of different cancers.13, 14, 15, 16, 17, 18, 19, 20 Earlier studies in our laboratories have shown that protein profiling of body fluids by high-performance liquid chromatography laser-induced fluorescence (HPLC-LIF) can be a very promising technique for the early detection of many forms of cancer. It is a minimally invasive method and is extremely sensitive, with limits of detection of protein markers of the order of subfemtomoles.13, 21, 22, 23, 24 We have shown that this technique is highly suitable for early detection of cervical cancer using blood serum as the clinical sample.21, 24 Our main aim in the present work is to evaluate the suitability of the analysis of protein profiles of serum samples, recorded by HPLC-LIF technique, as a diagnostic tool for the early detection of oral cancer.
There are many ways to describe diagnostic accuracy. Statistical evaluation is one of the common ways to validate any kind of diagnostic tests. In general, statistical evaluation gives parameters like sensitivity and specificity pairs, receiver operating characteristic (ROC), Youden's index, etc. These parameters can be used in objective decision making, compared to clinical and pathological methods that are subject to errors from visual judgment, experience of the clinician/pathologist, inhomogeneity of sample, fatigue factor, etc.25, 26, 27 The statistical parameters mentioned before play an important role in deciding the threshold in any kind of diagnostic test.
The present study has been done using serum samples collected from 30 healthy volunteers, 26 clinically diagnosed premalignant conditions, and 26 pathologically certified malignant subjects. All malignant serum samples were collected from the volunteers before undergoing biopsy to avoid any differences due to induced inflammation. The healthy group has an average age of 35 years with male/female ratio of 0.36; the malignant and premalignant groups of samples have average ages of 54 and 39 with male/female ratios of 2.7 and 5.5, respectively.
The sample collection protocol was approved by the University Ethics Committee (UEC) (reference UEC/16/2007), Manipal. Informed consent was taken from the volunteers who participated in this study. The ROC curve [i.e., plot of sensitivity versus (1-specificity)] and the role of Youden's index in deciding the threshold were employed to evaluate the success of the technique. The results are presented and discussed.
Materials and Methods
Instrumentation and Data Analysis
The home assembled HPLC-LIF system consists of the following components. The HPLC unit consists of an HP 1100 gradient HPLC system with G1322A degasser, G1311A pump, and a manual injector (model number 7725, Rheodyne, Idex Health and Science, Oak Harbor, Washington) coupled to a Vydac 219TP52 biphenyl reversed phase narrow bore column (diphenyl, 2.1 × 250 mm, 5 μm, 300 Å). The effluent from the column was sent into a capillary flow cell made of a quartz capillary (75 μm i.d. 300 μm O.D. Hewlett Packard, G1600–64311). The 257-nm laser emission from a frequency-doubled Ar+ laser (Innova 90C FreD, Coherent, Santa Clara, California) with 15-mW power was used to excite the sample. Fluorescence at 340 nm was detected by a photomultiplier tube (PMT operated at –850 V), coupled through a preamplifier to a lock-in amplifier. The fluorescence was chopped with an EG&G model 651 (Signal Recovery, Oak Ridge, Tennessee) chopper at the entrance slit at 20 Hz for lock-in detection.14, 24
Blood was collected from healthy, malignant and premalignant subjects at Manipal College of Dental Sciences (MCODS), Manipal University, India. Soon after its collection, the sample was then subjected to centrifugation at 5000 rpm for 10 min, following which the supernatant was collected and centrifuged again for 10 min at 5000 rpm. Filtered supernatant, if not immediately used for analysis, were stored at –80°C in the deep freezer. The supernatant of serum was diluted 1:1000 times with HPLC grade water (Merck, Whitehouse Station, New Jersey) and injected into HPLC for the protein profile. The conditions for the liquid chromatography run were optimized to get a suitable chromatogram that successfully separates out the proteins present in the sample. We used water with 0.1% and acetonitrile with 0.1% TFA (HPLC grade) for the gradient runs. Each time, a blank was run before the gradient to confirm the stability and to ensure that no residual contamination was present in the column. 20 μl of sample was injected into the narrow bore biphenyl column and then eluted with the water-acetonitrile gradient. The gradient started with 70% water +30% acetonitrile and changed to 40% water +60% acetonitrile in 60 min. Fluorescence intensity verses the time of elution of proteins gives rise to a protein profile.
The HPLC technique coupled with an ultrasensitive laser-induced fluorescence (LIF) detection system assembled and standardized in our laboratory is highly efficient and capable of detecting trace amounts of proteins (of the order of femtomoles) in microliter volumes of sample. The system is also highly reproducible and can be operated for several months (even years) without any noticeable change in the performance, as we have verified from protein profiles of normal serum samples run over periods of 1 to 2 years. Typical protein profiles of normal, premalignant, and malignant serum samples are shown in Fig. 1. We have shown that intensities and retention times of the component proteins in normal serum are highly reproducible.23 Even small changes observed in the protein profiles from normal to other types can thus be considered as significant for diagnostic applications. To illustrate the importance of this point, in Fig. 2 we show sections of the protein profile expanded 50 times. It is clear that the serum protein profile is very complex, with about eight to ten proteins present in large concentrations (albumin, globulins, transferrin, IgG, IgA, etc.), while many other proteins are present only in very small quantities. The concentrations of the proteins present in small amounts change noticeably from normal to premalignant and malignant conditions. Also several new peaks are observed on induction of malignancy. For example, while peaks at 360, 445, and 1360 (transferrin) become very weak or disappear, new peaks come at 555, 585, 850, 1250, etc., shown in Fig. 2. This is expected, since induction of malignancy is accompanied by expression of several new proteins by the activation of the oncogenes and loss of others by inactivation of tumor suppressor genes.28
For the validation of the HPLC-LIF protein profiling method for diagnostic applications, the principal component analysis (PCA) was used for the discrimination of protein profiles. Details of our method of PCA have been given earlier.23 Sample details are given in Table 1. Normal and premalignant samples were clinically confirmed as “true” normal and premalignant, respectively. Malignant samples were diagnosed as “true” malignant by histopathology. In our method, standard sets of a given class, say normal, are formed by random selection from clinically/pathologically certified samples of that class. All of the normal, premalignant, and malignant samples are tested against a standard set (say, normal samples) for match/no-match criterion.15 In this test, PCA of the standard set is carried out first to decide the required factors that will express the protein profiles of the standard set with the desired accuracy. With this set of factors, the protein profile of the sample to be tested is then added to the standard set, and the scores of factors from PCA are derived for the test sample. The scores are used to simulate the observed protein profiles of the test sample. The scores, squared residuals, i.e., observed protein profile – simulated protein profile)2 and Mahalanobis distance (M-distance) are used for match/no-match of test samples with respect to the same parameters for the members of the standard set. The M-distance (D 2) for any sample is defined by the equation:
|1 to 30||29 none, 1 alcohol||8 M, 22 F||20 to 64||Healthy|
|31||Smoking, tobacco chewing, alcohol||M||38||Carcinoma|
|33||Tobacco chewing, alcohol||M||39||Carcinoma|
|35||Smoking, tobacco chewing||M||65||Carcinoma|
|37||Smoking, tobacco chewing||M||52||Carcinoma|
|38||Smoking, tobacco chewing||M||75||Carcinoma|
|39||Tobacco chewing, alcohol||M||50||Carcinoma|
|40||Smoking, tobacco chewing, alcohol||M||60||Carcinoma|
|42||Smoking, tobacco chewing, alcohol||M||50||Carcinoma|
|49||Smoking, tobacco, alcohol||M||45||Carcinoma|
|55||Smoking, tobacco, alcohol||M||63||Carcinoma|
|56||Smoking, tobacco, alcohol||M||60||Carcinoma|
|58||Smoking, tobacco chewing, alcohol||M||54||Leukoplakia|
|60||Tobacco chewing, alcohol||M||40||OSMF|
|62||Smoking, tobacco chewing||M||34||OSMF|
|63||Smoking, tobacco chewing, alcohol||M||57||Leukoplakia|
|64||Smoking, tobacco chewing, alcohol||M||40||Leukoplakia|
|65||Smoking, tobacco chewing, alcohol||M||45||Speckled Leukoplakia|
|67||Smoking, tobacco chewing||M||35||OSMF|
|68||Smoking, tobacco chewing, alcohol||M||54||Speckled Leukoplakia|
|69||Tobacco chewer, alcohol||M||32||OSMF|
|71||Smoking, tobacco chewing||M||28||OSMF|
|72||Tobacco chewing, alcohol||M||48||Lichenplanus|
|73||Smoking, tobacco chewing, alcohol||M||25||OSMF|
|77||Tobacco chewing, alcohol||M||24||Lichenplanus|
|78||Smoking, alcohol||M||42||Speckled Leukoplakia|
|79||Tobacco chewing, alcohol||M||24||OSMF|
|80||Smoking, tobacco chewing, alcohol||M||26||Leukoplakia|
|81||Smoking, tobacco chewing, alcohol||M||48||OSMF|
Here, S is the vector of scores of factors and squared residual for a standard/sample, and N is the number of protein profiles in the standard set. It can be seen that the M-distance is expressed in units of standard deviation.15 Any sample that has an M-distance of two or more can be considered to be out of the standard group, with only a very small probability (<0.1%) of belonging to that group. Results from these match/no-match tests are used to estimate the sensitivity and specificity pairs, receiver operating characteristic (ROC) analysis, and Youden's index.
To reduce the arbitrariness and subjective nature of diagnostic decision-making statistical parameters like specificity, sensitivity, positive predictive value, and negative predictive value can be used for performance evaluation. Before understanding these measures, one should have a clear idea about the terms true positive (TP), false negative (FN), true negative (TN), and false positive (FP), since, they are required for calculation of specificity and sensitivity.30 Generally, positive refers to the disease state and negative refers to the nondisease state. A subject is said to be true positive (TP) if a disease subject is diagnosed correctly, otherwise it will be called false negative (FN). Similarly, if a nondisease subject is diagnosed correctly, then it is considered as true negative (TN), otherwise as false positive (FP). In the present studies, decision making involves not disease states, but premalignant conditions. Hence “true positive” here means “true positive decision,” meaning premalignant. Similarly “true negative” means “not premalignant.” In the latter situation, the subject can be normal or also malignant. It is shown that the present method can discriminate between these two situations as well (see Table 2). Sensitivity and specificity, as applied to the premalignant condition, is:
|Test result (diagnosis)|
|Positive (T+)||Negative (T−)||Total actual state|
|Positive (T+)||True positive (TP)||False negative (FN)||(TP+FN)|
|Negative (T−)||False positive (FP)||True negative (TN)||(FP+TN)|
|Total test results (diagnosis)||(TP+FP)||(FN+TN)|
Receiver operating characteristic curve
A receiver operating characteristic (ROC) curve illustrates the relationship between the sensitivity and specificity of a diagnostic test. It helps us to find the optimal operating region for a diagnostic test,29, 30, 31 and is a measure of the performance of a diagnostic test. An ROC curve is obtained by plotting sensitivity (y axis) against (1-specificity) along the x axis. Finding an area under the curve (AUC) is a popular measure in the analysis of ROC curves. AUC of a ROC curve evaluates the overall performance of the diagnostic test, and is considered the mean value of sensitivity for all possible values of specificity.31, 32, 33
The diagnostic tests can be regarded as continuous measurements, since they can be screened in a range of different threshold values or cutoff operating points. To decide the value of an ideal threshold or cutoff operating point that should discriminate between disease and nondisease states, it is the usual practice to choose a point that has high values for sensitivity and specificity. But, this may not be sufficient to evaluate the performance of a diagnostic test. It is well known that sensitivity and specificity have opposite trends in any diagnostic test. Attempts to increase one can result in decrease of the other. In such cases it is difficult to decide the threshold, and Youden's index (J) can be used to choose an appropriate cutoff. J is defined by,
J can vary between (−1, 1). But the negative index has no significant meaning for diagnostic tests, so usually we can use 0 as the minimum value instead of −1. The diagnostic test is perfect when J = 1, and has no diagnostic information when J = 0.34, 35
As we know, the expression for Youden's index is a combined measure of specificity and sensitivity. It can be written as follows,
From this we can relate sensitivity and specificity in a way such that one can decide the reliability of the test in a quantitative manner. In ROC analysis, the operator has to decide a cutoff point that will give a maximum area under the curve. This can be decided using Youden's index. In this case, the operator has to choose the operating point (cutoff point/threshold) for which Youden's index (J) is maximum. From this, one can say whether the test is reliable or not. The test is reliable if J is positive (i.e., the sum of specificity and sensitivity will be greater than 1). The operator can plot J for different operating points (threshold), and the ideal operating point (threshold) can be selected as that for which J is maximum. In this case, sensitivity will be maximum and (1-specificity) will be minimum.
Results and Discussion
Intercomparison of protein profiles recorded over several months requires a rigorous data analysis protocol. For this, all protein profiles were preprocessed as follows. Since the background fluorescence varied across the 60-min gradient run, possibly from the varying TFA-acetonitrile gradient, a background correction was carried out using a polynomial fit to the background. Small day-to-day changes in experimental conditions lead to minor shifts in peak positions from run to run. These were corrected by a standard of all the protein profiles along the time scale, using mean values of protein peaks common to all samples. All the protein profiles were then normalized with respect to the serum peak at 1666 s, which remained more or less unchanged in intensity and retention time in all runs.
A highly objective approach was used for data analysis. We used the principal component analysis (PCA) full region of the protein profiles to cover all the observed peaks, and the match/no-match tests were performed by comparing the test samples with standard sets (say normal, malignant, and premalignant).23 In practice, not only the test sample but each member of the standard set also is tested against the standard set by rotating them out one at a time and matching them against the rest of the set.
As mentioned before, PCA was done with the whole protein profiles. In this case, the protein profiles in the region 400 to 2800 s was used, as there was not much information in the regions below 400 s and above 2800 s. The trial runs showed that about five to six factors contribute to more than 95% of the variations of any sample in a class from the mean of that class. This indicates that about the first five factors are sufficient to fully represent that class. We used 15 samples for a normal standard set and for making premalignant and malignant standard sets; the number of samples used are 10 and 12 samples, respectively. For all three standard sets, five factors have been used. Plots of squared residuals against M-distance for normal standard, malignant standard, and premalignant standard sets are shown in Figs. 3, 3, 3. The results are shown in Tables 3, 4, 5. It is seen that the PCA of complete protein profiles gives better classification for normal compared to premalignant and malignant groups, and better sensitivity and specificity.
Match/no-match test results for the region 400 to 2800 s when tested with normal standard set.
Match/no-match test results for the region 400 to 2800 s when tested with a malignant standard set.
Match/no-match test results for the region 400 to 2800 s when tested with a premalignant standard set.
In Table 3, it can be seen that with a normal standard set, the normal validation test samples (as well as the members of the standard set) lie close to the origin having small values of M-distance and spectral residual, while all other test samples (premalignant and malignant) show large values, indicating that they do not belong to the normal class. Except for some suspicious cases, almost similar results are obtained when all the test samples are subjected to the match/no-match test with the premalignant standard set.
Similarly, from Tables 4, 5, Figs. 3 and 3, one can see that for malignant and premalignant standard sets, the samples of respective class lie very close to the origin, whereas samples from other classes are scattered away. This has been done for the cross validation of respective classes.
Using the results for a full protein profile, ROC curves, Youden's indices, and area under the ROC curves were obtained for all the serum samples. The ROC curves were plotted using specificity and sensitivity values corresponding to selected cutoff thresholds for M-distance. ROC and Youden's index curve for normal, malignant, and premalignant standard sets are shown in Figs. 4 and 4. ROC area under the curve for normal, malignant, and premalignant standard sets were found to be 0.907, 0.829, and 0.787, respectively. As mentioned earlier, Youden's index plays an important role in deciding threshold, since it gives an idea about combined measures (i.e., specificity and sensitivity). A Youden's index plot in Fig. 4 shows the plot of Youden's index value for each threshold and cutoff (threshold). The figures give a clear idea about the optimum threshold that has to be used for the screening. In this case, the ideal M-distance threshold observed is 2 for all normal, malignant, and premalignant standard sets. The nonlinear fit to Youden's index plots were found to be significant with regression value screening tests at 0.984, 0.948, and 0.973 for normal, malignant, and premalignant tests, respectively. The specificities and sensitivities at an ideal threshold (M-distance = 2) for normal, malignant, and premalignant standard sets are found to be 100, 69.5, and 61.5%, and 86.5, 87.5, and 87.5%, respectively.
It is to be noted that the premalignant set included leukoplakia, OSMF, and Lichenplanus samples. The serum protein profiling method can classify normal, all premalignant, and malignant conditions separately. This is highly advantageous in deciding whether a subject with a premalignant condition is going over to malignancy, by periodic screening, without the need for repeated biopsies. Moreover, biopsy and histopathology examine heterogeneous sample (tissue) and can give results that do not represent the true condition.12 Serum samples, on the other hand, are homogeneous and will reflect the true state of the disease condition. Serum protein profiles are thus highly suitable for monitoring premalignant conditions to decide whether they are crossing over to malignancy. Decision making involves only a simple blood test, which can be done in any clinical chemistry laboratory by a trained technician. Efforts are to collect unknown peaks to find out their identity.
The results presented here show that the method of protein profiling by HPLC-LIF can be used for the diagnosis of oral cancer. Diagnostic accuracy evaluated by the combination of Youden's index and ROC analysis gives a clear idea of the suitability of the method for decision making. Youden's index coupled with ROC can be taken as a good approach in deciding the operating threshold. By using these methods, the operator can decide the ideal threshold for any kind of screening tests. The ideal screening threshold values of M-distances derived out of the present study by using a complete region of the protein profile for all normal, malignant, and premalignant standard sets are found to be 2.0. The HPLC-LIF technique combined with statistical analysis (match/no match test) is proven to be capable enough to discriminate between malignant, premalignancy, and normal conditions, which in turn can be extended for early detection of oral cancer.
Part of this work was done by the financial support from Manipal University (MU). Ajeetkumar Patil, Vijendra Prabhu, and K.S. Choudhari are thankful to MU for fellowships. C. Santhosh is thankful to the Abdus Salam International Centre for Theoretical Physics, Trieste, Italy, for the regular associate fellowship.