Polyp classification is a feature selection and clustering process. Picking the most effective features from multiple polyp descriptors without redundant information is a great challenge in this procedure. We propose a multilayer feature selection method to construct an optimized descriptor for polyp classification with a feature-grouping strategy in a hierarchical framework. First, the proposed method makes good use of image metrics, such as intensity, gradient, and curvature, to divide their corresponding polyp descriptors into several feature groups, which are the preliminary units of this method. Then each preliminary unit generates two ranked descriptors, i.e., their optimized variable groups (OVGs) and preliminary classification measurements. Next, a feature dividing-merging (FDM) algorithm is designed to perform feature merging operation hierarchically and iteratively. Unlike traditional feature selection methods, the proposed FDM algorithm includes two steps for feature dividing and feature merging. At each layer, feature dividing selects the OVG with the highest area under the receiver operating characteristic curve (AUC) as the baseline while other descriptors are treated as its complements. In the fusion step, the FDM merges some variables with gains into the baseline from the complementary descriptors iteratively on every layer until the final descriptor is obtained. This proposed model (including the forward step algorithm and the FDM algorithm) is a greedy method that guarantees clustering monotonicity of all OVGs from the bottom to the top layer. In our experiments, all the selected results from each layer are reported by both graphical illustration and data analysis. Performance of the proposed method is compared to five existing classification methods by a polyp database of 63 samples with pathological reports. The experimental results show that our proposed method outperforms other methods by 4% to 23% gains in terms of AUC scores.
Colorectal cancer (CRC) remains one of the leading causes of cancer deaths today. Since precancerous colorectal polyps slowly progress into cancer, screening methods are highly effective in reducing the overall mortality rate of CRC by removing them before developing into later stages. Virtual colonoscopy has been shown to be a practical screening method and provide a high sensitivity and specificity for diagnosis between hyperplastic polyps and precancerous adenomas or adenocarcinomas through the use of texture feature analysis. We hypothesize that effects from nonhyperplastic polyps, such as angiogenesis from adenocarcinomas, may result in changes to the texture of the colon wall that could help with computer aided diagnosis of the colorectal polyps. Here we present the preliminary results of incorporating the texture features of neighboring colon wall tissue into the diagnostic classification. We use gray level co-occurrence matrices to calculate the established Haralick features and a set of supplemental features for colorectal polyp regions of interest, as well as for the neighboring colon wall environment of the polyp. A random forest package was then used to perform the classification tests on different sets of features, with and without the inclusion of the environment to obtain an area under the curve (AUC) value of the receiver operating characteristic (ROC). Experiments show approximately a 1% increase in overall classification performance with the inclusion of the environment features.