Reviewing videos from medical procedures is a tedious work that requires concentration for extended hours and
usually screens thousands of frames to find only a few positive cases that indicate probable presence of disease.
Computational classification algorithms are sought to automate the reviewing process. The class imbalance
problem becomes challenging when the learning process is driven by relative few minority class samples. The
learning algorithms using imbalanced data sets generally result in large number of false negatives. In this article,
we present an efficient rebalancing method for finding video frames that contain bleeding lesions. The majority
class generally has clusters of data within them. Here we cluster the majority class and under-sample the each
cluster based on its variance so that useful examples would not be lost during the under-sampling process. The
balance of bleeding to non-bleeding frames is restored by the proposed cluster-based under-sampling and oversampling
using Synthetic Minority Over-sampling Technique (SMOTE). Experiments were conducted using
synthetic data and videos manually annotated by medical specialists for obscure bleeding detection. Our method
achieved a high average sensitivity and specificity.
Computer-aided diagnosis usually screens thousands of instances to find only a few positive cases that indicate
probable presence of disease.The amount of patient data increases consistently all the time. In diagnosis of new
instances, disagreement occurs between a CAD system and physicians, which suggests inaccurate classifiers.
Intuitively, misclassified instances and the previously acquired data should be used to retrain the classifier.
This, however, is very time consuming and, in some cases where dataset is too large, becomes infeasible. In
addition, among the patient data, only a small percentile shows positive sign, which is known as imbalanced
data.We present an incremental Support Vector Machines(SVM) as a solution for the class imbalance problem
in classification of anomaly in medical images. The support vectors provide a concise representation of the
distribution of the training data. Here we use bootstrapping to identify potential candidate support vectors
for future iterations. Experiments were conducted using images from endoscopy videos, and the sensitivity
and specificity were close to that of SVM trained using all samples available at a given incremental step with
significantly improved efficiency in training the classifier.