In this paper we present a novel method to detect abnormal regions from capsule endoscopy images. Wireless Capsule
Endoscopy (WCE) is a recent technology where a capsule with an embedded camera is swallowed by the patient to
visualize the gastrointestinal tract. One challenge is one procedure of diagnosis will send out over 50,000 images,
making physicians' reviewing process expensive. Physicians' reviewing process involves in identifying images
containing abnormal regions (tumor, bleeding, etc) from this large number of image sequence. In this paper we construct
a novel framework for robust and real-time abnormal region detection from large amount of capsule endoscopy images.
The detected potential abnormal regions can be labeled out automatically to let physicians review further, therefore,
reduce the overall reviewing process. In this paper we construct an abnormal region detection framework with the
following advantages: 1) Trainable. Users can define and label any type of abnormal region they want to find; The
abnormal regions, such as tumor, bleeding, etc., can be pre-defined and labeled using the graphical user interface tool we
provided. 2) Efficient. Due to the large number of image data, the detection speed is very important. Our system can
detect very efficiently at different scales due to the integral image features we used; 3) Robust. After feature selection
we use a cascade of classifiers to further enforce the detection accuracy.
Text quality can significantly affect the results of text detection and recognition in digital video. In this paper we address the problem of estimating text quality. The quality of text that appears in video is often much lower than that in document images, and can be degraded by factors such as low resolution, background variation, uneven lighting, motion of the text and camera, and in the case of scene text, projection from 3D. Features based on text resolution, background noise, contrast, illumination and texture are selected to describe the text quality, normalized and fed into a trained RBF network to estimate the text quality. The performance using different training schemes are compared.
One difficulty with using text from digital video for indexing and retrieval is that video images are often in low resolution and poor quality, and as a result, the text can not be recognized adequately by most commercial OCR software. Text image enhancement is necessary to achieve reasonable OCR accuracy. Our enhancement consists of two main procedures, resolution enhancement based on Shannon interpolation and text separation from complex image background. Experiments show our enhancement approach improves OCR accuracy considerably.