Ear diseases are frequently occurring conditions affecting the majority of the pediatric population, potentially resulting in hearing loss and communication disabilities. The current standard of care in diagnosing ear diseases includes a visual examination of the tympanic membrane (TM) by a medical expert with a range of available otoscopes. However, visual examination is subjective and depends on various factors, including the experience of the expert. This work proposes a decision fusion mechanism to combine predictions obtained from digital otoscopy images and biophysical measurements (obtained through tympanometry) for the detection of eardrum abnormalities. Our database consisted of 73 tympanometry records along with digital otoscopy videos. For the tympanometry aspect, we trained a random forest classifier (RF) using raw tympanometry attributes. Additionally, we mimicked a clinician’s decision on tympanometry findings using the normal range of the tympanogram values provided by a clinical guide. Moreover, we re-trained Inception-ResNet-v2 to classify TM images selected from each otoscopic video. After obtaining predictions from each of three different sources, we performed a majority voting-based decision fusion technique to reach the final decision. Experimental results show that the proposed decision fusion method improved the classification accuracy, positive predictive value, and negative predictive value in comparison to the single classifiers. The results revealed that the accuracies are 64.4% for the clinical evaluations of tympanometry, 76.7% for the computerized analysis of tympanometry data, and 74.0% for the TM image analysis while our decision fusion methodology increases the classification accuracy to 84.9%. To the best of our knowledge, this is the first study to fuse the data from digital otoscopy and tympanometry. Preliminary results suggest that fusing information from different sources of sensors may provide complementary information for accurate and computerized diagnosis of TM-related abnormalities.
In this study, we proposed an approach to report the condition of the eardrum as “normal” or “abnormal” by ensembling two different deep learning architectures. In the first network (Network 1), we applied transfer learning to the Inception V3 network by using 409 labeled samples. As a second network (Network 2), we designed a convolutional neural network to take advantage of auto-encoders by using additional 673 unlabeled eardrum samples. The individual classification accuracies of the Network 1 and Network 2 were calculated as 84.4%(± 12.1%) and 82.6% (± 11.3%), respectively. Only 32% of the errors of the two networks were the same, making it possible to combine two approaches to achieve better classification accuracy. The proposed ensemble method allows us to achieve robust classification because it has high accuracy (84.4%) with the lowest standard deviation (± 10.3%).
In this study, we propose an automated otoscopy image analysis system called Autoscope. To the best of our knowledge, Autoscope is the first system designed to detect a wide range of eardrum abnormalities by using high-resolution otoscope images and report the condition of the eardrum as “normal” or “abnormal.” In order to achieve this goal, first, we developed a preprocessing step to reduce camera-specific problems, detect the region of interest in the image, and prepare the image for further analysis. Subsequently, we designed a new set of clinically motivated eardrum features (CMEF). Furthermore, we evaluated the potential of the visual MPEG-7 descriptors for the task of tympanic membrane image classification. Then, we fused the information extracted from the CMEF and state-of-the-art computer vision features (CVF), which included MPEG-7 descriptors and two additional features together, using a state of the art classifier. In our experiments, 247 tympanic membrane images with 14 different types of abnormality were used, and Autoscope was able to classify the given tympanic membrane images as normal or abnormal with 84.6% accuracy.