Presentation + Paper
4 April 2022 Selecting training samples for ovarian cancer classification via a semi-supervised clustering approach
Author Affiliations +
Abstract
Machine learning techniques have shown great promise in digital pathology. However, a major bottleneck is the difficulty of annotating necessary amount of tissue to deal with several variability factors, namely chemical fixation, sample slicing, or staining. Usually, models are trained using sets of annotated small image patches, but then, the number of required patches may increase exponentially and yet they must represent such variability. This paper presents a method for automatic sample selection to train a classifier for ovarian cancer by integrating a novel soft clustering strategy. The method starts by classifying a large set of patches with a previously trained classifier and divide patches from the cancer class as highly and moderately confident. An unsupervised selection of moderately confident patches by a Probabilistic Latent Semantic Analysis (PLSA), picks samples from relevant and meaningful groups with maximum within-group variance. A new model is re-trained using the highly confident patches together with patches obtained from the associated PLSA. This strategy outperforms a model trained with a larger set of annotated patches while the training times and the number of samples are much more smaller. The strategy was evaluated in a set of patches from 18 patients with Serous Ovarian Cancer, obtaining a reduction of 54.62% in the training time and 73.66% in the number of samples, while recall rate improved from 0.69 to 0.73.
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jennifer Salguero L., Prateek Prasanna, Germán Corredor, Angel Cruz-Roa, David Becerra, and Eduardo Romero "Selecting training samples for ovarian cancer classification via a semi-supervised clustering approach", Proc. SPIE 12039, Medical Imaging 2022: Digital and Computational Pathology, 1203906 (4 April 2022); https://doi.org/10.1117/12.2612984
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Ovarian cancer

Cancer

Diagnostics

Feature extraction

Pathology

Statistical modeling

Machine learning

Back to Top