Translator Disclaimer
30 March 2007 Automatic evaluation of uterine cervix segmentations
Author Affiliations +
In this work we focus on the generation of reliable ground truth data for a large medical repository of digital cervicographic images (cervigrams) collected by the National Cancer Institute (NCI). This work is part of an ongoing effort conducted by NCI together with the National Library of Medicine (NLM) at the National Institutes of Health (NIH) to develop a web-based database of the digitized cervix images in order to study the evolution of lesions related to cervical cancer. As part of this effort, NCI has gathered twenty experts to manually segment a set of 933 cervigrams into regions of medical and anatomical interest. This process yields a set of images with multi-expert segmentations. The objectives of the current work are: 1) generate multi-expert ground truth and assess the diffculty of segmenting an image, 2) analyze observer variability in the multi-expert data, and 3) utilize the multi-expert ground truth to evaluate automatic segmentation algorithms. The work is based on STAPLE (Simultaneous Truth and Performance Level Estimation), which is a well known method to generate ground truth segmentation maps from multiple experts' observations. We have analyzed both intra- and inter-expert variability within the segmentation data. We propose novel measures of "segmentation complexity" by which we can automatically identify cervigrams that were found difficult to segment by the experts, based on their inter-observer variability. Finally, the results are used to assess our own automated algorithm for cervix boundary detection.
© (2007) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shelly Lotenberg, Shiri Gordon, Rodney Long, Sameer Antani, Jose Jeronimo M.D., and Hayit Greenspan "Automatic evaluation of uterine cervix segmentations", Proc. SPIE 6515, Medical Imaging 2007: Image Perception, Observer Performance, and Technology Assessment, 65151J (30 March 2007);

Back to Top