11 March 2010 An automatic system to detect and extract texts in medical images for de-identification
Author Affiliations +
Recently, there is an increasing need to share medical images for research purpose. In order to respect and preserve patient privacy, most of the medical images are de-identified with protected health information (PHI) before research sharing. Since manual de-identification is time-consuming and tedious, so an automatic de-identification system is necessary and helpful for the doctors to remove text from medical images. A lot of papers have been written about algorithms of text detection and extraction, however, little has been applied to de-identification of medical images. Since the de-identification system is designed for end-users, it should be effective, accurate and fast. This paper proposes an automatic system to detect and extract text from medical images for de-identification purposes, while keeping the anatomic structures intact. First, considering the text have a remarkable contrast with the background, a region variance based algorithm is used to detect the text regions. In post processing, geometric constraints are applied to the detected text regions to eliminate over-segmentation, e.g., lines and anatomic structures. After that, a region based level set method is used to extract text from the detected text regions. A GUI for the prototype application of the text detection and extraction system is implemented, which shows that our method can detect most of the text in the images. Experimental results validate that our method can detect and extract text in medical images with a 99% recall rate. Future research of this system includes algorithm improvement, performance evaluation, and computation optimization.
© (2010) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yingxuan Zhu, Yingxuan Zhu, P. D. Singh, P. D. Singh, Khan Siddiqui, Khan Siddiqui, Michael Gillam, Michael Gillam, } "An automatic system to detect and extract texts in medical images for de-identification", Proc. SPIE 7628, Medical Imaging 2010: Advanced PACS-based Imaging Informatics and Therapeutic Applications, 762803 (11 March 2010); doi: 10.1117/12.855588; https://doi.org/10.1117/12.855588

Back to Top