Digital pathological image retrieval plays an important role in computer-aided diagnosis for breast cancer. The retrieval
results of an unknown pathological image, which are generally previous cases with diagnostic information, can provide
doctors with assistance and reference. In this paper, we develop a novel pathological image retrieval method for breast
cancer, which is based on stain component and probabilistic latent semantic analysis (pLSA) model. Specifically, the
method firstly utilizes color deconvolution to gain the representation of different stain components for cell nuclei and
cytoplasm, and then block Gabor features are conducted on cell nuclei, which is used to construct the codebook.
Furthermore, the connection between the words of the codebook and the latent topics among images are modeled by
pLSA. Therefore, each image can be represented by the topics and also the high-level semantic concepts of image can be
described. Experiments on the pathological image database for breast cancer demonstrate the effectiveness of our method.