13 March 1996 Similarity indexing: algorithms and performance
Author Affiliations +
Abstract
Efficient indexing support is essential to allow content-based image and video databases using similarity-based retrieval to scale to large databases (tens of thousands up to millions of images). In this paper, we take an in depth look at this problem. One of the major difficulties in solving this problem is the high dimension (6-100) of the feature vectors that are used to represent objects. We provide an overview of the work in computational geometry on this problem and highlight the results we found are most useful in practice, including the use of approximate nearest neighbor algorithms. We also present a variant of the optimized k-d tree we call the VAM k-d tree, and provide algorithms to create an optimized R-tree we call the VAMSplit R-tree. We found that the VAMSplit R-tree provided better overall performance than all competing structures we tested for main memory and secondary memory applications. We observed large improvements in performance relative to the R*-tree and SS-tree in secondary memory applications, and modest improvements relative to optimized k-d tree variants.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
David A. White, David A. White, Ramesh C. Jain, Ramesh C. Jain, } "Similarity indexing: algorithms and performance", Proc. SPIE 2670, Storage and Retrieval for Still Image and Video Databases IV, (13 March 1996); doi: 10.1117/12.234810; https://doi.org/10.1117/12.234810
PROCEEDINGS
12 PAGES


SHARE
Back to Top