1 November 1996 3D shape inferencing and modeling for semantic video retrieval
Author Affiliations +
Proceedings Volume 2916, Multimedia Storage and Archiving Systems; (1996) https://doi.org/10.1117/12.257292
Event: Photonics East '96, 1996, Boston, MA, United States
Abstract
In this paper we present a geometry based indexing method for the semantic retrieval of large video databases. It combines two separate modules i.e., 3D object shape inferencing from a video sequence and geometric modeling from the reconstructed shape, to achieve better performance. First, a motion-based segmentation algorithm employing feature block tracking and hierarchical principal component split is used for multi-moving-object motion classification and segmentation. After segmentation, feature blocks for an individual moving scene or object can be used to reconstruct the 3D motion and shape structure of this scene or object by a factorization method. We assume object is rigid and relatively far away from the camera so that perspective distortion can be ignored. The estimated shape structure and motion parameters are then used to generate the implicit polynomial (IP) representation for the object. The system starts with a very coarse representation of the 3D shape. When more frames are available from the video stream and are properly segmented and classified, the IP representation will change accordingly by varying the coefficients of the implicit polynomial to minimize the estimation error. This process will stop when enough information is obtained to generate a reliable IP shape representation or until the video stream runs out. The semantic retrieval of the video databases is achieved by using the geometric structure of the objects and their spatial relationship. We generalize the 2D sting concept to 3D to compactly encode the spatial relationship among objects. The algebraic invariants of the implicit polynomial are used as the geometric feature vector for the object. A similarity value can be computed for two sets of objects or two video sequences to allow fast retrieval of video databases.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhibin Lei, Yun-Tin Lin, "3D shape inferencing and modeling for semantic video retrieval", Proc. SPIE 2916, Multimedia Storage and Archiving Systems, (1 November 1996); doi: 10.1117/12.257292; https://doi.org/10.1117/12.257292
PROCEEDINGS
12 PAGES


SHARE
Back to Top