1 November 1996 3D shape inferencing and modeling for semantic video retrieval
Author Affiliations +
Proceedings Volume 2916, Multimedia Storage and Archiving Systems; (1996); doi: 10.1117/12.257292
Event: Photonics East '96, 1996, Boston, MA, United States
In this paper we present a geometry based indexing method for the semantic retrieval of large video databases. It combines two separate modules i.e., 3D object shape inferencing from a video sequence and geometric modeling from the reconstructed shape, to achieve better performance. First, a motion-based segmentation algorithm employing feature block tracking and hierarchical principal component split is used for multi-moving-object motion classification and segmentation. After segmentation, feature blocks for an individual moving scene or object can be used to reconstruct the 3D motion and shape structure of this scene or object by a factorization method. We assume object is rigid and relatively far away from the camera so that perspective distortion can be ignored. The estimated shape structure and motion parameters are then used to generate the implicit polynomial (IP) representation for the object. The system starts with a very coarse representation of the 3D shape. When more frames are available from the video stream and are properly segmented and classified, the IP representation will change accordingly by varying the coefficients of the implicit polynomial to minimize the estimation error. This process will stop when enough information is obtained to generate a reliable IP shape representation or until the video stream runs out. The semantic retrieval of the video databases is achieved by using the geometric structure of the objects and their spatial relationship. We generalize the 2D sting concept to 3D to compactly encode the spatial relationship among objects. The algebraic invariants of the implicit polynomial are used as the geometric feature vector for the object. A similarity value can be computed for two sets of objects or two video sequences to allow fast retrieval of video databases.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhibin Lei, Yun-Tin Lin, "3D shape inferencing and modeling for semantic video retrieval", Proc. SPIE 2916, Multimedia Storage and Archiving Systems, (1 November 1996); doi: 10.1117/12.257292; https://doi.org/10.1117/12.257292

3D modeling


Motion models


Semantic video


Motion estimation


Three-dimensional motion tracking by Kalman filtering
Proceedings of SPIE (October 11 2000)
Segmentation and motion estimation in image sequences
Proceedings of SPIE (January 01 1990)
Robust position estimation of a mobile vehicle
Proceedings of SPIE (November 09 1994)

Back to Top