20 July 2001 Multiscale audio-video analysis and processing: segmentations and arrangements
Author Affiliations +
Proceedings Volume 4519, Internet Multimedia Management Systems II; (2001) https://doi.org/10.1117/12.434277
Event: ITCom 2001: International Symposium on the Convergence of IT and Communications, 2001, Denver, CO, United States
We propose a multi-scale and multi-modal analysis and processing scheme for audio-video data. Using a non-linear scale-space technique audio-video is analyzed and processed such that it is invariant under various imaging and hearing conditions. Degradations due to Lyapunov and structural instabilities are suppressed by this scale-space technique without destroying essential semantic relations. On the basis of an audio-video segmentation its arrangements are quantified in terms of spatio-temporal inclusion relations and dynamic ordening relations by means of scaling connectivity relations. These relations infer a topological structure on top of the audio-video scale-space inducing a unimodal and multi-modal semantics. Our scheme is illustrated separately for video, audio and audio-video material the latter pointing out the added value of integrating audio and video.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Raango Aldershoff, Raango Aldershoff, Alfons H. Salden, Alfons H. Salden, } "Multiscale audio-video analysis and processing: segmentations and arrangements", Proc. SPIE 4519, Internet Multimedia Management Systems II, (20 July 2001); doi: 10.1117/12.434277; https://doi.org/10.1117/12.434277


Font generation of personal handwritten Chinese characters
Proceedings of SPIE (January 09 2014)
Hyperlinked video
Proceedings of SPIE (January 21 1999)
Hierarchical video summarization based on context clustering
Proceedings of SPIE (November 25 2003)

Back to Top