You have requested a machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Neither SPIE nor the owners and publishers of the content make, and they explicitly disclaim, any express or implied representations or warranties of any kind, including, without limitation, representations and warranties as to the functionality of the translation feature or the accuracy or completeness of the translations.
Translations are not retained in our system. Your use of this feature and the translations is subject to all use restrictions contained in the Terms and Conditions of Use of the SPIE website.
25 February 2014MPEG-4 AVC saliency map computation
A saliency map provides information about the regions inside some visual content (image, video, ...) at which a human
observer will spontaneously look at. For saliency maps computation, current research studies consider the uncompressed
(pixel) representation of the visual content and extract various types of information (intensity, color, orientation, motion
energy) which are then fusioned. This paper goes one step further and computes the saliency map directly from the
MPEG-4 AVC stream syntax elements with minimal decoding operations. In this respect, an a-priori in-depth study on
the MPEG-4 AVC syntax elements is first carried out so as to identify the entities appealing the visual attention.
Secondly, the MPEG-4 AVC reference software is completed with software tools allowing the parsing of these elements
and their subsequent usage in objective benchmarking experiments. This way, it is demonstrated that an MPEG-4
saliency map can be given by a combination of static saliency and motion maps.
This saliency map is experimentally validated under a robust watermarking framework. When included in an m-QIM
(multiple symbols Quantization Index Modulation) insertion method, PSNR average gains of 2.43 dB, 2.15dB, and 2.37
dB are obtained for data payload of 10, 20 and 30 watermarked blocks per I frame, i.e. about 30, 60, and 90 bits/second,
respectively. These quantitative results are obtained out of processing 2 hours of heterogeneous video content.
The alert did not successfully save. Please try again later.
M. Ammar, M. Mitrea, M. Hasnaoui, "MPEG-4 AVC saliency map computation," Proc. SPIE 9014, Human Vision and Electronic Imaging XIX, 90141A (25 February 2014); https://doi.org/10.1117/12.2042450