19 November 2003 SIMD-aware loop unrolling for embedded code optimization
Author Affiliations +
Proceedings Volume 5241, Multimedia Systems and Applications VI; (2003) https://doi.org/10.1117/12.513540
Event: ITCom 2003, 2003, Orlando, Florida, United States
Due to the rising complexity of modern embedded media applications (EMAs), the instruction level parallelism (ILP) is not sufficient to meet the need. Compilers must have the capability to exploit the superword level parallelism (SLP), which can expose more concurrency lying in applications, minimize the latency created by memory access and hence produce more efficient codes. The loop is a good candidate for SLP extraction because of its paralleled structure between iterations. This work analyzes the memory access patterns found in EMAs and presents our method of loop unrolling to fully utilize these patterns to generate efficient Single Instruction Multiple Data (SIMD) instructions. Experimental results performed on TriMedia TM-1300 processor for the H.264 encoder show performance improvement by a factor ranging from 3 to 30 times with an average of 12 times.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yunyang Dai, Yunyang Dai, Qing Li, Qing Li, Qi Zhang, Qi Zhang, C.-C. Jay Kuo, C.-C. Jay Kuo, "SIMD-aware loop unrolling for embedded code optimization", Proc. SPIE 5241, Multimedia Systems and Applications VI, (19 November 2003); doi: 10.1117/12.513540; https://doi.org/10.1117/12.513540


Using redundancy to repair video damaged by network data loss
Proceedings of SPIE (December 26 1999)
Embedded video monitoring system based on the OMAP
Proceedings of SPIE (November 27 2007)
QBIX-G: a transcoding multimedia proxy
Proceedings of SPIE (January 15 2006)
Multistage mode decision for intraprediction in H.264 codec
Proceedings of SPIE (January 17 2004)

Back to Top