The energy consumption profiling of the H.264 video decoder on VLIW embedded processors using the Trimaran simulator is conducted. Based on this study, we observe that the branch operations in the quarter-pixel (QP) interpolation and the DCT slow down the issue rate of the VLIW processors. Then, several new instruction architecture sets are proposed to address this issue. These new instructions can be used to speedup the issue rate, and reduce the total energy consumption. Finally, experimental results of the proposed instruction-level power-efficient strategies on the TI C6416 processor are reported and discussed.
The power consumption for the battery-supplied DSP-embedded multimedia systems based on a test platform, i.e. TI C64x, is analyzed in this research. We focus on the behavior of some frequently used compress/decompress functional modules. In particular, a MPEG-4 simple-profile decoder consisting of these modules is evaluated at the highest compiler optimization level so as to understand power allocation in embedded multimedia systems. Two DCT schemes are examined to find out a better power behavior. The integer DCT can reach 47% power saving as compared with an implementation of the float DCT. Overall, our studies provide a better understanding of the system-level power modeling and consumption estimation for embedded multimedia applications, and suggest some optimization methods.
Elliptic curve cryptography (ECC) is an excellent candidate for secure embedded multimedia applications due to its small key size and high security protection. The performance profiling of the ECC implementation, such as execution time and data cache stalls, on TriMedia TM1300 and Intel Pentium 4 is conducted in this research. Based on this study, we identify the main bottlenecks of the EEC implementation, and propose some favorable micro-architecture for this application. Moreover, several integer multiplication schemes are presented for the TM1300 processor for performance enhancement. In particular, the FIR-based multiplication is built with the special FIR instruction provided by TM1300. The performance improvement of the proposed schemes is reported and discussed. Overall, we aim at providing a good understanding of the system architecture of secure embedded multimedia applications, hardware and software cryptography implementation with ECC as an example.