Digital video decoding, enabled by the MPEG-2 Video standard, is an important future application for embedded systems, particularly PDAs and other information appliances. Many such system require portability and wireless communication capabilities, and thus face severe limitations in size and power consumption. This places a premium on integration and efficiency, and favors software solutions for video functionality over specialized hardware. The processors in most embedded system currently lack the computational power needed to perform video decoding, but a related and equally important problem is the required data bandwidth, and the need to cost-effectively insure adequate data supply. MPEG data sets are very large, and generate significant amounts of excess memory traffic for standard data caches, up to 100 times the amount required for decoding. Meanwhile, cost and power limitations restrict cache sizes in embedded systems. Some systems, including many media processors, eliminate caches in favor of memories under direct, painstaking software control in the manner of digital signal processors. Yet MPEG data has locality which caches can exploit if properly optimized, providing fast, flexible, and automatic data supply. We propose a set of enhancements which target the specific needs of the heterogeneous types within the MPEG decoder working set. These optimizations significantly improve the efficiency of small caches, reducing cache-memory traffic by almost 70 percent, and can make an enhanced 4 KB cache perform better than a standard 1 MB cache. This performance improvement can enable high-resolution, full frame rate video playback in cheaper, smaller system than woudl otherwise be possible.