The difficulty of parallelizing entropy coding is increasingly limiting the data throughputs achievable in media compression. In this work we analyze what are the fundamental limitations, using finite-state-machine models for identifying the best manner of separating tasks that can be processed independently, while minimizing compression losses. This analysis confirms previous works showing that effective parallelization is feasible only if the compressed data is organized in a proper way, which is quite different from conventional formats. The proposed new formats exploit the fact that optimal compression is not affected by the arrangement of coded bits, but it goes further in exploiting the decreasing cost of data processing and memory. Additional advantages include the ability to use, within this framework, increasingly more complex data modeling techniques, and the freedom to mix different types of coding. We confirm the parallelization effectiveness using coding simulations that run on multi-core processors, and show how throughput scales with the number of cores, and analyze the additional bit-rate overhead.
Buffer or counter-based techniques are adequate for dealing with carry propagation in software implementations of arithmetic coding, but create problems in hardware implementations due to the difficulty of handling worst-case scenarios, defined by very long propagations. We propose a new technique for constraining the carry propagation, similar to “bit-stuffing,” but designed for encoders that generate data as bytes instead of individual bits, and is based on the fact that the encoder and decoder can maintain the same state, and both can identify the situations when it desired to limit carry propagation. The new technique adjusts the coding interval in a way that corresponds to coding an unused data symbol, but selected to minimize overhead. Our experimental results demonstrate that the loss in compression can be made very small using regular precision for arithmetic operations.