From Event: SPIE Optical Engineering + Applications, 2018
Video coding is a powerful enabling technology for networked multimedia transmission and communication, that has been in constant improvement for decades. The upcoming VVC video codec, due in 2020, from the ITU|ISO/IEC standards committees, aims to achieve on the order of 1000:1 compression on high resolution and high dynamic range video, a stunning landmark. But the basic structure of codecs has remained largely unchanged over time, the gains obtained mainly through complexity increases. Moreover, video encoders have for decades used the same mean squared error, or sum of absolute differences, measure to optimize coding decisions. At the same time, the rapid rise of deep learning (DL) techniques poses the question: can DL fundamentally reshape how video is coded. While that question is highly complex, we first see a path for DL methods to make inroads into how video quality is measured. This in turn can also change how it is coded. In particular, we study a recently introduced video quality metric called VMAF and find ways to improve it further, which can lead to more powerful encoder designs that employ these measures in the coding decisions.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Pankaj Topiwala, Madhu Krishnan, and Wei Dai, "Deep learning techniques in video coding and quality analysis," Proc. SPIE 10752, Applications of Digital Image Processing XLI, 1075211 (Presented at SPIE Optical Engineering + Applications: August 21, 2018; Published: 17 September 2018); https://doi.org/10.1117/12.2322025.