Paper
19 September 2017 Dynamic frame resizing with convolutional neural network for efficient video compression
Jaehwan Kim, Youngo Park, Kwang Pyo Choi, JongSeok Lee, Sunyoung Jeon, JeongHoon Park
Author Affiliations +
Abstract
In the past, video codecs such as vc-1 and H.263 used a technique to encode reduced-resolution video and restore original resolution from the decoder for improvement of coding efficiency. The techniques of vc-1 and H.263 Annex Q are called dynamic frame resizing and reduced-resolution update mode, respectively. However, these techniques have not been widely used due to limited performance improvements that operate well only under specific conditions. In this paper, video frame resizing (reduced/restore) technique based on machine learning is proposed for improvement of coding efficiency. The proposed method features video of low resolution made by convolutional neural network (CNN) in encoder and reconstruction of original resolution using CNN in decoder. The proposed method shows improved subjective performance over all the high resolution videos which are dominantly consumed recently. In order to assess subjective quality of the proposed method, Video Multi-method Assessment Fusion (VMAF) which showed high reliability among many subjective measurement tools was used as subjective metric. Moreover, to assess general performance, diverse bitrates are tested. Experimental results showed that BD-rate based on VMAF was improved by about 51% compare to conventional HEVC. Especially, VMAF values were significantly improved in low bitrate. Also, when the method is subjectively tested, it had better subjective visual quality in similar bit rate.
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jaehwan Kim, Youngo Park, Kwang Pyo Choi, JongSeok Lee, Sunyoung Jeon, and JeongHoon Park "Dynamic frame resizing with convolutional neural network for efficient video compression", Proc. SPIE 10396, Applications of Digital Image Processing XL, 103961R (19 September 2017); https://doi.org/10.1117/12.2270737
Lens.org Logo
CITATIONS
Cited by 34 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Video compression

Video coding

Machine learning

Multimedia

Neural networks

Reliability

Back to Top