As a video encoding standard, High Efficiency Video Coding (HEVC) achieves excellent performance while causing a dramatic increase in coding complexity. Especially, the coding tree unit (CTU) depth decision process is the most complicated section, which takes heavy computation complexity in the entire HEVC intra coding process. Therefore, a deep learning-based method is applied to directly predict the CTU depth level for each frame in this study. In addition, a large-scale dataset that contains the coding unit image files and the corresponding depths was generated by HM16.15 to train and test the deep learning model. Besides, a Convolutional Neural Network called LeNet is fine-tuned by modifying the original architecture, and then the model with a more complicated structure is evaluated and compared on an acquired dataset. The experiments show that the fine-tuned deep learning model has the ability to identify accurately the depth level of CTU, the recognition accuracy reaches over 98.6%.
In High Efficiency Video Coding (HEVC) standard, the best intra prediction mode is decided by choosing the smallest ratedistortion cost of actual encoding among the total of 35 modes with the MPM (Most Probable Mode) scheme for compression purpose of mode encoding with reference to the adjacent reference blocks of the current prediction unit. This causes heavy computational complexity. In this paper, a deep neural network is conceived and experimented as a probable module for the intra prediction mode decision process inside of the HEVC encoding scheme. The neural network is trained and tested with a ground-truth dataset constructed from actual HEVC Intra encoding of original images. For the performance of the test, accuracy is used as the percentage of the correct mode output by the designed neural network to the ground-truth mode. The experimental results show that the neural network does not give good accuracy for the correct mode. However, accuracy increased when similar angle mode is considered as the correct mode. Also, the special modes of DC and Planar for MPM are analyzed in this paper.
This paper presents a correlation model based error concealment method for Polyphase Downsampleing (PD)
applied multiple description coding (MDC) using H.264. With MDC, when one description is lost due to errors,
the lost description can be reconstructed from other descriptions correctly received. The PD based MDC provides
coded stream capability of reconstructing pixels in the lost description from neighboring pixels which are coded in
other descriptions. The proposed error concealment method explores correlation between pixel to be concealed
and its neighboring pixels, then applies correlation model based concealment method to reconstruct the lost
pixels. Simulation results show that the proposed method improves coded stream error resiliency performance
by an average 0.62 dB PSNR enhancement in comparison with an existing PD based MDC method.
KEYWORDS: Visualization, Cameras, Mobile devices, 3D displays, 3D visualizations, 3D modeling, 3D image processing, Image compression, 3D applications, Java
In this paper, we introduce a graphics to Scalable Vector Graphics (SVG) adaptation framework with a mechanism of
vector graphics transmission to overcome the shortcoming of real-time representation and interaction experiences of 3D
graphics application running on mobile devices. We therefore develop an interactive 3D visualization system based on
the proposed framework for rapidly representing a 3D scene on mobile devices without having to download it from the
server. Our system scenario is composed of a client viewer and a graphic to SVG adaptation server. The client viewer
offers the user to access to the same 3D contents with different devices according to consumer interactions.
KEYWORDS: Visualization, Video, 3D video streaming, Video coding, 3D video compression, Mobile devices, Cameras, Video compression, Computer programming, Navigation systems
In this paper, we propose a method of 3D graphics to video encoding and streaming that are embedded into a remote
interactive 3D visualization system for rapidly representing a 3D scene on mobile devices without having to download it
from the server. In particular, a 3D graphics to video framework is presented that increases the visual quality of regions
of interest (ROI) of the video by performing more bit allocation to ROI during H.264 video encoding. The ROI are
identified by projection 3D objects to a 2D plane during rasterization. The system offers users to navigate the 3D scene
and interact with objects of interests for querying their descriptions. We developed an adaptive media streaming server
that can provide an adaptive video stream in term of object-based quality to the client according to the user's preferences
and the variation of network bandwidth. Results show that by doing ROI mode selection, PSNR of test sample slightly
change while visual quality of objects increases evidently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.