With the proliferation of camera equipped cell phones and the deployment of the higher data rate 2.5G and 3G infra structure systems, providing consumers with video-equipped cellular communication infrastructure is highly desirable, and can drive the development of a large number of valuable applications. However, for an uplink wireless channel, both the bandwidth and battery energy in a mobile phone are limited for video communications. In this paper, we pursue an energy efficient video communication solution through joint video summarization and transmission adaptation over a slow fading wireless channel. Coding and modulation schemes and packet transmission strategy are optimized and adapted to the unique packet arrival and delay characteristics of the video summaries. In additional to the optimal solution, we also propose a heuristic solution that is greedy but has close to optimal performance. Operational energy efficiency-summary distortion performance is characterized under an optimal summarization setting. Simulation results show the advantage of the proposed scheme with respect to energy efficiency and video transmission quality.
With the deployment of 2.5G/3G cellular network infrastructure and large number of camera equipped cell phones, the demand for video enabled applications are high. However, for an uplink wireless channel, both the bandwidth and battery energy capability are limited in a mobile phone for the video communication. These technical problems need to be effectively addressed before the practical and affordable video applications can be made available to consumers. In this paper we investigate the energy efficient video communication solution through joint video summarization and transmission adaptation over a slow fading channel. Coding and modulation schemes, as well as packet transmission strategy are optimized and adapted to the unique packet arrival and delay characteristics of the video summaries. Operational energy efficiency -- summary distortion performance is characterized under an optimal summarization setting.
The goal of video summarization is to select key frames from a video sequence in order to generate an optimal summary that can accommodate constraints on viewing time, storage, or bandwidth. While video summary generation without transmission considerations has been studied extensively, the problem of rate-distortion
optimized summary generation and transmission in a packet-lossy network has gained little attention. We consider the transmission of summarized video over a packet-lossy network such as the Internet. We depart from traditional rate control methods by not sacrificing the image quality of each transmitted frame but instead focusing on the frames that can be dropped without seriously affecting the quality of the video sequence. We take into account the packet loss probability, and use the end-to-end distortion to optimize the video quality given constraints on the temporal rate of the summary. Different network scenarios such as when a feedback channel is not available, and when a feedback channel is available with the possibility of retransmission, are considered. In each case, we assume a strict end-to-end delay constraint such that the summarized video can be viewed in real-time. We show simulation results for each case, and also discuss the case when the feedback delay may not
A critical component of any video transmission system is an objective metric for evaluating the quality of the video
signal as it is seen by the end-user. In packet-based communication systems, such as a wireless channel or the Internet,
the quality of the received signal is affected by both signal compression and packet losses. Due to the probabilistic
nature of the channel, the distortion in the reconstructed signal is a random variable. In addition, the quality of the
reconstructed signal depends on the error concealment strategy. A common approach is to use the expected mean
squared error of the end-to-end distortion as the performance metric. It can be shown that this approach leads to
unpredictable perceptual artifacts. A better approach is to account for both the mean and the variance of the end-to-end
distortion. We explore the perceptual benefits of this approach. By accounting for the variance of the distortion, the
difference between the transmitted and the reconstructed signal can be decreased without a significant increase in the
expected value of the distortion. Our experimental results indicate that for low to moderate probability of loss, the
proposed approach offers significant advantages over strictly minimizing the expected distortion. We demonstrate that
controlling the variance of the distortion limits perceptually annoying artifacts such as persistent errors.
A motion-compensated wavelet video coder is presented that uses adaptive mode selection (AMS) for each macroblock (MB). The block-based motion estimation is performed in the spatial domain, and an embedded zerotree wavelet coder (EZW) is employed to encode the residue frame. In contrast to other motion-compensated wavelet video coders, where all the MBs are forced to be in INTER mode, we construct the residue frame by combining the prediction residual of the INTER MBs with the coding residual of the INTRA and INTER_ENCODE MBs. Different from INTER MBs that are not coded, the INTRA and INTER_ENCODE MBs are encoded separately by a DCT coder. By adaptively selecting the quantizers of the INTRA and INTER_ENCODE coded MBs, our goal is to equalize the characteristics of the residue frame in order to improve the overall coding efficiency of the wavelet coder. The mode selection is based on the variance of the MB, the variance of the prediction error, and the variance of the neighboring MBs' residual. Simulations show that the proposed motion-compensated wavelet video coder achieves a gain of around 0.7-0.8dB PSNR over MPEG-2 TM5, and a comparable PSNR to other 2D motion-compensated wavelet-based video codecs. It also provides potential visual quality improvement.