As images are commonly transmitted or stored in compressed form such as JPEG, to extend the applicability of our previous work, a new scheme for embedding watermark in compressed domain without resorting to cryptography is proposed. In this work, a target image is first DCT transformed and quantised. Then, all the coefficients are implicitly watermarked in order to minimize the risk of being attacked on the unwatermarked coefficients. The watermarking is done through registering/blending the zero-valued coefficients with a binary sequence to create the watermark and involving the unembedded coefficients during the process of embedding the selected coefficients. The second-order neighbors and the block itself are considered in the process of the watermark embedding in order to thwart different attacks such as cover-up, vector quantisation, and transplantation. The experiments demonstrate the capability of the proposed scheme in thwarting local tampering, geometric transformation such as cropping, and common signal operations such as lowpass filtering.
Selective encryption exploits the relationship between encryption and compression to reduce encryption requirements, saving in complexity and facilitating new system functionality. Selective encryption of MPEG video streams has been proposed in a number of variations, yet has seen little application to date. Here we focus on high encryption savings, targeting 10% of the bit stream or less encrypted, moderate security in the sense that the content is degraded to the point that purchase would be preferred over consuming free content, no impact on compression efficiency, and a cryptanalytic approach to validating security. We find that adequate security is plausible if the compressor is cooperative or at least neutral with respect to the selective encryption system, but implausible if the compressor is operated antagonistically. The unusually low encryption targeted makes application of this solution appealing.
In this work, we investigate the congestion control problem for layered video multicast in IP networks of active
queue management (AQM) using a simple random early detection (RED) queue model. AQM support from networks improves the visual quality of video streaming but makes network adaptation more di±cult for existing layered video multicast proticols that use the event-driven timer-based approach. We perform a simplified analysis on the response of the RED algorithm to burst traffic. The analysis shows that the primary problem lies in the weak correlation between the network feedback and the actual network congestion status when the RED queue is driven by burst traffic. Finally, a design guideline of the layered multicast protocol is proposed to overcome this problem.
Media server scheduling in video-on-demand (VOD) systems includes video content allocation and request migration among servers. In this paper, we present a greedy algorithm to allocate video copies to media servers. It uses a graph model and minimizes the average shortest distance among media servers at each step. To facilitate the performance analysis of the random early migration (REM) algorithm proposed in our previous work, we introduce a formal description of the media service. Based on this system formalization, we develop a state transition method to study the parameter effect on the REM performance and compare the real time performance between REM and traditional migration with early start (TMES). The analytical result shows that REM introduces smoother migrations between media servers and thus leads to less real time system load than TMES.
To ease multimedia streaming over the QoS-deficient Internet, the
network-adaptive streaming has been introduced recently. For an
unicast media streaming environment, TCP-friendly end-to-end
congestion control is widely suggested to handle the network
congestion. However, the congestion control usually gives abrupt
changes in the available bandwidth between streaming systems and
the performance of media streaming can be degraded. To help this
situation, we can adaptively control the playback speed of audio
and video by adopting the time-scale modification technique. It
can mitigate the effect of network variations in delay and loss,
especially focusing on the low-latency video streaming situation. In this paper, we attempt to improve the streaming quality, when the congestion control is applied, by taking advantage of the adaptive playout mechanism. It can pro-actively prepare for imminent change with the adaptive playout capability, by estimating the expected buffer level and adjusting change in transmission rate, and controlling the playback rate.
Recently we have proposed a multicast media streaming framework that enhances the media synchronization by employing a server-client coordinated adaptive playout and error control. Focusing on the adaptive playout with time-scale modification, we adopt the adaptive playout to help both intra-client and inter-client synchronization in a multicast streaming environment. However, only the usability of adaptive playout in assisting the synchronized multicast streaming is demonstrated while leaving several detailed issues untouched. As an effort to enhance the proposed framework, in this paper, we concentrate on refining the adaptive playout between two (i.e., intra/inter) synchronization modes. We attempt to enhance each
synchronization mode and enable appropriate switching between them. More specifically, we zoom into the tradeoff of adaptive playout for buffer control and playout time control by adopting a dynamic playback factor adjustment. To evaluate the proposed scheme, network simulation and analysis (performed in NS2 simulator over the PIM-SSM multicast network) are conducted. Results show that the proposed scheme can enhance the performance of intra-client as well as inter-client synchronization.
Multi-hypothesis motion compensated prediction (MHMCP) predicts a block from a weighted sum of multiple reference blocks in the frame buffer. By efficiently combining these reference blocks, MHMCP can provide less prediction errors so as to reduce the coding bit rates. Although MHMCP was originally proposed to achieve high coding efficiency, it has been observed recently that MHMCP can also enhance the error resilient property of compressed video. In this work, we investigate the error propagation effect in the MHMCP coder. More specifically, we study how the multi-hypothesis number as well as hypothesis coefficients influence the strength of propagating errors. Simulation results are given to confirm our analysis. Finally, several design principles for the MHMCP coder are derived based on our analysis and simulation results.
The emerging H.264 video coding Standard can achieve a substantial
coding gain as compared with existing coding standards. One major contribution of its gain comes from a very rich syntax for motion compensated prediction at the expense of a higher computational complexity. To be more specific, seven modes of different block
sizes and shapes (i.e. 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4) are supported. To do full search over all modes requires an extremely large amount of computation. We propose a fast search algorithm for the variable block size motion estimation. The proposed algorithm includes three stages. First, an initial estimate of the motion vector is obtained by a multi-resolution motion search. Then, based on the estimated motion vector and its corresponding distortion measure, a rate-distortion model is used to select the initial mode for motion search. Finally, some early-termination rules are adopted to reject impossible block sizes and motion locatios sooner. By avoiding search through all block sizes, the amount of computation involved in the motion search can be substantially reduced. The proposed algorithm can achieve a speed-up factor up to 120 times when compared to the fastest full-search algorithm.
In this paper, we present a low-complexity RVLC decoding scheme for MPEG-4 video (including the effect of DC/AC prediction) that recovers more blocks and sometimes more MBs from error propagation region of corrupted video packets, as compared to the MPEG-4 scheme. The remaining blocks and MBs are concealed, by using maximally smooth error concealment scheme. It is shown that the proposed scheme achieves better data recovery, both in terms of PSNR and perceptual quality. In addition, we present more conditions for better error detection than those suggested in MPEG-4, and also discuss properties of error propagation in corrupted video packets. Since the schemes are purely decoder based, the compliance with the standard is fully maintained.
We investigate the encoding speed improvement for H.264 with a special focus on fast intra-prediction mode selection in this work. It is possible to adopt the rate-distortion (RD) optimized mode in H.264 to maximize the coding gain at the cost of a very high computational complexity. To reduce the complexity associated with the intra-prediction mode selection, we propose a two-step fast algorithm. In the first step, we make a course-level decision to split all possible candidate modes into two groups: the group to be examined further and the group to be ignored. The sizes of these two groups are adaptively determined based on the block activities. Then, in the second step, we focus on the group of interest, and consider an RD model for final decision-making. It is demonstrated by experiment results that the proposed scheme performs 5 to 7 time faster than the current H.264 encoder (JM5.0c) with little degradation in the coding gain.
MPML is an XML-based MPEG video coding method that was proposed in our previous work. A comprehensive study on the tradeoff between the coding bit rates and the PSNR performance with MPML under the noisy channel environment is conducted in this research. The original MPML compression algorithm has been modified for partial protection to reduce the XML overhead. A scheme to realize partial MPML protection is proposed. Simulation results demonstrate that the proposed MPML-based error resilient technique can achieve good performance for wireless video transmission.
A detailed study of the impact of memory bank conflict on the performance of EMAs is presented. Based on the study, novel schemes utilizing SIMD and array padding are described to solve the memory bank conflict problem. Since the parameter in array padding has a great impact on the overall behavior of the memory system, how to achieve optimal padding is an important research topic. Here, we analyze the padding effect and develop a probabilistic model to determine the optimal padding distance. Preliminary experimental results are given to verify the correctness of this model.
We present a necessary and sufficient condition that a given n-dimensional generalized interpolation approximation minimizes various worst-case measures of error of approximation at the same time among all the approximations, including nonlinear approximation, using the same set of sample values. As a typical example of the optimum approximation satisfying the above necessary and sufficient condition, we present n-dimensional generalitd interpolation approximation using the finite number of sample values. Then, we consider n-dimensional generalized discrete interpolation approximation based on n-dimensional FIR filter banks that uses the finite number of sample values in the approximation of each pixel of image but scan the image over the whole pixels. For this scanning-type discrete approximation, we prove that discrete interpolation functions exist that minimize various measures of error of approximation defined at discrete sample points xp=p, simultaneously, where p are the n-dimensional integer vectors. The presented discrete interpolation functions vanish outside the prescribed domain in the integer-vector space. Hence, these interpolation functions are realized by n-dimensional FIR filters. In this discussion, we prove that there exist continuous interpolation functions with extended band-width that interpolate the above discrete interpolation functions and satisfy the condition called discrete orthogonality. This condition is one of the two conditions that constitute the necessary and sufficient condition presented in this paper. Several discrete approximations are presented that satisfy both the conditions constituting the necessary and sufficient condition presented in this paper. The above discrete interpolation functions have much flexibility in their frequency characteristics if appropriate analysis filters are selected.
While the price of mobile devices is dropping quickly, the set of features and capabilities of these devices is advancing very dramatically. Because of this, new mobile multimedia applications are conceivable, also thanks to the availability of high speed mobile networks like UMTS and Wireless LAN. However, creating such applications is still difficult due to the huge diversity of features and capabilities of mobile devices. Software developers also have to take into account the rigorous limitation on processing capabilities, display possibilities, and the limited battery life of these devices. On top of that, the availability of the device resources fluctuates strongly during execution of an application, directly and violently influencing the user experience, whereas equivalent fluctuations on traditional desktop PC's are far less prominent. Using new technology like MPEG-4, -7 and -21 can help application developers to overcome these problems. We have created an MPEG-21-based Video-on-Demand application optimized for mobile devices that is aware of the usage environment (i.e., user preference, device capabilities, device conditions, network status, etc.) of the client and adapts the MPEG-4 videos to it. The application is compliant with the Universal Multimedia Access framework, supports Time-Dependent Metadata, and relies on both MPEG-4 and MPEG-21 technology.
This paper presents the development of an ATSC terrestrial data receiver to be used in multimedia datacasting for remote and rural communities.
In the context of a project put forward at the Communications Research Centre Canada to deliver interactive multimedia to remote and rural areas in Canada, a receiver capable of extracting Internet Packets (IP) as well as video and audio transport streams (TS) based on the Advanced Television Systems Committee (ATSC) terrestrial digital television system is developed. This paper describes the design, implementation, functionality, as well as several applications of this receiver which plays an important role in the interactive multimedia datacasting system. The ATSC terrestrial transmission chain from the server at the transmitter site to the receiver (client) at the end-user's premise is presented. A number of possible implementations for a return channel are proposed. Preliminary results of laboratory trials as well as plans for field trials are presented.
Due to the rising complexity of modern embedded media applications (EMAs), the instruction level parallelism (ILP) is not sufficient to meet the need. Compilers must have the capability to exploit the superword level parallelism (SLP), which can expose more concurrency lying in applications, minimize the latency created by memory access and hence produce more efficient codes. The loop is a good candidate for SLP extraction because of its paralleled structure between iterations. This work analyzes the memory access patterns found in EMAs and presents our method of loop unrolling to fully utilize these patterns to generate efficient Single Instruction Multiple Data (SIMD) instructions. Experimental results performed on TriMedia TM-1300 processor for the H.264 encoder show performance improvement by a factor ranging from 3 to 30 times with an average of 12 times.
In this paper a novel platform, HOUCOMTM, for the development of team based distributed collaborative applications is presented. Its implementation within a scenario for distributed cooperative virtual product development, ProViT, will be shown, describing the features and advantages of the presented platform. The specified platform consists of a decentrally organized, dynamic and user-configurable architecture. The main entities within the given platform are Conferences (working groups), Sessions (sub-groups), Users, Components, and Shared Resources. The system provides support of hierarchical Session Management, allowing for arbitrarily nested groups and multi-conferencing. Within the given platform Users can be individuals as well as technical devices, e.g. a streaming framework. The ProViT scenario builds a collaborative environment for interactive distributed VR Design reviews for the mechanical engineering industry. Here several distributed clusters form a working group, allowing individual partners to immersively collaborate on 3D models and supplementary documents and communicate via A/V-streaming. This paper divides into three chapters, first describing the ProViT scenario and deriving its requirements. The subsequent chapter examines the novel concept in general and the features that helped meeting the given project requirements in particular. In the conclusion the authors give an outlook on future extensions and applications of the developed platform.
The distributed Multiplayer Online Game (MOG) system is complex since it involves technologies in computer graphics, multimedia, artificial intelligence, computer networking, embedded systems, etc. Due to the large scope of this problem, the design of MOG systems has not yet been widely addressed in the literatures. In this paper, we review and analyze the current MOG system architecture followed by evaluation. Furthermore, we propose a clustered-server architecture to provide a scalable solution together with the region oriented allocation strategy. Two key issues, i.e. interesting management and synchronization, are discussed in depth. Some preliminary ideas to deal with the identified problems are described.
In this paper we introduce steganalysis and present three detection types. By the comparison of several watermark benchmark tools, which are Stirmark, Checkmark and Optimark, we discuss the validity of watermarking system, and propose a basic condition and outline to evaluate the validity of image watermarking system.
We propose an advanced time synchronization scheme with additional pre-processor for OFDM systems. The proposed pre-processor scheme makes symbol synchronization and guard-interval length detection possible, even though there is no guard-interval length information. This pre-processor is so useful in the systems like DVB-T, which have several kinds of guard-interval lengths. Simulation results show that the proposed scheme gives satisfactory performance in the guard-interval length detection and the OFDM symbol synchronization over AWGN.
The flexible and semi-structured eXtensible Markup Language (XML) is used in various application domains as well as the database field. The proposed XML harmonization system attempts to re-utilize existing markup languages by extracting and integrating them. This idea is analogous to the join operation in the database domain. When new structures are created, new document types are defined. Structures of XML instances can be viewed from various viewpoints, e.g. the relational database or the object-oriented database. In this work, we propose a new way to achieve harmonization, i.e., by defining axioms on atomic elements of the selected data structure. The advantage of using axioms is that it can be extended to other data structures easily. Measurements of harmonization are discussed, and harmonization examples are given to illustrate the axiom-based design principle.
In this paper, a real-time stereo image watermarking scheme using discrete cosine transform(DCT) and disparity map is proposed. That is, a watermark image is embedded into the right image of a stereo image pair in the frequency domain through the conventional DCT operation and disparity information between the watermarked right image and the left image is extracted. And then, the disparity data and the left image are simultaneously transmitted to the recipient
through the communication channel. At the receiver, the watermarked right image is reconstructed from the received left image and disparity data through the adaptive matching algorithm and a watermark image is finally extracted from this reconstructed right image through the decoding algorithm. From some experiments by using the stereo image pair captured by the CCD camera and a watermark image of English alphabet 'NRL', it is found that the PSNR of the
reconstructed right image through the DCT and adaptive matching algorithm improves to 9.3dB by comparing with those of the reconstructed right images through the conventional pixel-based and block-based matching algorithms. At the same time the PSNR of the watermark image extracted from the reconstructed right image also improve to 7.72dB by comparing with those of the others. These experimental results suggest a possibility of practical implementation of a disparity map-based stereo image watermarking scheme.