Most video watermarking algorithms embed the watermark in I-frames, which are essential for the video signal,
but refrain from embedding anything in P- and B-frames that are highly compressed by motion compensation.
Furthermore, these algorithms do not take advantage of temporal masking in subjective perception of the video.
In this paper, we explore the capacity of P-frames and the temporal masking effect in the video signal. The
challenge in embedding watermark bits in P-frames is that the video bit rate can increase significantly. Thus, we
choose to embed watermark bits only in nonzero AC residuals in P-frames. Changing zero-valued coefficients to
nonzero values can significantly increase the video bit rate because H.264 (and earlier coders as well) uses run
length codes. We show that avoiding zero-valued coefficients significantly reduces the percentage increase in the
compressed video bit rate and makes watermark embedding in P-frames practical. Since the nonzero residuals
in P-frames correspond to non-flat areas that are in motion, temporal and texture masking will be exploited at
the same time. This is confirmed by showing the resemblance of the plots of the number of nonzero residuals in
each frame to motion intensity plots.
As H.264 digital video becomes more prevalent, the industry needs copyright protection and authentication methods that are appropriate for this standard. The goal of this paper is to propose a robust watermarking algorithm for H.264. To achieve this goal, we employ a human visual model adapted for a 4x4 DCT block to obtain a larger payload and a greater robustness while minimizing visual distortion. We use a key-dependent algorithm to select a subset of the coefficients with visual watermarking capacity for watermark embedding to obtain robustness to malicious attacks. Furthermore, we spread the watermark over frequencies and within blocks to avoid error pooling. The error pooling effect, introduced by Watson, has not been considered in previous perceptual watermarking algorithms. Our simulation results show that we can increase the payload and robustness without a noticeable change in perceptual quality by reducing this effect. We embed the watermark in the residuals to avoid decompressing the video, and to reduce the complexity of the watermarking algorithm. However, we extract the watermark from the decoded video sequence to make the algorithm robust to intraprediction mode changes. Our simulation results shows that we obtain robustness to filtering, 50% cropping, and requantization attacks.