In content fingerprinting, the same media covertext - image, video, audio, or text - is distributed to many users.
A fingerprint, a mark unique to each user, is embedded into each copy of the distributed covertext. In a collusion
attack, two or more users may combine their copies in an attempt to "remove" their fingerprints and forge a
pirated copy. To trace the forgery back to members of the coalition, we need fingerprinting codes that can reliably
identify the fingerprints of those members. Researchers have been focusing on designing or testing fingerprints
for Gaussian host signals and the mean square error (MSE) distortion under some classes of collusion attacks,
in terms of the detector's error probability in detecting collusion members. For example, under the assumptions
of Gaussian fingerprints and Gaussian attacks (the fingerprinted signals are averaged and then the result is
passed through a Gaussian test channel), Moulin and Briassouli1 derived optimal strategies in a game-theoretic
framework that uses the detector's error probability as the performance measure for a binary decision problem
(whether a user participates in the collusion attack or not); Stone2 and Zhao et al.3 studied average and other
non-linear collusion attacks for Gaussian-like fingerprints; Wang et al.4 stated that the average collusion attack
is the most efficient one for orthogonal fingerprints; Kiyavash and Moulin5 derived a mathematical proof of the
optimality of the average collusion attack under some assumptions.
In this paper, we also consider Gaussian cover signals, the MSE distortion, and memoryless collusion attacks.
We do not make any assumption about the fingerprinting codes used other than an embedding distortion
constraint. Also, our only assumptions about the attack channel are an expected distortion constraint, a memoryless
constraint, and a fairness constraint. That is, the colluders are allowed to use any arbitrary nonlinear
strategy subject to the above constraints. Under those constraints on the fingerprint embedder and the colluders,
fingerprinting capacity is obtained as the solution of a mutual-information game involving probability density
functions (pdf's) designed by the embedder and the colluders. We show that the optimal fingerprinting strategy
is a Gaussian test channel where the fingerprinted signal is the sum of an attenuated version of the cover signal
plus a Gaussian information-bearing noise, and the optimal collusion strategy is to average fingerprinted signals
possessed by all the colluders and pass the averaged copy through a Gaussian test channel. The capacity result
and the optimal strategies are the same for both the private and public games. In the former scenario, the original
covertext is available to the decoder, while in the latter setup, the original covertext is available to the encoder
but not to the decoder.
An accurate statistical model of cover images is essential to the success of both steganography and steganalysis. We study the statistics of the full-frame two-dimensional discrete Fourier transform (DFT) coefficients of natural images and show that the independently and identically distributed model with unit exponential distribution is not a sufficiently accurate description of the statistics of normalized image periodograms. Consequently, the stochastic quantization index modulation (QIM) algorithm that aims at preserving this model is detectable in principle. To discriminate the resulted stegoimages from cover images, we train a learning system on them. Building upon a state-of-the-art steganalysis method using the statistical moments of wavelet characteristic functions, we propose new features that are more sensitive to data embedding. The addition of these features significantly improves the steganalyzer's receiver operating characteristic (ROC) curve.
Probability-of-error exponents have recently been derived for watermarking systems based on spread-spectrum and quantization-index modulation methods. This paper takes this work one step further and presents minmax error exponents for any embedding scheme and any attack (subject to distortion constraints) at all rates below capacity. The decoders used are universal: they do not know the attck used. Randomized codes outperform deterministic codes, except in the case of memoryless attacks where the same performance is obtained using either kind of code.
We study a detection-theoretic approach to steganalysis. The
relative entropy between covertext and stegotext determines the
steganalyzer's difficulty in discriminating them, which in turn
defines the detectability of the stegosystem. We consider the case
of Gaussian random covertexts and mean-squared embedding
constraint. We derive a lower bound on the relative entropy
between covertext and stegotext for block-based embedding
functions. This lower bound can be approached arbitrarily closely
using a spread-spectrum method and secret keys with large entropy.
The lower bound can also be attained using a stochastic
quantization index modulation (QIM) encoder, without need for
secret keys. In general, perfect undetectability can be achieved
for blockwise memoryless Gaussian covertexts. For general Gaussian
covertexts with memory, the relative entropy increases
approximately linearly with the number of blocks observed by the
steganalyzer. The error probabilities of the best steganalysis
methods decrease exponentially with the number of blocks.