In this paper we consider the problem of document authentication in electronic and printed forms. We formulate this problem from the information-theoretic perspectives and present the joint source-channel coding theorems showing the performance limits in such protocols. We analyze the security of document authentication methods and present the optimal attacking strategies with corresponding complexity estimates that, contrarily to the existing studies, crucially rely on the information leaked by the authentication protocol. Finally, we present the results of experimental validation of the developed concept that justifies the practical efficiency of the elaborated framework.
In this paper, we propose a new theoretical framework for the data-hiding problem of digital and printed text documents. We explain how this problem can be seen as an instance of the well-known Gel'fand-Pinsker problem. The main idea for this interpretation is to consider a text character as a data structure consisting of multiple quantifiable features such as shape, position, orientation, size, color, etc. We also introduce color quantization, a new semi-fragile text data-hiding method that is fully automatable, has high information embedding rate, and can be applied to both digital and printed text documents. The main idea of this method is to quantize the color or luminance intensity of each character in such a manner that the human visual system is not able to distinguish between the original and quantized characters, but it can be easily performed by a specialized reader machine. We also describe halftone quantization, a related method that applies mainly to printed text documents. Since these methods may not be completely robust to printing and scanning, an outer coding layer is proposed to solve this issue. Finally, we describe a practical implementation of the color quantization method and present experimental results for comparison with other existing methods.
In data-hiding the issue of the achievable rate maximization is closely related to the problem of host interference cancellation. The optimal host interference cancellation relies on the knowledge of the host realization and the channel statistics (the additive white Gaussian noise (AWGN) variance) available at the encoder a priori to the transmission. The latter assumption can be rarely met in practical situations. Contrarily to the Costa set-up where the encoder is optimized for the particular state of the independent and identically distributed (i.i.d.) Gaussian attacking channel, we address the problem of asymmetrically informed data-hiding optimal encoder design assuming that the host interference probability density function (pdf) is an i.i.d. Laplacian and the channel variance lies on some known interval. The presented experimental results advocate the advantages of the developed embedding strategy.
In this paper, we tackle the problem of performance improvement of quantization-based data-hiding in the middle-watermark-to-noise ratio (WNR) regime. The objective is to define the quantization-based framework that maximizes the performance of the known-host-state data-hiding in the middle-WNR taking into account the host probability density function (pdf). The experimental results show that the usage of uniform deadzone quantization (UDQ) permits to achieve higher performance than using uniform quantization (UQ) or spread spectrum (SS)-based data-hiding. The performance enhancement is demonstrated for both achievable rate and error probability criteria.
In a data hiding communications scenario, geometrical attacks lead to a loss of reliable communications due to synchronization problems when the applied attack is unknown. In our previous work, information-theoretic analysis of this problem was performed for theoretic setups, i.e., when the length of communicated data sequences asymptotically approaches infinity. Assuming that the applied geometrical attack belongs to a set of finite cardinality, it is demonstrated that it does not asymptotically affect the achievable rate in comparison to the scenario without any attack. The main goal of this paper is to investigate the upper and lower bounds on the rate reliability function that can be achieved in the data hiding channel with some geometrical state. In particular, we investigate the random coding and sphere packing bounds in channels with random parameter for the case when the interference (channel state) is not taken into account at the encoder. Furthermore, only those geometrical transformations that preserve the input dimensionality and input type class are considered. For this case we are showing that similar conclusion obtained in the asymptotic case is valid, meaning that within the class of considered geometrical attacks the rate reliability function is bounded in the same way as in the case with no geometrical distortions.
The main goal of this study consists in the development of the additive worst case attack (WCA) for quantization-based methods from two points of view: the bit error rate probability and from the rerspective of the information theoretic performance. Our analysis will be focused on the practical scheme known as distortion compensation dither modulation (DC-DM). From the mathematical point of view, the problem of the WCA design with probability of error as the cost function can be formulated as the maximization of the average probability of error subject to introduced distortion for a given decoding rule. When mutual information is selected as cost function, the problem of the WCA design establishes the global maximum of the optimization problem independently of the decodification process. Our results contribute to the common understanding and the development of fair benchmarks. The results show that the developed noise attack leads to a stronger performance decrease for the considered class of embedding techniques than the AWGN or the uniform noise attacks within the class of additive noise attacks.
In the paper we advocate image compression technique in the scope of distributed source coding framework. The novelty of the proposed approach is twofold: classical image compression is considered from the positions of source coding with side information and, contrarily to the existing scenarios, where side information is given explicitly, side information is created based on deterministic approximation of local image features. We consider an image in the transform domain as a realization of a source with a bounded codebook of symbols where each symbol represents a particular edge shape. The codebook is image independent and plays the role of auxiliary source. Due to the partial availability of side information at both encoder and decoder we treat our problem as a modification of Berger-Flynn-Gray problem and investigate a possible gain over the solutions when side information is either unavailable or available only at decoder. Finally, we present a practical compression algorithm for passport photo images based on our concept that demonstrates the superior performance in very low bit rate regime.