## 1.

## Introduction

In the digital era, various types of media, including audio, video, and images can easily be duplicated and distributed without permission from original owner/creator. This is undesirable because the consequences of such actions may discourage the owner/creator from developing future work. One possible solution to this problem is the use of digital watermarking to discourage people from making and distributing unauthorized copies of digital media.^{1} Image watermarking is a method used to embed information imperceptibly (the watermark) into a host image before public distribution. The degradation of a watermarked image must be unnoticeable by the observer of the image. The embedded watermark must be robust against both unintentional and intentional attacks while being extractable so the watermark can be “read”.^{2}^{,}^{3} A watermarking method used for image distribution should also be capable of blind detection so that the watermark can be extracted without the original image.^{4}

At present, various image watermarking schemes have been proposed and shown to be robust against various types of attacks. Several of them embed a watermark within the transform domain of the host image^{5}^{,}^{6}^{,}^{7}^{,}^{8} so that the embedded watermark can survive most compression schemes, such as JPEG and JPEG2000. There are also some studies demonstrated that such approaches are robust against geometrical attacks, e.g., cropping.^{4}^{,}^{7}^{,}^{8} However, most methods suffer from low capacity in that few watermark bits can be added to the host image.^{9} A simple and fast approach, based on spatial domain watermarking, was thus considered as an alternative. It was shown in many studies that the embedded watermark can survive most of the geometrical attacks while simultaneously providing considerably high watermark capacity. For example, the blind watermarking method proposed by Verma et al. in 2007 described watermark embedding by modifying the pixel values in the blue (B) component of a color image.^{10} Note that the blue component was modified because the human visual system (HVS) is least sensitive to blue.^{1} In the scheme, a $3\times 3\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixel}$ block from the predefined $8\times 8\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixel}$ image was modified in such a way that watermark extraction could be achieved by comparing the average intensity of subsets of pixels of the $8\times 8$ block. An error-correcting code was used with the embedded bits in order to enhance the performance. However, this scheme provided a small watermark capacity of $2500\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{bits}/512\times 512$ color image pixels. In 2008, another color image watermarking scheme based on linear discriminant analysis (LDA) was proposed, where all three color channels [i.e., red (R), green (G) and blue (B)] were used to carry a watermark in the form of a binary logo image. A trained LDA machine was used for watermark extraction.^{11} The scheme provided a small watermark capacity of $800\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{bits}/512\times 512\times 3$ color image pixels and also required a reference watermark to train the LDA in the extraction process. In 2009, a localized image watermarking resistant to geometric attacks was proposed by Li and Guo.^{12} In their scheme, the watermark was embedded into all local invariant regions repeatedly in the spatial domain of a color image and could be extracted from the distorted image directly with the help of an odd–even bit detector. Since the embedding positions were restricted to be within the local invariant regions in order to guard against geometric attacks (such as rotation, scaling, and translation), only a small number of bits (e.g., 16 bits) could be embedded into a $512\times 512$ gray level image. Recently, Hussein proposed a nonblind watermarking scheme based on log-average luminance^{13} whereby some $8\times 8\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixel}$ blocks, which were chosen spirally from the center of the embedding image and had a log-average luminance greater than or equal to the log-average luminance of the entire image, were used for watermark embedding. However, in this scheme, apart from using an inconvenient nonblind approach, modifying the luminance components of host image significantly degraded the visual image quality which could be perceived by a human observer. Although the author avoided this drawback by allowing only 16 blocks to be modified, the performance of the method was limited to having a small watermark capacity, as only 1024 bits were embedded into $512\times 512$ color image pixels.

A blind watermarking scheme based on the modification of image pixels enabling a large number of embedding bits was first proposed by Kutter et al.,^{14} where watermark embedding was performed by modifying the blue component of color image pixels, and watermark extraction was achieved by using a prediction method based on a linear combination of pixel values in a cross-shaped neighborhood around the embedded pixels. For this method, $512\times 512\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{bits}$ could be embedded into $512\times 512$ color image pixels. The method was experimentally observed to be robust against various types of image attacks, including geometrical attacks. The extraction performance was improved by introducing a Gaussian pixel-weighting mask into the embedding process and employing a linear combination of all nearby pixel values around the embedded pixel.^{15} However, if the number of watermark bits “1” and “$-1$” around the embedding pixel was not equal or balanced,^{16} the summation of those watermark bits would result in a large value, which directly affects the accuracy of the original pixel prediction step in the watermark extraction process, and the probability of extracting the watermark correctly would decrease. Such circumstances can frequently occur when a watermark that is to be embedded consists of recognizable patterns. The extraction probability also decreases when the host image is a very detailed image, that is when two nearby pixel values are substantially different. A similar concept of watermark embedding was also presented in Ref. 17, where the proposed perceptual mark was based on the adaptive least square (LS) prediction error sequence of the host image and claimed to match well with the properties of the HVS. Together with the new blind detection scheme based on an efficient prewhitening process and a correlation-based detector, their proposed mark exhibited impressive performance and the watermark capacity in the scheme was comparable to Ref. 15. However, the watermark embedding in the luminance component greatly degraded the perceptual quality of the watermarked image when compared with watermark embedding in the blue component at the same watermark strength. Watermark embedding in luminance translates to watermark embedding in all three color components, i.e., RGB. Thus, the resultant image quality will be degraded in accordance with the changes in each R, G, and B component. Based on the weaknesses in Ref. 15, three different improving techniques were proposed in Ref. 16. These techniques included balancing the watermark bits around the embedding pixel, tuning the strength of embedding watermark in accordance with nearby luminance components and reducing the effect caused by substantially different values between the nearby watermarked component and the center one in the prediction area. A different approach for improving the performance of this watermarking scheme was also presented in Ref. 18, where the watermark is embedded into the chrominance components of $Y{C}_{\mathrm{b}}{C}_{\mathrm{r}}$ color space that have less variation value. Although it achieved a better extraction performance, the accuracy of the extracted watermark still suffered from most compression schemes, e.g., a low quality watermark was obtained after applying JPEG compression.

We present in this article a new watermarking scheme based on the modification of image pixels in order to improve the accuracy of the extracted watermark and the robustness of the embedded watermark, as proposed in Refs. 1415.16.17.–18, especially against image compression standards. Three different methods are proposed to improve the overall performance. First, we propose a new watermark embedding method in the luminance component of the host image instead of the color component to avoid high lossy compression used in many image compression methods. This approach is usually overlooked because the quality of the host image will be severely dropped. Second, we reduce the number of watermark bits to be embedded, based on discrete wavelet transforms (DWTs) without decreasing the watermark image size in order to reduce the modifying number of the luminance components in the host image. Third, we propose a new watermark extraction method based on the prediction of original pixel from the weighted watermarked components in order to suit the high variation value of the watermarked luminance components. The performance of all three proposed methods is evaluated and compared with the previous watermarking schemes. The next section describes the proposed methods including our watermarking scheme. Section 3 presents the experimental settings and the performance of our proposed scheme, compared to the others. The conclusion is finally included in Sec. 4.

## 2.

## Proposed Methods

## 2.1.

### Watermark Embedding Based on Luminance Modification

We first consider embedding a watermark into a luminance component rather than in color components. This is because, in general, image compression methods strongly decrease the chrominance quality of a color image through subsampling processes.^{19} The watermark embedded in the luminance component should therefore be more robust against image compression than those in the color components. $Y{C}_{\mathrm{b}}{C}_{\mathrm{r}}$ color space is one of the most well known models and widely used to present images. We thus choose the component $Y$ in this color space, which is separately encoded, for watermark embedding. Recall that in $Y{C}_{\mathrm{b}}{C}_{\mathrm{r}}$ color space, for example, $Y$ represents the luminance component of a color image, whereas ${C}_{\mathrm{b}}$ and ${C}_{\mathrm{r}}$ represent the blue and red chrominance components, respectively.^{20} In addition, an image in RGB color space can be converted to $Y{C}_{\mathrm{b}}{C}_{\mathrm{r}}$, or vice versa, by the following equations:

## (1)

$$\left[\begin{array}{c}Y\\ {C}_{\mathrm{b}}\\ {C}_{\mathrm{r}}\end{array}\right]=\left[\begin{array}{ccc}0.257& 0.504& 0.098\\ -0.148& -0.291& 0.439\\ 0.439& -0.368& -0.071\end{array}\right]\left[\begin{array}{c}\mathrm{R}\\ \mathrm{G}\\ \mathrm{B}\end{array}\right]+\left[\begin{array}{c}16\\ 128\\ 128\end{array}\right]$$## (2)

$$\left[\begin{array}{c}\mathrm{R}\\ \mathrm{G}\\ \mathrm{B}\end{array}\right]=\left[\begin{array}{ccc}1.164& 0& 1.596\\ 1.164& -0.392& -0.813\\ 1.164& 2.017& 0\end{array}\right]\left[\begin{array}{c}Y\\ {C}_{\mathrm{b}}\\ {C}_{\mathrm{r}}\end{array}\right]+\left[\begin{array}{c}-222.921\\ 135.576\\ -276.836\end{array}\right].\phantom{\rule{0ex}{0ex}}$$However, since the HVS is very sensitive to changes in the luminance component, changing values of $Y$ undoubtedly cause a more severe effect on perception than changes in the color and/or chrominance components.

One efficient solution we consider here is to decrease the number of embedding bits in order to reduce the effect of the embedded watermark bits on the $Y$ component while simultaneously improving the quality of the watermarked image. Nevertheless, this solution must neither affect the size of the embedded watermark nor excessively degrade its quality. Figure 1 shows the zoomed version of the host and watermarked images “Lena” for the $B$ and $Y$ components. Figure 1(d) demonstrates the result obtained from embedding only $1/16$ of the watermark bits, with the same strength as used in Fig. 1(c), into the same host image. Note that the watermarking scheme in Ref. 16 was used in this test and the quality of watermarked image was controlled to achieve a peak signal to noise ratio (PSNR) of 30 dB. PSNR for watermarking embedding in the $Y$ component is given by:

## (3)

$$\mathrm{PSNR}\text{\hspace{0.17em}}(\mathrm{dB})=20\text{\hspace{0.17em}}\mathrm{log}\frac{255\sqrt{3MN}}{\sqrt{\sum _{i=1}^{M}\sum _{j=1}^{N}{[{Y}^{\prime}(i,j)-Y(i,j)]}^{2}}},$$Figure 1 shows that at the same PSNR, the quality of watermarked image in $Y$ component, Fig. 1(c), was perceptually poorer than that in $B$ component, Fig. 1(b). However, when the number of watermark bits in the same $Y$ component was reduced to $1/16$ of the original number, with the same strength, the change in quality is not very visible [see Fig. 1(d)], and the PSNR was increased to 43.3 dB.

## 2.2.

### Watermark Preparation/Reconstruction Based on DWT

To accomplish the embedding in the $Y$ component without affecting the watermark excessively, we develop a new watermark consisting of three processing steps. The first two are based on the two-dimensional (2-D) DWT and are used to reduce the size of the embedding watermark and to enlarge the size of the extracted watermark to its original dimensions. The last processing step is based on image denoising and is used to diminish the negative consequences of propagation error. The 2D-DWT of function $f(x,y)$ of size $M\times N$ is defined as follows^{21}:

## (4)

$${W}_{\phi}({j}_{0},m,n)=\frac{1}{\sqrt{MN}}\sum _{x=0}^{M-1}\sum _{y=0}^{N-1}f(x,y){\phi}_{{j}_{0},m,n,}(x,y),$$## (5)

$${W}_{\psi}^{i}(j,m,n)=\frac{1}{\sqrt{MN}}\sum _{x=0}^{M-1}\sum _{y=0}^{N-1}f(x,y){\psi}_{j,m,n}^{i}(x,y),$$Equation (6) identifies the scaling function as follows:

In this article, we use the unit-height, unit-width scaling function and the Haar wavelet^{21} function for 2-D DWT in order to decompose an image to four quarter size subimages, namely, ${W}_{\phi}$, ${W}_{\psi}^{H}$, ${W}_{\psi}^{V}$, and ${W}_{\psi}^{D}$. Both the functions are given in Eqs. (12) and (13):

## (12)

$$\phi (x)=\{\begin{array}{l}1\phantom{\rule[-0.0ex]{1em}{0.0ex}}0\le x<1\\ 0\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{elsewhere}\end{array}$$## (13)

$$\psi (x)=\{\begin{array}{l}1\phantom{\rule[-0.0ex]{1em}{0.0ex}}0\le x<0.5\\ -1\phantom{\rule[-0.0ex]{1em}{0.0ex}}0.5\le x<1\\ 0\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{elsewhere}\end{array}.$$Note that the decomposition process can be used again to the approximation of the image to obtain another set of four subband images. The resulting decomposition of the image after doing the 2-D DWT two times is illustrated in Fig. 2.

It should be noted that ${W}_{\phi}(j+1,m,n)$ can be reconstructed from ${W}_{\phi}(j,m,n)$ and ${W}_{\psi}^{i}(j,m,n)$, and ${W}_{\phi}(j+2,m,n)$ from ${W}_{\phi}(j+1,m,n)$ and ${W}_{\psi}^{i}(j+1,m,n)$, via the inverse DWT.

## (14)

$$f(x,y)=\frac{1}{\sqrt{MN}}\sum _{m}\sum _{n}{W}_{\phi}({j}_{0},m,n){\phi}_{{j}_{0},m,n}(x,y)\phantom{\rule{0ex}{0ex}}+\frac{1}{\sqrt{MN}}\sum _{i=H,V,D}\sum _{j={j}_{0}}^{\infty}\sum _{m}\sum _{n}\phantom{\rule{0ex}{0ex}}{W}_{\psi}^{i}({j}_{0},m,n){\psi}_{j,m,n}^{i}(x,y).$$To use the 2-D DWT to construct our watermark, the watermark image ${I}_{w}(i,j)\in \{0,1\}$ with the same size of the host image is first created from a black-and-white recognizable pattern and then decomposed two times using 2-D DWT to obtain seven subimages. Next, each coefficient ${\mathrm{co}}_{w}$ in ${W}_{\phi}(j,m,n)$ (see Fig. 2) is modified to obtain a two-level value by the following:

## (15)

$${\mathrm{co}}_{w\_\mathrm{mod}}(i,j)=\{\begin{array}{cc}1& {\mathrm{co}}_{w}(i,j)\ge 2\\ 0& \text{elsewhere}\end{array},$$When the extracted watermark ${\mathrm{co}}_{w\_\mathrm{mod}}^{\prime}$ is recovered, the second step is applied in order to reconstruct ${\mathrm{co}}_{w\_\mathrm{mod}}^{\prime}$ to its original size. That is, each coefficient is modified in accordance with the following equation to obtain a two-level value ${\mathrm{co}}_{w\_\mathrm{new}}^{\prime}$.

## (16)

$${\mathrm{co}}_{w\_\mathrm{new}}^{\prime}(i,j)=\{\begin{array}{cc}4& {\mathrm{co}}_{w\_\mathrm{new}}^{\prime}(i,j)=1\\ 0& \text{elsewhere}\end{array}.$$The new subimage ${W}_{\phi \_\mathrm{new}}(j,m,n)$ containing ${\mathrm{co}}_{w\_\mathrm{new}}^{\prime}$ together with the new recreated subimages ${W}_{\psi \_\mathrm{new}}^{i}\phantom{\rule{0ex}{0ex}}(j,m,n)$ and ${W}_{\psi \_\mathrm{new}}^{i}(j+1,m,n)$ containing all zero coefficients are inverse transformed to reconstruct the watermark image ${I}_{w}^{\prime}$ in its original size. Note that the values of lower and upper bounds, i.e., 0 and 4, are used in Eq. (16) because, based on observations, the quality of the reconstructed image using these two values is closer to the original image than with any other value. However, in the second step, if an erroneous coefficient in ${W}_{\phi \_\mathrm{new}}(j,m,n)$ occurs, this will lead to a group of 16 erroneous pixels in ${I}_{w}^{\prime}$. Hence, in the last step, a $5\times 5\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixel}$ denoising filter with the following property is applied to ${I}_{w}^{\prime}$ to reduce the effect of the propagation error. Note that this image denoising technique works in such a way that the output pixel value depends on the majority of ${I}_{w}^{\prime}$ being within an area of $5\times 5\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixels}$.

## (17)

$${I}_{w}^{\prime \prime}(i,j)=\{\begin{array}{cc}1& \text{when}[\sum _{m=-2}^{2}\sum _{n=-2}^{2}{I}_{w}^{\prime}(i+m,j+n)]\ge 13\\ 0& \text{elsewhere}\end{array}.$$The example of subimage ${\mathrm{co}}_{w\_\mathrm{mod}}$ from the image Scout Logo together with six new recreated subimages and its enlarged version is shown in Fig. 4(a) and 4(b), while the extracted watermark images Scout Logo before and after the denoising filter are shown in Fig. 4(c) and 4(d), respectively.

Based on the above watermark construction, a new watermarking scheme based on luminance modification is proposed. The block diagram showing steps in watermark embedding is illustrated in Fig. 5. The steps in the watermark embedding process are as follows: First, after obtaining the reduced size of ${I}_{w}$ by the 2-D DWT decompositions, ${\mathrm{co}}_{w\_\mathrm{mod}}$ is XORed with a pseudorandom bit stream of the same length generated by a key-based stream cipher in order to obtain a balanced set of bits around the embedding component and thus providing security for the embedded watermark. That is, without the secret key, no one can reproduce the same pseudorandom bit stream used in the embedding processing, and as a result would be unable to recover the embedded watermark. The bit positions of the result are then permuted and spread in accordance with the uniform distribution to disperse groups of 0 and 1 bits over the entire embedding area. In practice, all $k$ watermark bits are first permuted and spread randomly based on the uniform distribution over $16k$ pixel positions. Finally, the 0 bits are converted into $-1$ so that the watermark to be embedded becomes $w(i,j)\in \{-1,1\}$. Note that the remaining ($15/16$) pixels of the host image remains unchanged.

To watermark a host color image, the luminance component of the host image at coordinate ($i,j$) is pseudorandomly modified by using addition or subtraction, depending on the value of $w(i,j)$, the watermark strength $s$, and the luminance component of the embedding pixel $Y(i,j)$. The tuning factor, $s$, is included here in order to control the overall quality of the watermarked image. In practice, $s$ is a constant value used to achieve an expected PSNR and may be different depending on the host image. According to Eq. (1), the luminance component is determined by $Y(i,j)=0.299R(i,j)+0.587$ $G(i,j)+0.114B(i,j)$. Note that no luminance component of the host image is embedded two times, so that only $1/16$ of the entire luminance component is modified. The watermarked luminance component ${Y}^{\prime}(i,j)$ can be represented by:

where ${Y}_{g}(i,j)$ is the modified luminance value of the $3\times 3$ pixel block obtained from the Gaussian pixel-weighting mask,^{15}which is considered as an HVS-based tuning factor for watermark strength. In practice, $s$ must be carefully selected to obtain the best trade-off between imperceptibility and robustness.

## 2.3.

### Original Pixel Prediction Based on Weighted Components

The block diagram showing steps in the proposed watermark extraction process is illustrated in Fig. 6.

From the figure, an embedded watermark can be recovered based on two assumptions. First, we assume that any pixel value within an image is close to its surrounding neighbors so that a pixel value at a given coordinate ($i,j$) can be estimated using the average of the values of its nearby pixels. Hence, a prediction of $Y(i,j)$, which we denote as ${Y}^{\prime \prime}(i,j)$, is determined from the nearby watermarked components around ($i,j$) as follows:

## (19)

$${Y}^{\prime \prime}(i,j)=\frac{1}{8}\left\{\right[\sum _{m=-1}^{1}\sum _{n=-1}^{1}{Y}^{\prime}(i+m,j+n)]-{Y}^{\prime}(i,j)\}.$$Second, we assume that the summation of $w$ around ($i,j$) is close to zero so that the embedded bit at ($i,j$) can be estimated by the following equation:

It was shown in Ref. 16 that replacing a surrounding neighbor around ($i,j$) that most differed from ${Y}^{\prime}(i,j)$ by ${Y}^{\prime}(i,j)$ itself can help improve the accuracy of ${Y}^{\prime \prime}$. Since ${w}^{\prime}(i,j)$ can be either positive or negative, the zero value is set as its threshold, and its sign is used to estimate the value of $w(i,j)$. That is, if ${w}^{\prime}(i,j)$ is positive (or negative), $w(i,j)$ is estimated as 1 (or $-1$). Note that the magnitude of ${w}^{\prime}(i,j)$ reflects the confidence level of estimating $w(i,j)$. Last, the $-1$ bit of ${w}^{\prime}(i,j)$ is converted into 0, and the result is despread, repermuted and then XORed with the same pseudorandom bit stream used in the embedding process to obtain the recovered, black-and-white image ${I}_{w}^{\prime}(i,j)\in \{0,1\}$. Note that the same pseudorandom bit stream can be reproduced if the watermark detector knows the secret key.

From the above two assumptions, the accuracy of the extracted watermark depends mainly on the variation of the image pixels. For example, the two neighbor pixels having highly different values have a high chance of obtaining an error prediction for ${w}^{\prime}(i,j)$. In fact, the variation of the $Y$ component after being watermarked using the above scheme always increases, and hence, unavoidably causes a lower accuracy on watermark extraction. To enhance the performance of ${w}^{\prime}(i,j)$ estimation of the $Y$ component, we consider using a new prediction technique for $Y(i,j)$, taking into account the different values between two nearby components, i.e., the center and its neighbor. That is, instead of using the true value of the neighbor component around ($i,j$) in the prediction process, we first apply a weighting factor to every neighbor component around ($i,j$), so that all neighbor components get closer to the center pixel. Conceptually, the weighting factor is determined based on the different values between the predicting component and its neighbors. Since, as mentioned earlier, a component value at coordinates ($i,j$) is assumed to be predicted from its neighbors, and the neighbor component value should be close to the predicting one. Also, since the range of values for the $Y$ component varies from 16 to 235 and the different values between the two components can be varied from 0 to 219, the weighting factor is applied directly to the nearby component in accordance with the difference between that component and the predicting one. Based on this concept, the weighted neighbor component ${\overline{Y}}^{\prime}(i,j)$ around ${Y}^{\prime}(i,j)$ of an area $3\times 3$ pixels can be represented by the following equation:

## (21)

$${\overline{Y}}^{\prime}(i+m,j+n)={Y}^{\prime}(i+m,j+n)+\alpha [{Y}^{\prime}(i,j)-{Y}^{\prime}(i+m,j+n)],$$## (22)

$${\overline{Y}}^{\prime \prime}(i,j)=\frac{1}{8}\left\{\right[\sum _{m=-1}^{1}\sum _{n=-1}^{1}{\overline{Y}}^{\prime}(i+m,j+n)]-{\overline{Y}}^{\prime}(i,j)\}.$$Note that ${\overline{Y}}^{\prime}(i,j)={Y}^{\prime}(i,j)$, and ${w}^{\prime}(i,j)$ is now obtained by:

The differences between the proposed watermarking scheme and the previous equivalent schemes are summarized in Table 1. Note that, in comparison, we concentrated on the watermarking scheme that can embed the two-color watermark image having the same size as the original host image only.

## Table 1

Differences between five image watermarking schemes.

Scheme | Host and watermark images | Embedding method | Prediction method |
---|---|---|---|

Hussein’s work denoted by 〈Lu−Log,Y,org〉 | - Host color image - Use black and white logo with the same size as host image | - Use Log average luminance value - Embed in Y component chosen spirally from the centre of the embedding image | No prediction method, and need original host image as reference |

Kutter’s work denoted by 〈Lu,B,4n〉 | - Host color image - Use black and white logo with the same size as host image | - Use luminance value from each embedding pixel only - Embed in all blue components without XORing operation | Use cross-shape neighborhood (four watermarked components) |

Karybali’s work denoted by 〈Lu−LS,Y,8n〉 | - Both color and gray scale images can be used as host image - Use black and white logo with the same size as host image | - Use spatial perceptual mask based on the adaptive LS prediction error of host image - Embed in all Y component with XORing operation | Use eight surrounding neighborhood (eight watermarked components) |

Amornraksa’s work denoted by 〈Lu−G,B,7+1n〉 | - Host color image - Use black and white logo with the same size as host image | - Use luminance value weighted from the embedding pixel and its nearby pixels - Embed in all Blue components with XORing operation | Use seven surrounding neighborhood with its centre (eight watermarked components) |

Proposed method denoted by 〈Lu−G,Y/16,w8n〉 | - Both color and gray scale images can be used as host image - Use black and white logo with the same size as host image | - Use luminance value weighted from the embedding pixel and its nearby pixels - Embed in only 1/16 of Y components with XORing operation | Use weighted value from surrounding neighborhood (eight watermarked components) |

## 3.

## Experimental Results

In all the experiments, sixteen $256\times 256$ pixel color images having various characteristics, namely, “Lena,” “Airplane,” “Fish,” “Pepper,” “Tower,” “Baboon,” “House,” “Bird,” “Always running,” “A water trick,” “Couple,” “Golden Gate,” “Sail boat on lake,” “San Francisco,” “Splash,” and “Tree” were used as original host images. Most of them were taken from Refs. 22 and 23. A black-and-white image of the same size as the host image Scout Logo was created and used as the watermark. To obtain a fair comparison between different watermarking schemes, the embedding parameters used in each scheme were adjusted until the quality of watermarked images reached quality image at PSNR of 35 dB.^{24} When the watermark was extracted, its accuracy was evaluated by a metric known as the normalized correlation (NC). The robustness of the embedded watermark was also evaluated by the NC. The NC is a similarity measurement between two different signals, which is given as follows:

## (24)

$$\mathrm{NC}=\frac{\sum _{i=1}^{M}\sum _{j=1}^{N}{I}_{w}(i,j){I}_{w}^{\prime}(i,j)}{\sqrt{\sum _{i=1}^{M}\sum _{j=1}^{N}{I}_{w}{(i,j)}^{2}}\sqrt{\sum _{i=1}^{M}\sum _{j=1}^{N}{I}_{w}^{\prime}{(i,j)}^{2}}}.$$Normally, when two different versions of watermark are compared, the value of NC varies from 0 to 1, provided that each comparing watermark contains at least one component representing the value of 1. Note that the value of $\mathrm{NC}=1$ implies that two compared signals are identical. Also, the higher the NC is, the more accurate the extracted watermark will be. Apart from using the NC, the quality of the extracted watermark may be evaluated without comparing it to the original version. Since the watermark image contains recognizable patterns and/or logos, its quality may be judged from the intelligibility of its content. In this article, we mainly used the NC to evaluate the performance of the watermarking schemes though we sometimes used human observers to rapidly validate the extracted watermark.

For the experiments, we first explored the impact of the proposed watermark embedding method and the proposed original image prediction technique separately, before employing them in our watermarking scheme. We then evaluated and compared the performance of our scheme with the previous schemes under the same circumstance, i.e., embedding a black-and-white image into a color image of the same size. Finally, we evaluated and compared the robustness of the seven watermarking schemes against various types of attacks including JPEG-based compression schemes. Two of them were in fact adapted from the blue component embedding method in Kutter’s and Amornraksa’s schemes to the $Y$ component and denoted by $\langle \mathrm{Lu},Y,4n\rangle $ and $\langle Lu-G,Y,7+1n\rangle $, respectively.

## 3.1.

### Impacts of the Proposed Methods

To demonstrate that the proposed watermark constructed for the $Y$ component helped to improve the accuracy of the extracted watermark, the root mean square error (RMSE) between the extracted and original watermarks, that is, ${w}^{\prime}$ and $w$, was measured to observe the performance of the proposed method versus existing methods. Note that a smaller value of RMSE indicates a lesser difference between two components. The results in terms of average RMSE at various PSNR from the watermarking scheme in Ref. 16 with different embedding methods and channels discussed above are shown in Fig. 7. In the figure, we denote the embedding methods of Kutter, Karybali, Amornraksa, and the proposed one, as described in Table 1, by $\langle \mathrm{Lu},B\rangle $, $\langle \mathrm{Lu},Y\rangle $, $\langle \mathrm{Lu}-\mathrm{LS},Y\rangle $, $\langle Lu-G,B\rangle $, $\langle \mathrm{Lu}-G,Y\rangle $, and $\langle \mathrm{Lu}-G,Y/16\rangle $, respectively. The proposed method achieved the highest accuracy with respect to the extracted watermark as compared to the other methods.

We then demonstrate that the quality of the predicted image obtained from the weighting-based prediction method is closer to the original image than that obtained from other existing methods. Again, we measured the RMSE between the predicted and original components for every embedding position in order to observe the difference between various prediction methods. For instance, in the $B$ component, RMSE is computed from ${B}^{\prime \prime}$ and $B$, while in the $Y$ component, from ${Y}^{\prime \prime}$ and $Y$. Note that in this situation, a smaller value of RMSE indicates a better prediction of the original component. The results of RMSE averaged from all host images based on various PSNR from the watermarking scheme in Ref. 16 with different prediction methods described in Table 1 are presented and compared in Fig. 8. From the figure, we denote the prediction methods of Kutter, Karybali, Amornraksa, and the proposed one by $\langle 4n\rangle $, $\langle 8n\rangle $, $\langle 7+1n\rangle $, and $\langle w8n\rangle $, respectively. The results verified that our prediction method obtained the highest quality predicted image at all PSNR values. Note that in the case of pixel prediction at the edge of image, the value of the missing pixels was replaced by the nearest pixel. Note that the Hussein scheme^{13} was not compared here because it does not need the prediction step.

## 3.2.

### Performance Comparison

First, we needed to identify the value of NC used to differentiate the extracted watermark from the fake one. To accomplish this, we deployed a watermark counterfeit attack by computing the average value of NC of the watermark extracted from all watermarked testing images and comparing the results from the seven watermarking schemes with 993 different watermarks. In the experiments, the quality of all watermarked images was controlled to achieve 35 dB, and the value of $\alpha $ was set to 0.475 to obtain the best prediction performance. Note that $\alpha $ was obtained experimentally by a full-search approach. That is, by searching the value of $\alpha $ that gave the highest NC value, on average, from all testing images. In the experiments, the value of $\alpha $ was varied in step of 0.1, from 0 to 1. According to the results shown in Fig. 9, the average NC value between the extracted genuine watermark and the other 993 watermarks was approximately 0.5. Hence, if the values of NC for an extracted watermark was lower than 0.5, the watermark could be presumed to be a fake. This threshold may be used to indicate the absence of an embedded watermark as well because the value of NC for a valid watermark after the XORing step was equivalent to that obtained by using a pseudorandom bit stream (7 genuine and 993 random watermarks).

Now, we compared the performance of the seven watermarking schemes. The results in terms of average NC at various PSNR are presented in Fig. 10. The proposed scheme outperforms the other schemes. It should be noted that the performance of $\langle \mathrm{Lu}-\mathrm{Log},Y,\mathrm{org}\rangle $ was not good even though it used the original host image to help extract the watermark. This is because the black and white logo with the same size as the host image was used in the experiments, and some image areas having too low log-average luminance were not used to carry watermark bits.

Examples of the original color image Lena, which included the two-color, black-and-white watermark Scout Logo, the watermarked image and the extracted watermark using the seven different schemes at PSNR of 35 dB are given in Fig. 11. The values of average NC obtained from $\langle \mathrm{Lu}-\mathrm{Log},Y,\mathrm{org}\rangle $, $\langle \mathrm{Lu},B,4n\rangle $, $\langle \mathrm{Lu},Y,4n\rangle $, $\langle \mathrm{Lu}-\mathrm{LS},Y,\phantom{\rule{0ex}{0ex}}8n\rangle $, $\langle \mathrm{Lu}-G,B,7+1n\rangle $, $\langle \mathrm{Lu}-G,Y,7+1n\rangle $, and $\langle \mathrm{Lu}-\phantom{\rule{0ex}{0ex}}G,Y/16,w8n\rangle $ were 0.8430, 0.7396, 0.7226, 0.8260, 0.8603, 0.8229, and 0.9643, respectively.

Since embedding a watermark with the same strength to different components will result in different PSNRs, the watermarked images from different schemes and components were then fairly compared with another objective quality measure that matches well with the HVS properties, i.e., the weighted PSNR (wPSNR) taken from checkmark.^{25} The wPSNR is an adaptation of PSNR that introduces different weights for the perceptually different image areas taking into account that the visibility of noise in flat image areas is higher than that in textures and edges.^{26} The calculation of wPSNR is given by:

## (25)

$$w\mathrm{PSNR}\text{\hspace{0.17em}}\text{\hspace{0.17em}}(\mathrm{dB})=20\text{\hspace{0.17em}}\mathrm{log}\phantom{\rule{0ex}{0ex}}\frac{255\sqrt{3MN}}{\sqrt{\sum _{i=1}^{M}\sum _{j=1}^{N}{\{\mathrm{NVF}[{Y}^{\prime}(i,j)-Y(i,j)]\}}^{2}}},\phantom{\rule{0ex}{0ex}}$$^{26}In this experiment, the wPSNR values from all testing watermarked images at PSNR of $35\pm 0.01\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$ were evaluated and compared. The results in terms of average $w\mathrm{PSNR}$ among the seven watermarking schemes are shown in Table 2. Obviously, the average $w\mathrm{PSNR}$ value from the proposed watermarking scheme was slightly lower than the three comparing schemes, i.e., $\langle \mathrm{Lu},B,4n\rangle $, $\langle \mathrm{Lu},Y,4n\rangle $, and $\langle \mathrm{Lu}-\mathrm{LS},Y,8n\rangle $.

## Table 2

Comparison of the average wPSNR at PSNR of 35±0.01 dB.

Scheme | 〈Lu−Log,Y,org〉 | 〈Lu,B,4n〉 | 〈Lu,Y,4n〉 | 〈Lu−LS,Y,8n〉 | 〈Lu−G,B,7+1n〉 | 〈Lu−G,Y,7+1n〉 | 〈Lu−G,Y/16,w8n〉 |
---|---|---|---|---|---|---|---|

Average wPSNR (dB) | 37.4598 | 37.9030 | 37.9657 | 37.9127 | 37.7418 | 37.8430 | 37.8590 |

Difference (dB) | 0.3992 | −0.0439 | −0.1066 | −0.0537 | 0.1173 | 0.0160 | 0 |

## 3.3.

### Robustness Against Attacks

The various types of attacks were next implemented against the watermarked images using the Stirmark benchmark (version 4)^{27}^{,}^{28} and common image processing techniques. We then tried to extract the embedded watermark. After the attacks, if the size of the attacked image was different from its original version, we rescaled it to obtain the original size. In the case of a cropped image, we replaced the missing part(s) of the image by white pixels. It should be noted that the quality of all attacked images fell below 35 dB, depending on the type and strength of the attack. As demonstrated in Figs. 12Fig. 13Fig. 14Fig. 15Fig. 16Fig. 17Fig. 18Fig. 19Fig. 20–21 and Table 3, the average NC from the proposed scheme of the extracted watermark after being attacked was superior to the other schemes.

## Table 3

Robustness comparison against various types of attack.

Type and strength of attack | Average normalized correlation (NC) | |||||||
---|---|---|---|---|---|---|---|---|

〈Lu−Log,Y,org〉 | 〈Lu,B,4n〉 | 〈Lu,Y,4n〉 | 〈Lu−LS,Y,8n〉 | 〈Lu−G,B,7+1n〉 | 〈Lu−G,Y,7+1n〉 | 〈Lu−G,Y/16,w8n〉 | ||

Median filter | 3×3 pixels | 0.6796 | 0.6104 | 0.6450 | 0.6628 | 0.6685 | 0.6657 | 0.7107 |

Convolute filter | Smoothing | 0.6997 | 0.4819 | 0.5726 | 0.7147 | 0.7091 | 0.7068 | 0.9079 |

Sharpening | 0.7118 | 0.7079 | 0.7095 | 0.8315 | 0.8424 | 0.8255 | 0.9732 | |

Self similarities | type 1 | 0.7217 | 0.6926 | 0.6748 | 0.7813 | 0.7762 | 0.7263 | 0.8942 |

type 2 | 0.7014 | 0.6180 | 0.7239 | 0.7576 | 0.7595 | 0.8179 | 0.9680 | |

type 3 | 0.7233 | 0.7384 | 0.6568 | 0.7624 | 0.7757 | 0.6719 | 0.7828 | |

Small random distortions | 0.95 | 0.7462 | 0.6310 | 0.6345 | 0.6571 | 0.6571 | 0.6539 | 0.7025 |

1 | 0.7469 | 0.6298 | 0.6343 | 0.6577 | 0.6565 | 0.6535 | 0.7013 | |

1.05 | 0.7474 | 0.6288 | 0.6345 | 0.6578 | 0.6568 | 0.6540 | 0.7013 | |

1.1 | 0.7479 | 0.6285 | 0.6341 | 0.6564 | 0.6571 | 0.6537 | 0.7017 | |

Latest small random distortions | 0.95 | 0.7429 | 0.6356 | 0.6342 | 0.6575 | 0.6570 | 0.6539 | 0.7048 |

1 | 0.7438 | 0.6349 | 0.6336 | 0.6574 | 0.6574 | 0.6539 | 0.7043 | |

1.05 | 0.7445 | 0.6345 | 0.6336 | 0.6572 | 0.6574 | 0.6531 | 0.7068 | |

1.1 | 0.7450 | 0.6342 | 0.6330 | 0.6574 | 0.6571 | 0.6532 | 0.7075 | |

Brightness enchantment | +50 | 0.7371 | 0.6616 | 0.6710 | 0.7838 | 0.7911 | 0.7985 | 0.9224 |

−50 | 0.7306 | 0.6947 | 0.6847 | 0.7976 | 0.8173 | 0.8203 | 0.9271 | |

Contrast enchantment | +50 | 0.7588 | 0.6324 | 0.6443 | 0.7836 | 0.7828 | 0.7982 | 0.9205 |

−50 | 0.7455 | 0.7197 | 0.7297 | 0.7985 | 0.8297 | 0.8302 | 0.9279 |

The examples of the watermarked image Lena using the proposed scheme after JPEG compression at 100% and 75% quality factor, which included the corresponding extracted watermark Scout Logo using the seven different schemes, are shown in Fig. 22, whereas similar examples after JPEG2000 with compression ratios of $4:1$ and $12:1$ and decompression layers of 5 are shown in Fig. 23.

Finally, we demonstrated the robustness of the embedded watermark against watermark removal. In this experiment, the predictions for the ${Y}^{\prime}$ (${Y}^{\prime \prime}$) components as well as the ${C}_{\mathrm{b}}$ and ${C}_{\mathrm{r}}$ components were combined together to recreate an image without a watermark. The same process was applied to the resulting image several times with the aim of completely removing the embedded watermark. For example, the first-round combination was ${Y}^{\prime \prime}+{C}_{\mathrm{b}}+{C}_{\mathrm{r}}$, the second round combination was ${({Y}^{\prime \prime})}^{\prime \prime}+{C}_{\mathrm{b}}+{C}_{\mathrm{r}}$, and so on. The values of NC for the watermark extracted from different versions of the recreated image Lena and based on the four watermarking schemes are given in Table 4. Again the Hussein scheme^{13} was not included because it has no prediction step. The results obtained from the table confirmed that the embedded watermark still remained within all versions of the recreated image and could be reliably extracted. Also note that after the first round, the PSNR of the recreated image fell below 35 dB and became lower with each round it was recreated in.

## Table 4

Robustness comparison against watermark removal attack.

Version of recreated image “Lena” | NC | |||
---|---|---|---|---|

〈Lu,B,4n〉 | 〈Lu−LS,Y,8n〉 | 〈Lu−G,B,7+1n〉 | 〈Lu−G,Y/16,w8n〉 | |

First round combination (35.18 dB) | 0.7731 | 0.8739 | 0.9039 | 0.9858 |

Second round combination (31.27 dB) | 0.7082 | 0.7415 | 0.8831 | 0.9718 |

Third round combination (29.42 dB) | 0.4839 | 0.6182 | 0.7851 | 0.9566 |

Fourth round combination (28.28 dB) | 0.5846 | 0.6769 | 0.7362 | 0.9309 |

Fifth round combination (27.50 dB) | 0.4426 | 0.6415 | 0.7054 | 0.8997 |

Sixth round combination (26.90 dB) | 0.5328 | 0.6544 | 0.6918 | 0.8732 |

Seventh round combination (26.43 dB) | 0.4204 | 0.6462 | 0.6817 | 0.8599 |

Eighth round combination (26.04 dB) | 0.4990 | 0.6502 | 0.6744 | 0.8512 |

Ninth round combination (25.71 dB) | 0.4070 | 0.6451 | 0.6719 | 0.8406 |

## 4.

## Conclusions

We have presented a new image watermarking scheme based on luminance modification. The watermark was embedded into the luminance component of the host image without significant perceptible degradation. The luminance component prediction using the concept of a weighting factor was also employed to enhance the performance of the proposed watermarking scheme. The experimental results showed significant improvement in the proposed watermarking scheme in terms of accuracy of the extracted watermark and robustness of the embedded watermark, compared with the previous existing schemes, especially against two popular image compression methods, JPEG and JPEG2000.

For a practical system, other techniques such as error control coding or multiple embedding might be incorporated to provide extra reliability for watermark extraction, provided the complexity in the new system is not too high and enough watermark bits can still be embedded.

## Acknowledgments

This research work was supported by the Commission on Higher Education scholarship (CHE-PhD-SW-NEWU) granted to Mr. Narong Mettripun. The authors would like to sincerely thank Mr. Suwat Tachaphetpiboon and Miss Kharittha Thongkor for their fruitful discussions.

## References

## Biography

**Narong Mettripun** received a MSc degree in computer engineering from King Mongkut’s University of Technology Thonburi (KMUTT), Thailand, in 2004. He is currently a lecturer in the Electrical Engineering Department, Rajamangala University of Technology Lanna Chiang Rai. He is now pursuing a PhD degree in electrical and computer engineering at KMUTT. From July 2011 to July 2012, he was a visiting researcher at Video and Image Processing Laboratory, Department of Electrical and Computer Engineering, Purdue University, USA.

**Thumrongrat Amornraksa** received MSc and PhD degrees from University of Surrey, England, in 1996 and 1999, respectively. He is currently an associate professor in the Computer Engineering Department, King Mongkut’s University of Technology Thonburi (KMUTT). His research interests are digital image processing and digital watermarking.

**Edward J. Delp** received his BSEE and MS degrees from the University of Cincinnati, and a PhD degree from Purdue University. In 2002, he received an Honorary Doctor of Technology from the Tampere University of Technology in Tampere, Finland. He is currently The Charles William Harrison Distinguished Professor of Electrical and Computer Engineering and Professor of Biomedical Engineering and Professor of Psychological Sciences (Courtesy) His research interests include image and video compression, multi-media security, medical imaging, multimedia systems, communication, and information theory. He is a Fellow of the IEEE, a Fellow of the SPIE, a Fellow of the Society for Imaging Science and Technology (IS&T), and a Fellow of the American Institute of Medical and Biological Engineering. In 2008 Dr. Delp received the Society Award from IEEE Signal Processing Society (SPS). This is the highest award given by SPS for his work in multimedia security and image and video compression.