Noise is one of the main factors that degrades image quality.1,2 In spite of considerable efforts spent on noise intensity reduction in originally acquired images, noise still remains visible and disturbing for many practical applications. There are different types of noise that can be present in images such as additive white Gaussian noise (AWGN), spatially correlated additive noise, signal-dependent and mixed noise, speckle, etc.34.5.–6 And there are various groups of methods for image denoising. However, researchers continue their attempts to design new, more efficient techniques for both quite general and more specific applications.
One reason is that the image processing community and customers are not satisfied by the already obtained results. Another reason is that until recently it has not been clear that there is room for further improvement of image filtering performance. Fortunately, a new approach to the estimation of potential limit output (PLO) mean square error (MSE) for grayscale (one-component) images has been put forward by Chatterjee and Milanfar.7 This approach presumes that noise is AWGN and a noise-free image is available. Later, this approach has been further advanced8 to allow predicting the PLO MSE without having a quite accurate corresponding noise-free image.
The results presented in Refs. 89.–10 demonstrate the following: for a given image, the PLO MSE decreases if noise variance reduces. For a given noise variance, the PLO MSE can vary by several times depending upon an image. It can be easily concluded from data presented in Ref. 7 that the PLO MSE is considerably, by up to 10 times, larger for more complex structure (highly textural) images. Within the approach in Ref. 7, the PLO MSE is practically reached by modern most efficient filters for complex-structure images.
The PLO MSE in Ref. 7 has been derived within a nonlocal filtering approach. There are many techniques that belong to this family nowadays. They are based on searching for similar patches and their joint processing.1112.13.–14 Among them, the block-matching three-dimensional (BM3D) filter14 has been shown to be the most efficient for processing most grayscale test images7 and component-wise denoising of color test images10 corrupted by AWGN.
Meanwhile, the approach in Ref. 7 might not be unique for determination of PLO MSE. From the linear filtering theory, the Wiener filter is known to be the optimal in the sense of providing minimal output MSE under the condition of a priori known spectra of stationary signal and noise.15 Wiener filtering being applied to processing an entire image in spatial two-dimensional (2-D) Fourier domain is not as efficient as in the case of one-dimensional (1-D) stationary signal filtering (stationarity is required for proper operation of the Wiener filter16), since images are nonstationary random 2-D processes. Because of this, quasi-Wiener filtering is often implemented in spatial domain locally. The widely known local statistic Lee17 and Kuan18 filters are good examples of such algorithms. There are also options of the Wiener filter used in other than Fourier orthogonal transforms as, e.g., wavelet,16,1920.–21 DCT,4,22,23 and others.22 An attempt to implement a nonlocal Wiener filter in spatial domain using image “photometric similarities” is presented in Ref. 24.
Reference 22 compares the Wiener-based filtering efficiency for different orthogonal bases. Although this is done for the 1-D case, an important conclusion is that the DCT domain Wiener filtering approaches the best known optimal Karhunen-Loeve transform basis. This is due to very good data de-correlation and the energy compaction properties of the DCT, which are widely exploited in image and video compression.25 Efficiency and usefulness of the local DCT commonly carried out in blocks has also been proven for image denoising applications in Refs. 2618.104.22.168.–31. Thus, below we focus just on DCT as the considered basic orthogonal transform.
In this paper, our goal is to analyze the potential of the DCT image filtering in detail including an ideal (hypothetical) case of a priori known global and local power spectra and a more practical case when only information on noise statistics (variance) is available. Next, we determine the potential limits of the DCT-based filtering efficiency for fully overlapping blocks of , , and within the Wiener approach and compare them to the results obtained by the Chatterjee’s approach7,24 for a wide set of standard test images. Also, we analyze the filtering efficiency of the proposed multiscale DCT-based filters and compare them to the state-of-the-art BM3D filter.
The paper is organized as follows: the image Wiener filtering principle is considered; a way on how it reduces to hard switching filter is shown in Sec. 2. Details of multiscale DCT-based filtering are presented in Sec. 3. Numerical simulation results for two proposed multiscale filters in comparison to the best known ones are presented in Sec. 4, providing wide opportunities for analysis and comparisons. A brief discussion of what else can be done in DCT-based filtering is presented in Sec. 5. Finally, the conclusions follow.
Image Wiener Filtering in DCT Domain
Let us consider an additive observation equation (model)1)], denotes convolution operation, is an auto-correlation function of the 2-D signal , and is an auto-correlation function of the noise. Using the Fourier transform property for convolution and the Wiener-Khinchin theorem that relays correlation and power spectrum, one can obtain the Wiener-Hopf equation in the spectral domain given for the 2-D case as:
In the case of additive white Gaussian noise, the model for noise power spectral density is given by:9)].
The last expression for the Wiener filter frequency response, Eq. (9), could be simplified assigning the unit gain for all spatial frequencies where and zero gain otherwise. This results in a hard thresholding technique5:
In this case, was proven to have quasi-optimal value .6,23,26 To confirm this, let us present some results. Figure 1(a) shows a three-component LandsatTM image (optical bands) in red-green-blue representation. AWGN has been added to all three components and they have been processed by the DCT filter component-wise ( blocks with full overlapping of blocks, see details in the next sections). The dependences of the output MSE for all three components are presented in Fig. 1(b) and 1(c) for noise standard deviations 7 and 10, respectively. There are obvious minima for all dependences for slightly larger than 2.5. Since component images are quite similar (characterized by cross-correlation factor of about 0.9), all dependences are very similar. A general tendency is that optimal shifts to larger values for less complex images and/or larger standard deviations of the noise and vice versa. Meanwhile, setting equal to 2 or, e.g., 3.4 (i.e., ) instead of 2.7 leads to an MSE increase by about 10%. Thus, optimal setting (which is individual for each image and noise standard deviation) instead of the recommended quasi-optimal is able to produce output MSE which is only a few percent smaller than .
Locally Adaptive Wiener Image Filter in DCT Domain
More accurate estimates of are used for Wiener filtering, and better results in the sense of the output MSE are achieved [or, equivalently, in the sense of the peak signal-to-noise ratio defined for byte represented images as ]. This way, one can use local spectral estimates to take into account local data activity for better noise filtering. For this purpose, the filtering may be performed within blocks of , and such blocks are allowed to be overlapped for better noise suppression. In this paper, we assume that the blocks are maximally (fully) overlapped, i.e., the neighboring blocks have the overlapping area of if their upper left corner positions are shifted with respect to each other by only one pixel. In Refs. 23 and 26, it was shown that the DCT-based filtering with block overlapping reduces blocking effects and produces better output PSNR. The DCT-based denoising with full overlapping is more efficient in the sense of output MSE criterion than processing with partial overlapping or in nonoverlapped blocks.23 Meanwhile, denoising in fully overlapped blocks takes more time. However, since DCT can be easily implemented using fast algorithms and/or specialized software or hardware, DCT-based denoising in fully overlapped blocks is fast enough.
So, for a locally adaptive Wiener DCT-based image filter we use a normalized DCT-2 transform32 given by11), the frequency response of the local hard thresholding filter is:
Next, we propose to use the estimate in Eq. (14) to determine the local power spectrum as15), the frequency response of the local Wiener DCT-based image filter can be formulated as 14), Eq. (17) results in a high redundancy of the filtered data that has to be aggregated to produce the filtered image . The aggregation can be performed by averaging the block pixels where the overlapping occurs. It can also be performed using some weighting as proposed in Ref. 14, or using weighted least square patch averaging. However, we have determined by simulations that this simple mean calculation for block data aggregation 14) or Eq. (17) of size , denotes the number of overlapping blocks in the , ’th pixel. Note that filtering efficiency might be slightly worse for pixels near image edges since for these pixels a smaller number of filtered values from processed overlapped blocks is aggregated (for example, only one for four image corner pixels).
Next, we have found by simulations that the aggregation of the overlapped blocks of different size might further improve noise suppression. To this end, at each pixel position, different values of in Eqs. (11), (12), (14), and (17) are used and then the processed overlapped blocks of different size are aggregated using some weighting. In particular, we have determined that the following weighting produces good results for different images and different noise levels:19) is based on the results presented in the next section.
The simulations have been performed using a wide set of standard grayscale test images33 shown in Fig. 2, all of size . This allows obtaining quite full imagination on properties and performance of different filtering algorithms and approaches considered in this paper. Noise variance (standard deviation) has been varied in a very wide range as well. Despite the noise standard deviation values of the order for grayscale images of 8-bit representation it is almost impossible to meet, in practice, the corresponding data often presented in literature dealing with filter efficiency analysis and comparisons.7,12,14 Thus, we have decided to obtain and present such data for the considered techniques.
DCT Domain Hard Thresholding and Wiener Denoising
Let us start by applying filtering to the entire image: the DCT hard thresholding [Eq. (13)], practical Wiener filtering [with spectrum estimation from DCT filtered image; Eq. (16)], and the ideal Wiener (when , are both known). The obtained results are presented in Table 1.
Performance (in terms of the output PSNR, in dB) of the standard DCT-based filtering techniques [Eqs. (9) and (10)] and the ideal Wiener filtering that all operate over entire image transformed data.
|Image||σ||DCT hard thresholding||Wiener filtering||Ideal Wiener filtering|
|Stream & bridge||2||39.808||39.843||43.373|
As can be easily expected, the output PSNR decreases if noise standard deviation becomes larger (this tendency is observed for any filtering approach). However, output PSNR values differ a lot. For example, for the noise standard deviation equal to 10, the DCT-based filtering with hard thresholding (the quasi-optimal has been used for all images and values of noise standard deviation) produces output PSNR ranging from 31.14 dB for the simple structure Elaine image to 26.24 dB for the complex structure Baboon image. Similarly, the output PSNR for the ideal Wiener filter ranges from 36.33 to 31.94 dB (again, for the test images Elaine and Baboon, respectively).
A more detailed analysis shows that the output PSNR values for the ideal Wiener filter are usually by larger than for the DCT-based filter with hard thresholding. The difference slightly increases if the noise standard deviation becomes larger. The difference is smaller for the test images with more complex structure such as Baboon and Stream & bridge.
The two-stage procedure of practical Wiener filtering produces intermediate results which are considerably closer to the outputs of the DCT-based filter with hard thresholding than to the ideal Wiener filter. The resulting PSNR for the practical Wiener filter can be up to 0.4 dB better than for the DCT-based filtering with hard thresholding. This means that the estimates of the power spectrum are not accurate enough. Note that the largest improvement for the practical Wiener filter occurs for the test images with quite simple structure and if the noise variance is large.
As it has been mentioned in the Introduction, images are 2-D nonstationary processes for which local spatial spectra shapes differ considerably from spatial spectra shapes for the corresponding entire images. Although blocks are usually employed in the DCT-based filtering, we have considered the question of block size selection in more detail. For this purpose, the output PSNR values have been obtained for three sizes of , namely 4, 8, and 16 taking into account that in such cases the DCT-based filtering can be carried out faster than for other block sizes (e.g., ) that are, in general, also possible. The obtained results are presented in Table 2. As before, the results are given for the DCT-based filtering with hard thresholding, the practical (two-stage) Wiener filtering [Eq. (17)], and the ideal Wiener filtering. Besides, we present results for the lower bound of filtering efficiency obtained according to Ref. 7 using the software tool offered by the authors34 (according to the recommendations in Ref. 7, the selected number of clusters with the patch size ). The following results are expressed not in output MSE as it is produced by the software but in terms of PSNR for the convenience of comparisons. The same test image set is used and the AWGN with the same values of the standard deviation have been simulated.
Output PSNR (in dB) of the DCT-based image filters [Eqs. (14), (17), and (18)] in comparison to the noise suppression bound calculated according to Ref. 7 (5 clusters were used with the patch size 11).
|Image||σ||DCT with hard thresholding||Wiener filtering||Ideal Wiener filtering||PSNR bound7|
|Stream & bridge||2||42.489||42.544||42.472||42.625||42.6||42.519||44.923||45.017||44.952||44.448|
|Image||σ||DCT thresholding||Wiener filtering||Ideal Wiener filtering||PSNR bound7|
The first observation that follows from comparison of the corresponding data in Tables 1 and 2 is that the image block-wise filtering produces considerably better results than the image filtering with DCT applied to the entire image. The output values for the block-wise version of the DCT-based filtering with hard thresholding are by better than the entire image counterpart. This once more confirms expedience of the image local processing approach (with block overlapping). Similar observations hold for the practical and ideal Wiener filters.
As is seen, the block size has sufficient impact on the DCT-based filter performance. The results for are worse than for or 16 in practically all cases. The only exceptions are the results for the test image Stream & bridge for small standard noise deviations where PSNR for is slightly better than for . Meanwhile, the PSNR values for and usually do not differ a lot between each other, and simulations for revealed the filtering efficiency reduction in comparison to . The general tendency is the following: is a better choice if the noise standard deviation is larger and a processed image has a simpler structure.
We use the terms “simple structure” and “complex structure” images. Intuitively these terms are clear where the latter relates to more textural images. Unfortunately, until now there is no commonly accepted metric for image complexity.
The practical Wiener filter [Eq. (17)] again produces performance improvement compared to the DCT-based processing with hard thresholding. Due to applying the Wiener filter at the second stage, the output PSNR can be increased by up to 0.5 dB. We would like to stress here that the practical Wiener filtering can be performed in a pipeline manner, where the second stage processing is applied when the necessary output data of the DCT-based thresholding is obtained. Thus, although computation expenses are increased for the proposed two-stage procedure compared to the standard DCT-based denoising, the two-stage filtering is still considerably faster than most efficient denoising techniques that search for similar blocks (patches), and is usually time consuming.
The ideal Wiener filter again produces the output PSNR values that are by larger than those corresponding to practically implementable methods. Note that for the ideal Wiener filter the best results are produced for and the PLO PSNR for can be by almost 0.8 dB better than for .
It is interesting to compare these results (that can be considered as PLO PSNR) to the corresponding data produced by the Chatterjee’s approach.7 Such comparisons can be easily made by considering, e.g., the data in the last (rightmost) two columns of Table 2 (the best attainable values of PLO PSNR are marked bold). The PLO PSNR for the Chatterjee’s approach can be by almost 5 dB better (this takes place for simple structure images corrupted by AWGN with small standard deviation). Meanwhile, for complex structure images such as Baboon and Stream & bridge, the PLO PSNR for the Chatterjee’s approach can be by almost 4 dB smaller than for the ideal Wiener filter. For images of middle complexity (as, e.g., Boat), the Chatterjee’s approach produces larger PLO PSNR for small standard noise deviations than the ideal Wiener filter and vice versa. One possible explanation of this effect can be that it is a more difficult task to find similar patches and to take advantages of nonlocal processing for images of more complex structure and under condition where noise is intensive (has large variance).
The results presented in Table 2 also confirm one observation earlier emphasized in Ref. 9. The output PSNR for the DCT-based filtering with hard thresholding is quite close to the Chatterjee’s limit7 for the complex structure images corrupted by intensive noise (see, e.g., data for the test images Baboon and Stream & bridge for the noise standard deviation equal to 10 and larger). The difference is smaller than 1 dB. Meanwhile, there is room for efficiency improvement for simpler structure images if the noise standard deviation is not large.
Comparison to the State-of-the-Art
It becomes interesting to compare the performance of the proposed DCT-based filters, MDF, and two-stage Wiener MDF with the state-of-the-art BM3D filter. The data which allows carrying out such comparison are represented in Table 3. First of all, the presented PSNR values for a given image and noise standard deviation are quite close (the best results are marked bold). They differ by not more than 1 dB (this happens for simple-structure images corrupted by AWGN with large variance values, see data for the image Lena, ). The BM3D filter performs better for some test images while the two-stage Wiener filter is better for others. It is difficult to establish some obvious performance dependence of these filters on image complexity. For two simple-structure images such as Lena and Elaine, BM3D results are better for Lena and the two-stage Wiener produces, on average, better results for Elaine. Similarly, for two complex structure test images, Baboon and Stream & bridge, the two-stage Wiener filter is better for the test image Stream & bridge and vice versa.
Performance (PSNR, in dB) of the proposed image filters [Eqs. (14), (17), and (19)] in comparison to the images filtered by the state-of-the art BM3D filter.14
|Image||σ||MDF [Eqs. (14) and (19)]||Wiener MDF [Eqs. (17) and (19)]||BM3D|
|Stream & bridge||2||42.553||42.573||42.662|
Setting the weights in Eq. (19), we have taken into account that DCT-based denoising with blocks usually produces not worse filtering than with blocks but fewer artifacts are observed in neighborhoods of high-contrast edges and small-sized objects. In turn, denoising in block is less efficient than for larger sizes of blocks. Also, note that the DCT-based processing in blocks of different size can be carried out in parallel that allows diminishing processing time.
Figure 3 illustrates filtering efficiency for a fragment of the test image “Lena.” As is seen, noise removal is efficient and edge/detail preservation is good for both output images. Figure 4 presents an example of processing the test image “Baboon” by the proposed Wiener filter in comparison to the state-of-the art BM3D filter. The BM3D filter suppresses noise better in “flat” (homogeneous image) regions while the proposed filter preserves better texture and details; the filtered image in this case has a more natural appearance.
It is worth briefly discussing here the mechanism of DCT-based denoising with hard thresholding. Noise is removed in DCT-components of a block for which (although hard thresholding operation simultaneously introduces distortions in the corresponding signal components). Meanwhile, noise is preserved in the components when . Therefore, noise reduction should increase if the number of DCT coefficient with is larger.
All simulation results presented above for the DCT-based denoising have been obtained for hard thresholding with the fixed in Eq. (13). However, as has been mentioned above, such threshold setting is quasi-optimal. Let us demonstrate this by several examples. We have selected eight test images of different complexity widely used in image processing applications. For three values of noise standard deviation (5, 10, 15), the optimal values that provide maximal output PSNR have been determined. They are presented in Table 4. Besides, we have determined two probabilities: is the probability that DCT coefficient absolute values do not exceed and is the probability that DCT coefficient absolute values are larger than . One more characteristic of filtering efficiency has been determined: the ratio , where is output MSE after denoising. The obtained data are presented in Table 4. The test images are put in such order that in the fourth column increases.
DCT-based filter efficiency and DCT coefficient statistics for different test images and noise variances.
|Stream & bridge||5||2.38||0.369||0.204||0.71|
|Stream & bridge||10||2.37||0.474||0.105||0.52|
|Stream & bridge||15||2.37||0.521||0.067||0.4|
The first observation is that the probabilities and are highly correlated. If is smaller, then is usually larger. The second observation is that the values are smaller and are larger for more complex-structure images and smaller noise variance values. This is clear since for more complex-structure images the DCT coefficients for noise-free image have wider distribution. The third observation is that increases if image complexity reduces and/or noise variance becomes larger. varies from 2.3 to 2.8 where for most typical practical situations is within the limits from 2.6 to 2.7.
It seems that if is preliminary determined for a given image under a condition of exactly known noise variance, it can prove more careful threshold setting for providing certain benefits of filtering efficiency. Such a strategy can be treated as image/variance adaptive threshold setting. However, in our opinion, the benefits of this strategy are too small to use in practice. A more reasonable way seems to use locally adaptive setting of the thresholds, but currently we are unable to propose an algorithm to do this.
The data presented in Table 4 show that for noisy images their complexity (or, more strictly saying, complexity of image denoising task) can be indirectly characterized by the parameter . Filtering is more efficient (smaller are provided) if is smaller. Note that can vary from 0.78 (less than 1 dB increase of output PSNR compared to input PSNR) to 0.13 and even less (about 9 dB and more increase). Thus, it seems possible to predict (or, equivalently, for a priori known ) from analysis of with practically acceptable degree of accuracy. This can be one possible direction of future research. It can be also expected that the use of polynomial threshold operators and other more sophisticated thresholds35,36 can improve performance of the DCT-based denoising.
Different approaches to filtering grayscale images corrupted by AWGN are considered including the DCT-based denoising with hard thresholding, two-stage Wiener filter, and ideal Wiener filters that are compared to the state-of-the art BM3D technique. Several sizes of fully overlapped image blocks are studied and it is shown that processing in and blocks produces approximately the same results. It has been demonstrated that the performance can be slightly improved by combining the filter outputs that perform processing using different block sizes. Following this approach, two multiscale DCT-based filters, MDF and Wiener MDF, are proposed and their properties analyzed.
Potential limits of output PSNR (or MSE) for the ideal Wiener filter and Chatterjee’s approach are obtained and compared. These limits are, on average, of the same order but can differ by up to 5 dB depending on the image processed and noise variance. Thus, we can state that the potential limits of filtering efficiency are “approach-dependent.”
The state-of-the-art filters including the DCT-based denoising and the Wiener-based techniques provide filtering performances quite close to Chatterjee’s limit for complex-structure images and large noise variance. Performance characteristics of the state-of-the art BM3D filter and the proposed Wiener MDF are very close while the latter filter is simpler and faster.
The proposed MDF techniques require less computational time than the BM3D filter and, especially, the Chatterjee filter, which requires image clustering to perform nonlocal averaging. MDF technique [Eqs. (14) and (19)] is about two times faster than the Wiener MDF [Eqs. (17) and (19)] and produces good visual quality of the filtered images when the noise variance is low ().
It has also been shown that filtering efficiency depends considerably on DCT coefficient statistics. A more detailed study of this dependence can be a direction of future research to further improve performance of the block-wise DCT-based filters.
We are thankful to anonymous reviewers for their valuable comments and propositions. This work was partially supported by Instituto Politecnico Nacional as a part of the research project SIP20120530.
Oleksiy Pogrebnyak received his PhD degree from Kharkov Aviation Institute (now National Aerospace University), Ukraine, in 1991. Currently, he is with The Center for Computing Research of National Polytechnic Institute, Mexico. His research interests include digital signal/image filtering and compression, and remote sensing.
Vladimir V. Lukin graduated from Kharkov Aviation Institute (now National Aerospace University) in 1983 and got his diploma with honors in radio engineering. Since then he has been with the Department of Transmitters, Receivers and Signal Processing of National Aerospace University. He defended the thesis of Candidate of Technical Science in 1988 and Doctor of Technical Science in 2002 in DSP for Remote Sensing. Since 1995 he has been in cooperation with Tampere University of Technology. Currently, he is department vice chairman and professor. His research interests include digital signal/image processing, remote sensing data processing, image filtering, and compression.