Image resizing in the discrete-cosine-transform (DCT) domain is of interest for transcoding.1, 2, 3, 4, 5 It allows fast implementation omitting inverse DCT, where implicit the downsizing operation is done by truncating the high-frequency component in the DCT domain. In general, the downsizing operation always needs an anti-aliasing filter prior to the downsampling. The method contains an anti-aliasing filter implicitly by truncating the high-frequency component, because the filter bank of DCT resolves from low to high frequency. The frequency response of this method looks well shaped where a narrow transition is shown.2 However, most applications of transcoding are in small-size displays such as cell phones and mobile PCs, which need quarter common intermediate format (QCIF) or common intermediate format (CIF) resolution. Although the narrow band of the DCT-domain downsizing has a good shape, the visual appearance after downsizing shows a severe aliasing effect, as shown in Fig. 1 . A suitable anti-aliasing filter is still questionable in the image processing, but we propose a simple method to improve the visual appearance with a windowing operation, which adjusts the DCT coefficient.
One-dimensional (1-D) twofold downsizing in the spatial domain using DCT is expressed by combination of DCT and inverse discrete cosine transform (IDCT) as follows6:denotes the 1-D DCT kernel, and and are the original image and downsampled image, respectively. represents the upper kernels of the DCT from row 1 to , and the superscript represents the transpose of the matrix. Downsizing in the DCT domain and Eq. 1 using the DCT kernel are one and the same. Therefore, we present the proposed method in the spatial domain for simplicity and easy comprehension. Let be the weighting matrix, which is diagonal. When is identity, the downsizing matrix is identical to the previous method. However, a severe aliasing effect is shown after the downsizing of the image, since implicit anti-aliasing is not sufficient for friendly visual appearance. We propose a windowing operation in the DCT domain for reducing the aliasing effect, where windowing is simply scaling the DCT coefficient. The new downsizing matrix is written as follows: and denote the proposed downsizing operation and the combined downsizing matrix with data point and windowing matrix.
Joint video team (JVT)7 recommends a two-fold downsizing filter with twelve taps, which has a phase shift in the downsizing.6 The frequency response of the JVT filter is shown in Fig. 2 (method 1), which shows strong anti-aliasing but sacrifices detail preservation. However, we adopt the JVT filter for visual appearance. The proposed method finds an optimal weighting parameter having similar frequency response to the JVT filter. We used the least-square optimization method for determining the matrix. In other words, we searched the optimal matrix with the frequency response of the JVT filter. The frequency response of the DCT based downsizing is written as follows5:is a -transform of the -tap filter, which is represented by the th row of the matrix. As shown in Ref. 5, since the magnitude of is dominant in comparison with the other component, we deal with only the frequency response of for deriving proposed filter. The problem of finding the optimal matrix is written as follows: denotes the magnitude of the -transformed result, and is the -transform of the JVT’s downsampling filter. However, direct calculation of the matrix is impossible due to the nonlinear nature of the problem. We used the Levenverg Marquardt optimization method for finding the matrix. The obtained matrix is written as follows: denotes the diagonal elements of the matrix. The obtained weighting parameters decrease at the high-frequency index; hence, the index reduces the aliasing caused by the high-frequency data while lessening the detail of the image. The upsampling operation in the DCT domain is written as follows: and represent the left kernel of the IDCT from column 1 to and the previous result after upsampling,2 respectively. The inverse matrix is inserted in the DCT domain for restoring the adjusted DCT coefficient during downsampling with the proposed method. When the upsampling method in Eq. 6 is employed for image resizing after downsizing using the proposed method, the peak-signal-to noise ratio (PSNR) value is identical in comparison with the previous approach,2 as shown in Eq. 6. When we applied the proposed method in the downsizing transcoder, no overhead is incurred in the computational aspect, where the matrix is embedded in the DCT-domain down-upsizing matrix as a precalculated form such as the previous method.2 Therefore, the proposed down-upsampling method in the DCT domain reduces aliasing in the downsized image, while it has no loss of PSNR after upsampling using the proposed method and no overhead in complexity during down-upsampling in the DCT domain.
We used the two-fold downsizing matrix of Eq. 1 with , where the DCT coefficient with will be halved to make the downsized image. The visual appearance of “Mobile Calendar” is shown in Fig. 1. Fig. 1 shows good compromise with reduced aliasing and lessening details of image. But Fig. 1 shows severe aliasing with the previous method. The visual appearance of the proposed method is similar to that of JVT.
Figure 2 shows the frequency response of the JVT filter (method 1), the previous method (method 2), and the proposed method. Method 1 shows strong anti-aliasing, whereas method 2 shows good preservation of high-frequency details. The frequency response of the proposed method shows similar shape to method 1. However, attenuation at the high-frequency band is shown, but the visual appearance shows a similar result. A large number of may improve the frequency response with increased complexity. Moreover, adaptive determination of the weighting parameter will provide friendly visual quality. For example, when blocks containing a large high-frequency component will make the downsized block severely aliased, strong anti-aliasing using the matrix may make the block blurry for comfortable viewing, while low-frequency blocks perform weak anti-aliasing. However, we are searching for a method of selecting the proper matrix through various images.
We proposed a simple and efficient windowing method for a downsizing transcoder. The experimental result shows that the proposed method improves visual quality with reducing the aliasing artifact. The windowing in the DCT domain shows a similar effect for conventional windowing of the frequency domain. The proposed method has the same computational complexity and PSNR performance after upsampling using the proposed approach in comparison with the previous DCT-domain downsizing method,2 because the windowing operation in the DCT domain can be embedded in the down-upsizing operation. It can be expected that the transcoding application for downsizing will provide more friendly visual quality. Also, extension to arbitrary ratio downsizing for friendly visual quality is under way by the author.