The recognition and classification of objects based on their visual similarity has become a central task in current industrial imaging systems. With increasing amounts of real-world image data to be processed and stored, the development of powerful retrieval tools also has become necessary in machine vision applications. Along with texture and color, shape is an essential feature used to describe the objects in the images. Therefore, effective shape description is essential in retrieval systems.
Due to the increasing number of on-line solutions, computational lightness is nowadays considered equally important as classification accuracy. In retrieval, computational efficiency of a particular descriptor is generally dependent on two matters, descriptor dimensionality and matching procedure.
The Fourier descriptor (FD)1 is probably the best-known boundary-based shape descriptor. It has been proven to outperform most other boundary-based methods in terms of retrieval accuracy and efficiency.2 In addition to good retrieval and classification performance, the main advantages of FDs are that (1) they are compact and computationally light, (2) they are easy to implement, (3) their matching is straightforward, (4) they are very easy to normalize to be scale and rotation invariant, and (5) their sensitivity to noise is low.
Wavelet transforms3 have been widely used in multiscale image analysis and also have a few applications in shape description. In Ref. 4, the wavelet descriptors (WDs) are based on zero-crossing points of wavelet approximation of the shape and hence the similarity measurement is dependent on the shape complexity. In Ref. 5, moment invariants are employed in shape description using wavelets. It is also possible to combine wavelets with Fourier descriptors, which yields to rotation and scale invariance. This can be made based on polar coordinates of a shape6 or by Fourier transforming the wavelet coefficients obtained from the complex-valued boundary function.7 On the other hand, when WDs are formed using several scales, the resulting feature vector is typically high dimensional due to spatial information caused by multiple scales.
In this paper, we present an effective approach to wavelet-based shape representation at single scale. We show that it is possible to form rotation and translational invariant WDs, whose matching is as simple and fast as that of FDs. The proposed approach is applied to a practical industrial image retrieval and classification problem.
The contour-based shape description is based on one-dimensional boundary function (shape signature). Let , represent the object boundary coordinates, in which is the boundary length. Complex coordinate function (Ref. 2) expresses the boundary points in an object centered coordinate system:is the object centroid.
Fourier descriptors can be formed for the boundary function using the discrete Fourier transform (DFT):and are the transform coefficients of . The descriptors can be made rotation invariant using the magnitudes of the transform coefficients, . The scale can be normalized by dividing the magnitudes of the coefficients by .
The general shape of the object is represented by the low-frequency coefficients, which are usually selected to be the descriptor. In the contour Fourier method,2 the feature vector of length is formed as:
Wavelet Shape Descriptor Using Fourier Transform
In the wavelet-based approach, the boundary function is transformed using some wavelet .3 The complex wavelet transform8 is based on the continuous wavelet transform (CWT). The CWT of the boundary is defined as:of scale are obtained. The coefficients are defined for all positions .
The problem with the CWT coefficients is that they are dependent on the starting point of the object boundary. Hence, the obtained descriptor is not rotation invariant. Also the dimensionality of the feature vector depends on the boundary length. Therefore, the coefficient vectors of different shapes cannot be directly matched. The proposed solution for this problem is to apply the Fourier transform to the whole set of wavelet coefficients. This way the normalization and matching are straightforward operations. The proposed descriptor is formed by applying the DFT to the coefficients :3.
Experiments with Industrial Defect Shapes
The validation presented in this section is twofold. Simple classification experiments are first carried out to show the influence of scale selection on the shape description. The second part of the validation, the retrieval accuracy of the proposed methods, is compared to that of an ordinary FD (contour Fourier). In all the experiments, Euclidean distance and the “leave one out” validation principle are used.
For testing purposes, we use defect images that are collected from an industrial process using a paper inspection system.9 A reason for collecting defect image databases in process industry is a practical need for controlling the quality of production.9 When retrieving images from a database, the defect shape is one essential property describing the defect class. Therefore, effective methods for the shape representation are necessary. The test set consisted of 1204 paper defect shapes, which represented 14 defect classes with each class consisting of 27–103 images (Fig. 1).
Classification and Retrieval
The feature extraction in the testing database was carried out by calculating the descriptors for the images in the database. The dimensionality was 8 with all the descriptors [Eq. 3]. In the case of the wavelet-based approach, the selected wavelets were first and second order complex Gaussian wavelets that have been implemented in the Matlab wavelet toolbox.8 To compare different scales, we made preliminary -nearest neighbor ( -NN) classification experiments. Figure 2a presents the average classification rates of the proposed wavelet descriptors at different scales using a 5-NN classifier. In this figure, the classification rate of the contour Fourier descriptor (41.87%) is also presented. The scales that produce the highest classification rates were compared to contour Fourier in the retrieval experiment by calculating average precision versus recall curves for the queries [Fig. 2b].
In this paper, we showed that it is possible to overcome the difficulties with shape description using wavelet coefficients (rotational variance and complicated matching) by Fourier transforming the coefficients. The results of the classification and retrieval experiments reveal that the proposed wavelet-based shape description approach clearly outperforms ordinary FDs in defect shape description. It is also essential to note that the proposed descriptors have the same dimensionality and matching procedure as FDs. The computational cost of the feature extraction is somewhat higher than that of FDs due to the wavelet transform. However, the dimensionality of the descriptors is more essential than the feature extraction time, because in retrieval applications the feature extraction is usually an off-line operation. If the computational efficiency of feature extraction is critical, the cost of wavelet transform can be decreased using the algorithm presented in Ref. 10.
The authors wish to thank ABB Oy (Mr. Juhani Rauhamaa) for the paper defect image database used in the experiments.