Improved color texture descriptors for remote sensing image retrieval

Abstract Texture features are widely used in image retrieval literature. However, conventional texture features are extracted from grayscale images without taking color information into consideration. We present two improved texture descriptors, named color Gabor wavelet texture (CGWT) and color Gabor opponent texture (CGOT), respectively, for the purpose of remote sensing image retrieval. The former consists of unichrome features computed from color channels independently and opponent features computed across different color channels at different scales, while the latter consists of Gabor texture features and opponent features mentioned above. The two representations incorporate discriminative information among color bands, thus describing well the remote sensing images that have multiple objects. Experimental results demonstrate that CGWT yields better performance compared to other state-of-the-art texture features, and CGOT not only improves the retrieval results of some image classes that have unsatisfactory performance using CGWT representation, but also increases the average precision of all queried images further. In addition, a similarity measure function for proposed representation CGOT has been defined to give a convincing evaluation.


Introduction
Texture features have shown significant advantages in the field of image classification, 1 image segmentation, 2 and content-based image retrieval (CBIR), 3,4 etc. Particularly, texture features are one kind of low-level features which have been widely used in CBIR community because of characteristics of independence of image color and intensity.Some popular textural descriptors, such as gray level co-occurrence matrix (GLCM), 5 Gabor filter, 6 wavelet transform, 7 and local binary pattern (LBP), 8 have been extensively used in CBIR community.Unfortunately, these conventional texture features mentioned above are extracted from grayscale images directly and leave the discriminative information derived from different color channels, which can be regarded as complementary information for different texture patterns, out of consideration.With the intention of fully exploiting the discriminative information to improve the retrieval results of remote sensing images, many studies have been conducted on this topic.
Strategies of such research can be roughly divided into two categories: (1) combination of color and texture features and (2) texture features integrating opponent process theory.Some works based on the former strategy are illustrated as follows.Lin et al. 9 proposed a smart CBIR system based on color and texture features.Chun et al. 10 presented a CBIR method based on a combination of color and texture features extracted in multiresolution wavelet domain.Liapis and Tziritas 11 illustrated a new image retrieval mechanism based on a combination of *Address all correspondence to: Weixun Zhou, E-mail: weixunzhou1990@whu.edu.cntexture and color features using discrete wavelet frames analysis and one-dimensional histograms of CIELab chromaticity coordinates, respectively.This strategy has also been accepted as one of the important retrieval mechanisms in some famous image retrieval systems, such as query by image content (QBIC). 12,13Some other similar works [14][15][16][17] could be found in this research.Although these works simultaneously take discriminative information and texture features into consideration, problems such as computational complexity and definition of weight parameters with combinational features are still an open question.In the 1800s, Hurvich and Jameson 18 proposed an opponent process theory of human color vision, and thus texture features integrating opponent process theory have increasingly drawn substantial attention in recent years.Jain and Healey 19 proposed a multiscale representation based on the opponent process theory for texture recognition, and later this method was applied to hyperspectral image texture recognition. 20In one recent work by Choi et al., 21 two features, namely color local Gabor wavelets and color LBP, are proposed for the purpose of face recognition, which share similar principles and can be treated as an extensive application of the theory proposed in Ref. 19.The opponent process theory provides complementary information among color channels and generates a simple but effective feature representation.
Motivated by the aforementioned applications of opponent process theory, in this study, we propose one descriptor named color Gabor wavelet texture (CGWT) for remote sensing image retrieval.Meanwhile, color Gabor opponent texture (CGOT) descriptor based on Gabor wavelets has also been presented so as to improve the retrieval results of certain image classes which have inferior precision using CGWT representation.
The rest of this study is organized as follows.Section 2 shows the framework of remote sensing image retrieval based on the proposed descriptors and illustrates the details of the proposed features, parameters used, and similarity measure defined for CGOT descriptor.In Sec.Generally an image retrieval system contains image database, feature database, and some important functional modules, such as feature extraction, indexing mechanism, and similarity measure.Figure 1 illustrates the framework of improved color texture descriptors for remote sensing image retrieval in this study, which mainly contains two parts: feature extraction and image retrieval.
Feature extraction part indicates the extraction procedure of the proposed features.Given one RGB remote sensing image, three dependent color channel images, R, G, and B are obtained first.Then unichrome features corresponding to each color channel are extracted based on Gabor filter with orientation and scale ðu; vÞ.Finally, R unichrome feature, G unichrome feature, and B unichrome feature are combined together to form unichrome feature.For opponent feature, two Gabor filters with orientation and scale ðu; vÞ and ðu; v 0 Þ are used for two color channel images, respectively.As with unichrome feature, RG opponent feature, RB opponent feature, and GB opponent feature are combined together to form opponent features.
Image retrieval part illustrates a simple procedure of remote sensing image retrieval.All images and features are stored in image database and feature database, respectively.Meanwhile, images are associated with the corresponding features through an indexing mechanism.Given one query image, distances between query image and images in database are calculated using a predefined similarity measure, and then the first k most similar images are returned in ascending or descending order of similarity.
Feature extraction is an important and indispensable part in one image retrieval system.In Sec.2.2, the details of extraction of proposed representations are illustrated.In addition, as the most important procedure of image retrieval part, similarity measure methods used in this study are discussed in Sec.2.3 as well.

Feature Extraction
In our methodology, all images are represented in RGB color space for convenience.Both CGWT and CGOT features are based on Gabor filter, illustrated as follows: where u and ν are mean orientation and scale parameters of Gabor kernels, respectively, z ¼ ðx; yÞ, k • k means the norm operator, and k u;v is defined as follows: where k v ¼ k max ∕f v and ϕ u ¼ πu∕8.k max is the maximum frequency, f is the spacing factor, and σ is the standard deviation.Note that the Gabor filter may have many formula forms, and Eq.(1) in Ref. 22 is chosen because of its conciseness and convenience for setting parameters, such as direction and scale, in our algorithm.

Extraction of CGWT descriptor
As illustrated in Fig. 1, CGWT representation consists of two parts, unichrome feature and opponent feature.The terms "unichrome feature" and "opponent feature" follow the definition in Ref. 19, where you can find detailed information about the two features.Let R, G, and B be the three grayscale images of corresponding color channels of an RGB image, respectively.The convolution results of three grayscale images and Gabor kernel ψ u;v are denoted as follows: where z has the same meaning as Eq. ( 1) and * means convolution operator.R u;v ðzÞ, G u;v ðzÞ, and B u;v ðzÞ are the convolution results of three grayscale images with orientation u and scale v.Then, unichrome feature representation of one color image is represented by where the three components of unichrome feature are R unichrome feature, G unichrome feature, and B unichrome feature, respectively.It is clear that unichrome features are defined as values extracted from a single image band.
Then, the difference of normalized R u;v ðzÞ, G u;v ðzÞ, B u;v ðzÞ is defined by where u and ν, v 0 denote the orientation and scales of Gabor filters used, respectively.Note that according to Gabor kernels in Eq. ( 1), we choose ν and v 0 as adjacent scales, which means they should meet restriction condition jv − v 0 j ≤ 1.Then, opponent features can be defined by where the three components of opponent features are RG opponent feature, RB opponent feature, and GB opponent feature, respectively.It is clear that opponent features are defined as values extracted from the difference of two image bands.According to Eqs. ( 5) and ( 6), we can obtain three equations in Eq. (7).During the feature extraction procedure, dimension and efficiency are the two factors needed to be considered, while in the work by Jain and Healey 19 and Choi et al., 21 the above factors are not taken into consideration.In our study, we just choose three of them in Eq. ( 7) to constitute opponent feature so as to decrease feature dimension and increase efficiency.Finally, the CGWT representation of an image is denoted by Eq. ( 8)

Extraction of CGOT descriptor
CGOT representation combines Gabor texture 6 and opponent feature together, which substantially decreases the feature dimension compared with CGWT representation.Given one grayscale image I, then the convolution of I and Gabor kernels ψ u;v with orientation u and scale v is given by The mean μ uv and standard deviation σ uv of the transform coefficients are defined by Gabor texture feature composed of μ uv and σ uv is denoted using T ¼ fμ u;v ; σ u;v g.Then, CGOT representation of an image is denoted by CGOT ¼ fμ uv ; σ uv ; oppg: (11)

Extraction of comparative texture features
Some widely used traditional texture features, such as wavelet texture, LBP, and GLCM, are introduced as comparative methods to give a quantitative analysis.Before extraction of these features, the color images are converted into intensity images using the equation gray ¼ 114 Ã b, where r, g, and b mean red, green, and blue channels, respectively.Details about these comparative methods are in the following.Wavelet transform makes a great difference in the field of texture analysis.Let I be an original image, and then the extraction procedure is described as follows.First, "haar" wavelet is used to construct two decomposition filters, one low-pass filter, and one high-pass filter.Then, 2-level two-dimensional wavelet decomposition is applied to I by means of above constructed decomposition filters, and six subband images are obtained.Note that the decomposition level is an important parameter and the size of the smallest subimage should not be less than 16 × 16. 7 Finally, the energy of each subband image is calculated using the following equation: where xðm; nÞ are subband images, M × N is the size of original image, 1 ≤ m ≤ M, 1 ≤ n ≤ N. Also, wavelet texture feature of image I is defined by mean and standard deviation of energy of each subband image using T ¼ fμ 1 ; σ 1 ; μ 2 ; σ 2 ; : : : ; μ 6 ; σ 6 g.LBP describes the local structure of image texture through calculating the differences between each image pixel and its neighboring pixels.Ojala et al. 8 improved an original LBP operator and developed a generalized grayscale and rotation invariant operator LBP riu2 P;R which can detect "uniform" patterns and is denoted by where and sðxÞ is defined by R is the radius of the circularly symmetric neighbor set and P is the number of equally spaced pixels on the circle.g c is the center pixel of circular neighbor and g p ðp ¼ 0;1; : : : ; P − 1Þ are the neighbor pixels on the circle.U is a uniformity measure corresponding to the number of spatial transitions in the "pattern" and riu2 stands for rotation invariant "uniform" patterns having U value of at most 2.
In our study, 8 pixels circular neighbor of radius 1, i.e., LBP riu2 8;1 operator is used, and a total of 59 grayscale and rotation invariant LBP histogram is accepted.
GLCM is one widely used texture analysis method that considers spatial dependencies of gray levels from the perspective of mathematics.In the work by Haralick et al., 5 14 statistical measures extracted from GLCM are introduced.Nevertheless, many of them are strongly correlated with each other and there are no definitive conclusions about which features are more important and discriminative than others.How to choose appropriate features for texture analysis from 14 statistical measures is still studied by some researchers.Haralick et al. selected four features, energy, entropy, correlation, and contrast, as texture features and conducted classification experiments using a satellite imagery data set, and good classification results are obtained. 5Considering the good performance of the above four features on remote sensing images, energy, entropy, correlation, and contrast are used in our study.They are defined by GLCMs, where N g means the number of distinct gray levels in the quantized image, and p d;θ ði; jÞ is the normalized GLCM of pixel distance d and direction θ.In our study, pixel distance is set as 1 and four directions θ ∈ f0;45;90;135 degg are chosen.In addition, because the used images have 256 gray levels and excessive gray levels will increase the workload of calculating GLCMs drastically, we scale the images to eight gray levels, which means GLCM is one 8 × 8 symmetric matrix and N g ¼ 8. Consequently, a total of eight texture features composed of mean and standard deviation of four features in Eq. ( 15) is obtained.

Parameters setting
How to choose optimal parameters for Gabor wavelets is still studied by some researchers because different parameters may result in different experimental results even for the same question.With respect to parameters used in this study, we choose default parameters used in Ref. 22, and the details are described as follows.Gabor wavelets of five scales v ∈ f0;1; 2;3; 4g and eight orientations u ∈ f0;1; 2;3; 4;5; 6;7g, which have been used in most cases, are accepted because they can extract texture features from more scales and orientations.For the rest of the parameters, are accepted, which can be regarded as empirical values.In addition, the size of Gabor window is also an important parameter, and it is set as 128 × 128 in this study.Then, a total of 80 Gabor texture features is obtained.

Similarity Measure
Similarity measure is an indispensable and important step in image retrieval systems, and different methods may result in great difference even for identical query images.Some widely used similarity measure methods, such as Minkowski distance, histogram intersection, K-L distance, and Jeffrey divergence, etc. tend to have their own scope of application.In such cases, specific similarity measure methods are defined for certain features in this study.
Given two images I i and I j with corresponding CGWT representations f CGWT i and f CGWT j , the distance measure of CGWT is defined as in Ref. 19 where σ CGWT is the standard deviation of CGWT representation over the entire image database.This distance measure has been used to classify the textural images and ideal classification results have been achieved. 19or CGOT representation, considering it is the combination of Gabor texture feature and opponent feature, we integrate distance measure Eq. ( 16) and distance measure for Gabor texture features in Ref. 6 and define one much simpler distance measure by where σ CGOT is the standard deviation of CGOT representation over the entire image database, f CGOT i and f CGOT j are the corresponding CGOT representation of image I i and I j , respectively.Note similarity measure Eq. ( 17) has similar form but different meanings as similarity measure Eq. ( 16).Since Gabor texture and opponent feature constitute CGOT representation, distance measure Eq. ( 17) taking both of them into consideration is appropriate.In this similarity measure, CGOT representation is regarded as a unitary feature, which means it is unnecessary to pay attention to each component of the feature when calculating standard deviation σ CGOT .

Data Set
To evaluate the performance of proposed descriptors, eight land-use/land-cover (LULC) classes from UC Merced LULC data set are chosen as retrieval image database.Original LULC is one manually constructed data set consisting of 21 image classes, and the 100 images in each class are tiles with the size of 256 × 256 from large aerial images with the spatial resolution of 30 cm of some US regions. 23LULC data set has been used in many similar studies 24,25 and made publicly available to other researchers.Some image patches of eight LULC classes used in our experiments are shown in Fig. 2. From left to right, they are agricultural, airplane, beach, buildings, chaparral, residential, forest, and harbor, respectively.

Performance of Proposed Descriptors
Accurate and objective evaluation criteria have also been a hot topic in the CBIR community.Precision, recall, precision-recall curves, and ANMRR are publicly accepted as evaluation criteria.However, due to the existence of semantic gap, evaluation of CBIR is not effortless.In addition, it is possible to get different performances with different evaluation methods even if the same data set is used. 26In order to avoid such problems, precision and precision-recall curves are chosen as evaluation methods in this study, because they can be treated as similar evaluations from a different perspective.Precision is the fraction of correct retrievals and recall is the fraction of ground truth items retrieved for a given result set. 23igure 3 shows the performance of proposed features and conventional texture features.The last bin of the histogram with the label "average" gives the average precision of corresponding features.The chart indicates that CGOT and CGWT representations perform better on five classes, i.e., airplane, beach, chaparral, residential, forest, and harbor, and less than perfect on the other two classes, i.e., agricultural and buildings, compared with wavelet texture.Nevertheless, the two proposed features achieve highest average precision on the whole image classes.Meanwhile, we can see CGOT feature increases the average precision of agricultural, airplane, beach, buildings, residential, and harbor by CGWT feature further, which is particularly obvious with respect to agricultural and harbor due to abundant texture information on these image classes.
In order to demonstrate the superiority of the proposed representation, precision-recall curves for different features are presented in Fig. 4 through setting different number of returned images.With the increase of returned images, precision by conventional texture features  decrease rapidly, particularly GLCM and LBP.With regard to three rest features, it is evident that CGOT results in the best performance.For CGWT representation and wavelet texture, recall 0.5 can be treated as marginal value.When the value of recall is less than 0.5, CGWT representation performs better, and they have same performance with recall bigger than 0.5.Experimental results, here, are in accordance with the results in Fig. 3, and both of them have validated the effectiveness and good performance of the proposed color texture descriptors.

Comparisons of Used Similarity Measures
As aforementioned, appropriate similarity measure method is necessary in CBIR.For conventional texture features, i.e., GLCM, LBP, and wavelet texture, we choose L 2 distance as similarity measure.For CGWT representation, distance measure presented in Ref. 19 is used.Also, for CGOT representation, characteristics of existed distance measure for Gabor texture, unichrome and opponent features are considered and a simpler distance measure for CGOT representation is defined.Table 1 compares the performance of CGOT representation using proposed similarity measure in Eq. (17) with some other similarity measures, such as L 1 distance, L 2 distance, Jeffrey divergence, 27 and distance measure in Ref. 19.
For each group of returned images, the proposed similarity measure achieves highest precision and the average performance is best as well.Table 1 demonstrates that proposed distance measure is an appropriate and effective similarity measure method.

Examples of Remote Sensing Image Retrieval
Figure 5 shows one remote sensing image retrieval example using two proposed descriptors.30 retrieved images of CGOT and CGWT, respectively.Note that these images are returned in the order of descending similarity, which means images ranking front are more similar to the query image.
According to the retrieval results of two descriptors, CGOT retrieves more similar images than CGWT.In addition, among the first 12 retrieved images, CGOT returns two irrelevant images, while CGWT returns five irrelevant images, which also indicates the better performance of CGOT descriptor.

Discussion
From the previous experiments of remote sensing image retrieval, some interesting points have been concluded.
1. Proposed color texture descriptors, CGWT and CGOT, describe the content of remote sensing images well and achieve a good performance compared with wavelet texture, LBP texture, and GLCM texture.The reason is that they have taken the discriminative information among color bands into consideration.2. As shown in Fig. 3, CGOT improves the performance of CGWT and achieves highest average precision over the entire image database, and similar performance is obtained in Fig. 4.These results indicate that Gabor texture has better descriptive power than unichrome feature in terms of image texture.3. The similarity measure defined for CGOT is appropriate.It reveals that the characteristics of one feature should be taken into consideration when defining a similarity measure, because it plays an important role in improving the performance of the proposed representations.
In this study, all experiments are conducted using aerial remote sensing images from one public image database.However, not all the selected images have regular texture structure, which will have an effect on the performance of proposed descriptors.In addition, proposed descriptors are likely to be suitable for hyperspectral image retrieval because they have high spectral resolution and more discriminative information can be extracted from image bands.

Conclusion
With the rapid development of remote sensing technology, the amount of accessible remote sensing data has been increasing at an incredible rate, which not only provides researchers more choices for various applications, but also brings more challenges.Under the circumstances, CBIR is a better choice for effective organization and management of massive remote sensing data.
Traditionally, low-level features, particular texture features, are widely used in CBIR community for their special characteristics.Nevertheless, conventional texture features tend to be extracted from grayscale images directly and ignore the complementary information that is of great importance between color bands.
To exploit the complementary information and perform remote sensing image retrieval, CGWT and CGOT representations have been proposed based on Gabor filter and opponent process theory.The filtered images by Gabor filter with five scales and eight orientations are obtained first and then unichrome features, opponent features, and Gabor texture features are extracted.Finally, CGWT and CGOT representations are constituted and used in remote sensing image retrieval.
Considering the existence of semantic gap and some other difficulties, two similar evaluations, i.e., precision and precision-recall curves are chosen to evaluate the performance of all texture features.Results demonstrate that CGWT and CGOT perform better than GLCM, LBP, and wavelet texture, and CGOT not only improves the performance of some image classes using CGWT but also increases overall precision of all queried remote sensing images.In addition, a similarity measure for CGOT based on two existed distance measures has been defined.Compared with some widely used distance measures, the proposed similarity measure shows better performance.
In the future, the fusion mechanism of unichrome features and opponent features, Gabor texture and opponent features, as well as the influence of color space on proposed descriptors will be considered.

Fig. 1
Fig. 1 Architecture of remote sensing image retrieval based on proposed descriptors.

Fig. 3
Fig. 3 Average precision of each image class with the proposed features and other texture features.

Figure 5 (
Figure5shows one remote sensing image retrieval example using two proposed descriptors.Figure 5(a) is the query image from agricultural class, and Figs.5(b) and 5(c) are the first

Fig. 5
Fig. 5 One retrieval example of agricultural image: (a) query image, (b) performance of color Gabor opponent texture, (c) performance of color Gabor wavelet texture.
3, comparative experimental results and discussions are presented.Conclusions and future work constitute Sec. 4.

Table 1
Comparisons of CGOT using different distance measures.