As one of the most popular research topics in current application of remote sensing, change detection for multitemporal remote sensing images is essentially a process of determining the information of geophysical changes using remote sensing images of the same area at different temporals.1 The fields of application include a city’s dynamic development and geographical information databases update, etc. As a major application field, urban change detection has played an important role in city planning and management. For moderate- and low-resolution remote sensing images, various effective change detection methods have been proposed by scholars, and most of these methods can achieve reliable results by comparing each pixel in images.184.108.40.206.7.–8
In recent years, meter and submeter high-resolution remote sensing images represented by SPOT5, Quick-Bird, IKONOS, etc., have been widely applied.9 Improvement in spatial resolution not only provides more spectrum, texture, and geometrical information, but also brings about new challenges. First, the phenomenon of “the same object with different spectrums” is much more serious, and the phenomenon of “the same spectrum with different objects” still exists, so that it is difficult to differentiate changed areas from unchanged areas.10 Second, urban landscapes include various ecological environments and complex artificial objects. Consequently, it is hard for traditional pixel-oriented change detection methods to incorporate the concept of “object,” and the traditional method has poor robustness on the pseudochange caused by the slight spectrum difference inside the “object.” In addition, pixel-oriented change detection methods have high requirements on registration accuracy, radiometric correction, and viewpoint changes. Finally, topographic shadow, clouds covering, etc., can also cause difficulty in change detection. Therefore, there is great difficulty in directly applying the traditional pixel-oriented change detection methods to high-resolution remote sensing change detection.11
Compared with traditional pixel-oriented method, object-oriented change detection (OOCD) method chooses geographic object as basic unit for change detection and provides a new solution to the mentioned difficulties. OOCD method extracts the object’s features based on its natural shape and size, thus improving the category divisibility of different geographic objects and facilitating the deep analyzing of change information inside objects.12,13 Scholars have proposed some effective OOCD methods;1415.16.17.–18 e.g., Miller et al. proposed a method to detect blobs changes between gray-scale images, that is, first using connectivity analysis to obtain objects and then finding the matching object of each object in another image to make a comparison.14 Lefebvre et al. further validated the application of geometry (i.e., size, shape, and location) and content (i.e., texture) information in OOCD algorithm.15
Currently, there are several major challenges/issues of OOCD methods for high-resolution remote sensing images. First, meaningful image-objects should be completely extracted by typical segmentation to represent geographic objects in OOCD. However, currently no specific algorithm can be claimed to be adaptable for all OOCD algorithms. And in most OOCD algorithms, a great deal of spectrum or texture information generated during image segmentation is merely used to extract the objects and is still not fully exploited, especially for object-based features extraction.16 Second, since singly using the spectrum feature of an image to describe the change information in objects has to face the high requirement of image registration precision, and further the detection result is vulnerable to the effect of noise, the extra features, especially the texture features, are applied more and more in change detection. Therefore, a suitable combination of multiple object-based spectrum and texture features can effectively improve the accuracy and reliability of the algorithm.1718.–19 Finally, the results of change detection are related with the scale; that is, a single scale is insufficient to capture all the characteristics of objects within different sizes, shapes, etc. Based on the human visual system and expert knowledge, the combination of multiscale analysis tools and OOCD could more deeply analyze the changes between each object in different temporal and produce more reliable results than single-scale analysis.2021.–22 Therefore, designing an effective fusion strategy becomes another critical issue.
Based on the above analysis, this paper proposes a new OOCD approach for high-resolution remote sensing images based on multiscale fusion. Currently, J-segmentation (JSEG) algorithm23 is one of the most popular methods for color image segmentation. The proposed approach uses the JSEG algorithm to extract the image-objects, perform multiscale feature extraction and object comparison on the sequence of J-images that are generated in the segmentation process. Then, two fusion strategies are presented to construct an integrated change detection framework and derive the final detection results. Experiment shows that both strategies can produce satisfying results and have their respective advantages in false and miss detection. At last, the detection results classify object areas under different change intensities.
This paper consists of four sections. The basic principles and specific implementations of the approach will be introduced in the next section. Section 3 makes an analysis and a comparison of the experiment results, and the last section provides the conclusion.
In order to effectively extract, describe, and compare geographic objects from high-resolution remote sensing images, the method proposed in this paper mainly includes three components: object extraction, object analysis and comparison, multiscale fusion.
The purpose of object extraction is to extract the areas belonging to the same geographic objects through segmentation. JSEG algorithm proposed by Deng and Manjunath is a multiscale color texture segmentation method that shows a strong detection capability for homogeneity of regional color texture features and has been successfully applied in remote sensing image segmentation.24,25
During the process of JSEG, a sequence of multiscale J-images is generated. J-image reflects color distribution of the original image, which means it is in essence a gradient image with scale features. Therefore, for the J-images with the same scale from different multitemporal images, a similar description of a certain object from segmentation results based on gray values actually reflects the overall similarity of this object’s spectrum, texture, and scale features in different temporal images. In this manner, the limitations mentioned above in just using the spectrum feature in the original image can be effectively overcome. On the other hand, it means that there is no need to recalculate the multiscale images for following multiscale change detection. Compared with famous commercial software like eCognition, JSEG algorithm not only can implement the precise image segmentation, but also can be used with J-images for further object-based features extraction and comparison. In addition, JSEG algorithm can effectively improve the proposed change detection framework with better transparency and robustness. For these reasons, objects in this paper will be extracted using JSEG algorithm, which includes two steps: color quantization and space segmentation.
The color quantization applied the method proposed by Deng et al.26 First, the color space of the image will be converted to LUV color space. Then peer group filtering is used to perform image smoothing and denoising. Finally, the quantized image is obtained by applying the classic Hard C-means algorithm.
As for the space segmentation phase, a local homogeneity index value is calculated based on the quantized image, thereby generating J-images sequence. The detailed process of the subsequent segmentation in JSEG can be found in Refs. 23 and 26.
In particular, value is defined as follows: Let each pixel’s location in the quantized image be the value of pixel , and . is the set of all pixels inside the specific-sized window with pixel as the center. Figures 1 and 2 are shown with as the center, and their sizes are and , respectively. In order to maintain the consistency in each direction, corners in each window are removed.
value can be calculated according to the following formula:
Object Analysis and Comparison
In view of the above-mentioned characteristics of J-image, we separately analyze and compare each object in multiscale J-images based on the segmentation results. At this point, it is critical to select an appropriate similarity measurement to describe the similarity of a certain object in different temporal. Common measurements include various “distances” such as Euclidean distance and Mahalanobis distance, histogram matching, covariance, etc. Structural similarity (SSIM)27 first proposed by Wang et al. takes the mean value, variance, and covariance of vectors into account, and therefore, it can well express the similarity between vectors. SSIM between vector and vector is defined in Eq. (2).
In Eqs. (2) to (5), , , , , , and refer to mean value, standard deviation, and variance of and , respectively. refers to a covariance between and . , , and are the weights of three vectors, and , , and are constants added to the formulas in order to prevent instability when denominator approximates to zero.
When , , Eq. (2) can be simplified as
The larger is, the smaller the change in object between multitemporal images and the higher the similarity. In addition, according to the definition thereof, SSIM has the following characteristics: (1) it is bounded with ; (2) it is symmetrical: ; (3) it has a unique maximum value, when and only when , . Normally, a similarity measurement satisfying the above three criteria is considered to describe vectors’ similarity better.
Compared with SSIM, those various “distances” do not satisfy the characteristic of “bounded.” The histogram matching is not symmetric, and the covariance does not meet the criterion “unique maximum value.” Consequently, this paper selects SSIM to describe the similarity of each object between multitemporal images. For J-image at a certain scale, SSIM for all objects in segmentation results is calculated to obtain the change detection results at a single scale.
Considering the dependence of the objects and the changes on scale, and in order to improve change detection precision, two multiscale fusion strategies are presented in the proposed approach.
Fusion strategy 1 is based on Dempster/Shafer (D-S) evidence theory,28 which analyzes the whole system through multisource information, thereby making the right decision. D-S evidence theory is an effective tool to solve uncertain reasoning problems; the basic concept of D-S evidence theory is explained below.
is defined as a recognition framework. Define basic probability assignment formula (BPAF) as a function : in , and satisfying
In fusion strategy 1, D-S theory framework is defined as , where stands for dramatically changed objects, refers to obvious changed objects, and means unchanged objects. Thus, nonempty subsets of include , , , and . For each object (, with being the total number of objects in the segmentation results), define as the SSIM of between multitemporal J-images at the same scale , and the corresponding BPAF is established through the following formula:1 and 2, the small scale is suited to be used to detect the detail changes of objects, while the detection in large scale can effectively reduce the interference from noise and isolated points. Consequently, the values of parameters in the approach need to be set manually by experience or actual requirements in special applications.
Step 1: For each , calculate , , , and by from different scales according to Eq. (8). .
Step 2: If or , and , then is an object with dramatic change.
Step 3: If or , then is an object with an obvious change.
Step 4: Otherwise, is unchanged.
Step 5: Repeat steps 1 to 4 until all objects in the segmentation results are gone through.
In order to further confirm that, compared with single-scale detection, multiscale fusion strategy can effectively improve detection precision and yield more reliable results, fusion strategy 2 uses weighted data fusion. Define () as the weight value for detection results at each scale. Decision rule for fusion strategy 2 can be explained as follows:
Specific Implementation of Approach
As presented above, the specific implementation process of the proposed approach is illustrated in Fig. 3.
As shown in Fig. 3, the two temporal remote sensing images first need to be radiometrically corrected and geometrically registered. Then, JSEG algorithm is used to extract objects. It should be noted that the temporal image with less noise or shadows in multitemporal images will be chosen to be segmented in the proposed approach. In order to extract the same geographic objects, we directly map these boundaries of segmentation results to all J-images from different temporal images based on the registration results. On the other hand, we can also separately segment each temporal image and directly map all the segmented boundaries to J-images from different temporal images based on registration results. No matter in which segmentation ways the J-image sequence should be calculated by the same set of window sizes from multitemporal images, and both segmentation ways are allowed in the proposed change detection framework.
In the phase of object analysis and comparison, we can find the corresponding region for each object in every J-image from different temporal images based on the segmentation and registration results. Based on this, the SSIM for each object at single scale is calculated according to Eq. (6). Finally, the detection results are fused from multiscales according to the fusion strategies proposed in Sec. 2.3 and the entire detection process is accomplished.
Experiment Results and Analysis
For the purpose of comprehensively analyzing the performance of the proposed approach, this paper not only compares the method with traditional pixel-oriented and OOCD algorithm, but also analyzes the effects that the change of scale and fusion strategy have on the detection results. In addition, in order to further test the validity and reliability of the approach on remote sensing images from different sensors, two different types of datasets are selected for this experiment.
For pixel-oriented change detection, we choose the classic change vector analysis (CVA) method and the improved CVA-expectation-maximization (CVA-EM) algorithm29 proposed by Bruzzone et al. for comparison. CVA-EM algorithm uses the difference image generated by CVA method and introduces EM algorithm to estimate the relevant parameters of Gaussian model, which obviously yields a higher detection precision. Experiments were performed on both datasets, with the branch number of Gaussian mixture model defined as .The initial value for EM algorithm was set the same way as in Ref. 29.
As for object-oriented method, this paper uses the multiscale object-specific approach (MOSA)30 proposed by Hall et al. for comparison. MOSA extracts objects using multiscale marker-controlled watershed segmentation. It then calculates the difference image by adaptive threshold and obtains the final change results, which can effectively identify the change information related to scale. Hall believes that for MOSA method, the finest scale produces the best detection results. Therefore, this paper only evaluates the detection precision of MOSA at this particular scale.
Analysis of Experiment Results on Dataset 1
Image #1 and image #2 have been selected as dataset 1 to perform the experiments, as shown in Figs. 4(a) and 4(b). Images #1 and #2 are the airborne remote sensing digital ortho-photo map images acquired in March 2009 and Feb 2012, respectively, at the location of Jiangning campus of Hohai University, Nanjing city, Jiangsu province, China. Dataset 1 is at a spatial resolution of 0.5 m, and the size of image is .
Images from datasets 1 and 2 (see Fig. 5) were acquired in early spring (February to March) and late spring (June to July), respectively, which means that vegetation types are similar and therefore helpful for change detection. The matching precision for these two datasets is maintained within 0.5 pixel after the radiation correction and the geometric accuracy correction. Comparison between these two datasets, as shown in Figs. 4 and 5, indicates several aspects of the complexity and typicality of the scenes in these images: they all include typical changes, i.e., obvious changes of complex artificial objects in large areas and small changes as in tiny plants etc.; images in both datasets contain various geometric objects like vegetation, lakes, roads and buildings, etc. In addition, affected by illumination changes, there are large areas of shadow in image #2 in dataset 1; image #1 was therefore segmented.
In Eq. (6), let and . In fusion strategy 1, let threshold , , , and . In order to fairly compare the two strategies, the weight value in fusion strategy 2 were set as same as in strategy 1.
The final change detection results of two fusion strategies in dataset 1 are shown in Figs. 9(a), 9(b), and 9(c). In the figures below, areas with different colors refer to the objects belonging to dramatically changed areas, obviously changed areas, and unchanged areas, respectively.
For the convenience of visual analysis, as shown in dataset 1, the locations of typically changed ground objects for Jiangning campus of Hohai University during 2009 to 2012 are marked with letters A to D. Changed items include buildings, basketball court, vegetation, and other irregular artificial objects. Location A is the newly built gymnasium of the university. Location B is the new basketball court adjoined to the new handball field. Location C is the degraded lawn, and D is the temporary house.
Visual observation and comparison among Figs. 4, 9, and 10 reveal that the following: (1) Both CVA and CVA-EM algorithms mainly miss the basketball court and handball field at location B. MOSA method performs poorly in detecting changes in complex structures like location D. (2) Both fusion strategies can effectively detect change information at the four marked locations. Detection results under two strategies show that the detection results on regular anthropogenic objects like A and B are substantially the same, while the difference is in areas of complex background mixing various objects, such as location D. They also have different determination on the intensity levels of changes in some areas such as location C. Also, fusion strategy 2 detects more changed areas in the whole scene. (3) Large blocks of shadow in image #2 result in significant amount of false alarms with CVA and CVA-EM method. However, object-oriented MOSA and algorithm proposed in this paper can effectively reduce the interference from shadows, like road areas on the right side of location A.
In order to further quantitatively analyze the performance of different detection methods, on the basis of field visits and visual observation of detection results, a sample dataset of 7523 changed pixels and 8861 unchanged pixels is selected as the real sample data. Overall accuracy, false alarm rate, miss detection rate, and Kappa index are calculated to evaluate the performance of each method with results listed in Table 1.
Detection accuracy of different methods for dataset 1.
|Methods/parameters||Overall accuracy/%||False alarm rate/%||Miss detection rate/%||Kappa index|
|Fusion strategy 1||87.3||11.12||17.21||0.7212|
|Fusion strategy 2||86.8||12.95||15.96||0.7074|
Based on the above table, the following can be observed: (1) The OOCD approach proposed in this paper is obviously better than MOSA and the other two pixel-oriented detection methods, and is consistent with results of visual analysis. The overall accuracy and Kappa indexes for the two fusion strategies are 87.3%, 0.7212 and 86.8%, 0.7074, respectively, and the false alarm rates are considerably lower than the two pixel-oriented algorithms. Even though fusion strategy 1 has a slightly higher miss detection rate than MOSA algorithm, its false alarm rate is even lower and overall accuracy is higher. (2) Strategy 1 applies decision fusion based on D-S evidence theory, and yields the best performance in the experiments even though it has a slightly higher miss detection rate than strategy 2. (3) Strategy 2 adopts weighted data fusion on detection results at different scales, and its false alarm rate is a little higher than that of CVA-EM method, but the miss detection rate is the lowest in the experiments.
Analysis of Experiment Results on Dataset 2
Dataset 2 uses SPOT 5 pan-sharpened multispectral images #3 and #4 with a spatial resolution of 5 m and size of as shown in Figs. 5(a) and 5(b). They are fused with four wave bands in SPOT 5 including panchromatic band, red band, green band, and near-infrared band. Images #3 and #4 were acquired in June 2004 and July 2008, respectively, in Shanghai, China.
Compared with dataset 1, dataset 2 has lower space resolution and more complex background. Therefore, smaller windows were used for object extraction in the experiment with the proposed approach: , , and . Set and ; the threshold value was set as , with ,, . Detection results are shown as Figs. 11(a), 11(b), and 11(c.)
With reference to the previous experiments, a dataset containing 7523 changed pixels and 8861 unchanged pixels in the image are selected to be real change results. Accuracy parameters are calculated for different methods as shown in Table 2.
Detection accuracy of different methods for dataset 2.
|Methods/parameters||Overall accuracy%||False alarm rate%||Miss detection rate%||Kappa index|
|Fusion strategy 1||85.2||13.75||16.18||0.7058|
|Fusion strategy 2||85.1||14.83||15.42||0.6996|
Table 2 summarizes the results in the following aspects: (1) The performance of different algorithms on dataset 2 are basically the same as the conclusion obtained from dataset 1; thus, it further validates the effectiveness and reliability of the proposed method. Obviously, compared with traditional pixel-oriented change detection methods, the method proposed in this paper can significantly improve detection precision for high-resolution remote sensing images. In addition, compared with traditional object-oriented method MOSA, the detection algorithm proposed in this paper yields better accuracy parameters except a slightly higher miss detection rate in fusion strategy 1. (2) The overall detection accuracy of each algorithm in dataset 2 is lower than that in dataset 1, which is mainly driven by low spatial resolution of images in dataset 2. The reduction in resolution leads to the increase of the proportion of mixed pixels that contain multiple objects in the scene. (3) Results of the two datasets indicate that fusion strategy 1 can effectively control the false alarm rate, while strategy 2 can effectively reduce the miss detection rate.
Scale Dependence and Fusion Strategy Analysis
In order to analyze the dependence of change on scale and the effects of the two fusion strategies on detection results, further comparisons are performed in two aspects: accuracy parameters of detection results and area proportion of regions with different change intensity.
With reference to the previous two experiments, detection results at each scale J-image and the acquired accuracy parameters are illustrated in Figs. 13(a), 13(b), 13(c), and 13(d). In these figures, the dotted curve represents dataset 1 and the full curve represents dataset 2.
The following conclusion can be drawn based on the comparison between detection accuracy parameters in Fig. 13 at different scales and under different fusion strategies: change detection results at each single scale differ obviously, and the corresponding detection precisions are lower than those under fusion strategies. Therefore, applying multiscale fusion to single-scale detection results can effectively improve detection precision and reliability of algorithm. Comparison between Table 1, Table 2, and Fig. 13 indicates that the overall accuracy under single-scale object-oriented method in this paper is still obviously better than that under CVA and CVA-EM algorithm.
Table 3 (a) and (b) lists the proportion of areas of each change intensity level in the detection results of both fusion strategies.
Proportion of areas of different change intensity levels/%.
|Fusion strategy/change intensity||Dramatic change/%||Obvious change/%||No change/%||Total/%|
|(a) Dataset 1|
|(b) Dataset 2|
As shown in the tables, dramatically changed areas are mostly overlapping (Figs. 9 and 11) and basically of the same size under both strategies for the same dataset (the proportions are 10.2 to 11.3% for dataset 1 and 16.1 to 18.7% for dataset 2). Thus, dramatically changed areas can be set as the areas where actual changes are most likely to occur and should therefore be the primary detection target in practice. Obviously changed areas can be set as the “hot areas” in next phase of field investigations.
This paper established an integrated OOCD framework based on multiscale fusion and compared the detection performance of this framework. The following conclusions can be drawn:
1. The detection framework proposed in the paper is effective and reliable in urban change detection in high-resolution remote sensing images. The use of JSEG algorithm not only achieves the accurate extraction of objects in the scene, but also uses the multifeatures contained in the J-image sequence to perform change detection, and final results can be acquired by further applying two different fusion strategies. Experiment proves that this method overcomes the uncertainty of single-scale detection, thus producing detection results that are closer to real changes. In addition, with J-image’s multifeatures, calculation of SSIM between objects based on J-image is less susceptible to noises, and the interference from shadows in city scenes has been effectively reduced so that actual change location can be narrowed down and identified, thereby increasing detection precision.
2. Compared with the traditional pixel-oriented and object-oriented detection methods, the approach proposed in this paper has obviously higher precision. In the experiments conducted on two datasets, this algorithm performs better than two pixel-oriented detection algorithms even at single scale. Thus, it proves that pixel-oriented change detection algorithm can hardly satisfy the demands for high-resolution remote sensing images.
3. Both fusion strategies in the framework have their own advantages. Strategy 1 can effectively control the false alarm rate, while strategy 2 is better at reducing the bmiss detection rate. In practice, actual demands need to be taken into consideration in order to select the appropriate fusion strategy.
4. Dramatically changed areas detected by both strategies can serve as primary target areas for fieldwork; then, obviously changed areas can be examined as important prospecting areas. Division of change intensity can provide valuable reference information for fieldwork, thereby reducing workload and saving resources.
Hence, the future work will be focused on how to further improve the detection precision of the proposed framework and the application of multiscale analysis in OOCD algorithms.
This study is supported by the National Natural Science Foundation of China (No. 61271386), the Industrialization Project of Universities in Jiangsu Province (No. JH10-9), and Youth Foundation of Nanjing Institute of Technology (No. QKJA201204).
Chao Wang received his master's degrees in communication and signal processing from China University of Mining and Technology, China, in 2010. He is currently a Ph.D. candidate in computer science at Hohai University, China. His major research interests include remote sensing image processing and pattern recognition.
Mengxi Xu received her MSc degree from UNESCO-IHE in the Netherlands in 2007. Now she is a lecturer in the School of Computer Engineering at Nanjing Institute of Technology. Her current research interests include image processing, information processing system and its application.
Xin Wang is in the College of Computer and Information, Hohai University, China. She obtained her Ph.D. degree from Nanjing University of Science and Technology of China in 2010. Her research interests include image processing, pattern recognition, and computer vision.
Shengnan Zheng graduated from HOHAI University, Dept. of Electronic Information Engineering & Technology. Research direction is signal and information process (master’s) in 2010, and orientation of research is digital image processing. She now is an assistant engineer of computer engineering at Nanjing Institute of Technology.