A rapid earthquake damage assessment, right after a seismic event, can address civil protection interventions toward the most affected areas. Satellite data can be very useful for this purpose, thanks to the wide coverage, the high spatial resolution, and their intrinsic capability to provide a synoptic view over remote regions all over the world. The damage assessment is based on change detection techniques capable of observing an object at different times and of identifying changes. The basic premise in such a use of satellite data is that changes in land cover correspond to changes in radiance values. These latter must be large with respect to radiance changes caused by other disturbing factors (i.e., atmospheric conditions, soil moisture, etc.).1,2 Both optical and radar sensors can be exploited for change detection purposes and several examples are reported in the literature.
Synthetic aperture radar (SAR) is strongly sensitive to the surface changes and the signal is not affected by weather and lighting conditions that typically affect the observations in the optical spectral range. Serious damage or building collapses due to earthquakes modify the observed scenario and its electromagnetic behavior, thus allowing their detection by comparing pre- and postdisaster SAR images.
An index to estimate the damage level from SAR data combining amplitude changes and the related correlation coefficient has been proposed by Refs. 34.–5, for the Hyogoken-Nanbu and the Izmit earthquakes. For this latter case, the extracted damage distributions were in good agreement with the actual situations investigated by field surveys. Hoffman et al.6 defined a damage index based on interferometric SAR (InSAR) complex coherence, which takes into account the phase of the echoed signal. Yonezawa and Takeuchi7 compared changes in the SAR backscattering intensity and phase with the damage observed in Kobe city. This showed that intensity decorrelation and interferometric coherence reduction behave similarly. Matsuoka and Yamazaki8 investigated the different appearances of undamaged and damaged buildings in SAR imagery. By comparing pre- and postevents European Remote Sensing and Japanese Earth Resources Satellite SAR intensity data of the 1995 Hyogoken-Nanbu earthquake, the authors found out that the SAR backscatter values decrease with increasing damage. Ito et al.9 assessed different SAR change indicators, derived from L-band and C-band sensors and evaluated the frequency-dependent effects of spatial and temporal decorrelations.
Stramondo et al.10 applied SAR intensity correlation and complex coherence, focusing on the 1999 Izmit earthquake (Turkey) and on the 2003 Bam earthquake (Iran). The authors found out that the intensity correlation and the coherence detect different kinds of changes on the ground.
Pan and Tang11 investigated the relationship between building damage level and the differences of the backscattered SAR intensity values in pre- and postdisasters ALOS PALSAR imagery for the 2008 Wenchuan earthquake. Based on ENVISAT ASAR data, Chini at al.12 reported a decrease of the SAR backscattering in a damaged urban area due to the 2011 Tohoku (Japan) earthquake that was caused by the decrease of the double bounce effect.
Optical data can also provide valuable information on settlement conditions after an earthquake.13 The spatial resolution of satellite optical sensors has rapidly improved in the last few years, reaching (WorldView, QuickBird, EROS B) for panchromatic imaging types, thus becoming a reliable tool for detecting changes of individual buildings. However, the information content of high-resolution images can be affected by differences in the acquisition time and changes in the sight angle. The presence of shadows, variation in Sun illumination, and geometric distortions may prevent using automatic damage detection procedures,2 and the most used method for damage assessment from space images is still visual interpretation.
By exploiting both optical and radar sensors, and using change features achieved with statistical analysis, a more accurate and reliable damage mapping can be obtained. Bignami et al.14 and Stramondo et al.10 have analyzed the possible advantages of combining radar and optical satellite data.
As far as the optical data are concerned, the most significant results are related to the normalized difference index (NDI), image rationing (IR),15 Kullback–Libler divergence (KLD),16 and mutual information (MI) indices.17 All these features show a good sensitivity to the collapse ratio (CR). IR and NDI have very similar behaviors, whereas NDI is less sensitive to possible calibration errors, atmospheric effects (haze), and different illuminations of the scene in the two images. The SAR data methodologies indicate that phase coherence is not a good damage indicator, as carried out during the EU funded project advanced procedures for volcanic and seismic monitoring (APhoRISM),18 because of its sensitivity to other changes (e.g., geometric decorrelation) in the scene not related to damage, and is not able to distinguish different grades of damage. The intensity correlation difference (ICD) showed itself to be a very good damage proxy.10 KLD and MI19 are also suitable features that can contribute to damage estimation.
In this paper, we used the NDI, KLD, and MI indices for the optical data, and the ICD and the KLD parameters for the SAR data.
In order to analyze the capability of previous parameters to correctly detect damaged areas, two different classifiers were used: the Naive Bayes (NB) and the support vector machine (SVM).
The main innovation of the proposed analysis concerns the estimation of derived features at object scale. In order to exploit the object-oriented approaches, it is, therefore, necessary to perform a segmentation of the study area into several regions. Aiming at testing an operational approach, we have segmented and then generated the damage map of Port-au-Prince by considering a set of polygons extracted from the open source open street map (OSM) geo-database. The resulting damage map was calculated in terms of CR: this quantity is defined considering a city block and by calculating the number of collapsed buildings with respect to the total number of buildings within the block. The CR has been calculated by using ground truth (GT) information collected during a postearthquake survey and available from the Joint Research Centre (JRC) database.
In a preliminary analysis, we have attempted to produce a damage map by using supervised classification algorithms, with the objective to evaluate the performance of selected features to estimate the damage level. Furthermore, an unsupervised algorithm to estimate the damage level by satellite features, named in the paper as features stepwise thresholding (FST), has been proposed and tested.
Case Study and Datasets
The investigated case study is the earthquake that hit Haiti on January 12, 2010 (Mw 7.0). The available dataset is made up of two GeoEye-1 satellite optical images (one preseismic and one postseismic), three TerraSAR-X SAR satellite images (two preseismic and one postseismic), and a GT survey.
Regarding the optical data, the preseismic image was acquired on October 1, 2009, with a nominal collection elevation of 65.9 deg, whereas the postseismic one was acquired on January 13, 2010 (one day after the earthquake), with a nominal collection elevation of 86.9 deg. Such images consist of four spectral bands (blue/green/red/near-infrared) at 2 m ground resolution. The two optical images have been ortho-rectified by using the ASTER GDEM (available in Ref. 20) with a horizontal resolution of 1 arc sec (30 m). Then, with the aim to obtain panchromatic bands, the Geo-Eye-1 RGB spectral bands have been averaged for each pre- and postseismic image. In order to calculate the change features, both pre- and postseismic images need to be geometrically and radiometrically comparable. Therefore, they were spatially coregistered by manually selecting a set of control points, and processed by using the “flat field” procedure, available on the ENVI software, for radiometric correction.21 Figure 1 shows the location map of the study area and footprints of the employed satellite data.
Regarding the SAR data, the two preseismic images are dated May 1, 2009, and October 13, 2009, whereas the postseismic one was acquired on January 20, 2010. Such images, collected by the German TerraSAR-X mission, are X-band data in HH polarization and were acquired in Stripmap Mode, on beam 011 and right descending view, with 3 m per pixel ground resolution and an incidence angle at the center swath of 39 deg.
As already introduced, the GT has been expressed in terms of CR. With this aim in mind, we used the data from JRC where a building-by-building damage evaluation is reported in EMS98 scale. For each OSM polygon of the city, we have considered the damage grade 5, i.e., totally collapsed, to calculate the CR. Note that only polygons containing at least 10-surveyed buildings are considered in the analyses aiming at obtaining a self-consistent dataset. A total of 1513 objects have been finally considered.
The resulting GT map is shown in Fig. 2.
Methodology and Features for the Change Detection
Previous studies, carried out during the APhoRISM project, allowed us to select features having a good sensitivity to damage at the object scale. In particular, concerning the optical data, we identified the NDI, and two quantities coming from the information theory, namely the KLD and the MI. In addition, for the SAR data, we picked out the ICD and the KLD parameter.
The NDI parameter is defined asFigure 3 shows the resulting NDI map.
Actual NDI values range from 0.10 to 0.33, with the highest values centered on the west and northeast areas of the city (see Fig. 3), where the largest number of buildings are collapsed (see also the CR map in Fig. 2 for a visual and qualitative comparison).
The KLD parameter is defined asFig. 4).
The MI index measures the correlation between pre- and postseismic imagesFig. 5).
As far as the SAR data are concerned, the KLD feature is derived in the same way as the optical data, whereas the ICD can be computed on the base of the Pearson correlation coefficient () estimated on the preseismic SAR image pair, i.e., the preseismic intensity correlation, and on the coseismic SAR image pair, i.e., the coseismic intensity correlation
The behavior of the features above (NDI, KLD, and MI for optical data; ICD and KLD for SAR data) was analyzed as a function of the CR obtained from the GT survey. As examples, Fig. 6 shows the trends of NDI and ICD parameters as a function of the CR.
The relationship between the considered features and the CR is unknown a priori. With the aim to identify the best suitable features for damage assessment purposes, a linear dependence is hypothesized for all features (see scatterplots in Fig. 6). Under this condition, the Pearson correlation coefficient can be used to measure the linear dependence between each feature and the CR parameter. The Pearson coefficient is calculated for each feature and reported in Table 1.
Correlation between the satellite features and the CR from GT data.
|NDI from optical||0.69|
|KLD from optical||0.71|
|MI from optical||0.46|
|ICD from SAR||0.66|
|KLD from SAR||0.33|
For the NDI and KLD features from optical data, the correlation is very good (respectively, 0.69 and 0.71). Differently, the correlation is significantly lower for the MI feature (). Regarding the SAR data, the correlation is very good for the ICD parameter (), whereas it is very low for KLD (). This preliminary check allowed understanding, which are the most sensitive and promising features for a damage classification exercise, i.e., NDI and KLD for optical data, and ICD for the SAR images.
In order to perform a classification test, we have classified the CR data into a map with three classes of CR, following the requirements of civil protection centers, which emerged during the APhoRISM project work. In particular, polygons with are associated to damage class 1, polygons with to damage class 2, and polygons with to damage class 3. The resulting map is shown in Fig. 7.
With this classification, we have obtained 911 objects of damage class 1, 423 objects of class 2, and 179 objects of class 3.
The optical and SAR feature values (NDI, KLD, MI, and ICD) are also assembled in three classes following the previous CR thresholds. In Figs. 8 and 9, the trend and distribution of calculated values (for reasons of synthesis, only NDI and MI features) are shown. In particular, in the trend plots (see left sides of the figures), the mean value for each class (continuous line) and the mean value ± the normalized standard deviation (NSD) are reported (dashed lines). The NSD value for each class is also reported as a percentage (%) with respect to the mean value.
The NSD ranges from 12% to 16% and from 41% to 51%, respectively, for the NDI and MI feature.
In order to evaluate the capability of these features to correctly detect the damage grade, two different classifiers were used: the Naïve-Bayes (NB) and the SVM. Both were employed by using the -fold mathematical method for training and test set selection. Several analyses have been performed considering different input datasets and different output classes: the first sets analyze the ability of these features to discriminate class 3 (the highest damage class) from all others, whereas the second sets evaluate the separability of the three classes. The tests were performed by using only optical features, only SAR features or both, or only the best three features resulting from the correlation studies (see Table 1). Results, in term of overall accuracy and coefficient, are reported in Tables 2 and 3.
NB and SVM classification (class 3 versus others).
|NDI, KLD, MI from optical data||ICD and KLD from SAR data||NB accuracy (%) and K coefficient||SVM accuracy (%) and K coefficient|
|SET 4||Yes (only NDI and KLD)||Yes (only ICD)||82.0%;||83.7%;|
NB and SVM classification (class 1 versus class 2 versus class 3).
|NDI, KLD, MI from optical data||ICD and KLD from SAR data||NB accuracy (%) and K coefficient||SVM accuracy (%) and K coefficient|
|SET 4||Yes (only NDI and KLD)||Yes (only ICD)||65.8%;||60.9%;|
Both classifiers indicate a very good performance to detect and discriminate the highest damage class. The calculated overall accuracy is 80% and 84%, respectively, for NB and SVM classifiers, using all available features. The simultaneous use of combined features (optical and SAR) leads to the best performances for both classifiers.
The discrimination of the three classes is clearly more difficult; however, the results are satisfying, with an overall accuracy ranging from 61% to 66% (see Table 3).
Algorithm for Unsupervised Damage Estimation
The change detection features can be employed to assess the damage level and detect the most affected areas after a destructive earthquake. Each feature provides a qualitative assessment of the damage distribution within the city. (For example, see Fig. 3 for the NDI map at object scale.)
In this section, an attempt to fuse all information derived from all available features is described. In order to achieve a damage map useful in the rescue operations after an earthquake, an unsupervised algorithm based on these satellite features has been developed and tested. In the following, we refer to this algorithm as the FST method. The scheme of the proposed method is shown in Fig. 10; the algorithm is described and applied in the following.
The input data are composed of all change features obtained by both optical and SAR images. We used the NDI, KLD, and MI parameters achieved from optical data in addition to the ICD and KLD from SAR data. It is noteworthy that the FST algorithm can work by employing any number of parameters of different types. In other words, it is not feature-dependent. Several tests demonstrated that a better performance can be obtained by adopting all five features studied during the APhoRISM project.
First, the statistical distribution of values of each feature is analyzed to detect possible outliers due to different factors (i.e., a nonperfect images coregistration, natural phenomena affecting the data, such as strong surface changes due to seasonal effect, etc.). In particular, points with values more than 3 standard deviations away from the mean are considered outliers and are not considered for classification purposes. At the first iteration, all dynamic ranges (obtained excluding the outliers) of the values of each feature are divided into three intervals by setting two thresholds for each feature, and each interval is associated with a damage class. In order to consider several possible occurrences of polygons in the three damage classes (clearly unknown a priori), this subdivision of ranges, and the corresponding classification, is repeated -times, varying the width of the three intervals by changing the values of the two thresholds at each of the iteration. To better explain the iteration steps of the algorithm, in Fig. 11, an example of an iteration for NDI and MI features is shown. The example refers to iteration No. 1 reported in Table 4. Specifically, points with values between the minimum and the minimum plus 35% of the entire range are assigned to class 1 [see the blue range of the plot in Fig. 11(b)]; points with values between 35% and 55% () of the range are assigned to class 2 [see the yellow range of the plot in Fig. 11(b)]; the remaining points, with values within the remaining 45% of the range, are assigned to class 3 [see the red range of the plot in Fig. 11(b)]. The employed mathematical scheme with the width of each class at each iteration is completely reported in Table 4. In particular, the total number of 21 iterations results from the attempt to consider a great number of different possible distributions at a step of 5%, which is a reasonable value for the sensibility of the proposed method as several tests have demonstrated.
Width in percentage (%) of each class for each iteration.
|No. iteration||Class 1 (%)||Class 2 (%)||Class 3 (%)||No. iteration||Class1 (%)||Class2 (%)||Class3 (%)||No. iteration||Class1 (%)||Class2 (%)||Class3 (%)|
In this way, several possible occurrences of polygons in the three damage classes are considered. The assumptions on which the algorithm is based are only the trend (increasing or decreasing) and the linearity (confirmed by the analysis in Secs. 3 and 4) of the features with respect to the damage grade. These last hypotheses allow the algorithm to maintain its unsupervised characteristic.
Therefore, the value of each feature at each iteration defines the damage level, for that feature and that iteration, of the polygon itself. After the 21 iterations, we have 21 possible classifications for each polygon, and in order to assign the final damage level (1, 2, and 3 corresponding to low, medium, and high, respectively) to each one, the modal value of all resulting classifications is considered for that polygon. In this way, even if there are polygons with missing values (outliers) for one or more features, because not all the five features are missing, a class can still be assigned. Finally, a damage-retrieved map can be obtained by associating the computed damage level to each OSM polygon. The result for the Haitian earthquake test case is shown in Fig. 12.
This map can be compared, in a semi quantitative manner, with the map obtained from the GT survey (see Fig. 7). A good agreement is observed in the west area of the city, where a large number of collapsed buildings is present [high damage (class 3), i.e., ]. In order to assess the goodness of the algorithm, the error in assigning to each polygon the estimated damage level is computed. The error results from the difference between the estimated and the surveyed damage levels (see paragraph 4 and Fig. 7). The results are reported in the confusion matrix (Table 5) and shown in the histogram (Fig. 13). In the confusion matrix, the accuracies (in %) for the identification of each class are also reported, in terms of producer’s and user’s accuracy.
Confusion matrix by comparing the FST and the GT classifications. Bold values indicate the correct guesses for each class.
|GT classes (user accuracy)|
|Class 1 (81.20%)||Class 2 (41.57%)||Class 3 (59.13%)|
|FST classification (producer accuracy)||Class 1 (65.42%)||596||111||27|
|Class 2 (65.25%)||304||276||84|
|Class 3 (37.99%)||11||36||68|
The damage class is correctly identified for 62% of polygons and the coefficient, computed from the confusion matrix, is equal to 0.34. The performance is quite good considering that the NB and SVM classifiers reach an overall accuracy of 61% to 66% (see Table 3) and they use the GT information for training. In particular, class 1 and class 2 are correctly detected for 65% of the polygons, whereas class 3 is detected for only 38% (see Table 5). Note a clear underestimation of class 3 in the south-central and north-east areas of the city. Furthermore, a moderate overestimation of one class (22%) does occur. The wider error ( or classes) occurs in a small percentage(3% in total). In general, the proposed unsupervised algorithm, though simple, provides overall good results.
In order to evaluate the FST algorithm performance, in terms of accuracy, a comparison has been done by considering a well-known and consolidated unsupervised classification algorithm: the -means. The -means algorithm22 has been applied in the past years in many applications related to land cover classification by using remotely sensed data.23,24 -means, one of the most popular and simple clustering algorithms, was proposed over 50 years ago.25 In spite of its old age and the dozens of other clustering algorithms that have been published since then, -means is still widely used.26 The damage map resulting from the -means classification is shown in Fig. 14. The confusion matrix and the histogram of error, obtained as above for the FST method, are reported in Table 6 and in Fig. 15, respectively.
Confusion matrix by comparing the K-means and the GT classifications. Bold values indicate the correct guesses for each class.
|GT classes (user accuracy)|
|Class 1 (71.92%)||Class 2 (38.57%)||Class 3 (17.5%)|
|-means classification (producer accuracy)||Class 1 (77.61%)||707||218||58|
|Class 2 (44.68%)||187||189||114|
|Class 3 (3.91%)||17||16||7|
The overall accuracy is 60%, whereas the coefficient (computed from the confusion matrix in Table 6) is equal to 0.22. Moreover, class 3 is correctly identified for only 4% of the polygons.
Therefore, we can assert that the FST method performs better than the -means method in terms of overall accuracy (62% with respect to 60%) and of the resulting coefficient (0.34 with respect to 0.22). In particular, the FST method can better identify the highest damage class (see Fig. 12 and 14).
The proposed method is a quite robust approach that is based on unsupervised classification and cannot be considered to be case-dependent. Despite that, it has some limitations that are generally common to all change detection-based algorithms. The main one is that in some areas, a pre-event image (SAR or/and optical) cannot be available. This limitation nowadays is quite reduced because of the increased number of satellites and the global coverage of historical data. Of course, it would important to have updated images as much as possible to limit false alarms due to changes in the scene not due to damage. A very important opportunity to overcome these limitations will be, in the future, the availability of Copernicus Sentinel mission’s archives. Sentinel missions are able to routinely obtain a global coverage, which is expected to minimize the lack of updated pre-event data. Actually, it is not easy to determine a requirement on the maximum temporal separation between pre- and postimages. It typically depends on the regions we are considering (some places in the world are changing faster than others), and only a general consideration can be done: a low temporal span reflects (nominally) more accurate results. In this study, the temporal baseline is 3 and 5 months, respectively, for optical and SAR data. Clearly, the change detection indices are affected by other factors if the period between the images’ acquisition is much longer. It is worth noting that the proposed method, being based on city block areas, is quite robust with respect to changes not related to an earthquake, because we can guess that modifications caused by temporal changes are small (within the blocks) if compared with the changes due to such catastrophic event. In other words, these sources of errors are averaged within the blocks. However, some of these affecting factors are taken into account. For example, changes due to different illuminations at different times are taken into account by applying the atmospheric correction to images; the vegetation coverage is removed by using an appropriate mask.
Finally, it is worth noticing that the proposed damage classification is on a relative scale. Indeed, the method always identifies three classes of damage associated with low, medium, and high damage for the specific regions we are analyzing. In other words, the maximum damage class can be relative, for example, to a maximum absolute CR of 25%. On one hand, this characteristic can be considered as a limitation of the algorithm, but from an operational point of view, we think that during rescue activities, it is important to know where to intervene promptly and independently from the absolute value of damage.
We have investigated a number of change detection features obtained from satellite images in order to test the capability to assess damages due to an earthquake. The main innovation of the proposed analysis concerns the estimation of derived features at the object scale, assuming a segmentation of the area of interest (the city of Port-au-Prince) based on the open street map geo-database. Among the satellite derived features, the NDI and KLD change indices from optical and the ICD from SAR reach the best correlation with damage that occurred in the city. The correlation is significantly lower for the MI feature from the optical and the KLD from the SAR data.
The employed supervised classifiers (NB and SVM) show that the simultaneous use of all features (both optical and SAR) leads to the best performance. A first set of analyses indicates an overall accuracy of 80% to 84% in the identification of the highest damage class (30% to 100% of CR). The discrimination in three damage classes (clearly more difficult) is correct for 61% to 66% of considered objects.
With the aims above, an unsupervised algorithm (the FST method) to estimate the damage level by satellite features has been developed and tested. The performance of the FST algorithm is quite good, leading to a correct discrimination for 62% of objects, and taking into account that the wider error ( classes) is concentrated in a small area (3% in total). The comparison between the FST and the -means methods demonstrates the better performance of the first (with a coefficient of 0.34 in respect to 0.22), especially in detection of the highest damage class.
In conclusion, the proposed unsupervised algorithm, though simple, provides encouraging results and it can be a valid starting point to develop more sophisticated and accurate algorithms.
The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under Grant Agreement No. 606738, APhoRISM project.
Vito Romaniello received his degree in physics from the Sapienza University of Rome in 2006, and his PhD in methods and technologies for environmental monitoring from the University of Basilicata in 2011. He is a postdoctoral researcher at the Istituto Nazionale di Geofisica e Vulcanologia, Rome, Italy. His current research interests include satellite change detection and atmospheric physics.