Above-ground biomass estimates based on active and passive microwave sensor imagery in low-biomass savanna ecosystems

Abstract. Although many studies exist on the estimation and monitoring of above-ground biomass (AGB) of forest ecosystems by methods of remote sensing, very little research has been carried out for ecosystems of low primary production, such as grasslands, steppes, or savannas. Our study intends to approach this gap and investigates the correlation between space-borne radar information and AGB at the scale of 10 tons per hectare and below. Additionally, we introduce the integration of passive brightness temperature as an additional covariate for biomass estimation, based on the hypothesis that it contains information complementary to microwave backscatter of the active sensors. Our findings show that large-scale estimates of AGB can be conducted for grasslands and savannas at high accuracy (R2 up to 0.52). Additionally, we found that the integration of passive radar can increase the quality of AGB estimates in terms of explained variance for selected cases. We hope that these indications are a starting point for more integrated approaches toward biomass estimations based on Earth observation methods.


Introduction
Satellite imagery has become the primary source for biomass estimates of regions. 1 It is used for the assessment of agricultural yields, 2-4 monitoring resources over long time periods, [5][6][7] and forecasting carbon stocks and emissions. [8][9][10] In particular, the effects of climate change and desertification became a focus of Earth observation in sub-Saharan Africa. [11][12][13] As grasslands cover nearly one-fifth of the global land surface, their contribution to the carbon stock is expected to be between 10% and 30%. 14,15 And recent studies indicate that especially carbon losses caused by land-use changes and deforestation of African grasslands and savannas are 3 to 6 times larger than previously thought. 16 Monitoring of grassland ecosystems is, therefore, of crucial interest for the evaluation of carbon emissions. 17 The use of synthetic aperture radar (SAR) satellite data has been found to be of special value because of its wavelength and the sensitivity of the signal toward ramified structures and voluminous canopies. 18 A stable relationship between SAR backscatter and above-ground biomass (AGB) has been observed and exploited in numerous studies. [19][20][21][22][23][24] Many of them report signal saturation between 100 and 250 tons per hectare (t/ha) above which the measured SAR backscatter no longer increases correspondingly. 25,26 To overcome this limit, approaches based on radar polarimetry, 27 multifrequency exploitation, 28 integration of optical 29 or LIDAR data, 30 and SAR interferometry 31 have been proposed. However, while most of the studies are concentrating on tropical and boreal forest biomass, only a few studies deal with biomass estimation in arid or semiarid regions with herbaceous ground layer and sparse vegetation cover. [32][33][34] This aspect is primarily an outcome of SAR backscatter alone not significantly representing biomass variations below 5 t∕ha as well as other surface characteristics, such as roughness and soil moisture, which are superimposing the signal at these levels. [35][36][37] One further constraint is that coverages of cross-polarized acquisition modes (HV or VH), which are found to be most sensitive toward biomass, are often not large enough to cover whole regions at the same acquisition conditions (date, incidence angle). [38][39][40] One way to make use of the sensitivity of microwaves toward different scattering mechanisms is the utilization of sensors that employ different frequencies. 41 While combinations of short and long microwaves have been successfully applied for forest biomass estimates, 42,43 only Srivastava et al. 44 and Herold et al. 45 confirmed their potential for thin vegetation layers. Integrating passive radar for large-scale mapping is mostly limited to soil moisture 46 or surface temperature. 47 Ferrazzoli et al. 48 demonstrated their sensitivity toward vegetation biomass because of the water content plants and also Eom mentions their potential to detect changes in vegetation, 49 but none of these studies created spatial predictions from these data. Only Liu et al. 50 mapped terrestrial biomass between 1993 and 2012 at the global scale based on space-borne passive radar and report an estimated decrease of 0.07 pg C∕yr, mostly resulting from deforestation in the tropics, but also because of changes in savannas and shrublands.
In this letter, we propose a method to derive AGB estimates for large parts of Sénégal, ranging between tropical, semiarid, and arid climate zones based on the integrated use of imagery of radar measurements of sensors of different wavelengths. We investigate their contribution for wide-area biomass estimations and want to initiate discussions about the potential for automated long-term monitoring of resources in sub-Saharan African countries.

Satellite Data
Three types of satellite data were used in this study: (1) ENVISAT Advanced Synthetic Aperture Radar (ASAR) as wide Swath (WS) mode products at a spatial resolution of 150 m, (2) Advanced Land Observing Satellite (ALOS) Phased Array L-band Synthetic Aperture Radar (PALSAR) in wide beam (WB) mode products at a spatial resolution of 100 m, and (3) Special Sensor Microwave Imager (SSM/I) in V and H polarizations, representing surface brightness temperature [B T ], at a spatial resolution of 12.5 km. 51 Because of the unavailability of a consistent polarization, SSM/I data were acquired in both horizontally (H) and vertically (V) receiving polarization so that it could be combined with other data of both configurations. Table 1 shows the data used in this study. Using the revisit frequency of SSM/I of one day, a mean raster was calculated for the 31 days of December of the respective year.
Although field data are available for the years 1985 to 2013 (see Sec. 2.2), the selection of satellite imagery was restricted by the availability of images from all three sensors during the dry season. This limited the usable satellite data to the years 2006, 2009, and 2010.

Field Data
Biomass collection was carried out on a yearly base at the time of maximum phytomass, which occurs between the end of September and late October before the seeds are grown. Forty-eight representative sites were selected in Sénégal based on the intersection of topographic and phytogeographic maps (see spatial distribution in Fig. 2, prediction maps). These sites were consistently used for sampling since 1987. Each identified site has a size of 9 km 2 and was divided into nine units of 1 km 2 . A collection of biomass data was then conducted within each of these units by identifying homogeneous vegetation patterns (subunits) at the local level. Depending on the vegetation pattern of the subunits (homogenous, gradient, and mosaic), a representative number of stratified measurements was undertaken as described in the following.
For herbaceous units, linear transects of 200 m were drawn, as suggested by Poissonet et al., 52 and both destructive and nondestructive allometric measures were taken (fr = relative frequency of the layer along the transect, pm = average productive green weight [g∕m 2 ], and ms = dry matter rate). This allows the characterization of different spatial compositions, such as homogeneous landscapes, gradients, and mosaics. The herbaceous biomass was then calculated by the following equation: P H ½kg∕ha ¼ fr Ã pm Ã ms Ã 10. 53 Vegetation communities in the study area consist of open woodlands, mainly dominated by tree species of the Combretacea (Combretum collinum, Combretum glutinosum, and Guiera senegalensis), Caesalpiniaceae (Burkea Africana and Cordyla pinnata), Apocynaceae (Saba senegalensis), and Fabaceae (Pterocarpus lucens) families. 54,55 Woody biomass of individual trees was determined within squared plots of 50 × 50 m by measuring their diameter at breast height (DBH, 1.30 m) to be inserted in the following allometric equation: P i ¼ a Ã DBH b , where a and b are constants depending on the tree species. 53 The sum of all trees within a plot determines the woody biomass of an area P L was then added to the herbaceous biomass P H to get the full AGB of this area. A detailed description of the data collection methodology is given by Yameogo et al. 53 and the background on the used regressions is documented by Cissé et al. 56 Biomass data used in this study were also used in studies on monitoring sub-Saharan biomass 57 and the establishment pastoral early warning system based on medium resolution optical satellite imagery at a spatial resolution of 1 km. 58 After the removal of sites with missing values or sites outside the area covered by the satellite imagery, a total number of 63 measurements remained (

Data Preparation
SAR data were radiometrically calibrated to radar brightness (beta naught) by applying the scaling constant as described by Henry et al. 60 and Shimada et al., 61 respectively. To compensate for topographic variations and their impact on backscatter intensity, incidence angle normalization was performed using the 1 arc second shuttle radar topography mission digital elevation model, 62 resulting in terrain-flattened Gamma naught. 63 Range-Doppler terrain correction was applied to all SAR images to compensate for topographically induced geometric distortions. 64 Backscatter data were converted to the dB scale for better contrasts in the low-value ranges. SSM/I products were directly used as provided by the NSIDC due to their extensive and accurate internal calibration. 65,66 All raster processing was performed in the sentinel application platform provided by the European Space Agency (ESA).

Correlation Analysis
To analyze correlations between AGB measured in the field and the radar values from the images, backscatter intensities of ENVISAT and ALOS PALSAR were extracted at the sampling sites within a radius of 1.5 km (about 7 km 2 ) and averaged to one mean value per SAR sensor and site. Pixels within this radius represent nearly the same area as defined in the sampling process of the field reference data (9 km 2 ). This grants that both the independent and dependent variable refer to a similar area. Furthermore, the influence of speckle at the local scale is reduced. As for SSM/I data, one radar brightness value is extracted for each sampling site. Although it technically represents a clearly larger area, increasing the representative area of the SAR sensors by the radius of 1.5 km reduced the ratio between the pixel sizes of active radar and passive radar systems used in this study from 1:6000 to 1:22.
The extracted radar values were correlated with the AGB measurements of the corresponding sites and years using an exponential regression in R software package. 67 The relationship between backscatter and AGB is described by an exponential function as y ¼ a Ã e bx , where y represents AGB, x shows the image variables, and a and b are the equation coefficients. Although there are also linear, 23,24 logarithmic, 42 and power law-based 68 approaches, the majority of studies reports best results based on nonlinear exponential regressions, which also reflects our observations. 5,20,28,39,69 Furthermore, the conversion of SAR data to a dB scale already eliminates the logarithmic nature of the relationship between AGB and backscatter intensity. 25 In addition to the single measurements of each sensor ( Table 2, a, b, and d), several recombinations of the different sensor's values were derived to generate a variety of predictors to be inserted as further independent variables of the regression ( Table 2, c, e-i). This is based on the assumption that the information retrieved at different wavelengths is complementary for the description of the present biomass. Combining these data for the regression helps to analyze how well these joint measures are suitable to improve the prediction accuracy. Furthermore, it helps to overcome the difference in spatial resolution between active and passive radar data by simply extracting the values at similar footprints (see Sec. 2.3) without the need for spatially upsampling or downsampling one of each dataset, which was found to potentially have unwanted side effects on the quality of results. [70][71][72] To also create spatial predictor variables that are based on all three input sensors, a principal component analysis (PCA) was conducted as an acknowledged method of sensor combination. 73 However, because active and passive radar produce data of different range and units (see Sec. 2.3), all images were transformed into an eight-bit integer format. Based on these rasters, a PCA was performed for each investigated years using all three satellite scenes available. The first principal component (PC1) of each year was used for the regression analysis (maps g, h, and i), which showed an explained variance of the 0.502 (2006), 0.515 (2009), and 0.418 (2010), respectively. Although the rasters of ALOS and ENVISAT were found to be spatially correlated, the integration of SSM/I data introduced newly emerging patterns in PC1. This integrated spatial representation of all three sensors is believed to contribute to the prediction accuracy of the regressions because it features wavelengths of various sizes that might interact with different parts of the vegetated surface.
To test the strength of the relationship between AGB and the independent variables, the coefficient of determination (R 2 cd ) was calculated for each regression as shown in Table 2. Greater values indicate a stronger relation, whereas 0 means no correlation between the dependent variable (AGB) and the independent variable (radar information) and 1 meaning that the independent variable can be used to fully explain the variance of the dependent variable. Figure 1 shows selected scatter plots of the relationship between active and passive radar information, with the measured biomass as shown in Table 2. As all scatter plots show, a strong relationship exists between observed biomass and the layers retrieved from satellite imagery and a clear trend is visible. However, it has to be noted that many more combinations of configurations and derivatives were tested (examples of combinations are presented by Omar et al. 42 ) and that the selection shows the variables with the highest correlations that were found.

Statistical Prediction of Above-Ground Biomass
In general, ALOS and ENVISAT regressions alone already deliver a coefficient of determination (R 2 cd ) of 0.3 with slightly higher correlation for the ENVISAT. The results indicate that microwaves of shorter wavelength (ENVISAT, λ ¼ 5.6 cm) are more sensitive toward the roughness and volume of fine vegetation surfaces, such as grasslands or small bushes, while larger wavelengths (ALOS, λ ¼ 23.5 cm) are partly able to penetrate canopies and ground vegetation layers and typically interact stronger with objects of comparable size, such as large branches or stems of savanna trees. 28,44,45 As shown in scatter plot c, combining short-and long-wave information of ALOS and ENVISAT potentially increases the significance of the correlation, even if only to a small degree. This can be explained with the observation that horizontally and vertically polarized microwaves interact with different parts of the vegetation layer. 74,75 For scatter plot d, only SSM/I data were used. It shows that passive radar data alone cannot predict the distribution of AGB to a sufficient degree. Looking at the maps of Fig. 2, the overall distribution is rather inverted compared with the other cases. Additionally, the low spatial resolution of 12.5 km might not be sufficient to represent the spatial scale of biomass variations at the ground. Our assumption, however, is that it can be employed as a complementary source of information to depict patterns that are not fully related to active SAR backscatter measures of ALOS and ENVISAT.
For this reason, we calculated the ratio between active and passive information to be used in the correlation. Plots e and f demonstrate how a combination of both could statistically increase the predictions of AGB. As shown, both regression lines match the measured AGB values quite well, resulting in cd . This can be partially explained by the distribution of samples in the scatter plots: While plot h lacks a clear trend and has larger outliers, samples were nearly split into two populations in plot i. The description of both distributions by an exponential function was not successful. As later shown in the prediction maps in Fig. 2, this resulted in an underestimation of high AGB values in case h and an overestimation in case i. This leads to the conclusion that, if suitable configurations and measurements without outliers are present, merging active and passive SAR information via PCA can lead to very high correlations. If there are inconsistencies in either the microwave imagery or the measured AGB, merging sensors might not improve the result compared with single-sensor correlations. Another possible reason for the different accuracies between these three years could be that SSM/I data from 2006 were acquired at a wavelength of 85 GHz, whereas the data from 2009 and 2010 was only available at 91 GHz.  Table 2.

Spatial Prediction of Above-Ground Biomass
The regressions shown in Table 2 were applied to the whole study area using the input rasters as variables in a raster calculator. Accordingly, the resulting AGB predictions (Fig. 2) show biomass variations at the detail level of the sensor with the highest spatial resolution, while still containing contributions from other sensor's variations involved in the regression. The maps support the indications made above that microwave information can generally be a very suitable proxy for the spatial distribution of biomass, even in sparsely vegetated areas. The low biomass areas in the northern center of the study area coincide with the Ferlo rangelands, a pastoral savanna with variable rainfall, and lack of permanent water availability. 76 Highest AGB values are found along the Sénégal river along the border between Sénégal and Mauritania, as well as in the denser savannas in the Tambacounda region of in the subhumid south. 77 Although all maps show similar patterns, it is also evident that some of them tend to overestimate or underestimate the measured biomass. Although the purely SAR-based estimates (maps a, b, and c) lie within a similar range of values, especially the measurements that involve SMM/I predict generally lower AGB values in the study area, especially maps e and g, respectively. Map i can be considered a special case because no clear trend between the sampled data for 2010 and the microwave observations was found [see Fig. 2(i)] and a large proportion of samples is clearly underestimated by the regression.

Validation
Because of the limitations regarding data availability of both satellite and field data, no independent validation could be performed based on points that were not used in the regressions. As a consequence of the limited sample size (n ¼ 21 for some cases) and the required trade-off between bias and variance inherent in our data, we selected leave-one-out cross validation  Table 2, which were based on different combinations of the used radar images. over a k-fold cross validation. 78,79 The results are shown in Table 2 in the columns R 2 val and root mean square errors (RMSE). Compared with a 10-fold cross validation, leave-one-out produced significantly lower R 2 s, but the RMSE of between 1500 and 2500 kg per hectare were similarly estimated. Compared with the scatter plots in Fig. 1, the validation scores might be a bit pessimistic, yet we considered leave-one-out the more suitable and honest measure regarding the sample size and distribution of our data. 80 One major thing to note is that the lowest RMSEs are produced by the regressions based on both active and passive radar images. This furthermore underlines the potential of passive radar data to be integrated into AGB predictions, despite its significantly lower spatial resolution.
It became evident in this study that the availability of a sufficient number of representative samples is essential for a robust validation of the models. Having the samples split-up into independent training and testing subsets would clearly increase the robustness of the validation scores and finally increase the significance of the study.

Discussion
This study showed how satellite microwave imagery can be used to estimate AGB over large areas. Although most studies focus on tropical forest biomass, the estimates in ecosystems of low primary productivity, such as savannas or grasslands of the African Sahel zone, are still underrepresented in the literature. Our findings show good correlations with explained variances between 0.31 and 0.52, respectively. Accordingly, the combination of radar information from active and passive sensors can introduce high potentials because different types of scattering mechanisms and surface emittance are combined. This might be of particular interest for sub-Saharan countries, which often consist of a wide range of climate zones along their latitudinal gradient. The contribution of passive radar brightness temperature on the estimation quality of AGB is indicated by selected results, but clearly more research has to be carried out to prove its large-scale applicability.
Although the PCA of the year 2006 reached the highest R 2 s in this study which partially confirms our hypothesis that combinations of active and passive radar, as well as short and long wavelength SAR data, lead to increased accuracies, the scores of the PCAs of 2009 and 2010 were significantly lower. We identified two possible reasons for this. First, the temporal difference between the acquisition of ALOS and ENVISAT is the shortest for 2006 (see 1), which reduces phenological variations and potentially leads to illtrained regressions. Second, negative western Sahelian rainfall anomalies were reported for the year 2006, leading to a more pronounced seasonality and therefore more distinct vegetation patterns. 81 Despite the promising results, we identified several points of concern to mention: starting with data selection, the temporal gap between the data collection (September and October) and the utilized radar imagery (November and December) was unavoidable because of unavailability of both ENVISAT and ALOS imagery at an earlier point of the investigated years. Yet, we see little potential for larger errors as the rainy season lasts until November in the northwest and the starting of the dry season has no immediate effect on aboveground vegetation. 82 It should still be attempted to keep the image acquisitions as close to the field measurements as possible.
Another critical point is the extreme difference in pixel sizes of active and passive radar data. This discrepancy has been lowered by averaging SAR backscatter within a radius of 1.5 km around each sampling sites, thus reducing the impact of extreme values on the regression. In case of speckle, this might be desirable, but as shown in Fig. 1, underestimation of high AGB values was a general problem and it could partially its origin here.
Furthermore, the combination of SAR bands of both horizontal and vertical polarization is not ideal. Whenever possible, X∕C-band and L-band of the same polarization should be combined to make the best use of their complementary information. The missions of Sentinel-1 and ALOS-2, which were both launched in 2014, might serve as an ideal constellation as they both acquire images at regular intervals and with extensive coverage.
Due to the limited number of reference data collected in the field, a robust validation was not possible and the temporal difference between sampling and image acquisition was comparably large. For future studies, we pledge systematic field campaigns that take stratified and spatially representative samples and are aligned with the observation plans of the respective satellite missions. This could clearly reduce the error caused by phenological dynamics. 83 As a consequence of the limitations described above, it can also be expected that the different number of samples used for the regression had an impact on its R 2 , as well as the number of outliers. This again supports the necessity of a well-considered study design.

Conclusion and Outlook
Our approach showed that there are also good correlations between spaceborne SAR backscatter and low biomass conditions, which have values below 10 t∕ha. However, our regressions showed clear underestimations for values above 5 t∕ha. A statistical evaluation on the range and distribution of the collected AGB values is needed to select the suitable regression technique. While we used an exponential model to fit the data distribution, other cases might be more successful with linear, logarithmic, or multivariate techniques, as the logarithmic nature of the relationship between AGB and radar backscatter has often been demonstrated. 25 Future SAR missions bring large potential regarding the multifrequency prediction of AGB: in addition to the missions of Sentinel-1 and ALOS-2 mentioned above, high-resolution SAR satellites are planned which fulfill both the criteria of large wavelengths (TanDEM-L, NovaSAR-S 84,85 ) and temporal resolution (ICEYE-1 86 ) to complement long-term passive radar missions. Despite the limitations of this study, we want to encourage other researchers to include passive radar information in biomass estimates and to further develop methods for areas with low primary production because there is a large need in monitoring seasonal variations and effects of climate changes on these systems. [87][88][89]