Developing an algorithm for enhancement of a digital terrain model for a densely vegetated floodplain wetland

Abstract. Airborne laser scanning survey data were conducted with a scanning density of 4  points/m2 to accurately map the surface of a unique central European complex of wetlands: the lower Biebrza River valley (Poland). A method to correct a degrading effect of vegetation (so-called “vegetation effect”) on digital terrain models (DTMs) was applied utilizing remotely sensed images, real-time kinematic global positioning system elevation measurements, topographical surveys, and vegetation height measurements. Geographic object-based image analysis (GEOBIA) was performed to map vegetation within the study area that was used as categories from which vegetation height information was derived for the DTM correction. The final DTM was compared with a model obtained, where additional correction of the “vegetation effect” was neglected. A comparison between corrected and uncorrected DTMs demonstrated the importance of accurate topography through a simple presentation of the discrepancies arising in features of the flood using various DTM products. An overall map classification accuracy of 80% was attained with the use of GEOBIA. Correction factors developed for various types of the vegetation reached values from 0.08 up to 0.92 m and were dependent on the vegetation type.


Introduction
Airborne laser scanning (ALS) is widely applied to retrieve data on topography that is used to develop digital terrain models (DTMs). ALS remains as a cutting-edge methodology for retrieving high-resolution data on land relief and has multiple advantages over traditional surveying techniques. It is considered to be the most time-efficient method and advanced technology due to the capability to derive topographic (bare ground elevation) data by filtering vegetation or other objects on the surface and the capability to produce centimeter resolution DTMs over large areas. 1 Within near-natural floodplains, dense vegetation can overestimate ALS terrain measurements by masking true terrain elevation. In this regard, the quality of ALS-based DTMs of vegetated and relatively flat areas, such as lowland floodplains, requires special scientific attention.
Floodplain topography and its complexity influence water flow and the pattern of floodplain inundation, creating spatial and temporal patterns related to the processes of erosion, sediment transport, and deposition. [2][3][4][5][6] The (micro)topographic heterogeneity of riparian habitats shapes unique hydrological conditions present in riverine landscapes that determines biodiversity. 7,8 These conditions are relative to the type of water supply and can be considered catchmentscale variables. Topography of a floodplain underpins the extent of floods and the degree of hydration in particular riparian habitats, remaining a strongly local but important element of these geosystems. The importance of the exchange of water, sediments, nutrients, organic matter, and biota between the river and the floodplain has been emphasized in many different studies. [9][10][11][12] Thus, it has been proven that analyzing ecohydrological feedbacks in floodplain wetlands with respect to terrain relief and elevation, on top of the broad spectrum of hydrological data, requires appropriate data on floodplain surface topography. 12,13 Most of the topographic analyses in hydrological sciences are performed using DTMs created from the data retrieved either from traditional ground-surveyed optical measurements or, more commonly, remote sensing data, ALS in particular. High quality of ALS-based DTMs that remain as digital representation of bare terrain (surface elevation of the ground without any of the structures or vegetation), significantly improved the quality of hydraulic modeling, fluvial geomorphology analyses, and river dynamic studies. [14][15][16] The quality of topographic data for floodplains is critically important because only small changes in elevation across these low relief landscapes and small local differences in elevations can entail different functions of these ecosystems and, hence, should be accurately represented by topographic data. However, when variations in DTMs are the result of errors introduced from the chosen data processing method and do not represent the real terrain features, such DTM data may introduce significant errors in habitat assessment and, likely, bias conclusions.
The traditional approach has some limitations related to the high costs of field measurements and/or terrain inaccessibility. Topographic contour maps, especially the ones produced in mid-20th century (in our case, these maps were used in the analysis as they are the only available source of data on land relief available in the appropriate scale of 1∶10;000), also face technical limitations. The uncertainty in elevations introduced from large spacing between contours in flat areas (such as floodplains) and ground measurement control points negatively influence the DTM products obtained through interpolation. By contrast, high spatial resolution DTMs derived by ALS have been shown to be more representative for field slope measurements 17 and field-determined elevations 18 than DTMs created using topographic contour maps. 19 They provide a comprehensive and highly accurate source of elevation data from exposed ground and vegetation surfaces. Nevertheless, while the ALS data collection and processing methods are considered as cutting-edge technology, challenges are still presented to achieve an accurate DTM for low relief floodplains with dense vegetation and periodic inundation. Hence, specific efforts are required to obtain an accurate DTM, which is appropriate for hydrological applications. This is because ALS measurements of height or backscatter are returned from dense and sometimes tall wetland vegetation, a mosaic of flooded and dry lands, the structure of trees and shrubs as well as the time of data acquisition, negatively affecting the quality of a DTM. Consequently, estimates of terrain elevation are often invalid as height measurements are exaggerated due to so-called "vegetation effect" or inundation effects. This problem has been pointed out in numerous studies. Gorte et al. 20 have shown that the low vegetation (heights below 20 cm, e.g., grass or branches lying on the ground) is causing problems in determining the bare earth elevation. Hodgson et al. 21 found high elevation errors in areas covered by shrubs compared to the other types of the vegetation. Ahokas et al. 22 investigated the upward shift of the laser points compared to the ground points to be AE11 cm for grass and AE17 cm for forest. Bollweg and de Lange 23 found an upward shift of 8 cm for long dense grass. Su and Bork 24 indicated that many ALS last-return pulses may originate from the forest canopy or understoring vegetation rather than the true ground, causing overestimations of ground elevations within the extents of shrubs and forests. Hladik and Alber 25 found that mean errors (MEs) of DTM-derived ground elevations for different land cover classes ranged from 0.03 to 0.25 m compared to the ground truth data, with larger offsets for taller vegetation. They also developed species-specific correction factors for selected 10 land cover classes and used these correction factors to modify the ALS-derived DTM. Będkowski and Stereńczak 26 showed that the time of the year at which ALS data were acquired may considerably influence the quality of a final DTM. They concluded that elevation differences between DTMs developed on the basis of ALS data acquired in two seasons-spring and summer may be as high as a few meters. They also concluded that untypical relations between DTMs (spring-summer) are apparently connected with the presence of dense broadleaved species in understory canopies. In general, the complexity of DTM development for areas with the dense vegetation is not a new issue in the contemporary literature. 20,25 However, the development of ALS DTM processing algorithms and methods needs to be revisited in terms of accuracy, especially for floodplains with high and dense vegetation.
The overall goal of the research presented in this paper is to develop an efficient process to correct ALS data for the "vegetation effect" to gain an accurate DTM that represents the actual terrain surfaces of the research area, which is a broad temperate floodplain. We contest that the methodology based on ALS data and extended to geographic object-based image analysis (GEOBIA), 27 along with the application of vegetation height correction coefficients, remains an efficient algorithm for the development of accurate DTMs for floodplain wetlands. By comparing spatial analyses of selected hydrological features of a known flood event, depths of water within the floodplain, and the volume of floodplain water storage, we revealed that the DTM developed with the use of proposed methodology provides much better quality elevation data than the DTM developed with the use of standard data processing procedures.

Study Site
The northern part of the lower Biebrza River valley in Poland ( Fig. 1. 53°23′49′′N-22°28′34′′E bottom left and 53°28′12′′N-22°38′9′′E upper right) was chosen as the research site due to its unique environmental features, including natural vegetation, low human pressure, a natural and dynamically changing river bed with multiple natural oxbows, and a nearly nonmodified flow regime. 28 Wetlands of the stretch of the valley analyzed are supplied with surface water and groundwater with spring thaw flooding being a main driver of floodplain ecology. 29 The significance of the lower Biebrza River valley in catchment-scale water retention processes has been reported as an important ecosystem service. 30  of terrain elevations requires special efforts to be undertaken in order to make the DTM representative for hydrological and ecological purposes. Due to the numerous oxbows and meanders of the Biebrza River, its length in the analyzed stretch reaches 18.5 km. Most of the northern part of the valley is covered by floodplain vegetation, which is mainly dependent on seasonal flooding. The presence of dense vegetation cover consisting of alder forest, willow shrubs, reeds, and sedges makes the development of an accurate DTM challenging.

Procedure
The procedure applied in this study involved various types of data and consisted of the following tasks: The overall scheme of the data processing steps is presented in Fig. 2. This procedure applies to additional correction method used in order to obtain accurate DTM, task 7 in areas covered by very dense vegetation such as shrubberies or reed. Considering the density of point cloud (4 points∕m 2 ) in some places the laser beam did not reach the ground. In these cases, the extraction and interpolation procedures were used.

Airborne Laser Scanning Survey Details
The research in the framework of this paper utilized near-infrared (1064-nm Laser Scanner model and type: Leica ALS70) ALS data acquired in autumn 2011, with a scanning density of 4 points∕m 2 within the nationwide Polish ISOK program (IT System of the Country's Protection against Extreme Hazards for Poland). 31 The data for analysis were gathered after the growing season, during a period when no or slight flooding or inundation within the research area was present. This is important due to the fact that surface water can contribute to inaccurate elevation measurements using ALS data. In the project, 73 tiles (1 × 1 km) provided by CODGiK (Central Agency for Geodetic and Cartographic Documentation) were used, covering around 7300 ha in the northern part of the lower Biebrza River valley. The point cloud was acquired with an average elevation error of 0.15 m and an average location error of 0.5 m. 32 The analysis used in this study generated a 1m gridded DTM. The data were delivered in the Polish National Spatial Reference System frames, geodetic reference frame: National Geodetic Coordinate System 1992 (ETRS89/ Poland CS92, EPSG code: 2180), and vertical reference frame: Kronstadt normal height coordinate system (PL-KRON86-NH) based on GRS80 ellipsoid.

Airborne Laser Scanning Data Processing
ALS data were delivered in Log ASCII Standard (LAS) file format by CODGiK. First, the degree of spatial detail in an unprocessed ALS data was quantified with point density and equal to 4 points∕m 2 and point spacing was 0.389 points∕m.
The LASGround module from LASTools software (rapidlasso GmbH) was used. The LasGround module, the Wilderness algorithm, was applied as recommended for areas with natural vegetation. Nonground points (mainly trees, shrubs, reed, and sedges) were excluded from the analysis to obtain primarily bare earth points. This is important because in areas covered by high and dense vegetation the laser pulse is mainly reflected from the vegetation canopy and rarely reaches the ground. The filtering allowed high objects to be excluded from the analyzed data by taking into account only the points from last-return pulse (the lowest points measured by the laser scanner).
During the ALS point cloud data processing, several products were developed: DTM, DSM, and nDSM- Fig. 3, slope and ALS intensity maps with the spatial resolution of 1 m × 1 m.
The nDSM was calculated by simply subtracting the last-return product (DTM) from the first-return product (DSM) to give the relative heights of vegetation. 33 The ALS intensity, defined as the ratio of incoming to outgoing radiation of a laser pulse, is measured during data acquisition and in this study was used to produce the ALS intensity map. However, the resulting DTM did not give us a final reliable product for areas covered by trees and willow shrubs. In these areas, the laser pulse could not reach the ground surface due to the presence of branches, resulting in an overestimation of the ground elevation as the algorithm used in the analysis treated the lowest points as ground. The further use of GEOBIA classification in the processing procedure helped detect such areas and allowed the application of vegetation correction factors. In the end, a final corrected DTM was obtained. Moreover, in the end, vertical accuracy was determined by ground truthing after DTM creation and additional analysis was performed in order to present the importance of DTM accuracy on some crucial hydrological characteristics.

Geographic Object-Based Image Analysis Classification
GEOBIA was used in this study to integrate ALS, optical remote sensing (including satellite and airborne products), and thematic vector data to develop an accurate map of vegetation type and height across the northern part of the lower Biebrza Basin. The classification was performed based on remotely sensed optical imagery (airborne and satellite), laser scanning data, and thematic vector data. Data processing was conducted with eCognition software (Trimble GeoSpatial).
Today, this type of image classification is used by the world's leading research centers involved in the processing of remote sensing images. [34][35][36] Processing using the GEOBIA approach is based on objects-groups of pixels representing various features. In the first step of the analysis, the pixels are segmented into objects. Then, objects are assigned to land cover classes defined by the user. This process can be expanded upon and can consist of many steps, including reshaping, resegmenting, and reclassifying the initially created objects. This allows the creation of rule sets for processing large amounts of datasets without any user interaction. In this research, GEOBIA was used to process different types of spatial data (Fig. 2). In the first step (task 1), vector thematic maps, including streams, ditches, and oxbows, were rasterized. Application of the existing vector data in analysis allowed considering these objects more accurately in a final DTM. In the second step (task 2), the DTM, DSM, and nDSM were obtained from ALS point cloud data, and the segmentation and classification of the nDSM were performed in order to obtain the vegetation map on the basis of difference in vegetation height. The use of satellite (Landsat 8) and airborne images in GEOBIA (tasks 4 and 5) allowed vegetated classes to be distinguished from nonvegetated classes and helped delineate the river bed and oxbows. Multiresolution segmentation was chosen for image segmentation process. The scale parameter was set to 20. We used the default value of 0.1 for shape factor, 0.2 for smoothness, and the value 0.7 for compactness to obtain better representation of forest and shrub boundaries. These two parameters control the characteristics of similarity and heterogeneity for each image-object.
The use of three-dimensional point clouds from laser scanners, particularly in the aspect of their integration with the multispectral information originating from digital cameras, allowed the process of image interpretation to be significantly enhanced. 37 In task 4, several vegetation indices were tested in order to differentiate areas covered by alder forests, willow shrubs, common reed, reed-manna grass, sedges, and grass. The chosen vegetation indices, which are presented in Table 1, included the normalized difference vegetation index (NDVI 38 ), optimized soil-adjusted vegetation index (OSAVI 39 ), SAVI, 40 modified simple ratio (MSR 41 ), and the vegetation vitality ratio (VVR 42 ). All vegetation indices, (Table 1) including intensity map, were assigned the same weight during the segmentation process. Obtained image-objects were further classified into 7 land cover classes.

Digital Terrain Model Accuracy Enhancement
Developed in task 2, the ALS DTM represented the false terrain caused by the so-called "vegetation effect." Therefore, additional correction was performed in task 7 of the presented procedure (Fig. 2) in order to decrease the influence of the vegetation on the resulting DTM. The real-time kinematic global positioning system (RTK GPS) measurements taken in the different types of vegetation within the research area allowed a calculation of the differences between the ALS DTM and elevation represented by ground control points. The positive values indicate that the DTM was above the measured ground control points. The mean value of this shift was calculated for each vegetation class and was then subtracted from the DTM in the given vegetation class in order to correct the elevation values ( Table 2). The correction factors for areas overgrown by shrubberies higher than some 2.5 m and reeds higher than some 3.0 m did not allow sufficient correction of the ground elevation. An insufficient number and spatial distribution of reference points measured with the RTK GPS within patches of these vegetation types are likely to have led to unsatisfactory correction of these types. These problems occurred due to difficulties in acquiring high accuracy RTK GPS measurements (because the RTK GPS signal in these high and dense vegetation types was poor). Fortunately, these types of vegetation of the floodplain could be accurately mapped using GEOBIA. Areas covered by very dense clusters of high shrubs (represented here by Cornus sericea or willow shrubs) or dense and high reeds, the elevation data were excluded from the DTM, and interpolation from known ground points surrounding the erased vegetation patch was applied to fill in the gaps. GEOBIA classification attained a satisfactory classification accuracy (Table 2) with an overall value of 80%.

Verification of the Digital Terrain Model Processing Results and Influence of Digital Terrain Model Accuracy on Habitat Assessment
All topographic products developed from laser scanning data were verified using RTK GPS surveys performed within the research area [43][44][45] (Fig. 4). A regular distribution of the measurement points was not possible due to the very demanding field conditions (flooding) and the lack of roads or paths allowing the penetration of the valley. These measurements were taken during surveys carried out in recent years in Biebrza National Park 46 and by field measurements performed within this study. Data from the current study were collected using high-precision Topcon dual-frequency (L1/L2) RTK Global Navigation Satellite System (GNSS) receivers (GR-3 and GRS-1), and archival data were collected using dual-frequency (L1/L2) Topcon Legacy E receivers with a PGA-1 antenna. The RTK GPS surveying method was applied using the real-time NAWGEO service of the multifunctional precise satellite positioning system of ASG-EUPOS. Estimated precision of the real-time NAWGEO service is as high as 0.03 m (horizontally: XY coordinates) and 0.05 m (vertically: Z coordinate).
The RTK GPS measurements were taken using only GPS reference stations as no GLONASS and GALILEO reference stations are present in northeastern Poland.
All measurements were taken using multipath reduction and only in RTK fixed mode with position dilution of precision values lower than 4. Data points for which the GPS was not able to achieve a good GPS solution (RTK fixed solution), due to signal attenuation by the vegetation, were excluded from further use. RTK fixed solution means that the GPS can see at least five satellites in common and is receiving corrections from the base stations. The measurement accuracy in RTK mode is estimated as AE1.5 cm horizontally and AE2 cm vertically. Points measured were not referenced to any geodetic control network reference point because no such network exists in the research area.
Measurements were scattered throughout the river valley depending on topography and included 307 points taken randomly in different types of vegetation within the study area. At each location point, up to four measurements were taken, and then averaged in order to improve RTK GPS accuracy due to the lack of geodetic benchmarks in the research area. This allowed data to be obtained with an accuracy close to the one recommended by the American Society for Photogrammetry and Remote Sensing 47 and the European Spatial Data Research Network 48 regarding accuracy validation of ALS data. To quantify DTM error, several measures were computed (Table 3).
To present the relevance of DTM enhancement for quantification of hydrological features of the floodplain, we compared flood extents, depths of water within the flooded plain, and volumes of flood computed with the uncorrected and corrected DTMs applied to the analysis. Analyses were based on the intersection of grid datasets containing elevations of water table in various flood scenarios. (We used data of flooding in 1%, 5%, 10%, 20%, and 50% recurrence intervals computed by Grygoruk et al. 49 ) Differences in assessed values of these parameters are expected to present the relevance of DTM correction for hydrological estimation of floodplains and their role in shaping ecological features of riparian wetlands.

Accuracy Assessment Results
The accuracy assessment was performed on two developed DTMs (Figs. 5 and 6). A comparison of the accuracy of the analyzed DTMs shows that the best-fit between the ground-measured elevations and DTM-derived elevations was obtained in the case of the processed ALS DTM with "vegetation effect correction" (Table 4).
Considering the fact that the variance of n independent variables having the same distribution equals 1∕n of the variance of each individual variable, the summaric influence of GPS and ALS accuracy on calculated correction factor reach the value of ½ð0.05 2 þ 0.15 2 Þ∕n 0.5 , and-in our case-never exceeds 0.03 m ( Table 2). This value is significantly lower than the assessed and developed correction factors for particular vegetation types analyzed ( Table 2).
The ME of the ALS DTM versus the RTK GPS field measurements of ground elevation reached <0.13 m, which we consider to be acceptable for application in hydrological analyses, such as flood depths and extents calculation. Although the maximum error of this DTM reached 1.1 m, the standard deviation (SD) of all the values analyzed allows the conclusion that although errors higher than 0.5 m have occurred [32% of the data has an error of (0.12 AE 0.4 m)], their frequency of occurrence was considerably lower than in the case of the uncorrected DEM (Fig. 7). The errors higher than 5 m were associated with forest class.
The application of GEOBIA we present in our study remains the first ever use of this methodology for improving DTM on the basis of vegetation classes distinguished. Reaching the classification accuracy at the level of 80%, which we consider more than satisfactory and very promising as the research task to be further developed, our study opens the field for the improvement of ALS-based DTMs of densely vegetated wetlands. Among the main issues we faced in    the presented approach, there was unsatisfactory accuracy of terrain elevation measurements done with RTK GPS in patches of high (>2.5 m) bushes and dense and high (>3.0 m) reeds, which were used to calculate the correction factors related to particular types of the vegetation. The next step in the development of the presented herein methodology of DTM improvement would be to make accurate measurements of terrain elevations within extents of those vegetation types, preferably done with the use of standard optical geodetic devices such as the total station. Such measurements, being much more accurate than the ones done with RTK GPS, would allow gathering high-quality elevation data within patches of high and dense vegetation. Although low accessibility to the study area and harsh field conditions did not allow such measurements to be undertaken, we foresee that the application of optical geodetic measurements would allow reaching even higher accordance of performed DTMs comparing to the field-collected data.

Flood Mapping Results
To present how different DTMs applied to flood-mapping procedures affected the final results of the flood area and floodplain volume calculations, we analyzed floods of 1%, 5%, 10%, 20%, and 50% recurrence (Table 5). Water levels were modeled using a one-dimensional hydrodynamic model. 50 Being aware that microtopography and the presence of dense vegetation may significantly modify flood distribution and should therefore be determined with high accuracy, the ALS-based DTM with additional correction was expected to give the most reliable results of flood analysis. The result shows that the application of coarse quality elevation data in flood mapping can lead to a significant underestimation of flood extent and flood depth. Such a fact can result in inappropriate decision making when hydrological characteristics play important roles in ecosystem management (Fig. 8). As ecological analyses frequently refer to flood depths, we evaluated maximum and mean depths of the flood along with SDs (Table 6).
Maximum flood depth reached ∼2.18 m (ALS DTM without additional correction), which is 0.24 m lower than the value of maximum depth obtained in calculations using ALS with additional processing methods. Again, the results of flood depth calculations obtained with a course  DTM applied to lowland floodplains do not allow comprehensive conclusions of floodplain hydrology to be derived.
Knowing that the quality of the DTMs affected flood extent and flood depth calculations, we also compared floodplain water retention capacities (Table 7). Respectively, the calculated flood volumes were higher in average by 13.04 M∕m 3 for the DTM with correction. The flood volume for 1% flood recurrence time was higher by 15.93 M∕m 3 and for 50% flood recurrence time by 9.95 M∕m 3 . Our analyses confirmed that in addition to the importance of vegetation influence on computed hydrological features of floodplain wetlands (volumes of vegetation versus volumes of flood), 51 it is highly important to consider vegetation-type-dependent quality of DTM to obtain reliable results of topographic and ecohydrological analyses of these ecosystems.

Conclusions
The main goal of this research was to efficiently process an ALS point cloud dataset to obtain an accurate DTM of a densely vegetated floodplain, without losing relevant terrain information that may affect future hydrological analyses wherein a DTM is to be applied. The combination of different types of data, including topography and spatial patterns of wetland vegetation, provided a basis to develop an accurate DTM. Methodology for the development of DTMs based on ALS and including GEOBIA classification and plant height correction coefficients was proven to be an effective tool for high-quality DTM development. Comparison of the DTM produced using the standard procedures with the DTM developed with the proposed methodology showed that RMSE values were up to three times lower for the vegetation-height-corrected DTM. Application of GEOBIA for vegetation mapping, which was a novel approach to DTM improvement, achieved an overall classification accuracy of 80%. Correction factors developed for various types of the vegetation reached values from 0.08 up to 0.92 m and were dependent on the vegetation type. The methodology of removing dense vegetation from the ALS point cloud, then interpolating terrain elevations for these spatial gaps in data considerably increased the accuracy of the DTM. Although the values of vegetation height correction coefficients seem universal for the plant communities analyzed, we strongly recommend to derive appropriate values of these coefficients in particular research sites other than the Biebrza Valley. Flood volume assessment discrepancies between the two DTMs applied reached values as high as 66%. The proposed method, although, was proven to be capable of providing a good quality DTM to be used in hydrological analyses of the temperate floodplain wetlands, is more resource-and time-consuming than the standard one, mainly due to the extensive field measurements that have to be done in order to obtain the accurate field elevation data. In spite of this disadvantage, we recommend this methodology for all the cases of riparian wetlands, where accurate representation of floodplain's topography may play the crucial role in the appropriate management of wetlands.