Estimation of shallow bathymetry using Sentinel-2 satellite data and random forest machine learning: a case study for Cheonsuman, Hallim, and Samcheok Coastal Seas

Abstract. Bathymetry, the measurement of sea depth, has conventionally been conducted using echo-sounders on vessels. However, various factors limit conventional shipborne surveys in coastal regions, including data continuity, geographic obstacles, diplomatic concerns, and marine infrastructures. Remote sensing technology can address these limitations, particularly with the advancement of satellite imaging technology. Indeed, many studies are underway to develop machine learning-based water depth estimation technologies. However, previous studies have focused on clear waters with low turbidity or uniform seabed sediment. Therefore, in this study, we developed a satellite-derived bathymetry (SDB) model using the random forest machine learning algorithm, which was applied to three coastal areas around the Korean Peninsula with distinct characteristics: clear waters (Samcheok), high turbidity (Cheonsuman), and varied seabed sediments (Hallim). We then compared the accuracy of the bathymetric mapping data derived in these three areas. The estimated depth values exhibited the highest accuracy in Samcheok, followed by Hallim and Cheonsuman. Based on Worldview-3 images and on-site surveys, we confirmed the presence of basalt on the seabed. However, the remote reflectance was attenuated due to the effect of the black rock, leading to an overestimation of the depth. In the future, additional satellite images will be applied as training data for the machine learning model to advance the SDB technology using turbidity and seabed sediment distribution data for each area. Ultimately, the SDB results will be applied as depth monitoring data to facilitate safe ship passage in coastal areas, including ports that require periodic and consistent coastal bathymetry. In addition, they can be applied as input data for numerical ocean models, contributing to various fields.


Introduction
The depth of coastal waters is a crucial factor influencing marine environmental management, marine hydrodynamic structures, marine infrastructures, and ship navigation safety.Traditionally, bathymetric surveys have been conducted using shipborne echo-sounders, which calculate ocean depths based on the time it takes for sound waves to reflect off the seabed and return.
5][6] To overcome the limitations of contact-based bathymetry methods, extensive research on satellite-derived bathymetry (SDB) is underway globally.[5][6][7][8][9] Water depth is estimated by SDB technology based on correlations between the remotely sensed reflectance values of satellite imagery observed with optical multispectral sensors and the water depth during image acquisition.While SDB can be generally applied to depths up to 20 m, it may only be applicable to depths of up to 10 m, depending on the characteristics of the marine region. 5,10,11Indeed, the reflectance of light penetrating the water decreases exponentially with increasing water depth, and the attenuation of remote reflectance varies by wavelength. 3,7he Lyzenga linear band model is a widely used simple SDB model that defines a linear relationship, assuming that the seabed reflectance is linearly related to the depth variation.3][14] However, these models adopt empirical methodologies; consequently, the input values for depth estimation are variable between marine regions, impeding the construction of a universally applicable model. 31,15 Random forest (RF) is a machine learning algorithm that falls under the category of decision tree learning.It is commonly employed for tasks involving classification and regression analysis. 16In particular, its capacity to readily adjust variables and parameters, combined with its capacity to efficiently handle large amounts of data, makes RF a commonly used modality in SDB research for the construction of regression models. 17[27][28][29] The coastal waters of the Korean Peninsula's West, South, and East Seas differ considerably in marine environmental characteristics, including depth distribution, water turbidity, and sediment composition.The Yellow Sea (West Sea) seabed comprises sand and mud and is characterized by continuous sediment influx from rivers, seabed topography with low-gradient slopes, extensive tidal flats due to a large tidal range, shallower depths, strong tidal influence, and high underwater turbidity due to consistent tidal currents.In contrast, the East Sea has a simple coastline and a narrow continental shelf, leading to rapid depth increases from the coast and relatively clear water with low turbidity.Meanwhile, the South Sea presents characteristics intermediate between the East and Yellow Seas, with a more complex coastline dotted with many small-to medium-sized islands, and a seabed composed primarily of sand and mud. 30n addition, the coastal waters around Jeju Island have a mix of sandy and fine-grained shell sediments alongside basalt reefs. 31,32Domestic and international studies have applied various AI techniques to a single study area with clear waters and determined the most appropriate AI approach, 15,33 or have applied an AI technique to multiple study areas with marine characteristics less affected by turbidity. 5However, no study has quantitatively evaluated satellite-based bathymetry results estimated using an SDB model developed with the same methodology for multiple study areas with distinct marine environmental characteristics, such as the West, South, and East Seas of the Korean Peninsula.
In this study, we aimed to develop a model for estimating water depths in three selected coastal areas with distinct marine environmental characteristics, specifically in terms of tide, seabed sediment, and turbidity.To this end, we utilized Sentinel-2 satellite imagery with a 10-m resolution and multibeam bathymetric data provided by the Korea Hydrographic and Oceanographic Agency (KHOA).This model employed the RF machine learning algorithm for training and evaluation datasets.The application of SDB was restricted to areas with depths up to 20 m, and a comparative test was performed using the bathymetric data acquired from KHOA.Furthermore, the potential sources of estimation errors were analyzed by considering the marine environmental characteristics unique to each area, and the feasibility of implementing SDB technology was evaluated.
2 Materials and Methods

Study Area
In this study, the East Sea, Yellow Sea, and South Sea, and the waters around Jeju Island were included in the analyses.To accurately represent the diverse marine environmental characteristics of the three seas surrounding the Korean Peninsula and generate optimal machine learning data along with credible depth estimation results, we selected Samcheok (East Sea), Hallim (South Sea), and Cheonsuman Bay (West Sea) as our training areas (Fig. 1 and Table 1).This satellite captures images of the same area at 5-days intervals.For the machine learning model that estimated depths using the Sentinel-2 satellite imagery, we employed five bands: blue (B2), green (B3), red (B4), vegetation red edge (B5), and near-infrared (NIR, B8).The fifth band had a spatial resolution of 20 m and was, thus, resampled using a linear method to correspond with the 10-m spatial resolution of the other bands [Table 2].
For the training dataset, we utilized six to seven multi-temporal images per training area.This approach was adopted to mitigate the influence of different real water depth for each satellite image under large tidal variation area, which could be a limitation when using a single image5.Subsequently, we selected images of the training areas captured in 2020, the same year as the nautical chart's production, focusing on images with minimal cloud coverage (≤ 10%) and lesser influences of turbidity and waves (Table 3).

Bathymetric data
The depth data provided by KHOA were water depth values used for the latest nautical chart (echo-sounding in 2020), extracted and edited based on echo-sounder data, with the referenced depths based on the datum level (DL) (Fig. 2).Considering that a vector format was used for bathymetric data, they were resampled to align with the 10-m resolution of the satellite imagery grid.Drawing on prior research regarding the limitations of light penetration depth, our study focused solely on areas with depths < 20 m. 3,5,14 We employed tidal values from the corresponding timestamps for model training and result test to adjust for the datum.

Tidal data
To obtain tidal data, we employed the NAO99.Jb tidal model provided by the National Astronomical Observatory of Japan. 35The NAO99.Jb tidal model with a spatial resolution of 1/12 deg (about 1 km) was designed for use in the Northwest Pacific region. 36o ascertain the precision of the NAO99.Jb model, its outputs were systematically compared to observation data.The detailed analysis was performed on three positions in Yellow Sea (Incheon, Pyeongtaek, and Anheung) with high tidal range [Fig.3(a)].The tidal amplitudes obtained from the tide stations at these locations were compared to the predicted results provided from the model.The error at all stations was calculated to be < 5 cm (Table 4).These findings suggest that the NAO99.Jb model can be effectively utilized as tidal calibration data.

Beer-Lambert law
The total upwelling radiance (L t ) observed from the satellite was defined as the sum of atmospheric path radiance (L p ), specular radiance (L s ), subsurface volumetric radiance (L v ), and bottom radiance (L b ), as expressed by Eq. (1) (Fig. 4): In Eq. ( 1), L p can be removed through atmospheric correction and L s and L v can be removed through sun-glint and deep-water corrections, respectively, leaving only L b , defined by the Beer-Lambert law as per Eq. ( 2): where λ is the wavelength, ρðλÞ is the bottom reflectance, αðλÞ is the water's attenuation coefficient, θ is the viewing angle (from the nadir), Φ is the solar-illumination angle (from the vertical), and Z is the water depth.Equation (2) indicates that reflectance decreases exponentially as the water depth increases; this decrease is more pronounced for longer wavelengths.

Sun-glint correction
Sun-glint is observed in satellite images when sunlight reflects directly into the sensor due to a tilted surface.This can arise from various factors, including the sea surface, sun's position, sensor's viewing angle, and wind.To enhance the accuracy of water reflectance results, sun-glint correction was applied. 37This correction typically involves establishing a linear relationship between the NIR band and other bands, followed by adjustments for outlier pixels. 38,39ble 4 The amplitude of observed values from the three regions was compared with the amplitude of values from the NAO99.Jb model.This procedure was conducted using the Sentinel Application Platform offered by the European Space Agency (Fig. 5).

Land masking
In the NIR band, water intensely absorbs light, resulting in pixel reflectance values that approach 0. The normalized difference water index (NDWI) exploits this characteristic to differentiate between water and land by employing the NIR and green bands. 40This relationship is represented by Eq. ( 3), with NDWI values ranging from −1 to 1. Regions with an NDWI value > 0 were categorized as sea, while those with < 0 were identified as land [Fig.5(c)]: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 7 ; 3 6 8

Tidal correction
The depth indicated on the nautical chart is based on the DL standard, while the SDB result reflects the real water depth from the sea surface, influenced by the tide at the time of satellite imaging.Both the Yellow Sea and South Sea of Korea are strongly influenced by tides.Notably, in the Yellow Sea, the tidal range can span up to 10 m. 41,42 Consequently, to generate an SDB result aligned with the DL standard, a datum correction accounting for the tidal level during the imaging time is necessary: where D s ðmÞ is the depth at the time of satellite imaging, tl MSL is the tidal level from the mean sea level (MSL), and H m , H s , H o , and H k denote the semi-range of the tidal constituents M 2 , S 2 , O 1 , and K 1 , respectively.The tidal height data during satellite imaging were sourced from the tidal model, using the grid value nearest to the training area.Tidal heights are positive and negative for high and low tides, respectively, relative to the mean sea level.These values were input into Eq.( 4) to determine the corrected depth from the nautical chart during imaging.The corrected depth was then utilized as the reference data for SDB model training.Similarly, this approach can change the depth value derived from the SDB model to DL. Predicting bathymetry through satellite imaging is the process of measuring the depth at the time the satellite images were taken.Therefore, for quantitative assessment, the depth value adjusted to the nautical chart datum must be used.We constructed a model to estimate water depth using Sentinel-2A/B satellite images, electronic nautical chart data, and the tidal model (Fig. 6).The input data for the machine learning model were extracted from the Sentinel-2A/B images of the training areas, followed by sun-glint correction and land masking.Subsequently, we employed a mean filter to mitigate noise in satellite images by replacing the value of target pixels with the mean of their 3 × 3 pixels surroundings.Our model used depth data from the electronic nautical chart, corrected based on tidal height values extracted while capturing satellite images.We then matched the preprocessed satellite imagery of five bands with the reference depth data corrected through tidal model data.A dataset was created using matched data corresponding to the number of multi-temporal images.We grouped the band values and depth values for every pixel where band values of each image were present.Finally, the dataset was composed by creating a random sample from the data.This dataset was used as training material for the machine learning model.For depth estimation, we used the RF method, which involves training multiple decision trees.This algorithm facilitates easy variable modification and demands high dataset accuracy for training. 5We constructed the RF-based SDB model using the ensemble package in Python's scikit-learn tool [Fig.6(a)].The trained model was then applied to independently observed satellite images to estimate water depths in the training and test area.Depths estimated from the satellite imagery were adjusted to the DL standard for comparison with the nautical chart data [Fig.6(b)].

Depth estimation model training
The SDB model was developed using satellite images and categorized into three groups based on their respective training areas.For each area, datasets were created by combining data from five bands of the satellite images with nautical chart data, specifically targeting pixels representing depths between −5 and 25 m.Given the depth distribution across the training areas and to avoid training bias at certain depths, a consistent number of data points (15,000) was randomly sampled at 5-m intervals to create the training dataset.Of each dataset, 80% of the data points were utilized for model training, and the remaining 20% were randomly selected for validation [Fig.6(a)].The accuracy of the satellite (estimated) data in relation to the in situ (actual) data was measured through a Pearson's correlation coefficient (r) and the root mean square error (RMSE).The calculations for r and RMSE are represented by Eqs. ( 5) and ( 6), respectively: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 7 ; 7 0 0 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 7 ; 6 4 4 where D i is the actual depth, D s is the estimated depth, Di and Ds are the averages of D i and D s , S i and S s are the standard deviation of D i and D s .

Depth estimation model evaluation
To evaluate the performance of the depth estimation model tailored for each training area, we derived depth predictions from satellite images taken on dates distinct from those used in training (Table 5).The satellite images used for test were preprocessed identically to the training data and then applied to the corresponding SDB model for the training area.Since these predictions represent depths at the precise moment of satellite imaging, they underwent a tidal correction to align with the DL standard.Subsequently, only predictions within the 0 to 20 m range were compared with the real depth.

Depth Estimation Based on Satellite Imagery
The model accuracy was validated using 20% of the learning dataset, corresponding to 3000 randomly selected data points per depth segment (totaling 12,000).As measured by RMSE, the errors were 5.0526, 4.6616, and 2.0749 m for Cheonsuman, Hallim, and Samcheok, respectively.Samcheok demonstrated a higher r of 0.9152 than the other locations (Table 6).A previous study on waters with clarity similar to Samcheok, specifically in areas where the annual average chlorophyll-a concentration is less 1.0 mg∕m 3 , reported similar RMSE values between 1.77 and 1.97 m. 5

Quantitative Evaluation
To compare the performance of the SDB model across the three training areas, a quantitative evaluation was conducted on the model using statistical analyses.For objective testing, data points were randomly extracted at 5-m depth intervals from satellite images that were not used during the depth estimation model training process.These evaluation results were then compared  with depths from electronic nautical charts (Table 7).Significant variances were observed among the training areas: Samcheok exhibited the highest estimation accuracy (r ¼ 0.9, RMSE ¼ 2.5861), followed by Hallim (r ¼ 0.52, RMSE ¼ 5.4863) and Cheonsuman (r ¼ −0.05, RMSE = 6.4603).The shallow water estimations for Samcheok were precise estimations, with accuracy diminishing linearly as depth increased.Interestingly, the RMSE for depths between 15 and 20 m was lower than that for depths between 5 and 15 m.

Evaluation of SDB Results and Accuracy Affected by Turbidity
In water bodies, such as the Yellow Sea, that are significantly affected by tides and predominantly distributed with fine-grained sediment, the seabed sediments are readily resuspended due to strong and recurring tidal currents, thus, increasing the seawater turbidity. 38This scattering often leads to shallower depth estimates than the actual depths.Cheonsuman, embodying the marine environmental characteristics typical of the Yellow Sea, exhibited a depth estimation accuracy lower than that in the East Sea.
The density scatter plot and correlation coefficient also failed to align with the actual depth value distributions.In areas with high turbidity, shallow depths are overestimated, and deep depths are underestimated, leading to errors. 33This was observed in Cheonsu Bay.In regions deeper than 10 m, an inversion phenomenon was observed, where the estimated depth decreased as the actual depth increased.In areas with heightened turbidity, underwater suspended matter typically scatters light, resulting in a higher reflectance than in clearer waters.
To address these issues, we conducted an additional analysis using the normalized difference turbidity index (NDTI) to assess the impact of turbidity on the depth estimation model.The NDTI method measures the concentrations of soil sediments, microalgae, and other suspended materials that contribute to water turbidity, utilizing the green (B3) and red band (B4), as defined in Eq. ( 7). 43NDTI values range from −1 to 1, with those nearing −1 indicating less turbidity and clear waters: We calculated the NDTI using the reflectance of pixels from the depth estimation test across the three areas (Fig. 9).Cheonsuman displayed a mid-level NDTI value without correlating to the depth estimation results [Fig.9 0.56 m, respectively, whereas in the shallow waters (0 to 5 m), the RMSE improved slightly, decreasing by 0.20 and 0.08 m, respectively.These results indicate that if turbidity is not taken into consideration when developing satellite image-based depth estimation models, it can lead to significant errors due to the influence of turbidity.Therefore, it suggests that only by considering turbidity can the accuracy for turbid waters be improved (Fig. 10).

Impact of Seabed Sediment on the SDB Model
Coastal areas with tides typically experience resuspension of fine sedimentary particles due to recurring tidal currents, which affects underwater turbidity.Despite the partial removal of the turbidity effect in Hallim at 0 to 10 m depth, which caused an approximate 0.2 m reduction in the RMSE, a relatively high error was still observed in the range of 5.7102 to 7.5959 m.To discern the cause of this discrepancy, depth profiles were plotted for the overestimated depth range and areas with relatively low deviation, and a depth trend analysis was conducted (Fig. 11).
Figures 11(a) and 11(b) show that the depths estimated by the machine learning model were ∼10 m deeper than the actual depths, although the patterns of depth changes aligned well.As verified through Sec.2.2.3, the error was within 5 cm, and considering the tidal range in this region is < 4.5 m, such a 10 m discrepancy cannot be attributed solely to tidal influences.
To determine the cause of this overestimation, we created a scatter plot of reflectivity by depth [Fig.12(a)].According to the Beer-Lambert law, we distinguished area where reflectivity showed an exponential decrease with depth (blue), and those where reflectivity increased with depth (red).This characteristic was observed in all bands, although there were differences in slope.Although reflectivity decreases exponentially in most sea areas in clear waters, 1 results for Hallim showed a different characteristic, implying the involvement of other factors.
The locations were confirmed by marking them according to their characteristics [Fig.12(b)], and visual readings were obtained using WorldView-3 satellite R (MS2, 630 nm)-G (MS3, 545 nm)-B (MS4, 480 nm) images (with a spatial resolution of 1.2 m) (Fig. 13).As observed in Figs.13(a)①, 13(a)②, and 13(b)③ areas where the model overestimated the depth displayed a noticeably darker seabed than their surroundings.In conducting on-site investigations of these regions, we identified a basaltic seabed interspersed with patches of white sand (Fig. 14).In shallow waters, the seabed color could influence SR.Remote reflectance was affected by the color of the seabed materials.That is, areas with dark seabed materials had low reflectance and are overestimated than their actual depth.For Hallim Harbor, the overestimation in shallow areas can likely be attributed to the basaltic nature of the seabed.Indeed, we found that reflectance characteristics varied based on seabed materials.Thus, if we incorporate additional seabed spatial data into the training dataset in the future, we anticipate enhancements in model performance.The sediment distribution map, created from airborne hyperspectral imaging, is scheduled to be provided by KHOA.

Application of the SDB Model in Test Area
To assess whether the results of this study could represent different marine environments, we applied the three SDB models to regions with marine characteristics similar to those of the training areas.For this purpose, we selected Deokjeok, Seongsan, and Sokcho, all within 100 km and where the latest nautical chart data exist (Table 8).To determine the appropriateness of applying the SDB model in regions with similar coastal characteristics, we compared predictions for Seongsan using two different SDB models (Fig. 15).
The results from the SDB model of Hallim showed r ¼ 0.69 and RMSE = 4.7903 m, which were more accurate than the results from the SDB model of Samcheok, r ¼ 0.40 and RMSE = 5.4924 m (Fig. 15).Despite the higher validation of the SDB model of Samcheok, the more accurate evaluations using the SDB model of Hallim indicate the effectiveness of applying SDB models trained on area with similar characteristics (Table 6).When predictions were made using the SDB model with similar oceanic environmental characteristics for three area, the results were also similar to those in Sec. 3 (Fig. 16).As measured by RMSE, the prediction accuracies were 5.8292, 4.7903, and 3.0220 m for Deokjeok, Seongsan, and Sokcho, respectively.Sokcho demonstrated a higher correlation coefficient (r) of 0.8848 than the other locations (Table 9).

Fig. 1
Fig. 1 Geographic location of training areas.Sentinel-2A/B RGB images of (a) Cheonsuman, (b) Hallim, and (c) Samcheok.The blue boxes indicate three additional test areas, different from the training areas.Deokjeok in Yellow Sea, Seongsan in South Sea, and Sokcho in East Sea.

Fig. 5
Fig. 5 (a) Sentinel-2 red-green-blue (RGB) image of Cheonsuman in the Yellow Sea, (b) RGB image before sun-glint correction, and (c) RGB image after land masking and sun-glint correction.

2. 5
Depth Estimation Using a Machine Learning Model 2.5.1 Depth estimation model based on random forest

Fig. 6
Fig. 6 Flowchart of the SDB model.(a) Training and (b) predicting processes.The red box is the SDB model created during the training process and used for predicting.

3. 3
Qualitative EvaluationIndependent satellite images were used for the evaluation dataset, similar to the quantitative evaluation.To investigate the cause, we conducted a density scatter plot analysis, which allows for the visualization of data distribution and patterns.A density scatter plot was generated to visually compare the real and estimated depths across the three training areas, with the x and y axes representing the real and estimated depths, respectively [Fig.7].In the density scatter plot, the higher the proportion of red indicates a higher density of points, which can be interpreted as being influenced by certain factors.A plot closer to the 1:1 line indicated that the model better represented the real values.For Cheonsuman, the estimated depth values predominantly clustered within the 5-to 12-m range irrespective of the variations in actual depth [Fig.7(a)].Meanwhile, higher accuracy was achieved at depths of 10 to 20 m in Hallim.Nevertheless, regions with high-density scatter points appeared in the 0-to 7.5-m depth range [Fig.7(b)].
certain areas indicating overestimation.Cheonsuman had the highest RMSE value at 6.4603 m.The northern region of this training area saw alternating patterns of over-and underestimations, while the southern region predominantly showed an overestimation trend [Fig.8(c)].The causes behind the errors observed in both Hallim and Cheonsuman are discussed in Sec. 4.

Fig. 10
Fig. 10 Results of the SDB model with additional NDTI data.Note: The number under the r and RMSE values indicates the change from the existing SDB model.Blue and red indicate positively and negatively altered main results, respectively.

Fig. 12 (
Fig. 12 (a) Scatter plot of the Hallim area results with the actual depth as the x -axis and band-2 as the y -axis.Black dots express all points in the area, among which blue dots indicate that the difference between the actual and predicted depths is ≤ 3 m; red dots indicate that the difference between the actual and predicted depths is > 3 m.(b) Depths within 10 m classified based on (a).

Fig. 11 (
Fig. 11 (a)-(c) Water depth on the ①-③ transect lines in (d), respectively.Black and red in (a)-(c) indicate real and SDB model depths, respectively.The location of 0 m in (a)-(c) implies a point close to the coastline in (d).

Fig. 15
Fig. 15 Results of Evaluation Seongsan using a different SDB model.(a) Real depth maps and (b), (d) SDB model result maps.(c), (e) Density scatter plot analysis between the real and estimated depths.(b), (c) Results of using the SDB model of Hallim.(d), (e) Results of using the SDB model of Samcheok.

Table 1
Geographic coordinates of training areas.

Table 2
Characteristics of the Sentinel-2 band data used in this study.

Table 3
Sentinel-2 imagery used as training data for each area.

Table 5
Sentinel-2 data is used evaluation data for model verification.

Table 6
Validation results of three SDB models trained with Sentinel-2.

Table 7
Sentinel-2 data used as evaluation data.

Table 8
Information on regions with characteristics similar to the study area.

Table 9
Evaluation results of a test area using the SDB model.