Stand density extraction and analysis of plantations based on QuickBird and Worldview-2 images

Abstract. Stand density is one of the important forest structure parameters. QuickBird and Worldview-2 high spatial resolution remote sensing images were compared. The Jiangle state-owned forest farm in Fujian Province was the experimental area. A spectral local maximum filtering method was used to extract the number of individual trees in a mountain forest. Nonlinear quadratic polynomial regression models were established for the number of local maximum points and the actual stand density. The near-infrared II waveband of Worldview-2 imagery was used to extract the pure Chinese fir stands with an accuracy of 72.5%, R2  =  0.502, and RMSE  =  35.77 and the Masson pine stands with an accuracy of 78.35%, R2  =  0.754, RMSE  =  41.46, while the accuracy of all unclassified stands was only 0.2907. The results showed that the classification of tree species can improve the accuracy of modeling. The quadratic polynomial model using Worldview-2 image data achieved better results. Stand density in the Jiangle state-owned forest farm was extracted using the NIR II band after fusion of Worldview-2. A stand density planning map was constructed using the best models applied to the different forest types. The method of combining high-spatial resolution imagery and sampling plots to estimate stand density was proven to be feasible.


Introduction
The accurate acquisition of forest structural parameters is an important task for forest resource planning and investigation. 1,2 As one of the important forest structure parameters, stand density is closely related to the tree crown, tree height, diameter at breast height (DBH), wood properties, biomass, and carbon stock of a plantation. 3,4 At the same time, stand density is one of the most important variables in forest management as it directly affects the distribution of ecological factors, such as light, heat, and water, in the plantation community and changes the species diversity and structure within the stand. 5,6 The composition and spatial distribution of the fine root biomass are affected by the plant community composition and stand structure as well as the soil physical and chemical properties and nutrient content. 7 Therefore, the accurate estimation of stand density is helpful for the scientific management and planning of forests. Remote sensing is an important method of estimating forest density at the area and regional level (forest farm level). Traditional sample plot survey methods rarely reflect the state and dynamic changes of the diversity in large-scale forest structure in a timely and accurate manner. 8 The acquisition of the forest structure parameters mentioned above has some limitations, such as high labor intensity, high cost of human and financial resources, and long investigation period. 9 The remote sensingbased method of forest density acquisition mainly requires assistance from sample plot survey data, such as National Forestry Resource Inventory (NFRI) data, which can derive large area forestry resource data. 10 Therefore, exploring remote sensing-based methods can provide *Address all correspondence to Shuhan Wang, E-mail: shuhan2018@qq.com a reference for the future application and promotion of remote sensing. The method of extracting stand density through remote sensing was studied to provide estimates at the forest farm level. 11 Although moderate-resolution satellite imagery (e.g., Landsat) is reasonably sensitive to variation between managed forest stands, it is not sensitive to canopy height variation within the stands in comparison with aerial photography. Moderate-resolution image data (TM) are extremely inferior to LiDAR data in their ability to capture variability in forest structure. 12 Satellite image data are less sensitive to attributes related to canopy height (e.g., basal area) than attributes related to canopy cover (e.g., tree density). This difference also illustrates the value of the integration of multiple types of remote sensing data with field data to efficiently, objectively, and accurately characterize forests to support forest science and management. 6 Very high-spatial resolution (VHR) remote sensing imagery is usually considered a promising and alternative method for acquiring stand density because its spatial resolution is sufficient for identifying individual treetops. [13][14][15] Studies have shown that stand density can be represented by the stem density or the number of trees. 16,17 There are two kinds of methods to extract stand density using VHR. The first method locates every tree in the plot, i.e., the tree level. The second uses a statistical fitting method to predict the stand density with an unknown exact position of each tree, i.e., the plot-level. In general, the second method with the use of VHR remote sensing imagery has low precision for the number of trees that is estimated using spectral and texture information from the imagery. 18 Gougeon et al. used spectral imagery with a pixel size of 31 cm × 31 cm to identify individual trees in plantations of red pine (Pinus resinosa), red spruce (Picea abies), white spruce (Picea glauca), and Norway spruce (P. abies). This method had a stand density accuracy that ranged from 11% underestimation to 5% overestimation. 19,20 McCombs et al. 21 noted that previous studies using remote sensing data compared the field-measured stand density to derive stand density without ascertaining whether the identified trees actually existed. LiDAR demonstrates a very promising capability for predicting forest stand attributes. 22 Some studies have shown that the accuracy of stand density estimation that uses LiDAR data with topographic information is not high and needs to be improved. 23,24 LiDAR data have some shortcomings in the identification of dead trees because a standing dead tree would not be indicated as a tree peak in the spectral data. LiDAR data do not have universal coverage. Application to large areas is restricted due to the high cost. 25 In our study, we used the spatial information from the imagery to explore the ability of VHR imagery to extract individual trees. Many studies have focused on individual tree crown extraction algorithms with the use of different data sources, especially VHR imagery such as that from QuickBird, to extract individual tree crowns. Automatic detection of tree crowns from aerial images is generally accomplished in two steps: (1) tree crown detection and (2) tree crown delineation. 26,27 Validation shows that direct depiction of tree crowns in aerial photographs with VHR will lead to significant errors among tree crowns of different sizes because such overlap is common in a real forest. 13 Therefore, the detection of individual trees remains a challenge to obtaining the stand density for a large area because of the complexity of the algorithm and forest stand structure. 28 In this study, two mainstream high spatial resolution data sources were used to extract stand density. A method of extracting individual trees in an image using spectral local maximum (LM) filtering was proposed; this method has certain feasibility and limitations. 29 A regression model of the number of LM values and the actual stand density was established to estimate the stand density of the study area. The results can provide a reference for the healthy development of plantations and provides a scientific basis for improving the level of management in plantations.

Research Area and Data Acquisition
The study area was located in the central part of Jiangle County, Fujian Province (26°25′-27°04′ N, 117°05′-117°40′ E, as shown in Fig. 1). It belongs to the middle subtropical climate and has both marine and continental climate characteristics. The area's annual average temperature is 18.7°C in winter and 18.6°C in summer, with a maximum annual temperature of 28.6°C.
With an average annual frost-free period of 287 days, the study area is dominated by the mountainous areas where the overall slope is ∼30 deg and the highest elevation does not exceed 1000 m. The main soil type in the area comprises red soil. The Jiangle state-owned forest farm, which was the state-owned administration organization for timber production, has different working areas, including Shuinan and Mingtoushan, covering an area of ∼23 km 2 . 30 The Jiangle state-owned forest farm issued permission for each location, and the field studies did not involve endangered or protected species. Since long-term research and practices have taken place in the study area, good ecological and economic benefits have been produced by a variety of broad-leaved trees mixed with Pinus massoniana, such as Schima superba, Ziziphus spinose, Quercus acutissima, etc. 31 P. massoniana and China-fir (Cunninghamia lanceolata) are the most commonly grown afforestation species in Jiangle County, Fujian Province. 32 The Worldview-2 satellite was launched on October 6, 2009, into a solar synchronous orbit with an orbit height of 770 km and swath width of 16.4 km at nadir. WorldView-2 imagery covering the study sites were obtained on August 12, 2013, from DigitalGlobe. WorldView-2 has eight spectral ranges, including 400 to 450 nm (B1-coastal), 450 to 510 nm (B2-blue), 510 to 581 nm (B3-green), 585 to 625 nm (B4-yellow), 630 to 690 nm (B5-red), 705 to 745 nm (B6-red edge), 770 to 895 nm (B7-near-infrared-1), and 860 to 1040 nm (B8-near-infrared-2). 33 The images were orthorectified and geometrically corrected by DigitalGlobe. Radiance images were atmospherically corrected and transformed to canopy reflectance using the fast line-ofsight atmospheric analysis of a spectral hypercubes algorithm built in ENVI 4.7 software (Environment for Visualising Images: ENVI, 2009). 34 QuickBird-2 was the third satellite to be successfully launched by DigitalGlobe, and it reached orbit on October 18, 2001, via a Delta-2 rocket. This was the third in a series of VHR commercial satellites launched by DigitalGlobe. The VHR of the QuickBird-2 satellite in the panchromatic band and multispectral band is at the submeter level with a panchromatic image resolution of 0.6 m and a multispectral image resolution of 2.4 m. With pushbroom imaging, the maximum zenith angle of the satellite is 25 deg, the orbital height is 450 km, and stereo imaging is carried out along the orbit transverse trajectory. The irradiation width is 272 km around the orbit of the subsatellite point, and the band of the subsatellite mode is 16.5 km × 165 km; the single scene is 16.5 km × 16.5 km, the image width is 16.5 km, and the quantization value is 11 bits. 33,35 QuickBird-2 has four additional multispectral bands: blue band (0.45 to 0.52 nm), green band (0.52 to 0.60 nm), red band (0.63 to 0.69 nm), and nearinfrared (NIR) band (0.76 to 0.90 nm).
Typical sample plots were set up in the study area through the random sampling method. A total of 54 20 m × 20 m sample plots were collected in 2013 and 2014. Figure 2 shows the plot distribution. Information about the tree species, DBH, tree height, and crown width of trees with DBH over 5 cm were recorded by the field investigation. The precise coordinates of the four corners of the sample plot were recorded by total station and hand-held GPS. All plots (unclassified plots) included 11 mixed forests, 10 Masson pine (P. massoniana) forests, 27 Chinese fir (C. lanceolata) forests, and 6 broad-leaved forests. The P. massoniana and C. lanceolata forests accounted for the largest proportion of forests in the forest farm. For all 54 plots, the minimum stand density (trees per plot) was 13 (195 trees per ha), the maximum stand density was 265 (3975 trees per ha), the average stand density was 83.8 (1257 trees per ha), the sum of all trees in all plots was 4439, and the standard deviation was 49.8. The crown information from all coniferous forests of P. massoniana and C. lanceolata were analyzed. The average crown size (north and south) was 3.16 m, the crown width (east and west) was 3.06 m, the average minimum crown width was 1.78 m, and the average maximum crown width was 8.74 m.
In addition to collecting sample plot data in the forest areas, this paper obtained auxiliary data on the forest resources in forest areas by communicating with local forest farm management departments, which mainly included two types of survey data: forest resource inventory data and topographic maps. Since C. lanceolata and P. massoniana were the main distributed tree species, the main forest types can be divided into pure Chinese fir forest, pure Masson pine forest, Masson pine-Chinese fir mixed forest, and other broad-leaved or bamboo forest with Chinese fir or Masson pine mixed forest. The range of pure Chinese fir forests and pure Masson pine forests was determined according to the forest classification field in the forest resource inventory data. The raster topographic map was converted into binary images, and then the binary topographic map was vectorized by ARCSCAN 10.1 to extract the contours. Then, the contours were transformed into an irregular triangular network, and a digital elevation model (DEM) was extracted.

Image Preprocessing
First, orthorectification based on the control points for the panchromatic image and the multispectral image of Worldview-2 was carried out using a 1:10,000 topographic map. The root mean square error (RMSE) of the Worldview-2 panchromatic image and multispectral image orthophoto correction was 0.99 and 0.7854, respectively. The specific method of stand density extraction can be referred to as the extraction process in the Jiufeng Forest Farm. 29 The density of the coniferous forest in the Jiangle Forest Farm was extracted using the Worldview-2 and QuickBird-2 images. Matching different types of images was needed to ensure the geometric accuracy of them. Because the topographic map has a long history and the feature points are not prominent, it was difficult to select the orthorectification points. Therefore, it was necessary to select some points that do not change with time as much as possible.
The QuickBird image from 2013 was calibrated based on the orthorectified Worldview-2 image. The registration process was mainly carried out in the geometric correction module of the AutoSync Workstation using Erdas Imagine 2010 (Hexagon AB, Norcross, Georgia). The AutoSync Workstation can automatically generate thousands of connection points for the correction of images, provided that at least three points are selected manually. In this paper, 50 control points were generated for the geometric correction. It was found that the corrected QuickBird image was still not well matched with the Worldview-2 image, and there were still some deviations in some areas. It was necessary to adjust the image. It was preferable to use the spline method to re-edit the image in ArcGIS to complete the image correction and to match the two VHR images. The matching experiment process shows that the coverage area of the DEM file used for orthorectification must cover the entire study area to ensure the smooth progress of the calibration process. In addition, the high-precision DEM was the premise for the accurate matching of the two VHR images. Finally, the RMSE of the QuickBird image orthophoto correction was 0.503.

Individual Tree Identification Method and Statistical Analysis
If the grayscale value of a pixel in the image slice is defined as elevation and the image is taken as three dimensions, a tree crown will have a peak point, i.e., LM. LM reflectance was used to represent spectral reflectance maximum points and to identify the center of individual tree crowns. In other studies, LM filtering has been widely used as an alternative to peak extraction methods, and this technique has been adopted to detect individual trees. 36 The LM filtering method can be used for different data sources, such as QuickBird panchromatic bands, Worldview-2 panchromatic bands, or LiDAR data. In this paper, we used the LM filtering method to detect each individual tree crown.
The spectral reflectance peak maximum points extracted by the LM algorithm were not all derived from vegetation points. Points incorrectly located on a road were considered pseudo crown points. Normalized difference vegetation index (NDVI) filtration removed these pseudo crown points. A threshold NDVI value was set for removing the pseudo points. The tree crowns in each plot were then identified by spatial association statistics in ArcGIS software. We also tested the effect of different window sizes: 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11. Pixel was used as the unit of window size here. An optimal window size was required for identification. Five different ranges of NDVI were chosen, from 0.1 to 0.5. Points under the threshold value were removed to refine the reflectance maximum points layer by referring to the NDVI layer. The spectral reflectance maximum point layers were overlaid on the sample plots, and the number of reflectance maximum points (N 0 ) was calculated inside each plot. To identify the optimal combination of NDVI threshold and window size, we used Pearson correlation analysis to evaluate each effect of different NDVI threshold ranges and window sizes. A correlation analysis was then performed between the N 0 within the plot and the number of trees within the corresponding plot (N). We took 54 sample plots to evaluate the stand density of all stands.
To ensure the accuracy of the modeling, P. massoniana and C. lanceolata mixed forests, bamboo forests, and broad-leaved forests were excluded, leaving 12 pure forests of P. massoniana and 27 sample plots of pure Chinese fir forest. Pearson's correlation coefficient was used to analyze the linear correlation between variables X and Y, which are often two continuous variables. Pearson's correlation coefficient, when applied to a sample, is commonly represented by the letter r. Two-tailed t-tests were used to determine whether the correlation was statistically significance, and p ¼ 0.05 was used as the threshold. The highest r value indicates the best combination of NDVI threshold, window sizes, and the band. After selecting the best NDVI threshold value, optimal window size and band, considering that there is only one explanatory variable, a simple regression with one element was used to build the model. N 0 was treated as the independent variable, and the true stand density N was treated as an dependent variable. A nonlinear regression model and a linear regression model were both tested. The Shapiro-Wilk test was used to verify the distribution of the residuals, and a regression model was established according to the assumption of a normal distribution of the residuals (predicted value minus observed value). If the p-value was significantly >0.05, then the residuals were distributed normally. The produced models were evaluated for precision using the coefficient of determination (R 2 ) and the RMSE. RMSE measures the differences between the values predicted by a model and the values actually observed. The unit of RMSE is the same as the unit of stand density. We used the leave-one-out cross-validation approach to calculate the cross-validated coefficient of determination (r2cv) and root mean square error (RMSEcv) to validate the models in predicting the forest stand density. The prediction value of the i'th observation was calculated using the regression equation obtained by fitting the model leaving the i'th observation out.
According to tree species composition in each plot, all plots were divided into three types, pure Chinese fir forest, pure P. massoniana forest, and unclassified forest. According to the technical regulations of the National Forest Inventory in China, the definition of a pure forest is one in which a single woodland species has more than 65% of total stock volume. The definition of a mixed forest with coniferous and broadleaf trees is one in which no tree species (Group) have more than 65% of the total volume. In our study, we increased the percentage to maintain the consistency of tree species. If the percentage of Chinese fir exceeded 70%, the plot was grouped as a pure Chinese fir plot. And if the percentage of pure P. massoniana trees exceeded 70%, this plot was grouped as a pure P. massoniana plot. All of the plots were considered to be unclassified forest. Inventory data in the study area were used as the reference data. The study area was divided into two compartments according to the inventory map as reference data, the C. lanceolata stand area and the P. massoniana stand area. After the estimation models were established, we used Create Fishnet tools in ArcGIS 10.1 to create the 20 m × 20 m grid. The rectangles were evaluated using the Spatial Join tool of Analysis tool in ArcGIS 10.1 to count the number of LM in the whole study area. Then, we changed the fishnet vector layer with points counted into a raster layer. We used these models to estimate the stand density in the whole area.

Forest Density Extraction from Different Images
Pearson correlation analysis was used to evaluate the correlation between the stand density and LM points and to evaluate the advantages and disadvantages of the different LM methods.   Figure 4 shows the statistical results with only the C. lanceolata plots, and Fig. 5 shows the statistical results with the only P. massoniana plots. The results showed that the highest correlation coefficient of stand density was 0.50, which corresponded to the settings with a 5 × 5 window and an NDVI ≥ 0.6. After the classification of the forest types, the correlation coefficients improved significantly. The maximum value of the correlation coefficients of the stand density of the Chinese fir stands was 0.54, corresponding to a 7 × 7 window and an NDVI ≥ 0.6. Fig. 4 Correlation coefficients for C. lanceolata between the stand density derived from the different window sizes and NDVI thresholds and the actual stand density (NIR band). From visual analysis, the LM filtering actually achieved the effect of blurring the images. With the increase in the window size, the tree crown in the image became blurred, and the boundaries of the individual tree crowns became clear. For the Worldview-2 panchromatic image, the spatial resolution for the 7 × 7 filter window size defaults to 7 × 0.5 m. A 7 × 7 filter window size is disadvantageous for detecting trees with tree crown sizes smaller than 3.5 m because those crowns were almost indistinguishable. It was easy to blur the difference in the shadow area with the 3 × 3 window, which made the shadow position not obvious. From the visual point of view, the filtering effect of the window size of 5 × 5 was the best; it not only highlighted the crown contour of single trees but also retained the gap between the original tree and the filtered tree, and the shadows were clearly visible. The original cluttered individual tree crowns were displayed as granular by LM filtering and then were patchy after LM filtering, which was conducive to further extraction of the tree crowns. The specific steps of the crown position extraction can be referred to as the QuickBird stand density extraction method, which used the same operational process. 29 Because the stand density extracted by the spectral LM filter method was not currently the true stand density, it needed to be fitted with the actual stand density of the sample plots on the ground.
The panchromatic band was taken as an example, and the fitting results of the actual stand density and the number of LM points with five window sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11, were analyzed. Considering the limited space, some tables were not listed. The graph showed that there was a certain linear relationship between two of them, and with an increasing NDVI threshold, the correlation coefficient tended to change. Figure 6 shows the results of the correlation analysis between the number of spectral LM points extracted from the panchromatic band of the Worldview-2 image and the actual stand density, which corresponded to the whole unclassified sample plot, the Chinese fir sample plot, and the Masson pine sample plot. The results showed that the fitting results were low for all unclassified stand densities extracted in the panchromatic bands of Worldview-2. The correlation coefficient of the best extraction result was only 0.48 and corresponded to a 5 × 5 window and an NDVI ≥ 0.5. The correlation coefficient of the best extraction result was 0.64 for the stand density of Chinese fir forest, which corresponded to a 7 × 7 window and an NDVI ≥ 0.6. The stand density of P. massoniana was extracted, and the correlation coefficient of the optimum extraction result was 0.62, which corresponded to a 5 × 5 window and an NDVI ≥ 0.5.
The orthorectified panchromatic images were processed by spectral LM filtering with window sizes of 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11. The multispectral image with a spatial resolution of 1.8 m was fused with a 0.5-m panchromatic image through the HPF method, and the ability to improve stand density detection was analyzed in each band of the multispectral image after improving the resolution. The detailed analysis results and processes are shown in Figs. 7 and 8.
The window sizes 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11 were used, and the NDVI was calculated using the red and NIR bands. Then, the LM spectral points were extracted. Principal component analysis (PCA) can integrate multiband information and retain image information to a great extent. Therefore, it is necessary to explore and analyze the effect of extracting Fig. 6 Correlation coefficients of the unclassified, C. lanceolata and P. massoniana between the stand density based on the different window sizes and NDVI thresholds and the actual stand density (panchromatic band). the stand density from the first principal component and to compare the results with those of extracting stand density from multispectral images to select the optimal band for extracting extract stand density. The first principal component was extracted with the spectral LM filtering method. The results are shown in Table 1. The results showed that the first principal component extracted the stand densities of all unclassified stands, and the coefficient of determination of the  best extraction result was 0.2719, which corresponded to a 7 × 7 window and an NDVI ≥ 0.5. In the fourth band, the stand densities of all unclassified stands were extracted. The coefficient of determination of the best extraction result was 0.2183, which corresponded to a 7 × 7 window and an NDVI ≥ 0.5. In the fifth band, the stand densities of all unclassified stands were extracted. The coefficient of determination of the best extraction result was 0.2095, which corresponded to a 7 × 7 window and an NDVI (≥0.5). In the sixth band, the stand densities of all unclassified stands were extracted, and the coefficient of determination of the best extraction result was 0.2598, which corresponded to a 5 × 5 window and an NDVI ≥ 0.5. In the seventh band, the stand densities of all unclassified stands were extracted, and the coefficient of determination of the best extraction result was 0.2731, which corresponded to a 5 × 5 window and an NDVI ≥ 0.5. In the eighth band, the stand densities of all unclassified stands were extracted. The coefficient of determination of the best extraction result was 0.2907, which corresponded to a 7 × 7 window and an NDVI ≥ 0.5. The results show that regardless of which band or NDVI threshold was used to extract the stand density, there were one or two fixed filtering windows that achieved the best fit. For the first principal component, regardless of the NDVI thresholds used, the 7 × 7 window could always maximize the coefficient of determination. Table 1 shows the best combination for stand density extraction for each band with the highest coefficient of determination selected from Figs. 4 to 9. Both the QuickBird NIR band and Worldview-2 bands could achieve good results. The highest determinant coefficient was 0.2907 in the eighth band of Worldview-2. Table 1 compares the results of the stand density extraction using a 7 × 7 window and NDVI ¼ 0.5 threshold combination for each band.

Stand Density Extraction Result from Worldview-2 Image
For the example of the Worldview-2 panchromatic band of Fujian from 2013, refer to Fig. 9. With the increasing NDVI threshold, the correlation of the LM stand density showed an increasing trend. When the NDVI value reached 0.5, the determination coefficient reached a maximum; when the maximum filter of the 5 × 5 window was used, the determination coefficient reached a maximum of R 2 ¼ 0.2351, and the best fitting result was obtained. The maximum filtering size of 7 × 7 was generally more stable, while the window size of 3 × 3 was the least stable. The maximum determinant coefficient for the window size of 3 × 3 appeared to be the highest R 2 value of 0.1427 at the threshold level of 0.5.
Comparing the results from the different data sources, the best result was R 2 ¼ 0.2907, which has the highest correlation in all bands, using the eighth band of Worldview-2, the window size of 7 × 7, and the NDVI of 0.5 to estimate the stand density. Therefore, we decided to use the Table 1 Results of the comparison of the stand density extractions with variable filtering windows and NDVI thresholds.

Data
Year Band All unclassified forestry eighth band of the Worldview-2 to model and retrieve the stand density. All samples were classified preliminarily according to the tree species composition. That is, all plots were divided into pure Chinese fir forest and pure Masson pine forest. Figures 10 and 11 show the scatter plot of the linear fitting between the estimated stand density and the actual stand density of Chinese fir or Masson pine forests, respectively, based on the eighth band division. Through the comparison of Tables 2 and 3, it was found that the quadratic polynomial regression model with the independent variable and dependent variable was preferable with nonlinear fitting than with the linear regression model. The quadratic polynomial regression model had a higher determination coefficient value and lower RMSE. The regression models for the sample plots of Chinese fir and Masson pine forests were significant at the level of p ¼ 0.01. Through the observation of Fig. 9, the nonlinear regression model was compared with the linear regression model, and it was found that the nonlinear regression model could better reflect the change trend in the number of LM spectral points and the actual stand density, whether in the Chinese fir or Masson pine sample plots. Therefore, this paper used the established nonlinear regression model to estimate the stand density. Figures 11 and 12 show the results of the model validation, mainly using the predicted and true values for fitting. Table 2 shows the results of the linear regression model. The precision of the estimation model of the Chinese fir stand density based on the eighth band of Worldview-2 was 72.5%. Comparing the actual value with the predicted Fig. 10 Model validation for the C. lanceolate forest. value, we found that the precision of the estimation model of the P. massoniana stand density was 78.35%.
A total of 39 coniferous forest plots contributed to the statistics of the unclassified stands. Among these 39 plots, there were 10 pure Masson pine forests and 27 pure Chinese fir forest plots. Two mixed forests of Chinese fir and Phyllostachys pubescens and P. pubescens forests and broad-leaved forests were removed from the dataset. The suitable window size for LM filtering and the NDVI threshold value for filtering were analyzed. A gridded single-band image of Worldview-2 was created. An initial stand density raster map was obtained using the eighth band of Worldview-2, and the combination of the 7 × 7 window size and an NDVI ≥ 0.6 was used to extract the spectral LM points.
With the help of National Forest Resources Inventory data, the data could be superimposed on the VHR remote sensing image after orthorectification pretreatment. The range of the forest classification types could be determined by the tree composition field in the NFRI data. Thus, the final stand density estimation map could be created using the panchromatic image as the background. The final stand density estimation results are shown in Fig. 12.

Discussion
Further analysis of the spectral LM filtering method applied to VHR remote sensing images in the same research area could infer some helpful conclusions. We compared the results from Table 1 with the results in the literature that used similar technological processes. 29 Two QuickBird images were used to extract the stand density in deciduous forest and coniferous forest in Beijing, China. First of all, from both analysis results, it is found that the selection of the window sizes and the NDVI thresholds had certain consistency and regularity regardless of QuickBird or Worldview-2 images. The window size selected in the Jiangle area was larger than the window size in Jiufeng area, and the probability of selecting a 7 × 7 window was obviously higher than that in the Jiufeng research area. In a comparison of Table 1 and similar results in the literature, 29 the ratio of selected 5 × 5 window and 7 × 7 window sizes in Jiangle accounted for more than half of the total, while the ratio of selected 5 × 5 window and 7 × 7 window sizes in the Jiufeng area were less. The value of NDVI threshold selected in the Jiangle area (0.5 was chosen as the processing threshold) was higher than those selected in the Jiufeng area (0.3 was chosen as the processing threshold). By analyzing and comparing the results of the stand density extraction from the multispectral band and panchromatic band of Worldview-2 and QuickBird imagery, we can see that the NIR band using the LM method obtained the highest determination coefficient of 0.2907.
Research results in both the Jiufeng and Jiangle areas proved that the NIR band was indeed the best extraction spectral band compared with the panchromatic band. For the panchromatic band of Worldview-2, the 5 × 5 and 7 × 7 window sizes corresponded to an average 3 × 3 m canopy size. That window size value was coincidentally close to the average canopy size of the sample plot (see descriptive statistics in Sec. 2.1). This indicates that the window size used in the models was directly related to the actual ground canopy size. Actually, the window size should be close to the true size of the crown to achieve higher accuracy in the literature. 37 QuickBird images and Worldview-2 images were used to extract the stand densities of different study areas through the use of LM filtering. The comparison from both study result shows that using a QuickBird image in the Jiufeng research area in Beijing, China, is better than using a QuickBird and Worldview 2 image in the Jiangle study area in Fujian, China. Especially for the coniferous forest, the coefficient of determination between the number of LM spectral points in the sample plot and the actual stand density of coniferous forest reached R 2 ¼ 0.79 (P < 0.001), while the coefficient of determination between the number of spectral LM points obtained by Worldview-2 and the actual stand density of the sample plots was relatively lower, such as the stand density extracted for the unclassified forest by worldview-2 NIR band with the coefficient of determination only 0.2907. However, according to the composition ratios of the tree species, all sample plots could be divided into different forest types, and then the quadratic polynomial regression models of the different forest types could achieve higher fitting accuracy, as previously shown in Refs. 38 and 39. The stand density extraction method based on a spectral LM filter could extract the coniferous forest densities very well. This method achieved good results.
Interpreting the results was quite different for different sensor types used in the different research areas. There were three main reasons for this uncertainty. (1) Sample location accuracy. The accuracy of sample location had a very large influence on the result. Because VHR images had high spatial resolutions, the small displacements of the sample plots may have cause considerable drift, and those mistakes may have affected the extraction accuracy. The accuracy of orthorectification is another factor which will further affect the sample location. A 1: 2000 scale topographic map was adopted in the Jiufeng Forest Farm, while a 1:10,000 scale topographic map was used in the Jiangle Forest Farm. (2) From statistic results of both research areas, the actual stand density of the Jiangle Forest Farm in Fujian Province was higher than that of the Jiufeng Forest Farm in Beijing, as shown in the literature. 29 With actual stand density increasing, spectral LM points to estimate stand density became less effective. Relevant literatures also pointed out that a high stand density is vulnerable to saturation of number of spectral LM points. Viewing from the fitting process, when the stand density of sample plots reached a certain level, the number of LM spectral points did not increase with increasing stand density. As shown in Fig. 9, when the actual stand density reached 300 trees per plot, increasing the number of LM spectral points obtained through statistical analysis was difficult, which indicated that the number of LM spectral points tends to saturate with the increase of stand density. (3) Satellite altitude angle, azimuth angle, solar altitude angle, azimuth angle, and other external factors during imaging could result in uncertainty in the analysis. Although high accuracy was obtained from the results, the confidence of the results needs to be further verified due to the small number of P. massoniana plots. Additional samples need to be obtained for verification in the future, and the accuracy of estimation needs to be improved. LiDAR data and aerial image data with high spatial resolution and the integration of multisource data are worthy of additional attempts in the future.

Conclusion
In this study, the stand densities in the Jiangle research area in Fujian Province were extracted using Worldview-2 and QuickBird VHR remote sensing images. The following conclusions can be drawn: the NIR band after fusion with higher spatial resolution had more potential to extract the stand densities than the panchromatic band. The sample sites involved in the modeling were divided into different types according to the tree species composition ratios. The plots were divided into Masson pine sample plots and Chinese fir sample plots for statistics, which can significantly improve the extraction accuracy of stand density. The regression models for the number of LM points and the actual stand density were established. The regression models were fitted by linear fitting and quadratic polynomial fitting. The quadratic polynomial model achieved better results. The extraction model of the Chinese fir forest based on Worldview-2 NIR band was y ¼ −4E − 05x 2 þ 0.1559x þ 15.977 with R 2 ¼ 0.502, RMSE ¼ 35.77, and extraction accuracy of 72.5%. The extraction model of Masson pine forest was y ¼ 0.1301x 2 − 3.436x þ 38.633 with R 2 ¼ 0.754, RMSE ¼ 29.05, and extraction accuracy of 78.35%. Finally, the models were applied to the areas of P. massoniana and the Chinese fir forest area, which were determined by the NFRI data, and a stand density distribution map of the study area was generated.