Multisensorial Close-Range Sensing Generates Benefits for Characterization of Managed Scots Pine ( Pinus sylvestris L.) Stands

: Terrestrial laser scanning (TLS) provides detailed three-dimensional representation of the surrounding forest structure. However, due to close-range hemispherical scanning geometry, the ability of TLS technique to comprehensively characterize all trees and especially the upper parts of forest canopy is often limited. In this study, we investigated how much forest characterization capacity can be improved in managed Scots pine ( Pinus sylvestris L.) stands if TLS point cloud is complemented with a photogrammetric point cloud acquired from above the canopy using unmanned aerial vehicle (UAV). In this multisensorial (TLS+UAV) close-range sensing approach, the used UAV point cloud data was considered feasible especially in characterizing the vertical forest structure and improvements were obtained in estimation accuracy of tree height as well as plot-level basal-area weighted mean height (H g ) and mean stem volume (V mean ). Most notably the root mean square error (RMSE) in H g improved from 0.88 m to 0.58 m and the bias improved from -0.75 m to -0.45 m with the multisensorial close-range sensing approach. However, in managed Scots pine stands the mere TLS captured also the upper parts of the forest canopy rather well. Both approaches were capable of deriving stem number, basal area, V mean , H g and basal area-weighted mean diameter with a relative RMSE less than 5.5% for all of the sample plots. Although the multisensorial close-range sensing approach mainly enhanced characterization of forest vertical structure in single-species, single-layer forest conditions, representation of more complex forest structures may benefit more from point clouds collected with sensors of different measurement geometries.


Introduction
Trees are among the most important plants for the terrestrial biosphere [1,2]. For improving our understanding on tree populations at varying scales, improvements in methodologies and technologies are needed for characterizing single trees and forests. This methodological knowledge gap is also listed among the most important ecological research topics according to Sutherland et al. [3]. Remote and close-range sensing techniques provide state-of-the-art in mapping and characterizing trees and forests [4,5]. Laser scanning, an active remote sensing technique recording three-dimensional (3D) environment using billions of 3D points, has been the main driving force behind the development of characterization of trees during the last two decades [6][7][8]. An alternative to laser scanning is image matching [9][10][11][12] which can also be used to derive dense point clouds with high geometric accuracy. Dense point clouds acquired with aircrafts or from unmanned aerial vehicles (UAVs) can be used to detect single trees [13], reconstruct 3D crown structure [14], derive height and stem dimensions [15] and predict tree attributes such as species, biomass and stem volume [16][17][18][19].
However, due to the incomplete tree detection and inaccurate tree characterization, some of the tree attributes (e.g. stem diameters) have been challenging to reliably obtain with remote and closerange sensing techniques [20,21]. In varying forest conditions, only part of the trees have been detected from point clouds collected above canopy as suppressed trees have most often been occluded [20,22,23]. The use of the whole point cloud information, instead of canopy height modelbased techniques (See e.g. [24]) has improved detection of the suppressed trees [6], but the robust solution is still missing. Greater part of the trees can be detected from terrestrial point clouds acquired below canopy [4,7]. Terrestrial laser scanning (TLS) provides a detailed 3D representation of the surrounding forest structure enabling automated characterization of trees and stands [4,25,26]. Compared to the conventional forest inventory methods, such as use of clinometers for measuring height and calipers or measurement tapes for measuring stem diameters [27], the use of TLS point clouds enable non-destructive approaches to estimate stem profile and volume [28][29][30] and to characterize branching structure of trees [31,32] which further improve modelling of tree biomass [33,34]. However, due to the close-range hemispherical scanning geometry, the ability of TLS technique to comprehensively characterize the upper parts of forest canopy is often limited [4,21,35]. Therefore, several meters of error in TLS-based tree height estimates are common [21]. Furthermore, errors in tree height estimates lead to erroneous estimates for stem volume and mean tree height at plot level as well. In addition, occlusion due to dense undergrowth vegetation, branches and other trees, are still making automatic detection of all trees challenging (e.g. [21]). Other approaches require more manual work and quality control of identified trees from TLS point clouds required [36,37].
To overcome the challenges with detection of suppressed trees, upper canopy characterization and occlusion, TLS and aerial point cloud data could be combined for improved forest characterization [21,38]. While better capturing the upper canopy structure for more reliable tree height measurement, the multisensorial approach could also enable improved tree detection due to different measurement geometries of the used sensors. Theoretically, a more complete set of forest observations should lead to improved estimates for forest structure. During the recent years, the use of UAVs has become a feasible option for small-scale forest monitoring [18,[39][40][41]. Applying image processing techniques such as Structure from Motion approach for generating point clouds from overlapping images [42], detailed 3D information on forest canopy structure can be acquired even using a consumer-grade UAV equipped with an RGB camera [43]. With theoretically high temporal resolution in data acquisition, UAV photogrammetry is expected to suit for complementing TLS data in forest monitoring applications.
The objective of this study is to investigate the feasibility of combining photogrammetric UAV and TLS point clouds (i.e. multisensorial close-range sensing approach) to improve the accuracy of detecting trees, measuring tree height and estimating forest structural attributes on managed boreal forest stands. We hypothesize that different measurement geometries between TLS and UAV point cloud data will lead to more complete characterization of single trees and therefore to more complete characterization of vertical and horizontal structure of forest stands. We assess the vertical and horizontal characterization performance by using the most common forest structural attributes. Number of trees per hectare (TPH), basal area (G) and basal-area weighted mean diameter (Dg) are used as measures for characterizing forests horizontal structure whereas basal-area weighted mean height (Hg) is a measure describing forests vertical structure. Mean stem volume (Vmean) is a forest structural attribute that is affected by both vertical and horizontal forest structure.  The study materials consisted of tree-level field inventory, TLS data, and aerial imagery collected from 27 sample plots (900-1200 m 2 ) located in pure Scots pine (Pinus sylvestris L.) stands in three study sites in Southern Finland: Palomäki and Pollari located in Vilppula (62°02'N 24°29'E) and Vesijako in Padasjoki (61°21'N 25°06'E) ( Figure 1). Elevation varies between 120 m and 150 m above the sea level as the typical temperature sum in the study areas is around 1200 d.d. Site fertility for all of the sample plots is mesic heath. The tree-level field inventory data consisted of 2102 Scots pine trees that were measured using callipers and clinometers [27,44] with expected precision of 0.3 cm for diameter-atbreast height (dbh) and 0.5 m for tree height [27]. The plot-level forest structural attributes were computed based on tree species and measured dbh and height. Basal area for each tree was computed by considering the cross-sectional area of a tree to be circular. Stem volume for each tree was estimated using nationwide species-specific volume equations by Laasasenaho [45], where dbh and tree height were used as explanatory variables. Then, plot-level forest structural attributes, in other words, TPH, G, Dg, Hg and Vmean were computed as a sum or basal area-weighted mean of single tree variables according to the following:

=
(1) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 27 March 2020 doi:10.20944/preprints202003.0399.v1 where n is the number of trees in a sample plot, A is the area of the sample plot in hectares, gi is the basal area of the i th tree, vi is the stem volume of the i th tree, di is the dbh of the i th tree, and hi is the height for the i th tree. Based on the field inventory, the sample plots represented managed, even-aged Scots pine stands where the G ranged between 13.3 m 2 /ha and 43.3 m 2 /ha indicating large variation in forest horizontal structure (Table 1). Respectively, the Hg describing vertical structure, varied from 16.9 m to 24.6 m.

Acquisition and preprocessing of terrestrial laser scanning and photogrammetric point clouds
The multisensorial point cloud data consisted of TLS point clouds and photogrammetric UAV point clouds. TLS data acquisition was carried out with Trimble TX5 3D laser scanner (Trimble Navigation Limited, USA) for all three study sites between September and October 2018 (see [44]). Eight scans were placed to each sample plot and scan resolution of point distance approximately 6.3 mm at 10-m distance was used. Artificial constant sized spheres (i.e. diameter of 198 mm) were placed around sample plots and used as reference objects for registering the eight scans onto a single, aligned coordinate system. The registration was carried out with FARO Scene software (version 2018) with a mean distance error of 2.9 mm and standard deviation 1.2 mm, mean horizontal error was 1.3 mm (standard deviation 0.4 mm) and mean vertical error 2.3 mm (standard deviation 1.2 mm).
The UAV point clouds were acquired using Gryphon Dynamics quadcopter equipped with two Sony A7R II digital cameras mounted on +15° and -15° zenith angles. Flight paths were from southeast to northwest in Palomäki and Pollari study sites and from northeast to southwest in Vesijako study site. With a flying altitude of 140 m and flying speed of 5 m/s a total of 1916 images were captured resulting in 1.6 cm ground sampling distance and 93% forward and 75% side overlaps. Eight ground control points (GCPs) were precisely measured for each study site using the Topcon Hiper HR RTK GNSS receiver (Topcon, Tokyo, Japan). Photogrammetric processing was carried out using Agisoft Metashape Professional software [46], following a similar processing workflow as presented in [47]. The point clouds were generated using the quality setting "high" i.e. using images with two-times magnified pixel size and the depth filtering setting "mild". In bundle adjustment, the root-mean-square-errors (RMSE) were 0.29-1.75 cm for the X-, Y-and Z-coordinates. As a result, dense UAV point clouds were obtained with a reprojection error of 0.65-0.70 pixels, point cloud resolution of 3.11-3.53 cm/pixel, and a point density of 804-1030 points/m 2 depending on study site.
TLS and UAV point clouds were normalized and registered to obtain a multisensorial dataset (Fig. 2). Point cloud normalization was conducted separately for TLS and UAV point clouds using LAStools software [48]. TLS point clouds were normalized following a similar procedure presented in [49] whereas publicly available 2 m x 2 m digital terrain model (DTM) with an expected vertical accuracy of 30 cm (National Land Survey of Finland) was utilized when normalizing the UAV point clouds. The normalized datasets were then registered and merged using 3D rigid transformation where the transformation matrix was computed based on the coordinates of tie points manually extracted for each sample plot.

Deriving forest structural attributes using terrestrial laser scanning and multisensorial point clouds
A point cloud processing method presented in Yrttimaa et al. [50] was used in this study to segment trees, classify the point cloud into stem and crown points, and to measure tree attributes from point clouds (Fig. 2). It should be noted that the same processing workflow was used for TLS as for the multisensorial (TLS+UAV) point clouds. After single tree attributes were derived, the forest structural attributes were aggregated from the tree attributes using the Equations 1-5. The method is explained in detail in the following sections. First raster-based canopy segmentation was carried out to partition the point clouds into smaller areas to speed up the computations in further stages of the processing workflow (Fig. 2a). Canopy height models (CHMs) at a 20 cm resolution were generated from the normalized point clouds using the LAStools software [48]. UAV point clouds were used in the multisensorial approach. Variable Window Filter approach [51] was used to identify the local maximas in CHMs, and Marker-Controlled Watershed Segmentation [52] was applied to delineate the canopy segments. The point clouds were then split according to the extracted crown segments using point-in-polygon approach.
At this point it was assumed that each crown segment may contain multiple trees if the crowns of adjacent trees are overlapping. These trees were separated from each other and the segmented point clouds were further classified into stem points and non-stem points using the point cloud classification approach (Fig. 2b). The classification was based on a general assumption that stem points have more planar, vertical and cylindrical characteristics than points representing branches and foliage [30,49]. These characteristics were distinguished by applying surface normal filtering, point cloud clustering, and Random Sample Consensus (RANSAC)-cylinder filtering on horizontal point cloud slices. More detailed description of the point cloud classification procedure can be found in [50].
Tree attributes, namely dbh, tree height, and stem volume, were extracted from the classified point clouds as follows (see Fig 2c): Tree height was determined as the vertical distance between the highest and lowest points for each tree. Stem taper curve was estimated by measuring diameters through circle fitting at 20 cm vertical intervals to the stem points. The outliers in diameter-heightobservations were filtered out by comparing the measured diameters to the mean of three previous (or three closest at the bottom of the stem) diameters. Then a cubic spline curve was fitted to the diameter-height -observations to level unevenness in diameter measurements and to interpolate the missing diameters as in [29]. Dbh was then obtained as the diameter at 1.3 m height from the taper curve. Stem volume was estimated by considering the stem as a sequence of vertical cylinders with height of 10 cm. Finally, the plot-level forest structural attributes, TPH, G, Vmean, Dg, Hg, were computed from tree attributes according to the Equations 1-5.

Performance assessments
Performance of the methods to characterize forest structure was assessed by comparing the point cloud-derived tree height and plot-level forest structural attributes (TPH, G, Dg, Hg, Vmean) with the field-measured ones by using bias (mean error) and root-mean-square-error (RMSE) as accuracy measures: where n is the number of sample plots, ̂ is the TLS-or multisensorial point cloud-derived tree attribute or forest structural attribute for plot i, and Xi is the corresponding attribute based on field inventory. In addition, the coefficient of determination (R 2 ) was used to describe the proportion of the variance that the point cloud-based approaches were capable of capture from the forest structural attributes. Tree detection accuracy was evaluated by computing how large a part of the fieldmeasured trees was automatically detected from the point cloud and how large portion of the total tree stem volume these trees represented. There were no differences in tree detection accuracy between the use of only TLS data and multisensorial point clouds (Table 2, Fig. 3). Out of the total number of 2102 Scots pine trees, 2076 (98.8%) were automatically detected with both point clouds. The stem volume of the detected trees accounted for 99.5% of the stem volume of all the trees. On average the tree height was underestimated by 0.33 m (1.7%) and RMSE in the tree height measurements was 1.47 m (7.4%) with multisensorial approach. The accuracy was slightly decreased when the measurements were only based on TLS data, as the tree height was underestimated by 0.65 m (3.3%) with an RMSE of 1.64 m (8.3%). In TPH, G and Dg, there were no differences in bias or RMSE when these attributes describing forest horizontal structure were derived from TLS or multisensorial point clouds (Table 2). TPH, G and Dg were all slightly underestimated (1%-2.5%). RMSEs in TPH, G, Dg were 4.8%, 3.3% and 1.5%, respectively. Hg and Vmean estimation accuracies differed between TLS and multisensorial approach. In Hg, the bias decreased from -0.75 m (-3.6%) to -0.45 m (0.58%) and RMSE from 0.88 m (4.3%) to 0.58 m (2.8%) when multisensorial approach was used instead of only TLS data. In Vmean multisensorial approach provided a slightly lower RMSE (12.81% vs. 14.55%) compared to TLS, but the estimates included more bias (4.97% vs. 0.82%). Negative bias denotes underestimation.

Discussion
We investigated how much forest characterization capacity can be improved in managed Scots pine stands if TLS point clouds are complemented with photogrammetric point clouds acquired from above the canopy using an UAV (i.e. multisensorial close-range sensing approach). We hypothesized that different measurement geometries between TLS and UAV point cloud data would lead to more complete characterization of single trees and therefore to more complete characterization of vertical and horizontal structure of forests. We assessed the vertical and horizontal characterization performance by using the most common forest structural attributes. TPH, G, Dg represented horizontal variation of the Scot pine stands whereas Hg described their vertical structure, and Vmean was affected by both vertical and horizontal structure. The results supported our hypothesis as forest structural attributes related directly to the vertical forest structure were more accurately estimated by the multisensorial point clouds (Table 2). Additional benefits from the photogrammetric UAV point clouds were in the tree segmentation stage of the data processing. However, the multisensorial close-range sensing approach did not considerably improve characterization of the horizontal forest structure compared to use of only TLS point clouds. This finding was somewhat unexpected as different measurement geometries were assumed to also lead to improved tree detection as different trees were anticipated to be occluded in TLS and UAV data. Though, it should be noted that the sample plots used located in managed Scots pine stands and already the TLS-based tree detection rate was better than has been reported in most of the studies in boreal forests [21,49]. In more complex forests, use of multisensorial data should theoretically improve the tree detection rate and therefore estimation of TPH, G, Dg as well as Vmean.
Tree detection from TLS point clouds is greatly dependent on comprehensiveness of a point cloud describing trees. A grid of 10 m between scan positions produced uniform data for characterization of even sub-order branches [53]. In [54] it was reported that the highest tree detection rate of 82% was obtained with seven scanning locations at the vertices of a hexagon together with a centre scan in temperate forests. In [21] RMSE of tree height estimates varied between 2.8 and 4.7 m (13%-30%) which is considerably higher than the results obtained with TLS data only here (i.e RMSE 1.6 m). It should be noted, however, that scan design was different between these two studies (i.e. five scan locations in [21] and eight in our study) together with the fact that plots in Liang et al. [4] had more complex species composition compared to practically pure Scots pine plots in this study. The improvements for the estimates of TPH, G, Dg, Hg, and Vmean between [49] and our study is also noticeable indicating the effect of forest structure but also scan design in reliable characterization of forest structure. In [55] automatic and manual measurements of tree height from 3D point clouds acquired with a laser sensor mounted on an UAV were carried out and the results showed RMSE of ≥10% and bias of ~3%. Although, the inclusion of UAV point clouds provided only slightly lower RMSE (2.8%) and bias (-2.2%) for Hg compared to TLS data only (4.3% and -3.6%, respectively), RMSE especially was notably lower than reported by [55]. Although there was a difference between UAV sensors the difference in heterogeneity in forest structure is assumed to be the main reason for the differences between these two studies.
Forest structure can be characterized by using a field inventories [27], close-range sensing [4,27,56] or remote sensing [5,57]. In field inventories, typically calipers are used for dbh and clinometers for tree height measurements. If tree positions are mapped, a global navigation satellite system and tacheometer are needed. Field inventories can be time-consuming, require skillful personnel, and measurements are prone to human errors. However, field inventories provide rather accurate information from trees, but from a limited number of attributes (such as species, dbh and height). Still, forest field inventories are most often considered as the reference for other forest characterization methods [27]. The technologies, such as ALS and digital stereo interpretation of aerial imagery (e.g. [58]) provide state-of-the-art in forest characterization based on remote sensing. Large forested areas and landscapes can be characterized with these techniques. Most often, direct observations from the attributes of interests are not possible as the height of the canopy, height variation, canopy cover, and spectral properties of forests are observed instead [58]. Thus, reference information from attributes of interest are required as prediction models between the attributes of interest and remote sensing observations are generated. Another limitation relates to the amount of detected trees, typically only dominant trees or trees contributing to canopy surface can be identified [20,22,23]. Close-range sensing technologies include, but are not limited to TLS and use of UAVs [4,56,59]. Close-range sensing technologies can be used to characterize single forest patches or small forested landscapes [56]. Multiple TLS scans are required from each forest stand to capture its structure. Time consumption on the site with TLS is slightly less than in treewise field inventories with calipers and clinometers. However, TLS provides a more complete description of forest structure, especially about the structure below the crown base height [32], and the measurements are objective. In dense forest stands, the occlusion may limit the number of observations that are received from stems and crowns which decreases the forest characterization capacity [4,60]. In general, the number of scans that are required for automatic detection of each tree [23,49], tree species recognition [61], and systematic underestimation of tree height [21] are the major bottle-necks when mere TLS is used for forest characterization. Use of UAVs have mainly the same limitations as airborne remote sensing technologies, such as the lack of direct observation methods for single tree attributes. However, use of UAVs offers flexibility to data acquisition and lower flying costs at a price of smaller areas that can be covered with aircrafts [60]. Considering the strengths and weaknesses of the above alternatives for characterizing forest stands at single tree level, it seems feasible to combine TLS and UAV data. This statement is also supported by our findings. Multisensorial approach was capable of improving characterization of forest structure. It should be noted that we did not test the use of only UAV data. However, based on the existing knowledge, it is known that by using the similar sensors that were used here, complete tree detection and tree stem characterization is still challenging [17,62].
We aimed for developing methodologies and technologies that can be used to characterize single trees, forest patches and even further, to improve our understanding on tree populations at varying scales. Instead of calipers and clinometers, researchers and forest organizations are more and more interested in using close-range sensing sensors for characterizing forests. Based on our study it is beneficial to combine point clouds that were collected below and above the canopy using TLS and UAV for characterization of forests horizontal and vertical structure.

Conclusions
The automatic forest characterization based on multisensorial data consisting of photogrammetric UAV and multi-scan TLS point clouds was capable of detecting almost all the Scots pine trees on the sample plots while providing reliable estimates for the forest structural attributes. Compared to the use of only TLS point clouds, improvement in the accuracy of tree height measurements as well as estimates of Hg and Vmean was recorded when photogrammetric UAV and TLS point clouds were combined. However, in managed Scots pine forests TLS alone also captured the upper parts of the forest canopy rather well and tree height measurements and Hg estimates were only slightly underestimated with bias of 0.65 m and 0.75 m, respectively. Although in single-species, single-layer forest conditions, multisensorial approach enhanced characterization of forests vertical Funding: This research was funded by the Academy of Finland, grant numbers 315079 ("The effects of stand dynamics on tree architecture of Scots pine trees"), 272195 ("Centre of Excellence in Laser Scanning Research") and 327861 ("Autonomous tree health analyzer based on imaging UAV spectrometry").