The Potential of Widespread UAV Cameras in the Identiﬁcation of Conifers and the Delineation of Their Crowns

: With the ever-improving advances in computer vision and Earth observation capabilities, Unmanned Aerial Vehicles (UAVs) allow extensive forest inventory and the description of stand structure indirectly. We performed several ﬂights with different UAVs and popular sensors over two sites with coniferous forests of various ages and ﬂight levels using the custom settings preset by solution suppliers. The data were processed using image-matching techniques, yielding digital surface models, which were further analyzed using the lidR package in R. Consumer-grade RGB cameras were consistently more successful in the identiﬁcation of individual trees at all of the ﬂight levels (84–77% for Phantom 4), compared to the success of multispectral cameras, which decreased with higher ﬂight levels and smaller crowns (77–54% for RedEdge-M). Regarding the accuracy of the measured crown diameters, RGB cameras yielded satisfactory results (Mean Absolute Error—MAE of 0.79–0.99 m and 0.88–1.16 m for Phantom 4 and Zenmuse X5S, respectively); multispectral cameras overestimated the height, especially in the full-grown forests (MAE = 1.26–1.77 m). We conclude that widely used low-cost RGB cameras yield very satisfactory results for the description of the structural forest information at a 150 m ﬂight altitude. When (multi)spectral information is needed, we recom-mend reducing the ﬂight level to 100 m in order to acquire sufﬁcient structural forest information. The study contributes to the current knowledge by directly comparing widely used consumer-grade UAV cameras and providing a clear elementary workﬂow for inexperienced users, thus helping entry-level users with the initial steps and supporting the usability of such data in practice.


Introduction
Forests are complex terrestrial ecosystems covering almost 4 billion hectares globally and providing an immense number of ecosystem services [1], i.e., ecological, climatic, economic, cultural, and social services. Forest ecosystems accumulate most of the standing biomass [2], protect the watershed, prevent soil erosion, sequestrate large quantities of carbon via photosynthesis, mitigate climate change [3], and harbour a massive number of species [4]. However, the pressure on the production of these ecosystem services is increasing [5,6], forest disturbances are intensifying [7], and the global forested area is continuously decreasing. This is why it is necessary to continue improving our abilities of forest ecosystems monitoring.
With the availability of Earth observation techniques, the knowledge of the world's forests distribution, health status, biomass storage, and stand structure and composition continuously increases [8]. Satellites and airborne systems supplement the terrestrial data inventories and facilitate the monitoring, mapping, and modelling of forests with unprecedented spatial resolution across large extents. The choice of the platform always relates to the analysis or application we would like to perform. While satellites may capture

Imagery Acquisition
Flights with four sensors mounted on three different UAVs were performed in the study area (see Table 1 for details). The imagery was collected during late winter, on 26 February 2019; the flight conditions were convenient-mostly cloudy sky with a temperature of around 12 °C and a northwest wind of 2-5 m⋅s −1 . A total number of eight Ground Control Points (GCP) surveyed with a Leica 1200 GNSS in RTK mode were placed throughout the study site.

Imagery Acquisition
Flights with four sensors mounted on three different UAVs were performed in the study area (see Table 1 for details). The imagery was collected during late winter, on 26 February 2019; the flight conditions were convenient-mostly cloudy sky with a temperature of around 12 • C and a northwest wind of 2-5 m·s −1 . A total number of eight Ground Control Points (GCP) surveyed with a Leica 1200 GNSS in RTK mode were placed throughout the study site.
The three UAVs included: (a) a lightweight fixed-wing UAV Disco Pro Ag (Parrot SA, Paris, France), which is a ready-to-deploy solution for agricultural and forestry applications with a maximum take-off weight (MTOW) of 0.94 kg mounted with a Sequoia camera; the remaining two were rotary-wing systems, namely (b) the Phantom 4 Pro (DJI, Shenzhen, China) which is probably the most popular lightweight (MTOW of 1.39 kg) universal commercial UAV, mounted with an integrated camera, and (c) a Matrice 210 (DJI, China), representing a professional adjustable enterprise solution with a MTOW of 6.6 kg, with a Zenmuse X5S FC6520 camera. In addition, (d) a fourth UAS (Unmanned Aerial System, i.e., a ready-to-fly solution including all of the necessary components) was created by mounting a RedEdge-M camera on the Phantom 4 Pro platform. Except for the fixed-wing UAV, for which where the producer regulates the elevation level, the flights were performed at 100, 150, and 200 m above ground level (AGL). For the fixed-wing UAV, we used the minimum and maximum allowed flight altitudes (125 m and 150 m) and supplemented them with one between these two (135 m). The flight missions were performed using (i) the DJI Ground Station Pro application for both Phantom 4 and Matrice 210, and (ii) the Pix4Dcapture application for the Parrot Disco Pro Ag. The flights were conducted using perpendicular flight lines with 80% forward (longitudinal) overlap and 70% side (lateral) overlaps. The UAVs followed predefined flight plans across the study sites. The sensors' triggering option was set to Overlaps (one image acquisition per waypoint) or, where not possible, Time-lapse (a fixed time between any two acquisitions). In order to simulate the behaviour of a user without in-depth knowledge of RS techniques, we evaluated the performance of the systems in the default mode, i.e., with no adjustments to the vendorpreset parameters of image acquisition (shutter speed/aperture preference, ISO, etc.).

Image Alignment and Surface Reconstruction
Agisoft Metashape Professional (version 1.5.5, Agisoft LLC, St. Petersburg, Russia) image-matching software was used to generate point clouds and reconstruct the 3D surface [44]. Metashape uses metadata related to the band information from EXIF to load the image description, including the coordinates taken from the on-board GNSS units. As the first step, we loaded the images and estimated the image quality. Images taken during the take-off, landing and taxiing, as well as those with a quality below 0.5 (automatically evaluated by Agisoft), were excluded from further processing [45]. The numbers of the acquired images are tabulated below ( Table 2).
After determining the image orientations, i.e., after geometrical processing [46], the sparse point clouds were checked for outliers and, subsequently, densified using the surveyed GCPs. Following the Metashape manual, a digital surface model (DSM) was constructed using dense point clouds and an orthomosaic was built; see the processing  Table 2. The ground point classification of the dense point clouds was performed in order to enable the construction of a digital terrain model (DTM). In accordance with [47], point cloud filtering was performed using filtering parameter tuning in order to receive the best possible terrain for all of the inputs.

Deriving Tree Variables and Statistical Analysis
The evaluated tree parameters (the number of detected trees and the crown diameter) were derived in R (version 3.4.3, R Core Team, Vienna, Austria). First, we subtracted the terrain models from the surface models in order to gain normalised heights, i.e., Canopy Height Models (CHM). Subsequently, CHMs from all 24 datasets (3 altitude levels, 2 sites and 4 cameras) were processed using the same workflow in the lidR package [48,49]. The workflow included (i) identification of individual trees; and (ii) delimitation of the tree crowns. Detection of the individual trees was based on local maxima filtering using focal statistics [50,51] while crown delineation was performed by watershed-based object detection [52,53]. Subsequently, the crown diameter was calculated using an automatic methodological workflow consisting of (i) the approximation of individual detected crowns using circles with areas corresponding to those of the tree crown polygons, and (ii) circle diameter calculation. Thus, the final output contained information about the total number of trees along with the location (coordinates) and crown diameter (m) of each individual tree/shrub.
Reference values for the number and diameters of the tree crowns were obtained by the operator through manual detection from the orthomosaic. The number of trees was also surveyed in the field. The tree crown diameters were manually measured in the northsouth and west-east directions in ArcGIS software, version 10.7.1 (ESRI, Redlands, CA, USA). Due to the slight differences in the treetop position in the CHMs from the individual UASs and the reference, the treetops on the individual orthomosaics were automatically overlaid (Near function in ArcGIS) before further processing. Multiple treetops within a radius of 2 m were considered to represent a single tree [15]. In addition, the results were visually inspected.
The accuracy of individual tree detection, and thus the proportion of correctly detected trees in both study sites, was evaluated and expressed as the total accuracy with 95% confidence intervals (Table 3). In total, 100 trees were randomly selected within each study site. The normality of the distribution of the individual treetop areas was tested by the Shapiro-Wilk test with outliers both included and excluded. Depending on the results, we applied Student's t-test or the Wilcoxon signed-rank test, respectively (Table 4), for the detection of significant differences between the automatically detected treetop areas and the reference values (i.e., for the comparison of the performance of the models). The accuracy  (Table 4). The study pipeline is in Figure 2. tected trees in both study sites, was evaluated and expressed as the total accuracy with 95% confidence intervals (Table 3). In total, 100 trees were randomly selected within each study site. The normality of the distribution of the individual treetop areas was tested by the Shapiro-Wilk test with outliers both included and excluded. Depending on the results, we applied Student's t-test or the Wilcoxon signed-rank test, respectively (Table 4), for the detection of significant differences between the automatically detected treetop areas and the reference values (i.e., for the comparison of the performance of the models). The accuracy of the detected crowns was evaluated and expressed as the Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE); the model performance was expressed as 1-MAPE (Table 4). The study pipeline is in Figure 2. Figure 2. Study processing pipeline. Using four UAVs, data were collected at three different flight levels at two sites. The data were subsequently processed by SfM-MVS methods into the form of orthomosaics and digital surface/terrain models. Then, the treetops were automatically detected and the tree crowns were delineated using the lidR package in R. The accuracy of the automatic methods against the reference data was further evaluated statistically. Figure 2. Study processing pipeline. Using four UAVs, data were collected at three different flight levels at two sites. The data were subsequently processed by SfM-MVS methods into the form of orthomosaics and digital surface/terrain models. Then, the treetops were automatically detected and the tree crowns were delineated using the lidR package in R. The accuracy of the automatic methods against the reference data was further evaluated statistically. Table 3. Descriptive statistics of the models' performance: the accuracy as a percentage of the identified trees (Detected Trees) and the 95% confidence interval (CI 95) for the achieved accuracy; the values of the accuracies of 70% or more (within the range), together with the 95% confidence intervals, are in bold. In total, 612 detected trees were used for reference at each site.

Results
In total, 24 canopy height models, together with orthorectified mosaics, were derived from the low-altitude aerial surveys using three different types of unmanned aerial systems across two forested study sites. The detail of the orthomosaics with the resulting spatial resolutions are in Figure 3.

Number of Trees
As far as the number of trees is concerned, all of the models observed a similar trend, i.e., that cameras with a higher resolution and larger sensor size were able to capture the forest canopy in more detail than those with a lower resolution and smaller sensors (the

Number of Trees
As far as the number of trees is concerned, all of the models observed a similar trend, i.e., that cameras with a higher resolution and larger sensor size were able to capture the forest canopy in more detail than those with a lower resolution and smaller sensors (the latter tended to smooth out slight vertical differences). Generally, sensors with higher spatial resolution, i.e., the Phantom 4 Pro and Zenmuse XS5, performed better at higher flight levels (150 or 200 m AGL), while the sensors with lower resolution (RedEdge-M and Sequoia) performed better at the lowest flight level (100 and 125 m AGL, respectively); see the descriptive statistics in Table 3.
Two sites were tested: a full-grown 80-100 years old forest with a mean crown size of 7.1 ± 1.4 m (Site I), and a younger, 20-40 year-old forest with a 4.5 ± 0.9 m mean crown size (Site II). The Phantom 4 Pro and Zenmuse X5S offered similar results for both sites, while the RedEdge-M and Sequoia performed much better at Site I, probably due to the larger tree crown size and the lower resolution of the sensors. Considering the success rate of the detected individual trees, the best results were achieved using the widely used consumer-grade DJI Phantom 4 Pro flying at 150 m AGL, yielding 84% correctly identified trees at both typical central European forest sites. Zenmuse X5S achieved very similar results with a detection rate of 81-83% at a 150 m flight height. On the other hand, RedEdge-M detected 75-78% and Sequoia 50-71% of the trees across both sites at 100 m AGL. In general, the Sequoia camera performed worse than the RedEdge-M, which could be caused by is poorer optical system quality, resulting in lower detail. However, the inferior results could also be associated with the different fixed capture settings compared to the RedEdge-M camera.

Tree Crown Diameters
As far as the tree crown diameters are concerned, the best results (i.e., the lowest errors) were achieved using the Phantom (Table 4) and visualized in scatterplots ( Figures A1 and A2).
Sensors with better resolution performed better at higher flight altitudes (150-200 m AGL); their accuracy was higher than 80%. They performed very well, especially at the 150 m altitude (note the 86% accuracy of the Phantom 4 Pro at 150 m and the 84-85% accuracy of the Zenmuse X5S at 150 m, respectively). Following the trend of tree identification, multispectral cameras with much lower resolution performed better at low flight altitudes (100 m AGL), where RedEdge-M and Sequoia achieved promising 81% and 80-69% accuracies, respectively.
The tree crown diameters performance of higher-resolution sensors remained balanced across the sites. In contrast, the low-resolution sensor's performance declined with the decreasing diameters of the tree crowns (especially for the Sequoia). At Site I, with a mean crown diameter of 7.1 ± 1.4 m, RedEdge-M and Sequoia's accuracy ranged between 81% and 72%. When the mean crown diameter decreased to 4.5 ± 0.9 m (Site II), the accuracy decreased significantly to 81-51%. However, the low value of 50% was only valid for the fixed-wing UAV at the highest altitude, which can be unsuitable for this type of analysis; all of the remaining flights, even with low-resolution sensors, yielded accuracies of 65% or more.

Discussion
Many studies have pointed out the potential of remote sensing for forestry purposes. Over the last few years, the popularity of UAVs has increased, and many applications have been explored and described [54]. Specifically, UAVs equipped with various cameras were successfully used to count trees and to measure their crowns and heights [21,55,56]. The potential of consumer-grade (low-cost) solutions is also being explored in practice [57]. This is why this paper compares the performance of four widely used cameras for the counting of the trees and the measurement of their crown diameter. The presented results indicated that even consumer-grade non-professional cameras have potential for measuring tree parameters, especially for users without in-depth education in remote sensing data acquisition and processing, such as forest managers. Moreover, it is possible to process the data solely in open-source software [58], thereby reducing the necessary costs.
The best success rate of the tree detection and the best accuracy of delineated crowns using the consumer-grade RGB cameras was observed when the imagery was taken from the flight altitude of 150 m AGL (which corresponds approximately to 140-120 m above the canopy, depending on the site) at both sites, i.e., in the full-grown as well as young dense forest. The RGB cameras yielded generally better results at 200 m AGL than at 100 m AGL. On the other hand, the multispectral cameras yielded better results at the lowest flight altitude, i.e., 100 m AGL (approximately 90-70 m above the canopy). The increasing flight altitude led to a significant decline of the results of multispectral cameras, especially in tree detection.
Lidar is often considered to be the superior method for scanning forests, as it can better penetrate the canopy. This is true where tree heights are being measured [59]; however, where tree detection and tree crown delineation are concerned, SfM methods can provide results of similar accuracy. While the best result achieved using our simple workflow was 84% of detected individual trees, St-Onge et al. [60] achieved such accuracy (83%) using lidar in boreal forests. On the other hand, Kuželka et al. [29] reported as much as 98-99% successfully identified trees using UAV-lidar; however, they analysed a 100-130 years old coniferous forest with a low tree density. Similarly to our study, an accuracy of more than 80% was reported for a mixed coniferous forest by Mohan et al. [21] using a DJI Phantom 3. As far as a forest with standard silviculture treatment is concerned, the success rate exceeding 80% detected trees is excellent. On the other hand, the success rate may exceed 90% in plantations and orchards; Guerra-Hernández et al. [61] reported 80-96% accuracy in eucalyptus plantations.
Where the tree crown diameters are concerned, Panagiotidis et al. [52] reported lower accuracy than our study using Sony NEX-5R, in a very similar environment. Specifically, they reported MAEs of 2.6 and 2.8 m, respectively. Qiu et al. [62] achieved accuracies of 76% and 63% in a rainforest and coniferous forest, respectively. Our results correspond to those of Zhou et al. [63], who achieved an accuracy of 86% in a mixed growth forest. On the other hand, the authors report the success rate being as high as 92% in monoculture environments.
UAV image acquisition results may be generally affected by the weather conditions, especially the wind speed, precipitation, shadows and light conditions, resulting in differences in exposure or blurriness [45]. This study aimed to eliminate these factors by performing the flights at low wind speeds, with no precipitation, and with stable light conditions. However, we set capture settings set to default/auto mode without user adjustment. The differences in the cameras' behaviour (F-stop, shutter speed, ISO) might have also affected the image quality and, in effect, the detection results.
In addition, the quality of forest canopy geometry may also be affected by the angle of the acquired imagery. Imagery acquisition combining different camera angles may significantly reduce the models' systematic error and the error in the determined internal orientation parameters [64]; using oblique image acquisition might, therefore, make the forest canopy imagery even more suitable for tree crown delineation. Moreover, the flight level of lesser tens of meters is usually believed to improve the overall accuracy. However, the lower flight levels increase the number of images and, in effect, the computational time and costs; such (perhaps unnecessarily high) spatial resolution may also result in increasing bias and uncertainty in canopy geometry.
The results may also be affected by the processing workflow used for the tree detection and crown delineation. The lidR package, which was used in this study, was developed for forestry applications of airborne LiDAR data [49]; however, it could fit UAV-borne data as well [65]. The used method consists of local maxima filtering and watershed-based object detection using a (normalised) digital elevation model. Such a workflow is typical for the detection of and delineation of tree crowns in various environments [56,66,67]. On the other hand, other methods, such as Geographic Object-Based Image Analysis-GEOBIA [68,69], neural network approaches and machine learning methods, or semi-supervised feature extraction [70,71], have also been successfully employed [72,73]. Methods based on spectral information analysis can increase the accuracy compared to CHM-based methods [74]; however, CHM-based tree crown detection and delineation methods offer approximately 70-90% accuracies (depending on the environment). Thanks to ready-made processing workflows, CHM-based methods can be used even by users without in-depth education in remote sensing data processing. In contrast, GEOBIA or machine learning methods require advanced knowledge and experience in remote sensing data processing.

Conclusions
Close range remote sensing techniques support current forestry practices and may be beneficial even to forest managers without in-depth education in remote sensing. The successful implementation of UAVs depends not only on the UAV's availability, flexibility and userfriendliness but also on financial affordability and reliability. UAV-borne canopy height models allow the derivation of numerous parameters, including tree crown detection and delineation, and the determination of tree heights, which could be used, among other things, to estimate the aboveground forest biomass. This study assessed the effect of widely used consumer-grade cameras and flight altitudes on the detection and delineation of typical central European conifer trees. The study brings a comparison of the performance of four solutions (three of which are ready-made commercial solutions suitable for "out of the box" use), providing information on the reliability of data acquired by inexperienced users of these systems. In addition, a clear elementary workflow for inexperienced users is presented, opening the door to the easier usability of such data in forestry practice.
Our results prove the relationship between the flight altitude and the quality of the resulting product. While RGB cameras with higher spatial resolutions performed better in higher altitudes, multispectral cameras with much smaller spatial resolution required lower flight altitudes to yield sufficient accuracy. The success of tree detection and crown diameter determination decreased in the case of multispectral cameras proportionately with increasing flight altitude, i.e. with the declining detail of the forest canopy.
We found that the imagery acquisition altitude of 150 m AGL was associated with the best performance, especially for sensors with a higher spatial resolution (i.e. consumergrade RGB cameras); it is also necessary to mention the additional benefit of this flight altitude compared to lower ones, i.e., the lower number of images leading to the lower computational time and lower costs). On the other hand, more expensive multispectral cameras with a much lower spatial resolution can also achieve promising results but lower flight levels of approximately100 m AGL are necessary for these sensors.
Author Contributions: All of the authors contributed in a substantial way to the manuscript. J.K. and T.K. invented, conceived, designed and performed the experiment, and wrote significant manuscript parts. J.K. was responsible for project management and administration. P.K. contributed by acquiring the UAV imagery. J.K. processed all of the input remote sensing data. K.H. performed the formal analysis and statistical evaluation. J.K. and T.K. ensured the project funding. All of the authors read and approved the submitted manuscript. All authors have read and agreed to the published version of the manuscript.

Funding:
The research was supported by the Technology Agency of the Czech Republic under the Grant Nos. CK02000203 and TJ02000283.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to data licensing reasons.

Acknowledgments:
We would like to thank David Balhar for organising and performing the aerial surveys and Dominika Gulková for measuring the crown diameters. Also, many thanks to Jaroslav Janošek for his helpful comments.

Conflicts of Interest:
The authors declare no conflict of interest.