1. Introduction
Urban vegetation is a vital component of the urban environment, providing a broad spectrum of ecosystem services that directly and indirectly impact human health and well-being [
1,
2,
3]. Trees and shrubs form the basis of urban forests, constituting a significant portion of urban vegetation with a substantial influence on urban populations [
4,
5,
6]. Urban forests contribute to physical and mental health [
7], regulate the urban microclimate [
8], enhance water and air quality while mitigating their pollution [
9], reduce noise [
10], provide aesthetic benefits, and support biodiversity conservation [
11,
12]. The rapid acquisition of up-to-date information on urban vegetation, including the number of trees and their condition, enables timely decision-making in urban planning and management. This underscores the importance of green spaces and urban greening initiatives, which, for instance, contribute to climate change adaptation through carbon sequestration and oxygen production [
13,
14,
15].
Currently, two principal methodologies are utilized to gather vegetation information pertinent to the assessment of ecosystem services: direct field measurements and remote sensing. Traditionally, field inventories have been the most accurate method for collecting tree-related data, encompassing attributes such as species affiliation, trunk diameter, crown proportions, and dieback, when compared to airborne and satellite remote sensing techniques. However, this approach faces several limitations, including high time and labor costs, a restricted survey area, and potential disturbance to the objects under study [
16,
17,
18].
The limitations of traditional methods are overcome by using active and passive remote sensing techniques [
19,
20], which enable the rapid collection of extensive data over large areas without direct contact with the objects of study [
21,
22]. Satellite data is increasingly informative and accessible; however, sub-meter resolution is typically available only for specific regions and often entails a high monetary cost [
23,
24]. For most areas, the spatial resolution of free-access satellite sensors, such as Landsat 7 and 8 or Sentinel 2, only permits a general estimation of vegetation cover [
25,
26]. Aerial imaging offers a distinct approach, providing centimeter-resolution images within study areas, thus enabling the study of individual trees, albeit with operational challenges and limited area coverage compared to satellite imagery [
27,
28]. Additionally, active aerial sensing methods like Light Detection and Ranging (LiDAR) provide precise and detailed three-dimensional information in the form of point clouds on the structure and spatial arrangement of studied objects [
29,
30]. For instance, LiDAR is effective for tree detection and digital modelling [
31], facilitating automatic measurement of forest traits [
22]. However, this method also presents difficulties related to the three-dimensional representation of data [
32] and the interpretation of spectral features within the study area [
33].
Unmanned Aerial Vehicles (UAVs) equipped with sensors represent cutting-edge, practical, and adaptable instruments for remote vegetation data gathering [
27]. A more conventional application of UAVs in tree stand studies [
20] involves using passive systems to collect the range of spectral information reflected from the surface of objects under study, such as optical imaging in the visible spectrum [
34,
35,
36,
37], multispectral [
38,
39], and hyperspectral imaging [
40,
41]. Photogrammetric processing of the acquired images enables the reconstruction of digital models of the topographic surface and tree canopy. The obtained spectral characteristics can be used, for example, in calculating various spectral indices [
38,
39,
41] or for tree species classification [
35,
40]. Concurrently, active remote sensing technologies like LiDAR are becoming more accessible. LiDAR can be employed for more accurate reconstruction of both terrain features and the spatial structure of stands [
10], which can subsequently be used to delineate tree stands or detect individual trees [
42,
43], or applied to different analyses and classifications [
44,
45]. Given that both approaches have their drawbacks, numerous studies utilize these techniques synergistically [
18,
24,
46,
47].
Simultaneously, techniques for object detection and semantic segmentation of urban areas, with a focus on green spaces using remote sensing data such as aerial imagery, are developing rapidly [
5,
48]. Consequently, urban forests, and in more detailed works, individual trees are becoming the focus of such research [
49,
50,
51]. Methods for the remote estimation of the main spectral and structural parameters of individual trees are thus developed. These methods can be used, for instance, both in the detection of damaged and diseased trees [
41,
52,
53] and for the calculation of ecosystem services [
25,
54]. At the same time, UAVs are increasing in popularity for agricultural purposes, forest management, and landscape ecology due to their flexibility and the capabilities of using various instruments as payload for image capture and translation [
48,
55,
56].
Regardless of the data acquisition method, a solution for identifying individual trees and their crowns is necessary, serving as either the initial analytical step or the primary objective [
57,
58]. Recently, computer vision methods have advanced significantly, enabling more efficient detection of various objects, including trees [
59,
60,
61]. For tree crown detection, sophisticated machine learning models employing Convolutional Neural Networks with instance segmentation [
59,
62,
63] are frequently utilized. The ongoing demand for improving and examining the raster-based methods also stems from the preference of many researchers to work with raster data, specifically Canopy Height Models (CHM), when studying tree stands [
20,
63,
64]. Consequently, region growing and watershed segmentation algorithms are the most common raster-based techniques for image segmentation in this context [
18,
52], with the Silva et al. [
65] algorithm also being frequently used for segmenting CHMs [
43,
44,
45,
47].
All acknowledged segmentation techniques used to delimit tree crowns, regardless of the type of raw data (raster-based or point cloud-based), require user intervention at the stage of selecting input parameter values [
65,
66] due to the existence of a massive amount of specific cases for these algorithms: individual tree/plot-level studies, coniferous/broadleaf/mixed wood types, forest/urban areas, remote sensing data type, etc. [
10,
18,
39,
43,
45,
52,
63]. In other words, each application case needs almost unique fine-tuning of parameter values to maximize the applied algorithm’s effectiveness.
The common workflow involves selecting a working configuration of parameters through a combination of defaults and adjustments based on ground-truth data, often with manual tuning [
10,
45,
46,
52,
67]. However, the problem of selecting the optimal parameter configuration remains unresolved and requires deeper consideration to achieve more accurate and reliable results. Several studies address the issue of using the most appropriate parameters, such as the shape and size of the search window for the local maximum filter algorithm [
39,
63]. Despite this, few studies employ systematic approaches to identify the most suitable parameter set for various segmentation algorithms when processing digital models. For instance, Hastings et al. [
47] utilized a simple random search within the multivariate parameter space. This approach eliminates the need for manual parameter setting, ostensibly increasing the impartiality and objectivity of the results. However, random search in a multidimensional space is a basic algorithm that may not yield satisfactory outcomes [
68].
We propose applying specialized optimization techniques to search for optimal parameter sets for the most widespread segmentation algorithms. We utilized the Random Search method and the Differential Evolution strategy to identify the possibilities and limitations of the proposed approach. The campus of the Lobachevsky State University of Nizhny Novgorod was selected as the model object for this study. Its tree composition is typical of temperate climate zone forests in central Russia.
This study addresses the following questions: which segmentation method provides the most accurate segmentation of UAV-derived images using default parameter values; how the segmentation quality of each algorithm changes following parameter optimization; and whether the considered segmentation methods exhibit differences in the degree and speed of parameter optimization. We hypothesize that the optimization of parameter values for CHM segmentation algorithms increases the efficiency of individual tree crown recognition.
  4. Discussion
In this study, we investigated how different parameter selection methods affect tree recognition quality using four common segmentation algorithms: Watershed, Marker-Controlled Watershed, Dalponte, and Silva. We broadly distinguish two parameter selection approaches: passive, where parameters are set using default values and/or field data, and active, where optimal parameter configurations are determined through manual or algorithmic search. Our results indicate that adopting an active parameter selection substantially improved the overall performance of all segmentation algorithms. Initially, the Silva (F-score = 0.2287) and Dalponte (F-score = 0.2153) methods showed higher performance than MCWS (F-score = 0.1896) and Watershed (F-score = 0.1468), potentially suggesting greater “universality” in segmenting vegetation images from various regions. However, after optimization, all studied methods exhibited a similar segmentation pattern with only minor differences in overall efficiency (within the range of 0.3 F-score points). This suggests that the key to accurate tree detection lies not in the segmentation method itself, but in its preliminary tuning and optimization.
  4.1. Tuning the Values of the Segmentation Algorithm Parameters and Searching for the Optimal Configuration
The optimization procedure enabled the achievement of a more optimal parameter configuration after only 40 iterations (for the MCWS, Dalponte, and Silva algorithms), leading to an efficiency increase of 0.11–0.16 F-score points and excluding the human factor’s influence on input parameter value selection (
Figure 7). After 300 iterations, segmentation efficiency saw an improvement of 0.12–0.19 F-score points. Consequently, our study’s results indicate that conducting reasonable optimization with random selection of input parameter values can enhance the overall effectiveness of individual tree crown detection and delineation algorithms. Furthermore, applying Differential Evolution may prove to be a sound solution in the long run (>1000 iterations), while Random Search can provide an approximate solution after just a few dozen iterations.
In the context of studying tree vegetation, a common practice for working with such algorithms involves choosing parameter values based on the interpretation of field measurements. For instance, Popescu and Wynne [
77] proposed using a linear regression with a quadratic model (separate models for deciduous, coniferous, and mixed forests), derived from measurements of 424 trees, to predict crown width based on tree height. Technically, their research integrates these regression models into the Local Maximum Filtering (LMF) function [
77], enabling the prediction of crown boundaries within the CHM. The core ideas of this approach have been applied in studies focused on tree detection using LMF [
10,
46].
With the development of more advanced CHM segmentation methods, specifically adapted for delineating tree crowns (e.g., [
65,
66]), additional empirical parameters have been introduced to enhance canopy segmentation accuracy. Silva et al. [
65] estimated the expected tree crown diameter as 60% of its height, based on preliminary field observations. This relationship is incorporated into their segmentation algorithm as a specific parameter, 
max_cr_factor, with a default value of 0.6. Silva and colleagues also suggested excluding all pixels with height values below a predetermined, fixed proportion of the maximum height of the detected tree, a coefficient termed exclusion. They proposed a default value of 0.3 for this parameter in their algorithm.
A somewhat different situation arose with the algorithm proposed by Dalponte and Coomes [
66]. In their work describing the algorithm’s principles, the authors do not mention specific values for key parameters used to delineate crowns, only noting that these parameter values “should be defined by user”. However, to compare our results with a hypothetical benchmark, we followed Pu et al. [
45], Hastings et al. [
47], and Tatum and Wallin [
44]. We defined default values for parameters describing the inclusion of neighboring pixels into the 
crown, 
th_seed and 
th_cr, as 0.45 and 0.55, respectively.
The most complex and ambiguous situation regarding parameter selection arises with watershed segmentation algorithms, where key threshold values are often not even mentioned when the algorithm is used for tree crown delineation (e.g., [
9,
39,
43]). This significantly complicates the decision-making process for determining optimal parameter values.
It is most probable that in cases where the aforementioned algorithms are applied, parameter values are manually chosen by an operator. Some studies emphasize the screening of window or kernel sizes for LMF algorithms, without detailed descriptions of further parameter tuning [
33,
46,
52,
63]. Zhang et al. [
94] employed a “trial-and-error” method to determine optimal segmentation algorithm parameter configurations during testing and parameter adjustment. Conversely, utilizing machine learning models for individual tree detection and crown delineation obviates the need for parameter value optimization due to the fundamental operational principles of these models, which typically render the model’s decision-making process not entirely transparent. From the current perspective, comparing machine learning models with “embedded” optimization solutions to simpler, user-configured segmentation algorithms appears unconvincing.
We employed the Random Search and Differential Evolution optimization algorithms to identify the optimal parameter values. For most parameters, the optimized values differed from the default reference values. For instance, optimizing the Dalponte algorithm’s parameters using Random Search yielded a configuration where the th_seed parameter was approximately 0.195 (−0.255 compared to the default value) and the th_cr parameter was approximately 0.15 (−0.4). In contrast, applying Differential Evolution resulted in values of 0.06 (−0.39) for th_seed and 0.55 for th_cr, with the latter parameter’s optimized value matching the default to three decimal places.
For the Silva segmentation algorithm, the values of the parameters max_cr_factor and exclusion showed relatively similar optimization regardless of the method used, with differences not exceeding 5%. However, compared to the default values, the optimized max_cr_factor parameter received lower values (−0.15 to −0.20), while the exclusion parameter received higher values (+0.12 to +0.15). For the Watershed and MCWS methods, assessing the correspondence between optimized and default parameters was difficult due to a lack of information on reference values, underscoring the need for active parameter selection.
This critical role of parameter optimization was further quantified by our ablation study (
Section 3.3.4). The study demonstrated that for Dalponte, Silva, and MCWS algorithms, the 
Sigma parameter was particularly influential, as reverting it to its default value resulted in the most substantial F-score degradation. Similarly, for the Watershed algorithm, the 
tol parameter exhibited the strongest individual impact on performance.
  4.2. The Effect of Parameters on the Accuracy of Tree Delineation
The determination of the actual impact of each parameter on segmentation efficiency is complicated by the fact that our chosen optimization methods operate within a multi-dimensional parameter space. In this space, each unique parameter contributes to an efficiency increase only in conjunction with other parameters. This notably explains the exceptionally wide spread of efficiency metrics observed for specific sigma values during the optimization of other key parameters (
Figure 9), and a similar situation exists for other parameters.
A somewhat ambiguous situation arises concerning the smoothing of the CHM and the sigma parameter, which determines the strength of this smoothing. Initial tests indicate that the model’s segmentation effectiveness improves with lower smoothing strength values, assuming other parameter values are set to “default” (
Figure 9). However, the results from simultaneously optimizing all parameters in a multi-dimensional space did not reveal a strict relationship between segmentation effectiveness and CHM smoothing strength, even when employing Differential Evolution methods where individual parameters contributing to increased tree canopy boundary recognition accuracy could be maintained across multiple generations. This issue likely deserves further detailed investigation.
In other similar studies, smoothing of the CHM is frequently employed [
11,
65] to fill in “empty” pixels [
32], enhance the accuracy of tree delineation in broadleaf forest conditions [
45], and address crown oversegmentation [
75]. Fujimoto et al. [
67] also highlight the potential use of a Gaussian filter for smoothing Digital Terrain Models. Conversely, Hastings et al. [
47] point out the experimentally identified negative effects of such smoothing, which can diminish the effectiveness of crown delineation. Zhen et al. [
20] identify “extraction, interpolation, and smoothing procedures” as drawbacks of raster-based methods that may lead to information loss or other potential errors, a concern also raised by Silva et al. [
65], who utilized both smoothed and unsmoothed versions of the CHM.
Concurrently, our findings demonstrate that even the application of a straightforward segmentation algorithm, such as the Watershed method, can yield competitive results compared to more complex ones. We hypothesize that our “optimized” parameter value sets may represent local extrema or heuristic solutions. This could explain, for instance, why the optimal values of key parameters, like the coefficients of the regression equation for treetops, obtained by applying different optimization methods may differ significantly within the same segmentation method (
Figure 10). Therefore, we believe that to find the best solution, multiple optimization procedure launches or increases in sample size are required.
Our findings indicate that excessive generalization and the use of default parameter values, including those derived from empirical data, prevent segmentation algorithms from achieving maximum efficiency. It is important to emphasize that the optimized regression coefficient values do not correspond to those estimated from field observations (
Figure 10), suggesting that empirical data are not fully suitable for describing and analyzing digital models of the corresponding plant communities.
The specific optimized parameter sets derived from the Lobachevsky State University campus are intrinsically data-driven and thus highly tailored to the unique characteristics of our UAV-derived CHMs and local vegetation structure. This data-specific optimization, especially given the absence of explicit mathematical regularization terms in our chosen Random Search and Differential Evolution methods, carries a risk of overfitting to the training data. Therefore, direct transferability of these precise parameter values to other geographical locations, different sensor types, or diverse remote sensing datasets should not be assumed without empirical re-validation. Instead, our study strongly advocates for the application of similar optimization procedures in any subsequent research or practical applications to ensure the highest attainable accuracy and reliability across varied environments and data types. This approach acknowledges the inherent variability in urban forest characteristics and sensor data, promoting a methodologically sound path to robust crown delineation.
In practical applications for broader deployment, this strategy implies that optimization can be performed on a carefully selected subset of the surveyed area where ground truth measurements are available. The parameters optimized on this representative subset can then be reliably applied across larger, ecologically homogeneous local territories or ‘unseen’ landscapes with similar characteristics, thereby extending the utility of the method beyond areas with exhaustive ground truth data.
  4.3. An Impartial and Objective Selection of the Best Algorithm for Individual Tree Crown Detection and Delineation
Several studies, including those by Minařík et al. [
52], Pu et al. [
45], and You et al. [
43] have compared different CHM segmentation methods to identify the most appropriate and effective approach for tree detection. These studies indicate that raster-based and point cloud-based approaches have distinct advantages and disadvantages, performing with varying effectiveness under different scenarios. However, an objective comparison of methods necessitates an unbiased selection of parameter values, which can be achieved through a user-independent search for optimal parameter values that may deviate from default settings.
Hastings et al. [
47] conducted an iterative search for optimal parameter configurations for the Watershed, MCWS, Dalponte, and Silva algorithms. Similar to our approach, they optimized parameters within the R environment and utilized a comparable set of parameters, ranges, and standard performance metrics. Using default parameters, the F-scores reported by Hastings et al. [
47] were 0.08 for Watershed, 0.49 for MCWS, 0.46 for Dalponte, and 0.48 for Silva, which substantially exceeded the performance of these algorithms with default parameters applied to the campus area in our research. The improvement in overall accuracy after parameter tuning was substantial only for the Watershed method (+43%), while the gains for other algorithms ranged from 1% (Silva) to 6% (MCWS). The variation in overall accuracy among the four methods was 5% (0.48–0.53). It is worth noting that Hastings et al. used LiDAR point clouds from NASA’s Goddard LiDAR, Hyperspectral and Thermal remote sensing datasets for implementing segmentation techniques, whereas in our work we used photogrammetric point clouds derived after processing UAV images. Thus, their CHM has a spatial resolution of 1 m, while our CHM has 10 cm. Nevertheless, we strongly believe that this fact does not affect our main findings that follow from this comparison.
Comparing the parameter values obtained by Hastings et al. [
47] with our optimization results, we find that the optimal threshold for minimum height for the MCWS algorithm falls within the range of 3.7 to 8.3. However, our findings suggest that treetop detection (
hmin.tt) should be set slightly higher than crown detection (
hmin.cr), whereas Hastings et al. reported that these values could be nearly identical. For the Watershed, Dalponte, and Silva algorithms, the final values for the corresponding parameters show differences between our results and those of Hastings et al. Furthermore, the optimal parameter values obtained by Hastings et al. varied across different study plots within a single area, underscoring the limitations of generalizing parameter values during extrapolation and highlighting the importance of tuning parameters for each unique tree stand.
It is worth noting that canopy segmentation using different methods requires varying amounts of computational time (
Table 9 and 
Table 11), which may serve as a decisive argument in favor of one segmentation method over another. Evidently, the differences in computational speed among various algorithms are attributable to inherently different approaches to image segmentation and non-identical composition of constituent variables. However, the overall duration and speed of optimization should also be considered crucial factors. The MCWS, Dalponte, and Silva algorithms demonstrate a sharp and significant increase in efficiency after the first dozen iterations during the optimization procedure (
Figure 7), after which segmentation effectiveness plateaus with minor improvements throughout the remaining iterations. This observation places the three algorithms on equal footing in terms of crown delineation accuracy. However, the Dalponte method exhibited longer iterations compared to MCWS and Silva methods (
Table 9 and 
Table 11), which may limit its applicability and requires more detailed study. The Watershed algorithm, on the contrary, requires the least computational time among the presented methods but has a relatively slow optimization rate. This might be a consequence of its “versatility” in addressing image segmentation tasks, where crown delineation represents only a specific application.
While our study demonstrates the profound impact of parameter optimization on raster-based segmentation algorithms, it is important to acknowledge that the overall accuracy of individual tree crown delineation can be further enhanced through improvements across the entire processing pipeline. This includes exploring more advanced preliminary data processing techniques for robust vegetation masking, such as object-based image analysis or deep learning-based semantic segmentation, which could provide cleaner inputs to the CHM. Furthermore, the application of sophisticated machine learning and deep learning models for direct canopy detection and instance segmentation, as opposed to traditional raster-based CHM methods, offers another promising direction for future research, potentially circumventing some of the inherent limitations of traditional approaches. Future work could systematically compare the performance gains from such innovations against the benefits of parameter optimization for traditional algorithms, providing a comprehensive understanding of the most effective strategies for urban forest inventory.