2.1. Dataset Introduction
The dataset was collected in mid-April 2025, using standard smartphones as the imaging equipment, with the primary collection site located in the core apple cultivation area of Tianshui City, Gansu Province, China. The dataset was collected from the same orchard, and all fruit trees were planted under similar management conditions.
To achieve yield prediction during the flowering stage, it is necessary to collect image samples of varying flower quantities and distribution states. Therefore, selecting different time periods (morning, noon, evening) under both forward and backward lighting conditions, and covering a variety of shooting angles (frontal, top-down 30°, bottom-up 30°) and background scenes (sky, buildings, distant fruit trees, etc.) were necessary to enhance sample diversity and model generalization capability, providing representative flowering-stage samples for subsequent spatial distribution modeling and yield prediction. A total of 141 high-quality original images were obtained during the entire collection process. After screening, 100 images covering the blooming state of the entire apple tree from 100 different sample apple trees were retained, and the image size was uniformly adjusted to 3072 × 4096 pixels.
To support the application of apple flowering period images in thinning simulation and yield prediction, this study used a sliding window cropping strategy on the collected dataset, with a step size of 512 pixels, to divide the original image into multiple small images of 1024 × 1024 pixels. After accumulating 4800 sample images, Labelme was used for unified labeling, and the training and testing sets were defined at an 8:2 ratio. After 100 epochs of training using Deeplabv3+, the model achieved a mean Intersection over Union (mIoU) of 92.92%, a mean Precision (mRecall), a mean Recall (mRecall), and a mean F1 score (mF1) of 96.64%, 95.86%, and 96.24%, respectively. All experiments were conducted in Python (version 3.8) using PyTorch (version 2.4) with CUDA 12.4 acceleration, and the code was developed in PyCharm (version 2024). DeepLabv3+ was used to separate the target flower object from the background and create corresponding flower mask images as dataset samples. The content of the image dataset is shown in
Figure 1.
In this study, 100 apple fruit tree samples were collected during the flowering period as research objects. After image acquisition, experienced fruit farmers carried out normal thinning and fruit thinning operations on the fruit trees, and manually picked the apple fruit trees during the fruit ripening period. Electronic weighing devices (SENSSUN, Zhongshan, Guangdong, China) were used to weigh and record the yield data of apples harvested from 100 fruit trees (excluding samples of naturally fallen apples during growth). The yield data of each tree is retained in kilograms, with a sample yield range of 4.6–35.9 kg, with an average yield of 18.45 kg. The above sampling method ensures that the yield distribution has a certain representativeness, which facilitates the establishment of a mapping relationship between flowering characteristics and final yield. The yield distribution information is shown in
Figure 2.
2.2. Flower Cluster Recognition and Clustering Methods
After completing the collection of sample data, the mask image of apple fruit tree flowering period only separates flowers from complex environments, and the original feature information (total number of flowers, overall area) is difficult to express as spatial distribution and cluster structure size information. Further spatial structure analysis of flowers is needed to extract more representative phenotypic features of the flowering period, providing a reliable data basis for subsequent thinning simulation and yield prediction. This method takes flower segmentation results as input and constructs a complete analysis framework from flowering period images to yield estimation through four main steps: flower cluster clustering, thinning simulation, feature construction, and yield prediction. The overall technical roadmap is shown in
Figure 3.
Firstly, the flower mask image obtained by DeepLabv3+ is used as input data to extract the spatial coordinates of flower pixels, and the flower cluster structure is identified through density clustering algorithm. Considering the significant differences in scale and density of apple flower clusters, traditional fixed parameter clustering algorithms are difficult to adapt to complex flowering period distribution characteristics. Therefore, this paper introduces an adaptive scale adjustment mechanism and KDTree acceleration structure based on the DBSCAN algorithm, and constructs an adaptive multi-scale density clustering algorithm (AMS-DBSCAN) to effectively identify single flowers and flower cluster structures.
After obtaining information on the structure of flower clusters, this article further combines the experience of artificial thinning flower in orchards to design two thinning simulation strategies: one is a dynamic retention strategy based on flower cluster density, which simulates the thinning principle of “dense is sparse” by moderately reducing high-density flower clusters; the second is a uniform thinning strategy for single flower space which controls the spatial spacing between flowers to flowers to make the preservation of flowers more evenly distributed in space. By simulating these two strategies, multidimensional features such as the number of flower clusters after thinning, retention area, and spatial uniformity can be generated.
Finally, the statistical features of the original flower clusters are fused with the features generated by two thinning strategies to construct a multidimensional phenotype feature set of flowering images. The Lasso method is used to screen and optimize the features to reduce redundant information and improve the model’s generalization ability. Finally, the optimized features are input into multiple machine learning models (such as XGBoost, BPNN, and SVR) for training and comparison, establishing a mapping relationship between flowering image features and actual fruit yield during the ripening period, and achieving early prediction of apple yield.
2.3. Improved Clustering Algorithm AMS-DBSCAN
2.3.1. AMS-DBSCAN
In the process of this study, due to the irregular characteristics of the shooting perspective and the target object itself, there were significant differences in the size, shape, and density distribution of flower clusters in the raw data images obtained. The traditional DBSCAN algorithm uses a fixed neighborhood radius eps, which makes it difficult to effectively identify sparse independent flowers and dense large flower clusters, and can easily lead to single flowers being misjudged as noise. When traditional algorithms process data samples, each data point requires multiple neighborhood queries, and the algorithm complexity is O(n
2), resulting in lower processing efficiency in large-scale image scenes [
15]. In response to the above issues, this study proposes an adaptive multi-scale DBSCAN clustering method (AMS-DBSCAN), and the implementation flowchart of the method is shown in
Figure 4.
On the basis of maintaining the DBSCAN algorithm, this framework introduces an area adaptive adjustment mechanism and KDTree acceleration strategy to improve the adaptability and computational efficiency of clustering algorithms in flowering mask images, providing a reliable flower cluster structure foundation for subsequent thinning simulation tasks. The algorithm first fixed the neighborhood radius EPS to 10 for clustering, obtaining information on the area and number of flower clusters. Due to significant differences in spatial distribution characteristics between isolated flowers and dense flower clusters, single layer clustering and cluster layer clustering were distinguished by the size set by EPS. In areas with dense flower distribution, a smaller neighborhood radius (EPS) is required to prevent adjacent flowers from excessively merging into a cluster. On the contrary, the EPS in the flower cluster area can be set to larger values to reduce the non-aggregation of flower clusters caused by small variations in EPS. Therefore, the proposed method introduces a conditional branching mechanism that dynamically adjusts clustering parameters based on estimated clustering features. This adaptive strategy improves clustering stability in different scenarios and provides more reliable inputs for subsequent processes.
2.3.2. Definition of Feature Space
In order to identify spatially clustered flower clusters in flowering images, the AMS-DBSCAN algorithm performs clustering in a two-dimensional pixel coordinate space. The flower region mask obtained by semantic segmentation is
, and its non-zero pixels represent all regions of the flower. For each input image, the spatial coordinate information of all flower pixels is first extracted using the following method:
where
is the total number of flower pixels,
and
represent the column coordinates and row values of pixel points in the image, respectively. After extracting the pixel coordinates of all flowers, a two-dimensional feature matrix can be obtained:
This matrix constitutes the feature space of AMS-DBSCAN clustering. The clustering process adopts the Euclidean distance metric and uses spatial proximity as the criterion for clustering combination. The formula is as follows:
In this study, due to the obvious clustering distribution pattern of flowers in their natural state, adjacent pixels on the image plane also correspond to adjacent positions in real space. Therefore, clustering results based on two-dimensional coordinates can effectively describe the spatial organization structure of flower clusters. The clustering process is carried out in a two-dimensional pixel coordinate space, which can directly reflect the spatial distribution characteristics of flowers in the image, and has good physical consistency and interpretability. In addition, the distance calculation process in two-dimensional Euclidean space is concise and can be efficiently combined with the KDTree index structure, significantly improving the algorithm’s operational efficiency while ensuring clustering accuracy.
2.3.3. Double Layer Adaptive EPS Regulation Mechanism
The fixed EPS parameters in traditional DBSCAN are difficult to adapt to image regions of different sizes. This study proposes an adaptive neighborhood radius adjustment mechanism. The core idea of this mechanism is that the area of the clustering region is positively correlated with the required neighborhood radius, that is, the larger the region area, the larger the required EPS value, in order to avoid misclassifying a large cluster into multiple small clusters.
- (1)
Macro-clustering: The purpose of this level is to aggregate adjacent flower pixels in spatial position into individual flower clusters. In order to address the significant differences in flower cluster size among different images, this study designed a dynamic adaptive EPS mechanism for this level. The core idea is that as the area of a flower cluster increases, the required neighborhood radius should also increase accordingly to avoid misclassifying a large cluster into multiple small clusters. The calculation formula for adaptive EPS is as follows:
Among them,
is the total area of the current flower area to be clustered,
determines the area threshold of the flower cluster by analyzing the area distribution of the flower cluster samples,
is the area ratio coefficient,
is the base radius, and
is the adjustment coefficient used to control the strength of the influence of area on
. The
will ultimately be constrained within a preset interval to ensure the stability of the algorithm’s parameters. This function has the ability to suppress parameters in areas with extremely large flower clusters by adding logarithms during the design process, so that the
value slowly increases with the increase of area, preventing parameter explosion, and the range of parameter values is shown in
Table 1.
- (2)
Micro-clustering: After identifying the region of each flower cluster, fine-grained clustering is performed within the flower cluster to separate the individual flowers that exist. Therefore, the variation range of the is relatively small, and when fine tuned based on the total area of the flower cluster during the individual flower clustering differentiation process, its variation range is much smaller than that of the flower cluster hierarchy. This ensures the sensitivity and consistency of the algorithm in handling local details.
By combining flower clusters with microscopic single flowers in a two-level design, the AMS-DBSCAN algorithm can effectively solve the problem of clustering failure in traditional algorithms in scenarios with large scale differences. It retains the overall shape of flower clusters while capturing individual flowers, providing diverse and comprehensive feature information for subsequent yield prediction research.
2.3.4. Double Level Clustering Architecture Based on KDTree Acceleration
KDTree is a spatial partitioning structure based on binary trees, which recursively divides the dataset into hyper rectangular regions according to different dimensions, thereby achieving efficient indexing of point sets [
16]. When performing neighborhood search, KDTree can quickly eliminate areas that are unlikely to have neighbors and continue searching only in nodes that intersect with the query radius. Compared to the
complexity of brute force search, the average neighborhood access time of KDTree only requires
, Using this structure can effectively reduce the overall complexity of DBSCAN, with a time complexity of
. Therefore, this article introduces the KDTree data structure in the clustering algorithm process, and inputs the two-dimensional coordinate set
of flower pixels to achieve fast retrieval of adjacent pixel sets within a specified radius.
In this study, the flower pixels first need to be clustered throughout the entire image to identify adjacent flower clusters. The amount of data at this stage is huge, and if traditional neighborhood search methods are used, the computational cost is extremely high. The application of KDTree can quickly complete large-scale nearest neighbor retrieval, effectively supporting dynamic adjustment of adaptive ε on clusters of different areas, thereby ensuring clustering efficiency. After obtaining the rough distribution data of flower clusters through coarse clustering, fine-grained single flower clustering is performed. Although the point set size at this stage is relatively small, due to the higher requirements for accuracy and sensitivity of neighborhood search in single flower clustering, the efficient indexing of KDTree can still improve the running speed while maintaining high-resolution recognition, ensuring that the algorithm can maintain stable efficiency in batch image processing tasks. The average running time of the AMS-DBSCAN algorithm accelerated by KDTree is only about 30–40% of that of traditional DBSCAN. As the data volume increases, its efficiency advantage becomes more apparent. By combining a two-level adaptive parameter mechanism, this optimization ensures that the algorithm has stable clustering accuracy and meets the requirements of real-time and scalability in large-scale image processing scenarios.
2.4. Flower Thinning Strategy
Under the condition of a single flowering period image, some information related to the physiological status of flowers and the carrying capacity of branches is difficult to directly obtain, so visual methods are difficult to fully reproduce the above experience. However, the spatial distribution characteristics, local density, and the morphology and area of individual flowers in flower clusters have good observability, which can to some extent reflect the spatial competition between flowers.
Based on the above modeling information, after completing the dual level clustering of AMS-DBSCAN algorithm, this paper obtained the flower cluster distribution structure and single flower segmentation results of the apple tree flowering period. Further combining the clustering results of flower clusters with individual flower characteristics, two types of thinning simulation strategies were designed from the perspectives of flower cluster density control and single flower quantity control, to approximate some empirical principles in artificial thinning at the visual level. The following thinning simulation methods have not yet replaced thinning operations in real orchards.
- (1)
Dynamic retention strategy for flower cluster density: Flower cluster density reflects the degree of aggregation of flowers in local space, and overly dense flower clusters can lead to competition for nutrients in subsequent fruits and affect fruit quality. Therefore, this article designs a density constrained thinning simulation preservation function based on the characteristics of flower cluster area and spatial density under visually observable conditions, to approximate the empirical principle of “density must be sparse” in artificial thinning:
Among them,
is divided into multiple retention ratios based on the area of the flower cluster (the larger the area, the lower the retention ratio);
is the number of single flowers in the current flower cluster, and the more single flowers there are, the lower the retention rate;
evaluates shape and regularity by calculating the roundness of clusters, and clusters with regular shapes have an appropriate increase in retention rate. There is a negative correlation between
and cluster compactness, and clusters with high compactness will have a slightly reduced retention rate. Clip is used to constrain the retention rate range of flower cluster area, controlled within
, and the range of parameter values is shown in
Table 2.
Dynamic retention strategy for flower cluster density: The implementation process is shown in
Figure 5. The algorithm first traverses the area of each flower cluster based on the clustering results. For clusters with smaller areas, the retention rate is set to 0 and treated as noise to be directly removed. Then, the above function is used to obtain the retention ratio of each cluster, with priority given to retaining pixels near the center of the cluster, forming a representative spatial structure of flower density.
- (2)
Uniform thinning strategy for single flower space: After obtaining the preliminary parameters and data of the flower cluster, strategy two tends to retain specific flowers, control the number of flowers while maintaining a more uniform distribution of flowers within the cluster, and avoid situations such as local overcrowding or leaning towards the edges, and the thinning flower process is shown in
Figure 6.
Its core idea is that before the thinning of flowers within the cluster begins, in order to avoid the phenomenon of local overcrowding caused by the distance between flower clusters being only greater than the neighborhood radius, the distance between clusters is first filtered. If the distance between the centroids of two clusters is less than the threshold, only the flower cluster with the larger area will be retained to avoid redundancy caused by overly fine cluster segmentation, thereby enhancing the stability of the thinning strategy. Under the constraint of target quantity T (positively correlated with cluster size), the flowers to be retained are selected by visiting each flower in the cluster one by one and calculating their basic and spatial scores. The calculation formula is as follows:
Among them, is related to local features such as flower area and roundness; is used to balance the two requirements of “center first” and “uniform coverage”. In the same flower cluster, each single flower small cluster is traversed sequentially. In the initial stage, the single flower small cluster located at the center of the large cluster is prioritized. In the subsequent selection process, the average distance between the determined small clusters is taken into account while considering the distance from the cluster center, so as to ensure that the single flower small clusters are spatially dispersed when selecting the core point and avoid excessive concentration in local areas.
2.5. Production Forecasting Model
Due to the high complexity of the mapping relationship between flower spatial distribution structure, quantitative statistical characteristics and final yield, it is difficult for a single model to simultaneously depict the statistical correlation between multi-scale spatial characteristics and yield response. Stepwise multiple regression [
17] (SMR) is chosen to establish a linear model, which can gradually screen out the main influencing factors in the case of a large number of features and construct a concise prediction equation. Decision tree-based algorithms are introduced, including Random Forest [
18], XGBoost [
19] and LightGBM [
20]. Random Forest utilizes a combination of multiple decision trees to improve the robustness of the model, XGBoost improves generalization ability through iterative optimization and regularization design, while LightGBM uses an efficient histogram splitting method to significantly shorten training time while maintaining accuracy. SVR [
21] is based on computational learning theory and uses kernel functions to map data to high-dimensional space to capture complex nonlinear relationships. Although it has a high computational cost, it is suitable for prediction problems with limited sample sizes and high feature dimensions. MLP [
22] and BPNN [
23] have also been incorporated into deep learning models, which can automatically learn potential relationships from input features and have strong fitting ability for complex nonlinear mappings. This study considered multiple approaches in model selection, including linear and nonlinear, statistical regression and ensemble learning, shallow methods and deep networks, aiming to reveal the relationship between flowering phenotype characteristics and yield from different perspectives.
2.6. Evaluation Metric
For a comprehensive performance evaluation of various yield prediction models, four metrics are employed in this paper: coefficient of determination (
R2), root-mean-square error (RMSE), mean absolute error (MAE), and relative error (RE). The coefficient of determination (
R2) is used to quantify the goodness of fit between the model predictions and the ground-truth values. The root-mean-square error (RMSE) reflects the overall deviation between predicted values and ground-truth values. The mean absolute error (MAE) is adopted to represent the average absolute difference between predicted values and ground-truth values. The relative error (RE) measures the proportional deviation of predicted values relative to the ground-truth values. Their calculation formulas are given in Equations (8)–(11).
Among them, represents the true value of the -th sample, indicates the predicted result, denotes the sample mean, and is the total number of samples.
R2, RMSE, and MAE are widely used in crop yield prediction research [
24]. RE is introduced to normalize the prediction error relative to the actual yield value, allowing for a more intuitive comparison across different yield ranges. These indicators are used to comprehensively evaluate the accuracy and reliability of the model.