1. Introduction
Sunflowers rank among the world’s most significant economic crops [
1], finding extensive application in vegetable oil processing and seed consumption. Their seeds are rich in unsaturated fatty acids and protein, constituting an essential nutritional component of balanced, healthy diets [
2] with considerable nutritional and health benefits. According to statistics from the Food and Agriculture Organisation of the United Nations, sunflower oil has become one of the world’s primary vegetable oil sources [
3]. Current global oil consumption totals approximately 85 million tonnes, with vegetable oils accounting for 75% of this figure, where sunflower oil holds a leading position [
4]. As consumer demands for high oil content and seed quality continue to rise, the stability of sunflower’s commercial traits is receiving increasing attention. Particularly, sunflower phenotypic factors such as flower head size and marginal thickness increasingly influence marketability and economic value, progressively becoming key indicators for assessing ‘consistency of marketability’ and ‘oil stability’. Consequently, sunflower structural phenotypes are not only crucial for vegetable oil supply security but also closely linked to sustainable agricultural development [
5].
To address the pressure on grain and oil demand arising from population growth, enhancing sunflower yield, improving quality, and strengthening stress tolerance have become core tasks in crop breeding and agricultural production. In this process, the precise acquisition of crop phenotypes has become a critical step. Phenotypic analysis not only provides vital information on plant morphology, physiology, and developmental status but also serves as an indispensable tool for variety improvement, agronomic optimisation, and high-throughput breeding [
6]. However, current research on sunflower phenotyping predominantly focuses on traits such as plant height and leaf area index. Although some studies have addressed geometric characteristics like flower head diameter [
7], quantitative analysis of flower head size and marginal thickness remains severely inadequate. Indeed, these two structural parameters not only determine the spatial distribution and filling uniformity of seeds but also directly influence final seed yield and oil accumulation levels [
8]. As key phenotypic indicators, flower head diameter and rim thickness exert direct effects on seed filling, oil accumulation, and performance during mechanised seed harvesting. The seed portion of sunflower seeds accounts for approximately 70% of total seed weight, with oil content comprising about 55%. This indicates that overall flower head size and marginal thickness play a decisive role in determining seed plumpness and oil quality performance [
9].
Furthermore, studies have revealed that east-facing flower heads exhibit an 11.2% increase in average seed weight compared to west-facing treatments, with more fully filled grains. This indirectly corroborates the importance of flower head structure in pollination efficiency and grain development [
10]. Further investigations reveal that plants with more uniform disc rim thickness exhibit lower grain damage rates during harvesting, alongside superior grain plumpness and oil stability [
11]. During mechanised harvesting, adaptive picking based on glume diameter and marginal thickness is crucial for reducing grain shedding and damage rates. This is because varying marginal thicknesses directly impact the gripping stability, mechanical control, and path planning of harvesting robot arms, thereby determining both harvesting efficiency and the integrity of marketable seed. Research indicates that seed loss increases approximately twofold by the fifth day after harvesting commences beyond full ripeness, rising to a tenfold to twelvefold increase by the fifteenth day [
12]. This further underscores the importance of precision harvesting and timing control based on flower head diameter and rim thickness. Consequently, achieving efficient identification and adaptive harvesting of sunflowers with varying flower head diameters and rim thicknesses has become a critical technical bottleneck in intelligent breeding and precision agricultural management. Currently, measuring sunflower phenotypic parameters primarily relies on manual methods, such as rulers, protractors, and other handheld devices, to obtain data including plant height, stem diameter, leaf width, flower head diameter, and rim thickness [
13]. Whilst these methods offer simplicity and practicality, they suffer from limitations including time-consuming and labour-intensive procedures, high costs, significant human error, and poor repeatability. Furthermore, repeated measurements often cause plant damage, making it difficult to meet demands for efficient and real-time feedback.
With the advent of non-contact technologies such as remote sensing, image processing, and artificial intelligence, crop phenotyping has progressively evolved towards automation and digitalisation [
14]. For instance, Fieuzal et al. achieved high-precision estimation of sunflower leaf area index and plant height using multi-temporal optical and SAR satellite data [
15]. Additionally, Sunoj et al. employed digital image processing techniques to measure inflorescence size, attaining considerable accuracy and semi-automated results [
16]. Other studies have attempted to utilise techniques such as threshold segmentation and image binarisation for identifying flower heads, receptacles, and petals. While possessing a degree of automation capability, these methods face significant challenges in terms of recognition accuracy and robustness under field conditions involving natural lighting, complex backgrounds, and petal occlusion. These limitations make it difficult for such approaches to reliably obtain geometric structure parameters of flower heads, particularly regarding marginal thickness and overall integrity. Consequently, integrating advanced deep learning models with three-dimensional reconstruction techniques has emerged as a novel research trend. Furthermore, the recent combination of instance segmentation models with depth cameras has opened new avenues for phenotyping research. For instance, YOLO v11 can reliably identify targets in complex environments by learning the morphological and textural features of the receptacle, while the point cloud data generated by depth cameras provides a robust foundation for true geometric measurements [
17].
To address the aforementioned issues, this paper proposes a non-contact measurement method integrating instance segmentation with three-dimensional point cloud analysis, specifically designed for the precise identification and estimation of sunflower flower head diameter and edge thickness. Although this method can directly estimate flower head thickness and diameter through single-angle lateral observation, front-view observations are additionally employed to validate diameter measurements, ensuring rigorous results. During the training phase, the model learns structural features of the flower head surface and edge regions through image annotation, while actively excluding non-target interference by omitting annotated petals and calyxes. Following mapping of recognition results into a three-dimensional point cloud space, denoising is performed using accelerated filtering and statistical filtering. Missing regions are then completed via interpolation and upsampling. Combined with PCA for pose standardisation, this generates structurally complete and unified standardised point cloud models for both the disc surface and margins. Considering the central disc faces directly forward with an approximately flat structure, while the margins exhibit curved, winding patterns with indistinct thickness and highly variable growth forms, this study employs differentiated geometric extraction strategies: within the central disc point cloud, the disc diameter is calculated using principal plane projection combined with boundary maximum diameter measurement; within the marginal point cloud, PCA principal axis fitting analysis is employed to precisely determine margin thickness. This approach fully accommodates the three-dimensional structural characteristics of different flower head regions, effectively enhancing measurement completeness and accuracy. It avoids redundant point cloud stitching and error accumulation, significantly reduces computational burden on equipment, and better meets the requirements for real-time performance and stability in field environments.
In summary, existing research still exhibits several shortcomings in non-contact measurement and three-dimensional point cloud analysis: Firstly, multi-view point cloud registration relies on complex alignment algorithms, prone to cumulative errors and resulting in insufficient real-time performance; Secondly, quantitative studies on key parameters such as sunflower head edge thickness remain scarce, with existing methods struggling to balance computational efficiency and accuracy; Thirdly, existing deep learning models impose excessive computational overhead when deployed on resource-constrained edge devices, limiting their real-time application in field scenarios.
To address these technical gaps, this study proposes a non-contact measurement method combining an improved MBLA-YOLO approach with dual-view point clouds. Its innovations manifest in three key aspects: (1) Methodological: Enhancing feature representation capabilities for complex boundaries and small targets through the introduction of the CKMB module and LADH detection head; (2) Process: Designing a ‘dual-view direct measurement’ strategy that bypasses traditional multi-view point cloud fusion, reducing error accumulation and improving real-time performance; (3) Application level: Integrates PCA pose standardisation for efficient edge thickness estimation, delivering a scalable lightweight solution for intelligent harvesting equipment and high-throughput agronomic phenotyping research.
Consequently, this study not only addresses methodological shortcomings in existing techniques regarding precise edge thickness measurement and lightweight deployment but also provides novel insights and technical support for intelligent equipment design and efficient field measurement in precision agriculture.
Subsequent sections outline the methodology:
Section 2 details data acquisition, model refinement, and point cloud geometric analysis;
Section 3 presents experimental results and analysis;
Section 4 discusses performance variations, application strategies, and limitations;
Section 5 concludes the paper.
3. Results
Given that CKMB and LADH operate independently on trunk extraction and detection head stages, respectively, this paper designed a series of ablation experiments. By sequentially introducing CKMB and LADH, we compared the performance differences between the original YOLOv11n-seg model, single-module improved versions, and the integrated version MBLA-YOLO. This systematically evaluated the impact of each module on segmentation accuracy, edge detection capability, and inference efficiency, thereby validating the effectiveness of structural optimisation and providing theoretical foundations for subsequent model deployment and lightweight adaptation.
Given the differences in texture structure and boundary characteristics between the two regions, this paper conducts independent training and evaluation for the disc surface and edge tasks to better accommodate their respective segmentation requirements. This approach ensures fair comparison and accurate reflection of performance metrics. All comparison models were trained using identical training datasets and parameter settings to guarantee the scientific validity and comparability of experimental results. The ablation study findings are presented in
Table 2 and
Table 3.
In the baseline model YOLOv11n-seg without introduced enhancements, the Precision on flower disk edges reached 0.961, Recall reached 0.939, and mAP50 was 0.960. For the flower disk surface, Precision reached 0.967, Recall reached 0.946, and mAP50 was 0.971. The model parameters amounted to 2.83 million, with GFLOPs at 10.2 G. While demonstrating respectable segmentation performance under standard testing conditions, the model exhibits room for improvement in detection accuracy within complex scenarios characterised by diverse morphologies and significant perspective variations. This indicates that both feature extraction and robustness require further refinement.
Upon replacing the C3k2 module in the backbone network with the proposed CKMB module, the model achieved improvements in Precision, Recall, and mAP50 for flower disk edges to 0.984, 0.945, and 0.972, respectively. while Precision, Recall, and mAP50 for the disc surface improved to 0.987, 0.951, and 0.978, respectively. This demonstrates the module’s significant advantage in enhancing feature extraction capabilities. By incorporating the MBConv architecture and channel attention mechanism, the CKMB module further enhances the network’s perception and modelling capabilities for key region features, thereby improving overall segmentation performance. Although the number of parameters slightly increased to 3.12 million, the GFLOPs requirement decreased to 10.0 G, demonstrating that this structure maintains high computational efficiency while improving accuracy, exhibiting favourable lightweight characteristics.
Furthermore, upon replacing the original detector head with the LADH detector head proposed herein, the Precision, Recall, and mAP50 metrics for the flower disk edge improved to 0.988, 0.953, and 0.978, respectively. while Precision, Recall, and mAP50 on the flower disk surface improved to 0.99, 0.958, and 0.983, respectively. Concurrently, the number of parameters was significantly reduced to 2.40 million, and GFLOPs decreased to 8.60. These results demonstrate that LADH’s asymmetric tri-branch architecture, through decoupled modelling of classification, regression, and confidence prediction tasks combined with lightweight convolution design, effectively enhances detection accuracy while substantially reducing model complexity. This further boosts inference efficiency and deployment adaptability. As illustrated in
Figure 7, the asymmetric three-branch design of LADH simplifies the detection head architecture by separating classification, regression, and IoU prediction into lightweight convolutional branches. This architectural modification achieves improved accuracy while significantly reducing parameter count (from 2.83 million to 2.40 million) and computational cost (from 10.2 GFLOPS to 8.6 GFLOPS). The ablation study results in
Table 2 and
Table 3 further validate the effectiveness of this lightweight design.
In summary, the proposed CKMB module and LADH detection head both demonstrate significant performance advantages in instance segmentation tasks. The jointly constructed MBLA-YOLO model not only enhances the precision of segmenting both the surface and edges of flower heads but also effectively controls model complexity. This achieves a favourable balance between accuracy and efficiency, exhibiting superior practicality and deployment potential. It is particularly well-suited for resource-constrained agricultural scenarios demanding high real-time performance.
Figure 10 illustrates the comparison between model predictions and measured averages across 100 plate samples. The horizontal axis denotes sample identification (Plant ID) for individual differentiation; the left vertical axis displays measured averages (mm, blue broken line), while the right vertical axis shows model predictions (mm, orange broken line). It can be observed that the predicted curve aligns closely with the measured curve at peak and trough positions, indicating the model effectively captures relative variation trends across samples. This correlates with the goodness-of-fit coefficient
R2 = 0.95, signifying the model explains approximately 95% of measured value variation. Prediction errors were generally small for most samples (MAE = 0.64 mm, RMSE = 0.69 mm), with only a few samples exhibiting notable discrepancies. Predictions tended to be slightly lower than actual measurements in the high-value range, while some samples showed slightly higher predictions in the low-value range. However, no significant unidirectional systematic bias was observed overall. Furthermore, the magnitude of errors remained relatively stable across sample IDs, with no discernible drift phenomenon.
Figure 11 presents a three-dimensional scatter plot of positive and negative residuals between measured and predicted values. The X-axis represents model predictions (mm), the Y-axis denotes measured averages (mm), and the Z-axis indicates residuals (mm). Red spheres denote positive residuals (model underestimation), blue spheres indicate model overestimation, while the grey plane signifies the ideal prediction state (zero residual). The results indicate that the residual point cloud exhibits broadly symmetrical distribution above and below the zero plane, with comparable numbers of positive and negative residuals, showing no significant overall unidirectional bias. Blue points demonstrate slightly greater vertical deviation than red points in certain samples, potentially indicating slightly larger error margins during overestimation. The residual distribution spans the entire magnitude range between predicted and measured values without clustering within specific intervals, though dispersion increases marginally in regions of extreme high and low values.
The causes of the aforementioned phenomenon may include: an insufficient number of extreme value samples in the dataset, resulting in limited generalisation capability of the model in both high-value and low-value regions. The disc surface, as the central area of the flower head, exhibits a relatively flat structure but may feature central depressions or localised protrusions. These subtle undulations can vary significantly across different samples, particularly during the later stages of development, where disc thickness, density, and colouration fluctuate in response to pollen development status and humidity changes. Certain samples exhibiting large localised errors may stem from measurement inaccuracies, calibration discrepancies, or individual sample variations (such as morphological anomalies, uneven development, or external environmental factors). If extreme-value samples themselves contain substantial noise—for instance, due to damage to the central disk, surface contamination by foreign matter, or intense light reflection—this can further amplify residuals.
Furthermore, the disc surface’s numerous tubular flowers, complex textures, and high light-shadow contrast can cause features extracted by the model to deviate from actual morphology, with these variations not fully captured by the model. To address these issues, the proportion of samples exhibiting significant central variation, differing developmental stages, and morphological abnormalities should be increased during training to ensure the model adequately learns feature representations across various growth stages and structural states. During subsequent data collection, maintaining uniform illumination and avoiding strong reflective interference will enhance the consistency between captured images and actual morphology. Furthermore, to evaluate the model’s predictive performance across different structural types of the surface, the data has been segmented into the following ranges (low, medium, high values) for analysis. This provides targeted insights for model optimisation.
The numerical indicators in
Table 4 demonstrate the model’s exceptional regression capability in predicting flower head size. With an
R2 value as high as 0.95, the model accounts for the majority of variation in flower head size. The Mean Absolute Error (MAE) stands at 0.64 mm and the Root Mean Square Error (RMSE) at 0.69 mm. All three metrics indicate the prediction results possess high precision and practical applicability. This study not only analyses the model holistically but also conducts a stratified analysis across different flower head size intervals to further validate its robustness and stability.
The partitioned analysis revealed the model performed most effectively in the larger flower head range (p > 22 mm), exhibiting the lowest errors (MAE = 0.578 mm; RMSE = 0.631 mm). This indicates the model extracts features more stably and delivers accurate predictions when target structures are distinct and boundary contours well-defined. The high-precision predictions within this range also demonstrate the model’s strong generalisation capability when processing image targets with high contrast and structural integrity. In contrast, the medium-sized range (17 ≤ p ≤ 22 mm) exhibited a slight increase in error, with RMSE reaching 0.74 mm, suggesting some uncertainty in handling near-critical-sized pistils. Potential causes include reduced contour sharpness and less pronounced texture features within this size range, coupled with increased background interference in images. These factors collectively impacted the model’s accuracy in estimating boundary positions and dimensions. Within the small-size range (p < 17 mm), although the model’s predicted values were slightly underestimated, its overall fitting capability remained high (R2 = 0.742), with errors controlled within reasonable limits. Analysis suggests that smaller pollen heads exhibit limited pixel distribution within images, possess fewer structural details, and display weaker boundary-background contrast. Consequently, they are particularly susceptible to factors such as image resolution and minor imaging angle deviations, which complicate feature contour recognition and thereby compromise prediction accuracy.
Figure 12 presents a comparison of predicted values against measured averages (sorted by Plant ID). The horizontal axis denotes sample identification (Plant ID) for individual differentiation; the left vertical axis displays measured averages (mm, blue broken line), while the right vertical axis shows model predictions (mm, orange broken line). The predicted curve exhibits a high degree of consistency with the measured curve in overall trend. Predicted values closely align with measured values for the majority of samples, with only minor amplitude differences observed at isolated peaks and troughs. The model exhibits an
R2 = 0.93 coefficient of determination for this metric, explaining approximately 93% of observed variance. The Mean Absolute Error (MAE) is 0.27 mm, while the Root Mean Square Error (RMSE) is 0.28 mm, indicating low error levels and high predictive accuracy and stability.
Figure 13 depicts the three-dimensional distribution of positive and negative residuals between measured and predicted values. The X-axis represents model predictions (mm), the Y-axis denotes measured averages (mm), and the Z-axis indicates residuals (mm). Positive residuals (red spheres) signify model underestimation, while negative residuals (blue spheres) indicate overestimation. The grey plane represents the ideal prediction state. Residual points are uniformly distributed above and below the zero plane, with nearly equal numbers of positive and negative residuals, indicating no significant overall bias. The dispersion of overestimation and underestimation is relatively balanced, though residual fluctuations are slightly greater in extreme value regions. This suggests a slight reduction in prediction stability for samples exhibiting extreme morphological patterns at the disk margins.
The aforementioned issues may arise because the disc margins, as annular continuous structures, exhibit uneven surfaces and irregular geometric variations in some specimens. Whilst most samples provide stable and clear feature information for the model, their overall prediction accuracy falls short of that achieved for the disc surface. In specimens with extreme thickness or localised curling, the marginal contours become more susceptible to growth morphology and angular influences within the image, leading to increased edge detection errors.
Furthermore, samples exhibiting abnormal morphology (such as damage, disease spots, or uneven development) may present gaps or protrusions at the edges, introducing additional noise into predictions and amplifying residual errors. To address these issues, the proportion of thin-edged and morphologically anomalous samples can be purposefully increased to broaden the model’s training coverage of extreme forms. During imaging, uniform lighting should be maintained to minimise contour distortion. Regarding feature extraction, edge enhancement algorithms alongside geometric features such as local curvature and width variation rate can be employed to improve recognition of irregular edges. Furthermore, segmented data analysis (low, medium, high value intervals) has been supplemented to precisely evaluate prediction performance across different ranges.
The numerical indicators in
Table 5 demonstrate that the model exhibits satisfactory fitting performance relative to the measured mean values. Overall trend analysis reveals that predicted and measured values fluctuate synchronously across plant ID dimensions, exhibiting highly consistent periodic variation characteristics. Minor deviations occur only at isolated sample points, with no systemic errors emerging. This indicates the model possesses robust trend capture capability and strong predictive stability, accurately reflecting the pattern of inflorescence characteristics across samples. Further spatial visualisation of prediction errors via three-dimensional residual plots revealed performance variations across different thickness intervals. Within the lower thickness range (b < 2.8 mm), predicted values exhibited a systematic tendency towards underestimation, with negative residuals concentrated in this zone. Conversely, positive deviations were observed in the higher thickness region (b > 3.5 mm). This asymmetric residual distribution indicates relatively weaker model fitting capability at boundary intervals, potentially attributable to uneven data distribution or insufficient data volume in these regions. The model demonstrated overall excellent performance, achieving an
R2 value of 0.926, indicating strong explanatory power for the response variable.
Concurrently, the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) were 0.27 mm and 0.28 mm, respectively, keeping overall errors within a manageable range and demonstrating strong practical applicability. However, upon grouping samples by thickness intervals, the model’s goodness-of-fit diminishes across segments, particularly within the b < 2.8 mm range where R2 drops to 0.484, indicating markedly reduced predictive capability. This phenomenon arises not only from potential sample size limitations but also from inherent physical constraints in image recognition processes. Specifically, when the disc margin is thin, its contour area appears minute and blurred within the image. During actual capture, it is prone to partial occlusion by adjacent structures such as petals or leaf margins, leading to incomplete image information or unclear edge extraction. This not only compromises the accuracy of subsequent feature extraction but also introduces interference to model learning. Furthermore, as the disc margin occupies the image periphery, variations in illumination, shadow effects, and minor shifts in imaging sharpness may cumulatively introduce prediction errors. Particularly during automated image capture, non-standardised angles or focal lengths can further amplify recognition inaccuracies for smaller structures. In contrast, the thickness boundaries of medium to thicker regions (2.8 mm ≤ b ≤ 5 mm) exhibit clearer delineation, with contours more readily and stably recognisable within images. Consequently, models demonstrate greater reliability and stability within this range.
In summary, this study constructed predictive models based on two dimensions: disc diameter and edge thickness, systematically evaluating their overall and zoned-level predictive performance. Results indicate the model significantly outperforms edge thickness prediction in disc diameter estimation, achieving a coefficient of determination (R2) exceeding 0.95. It demonstrates high fitting accuracy and stability across all size intervals. The model’s effectiveness is particularly pronounced in larger flower head regions, primarily due to the inherent advantages of their overall structural morphology: regularity, near-circular shape, closed boundaries, and complete contours. This morphological characteristic facilitates easier segmentation and recognition of the target region within images. Consequently, it significantly reduces identification challenges arising from occlusion, reflections, and background interference during modelling, enabling the model to learn and converge more stably. In contrast, predicting petal edge thickness presents greater difficulty, exhibiting particular instability in low-thickness regions (e.g., b < 2.8 mm). The edge structure itself is minute and morphologically complex, prone to occlusion by surrounding petals, leaves, and other elements within the image. Furthermore, the edge contour is often difficult to extract precisely due to blurring, variations in illumination, or changes in viewing angle, resulting in incomplete feature representation. This limits the model’s accuracy and generalisation capability. Consequently, although the overall model has achieved a high standard, edge thickness prediction remains a relatively weaker aspect of the current system.
Figure 14 Visual comparison of detection results between the baseline YOLOv11n-seg model and the proposed MBLA-YOLO on sunflower heads. (a) Original image; (b) Detection results of the baseline YOLOv11n-seg; (c) Detection results of the proposed MBLA-YOLO.As evident from the figures, MLBA-YOLO achieves performance enhancements across multiple aspects. Firstly, in detecting the flower head region, MLBA-YOLO’s segmentation mask provides more complete coverage of the flower head area with more precise boundary localisation. Secondly, regarding the identification of flower head edge regions, the improved model effectively detects multiple edge areas, whereas the original model exhibits missed detections. This demonstrates MLBA-YOLO’s superior perceptual capability in recognising fine structural details. Furthermore, MLBA-YOLO demonstrates superior robustness in multi-object detection tasks, capable of simultaneously identifying multiple flower disk targets while maintaining high confidence levels. Overall, the improved model outperforms the original YOLOv11n-seg in detection accuracy, edge structure reconstruction capability, and multi-object recognition performance, rendering it more suitable for plant detection tasks involving complex structures in natural environments.
4. Discussion
The overall results of this study demonstrate that the proposed MBLA-YOLO model significantly outperforms the baseline YOLOv11n-seg in terms of accuracy, robustness, and efficiency. For the flower disk edge task, Precision, Recall, and mAP50 improved by approximately 3.2%, 2.3%, and 2.7%, respectively; for the disk surface task, improvements were approximately 2.7%, 1.9%, and 1.5%, respectively. Concurrently, the model’s parameter count decreased from 2.83 million to 2.69 million (a reduction of approximately 4.9%), while GFLOPs decreased from 10.2 G to 8.4 G (a reduction of approximately 17.6%). This achieves lightweight design and enhanced inference efficiency while maintaining high accuracy. Ablation studies further demonstrate that the CKMB module enhances feature extraction and boundary perception capabilities, while the LADH module significantly reduces model complexity whilst improving detection accuracy. The combination of both achieves optimal overall performance (pavement mAP50 = 0.986, edge mAP50 = 0.988), fully validating the synergistic gains and application potential of the proposed architecture. Please verify the accuracy of the calculated test values.
Experimental results validated the improved network architecture’s adaptability to complex field environments. However, variations in predictive performance across different measurement metrics and size ranges revealed both the method’s advantages and limitations in practical application. For diameter measurement, the overall R2 reached 0.95, with a Mean Absolute Error (MAE) of 0.64 mm and Root Mean Square Error (RMSE) of 0.69 mm, indicating the model’s effective capture of key geometric features of the flower head surface. Regarding partition performance, large-sized flower heads (p > 22 mm) exhibited optimal prediction (R2 = 0.76, MAE = 0.58 mm, RMSE = 0.63 mm), primarily due to their distinct boundaries, regular morphology, and high feature extraction stability. whereas errors increased for medium-sized (17 ≤ p ≤ 22 mm) and small-sized (p < 17 mm) flower heads. Small-sized targets exhibited reduced prediction accuracy. This may be related to limited pixel coverage, blurred contours, and higher susceptibility to occlusion, which warrants further validation with expanded datasets.
The overall R2 for edge thickness measurement was 0.93, with an MAE of 0.27 mm and an RMSE of 0.28 mm, similarly demonstrating high accuracy. However, partition analysis revealed significant under performance in the low-thickness range (b < 2.8 mm), where R2 was only 0.48 and a systematic negative bias was present. This may be associated with low pixel proportion in edge regions, occlusion effects, or sensitivity to illumination and resolution limits. Prediction performance was optimal in the medium thickness range (2.8–4 mm) (R2 = 0.71), slightly lower in thicker regions (b > 4 mm) (R2 = 0.65), indicating that boundary clarity and texture significantly influence thickness prediction. In contrast, diameter prediction exhibited relatively minor performance variations across different ranges, demonstrating the advantage of morphological structural features for model stability.
In terms of network architecture enhancements, the CKMB module improves the network’s ability to capture subtle curvature variations and textural details along the flower head margins during feature extraction, outperforming the traditional C3k2 module. The LADH detection head enhances spatial information decoupling through its asymmetric multi-branch structure, significantly improving both mask generation quality and localisation accuracy. The combined MBLA-YOLOv11-seg model outperforms baseline models in both diameter and thickness prediction, achieving lightweight efficiency by reducing parameter counts and computational demands while maintaining high precision. LADH not only improves segmentation accuracy but also demonstrates strong lightweight characteristics. Its simplified tri-branch structure (
Figure 7) directly reduces parameters and computational cost, which is crucial for deployment in embedded and edge devices operating under resource constraints. It is worth emphasising that lightweight design is not merely an ancillary advantage of model architecture, but rather a prerequisite for deployment on edge computing platforms such as handheld devices and drones. This research effectively reduces model complexity through structural refinements, enabling high-precision real-time inference on computationally constrained devices. This enhances the feasibility and practical value of agricultural field applications.
Compared to conventional two-dimensional methods reliant on threshold segmentation and contour extraction, this approach employs organ-level recognition to distinctly separate the flower head from non-target areas. It combines deep point cloud measurement to obtain true geometric parameters and utilises interpolation upsampling to compensate for point cloud deficiencies caused by occlusion, thereby enhancing the stability of geometric analysis. This method shows potential scalability to other crops and may be deployed on drones and edge devices for real-time recognition and measurement, though further validation is needed.
In terms of application strategy, for equipment control scenarios such as agricultural mechanical harvesting, it is recommended to directly adopt the maximum distance obtained from two structural measurements as the critical dimension to ensure safety redundancy for gripping or cutting devices. Conversely, in plant phenotyping research and population structure analysis, robust statistical indicators such as averages are more suitable for accurately reflecting overall structural trends and variation characteristics. It should be noted that whilst a single lateral angle observation suffices to estimate both flower head thickness and diameter in this experimental measurement process, an additional frontal perspective was incorporated to validate diameter measurements, thereby ensuring rigorous results. This design, whilst increasing data acquisition and computational demands to some extent, enhances the reliability of conclusions. Future research may explore lightweight measurement approaches based on a single angle, provided accuracy is maintained. This could further reduce computational resource consumption and enhance the flexibility of system deployment.
Nevertheless, this study retains certain limitations. Variations in illumination during the image recognition phase may interfere with YOLO detection results, and the stability of predicting low-thickness, small-sized structures remains inadequate [
38]. It should be noted that this study is primarily based on relatively uniform field conditions and has not yet been validated under varying planting densities. Future comparative experiments could be conducted in both dense (such as dense sunflower fields) and sparse environments to further examine the model’s robustness and adaptability under conditions of occlusion, blurred boundaries, and multiple targets.
Furthermore, in the prediction of geometric parameters based on point cloud data, this paper ultimately selected point cloud results processed through denoising and imputation as the predicted values for thickness and diameter. This choice was primarily based on their advantages in terms of morphological integrity and geometric continuity, which effectively mitigate the interference of anomalous noise points and missing regions on the fitting results, thereby enhancing measurement stability. Nevertheless, this choice does not imply that processed values universally outperform raw data. Under varying sample morphologies, noise distribution characteristics, and interpolation parameter settings, predicted values may still exhibit fluctuations.
Consequently, application of these results should involve comprehensive assessment considering practical requirements, tolerance for accuracy, and specific contextual factors to ensure methodological reliability and applicability. Future research may explore enhancing robustness by increasing the proportion of low-thickness and small-sized samples, incorporating edge enhancement or background suppression algorithms tailored for fine structures during image preprocessing, integrating multi-modal fusion of near-infrared and multispectral information, and implementing illumination normalisation. Concurrently, the method’s real-time performance and resource consumption on low-power edge computing platforms should be evaluated to ensure its feasibility and practical value for deployment on devices such as drones and mobile terminals.