Next Article in Journal
A Vision-Based Robot System with Grasping-Cutting Strategy for Mango Harvesting
Previous Article in Journal
Soil Fertility Assessment Through the Integration of Satellite Imagery and Spatial Analysis: Application to Arabica Coffee Cultivation in Lonya Grande, Peruvian Amazon
error_outline You can access the new MDPI.com website here. Explore and share your feedback with us.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

EffiFormer-CGS: Deep Learning Framework for Automated Quantification of Fusarium Spore Germination

1
College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling, Xianyang 712100, China
2
Institute of Plant Protection and Agro-Products Safety, Anhui Academy of Agricultural Sciences, Hefei 230031, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Agriculture 2026, 16(1), 131; https://doi.org/10.3390/agriculture16010131
Submission received: 15 November 2025 / Revised: 22 December 2025 / Accepted: 1 January 2026 / Published: 4 January 2026
(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

Abstract

Fusarium head blight (FHB), caused mainly by the Fusarium graminearum species complex, is a devastating cereal disease associated with yield losses and mycotoxin contamination. Early infection is closely linked to spore germination and germ tube elongation, yet conventional monitoring methods are labor-intensive and poorly suited for dynamic phenotypic quantification. We present EffiFormer-CGS, a three-module deep learning framework integrating object detection, key point localization, and phenotypic quantification for microscopic images of FHB spores. A dataset of 2381 images was generated from systematic experiments with triazole fungicides (Prochloraz, Prothioconazole, and Tebuconazole) across multiple concentrations and time points. Spores were annotated with bounding boxes and fine-grained geometric key points, enabling calculation of germination degree as the ratio of germ tube length to body length. EffiFormer-CGS achieved 90.8% mAP@0.5:0.95 in object detection and 91.4% mAP@0.5 in key point localization. All fungicides significantly inhibited germination, with Prochloraz showing the strongest effect. Predictions closely matched manual counts, with germination rate errors ≤ 5.18%. EffiFormer-CGS provides an efficient, automated, and high-precision approach for spore germination analysis, supporting high-throughput fungicide screening, resistance monitoring, and sustainable FHB management.

1. Introduction

Fusarium head blight (FHB) is a major fungal disease of global significance, especially prevalent in warm and humid agroecological zones where it causes substantial yield losses and grain quality deterioration [1]. The primary pathogens belong to the Fusarium gramine arum species complex (FGSC), which produce mycotoxins such as deoxynivalenol (DON) and zearalenone (ZEN), ting serious risks to food security as well as human and livestock health [2]. Field investigations have demonstrated a strong correlation between the proportion of FGSC spores in airborne transmission and both disease severity and toxin accumulation in grain [3]. The early infection process—comprising spore arrival, attachment, and germination—plays a decisive role in determining both the likelihood of disease onset and the extent of toxin contamination. In particular, during the flowering stage, the emergence and elongation rate of spore germ tubes serve as critical phenotypic markers of pathogen activity and provide highly sensitive indicators for assessing fungicide efficacy [4].
Conventional methods for spore monitoring and identification include direct microscopic counting, culture-based biochemical assays, immunological detection (e.g., ELISA), and molecular techniques such as PCR and sequencing [1]. While essential for pathogen confirmation and routine surveillance, these approaches suffer from limitations including labor-intensive protocols, low throughput, and reliance on specialized personnel and equipment. More critically, they are poorly suited to directly and quantitatively characterizing the dynamic phenotype of active germination [5]. Fluorescence-based assays employing dipicolinic acid (DPA) as a marker have demonstrated the potential of chemical approaches for spore detection. However, their applicability is largely restricted to bacterial spores and provides limited insight into fungal conidia or ascospores and their germination morphology. Consequently, such methods cannot adequately support fine-grained geometric quantification or comparative evaluations of fungicide efficacy [6].
Recent advances in computer vision have yielded notable progress in spore analysis. For example, GSD-YOLO, an optimized variant of YOLOv7-tiny with decoupled heads and lightweight convolutions, achieved 98.0% mAP for spore classification [7], and CRF_ResUNet++ integrated residual modules with conditional random fields to address adherent spore segmentation (F1-score = 0.943) [8]. Nevertheless, significant limitations persist: object detection models often focus on holistic spore identification while overlooking morphological details of germ tubes; segmentation networks, although alleviating adhesion issues, fail to establish geometric associations between the spore body and germ tube. It is worth noting that although human pose estimation technologies have achieved precise localization of joint points in fields like pose analysis, similar concepts have not been fully exploited in microbial image analysis [9]. Existing methods still largely rely on handcrafted features or specialized imaging equipment, making it difficult to adapt to fine-grained quantification under complex microscopic conditions. In particular, there is a lack of an end-to-end framework integrating object detection, keypoint localization, and phenotypic computation. As a result, the automated and high-precision quantification of germ tube extension length—a core phenotypic metric—remains a technological gap [10].
To overcome the challenges of inadequate germ tube quantification and the heavy reliance on manual analysis in microscopic imaging, this study targets the development of an integrated approach combining object detection, key point localization, and phenotypic quantification of Fusarium head blight (FHB) spore germination [11]. A systematic experimental protocol—including inoculation, fungicide treatment, and microscopic imaging—was implemented. Focusing on the critical observation window of 2.0–2.5 h and employing standardized image acquisition at 10× and 40× magnifications, we constructed a comprehensive dataset comprising 24 groups (2381 images) that included both control samples and triazole fungicide treatments (Prochloraz, Prothioconazole, and Tebuconazole) at multiple concentrations and time points. For dataset annotation, we performed both object detection and binary classification of spores into germinated (Ag) and non-germinated (Ng). In addition, a fine-grained geometric annotation scheme was implemented, defining four key points: the two termini of the spore body (Head1, Head2) and the terminal ends of the germ tube (Tail1, Tail2). Spore body length was calculated as the Euclidean distance between Head1 and Head2, while germ tube length was determined by summing the Head–Tail distances in unilateral or bilateral germination. The ratio of germ tube length to body length was then used as a quantitative indicator of “germination degree.” This approach approximates the germ tube as a straight line segment, enabling consistency and comparability across samples. For spores with branched or curved germ tubes, annotation rules can be expanded and curve fitting applied to reduce measurement error.
To accomplish the dual objectives of determining whether germination occurs (qualitative analysis) and quantifying germination degree (quantitative analysis), we propose EffiFormer-CGS: a three-module fusion framework that integrates object detection (EfficientDet), key point localization (UniFormer-CGS), and phenotypic quantification for microscopic images of spore germination. Beginning with EfficientDet for robust multi-scale detection, the framework incorporates geometric consistency-guided key point localization and statistically reliable morphological quantification, thereby establishing a complete information chain from germination occurrence to the extent of germination.
For the key point localization module, UniFormer was adopted as the backbone network to address challenges such as blurred boundaries of fine structures and uneven responses in spore microscopic images. Three major innovations were incorporated:
(1)
CBAM attention mechanism: By combining channel and spatial attention, CBAM enhances the model’s sensitivity to critical regions, including the spore body and germ tube tips [12].
(2)
Gradient Focal Heatmap Loss (GFHL): This novel loss function emphasizes regions of high uncertainty or structural variability in prediction distributions, thereby improving learning in cases of blurred boundaries and morphological heterogeneity.
(3)
SimCCLabel coordinate encoding: By reformulating two-dimensional coordinate regression into two independent one-dimensional classification tasks, this method achieves sub-pixel-level localization accuracy while substantially reducing memory usage [13].
The resulting EffiFormer-CGS framework synergistically enhances both the accuracy and efficiency of key point localization. The complete system demonstrates strong robustness under challenging microscopic imaging conditions—including spore overlap, occlusion, uneven illumination, and focal plane variation—thereby offering an efficient and reliable solution for rapid fungicide screening and large-scale plant phenotyping. On an NVIDIA GeForce RTX 4090, the end-to-end pipeline runs at 16.4 ms per image (~61 FPS), enabling approximately 3000–3600 microscopy images per minute in batch-oriented screening workflows.

2. Materials and Methods

2.1. Experimental Design and Data Acquisition

This study was conducted at the Institute of Plant Protection and Agro-products Safety, Anhui Academy of Agricultural Sciences, from 12 August 2024 to 22 August 2024. Using Fusarium graminearum spores as the experimental subject, a systematic protocol encompassing spore inoculation, fungicide treatment, and microscopic image acquisition was established. The overall process is shown in Figure 1.

2.1.1. Spore Inoculation and Culture Conditions

A pre-prepared suspension of Fusarium graminearum spores (10 μL) was dropped onto the surface of a Petri dish containing Water Agar solid medium and left to stand until the liquid was fully absorbed. Subsequently, the Petri dish was placed in a constant-temperature incubator for cultivation, with the temperature set at 28 °C, humidity at 60%, and incubation carried out under light-proof conditions. Preliminary experiments indicated that 2.0–2.5 h after cultivation represented the critical window for spore germination [14]. At this stage, spore morphology was distinct and germination status remained stable, facilitating both morphological analysis and subsequent image processing.

2.1.2. Fungicide Treatment Protocol and Grouping Design

Three widely used triazole fungicides—Prochloraz, Prothioconazole, and Tebuconazole—were selected for treatment [15]. For each fungicide, three concentration gradients (5, 6, and 7 ppm) were applied, with a control group (CK, without fungicide) included, resulting in 24 treatment groups in total. The specific grouping and sampling times are shown in Table 1.

2.1.3. Microscopic Image Acquisition and Dataset Construction

Images of the spore germination state were acquired using a MOTIC BA210 digital microscope with 10× and 40× objectives. The image resolution was set to 1600 × 1200, and images were saved in BMP format. During image acquisition, light source intensity, focal length, and exposure parameters were standardized to ensure image quality stability and comparability of analyses. For the control group, images were acquired at multiple time points (0.5–2.5 h and 5.0–5.5 h) to fully capture the dynamic changes in spore germination. For the fungicide-treated groups, the focus was on the critical observation window of 2.0–2.5 h to obtain the germination phenotypic characteristics under each treatment condition.
In total, microscopic image data were collected from 24 treatment groups, comprising 2381 images. These images were uniformly numbered and classified according to fungicide type, concentration gradient, and time point.

2.2. Image Preprocessing and Annotation

2.2.1. Preprocessing

To ensure the quality of the dataset and enhance the generalization ability of the model, this study conducted systematic preprocessing on the collected microscopic images.
First, rigorous data cleaning was performed to exclude images that were uninterpretable due to blurring, abnormal exposure (over- or underexposure), severe spore overlap, or morphological distortion. This step was intended to minimize the influence of noisy samples on subsequent training of object detection and key point regression models, thereby enhancing analytical stability and ensuring that only high-quality images were retained for annotation [16].
Second, to further enhance the model’s adaptability to common variations in actual microscopic imaging (such as uneven illumination and focal length fluctuations), a comprehensive data augmentation strategy was adopted, including random rotation, horizontal/vertical flipping, brightness adjustment, Gaussian noise injection, and scaling [17]. To prevent data leakage and ensure unbiased evaluation, the data preprocessing followed a strict sequence: (1) the cleaned set of original images was first partitioned into training, validation, and test sets using a group-wise split, guaranteeing that all images from the same biological sample remained within the same partition; (2) data augmentation techniques were subsequently applied exclusively to the training set images. The validation and test sets contained only original, unaugmented images. Following augmentation, the dataset was expanded to 8000 images and partitioned into training (6400 images), validation (800 images), and test (800 images) sets at an 8:1:1 ratio. The augmented images effectively simulate complex imaging conditions and provide more abundant learning samples for the model.

2.2.2. Image Annotation Process

A hierarchical annotation strategy was adopted in this study to sequentially complete object detection annotation and key point annotation.
In the object detection annotation phase, Labelme software was employed to identify and classify spore instances in microscopic images. The categories were clearly defined to distinguish between germinated spores (Ag) and non-germinated spores (Ng): germinated spores (Ag) refer to those with at least one visible protruding structure (germ tube) outside the spore body; non-germinated spores (Ng) refer to those with no visible protruding structures outside the spore body. The annotation method used a minimum bounding box to frame the main region of the spore. The generated JSON format file contains category labels (Ag or Ng) and the corresponding bounding box coordinate information.
Subsequently, key point annotation was performed. For spores identified as Ag class, the geometric vertices at both ends of the long axis of the spore body (Head1, Head2) and the terminal endpoints of the germ tube (Tail1, Tail2) were annotated. For Ng class spores, only the vertices at both ends of the spore body (Head1, Head2) were annotated. Key point annotation followed specific rules: Head1 and Head2 are located at both ends of the long axis of the spore body; Tail1 and Tail2 are positioned at the terminal points of the germ tube. For unilateral germination, the Tail point on the non-germinated side was annotated at the same position as its corresponding Head point to maintain consistency in calculation logic.
To link object detection results with key point annotations and facilitate subsequent geometric quantification, each spore instance was assigned a unique Group ID, associating its bounding box with the corresponding key point annotations.

3. EffiFormer-CGS Three-Module Fusion Framework

To simultaneously extract dual information—germination status (qualitative) and germination degree (quantitative)—from microscopic images of Fusarium graminearum spore germination, this study introduces the EffiFormer-CGS three-module fusion framework (Figure 2).
The framework integrates three core modules—object detection, key point localization, and phenotypic quantification—forming a complete workflow. It enables spore recognition and germination status classification (Ag/Ng), sub-pixel localization of spore body and germ tube structures (Head1, Head2, Tail1, Tail2), and automated calculation of the core phenotypic parameter, germination degree. Its core advantages are as follows: The object detection module (based on the improved EfficientDet) provides a foundation for high-precision spore localization and classification; The key point localization module (UniFormer-CGS) integrates multiple strengths: the global–local modeling advantage of the UniFormer backbone network, the ability of the CBAM attention mechanism to focus on tiny key regions, the optimized learning of boundary-blurred/mutated regions by the innovative GFHL function, and the efficient SimCCLabel coordinate encoding strategy for high-precision and low-memory localization; The phenotypic quantification module accurately defines and calculates “germination degree”—a sensitive indicator—based on key point coordinates.
Efficient collaboration between modules is achieved through intelligent path selection (Ng spores skip key point localization) and Group ID binding. Characterized by high precision, efficiency, and scalability, the framework effectively supports automated spore germination analysis under challenging microscopic conditions (e.g., overlap, uneven illumination) and across multiple magnifications, fungicides, and time points.

3.1. Object Detection Module

Microscopic analysis of spore germination presents challenges such as complex backgrounds, uneven illumination, clustered overlap, and morphological polymorphism, particularly under fungicide treatment where subtle structural changes in germ tubes critically affect phenotypic evaluation. Traditional manual counting is inefficient and prone to subjective error, limiting its suitability for high-throughput dynamic monitoring [18]. Therefore, constructing an automated object detection module with high robustness and accuracy is the foundation of the subsequent analysis process. The primary function of this module is to accurately detect spore instances and classify them as germinated (Ag) or non-germinated (Ng), providing reliable input for subsequent key point localization and phenotypic quantification.
In this study, an improved EfficientDet architecture was adopted as the backbone of object detection. This architecture combines the efficient feature extraction of EfficientNet with the multi-scale feature fusion capability of BiFPN, yielding strong representational power and generalization. It is particularly suitable for the recognition of tiny targets such as spores under the contexts of complex backgrounds, uneven illumination, clustered overlap, and morphological polymorphism [19].
(1) EfficientNet Backbone Network: It balances model complexity and performance through a compound scaling strategy (depth, width, resolution), and effectively extracts the edge, texture, and overall morphological features of spores.
(2) BiFPN Feature Fusion: By weighted fusion of feature maps at different levels, the model can focus on both the overall outline and local details of spores, significantly improving the ability to recognize spores in overlapping regions and with blurred boundaries.
(3) Detection Head and Classification Criteria: The detection head outputs the bounding boxes and category labels (Ag/Ng) of spores, without involving key point information. Classification was based on structural visibility: spores exhibiting any visible germ tube were classified as Ag, whereas those lacking protrusions were classified as Ng. This criterion emphasizes structural recognition rather than length thresholds, avoiding missed detection caused by short germ tubes.

3.2. UniFormer-CGS Key Point Localization Module

The geometric changes during spore germination (especially the formation and elongation of germ tubes) hold significant phenotypic value, directly reflecting pathogenic activity and the inhibitory effect of fungicides. Conventional object detection localizes the overall spore position but fails to capture fine internal structures (e.g., spore body termini, germ tube tips), thereby limiting quantitative analysis of germination degree. To address this, a key point localization module was constructed in this study based on object detection, aiming to accurately identify the geometric structural features of spore bodies and germ tubes. This module adopts UniFormer as its backbone network and integrates three key technologies: the CBAM attention mechanism, the GFHL gradient focus loss function, and the SimCCLabel coordinate encoding strategy, forming the UniFormer-CGS architecture. This architecture enables precise and robust spore key point localization, providing a solid foundation for phenotypic quantification.
The network architecture of UniFormer-CGS is shown in Figure 3.

3.2.1. UniFormer Backbone Network

The target structures in spore microscopic images are characterized by small size, diverse morphologies, blurred boundaries, and complex background interference. Traditional convolutional neural networks (e.g., ResNet, HRNet) are often limited by local receptive fields and semantic expression capabilities when dealing with such unstructured small targets, occlusions, and multi-scale morphologies [20].
This study introduces UniFormer (Unified Transformer) as the backbone network for key point localization. UniFormer innovatively integrates the local feature extraction capability of convolutional neural networks (CNN) and the global semantic modeling capability of self-attention mechanism (Transformer). It overcomes the problem of separation between local and global modeling in traditional networks, and significantly improves the model’s feature modeling and localization accuracy in complex microscopic scenarios [21]. Its core advantages are reflected in:
(1) Unified Architecture Design: UniFormer integrates both convolution and self-attention modules simultaneously at each stage. The convolution module is responsible for capturing local details of spores, such as edges and textures; the self-attention module models the spatial relationships and global contextual semantics of spores. The two modules operate synergistically, enhancing stability in processing spores with overlap, occlusion, or complex morphologies.
(2) Multi-Stage Feature Extraction: A feature pyramid is constructed through layer-wise embedding and deep stacking, outputting multi-scale feature maps that cover rich information from low-level textures to high-level semantics. This multi-scale representation capability is crucial for the accurate localization of fine-grained structures such as the two ends of the spore body (Head1, Head2) and the tips of germ tubes (Tail1, Tail2), providing sufficient contextual support and spatial resolution.

3.2.2. CBAM Attention Module

To enhance the model’s ability to focus on key regions, this study introduces the CBAM (Convolutional Block Attention Module) attention mechanism between the backbone network output and the upsampling module (Figure 3). CBAM is a lightweight and general-purpose module that serially connects the Channel Attention and Spatial Attention sub-modules. By explicitly modeling the “channel dimension importance” and “spatial position saliency”, it guides the network to focus on regions with higher semantic value [22].
(1) Channel Attention: It generates channel description vectors through global average pooling and max pooling, produces channel weights via a shared multi-layer perceptron (MLP), and performs channel-wise weighting on feature maps to enhance the response to key structures.
(2) Spatial Attention: It conducts average and max pooling along the channel dimension, concatenates the results, and generates a spatial weight map through convolution. This map is used to highlight regions with high localization value in the image.
CBAM was integrated after the backbone output, allowing the attention mechanism to act directly on high-level semantic features to enhance discriminability while avoiding interference with low-level texture details [23].

3.2.3. Gradient Focal Heatmap Loss Module (GFHL: Gradient Focal Heatmap Loss Function)

To address the issue that traditional loss functions struggle to fully focus on key regions (especially insufficient learning or gradient vanishing that tend to occur at locations with high gradient changes or distribution peaks) due to blurred boundaries, uneven responses, and variable morphologies of spore key points, this study proposes the GFHL (Gradient Focal Heatmap Loss) function to optimize the prediction of key point coordinate distribution under SimCC representation [24].
GFHL is a weighted loss function that integrates error magnitude and gradient information. Its core idea is to guide the model to focus on regions with high uncertainty or structural mutations in the predicted distribution during training. Loss calculation is performed separately in the x and y directions:
(1) Error term ( e r r o r i ): Measures the absolute deviation between the predicted distribution ( P i ) and the ground-truth distribution ( T i ) at position i:
e r r o r i   =   P i T i
(2) Gradient term ( g r a d i ): Calculates the gradient magnitude of the predicted distribution via central difference to capture regions with drastic changes:
g r a d i   =   P i + 1 P i 1 2
(3) Focal weighting factor ( w i ): Adds the error and gradient, then performs exponential amplification ( γ is the focal exponent that controls the rate of weight growth):
w i   =   e r r o r i + g r a d i γ
(4) Final loss ( L G F H L ): Adopts the form of weighted squared error:
L G F H L   =   i w i · e r r o r i 2
GFHL supports the target_weights mechanism, which can dynamically adjust the learning intensity based on the importance or visibility of key points. This loss function operates on the x- and y-coordinate distributions output by the SimCCHead and supports multiple reduction strategies (mean, sum, none) [25].
Hyperparameter Selection for Focal Exponent γ: The focal exponent γ in Equation (3) controls the strength of weighting on uncertain regions. To determine its optimal value, we conducted a targeted hyperparameter search on the validation set, evaluating γ ∈ {1.0, 1.5, 2.0, 2.5}. The setting γ = 2.0 was found to provide the best balance, yielding the highest keypoint localization accuracy while maintaining stable training convergence. Model performance demonstrated moderate sensitivity to γ; values below 2.0 weakened the emphasis on uncertain regions, while larger values occasionally caused over-focusing and gradient instability. Robust performance was consistently observed within the range γ ∈ [1.5, 2.5].

3.2.4. SimCCLabel Encoding

Traditional key point encoding methods based on 2D Gaussian heatmaps have problems such as output resolution limitations (difficulty in achieving sub-pixel accuracy), high memory usage, and blurred boundaries. This study adopts the SimCCLabel (Simple Coordinate Classification) strategy (Figure 3), converting the 2D coordinate regression task of each key point into two independent 1D classification tasks (corresponding to the x-axis and y-axis, respectively):
(1) Discretize the image coordinate axes into a fixed number (B) of intervals (bins).
(2) The model outputs the probability distribution of each bin via softmax ( p x , i , p y , j ).
(3) The final coordinates are calculated via the weighted average of the probability distribution:
x ^   =   i = 1 B i · p x , i , y ^   =   j = 1 B j · p y , j
SimCCLabel substantially reduces memory usage, enhances coordinate prediction accuracy (achieving sub-pixel localization), and improves adaptability at structural boundaries. It provides an efficient and stable coordinate modeling foundation for the accurate localization of tiny structures in complex spores [26].
In summary, the UniFormer-CGS architecture integrates the powerful feature modeling capability of the UniFormer backbone network, the prominent regional focusing capability of CBAM, the optimization capability of GFHL for uncertain/mutated regions, and the efficient and accurate coordinate representation of SimCCLabel, constructing a spore key point recognition system with high robustness and precision. This architecture achieves collaborative improvement in multiple dimensions including feature extraction, attention guidance, loss optimization, and coordinate encoding, providing reliable technical support for subsequent phenotypic quantification.

3.3. Phenotypic Quantification Module

Based on the coordinates of germinated spore key points (Head1, Head2, Tail1, Tail2) output by the key point localization module, this module uses geometric calculation methods to quantify key spore germination phenotypic parameters, as shown in the example on the right side of Figure 2. The specific process is as follows:
(1) Calculation of spore body length ( L b o d y ):
The spore body was approximated as a straight line segment between the two termini of its major axis (Head1 and Head2), although its actual morphology may exhibit slight curvature, and its length is calculated using the Euclidean distance formula:
L b o d y   =   x h 2 x h 1 2 + y h 2 y h 1 2
where x h 1 , y h 1 and x h 2 , y h 2 are the pixel coordinates of Head1 and Head2, respectively.
(2) Calculation of Germ Tube (Protrusion) Length ( L p r o t r u s i o n ):
Calculated separately according to the germination direction (unilateral or bilateral):
a. Unilateral germination: Only a unilateral germ tube (e.g., Tail1) is detected. The protrusion length is the straight-line distance from the Head point on this side (e.g., Head1) to the corresponding Tail point (Tail1):
L p r o t r u s i o n   =   x t 1 x h 1 2 + y t 1 y h 1 2  
b. Bilateral germination: Bilateral germ tubes (Tail1 and Tail2) are detected. The protrusion length is the sum of the extension distances on both sides:
L p r o t r u s i o n   =   x t 1 x h 1 2 + y t 1 y h 1 2 + x t 2 x h 2 2 + y t 2 y h 2 2
where x t 1 , y t 1 and x t 2 , y t 2 are the pixel coordinates of Tail1 and Tail2, respectively.
(3) Calculation of Germination Degree ( R g e r m ):
Germination degree was defined as the percentage of protrusion length ( L p r o t r u s i o n ) relative to spore body length ( L b o d y ):
R g e r m   =   L p r o t r u s i o n L b o d y × 100 %
The “germination degree” ( R g e r m ), defined as the ratio of germ tube length to spore body length, is a normalized metric that offers several advantages over using absolute germ tube length alone. First, it accounts for the natural size variation among individual spores, providing a size-invariant measure of germination vigor and enabling more consistent comparisons across different samples and treatments. Second, it is a more sensitive indicator for assessing fungicide effects, as it quantifies the relative suppression of germ tube outgrowth relative to the spore’s own potential, rather than being confounded by initial spore dimensions. This relative measure also aligns with the functional relevance of germ tube extension in host infection potential and exhibits reduced variance in statistical analysis, enhancing the robustness of phenotypic comparisons.
a. In the case of unilateral germination, R g e r m represents the ratio of the length of the unilateral germ tube to the length of the spore body.
b. In the case of bilateral germination, R g e r m reflects the ratio of the total length of the bilateral germ tubes to the length of the spore body.

3.4. Three-Module Fusion Mechanism

The core of the EffiFormer-CGS framework lies in the seamless integration of its three core modules—the target detection module (EfficientDet), the key point localization module (UniFormer-CGS), and the phenotypic quantification module—with close collaboration and data connectivity. Rather than a simple stacking of modules, this integration establishes an end-to-end automated analysis pipeline—from spore recognition and fine structure localization to phenotypic parameter calculation—through carefully designed data flows and functional complementarity.
The entire fusion process starts with the target detection module. This module receives raw microscopic image input, uses the improved EfficientDet architecture to efficiently detect all spore instances in the image, and accurately classifies them into two categories—“Germinated (Ag)” or “Non-germinated (Ng)”—based on their germination status (whether visible germ tubes exist). For each detected spore, the module outputs its bounding box coordinates and category label (Ag/Ng). This output serves as the foundation for subsequent modules.
The results of target detection directly drive the execution path of subsequent modules, enabling logical integration in terms of functionality. For spores classified as “Non-germinated (Ng)”, since they do not have a germ tube structure, their germination degree is directly defined as 0%. Therefore, the information of these Ng-type spores (position, category, R g e r m = 0%) skips the key point localization module and is directly transmitted to the final phenotypic output stage, which significantly saves computing resources. For spores classified as “Germinated (Ag)”, their bounding box information serves as the input window for the key point localization module. Based on the bounding box coordinates, the system crops out the image region containing the Ag spore, performs necessary scaling preprocessing, and then inputs it into the UniFormer-CGS key point localization module.
The key point localization module (UniFormer-CGS) is a core link that connects the preceding and subsequent steps in the fusion framework, specifically responsible for processing Ag spores. It receives the cropped images of Ag spores provided by the target detection module, and by leveraging a powerful architecture integrated with the UniFormer backbone network, CBAM attention mechanism, GFHL gradient focal loss function, and SimCCLabel coordinate encoding strategy, it accurately predicts the sub-pixel-level coordinates of the four key points of the spore: the two ends of the major axis of the spore body (Head1, Head2) and the end points of the germ tube (Tail1, Tail2). Crucially, through the unique Group ID assigned to each spore instance, the coordinate data output by the key point localization module establishes a stable associative binding with the category label (Ag) and original bounding box information of the spore provided by the target detection module, ensuring data consistency throughout the process.
Finally, the phenotypic quantification module receives the integrated information from the first two modules. For Ng spores, it directly assigns R g e r m = 0%. For Ag spores, it uses the precise coordinates (Head1, Head2, Tail1, Tail2) provided by the key point localization module to perform geometric calculations based on Euclidean distance: first, it calculates the spore body length ( L b o d y , i.e., the distance from Head1 to Head2); then, depending on whether the germ tube is unilateral or bilateral, it calculates the germ tube protrusion length ( L p r o t r u s i o n , i.e., the distance from the Head to the Tail on the unilateral side or the sum of the distances from the Heads to the Tails on both sides); and finally, it calculates the core phenotypic indicator—germination degree ( R g e r m ), which is defined as L p r o t r u s i o n / L b o d y × 100%. The output of this module is the complete phenotypic information of each spore, including its position (bounding box), germination status (Ag/Ng), and quantitative germination degree ( R g e r m ).
The entire fusion framework ensures process efficiency and result robustness through shared preprocessing strategies (e.g., unified data augmentation), a unified annotation system (Group ID association), and optimized collaboration between modules (e.g., the high-precision bounding boxes from target detection provide high-quality input regions for key point localization, and the GFHL optimizes key point localization accuracy to ensure the reliability of phenotypic quantification). This deep integration enables EffiFormer-CGS to operate like a sophisticated assembly line, automatically completing the conversion from raw images to accurate phenotypic data, and providing a powerful integrated tool for the research on spore germination of wheat scab and pesticide screening. Its workflow is shown in Figure 4.

4. Experimental Results

4.1. Experimental Environment and Parameter Settings

The experiments were conducted on a workstation equipped with an Intel® Xeon® Gold 6430 processor (2.60 GHz), an NVIDIA GeForce RTX 4090 GPU with 24 GB of video memory, and 16 GB of system RAM. These computational resources were hosted on the AutoDL cloud platform. The software environment was based on Ubuntu 20.04 with Python 3.8, employing PyTorch 2.0.1 as the deep learning framework and CUDA 11.8 for GPU acceleration.
For the object detection module, we employed the EfficientDet-B0 architecture, initialized with pre-trained weights from the OpenMMLab official open-source model repository. The loss function for detection consisted of a classification loss and a bounding box regression loss, with weighting factors set to 1.0 and 50.0, respectively. Detailed hyperparameters for model training (e.g., input resolution, initial learning rate, batch size, and epochs) are summarized in Table A1 of Appendix A.
To substantiate the high-throughput capability of EffiFormer-CGS, we benchmarked inference efficiency on an NVIDIA GeForce RTX 4090 GPU. The detection module uses EfficientDet-B0 with an input resolution of 640 × 640. The keypoint localization module (UniFormer-CGS) is executed only on cropped regions produced by the detector, and phenotype computation is lightweight. We report single-image latency (batch size = 1) averaged over repeated runs after warm-up. Timing excludes disk I/O and focuses on GPU inference and post-processing.
Overall, EffiFormer-CGS achieves 16.4 ms per image (~61 FPS) on RTX 4090, enabling approximately 3000–3600 images per minute in batch-oriented screening workflows.

4.2. Performance Comparison of Target Detection and Key Point Localization

4.2.1. Comparison of Target Detection Models

To evaluate the performance of the target detection module, five mainstream models were compared on the same test set (800 images). Evaluation followed standard target detection metrics: mAP@0.5:0.95 (mean average precision across IoU thresholds from 0.5 to 0.95) and mAP@0.5 (average precision at an IoU threshold of 0.5). The results are shown in Table 2:
Conclusion: EfficientDet is significantly superior to other models in terms of the comprehensive metric (mAP@0.5:0.95) and exhibits robust performance in high-confidence detection (mAP@0.5). Accordingly, EfficientDet was selected as the target detection backbone of the framework.

4.2.2. Ablation Experiment on the Key Point Localization Module

Based on the UniFormer backbone network, the CBAM attention mechanism, GFHL function, and SimCCLabel encoding strategy are integrated step by step to evaluate their impacts on the key point localization accuracy (the evaluation metrics are the same as those in Table 2). The results are shown in Table 3:
The ablation experiment results (Table 3) highlight the contribution of each core component within the UniFormer-CGS architecture to key point localization accuracy, and confirm the effectiveness of their collaborative optimization. First, the UniFormer backbone network has been proven to be a key foundation for performance improvement. Compared with the baseline model (Rtmpose), the adoption of the UniFormer backbone significantly improves the localization accuracy—specifically, it increases the high-confidence detection metric (mAP@0.5) by 1.5%. This improvement strongly confirms the advantage of UniFormer in integrating the local feature extraction capability of convolutional neural networks (CNNs) and the global semantic modeling capability of Transformers. This advantage enables UniFormer to provide more powerful and discriminative feature representations when addressing challenges such as the tiny structure of spores, blurred boundaries, and complex microscopic backgrounds.
Notably, the introduction of the SimCCLabel (Simple Coordinate Classification Label) strategy yielded the greatest improvement in both accuracy and efficiency. This strategy increases the comprehensive key point localization metric (mAP@0.5:0.95) by 2.7% while greatly reducing the model’s memory footprint. By reformulating 2D coordinate regression into two independent 1D classification tasks, SimCCLabel effectively overcomes resolution limitations and boundary blurring inherent in traditional heatmap methods, thereby providing a robust foundation for sub-pixel precision localization of spore structures. This result fully highlights the important value of SimCCLabel in balancing high-precision localization and computational resource consumption.
Finally, the complete UniFormer-CGS architecture, which integrates all core components (the UniFormer backbone network, CBAM attention mechanism, GFHL function, and SimCCLabel encoding strategy), achieves a peak performance of 91.4% in the most representative high-confidence detection metric (mAP@0.5). This outstanding result strongly proves the overall effectiveness of the collaborative work of various technical modules: the UniFormer provides robust basic feature extraction capabilities; the CBAM attention mechanism enhances the ability to focus on key regions such as spore body endpoints and germ tube tips, effectively suppressing background interference; the GFHL function optimizes the model’s learning process for regions with blurred boundaries and variable morphologies; and the SimCCLabel enables efficient and accurate coordinate modeling. This deep integration and synergy across multiple dimensions—including feature extraction, attention guidance, loss optimization, and coordinate encoding—explain the ability of the UniFormer-CGS architecture to achieve high-precision, robust key point localization under complex and variable microscopic imaging conditions.

4.3. Effects of Different Agents and Concentrations on Spore Germination

Based on the complete framework, the spore germination rate (Ag/(Ag + Ng)) and germination degree ( R g e r m ) of the control group and the triazole fungicide-treated groups (5/6/7 ppm) were calculated. The comparison between the manual counting results and the model prediction results is shown in Table 4, and the visualizations are presented in Figure 5 and Figure 6.
Experimental results demonstrated that all tested triazole fungicides significantly inhibited the germination of Fusarium graminearum spores. Compared with the control group (germination rate: 90.44%, germination degree: 67.36%), the germination rate and germination degree of the fungicide-treated groups decreased to 56.82–80.12% and 29.63–61.49%, respectively. Among the fungicides, Prochloraz exhibited the strongest inhibitory activity, yielding the lowest average germination rate (57.34%) and germination degree (31.53%), followed by Prothioconazole (average germination rate: 76.25%).
In terms of concentration dependence, the inhibitory effects of Prochloraz and Tebuconazole (a triazole fungicide) increase with the rise in concentration. Taking Prochloraz as an example, its germination degree decreases from 32.39% in the 5 ppm treatment group to 29.63% in the 7 ppm group. Interestingly, Prothioconazole showed a slight rebound at 6 ppm, with a germination rate of 80.12%, notably higher than the 72.49% observed at 5 ppm. This abnormal trend may be related to the specific mechanism of action of the fungicide.
Comparison between manual counts and model predictions verified the reliability of the framework. The relative error of germination rate prediction does not exceed 5.18%, and the error range of germination degree is within 16.85% (the latter error is mainly caused by measurement differences of tiny germ tubes). The 5 ppm Prothioconazole group demonstrated the highest consistency, with a relative error in germination rate prediction of only 0.18%, underscoring the model’s high-precision quantitative capability under specific conditions.

4.4. Comparison of Model Visualization Results

The comparative analysis of model visualization (Figure 7) demonstrates the actual performance of the proposed framework in identifying the germination status of Fusarium graminearum spores and localizing key points under different lighting conditions. As illustrated in Figure 7, the framework demonstrates strong recognition capability under both bright and dark microscopic backgrounds, effectively distinguishing germinated (Ag) from non-germinated (Ng) spores and completing the corresponding key point annotations.
For individual spores identified as non-germinated (Ng), the visualization results clearly show that the model only labels them as the Ng category, without the need to annotate key points. This aligns with the framework’s design logic: since Ng spores lack germ tube structures, their germination degree is consistently defined as 0. Therefore, there is no need for key point localization and subsequent length calculation, thereby avoiding unnecessary computational redundancy.
These visualization results further demonstrate the robustness of the model in complex microscopic imaging environments. Whether under bright or dark background conditions, the model can stably perform its core functions: first, the target detection module accurately locates spore instances and classifies their germination status (Ag/Ng) correctly; subsequently, for spores identified as Ag, the key point localization module (UniFormer-CGS) can precisely mark the key point positions of the two ends of the spore body (Head1, Head2) and the tips of the germ tubes (Tail1, Tail2); finally, the phenotypic quantification module calculates the germination degree based on these key point coordinates. The framework’s stable performance under varying light intensities can be attributed to its integrated technologies: data augmentation (e.g., random brightness adjustment) enhanced generalization to illumination changes; CBAM attention improved focus on critical regions such as spore bodies and germ tube tips; and GFHL optimized localization accuracy in boundary-blurred regions. The comparative results in Figure 7 intuitively verify the effectiveness of this integrated framework in addressing the challenge of uneven illumination, which is common in practical microscopic imaging.

5. Discussion

5.1. Comparison with Existing Methods

The EffiFormer-CGS framework proposed in this study offers substantial advantages over existing methods by addressing the core challenges of microscopic image analysis for Fusarium graminearum spore germination—specifically, achieving high-precision object detection, fine-grained key point localization, and robust phenotypic quantification simultaneously.
(1) Compared with traditional manual detection: Traditional microscopic counting and morphological measurement methods are highly dependent on the experience of operators, with cumbersome processes, low throughput, and strong subjectivity. Errors increase markedly when dealing with clustered or overlapping spores or when identifying tiny germ tubes, limiting the suitability of traditional methods for high-throughput fungicide screening [27]. The framework proposed in this study realizes full-process automation, improves the analysis efficiency by several orders of magnitude, and significantly enhances the objectivity and consistency in the determination of germination status (Ag/Ng) and the quantification of germination degree ( R g e r m ) (as shown in Table 4, the relative error of germination rate prediction is ≤5.18%). This provides reliable technical support for large-scale and dynamic monitoring of spore germination phenotypes.
(2) Compared with the “hybrid paradigm” of classical image processing: Traditional methods based on threshold segmentation, edge detection, morphological operations, and geometric fitting tend to fail in their rule-based pipelines under complex scenarios such as uneven illumination, focal plane fluctuation, variable spore morphologies (especially after being affected by agents), and severe overlapping. This leads to segmentation drift and scale sensitivity, making it difficult to stably characterize the fine geometric relationship between the spore body and germ tubes (e.g., Head/Tail point localization). The proposed framework, as a deep learning–driven end-to-end solution, leverages the multi-scale detection capability of EfficientDet to effectively mitigate overlap and background interference [28]. Furthermore, by virtue of the global–local modeling (UniFormer), salient region focusing (CBAM), optimized loss for regions with blurred/abrupt boundaries (GFHL), and high-precision low-memory coordinate encoding (SimCCLabel) integrated in UniFormer-CGS, the framework achieves sub-pixel-level accurate localization of spore key points (Head1, Head2, Tail1, Tail2). This approach fundamentally overcomes the limitations of traditional methods in quantifying germ tube microstructures and establishes a solid foundation for accurate calculation of germination degree (the contributions of each module are verified by the ablation experiments in Table 3).
(3) Compared with existing deep learning object detection or segmentation models: As shown in Table 2, the proposed framework adopts the improved EfficientDet, which significantly outperforms other mainstream detection models (e.g., YOLOv8s, VitDet, DiffusionDet) in terms of the comprehensive metric mAP@0.5:0.95. This proves its superiority in detecting tiny spore targets. Importantly, most existing methods emphasize spore counting or binary classification (germinated vs. non-germinated), while few address accurate quantitative analysis of germ tube extension length—a key dynamic phenotype [29]. This study fills this gap by innovatively integrating the object detection, key point localization, and geometric calculation modules, connecting the complete information chain from “whether germination occurs” to “how much germination occurs”. It thereby provides a highly sensitive indicator ( R g e r m ) for evaluating the inhibitory effects of agents on germ tube extension.
We implemented this quantitative analysis via key point localization rather than semantic or instance segmentation based on several technological considerations specific to our goal. The germination degree ( R g e r m ) is fundamentally a ratio of lengths, requiring precise coordinates of endpoints (Head/Tail points) rather than full pixel-wise masks. While segmentation excels at shape delineation, extracting accurate lengths from segmented germ tubes—especially thin, faint, or curved ones—introduces additional challenges: it requires flawless boundary detection followed by error-prone post-processing steps like skeletonization and endpoint extraction. In contrast, our key point localization framework offers a more direct and robust pathway: (i) Direct Regression: It simplifies the task to directly predicting the coordinates of the definitive geometric points, avoiding the cascading errors from segmentation and subsequent processing. (ii) Precision Focus: Techniques like SimCCLabel and the GFHL function are engineered to achieve sub-pixel coordinate accuracy specifically for these key points, yielding more reliable length measurements. (iii) Inherent Structural Encoding: The annotation scheme inherently encodes the semantic connection between the spore body and germ tube (via paired points), a relational feature not directly provided by segmentation outputs. Therefore, for the precise objective of calculating a length-based phenotypic ratio, a dedicated key point localization approach is more efficient, accurate, and functionally aligned than a segmentation-based pipeline.

5.2. Biological Significance and Implications

The results of this study hold significant phytopathological significance and provide valuable insights for the development of Fusarium head blight (FHB) prevention and control strategies as well as new fungicides.
(1) Revealing differences in inhibitory effects among triazole fungicides: The experimental results (Table 4, Figure 5 and Figure 6) clearly quantify the inhibitory effects of different triazole agents and their concentrations on the germination of Fusarium graminearum spores [30,31]. Prochloraz exhibited the strongest inhibitory activity (with the lowest average germination rate of 57.34% and average germination degree of 31.53%), followed by Prothioconazole and Tebuconazole. Prothioconazole exhibited a slight rebound in germination rate at 6 ppm (higher than at 5 ppm), suggesting that its mechanism of action may involve specificity or concentration-threshold effects, warranting further molecular-level investigation (e.g., target protein affinity and metabolic pathways). Tebuconazole displayed a trend of concentration-dependent enhancement in inhibition. These quantitative data provide a direct basis for pesticide selection, dosage optimization, and resistance risk assessment in field applications.
(2) Comparative Advantage of Germination Degree over Germination Rate: This study introduces and validates the “germination degree” ( R g e r m ) as a novel phenotypic metric, which offers distinct advantages over the conventional binary germination rate (Ag%). The germination rate provides a fundamental measure of whether germination initiation is inhibited, classifying spores simply as germinated or not. In contrast, R g e r m quantifies the extent of germination by measuring the relative elongation of the germ tube. This continuous variable captures subtler phenotypic responses. For instance, a fungicide may not completely block germination (resulting in a moderately reduced germination rate) but can severely impair germ tube extension (leading to a dramatically lower R g e r m ). As evidenced in our results (Table 4, Figure 5 and Figure 6), R g e r m revealed more pronounced differences between fungicide treatments and concentrations compared to the germination rate alone, particularly for Prothioconazole and Tebuconazole. This enhanced sensitivity makes R g e r m a superior indicator for early fungicide efficacy screening, as it directly measures the suppression of the pathogen’s invasive structure—the germ tube—whose length is closely correlated with host infection potential. Therefore, integrating R g e r m with the traditional germination rate provides a more comprehensive phenotypic profile for evaluating antifungal compounds and understanding resistance mechanisms.
(3) Value of germ tube morphology as a core phenotypic indicator: This study emphasizes and successfully quantifies the “germination degree” ( R g e r m ), defined as the percentage of the germ tube extension length relative to the spore body length. Compared with the simple germination rate (Ag%), R g e r m can more sensitively reflect the inhibitory effect of agents on the extension of germ tubes—a key invasive structure of spores after germination. Germ tube formation and elongation are critical steps in successful host infection and are directly linked to pathogen virulence [10] leverages the multi-scale detection capability of EfficientDet to effectively mitigate overlap and background interference. Thus, as a dynamic and continuous quantitative indicator, germination degree offers unique advantages for evaluating early-stage fungicide effects (e.g., blocking infection during anthesis), screening highly effective fungicides, and investigating mechanisms of pathogen resistance.
(4) Enabling high-throughput screening and resistance research: The efficiency and automation of this framework position it as an ideal tool for large-scale, systematic screening of novel compounds or formulations for their inhibitory effects on spore germination and early infection. The precise phenotypic data (germination rate, germination degree) it provides can be combined with molecular biology and biochemistry data to deeply analyze the molecular targets and mechanisms of action of agents [32]. Moreover, long-term monitoring of germination phenotypes in field strains under fungicide pressure can provide critical phenomic evidence for early detection and assessment of resistance development. The application scenarios of this framework are shown in Figure 8.

5.3. Limitations and Potential Improvements

Although the proposed framework has achieved significant results in spore germination phenotype analysis, it still has certain limitations, which point out directions for future research:
(1) Simplification of germ tube morphology:
The current method calculates germ tube length as the Euclidean distance between the annotated Head and Tail points, effectively approximating the tube as a straight line segment. While this is a practical simplification, it inevitably introduces a quantification error when germ tubes exhibit curvature. The magnitude of this error is contingent upon the degree of curvature: for a slightly curved tube, the straight-line distance provides a good estimate; however, for a highly sinuous germ tube, the linear approximation will systematically underestimate the true contour length. This leads to a corresponding underestimation of the absolute germ tube length and, consequently, the germination degree ( R g e r m ). Regarding the comparative sensitivity of the R g e r m metric (e.g., for assessing differential fungicide efficacy), the impact of this linear approximation is context-dependent. If the curvature pattern is relatively consistent across different treatment groups, the systematic underestimation may affect absolute values but could preserve the relative differences in R g e r m between treatments. The comparative sensitivity of R g e r m would be more significantly compromised if different treatments induce fundamentally different germ tube morphologies—such as one causing straight tubes and another prompting highly curved growth—as the linear length metric would no longer be a fair basis for comparison.
It is also important to note that our primary validation focused on the early germination stage (2.0–2.5 h), which is most critical for assessing initial fungicide inhibition. While the detection and keypoint localization modules are architecturally capable of operating on images from later stages (e.g., 5.0–5.5 h), where germ tubes are longer and more curved, the accuracy of the quantification (specifically, the linear length approximation) for such complex morphologies requires further investigation and validation. Extending the framework for precise dynamic monitoring across the entire germination time course remains an important direction for future work.
Potential improvements include expanding key point annotation rules (e.g., incorporating intermediate or branch points) and employing curve-fitting algorithms (e.g., Bézier or spline curves) to more accurately estimate germ tube length.
(2) Applicability to other fungicide classes: Another notable limitation concerns the scope of our experimental dataset, which was constructed using three triazole fungicides (Prochloraz, Prothioconazole, and Tebuconazole). While triazoles represent a major class of fungicides widely deployed against Fusarium graminearum, other chemical classes (e.g., strobilurins, benzimidazoles, or succinate dehydrogenase inhibitors) may induce fundamentally different morphological alterations or germination disorders in spores, such as severe curling, swelling, or multiple aberrant germ tubes. The current version of EffiFormer-CGS has been trained and validated primarily on triazole-induced phenotypes; therefore, its performance on spores treated with other fungicide classes cannot be guaranteed without further adaptation. However, the architecture itself—particularly the key point localization module—is designed to learn discriminative features from annotated structures. Future work should focus on expanding the training dataset to include spores exposed to diverse fungicide chemistries. This would enhance the framework’s robustness and generalizability, enabling its application in comprehensive, high-throughput fungicide screening pipelines across multiple modes of action.
(3) Adaptability to complex morphologies: Framework performance may be reduced when spores are densely overlapping, severely deformed, or exhibit atypical morphologies. Future improvements could include incorporating a more advanced instance segmentation module to complement object detection, exploring graph neural networks (GNNs) to model spatial relationships among spores, and expanding datasets with extreme morphologies for targeted training.
(4) Lack of 3D Information: Because microscopic images are essentially 2D projections of 3D structures, germ tubes outside the focal plane may not be fully captured or accurately localized, potentially biasing measurements. Future improvements may involve integrating Z-stack microscopy and 3D reconstruction algorithms to capture the spatial characteristics of spore germination.
(5) Cross-species Generalization: The current model was primarily trained and validated on Fusarium graminearum spores. Future work should evaluate the applicability of the framework to other plant pathogenic fungi (e.g., rust fungi, powdery mildew fungi, Magnaporthe oryzae) with similar germination structures (e.g., germ tubes, appressoria), and construct multi-species datasets for transfer learning or joint training to enhance generalization.
(6) Real-Time Performance Optimization: Although efficiency was incorporated into the model design (e.g., EfficientDet, SimCCLabel), further optimization is needed for deployment on mobile devices or online monitoring systems. Future improvements may include model lightweighting (e.g., knowledge distillation, pruning, quantization) and hardware-accelerated deployment (e.g., TensorRT, ONNXRuntime).
(7) Quantitative analysis under extreme overlap: While the current framework demonstrates robustness through high overall performance on a test set containing common overlap scenarios, its performance limits under conditions of extreme spore overlap have not been isolated and quantified. Future work should involve constructing a dedicated benchmark with annotated levels of overlap severity to perform a granular analysis of performance degradation, providing a more stringent stress test for model robustness.
(8) Comprehensive Performance Benchmarking: The current study primarily focuses on validation of accuracy and biological application. A comprehensive benchmark of the end-to-end inference speed across diverse hardware platforms, along with a comparative analysis against other lightweight architectures, was not within the scope of this work but represents a valuable direction for future research. Such studies would further optimize the framework for scenarios demanding real-time processing or deployment on resource-constrained edge devices.
(9) Broader Benchmarking of Keypoint Localization Architectures: In this study, we selected Rtmpose as the primary baseline for our ablation experiments on keypoint localization, as it represents a modern, high-performance architecture that has demonstrated state-of-the-art results on general pose estimation benchmarks, outperforming established models such as HRNet and SimpleBaseline. Given the similarity between keypoint localization in human pose estimation and our microscopic spore analysis—both requiring precise detection of fine-grained structures under potential occlusion and scale variation—we considered Rtmpose to be a strong and relevant benchmark for evaluating our proposed UniFormer-CGS module. While our ablation study (Table 3) clearly demonstrates the incremental benefits of our integrated components (UniFormer, CBAM, GFHL, SimCCLabel) over this competitive baseline, we acknowledge that a more extensive comparison including a wider array of architectures (e.g., HRNet, SimpleBaseline, or other Transformer-based models) could provide additional context and further validate the generalizability of our approach. Such comprehensive benchmarking remains a valuable direction for future research, particularly for applications requiring deployment across diverse imaging conditions or pathogen species.

6. Conclusions

This study designed and validated the EffiFormer-CGS three-module fusion framework, which innovatively integrates object detection (EfficientDet), key point localization (UniFormer-CGS), and phenotypic quantification (geometric calculation). The framework effectively addresses the dual task of extracting qualitative information (“whether germination occurs”) and quantitative information (“germination degree”) from microscopic images of Fusarium head blight (FHB) spore germination. Experimental results demonstrate that the framework achieves high robustness and precision under complex microscopic conditions (e.g., overlap, uneven illumination), substantially outperforming traditional manual methods and mainstream models. Through systematic analysis of three triazole fungicides, the framework accurately quantified their inhibitory effects on spore germination rate and germ tube extension length, revealed activity differences and concentration dependence among agents, and highlighted germination degree as a high-sensitivity phenotypic indicator for evaluating early fungicide effects.
The framework provides a powerful phenomic analysis tool for agricultural plant protection, particularly for high-throughput fungicide screening, resistance monitoring and mechanism research, and studies of pathogenic infection biology in airborne fungal diseases such as Fusarium head blight. Future work will aim to address current limitations (e.g., modeling complex morphologies and integrating 3D information) and extend the application of this system to the study of germination and early infection processes in other important plant pathogens. By integrating imageomics with multi-omics big data analysis, the framework could provide more comprehensive technical support for advancing understanding of pathogen–host interactions and developing green, sustainable prevention and control strategies.

Author Contributions

Z.W.: Writing—review & editing, Writing—original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation, and Conceptualization. X.B.: Writing—review & editing, Writing—original draft, Visualization, Validation, Software, Methodology, Formal analysis, and Data curation. T.C.: Supervision, Methodology, and Conceptualization. Z.D.: Supervision, Methodology, and Conceptualization. D.H.: Writing—review & editing, Writing—original draft, Supervision, Visualization, Software, and Methodology. D.Z.: Writing—review & editing, Supervision, Project administration, Funding acquisition, and Conceptualization. S.X. Visualization. T.G.: Visualization. X.Y.: Resources. C.G.: Resources and Funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by the National Natural Science Foundation of China (Grant No. 42271364) and Transformation and Application Special Project in Agricultural Scientific and Technological Achievements of Anhui Province (Grant No. 2024ZH004).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A

Table A1. Model training parameters.
Table A1. Model training parameters.
ModuleInput ResolutionInitial Learning RatePretrained ModelBackbone VersionBatch SizeEpochsLoss Weights (cls/bbox)
Target Detection Module640 × 6404 × 10−4OpenMMLab EfficientDet-B0EfficientDet-B0323001.0/50.0
Key Point Localization Module256 × 1925 × 10−4No-64210-

References

  1. Inbaia, S.; Farooqi, A.; Ray, R.V. Aggressiveness and Mycotoxin Profile of Fusarium Avenaceum Isolates Causing Fusarium Seedling Blight and Fusarium Head Blight in UK Malting Barley. Front. Plant Sci. 2023, 14, 1121553. [Google Scholar] [CrossRef] [PubMed]
  2. Blackwell, B.A.; Schneiderman, D.; Thapa, I.; Bosnich, W.; Pimentel, K.; Kebede, A.Z.; Reid, L.M.; Harris, L.J. Assessment of Deoxynivalenol and Deoxynivalenol Derivatives in Fusarium Graminearum-Inoculated Canadian Maize Inbreds. Can. J. Plant Pathol. 2022, 44, 504–517. [Google Scholar] [CrossRef]
  3. Wang, A.; Shang, Z.; Jiang, R.; Zhang, M.; Wang, J.; Li, H.; Zhang, B.; Tang, H.; Xu, F.; Hu, X.; et al. Development and Application of a qPCR-Based Method Coupled with Spore Trapping to Monitor Airborne Pathogens of Wheat Causing Stripe Rust, Powdery Mildew, and Fusarium Head Blight. Plant Dis. 2025, 109, 257–264. [Google Scholar] [CrossRef] [PubMed]
  4. Kim, T.; Yeo, B.C. Recovering Microscopic Images in Material Science Documents by Image Inpainting. Appl. Sci. 2023, 13, 4071. [Google Scholar] [CrossRef]
  5. Ballesteros, D.; Hill, L.M.; Lynch, R.T.; Pritchard, H.W.; Walters, C. Longevity of Preserved Germplasm: The Temperature Dependency of Aging Reactions in Glassy Matrices of Dried Fern Spores. Plant Cell Physiol. 2019, 60, 376–392. [Google Scholar] [CrossRef]
  6. Camerlingo, C.; Di Meo, G.; Lepore, M.; Lisitskiy, M.; Poli, A.; Portaccio, M.; Romano, I.; Di Donato, P. Graphene-Based and Surface-Enhanced Raman Spectroscopy for Monitoring the Physio-Chemical Response of Thermophilic Bacterial Spores to Low Temperatures Exposure. Sensors 2020, 20, 4150. [Google Scholar] [CrossRef]
  7. Zhang, D.; Tao, W.; Cheng, T.; Zhou, X.; Hu, G.; Qiao, H.; Guo, W.; Wang, Z.; Gu, C. GSD-YOLO: A Lightweight Decoupled Wheat Scab Spore Detection Network Based on Yolov7-Tiny. Agriculture 2024, 14, 2278. [Google Scholar] [CrossRef]
  8. Zhang, D.; Zhang, W.; Cheng, T.; Lei, Y.; Qiao, H.; Guo, W.; Yang, X.; Gu, C. Segmentation of Wheat Scab Fungus Spores Based on CRF_ResUNet++. Comput. Electron. Agric. 2024, 216, 108547. [Google Scholar] [CrossRef]
  9. Jin, L.; Wang, X.; Nie, X.; Wang, W.; Guo, Y.; Yan, S.; Zhao, J. Rethinking the Person Localization for Single-Stage Multi-Person Pose Estimation. IEEE Trans. Multimed. 2024, 26, 1436–1447. [Google Scholar] [CrossRef]
  10. Praetorius, J.-P.; Hitzler, S.U.J.; Gresnigt, M.S.; Figge, M.T. Image-Based Quantification of Candida Albicans Filamentation and Hyphal Length Using the Open-Source Visual Programming Language JIPipe. FEMS Yeast Res. 2025, 25, foaf011. [Google Scholar] [CrossRef]
  11. Genze, N.; Bharti, R.; Grieb, M.; Schultheiss, S.J.; Grimm, D.G. Accurate Machine Learning-Based Germination Detection, Prediction and Quality Assessment of Three Grain Crops. Plant Methods 2020, 16, 157. [Google Scholar] [CrossRef]
  12. Zeng, W.; He, M. Rice Disease Segmentation Method Based on CBAM-CARAFE-DeepLabv3+. Crop Prot. 2024, 180, 106665. [Google Scholar] [CrossRef]
  13. Liu, Y.; Wang, G.; Dong, H.; Chen, C. SimCC Coordinate Based Learning of Human Pose Constraint Information. Digit. Signal Prog. 2024, 144, 104286. [Google Scholar] [CrossRef]
  14. Yang, A.; Wang, T.; Gan, W.; Lai, H.; Lu, K.; Hao, C.; Xu, Z.; Zeng, R.; Wang, Z.; Ran, Z.; et al. Efficient Adsorption of Triazole Fungicides Using a Porous Organic Polymer with Imine/Aminal Linkages. Sep. Purif. Technol. 2025, 354, 129117. [Google Scholar] [CrossRef]
  15. Roman, D.L.; Voiculescu, D.I.; Filip, M.; Ostafe, V.; Isvoran, A. Effects of Triazole Fungicides on Soil Microbiota and on the Activities of Enzymes Found in Soil: A Review. Agriculture 2021, 11, 893. [Google Scholar] [CrossRef]
  16. Huang, J.-T.; Ting, C.-H. Deep Learning Object Detection Applied to Defect Recognition of Memory Modules. Int. J. Adv. Manuf. Technol. 2022, 121, 8433–8445. [Google Scholar] [CrossRef]
  17. Ma, J.; Hu, C.; Zhou, P.; Jin, F.; Wang, X.; Huang, H. Review of Image Augmentation Used in Deep Learning-Based Material Microscopic Image Segmentation. Appl. Sci. 2023, 13, 6478. [Google Scholar] [CrossRef]
  18. Kawai, S.; Nakano, M. Analysis of Vegetative Cells and Spore Germination in Food Using Flow Cytometry. J. Microbiol. Methods 2025, 232, 107137. [Google Scholar] [CrossRef]
  19. Saleh, M.A.; Ameen, Z.S.; Altrjman, C.; Al-Turjman, F. Computer-Vision-Based Statue Detection with Gaussian Smoothing Filter and EfficientDet. Sustainability 2022, 14, 11413. [Google Scholar] [CrossRef]
  20. Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3349–3364. [Google Scholar] [CrossRef]
  21. Li, K.; Wang, Y.; Zhang, J.; Gao, P.; Song, G.; Liu, Y.; Li, H.; Qiao, Y. UniFormer: Unifying Convolution and Self-Attention for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12581–12600. [Google Scholar] [CrossRef] [PubMed]
  22. Chakraborty, S.; Mali, K. Microscopic Image Segmentation Approach Based on Modified Affinity Propagation-Based Clustering. Multimed. Tools Appl. 2024, 83, 78161–78182. [Google Scholar] [CrossRef]
  23. Kong, H.; Yuan, Z.; Zhou, H.; Liang, G.; Yan, Z.; Cheng, G.; Hu, Z. Synthetic High-Energy Computed Tomography Image via a Wasserstein Generative Adversarial Network with the Convolutional Block Attention Module. Quant. Imaging Med. Surg. 2023, 13, 4365–4379. [Google Scholar] [CrossRef] [PubMed]
  24. Zhang, X.; Jiang, M.; Chen, H.; Zheng, J.; Pan, Z. Incorporating Geometry Knowledge into an Incremental Learning Structure for Few-Shot Intent Recognition. Knowl.-Based Syst. 2022, 251, 109296. [Google Scholar] [CrossRef]
  25. Li, X.; Lv, C.; Wang, W.; Li, G.; Yang, L.; Yang, J. Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 3139–3153. [Google Scholar] [CrossRef]
  26. Zheng, Q.; Guo, H.; Yin, Y.; Zheng, B.; Jiang, H. LFSimCC: Spatial Fusion Lightweight Network for Human Pose Estimation. J. Vis. Commun. Image Represent. 2024, 99, 104093. [Google Scholar] [CrossRef]
  27. Roshanfekrrad, M.; Papadopoulos, C.; Calonne-Salmon, M.; Schneider, C.; Zhang, K.; Karpouzas, D.; Declerck, S. Development of a High-Throughput Spore Germination Test to Assess the Toxicity of Pesticides on Arbuscular Mycorrhizal Fungi. Mycorrhiza 2025, 35, 38. [Google Scholar] [CrossRef]
  28. Zheng, X.; Chen, F.; Lou, L.; Cheng, P.; Huang, Y. Real-Time Detection of Full-Scale Forest Fire Smoke Based on Deep Convolution Neural Network. Remote Sens. 2022, 14, 536. [Google Scholar] [CrossRef]
  29. Cardini, A.; Pellegrino, E.; Del Dottore, E.; Gamper, H.A.; Mazzolai, B.; Ercoli, L. HyLength: A Semi-Automated Digital Image Analysis Tool for Measuring the Length of Roots and Fungal Hyphae of Dense Mycelia. Mycorrhiza 2020, 30, 229–242. [Google Scholar] [CrossRef]
  30. Chen, J.; Wei, J.; Fu, L.; Wang, S.; Liu, J.; Guo, Q.; Jiang, J.; Tian, Y.; Che, Z.; Chen, G.; et al. Tebuconazole Resistance of Fusarium Graminearum Field Populations from Wheat in Henan Province. J. Phytopathol. 2021, 169, 525–532. [Google Scholar] [CrossRef]
  31. Ortiz, S.C.; Easter, T.; Valero, C.; Bromley, M.J.; Bertuzzi, M. A Microscopy-Based Image Analysis Pipeline for the Quantification of Germination of Filamentous Fungi. Fungal Genet. Biol. 2025, 176, 103942. [Google Scholar] [CrossRef]
  32. Yin, Y.; Miao, J.; Shao, W.; Liu, X.; Zhao, Y.; Ma, Z. Fungicide Resistance: Progress in Understanding Mechanism, Monitoring, and Management. Phytopathology 2023, 113, 707–718. [Google Scholar] [CrossRef]
Figure 1. Overall flow chart.
Figure 1. Overall flow chart.
Agriculture 16 00131 g001
Figure 2. EffiFormer-CGS three-module fusion framework.
Figure 2. EffiFormer-CGS three-module fusion framework.
Agriculture 16 00131 g002
Figure 3. Architecture of the UniFormer-CGS key point localization network.
Figure 3. Architecture of the UniFormer-CGS key point localization network.
Agriculture 16 00131 g003
Figure 4. Workflow diagram of EffiFormer-CGS.
Figure 4. Workflow diagram of EffiFormer-CGS.
Agriculture 16 00131 g004
Figure 5. Comparison chart of spore germination rate and germination degree under different agents.
Figure 5. Comparison chart of spore germination rate and germination degree under different agents.
Agriculture 16 00131 g005
Figure 6. Graph of relative errors between manual counting and machine prediction under different agents.
Figure 6. Graph of relative errors between manual counting and machine prediction under different agents.
Agriculture 16 00131 g006
Figure 7. Model recognition results under bright and dark background.
Figure 7. Model recognition results under bright and dark background.
Agriculture 16 00131 g007
Figure 8. Application scenarios of this framework.
Figure 8. Application scenarios of this framework.
Agriculture 16 00131 g008
Table 1. Grouping setup for fusarium graminearum spore treatment under different fungicide concentrations and microscopic magnifications.
Table 1. Grouping setup for fusarium graminearum spore treatment under different fungicide concentrations and microscopic magnifications.
Fungicide GroupMagnificationTime PointConcentration (ppm)Number of Groups
Control Group10×2.0–2.5 h, 5.0–5.5 h, 0.5–2.5 hNo Fungicide3
40×2.0–2.5 hNo Fungicide1
Prochloraz10×2.0–2.5 h5, 6, 73
10×0.5–2.5 h61
40×2.0–2.5 h5, 72
Prothioconazole10×2.0–2.5 h5, 6, 73
10×0.5–2.5 h61
40×2.0–2.5 h5, 6, 73
Tebuconazole10×2.0–2.5 h5, 6, 73
10×0.5–2.5 h61
40×2.0–2.5 h5, 6, 73
Table 2. Target detection results.
Table 2. Target detection results.
mAP@0.5:0.95/%mAP@0.5/%
EfficientDet90.893.4
DiffusionDet78.694.3
RmDet58.471.5
VitDet52.579.1
YOLOv8s68.784.4
Table 3. Ablation Experiment Results of the Key Point Localization Module (Unit: %; √: Use, ×: No Use).
Table 3. Ablation Experiment Results of the Key Point Localization Module (Unit: %; √: Use, ×: No Use).
RtmposeUniFormerCBAMGFHLSCLmAP@0.5:0.95mAP@0.5
××××81.085.6
××××81.087.1
×××82.287.2
×××82.686.5
×××83.788.3
××85.289.5
××84.386.6
××83.988.7
×83.791.4
Table 4. Effects of different chemical agents and concentrations on spore germination.
Table 4. Effects of different chemical agents and concentrations on spore germination.
ReagentConcentrationGermination Rate/%Germination Degree/%
ManualMachineAverageRelative Error/%ManualMachineAverageRelative Error/%
Control GroupNo Reagent89.3791.5190.44 2.3964.0770.6567.36 10.27
Prochloraz558.3355.3156.82 5.1835.3729.4132.39 16.85
658.3256.2557.29 3.5534.8930.2332.56 13.36
758.4759.3558.91 1.5131.4127.8429.63 11.37
Prothioconazole572.5572.4272.49 0.1860.0362.9461.49 4.85
678.8681.3880.12 3.2053.1546.0149.58 13.43
777.0575.2076.13 2.4051.7745.9848.88 11.18
Tebuconazole568.2065.7066.95 3.6735.9739.3437.66 9.37
670.1168.5469.33 2.2448.6150.9049.76 4.71
771.2467.6669.45 5.0354.0846.5150.30 14.00
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Bai, X.; Cheng, T.; Ding, Z.; Han, D.; Zhang, D.; Xie, S.; Guo, T.; Yang, X.; Gu, C. EffiFormer-CGS: Deep Learning Framework for Automated Quantification of Fusarium Spore Germination. Agriculture 2026, 16, 131. https://doi.org/10.3390/agriculture16010131

AMA Style

Wang Z, Bai X, Cheng T, Ding Z, Han D, Zhang D, Xie S, Guo T, Yang X, Gu C. EffiFormer-CGS: Deep Learning Framework for Automated Quantification of Fusarium Spore Germination. Agriculture. 2026; 16(1):131. https://doi.org/10.3390/agriculture16010131

Chicago/Turabian Style

Wang, Ziheng, Xuehui Bai, Tao Cheng, Ziyu Ding, Dong Han, Dongyan Zhang, Shiying Xie, Tianyi Guo, Xue Yang, and Chunyan Gu. 2026. "EffiFormer-CGS: Deep Learning Framework for Automated Quantification of Fusarium Spore Germination" Agriculture 16, no. 1: 131. https://doi.org/10.3390/agriculture16010131

APA Style

Wang, Z., Bai, X., Cheng, T., Ding, Z., Han, D., Zhang, D., Xie, S., Guo, T., Yang, X., & Gu, C. (2026). EffiFormer-CGS: Deep Learning Framework for Automated Quantification of Fusarium Spore Germination. Agriculture, 16(1), 131. https://doi.org/10.3390/agriculture16010131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop