DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition

Liang, Lina; Chen, Jingnan; Tian, Ying; Wang, Hongyan; Cai, Yiting; Zhong, Fenglin; Wang, Senpeng; Hou, Maomao; Lu, Junyang

doi:10.3390/horticulturae12040423

Open AccessArticle

DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition

by

Lina Liang

^1,†,

Jingnan Chen

^2,†,

Ying Tian

¹,

Hongyan Wang

¹,

Yiting Cai

¹,

Fenglin Zhong

¹,

Senpeng Wang

¹,

Maomao Hou

^1,* and

Junyang Lu

^1,*

¹

College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou 350002, China

²

College of Horticulture and Landscape Architecture, Fujian Agricultural Vocational and Technical College, Fuzhou 350002, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Horticulturae 2026, 12(4), 423; https://doi.org/10.3390/horticulturae12040423

Submission received: 24 December 2025 / Revised: 25 March 2026 / Accepted: 27 March 2026 / Published: 31 March 2026

(This article belongs to the Special Issue Computer Vision and Machine Learning in Horticulture Plants)

Download

Browse Figures

Versions Notes

Abstract

Early and accurate identification of tomato leaf diseases constitutes a key safeguard for mitigating economic losses in tomato production. Conventional tomato leaf disease detection methodologies are constrained by inherent limitations, such as low operational efficiency, inadequate detection precision, and limited adaptability to environmental fluctuations. In contrast, the integration of deep learning techniques has yielded improvements in this research domain. Consequently, the development of deep learning-based approaches for the rapid and precise detection of tomato leaf diseases holds considerable theoretical significance and practical application value. To improve the detection accuracy of tomato leaf diseases, this study proposes a transfer learning-based DenseNet disease recognition model named DenseNet-BiFPN-ECA Fusion Network. The bidirectional feature pyramid network (BiFPN) is introduced at the terminal of DenseNet121 to achieve multi-scale feature fusion, while the efficient channel attention (ECA) mechanism is applied to enhance the discriminative capacity of fused features. Classification is ultimately completed via a global average pooling layer and a fully connected layer. The experimental results demonstrate that the improved model achieves an accuracy of 90.63% on the small-sample tomato leaf dataset collected from complex greenhouse environments, representing an improvement of 20.32 percentage points over the original DenseNet121 model. On the large-scale open-source Plant Village dataset, the model attains an accuracy of 98.47%, significantly outperforming the baseline models. Furthermore, a comparative analysis shows that the highest accuracy achieved by DenseNet, ResNet101, and VGG16 models on the same dataset is only 83.59% (within ±0.5%). This result validates the effectiveness of DenseNet-BiFPN-ECA Fusion Network in disease recognition tasks. The model provides a reliable technical reference for the intelligent diagnosis of tomato leaf diseases.

Keywords:

disease identification; tomato; DenseNet; attention mechanism; transfer learning

1. Introduction

Disease pressures constitute a key factor leading to tomato yield losses, often imposing substantial economic burdens on growers [1,2]. Traditional disease identification methods relying on farmers’ empirical judgments demonstrate inherent limitations in large-scale field monitoring scenarios—including low efficiency, heavy workloads, and high misjudgment rates—making them inadequate for meeting precision control requirements during early disease stages [3,4]. Early automated identification research primarily utilized digital image processing techniques and conventional machine learning algorithms, yet their feature representation depended on manually designed models constrained by insufficient generalization capability. Furthermore, machine learning models demand in-depth understanding of plant diseases and may require prolonged processing time, rendering them difficult to adapt to China’s diverse geographical environments and complex cultivation scenarios, thereby limiting practical application and promotion [5].

In recent years, advancements in image processing and detection algorithms have led to their application in agricultural pest and disease identification. Ensemble learning and model fusion have become mainstream strategies for enhancing performance. Sharma et al. [6] proposed a lightweight ensemble model integrating MobileNetV2 and ResNet50 for tomato leaf disease classification. The model achieved 99.91% accuracy on the Kaggle dataset (11,000 images, 10 disease categories). In mobile deployment, it reached 38 FPS (Frames Per Second) inference speed with 26 ms latency. Through quantization compression, model size was reduced by 42%. Model adaptability depends on hardware computational capacity: low-end devices (<2 TOPS, where TOPS denotes Tera Operations Per Second) require 8-bit quantization, while high-end devices (≥4 TOPS) support real-time diagnosis of multiple diseases. The feature fusion strategy reduced cross-model attention error by 19.7%. Raval et al. [7] proposed an improved ensemble transfer learning model applied to six plant leaf disease datasets. It achieved over 99% accuracy on balanced, high-quality datasets and maintained over 90% accuracy on imbalanced and noisy datasets. However, its substantial parameter count demands high computational resources, making it unsuitable for mobile deployment. Dixit et al. [8] explored cross-domain hybrid models, combining the feature learning capability of Deep Convolutional Neural Networks (DCNNs) with the classification boundary definition of Support Vector Machines (SVMs). This approach provides a high-precision solution for rice disease diagnosis and demonstrates deployment feasibility on edge computing devices. Sharma et al. [9] proposed an enhanced AlexNet architecture for early detection of tomato leaf diseases in the Indian Himalayan region, achieving 94.80% accuracy on a dataset of 7150 images. It achieved real-time performance of 28 FPS on mobile devices based on the Snapdragon 865 platform, with model size reduced to 12.3 MB (a 68% reduction). On low-end devices with less than 1 GB of memory, 8-bit quantization keeps inference latency under 50 ms. Eliwa et al. [10] proposed an enhanced YOLOv11 architecture. It achieved accuracies of 97.88% and 96.90% on a tomato leaf disease dataset (23,723 images, 11 categories) and a plant disease dataset (2904 images, 12 categories), respectively. In mobile deployment accelerated by TensorRT, it achieved real-time performance of 42 FPS, with model size optimized to 14.6 MB (a 28% reduction). Key innovations include a dynamic attention mechanism (boosting small object detection rate by 19%) and a lightweight feature fusion module (reducing memory usage by 35%), achieving a final F1 score of 97.2%. Trivedi et al. [11] proposed a Skill-based Honey Badger Optimization Algorithm-Deep Convolutional Neural Network (SHBOA-DeepCNN) for detecting tomato leaf diseases. The model achieved 91.91% accuracy with a false positive rate of only 7.38%. It demonstrated real-time performance of 28 FPS on Snapdragon 865 mobile devices, with model size compressed to 18.6 MB (a 32% reduction). Optimizing network parameters with the SHBOA algorithm improved training efficiency by 40%. Javidan et al. [12] proposed a ResNet-12-based few-shot learning framework that effectively detects tomato early blight, leaf spot, and gray mold in limited data scenarios. In 1-shot mode, recognition accuracies for the three diseases and healthy leaves were 91.64%, 92.37%, 92.93%, and 100%, respectively, with improvements in 3-shot mode. After quantization deployment, memory usage was only 23 MB on Snapdragon 855 mobile devices (a 68% reduction).

To pursue ultimate recognition accuracy and system robustness, researchers are exploring complex multimodal architecture fusion and meta-learning strategies. Jelali [13] conducted a systematic analysis of current deep learning-based tomato pest and disease detection technologies, highlighting the performance gap between laboratory datasets (e.g., PlantVillage) and real field scenarios. For instance, the quantization-optimized YOLOv5s model achieved a detection speed of 28 FPS on Jetson Xavier mobile devices with a model size of 14.3 MB, though its accuracy decreased by 12–15% compared to laboratory conditions. Reis [14] proposed a multi-depth and meta-learning fusion framework integrating Convolutional Neural Network architectures (DenseNet201, MobileNet), Transformer architectures (FastViT, MaxViT), and the WaveMLP architecture. Employing a VtC (Voting Trust Certificate) voting classifier meta-learning strategy, it achieved high-precision detection of healthy tomato leaves, early blight, and late blight, reaching 99.56% accuracy on 900 test images. In deployment tests, the optimized hybrid model achieved real-time processing performance of 32 FPS on the Snapdragon 888 mobile platform, with the model size reduced to 48 MB. Karthika J et al. [15] proposed the Lightweight Aggregated Fusion Channel Network, which optimizes the Capsule Neural Network and integrates the Non-Monotonic Search Algorithm with the Hunger Games Search Algorithm. For tomato leaf disease detection, it achieved accuracy, precision, recall, and F1 scores ranging from 98% to 99%. The model innovatively incorporates a Contextual Transformer module to enhance localization accuracy for small lesion areas. Simultaneously, the WGC-PANet architecture reduces computational complexity by 42%, shrinking model parameters to 15.8 MB. Deployment tests demonstrated real-time processing at 25 frames per second on the Snapdragon 865 mobile platform, with inference latency strictly controlled below 35 ms.

In summary, while current deep learning-based methods for tomato leaf disease recognition demonstrate strong performance in controlled environments—such as achieving >90% accuracy and mobile inference latency below 50 ms—they still face significant challenges when deployed in real, complex greenhouse production conditions. Under variable lighting, cluttered backgrounds, and limited data, the generalization capability and robustness of models often decline notably. Moreover, existing approaches show room for improvement in discriminating between multi-scale diseases with similar visual characteristics. Therefore, this study aims to enhance generalization through model architecture optimization while maintaining high accuracy, thereby better aligning with practical agricultural application needs. The enhanced model is then compared with ResNet50 [16], ResNet101 [17], and VGG16 [18] to evaluate its performance and effectiveness.

The main contributions of this study are summarized as follows:

(1): We propose a novel DenseNet-BiFPN-ECA Fusion Network architecture for tomato leaf disease recognition. This architecture strategically integrates a bidirectional feature pyramid network (BiFPN) and an efficient channel attention (ECA) mechanism into a transfer learning-based DenseNet121 backbone, specifically designed to address the challenges of multi-scale lesion representation and complex background interference prevalent in natural agricultural scenarios.
(2): We constructed a tomato leaf disease image dataset named FC-TLFD (Fujian Changle Tomato Leaf Field Dataset), which was collected from complex greenhouse environments. The dataset encompasses diverse conditions such as variable weather, lighting, shooting distance, and camera perspectives, thereby supporting research on disease identification in real-world scenarios.

2. Materials and Methods

2.1. Datasets

This study constructed datasets using two primary sources: the publicly available tomato leaf disease image dataset from the Plant Village dataset [19] and a self-collected field dataset of tomato leaf disease images. The Plant Village dataset is an open-source resource that covers ten common crops, with the tomato subset comprising ten distinct categories: healthy, early blight (Alternaria solani), late blight (Phytophthora infestans), leaf mold (Fulvia fulva), gray leaf spot (Alternaria alternata f. sp. lycopersici), tomato leaf miner (Tuta absoluta), tomato mosaic virus (Tomato mosaic virus), Septoria leaf spot (Septoria lycopersici), two-spotted spider mite (Tetranychus urticae), and tomato yellow leaf curl virus (Tomato yellow leaf curl virus). The tomato subset includes 10,047 training images, with all ten categories used in this study. The pathogen names are provided in accordance with the original dataset documentation. The self-built dataset, named FC-TLFD (Fujian Changle Tomato Leaf Field Dataset), was collected in greenhouse plantings in Changle District, Fuzhou City, Fujian Province. This region, located on the southern bank of the Min River estuary, features predominantly low hills and a subtropical maritime monsoon climate. The image collection was conducted between December 2024 and March 2025 during the critical vegetative growth stage of tomatoes, when their pronounced branching capacity created ideal conditions for disease observation [20,21]. Our photographic protocol employed a Vivo S10 smartphone (manufactured by Vivo Communication Technology Co., Ltd., Shenzhen, China) (8:30–17:30 daily) to systematically document 812 high-resolution disease samples (3456 × 4608 px JPGs, raw captured images), encompassing the full spectrum of greenhouse conditions: varying weather patterns (sunny/overcast/rainy), lighting directions (front/backlit), imaging distances (proximal/distal), and camera perspectives (elevated/ground-level)—this rigorous environmental sampling strategy significantly strengthens the dataset’s representativeness and model generalization potential [22]. Subsequently, the collected data underwent a screening process. Images that significantly deviated from real-world conditions, such as those with excessive brightness or severe blurring, were removed. A total of 756 images were retained to construct the final dataset. This dataset comprises 161 images of healthy tomato leaves, 150 images of early blight (Alternaria solani) [23], 150 images of late blight (Phytophthora infestans) [24], 151 images of leaf mold (Fulvia fulva) [25], and 144 images of gray leaf spot (Alternaria alternata f. sp. lycopersici) [26]. All images were manually annotated by the first authors (Lina Liang). For this study, the images within the dataset were randomly partitioned into training, validation, and test sets in a 7:2:1 ratio [27]. All images were scaled to 224 × 224 px prior to analysis. No image augmentations were performed as part of the analyses. The images of healthy tomato leaves and four types of diseases under sunny and rainy conditions, as well as close-up views, are shown in Figure 1.

2.2. The Proposed DenseNet-BiFPN-ECA Fusion Network

To achieve precise identification of tomato leaf diseases in complex environments, this study developed a novel hybrid recognition model named DenseNet-BiFPN-ECA Fusion Network. The core design of this model strategically integrates a bidirectional feature pyramid network (BiFPN) [28] with an efficient channel attention (ECA) mechanism [29] into a transfer learning-based DenseNet121 backbone network [30]. This architecture is specifically designed to address the challenges of multi-scale representation of diseased regions and complex background interference, which are prevalent in natural agricultural scenarios [31].

The architecture of DenseNet-BiFPN-ECA Fusion Network is illustrated in Figure 2. The model processes RGB images with dimensions of 224 × 224 × 3 as input. The initial and hierarchical feature extraction is performed by the DenseNet121 backbone network, which employs a dense connection pattern. This design enables the network to deepen while mitigating the gradient vanishing problem, ultimately generating multi-scale feature maps with varying semantic levels [32].

To efficiently utilize multi-scale feature information, the model incorporates a customized BiFPN module. This module achieves deep integration between feature maps of different resolutions through bidirectional (top-down and bottom-up) paths and cross-scale connections, enabling mutual reinforcement between detail-rich shallow features and semantic-rich deep features. Consequently, it generates a set of fusion features that comprehensively represent lesions at various scales [33].

The fusion features are then processed in the ECA module. The ECA mechanism employs a lightweight, dimensionality-preserving channel attention approach that captures cross-channel interactions through adaptive one-dimensional convolution, thereby recalibrating channel-specific feature weights [34]. This process automatically enhances symptom-relevant feature channels while suppressing irrelevant background noise, significantly improving the model’s discriminative power in complex environments.

Ultimately, the optimized features are transformed into classification results. The spatial dimension is compressed through a global average pooling layer, followed by a fully connected layer and Softmax activation function to output the probability distribution of target categories (healthy, early blight, late blight, leaf mold, and gray leaf spot). The entire framework supports end-to-end training and optimization. The classification algorithm formula is as follows:

v = G A P (F^{'})

(1)

y = S o f t m a x (W v + b)

(2)

The multi-scale feature maps obtained after BiFPN fusion (denoted as F’) are first compressed in spatial dimensions via Global Average Pooling (GAP) to produce a channel-wise feature vector v = GAP(F′), where v ∈

R^{C_{o u t}}

(in this work, Cout = 256). This vector is then transformed by a fully connected layer through the linear operation s = Wv + b, with weight matrix W ∈ R^k×Cout and bias vector b ∈ R^k. Here, k represents the total number of classes (in this study, k = 5, corresponding to healthy leaves and four disease types) [35]. Finally, the Softmax function converts s into a normalized class probability distribution: y = Softmax(s), completing the end-to-end disease classification. This design effectively reduces the number of parameters while preserving essential spatial information, and the normalized discriminative mechanism of Softmax enhances the model’s capability to differentiate among multiple disease categories.

In summary, DenseNet-BiFPN-ECA Fusion Network is a purpose-built collaborative architecture. The BiFPN implements robust multi-scale feature aggregation, while the ECA performs adaptive channel feature refinement. These two components complement each other, building upon the strong representation capabilities of DenseNet121. This synergy enables the model to accurately locate and classify various disease morphologies and sizes in greenhouse noise environments, representing the key technical contribution of this work.

2.3. Experimental Settings

2.3.1. Training Settings

The experimental platform was established using PyCharm (2025.2.5 Community Edition) to conduct model training and comparative analysis. The batch size was set to 16, with an initial learning rate of 0.001. A learning rate decay strategy was employed, reducing the rate by a factor of 0.1 every 15 epochs. The Adam optimizer was used for training over 50 epochs. The input image resolution was fixed at 224 × 224 pixels after uniform preprocessing. All models were trained under consistent conditions regarding training environment, initial learning rate, and optimization algorithm to ensure comparability.

2.3.2. Evaluation Metrics

To comprehensively evaluate the classification performance of the proposed model, multiple quantitative metrics were adopted. Accuracy was used as the primary metric to measure the overall correctness of disease identification [36]. Additionally, to assess model stability and generalization capability, five-fold cross-validation was performed, reporting the mean accuracy and standard deviation [37]. For ablation and comparative experiments, training loss and validation accuracy curves were monitored throughout the training process to analyze convergence behavior [38]. Confusion matrices were employed to visualize classification performance across individual categories, with diagonal elements representing correctly classified samples and off-diagonal entries indicating misclassifications [39]. Key metrics including recall rate, false positive rate, and cross-misjudgment rate were further calculated to provide detailed insights into the model’s discriminative power, particularly for small-sample categories and visually similar diseases [40].

In preparing this manuscript, the authors used a large language model for language refinement and polishing of Section 4. The authors have reviewed and edited the output and take full responsibility for the content of the publication.

3. Results

To validate the recognition performance of the proposed DenseNet-BiFPN-ECA Fusion Network model for tomato leaf diseases, ablation studies were conducted to compare the improved model with the original architecture, along with comparative experiments against classical models, followed by detailed performance analysis through confusion matrices and recognition results. All ablation experiments were performed under identical conditions, with consistent training environments, initial learning rates, and optimization algorithms maintained throughout.

3.1. Ablation Experiment

As shown in Figure 3 and Table 1, comprehensive analysis of the training process and final results for the improved DenseNet variants demonstrates that, on the FC-TLFD dataset, the DenseNet-ECA model achieved stable loss convergence near 0.1 with a final validation accuracy of 83.59% (Table 2); the DenseNet-BiFPN model exhibited mid-training fluctuations (loss temporarily rising to 0.4) but ultimately converged to a 0.163 loss corresponding to 88.28% accuracy, while the fused DenseNet-BiFPN-ECA Fusion Network model displayed optimal learning curves with loss steadily decreasing to 0.068 and validation accuracy progressively reaching 90.63%. On the Plant Village dataset, DenseNet-ECA achieved rapid accuracy convergence to 97.92%; DenseNet-BiFPN showed greater loss fluctuations (peak 0.6) yet attained 97.81% final accuracy; the fused model demonstrated the most stable convergence with loss controlled at 0.033 and accuracy peaking at 98.47%.

As shown in Table 1, the fusion model achieves the highest accuracy of 90.63% and 98.47% on both the FC-TLFD and the Plant Village datasets, respectively. The ECA module demonstrates its critical role in feature selection through 97.92% accuracy and a 0.017 low loss value on the Plant Village dataset. Meanwhile, the BiFPN module contributes a 4.69% accuracy improvement on the FC-TLFD dataset. In terms of model complexity, the proposed fusion model contains 8.5 M parameters, marginally higher than DenseNet-ECA (7.1 M) and DenseNet-BiFPN (8.3 M), with inference speeds of 7.1 ms/img on the testing platform. The cross-validation results further confirm the robustness of all three models, with standard deviations consistently below 0.003 across five runs, indicating stable training behavior.

3.2. Comparison with State-of-the-Art Methods

Analysis of the training process (Figure 4) and final performance (Table 2) reveals significant performance differences among the compared models. On the self-built FC-TLFD dataset, VGG16 achieved an accuracy of 83.59%, outperforming DenseNet (70.31%) and ResNet101 (73.44%). On the Plant Village dataset, ResNet101 showed superior performance among the baseline models, achieving 95.63% accuracy, significantly surpassing VGG16 (86.72%) and DenseNet (72.66%). Notably, as shown in Table 2, the proposed DenseNet-BiFPN-ECA Fusion Network achieves the highest accuracy on both datasets, reaching 90.63% on FC-TLFD and 98.47% on the Plant Village dataset, with consistently low loss values of 0.068 and 0.033, respectively. The cross-validation results further confirm its robustness, with mean accuracies of 90.23% (±0.0025) on FC-TLFD and 98.36% (±0.0016) on the Plant Village dataset.

While VGG16 achieves reasonable accuracy on the FC-TLFD dataset (83.59%), its parameter count is over 16 times larger than the proposed model, highlighting its inefficiency. ResNet101 demonstrates strong performance on the Plant Village dataset (95.63%) but falls short on the more challenging FC-TLFD dataset (73.44%). The training curves in Figure 4 show that the proposed model maintains stable convergence across both datasets, whereas VGG16 exhibits noticeable overfitting tendencies on the small-sample FC-TLFD dataset.

As presented in Figure 5, on the FC-TLFD dataset, the confusion matrix of the DenseNet-BiFPN-ECA Fusion Network exhibits the most concentrated diagonal distribution, achieving an average classification accuracy of 96.2% across the five categories. Compared to DenseNet-BiFPN and the original DenseNet, this fusion model improves the recall rate for small-scale lesion categories by 11.7%. The model with only the ECA mechanism (DenseNet-ECA) reduces the false positive rate for the Healthy class by 14.3% relative to the baseline model. The fully fused model achieves a cross-misclassification rate of only 8.1% for easily confusable lesions (such as early blight and gray leaf spot), representing a 12.5% reduction compared to the baseline, and attains an F1-score of 0.938 for small-sample categories. In contrast, VGG16, due to its shallow network structure that struggles to capture subtle pathological texture features, exhibits a misclassification rate as high as 22.4% across the four disease categories. ResNet101 achieves a precision of only 89.7% in tomato leaf diseases identification, with its performance potentially constrained by residual structure redundancy.

On the Plant Village dataset, quantitative analysis of confusion matrices from six deep neural networks indicates that the DenseNet-BiFPN-ECA Fusion Network achieves the optimal classification performance, with diagonal values in its confusion matrix notably higher than those of other models. The overall accuracy of this fusion model reaches 96.7%, outperforming the single-module improved models DenseNet-ECA (94.1%) and DenseNet-BiFPN (94.8%). Compared to the baseline model and DenseNet-BiFPN, the proposed model increases the recall rate for small-scale lesion categories by 9.3% and reduces the cross-misclassification rate for easily confusable lesions to 7.9%, a decrease of 10.7% from the baseline. The model incorporating the ECA mechanism lowers the misclassification rate by 12.7% compared to the original DenseNet. VGG16 and ResNet101 display noticeable off-diagonal distributions for morphologically similar disease categories. DenseNet-BiFPN-ECA Fusion Network achieves a recall rate of 92.1% for minority categories, further validating its robustness in real-world complex scenarios.

Furthermore, the fusion network model demonstrated exceptional recognition accuracy for late blight, achieving 98.1% on the FC-TLFD dataset and 97.9% on the Plant Village dataset. The disease is characterized by deep green water-soaked lesions and white mycelium or sporulation. In contrast, early blight and gray leaf spot exhibit highly similar visual features during the initial stages, both presenting as brown spots. This similarity poses challenges, with the confusion matrix revealing that misclassifications between the two account for approximately 65% of all classification errors. Another challenge is leaf mold, whose fungal sign characteristics are easily obscured in complex backgrounds, yet the model still achieved a recall rate of 92.5% on the FC-TLFD dataset.

4. Discussion

This study was conducted based on a core hypothesis: whether the integration of multi-scale feature fusion and channel attention mechanisms could effectively enhance a model’s discriminative capacity for tomato leaf diseases that exhibit both scale variations and visual similarities when captured under complex greenhouse conditions. The experimental results obtained from this work provided clear support for this hypothesis. Specifically, the proposed DenseNet-BiFPN-ECA Fusion Network achieved a classification accuracy of 90.63% on a self-constructed small-sample dataset collected from real field environments, which represented a substantial improvement of 20.32 percentage points over the baseline DenseNet121 model. Furthermore, when evaluated on a large-scale publicly available dataset, the model attained an even higher accuracy of 98.47%, further demonstrating its strong learning capability and generalization performance. This approach aligns with proven strategies that have been widely adopted in the field of agricultural computer vision, where numerous studies have reported that deep learning architectures incorporating multi-scale feature extraction modules and attention mechanisms tend to yield superior performance compared to conventional network structures [41]. These results collectively validate the effectiveness of combining multi-scale features with attention mechanisms, and also demonstrate the model’s robustness in handling limited and challenging field data, which is particularly important for addressing common data-scarcity issues frequently encountered in plant disease recognition tasks [42].

To gain a deeper understanding of how individual components contributed to the overall performance improvement, a series of ablation experiments were conducted. The results of these experiments revealed the distinct yet complementary roles played by the BiFPN and ECA modules within the proposed architecture. The BiFPN module, through its design of bidirectional cross-scale connections, was found to substantially enhance the extraction and fusion of multi-scale features associated with disease lesions. When evaluated independently on the self-collected dataset, the inclusion of BiFPN alone led to a performance gain of 4.69 percentage points relative to the baseline. Meanwhile, the ECA module contributed to performance improvement through a different mechanism. By applying efficient channel-wise recalibration, it enabled the model to direct greater attention toward features that are most relevant to pathological characteristics, while simultaneously reducing the influence of irrelevant background information [43]. When these two modules were combined, their synergistic effect became even more apparent. One notable outcome was a significant reduction in cross-class misclassification between diseases that share similar visual appearances, such as early blight and gray leaf spot, which are often difficult to distinguish even for experienced human observers. This observed performance enhancement achieved through multi-module fusion aligns with findings reported in the recent literature, where the integration of multiple complementary modules has been shown to yield performance benefits beyond what can be achieved by any single module alone [44]. Furthermore, such hybrid architectures that integrate diverse algorithmic components have demonstrated superior performance not only in plant disease recognition but also in other engineering and environmental prediction tasks [45].

In addition to ablation studies, a series of comparative experiments were carried out to benchmark the proposed model against several widely used baseline architectures commonly employed in plant disease recognition tasks. These included VGG16, ResNet101, and the original DenseNet model [46]. Across all comparisons, the proposed DenseNet-BiFPN-ECA Fusion Network consistently outperformed these baseline models. On the FC-TLFD dataset, it achieved higher accuracy than any of the compared models, and on the Plant Village dataset, its performance was similarly robust, demonstrating significantly better results. These findings provide strong evidence for the adaptability of the proposed architecture to datasets with different characteristics, as well as for the effectiveness of the transfer learning strategy employed in this study. Similar observations regarding the influence of factors such as optimized cultivation conditions and data quality on model performance have also been reported and validated in other studies within the agricultural artificial intelligence domain [47]. Moreover, the model’s ability to achieve particularly high recognition rates for disease categories with distinctive visual features, such as late blight with its characteristic water-soaked lesions and white mold layer, is consistent with findings from research focused on cultivar-based classification, where unique morphological traits often facilitate more accurate identification [48]. The critical role of image dataset characteristics in shaping and determining model performance has also been emphasized and confirmed in investigations related to crop quality monitoring, where variations in image acquisition conditions can significantly impact model outcomes [49]. Looking beyond standard RGB imagery, several researchers have pointed out that expanding the input feature space, for example, by incorporating multispectral or hyperspectral data, represents a promising direction for further improving early and accurate disease diagnosis in agricultural applications involving other crops [50]. Despite the overall strong performance of the proposed model, it is important to acknowledge that some explainable misclassifications still occur, particularly for diseases that are morphologically similar in their early stages. These errors, while limited in number, reflect the inherent relationship between the model’s decision boundaries and the subtle visual phenotypes of diseases. This challenge underscores the value of multi-temporal and multi-feature fusion methods, which have proven beneficial for distinguishing other crop diseases in previous studies [51].

Despite the high recognition accuracy achieved by the proposed model, several challenges still need to be addressed before it can be effectively deployed in practical field applications. First, there exists an inherent trade-off between performance improvement and computational efficiency; modules such as BiFPN, while beneficial for enhancing feature fusion capability, also increase overall model complexity and computational cost [52]. Therefore, future lightweighting efforts are considered essential, as adaptive scheduling and lightweight models are widely recognized as key factors for ensuring real-time performance in IoT devices and other resource-constrained platforms [53]. Second, the fixed input resolution of 224 × 224 pixels used during model training may lead to the loss of subtle lesion features when processing high-resolution field images through down-sampling operations, which could potentially affect the sensitivity of early disease detection. Furthermore, although the model demonstrates better robustness than classical models on self-built datasets containing various environmental disturbances, its generalization capability to completely unknown environments, extreme lighting conditions, or novel disease variants still requires further validation through more extensive cross-regional and cross-seasonal data collection efforts. Research on domain adaptation techniques is considered crucial for improving model applicability across diverse cultivation environments and conditions [54]. In the field of plant phenotyping, there is growing emphasis on linking model predictions to observable biological traits as a means to enhance trustworthiness and guide practical use in real-world settings [55]. Explainable AI (XAI) methods have been shown to improve model reliability and practicality in other environmental prediction tasks, and their application in plant disease recognition warrants further exploration [56]. This study has already attempted to interpret the model’s performance from the perspective of disease visual characteristics, thereby providing a clear basis for its potential application and future optimization.

Based on the preceding discussion, future work will concentrate on four key directions to facilitate the transition of the model from a laboratory tool to a practical field solution. The first direction involves Model Lightweighting and Acceleration, which includes investigating techniques such as network pruning, quantization, and knowledge distillation to significantly reduce model size and computational overhead while maintaining acceptable accuracy levels, thereby meeting the memory and real-time requirements of mobile deployment platforms. The second direction focuses on Data Augmentation, which entails systematically introducing techniques such as random flipping, rotation, and color jittering during preprocessing to enhance model adaptability to complex lighting, background variations, and different shooting angles. The third direction focuses on Multimodal Data Fusion, which entails actively exploring the integration of additional data sources, such as hyperspectral or thermal imaging, to enrich the feature representation beyond what conventional RGB imagery can provide. The fourth direction addresses Cross-Domain Adaptation and Generalization, which involves advancing research on domain adaptation techniques to enable the model to rapidly and effectively adapt to different tomato cultivars, growth environments, and climatic regions, thereby significantly improving its overall versatility and practical utility in diverse agricultural settings.

5. Conclusions

Through systematic architectural innovation and empirical validation, this study proposes a tomato leaf diseases recognition model based on DenseNet121, which integrates a bidirectional feature pyramid network (BiFPN) and an efficient channel attention mechanism (ECA). The model demonstrates high classification performance and robustness on both a FC-TLFD complex small-sample field dataset and a publicly available large-scale dataset, significantly improving disease recognition accuracy while enhancing the model’s ability to represent multi-scale pathological features. The proposed model demonstrates superior performance compared to the original model on both the Plant Village dataset and the FC-TLFD dataset, achieving accuracy rates of 98.47% and 90.63%, respectively. Its robust stability surpasses mainstream baseline models such as VGG16 and ResNet101. The introduced DenseNet-BiFPN-ECA Fusion Network effectively integrates multi-scale features and enhances critical discriminative information, making it suitable for intelligent disease diagnosis in complex greenhouse environments.

Author Contributions

Conceptualization, data curation, and writing of original draft: L.L. and J.C. Investigation and validation: H.W. and Y.T. Methodology and visualization: Y.C., F.Z. and S.W. Conceptualization, supervision, and writing—review and editing: M.H. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fujian Modern Agricultural Vegetable Industry System Construction Project, grant number 2019-897. The APC was funded by the same project.

Data Availability Statement

The FC-TLFD dataset generated during this study is available from the corresponding author upon reasonable request. The publicly available Plant Village dataset used in this study can be accessed at https://tensorflow.google.cn/datasets/catalog/plant_village (accessed on 15 April 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mengesha, A.T.; Mengistie, M.A. Applying Transfer Learning in CNN Model Architectures for Detecting Tomato Leaf Disease with Explainable Artificial Intelligence. Smart Agric. Technol. 2025, 11, 101034. [Google Scholar] [CrossRef]
Shehu, H.A.; García-Díaz, V.; Casado-Vara, R.; Cánovas-García, M. YOLO for Early Detection and Management of Tuta absoluta-Induced Tomato Leaf Diseases. Front. Plant Sci. 2025, 16, 1524630. [Google Scholar] [CrossRef]
Qi, J.; Qi, X.; Liu, Y. An Improved YOLOv5 Model Based on Visual Attention Mechanism: Application to Recognition of Tomato Virus Disease. Comput. Electron. Agric. 2022, 194, 106957. [Google Scholar] [CrossRef]
Shehu, H.A.; Casado-Vara, R.; García-Díaz, V.; Cánovas-García, M. Artificial Intelligence for Early Detection and Management of Tuta absoluta-Induced Tomato Leaf Diseases: A Systematic Review. Eur. J. Agron. 2025, 170, 127669. [Google Scholar] [CrossRef]
Sambana, B.; Kovvur, R.M.R.; Sairam, D.; Sridhar, G. An Efficient Plant Disease Detection Using Transfer Learning Approach. Sci. Rep. 2025, 15, 19082. [Google Scholar] [CrossRef]
Sharma, J.; Ahmed, S.; Mahapatra, S.; Singh, S. Deep Learning Based Ensemble Model for Accurate Tomato Leaf Disease Classification by Leveraging ResNet50 and MobileNetV2 Architectures. Sci. Rep. 2025, 15, 13904. [Google Scholar] [CrossRef]
Raval, H.; Chaki, J. Ensemble Transfer Learning Meets Explainable AI: A Deep Learning Approach for Leaf Disease Detection. Ecol. Inform. 2024, 84, 102925. [Google Scholar] [CrossRef]
Dixit, A.K.; Verma, R. Advanced Hybrid Model for Multi Paddy Diseases Detection Using Deep Learning. EAI Endorsed Trans. Perv. Health Technol. 2023, 9, e5. [Google Scholar] [CrossRef]
Sharma, R.; Naaz, S.; Vaidya, P. Harnessing Deep Learning with AlexNet for Tomato Leaf Disease Detection in the Indian Himalayan Terrain. J. Electr. Comput. Eng. 2025, 2025, 2807347. [Google Scholar] [CrossRef]
Eliwa, E.H.I.; Hafeez, T.A.E. Advancing Crop Health with YOLOv11 Classification of Plant Diseases. Neural Comput. Appl. 2025, 37, 15223–15253. [Google Scholar] [CrossRef]
Trivedi, N.K.; Anand, S.; Reema, J.K.; Bansal, P.; Pandey, R.K. Skill-Honey Badger Optimisation Algorithm-Enabled Deep Convolutional Neural Network for Multiclass Leaf Disease Detection in Tomato Plant. J. Phytopathol. 2024, 172, e70001. [Google Scholar] [CrossRef]
Javidan, S.M.; Banakar, A.; Vakilian, K.A.; Ampatzidis, Y. Tomato Fungal Disease Diagnosis Using Few-Shot Learning Based on Deep Feature Extraction and Cosine Similarity. AgriEngineering 2024, 6, 4233–4247. [Google Scholar] [CrossRef]
Jelali, M. Deep Learning Networks-Based Tomato Disease and Pest Detection: A First Review of Research Studies Using Real Field Datasets. Front. Plant Sci. 2024, 15, 1493322. [Google Scholar] [CrossRef]
Reis, H.C. Advanced Tomato Disease Detection Using the Fusion of Multiple Deep-Learning and Meta-Learning Techniques. J. Crop Health 2024, 76, 1553–1567. [Google Scholar] [CrossRef]
Karthika, J.; Asha, R.; Priyanka, N.; Amshavalli, R. Integrating NMSA Based Advanced Light-Weight Aggregated Fusion Channel Network for Robust Tomato Leaf Disease Detection. Multimed. Tools Appl. 2024, 84, 30227–30258. [Google Scholar]
Nasr, M.; Torkey, H.; Ebeid, H. Comparative Analysis of Noise Estimation Methods in Computed Tomography Images: Histogram Analysis, L2 Norm, SSIM, and CNN-Based Classification with ResNet50. Digit. Signal Process. 2025, 166, 105242. [Google Scholar] [CrossRef]
Kumari, B.S.; Komma, A.; Rajat, A.; Sethy, P.K. ResNet101-SVM: Hybrid Convolutional Neural Network for Citrus Fruits Classification. J. Intell. Fuzzy Syst. 2024, 46, 7035–7045. [Google Scholar]
Arnob, A.S.; Al Mamun, M.A.; Rahman, M.A. Comparative Result Analysis of Cauliflower Disease Classification Based on Deep Learning Approach VGG16, Inception v3, ResNet, and a Custom CNN Model. Hybrid Adv. 2025, 10, 100440. [Google Scholar] [CrossRef]
Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 2025, 7, 1419. [Google Scholar] [CrossRef]
Wang, Z.; Wu, W.; Wang, J.; Guo, Z. A Machine Learning-Based Irrigation Prediction Model for Cherry Tomatoes in Greenhouses: Leveraging Optimal Growth Data for Precision Irrigation. Comput. Electron. Agric. 2025, 237, 110558. [Google Scholar] [CrossRef]
Veronica, C.; Palmitessa, O.D.; Leoni, B.; Renna, M.; Santamaria, P. Morpho-Physiological Classification of Italian Tomato Cultivars (Solanum lycopersicum L.) According to Drought Tolerance during Vegetative and Reproductive Growth. Plants 2021, 10, 1826. [Google Scholar] [CrossRef]
Li, K.; Wang, C.; Fan, Y.; Han, Y.; Liu, J. Attention-Optimized DeepLab V3+ for Automatic Estimation of Cucumber Disease Severity. Plant Methods 2022, 18, 109. [Google Scholar] [CrossRef]
Bao, H.; Huang, L.; Zhang, Y.; Pang, H. Early Discrimination and Visualization of Tomato Early Blight Based on Hyperspectral Images. Prog. Biochem. Biophys. 2025, 52, 513–524. [Google Scholar]
Zhang, R.; Xu, L.; Huang, K.; Yang, S.; Kong, Q.; Yuan, S. Field Efficacy of Multiple Fungicides Against Tomato Late Blight in Yunnan. China Plant Prot. 2024, 44, 85–87+101. [Google Scholar]
Zhao, D.; Zhang, Y.; Wang, Z.; Bai, X.; Wang, X. Detection Method for Tomato Leaf Mold During Latent Period Based on Hyperspectral Imaging. J. Agric. Mach. 2025, 56, 390–397. [Google Scholar]
Zou, F.; Xiong, Z.; Wang, C.; Luo, X.; Ma, H.; He, L. Identification of Pathogen and Fungicide Efficacy for Tomato Gray Leaf Spot in Jiangxi Province. J. Hunan Agric. Univ. Nat. Sci. Ed. 2025, 51, 48–53. [Google Scholar]
Shao, T.; Cai, Z.; Liu, Y.; Shi, X.; Wang, J. Prediction of Pathological Grade of Oral Squamous Cell Carcinoma and Construction of Prognostic Model Based on Deep Learning Algorithm. Discov. Oncol. 2025, 16, 976. [Google Scholar] [CrossRef] [PubMed]
Chen, H.; Wang, H.; Shi, L.; Tang, J.; Xu, Z. A Robust YOLOv5 Model with SE Attention and BIFPN for Jishan Jujube Detection in Complex Agricultural Environments. Agriculture 2025, 15, 665. [Google Scholar] [CrossRef]
Cui, W.; Li, Z.; Wang, Y.; Liu, J. Apple Yield Estimation Method Based on CBAM-ECA-Deeplabv3+ Image Segmentation and Multi-Source Feature Fusion. Sensors 2025, 25, 3140. [Google Scholar] [CrossRef] [PubMed]
Gu, J.; Wu, Y.; Zhang, Y.; Wang, D. Research on the Quality Grading Method of Ginseng with Improved DenseNet121 Model. Electronics 2024, 13, 4504. [Google Scholar] [CrossRef]
Huang, X.; Zhang, C.; Chen, X. Occluded Tomato Disease Image Recognition by Integrating Multi-Scale Features. J. Chin. Agric. Mech. 2024, 45, 194–200. [Google Scholar]
Zhang, L.; Wang, J.; Liu, S.; Wang, G. DCF-Yolov8: An Improved Algorithm for Aggregating Low-Level Features to Detect Agricultural Pests and Diseases. Agronomy 2023, 13, 2012. [Google Scholar] [CrossRef]
Li, M.; Liu, S.; Ouyang, Y.; Zhang, P. Efficient Lightweight Citrus Leaf Disease Detection Model Based on YOLOv8n. J. Zhejiang Agric. 2025, 37, 2198–2208. [Google Scholar]
Ni, H.; Wang, G.; Cai, W.; Zhou, Y.; Gao, J. Classification of Typical Pests and Diseases of Rice Based on the ECA Attention Mechanism. Agriculture 2023, 13, 1066. [Google Scholar] [CrossRef]
Terzioğlu, H.; Gölcük, A.; Shakarji, A.M.A.; Al-Bayati, M.Y. Comparative analysis of deep learning-based feature extraction and traditional classification approaches for tomato disease detection. Agronomy 2025, 15, 1509. [Google Scholar] [CrossRef]
Zhang, R.; Pei, C.; Shi, J.; Wang, S. Construction and validation of a general medical image dataset for pretraining. J. Imaging Inform. Med. 2025, 38, 1051–1061. [Google Scholar] [CrossRef]
Vabalas, A.; Gowen, E.; Poliakoff, E.; Casson, A.J. Machine learning algorithm validation with a limited sample size. PLoS ONE 2019, 14, e0224365. [Google Scholar] [CrossRef]
Smith, L.N. A disciplined approach to neural network hyper-parameters: Part 1—Learning rate, batch size, momentum, and weight decay. arXiv 2018, arXiv:1803.09820. [Google Scholar]
Zhao, N.; Bui, T.; Jia, Y.Y.; Dzieciolowski, K. Outperformance score: A universal standardization method for confusion-matrix-based classification performance metrics. arXiv 2025, arXiv:2505.07033. [Google Scholar]
Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
Liu, S.; Qiao, Y.; Li, J.; Zhang, H.; Zhang, M.; Wang, M. An improved lightweight network for real-time detection of apple leaf diseases in natural scenes. Agronomy 2022, 12, 2363. [Google Scholar] [CrossRef]
Cristea, A.M.; Dobre, C. Federated Transfer Learning for Tomato Leaf Disease Detection Using Neuro-Graph Hybrid Model. AgriEngineering 2025, 7, 432. [Google Scholar] [CrossRef]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Minghao, Z.; Wei, Q.; Xiangdong, W.; Zhe, Y.; Xiaohong, X.; Liheng, Z.; Xing, H. Assessment of gully erosion susceptibility in Northeast China’s black soil region using new stacking model with multiple machine learning algorithms. Soil Tillage Res. 2026, 257, 106964. [Google Scholar] [CrossRef]
Liao, J.; Liu, Y.; Wang, H.; He, X.; Huang, Y.; Han, Z.; Fang, J.; Chen, P.; Hu, J. AI-assisted transient emission prediction for diesel engines based on a novel hybrid model combined multiple machine learning algorithms and XGBoost. J. Environ. Chem. Eng. 2025, 13, 119649. [Google Scholar] [CrossRef]
Too, E.C.; Yujian, L.; Njuki, S.; Yingchun, L. A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric. 2019, 161, 272–279. [Google Scholar] [CrossRef]
Dastres, E.; Esmaeili, H.; Sonboli, A.; Mirjalili, M.H. Optimizing cultivation areas for Salvia leriifolia using advanced spatial prediction and hybrid machine learning algorithms for maximum bioactive compound yield. Smart Agric. Technol. 2025, 12, 101482. [Google Scholar] [CrossRef]
Hilaili, M.; Fathi-Najafabadi, A.; Nurwahyuningsih; Rahman, A.; Jahari, M.B.; Russo, L.; Kondo, N.; Fatchurrahman, D. The authentication of the local cured Nicotiana tabacum L. based on varieties by using machine learning algorithms. Next Res. 2025, 2, 101039. [Google Scholar] [CrossRef]
Junyang, W.; Maskat, M.Y.; Mohd Ali, M. Non-destructive monitoring of postharvest quality changes in chili (Capsicum frutescens) during storage using image processing coupled with machine learning. Postharvest Biol. Technol. 2026, 234, 114140. [Google Scholar] [CrossRef]
Herrera Ollachica, D.A.; Asiedu Asante, B.K.; Imamura, H. Advancing Water Hyacinth Recognition: Integration of Deep Learning and Multispectral Imaging for Precise Identification. Remote Sens. 2025, 17, 689. [Google Scholar]
Wang, H.; Ruan, C.; Zhao, J.; Wang, Y.; Li, Y.; Dong, Y.; Huang, L. Utilizing interpretable machine learning algorithms and multiple features from multi-temporal Sentinel-2 imagery for predicting wheat fusarium head blight. Artif. Intell. Agric. 2026, 16, 224–239. [Google Scholar] [CrossRef]
Türkoğlu, M.; Hanbay, D. Plant disease and pest detection using deep learning-based features. Turk. J. Electr. Eng. Comput. Sci. 2019, 27, 1636–1651. [Google Scholar] [CrossRef]
Saurabh, K.; Mahapatra, S.; Tripathi, M.M. An Adaptive Bandwidth Scheduling Algorithm for IoT Devices Using Machine Learning Techniques. Frankl. Open 2026, 14, 100492. [Google Scholar] [CrossRef]
Lu, Y.; Young, S. A survey of public datasets for computer vision tasks in precision agriculture. Comput. Electron. Agric. 2020, 178, 105760. [Google Scholar] [CrossRef]
Elfouly, M.K.; AbdelAziz, A.M.; Gomaa, W.H.; Abdalla, M. A deep learning-based framework for large-scale plant disease detection using big data analytics in precision agriculture. J. Big Data 2025, 12, 205. [Google Scholar] [CrossRef]
Zhang, J.; Xiao, C.; Liang, X.; Yang, W.; Fang, Z.; Zhang, L.; Dai, R.; Li, W.; Ni, H. Machine Learning Based on a Swarm Intelligence Algorithm and Explainable Ai for the Prediction of Reservoir Temperature. Energy 2025, 341, 139412. [Google Scholar] [CrossRef]

Figure 1. Example images of tomato leaves from the Fujian Changle Tomato Leaf Field Data under multiple greenhouse scenarios (Rows 1–4) and from the Plant Village dataset (Row 5).

Figure 2. The improved DenseNet-BiFPN-ECA Fusion Network architecture. The input layer accepts three example images (healthy, early blight, late blight), and the output layer produces a label tensor [lb₁, …, lbₘ], where m denotes the number of input images and lb represents the corresponding predicted label. In this case, m = 3, corresponding to the three input images.

Figure 3. Training loss and accuracy curves of different models on two datasets. (A) Results on the FC-TLFD dataset. (B) Results on the Plant Village dataset. The final validation accuracy of each model is reported in Table 1.

Figure 4. Training loss and accuracy curves of baseline models across datasets. (A) Results on the FC-TLFD dataset. (B) Results on the Plant Village dataset.

Figure 5. (A) Confusion matrix results of different models on the Plant Village dataset. (B) Confusion matrix results of different models on the FC-TLFD dataset. The four models used in this study are labeled above each sub-figure in top-to-bottom order: DenseNet (original model), DenseNet-BiFPN-ECA fusion network (improved model), VGG16, and ResNet101. In each confusion matrix of the FC-TLFD dataset, the five classification categories arranged vertically and horizontally from top to bottom and left to right are: healthy status, early blight, late blight, leaf mold, and gray leaf spot. For the Plant Village dataset, each confusion matrix contains ten classification categories: healthy leaves, early blight, late blight, leaf mold, gray leaf spot, leaf miner disease, mosaic virus, leaf spot blight, red spider mite, and yellow leaf curl virus.

Table 1. Comparison of Final Performance comparison of the improved fusion network models on the training dataset.

Datasets	Models	Accuracy (%)	Final Loss	Parameters (M)	Inference Speed (ms/img)	Cross-Validation Accuracy (Mean ± SD, %)
FC TLFD	DenseNet-ECA	83.59	0.039	7.1	5.2	82.96 (±0.0029)
	DenseNet-BiFPN	88.28	0.163	8.3	6.8	88.54 (±0.0030)
	DenseNet-BiFPN-ECA Fusion Network	90.63	0.068	8.5	7.1	90.23 (±0.0025)
Plant Village dataset	DenseNet-ECA	97.92	0.017	7.1	5.2	96.58 (±0.0019)
	DenseNet-BiFPN	97.81	0.083	8.3	6.8	97.26 (±0.0023)
	DenseNet-BiFPN-ECA Fusion Network	98.47	0.033	8.5	7.1	98.36 (±0.0016)

Table 2. Comparison of Final Performance comparison of the improved fusion model on the test dataset.

Datasets	Models	Accuracy (%)	Final Loss	Parameters (M)	Inference Speed (ms/img)	Cross-Validation Accuracy (Mean ± SD, %)
FC TLFD	DenseNet	70.31	0.140	7.1	5.2	69.45 (±0.0031)
	VGG16	83.59	0.022	138.4	15.8	83.63 (±0.0028)
	ResNet101	73.44	0.149	44.5	10.3	73.21 (±0.0021)
	DenseNet-BiFPN-ECA Fusion Network	90.63	0.068	8.5	7.1	90.23 (±0.0025)
Plant Village dataset	DenseNet	72.66	0.561	7.1	5.2	72.35 (±0.0028)
	VGG16	86.72	0.040	138.4	15.8	87.26 (±0.0029)
	ResNet101	95.63	0.007	44.5	10.3	95.66 (±0.0022)
	DenseNet-BiFPN-ECA Fusion Network	98.47	0.033	8.5	7.1	98.36 (±0.0016)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, L.; Chen, J.; Tian, Y.; Wang, H.; Cai, Y.; Zhong, F.; Wang, S.; Hou, M.; Lu, J. DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition. Horticulturae 2026, 12, 423. https://doi.org/10.3390/horticulturae12040423

AMA Style

Liang L, Chen J, Tian Y, Wang H, Cai Y, Zhong F, Wang S, Hou M, Lu J. DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition. Horticulturae. 2026; 12(4):423. https://doi.org/10.3390/horticulturae12040423

Chicago/Turabian Style

Liang, Lina, Jingnan Chen, Ying Tian, Hongyan Wang, Yiting Cai, Fenglin Zhong, Senpeng Wang, Maomao Hou, and Junyang Lu. 2026. "DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition" Horticulturae 12, no. 4: 423. https://doi.org/10.3390/horticulturae12040423

APA Style

Liang, L., Chen, J., Tian, Y., Wang, H., Cai, Y., Zhong, F., Wang, S., Hou, M., & Lu, J. (2026). DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition. Horticulturae, 12(4), 423. https://doi.org/10.3390/horticulturae12040423

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DenseNet-BiFPN-ECA Fusion Network: An Enhanced Transfer Learning Approach for Tomato Leaf Disease Recognition

Abstract

1. Introduction

2. Materials and Methods

2.1. Datasets

2.2. The Proposed DenseNet-BiFPN-ECA Fusion Network

2.3. Experimental Settings

2.3.1. Training Settings

2.3.2. Evaluation Metrics

3. Results

3.1. Ablation Experiment

3.2. Comparison with State-of-the-Art Methods

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI