An Explainable Machine Learning Framework for the Hierarchical Management of Hot Pepper Damping-Off in Intensive Seedling Production
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Acquisition
2.2. Data Preprocessing
2.2.1. Outlier Handling
2.2.2. Kalman Filter Smoothing Processing
2.2.3. One-Hot Encoding
2.3. Feature Engineering
2.3.1. Feature Creation
2.3.2. Feature Selecting
2.4. Dataset Preparation
2.4.1. Dataset Partitioning and Standardization
2.4.2. SMOTE-ENN Handles Unbalanced Classification
2.5. Model Development and Evaluation Statistical Testing
2.5.1. Model Operating Environment
2.5.2. Development of Baseline Model
2.5.3. Model Evaluation Metrics
2.5.4. Model Performance Statistical Testing
2.5.5. Hyperparameter Tuning
2.5.6. Model Explainable Analysis
2.6. Overall Modeling Workflow
3. Results
3.1. Kalman Filter Smoothing Results
3.2. Feature Selection Results
3.3. Results of Unbalanced Sample Processing
3.4. Baseline Model Prediction Results
3.5. Model Difference Test Results
3.6. Hyperparameter Tuning and Optimal Model Selection
3.7. SHAP Explainable Analysis of the Model
3.7.1. Global Importance Analysis
3.7.2. Local Dependency Analysis
4. Discussion
4.1. Machine Learning for Predicting Disease Severity
4.2. Comparison with Existing Models
4.3. Horticultural Insights and Model Limitations
5. Conclusions
- (1)
- The ET model achieved an F1-score of 0.9734 and an AUC of 0.9969 in predicting hot pepper damping-off severity.
- (2)
- SHAP analysis was employed for both global and local interpretability, leading to the formulation of a hierarchical management and control strategy for damping-off.
- (3)
- Key interacting environmental variables were identified. Based on the dependency analysis results, threshold-based environmental control measures were proposed and implemented in the platform’s real-time prediction system.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xie, J.; Yu, J.; Chen, B.; Feng, Z.; Li, J.; Zhao, C.; Lyu, J.; Hu, L.; Gan, Y.; Siddique, K.H.M. Facility Cultivation Systems “设施农业”: A Chinese Model for the Planet. In Advances in Agronomy; Elsevier: Amsterdam, The Netherlands, 2017; Volume 145, pp. 1–42. ISBN 978-0-12-812417-8. [Google Scholar]
- Xie, J.; Yu, J.; Chen, B.; Feng, Z.; Lyu, J.; Hu, L.; Gan, Y.; Siddique, K.H.M. Gobi Agriculture: An Innovative Farming System That Increases Energy and Water Use Efficiencies. A Review. Agron. Sustain. Dev. 2018, 38, 62. [Google Scholar] [CrossRef]
- Kowalska, A.; Lingham, S.; Maye, D.; Manning, L. Food Insecurity: Is Leagility a Potential Remedy? Foods 2023, 12, 3138. [Google Scholar] [CrossRef]
- Sundari, M.T.; Darsono, D.; Sutrisno, J.; Antriyandarti, E. Analysis of Trade Potential and Factors Influencing Chili Export in Indonesia. Open Agric. 2023, 8, 20220205. [Google Scholar] [CrossRef]
- Zou, Z.; Zou, X. Geographical and Ecological Differences in Pepper Cultivation and Consumption in China. Front. Nutr. 2021, 8, 718517. [Google Scholar] [CrossRef]
- Delai, C.; Muhae-Ud-Din, G.; Abid, R.; Tian, T.; Liu, R.; Xiong, Y.; Ma, S.; Ghorbani, A. A Comprehensive Review of Integrated Management Strategies for Damping-off Disease in Chili. Front. Microbiol. 2024, 15, 1479957. [Google Scholar] [CrossRef] [PubMed]
- Zhao, C.-J.; Li, M.; Yang, X.-T.; Sun, C.-H.; Qian, J.-P.; Ji, Z.-T. A Data-Driven Model Simulating Primary Infection Probabilities of Cucumber Downy Mildew for Use in Early Warning Systems in Solar Greenhouses. Comput. Electron. Agric. 2011, 76, 306–315. [Google Scholar] [CrossRef]
- Deguine, J.-P.; Aubertot, J.-N.; Flor, R.J.; Lescourret, F.; Wyckhuys, K.A.G.; Ratnadass, A. Integrated Pest Management: Good Intentions, Hard Realities. A Review. Agron. Sustain. Dev. 2021, 41, 38. [Google Scholar] [CrossRef]
- Ghorbani, A.; Emamverdian, A.; Pishkar, L.; Chashmi, K.A.; Salavati, J.; Zargar, M.; Chen, M. Melatonin-Mediated Nitric Oxide Signaling Enhances Adaptation of Tomato Plants to Aluminum Stress. S. Afr. J. Bot. 2023, 162, 443–450. [Google Scholar] [CrossRef]
- Nanehkaran, F.M.; Razavi, S.M.; Ghasemian, A.; Ghorbani, A.; Zargar, M. Foliar Applied Potassium Nanoparticles (K-NPs) and Potassium Sulfate on Growth, Physiological, and Phytochemical Parameters in Melissa officinalis L. under Salt Stress. Environ. Sci. Pollut. Res. 2024, 31, 31108–31122. [Google Scholar] [CrossRef]
- Corkley, I.; Fraaije, B.; Hawkins, N. Fungicide Resistance Management: Maximizing the Effective Life of Plant Protection Products. Plant Pathol. 2022, 71, 150–169. [Google Scholar] [CrossRef]
- Liu, K.; Mu, Y.; Chen, X.; Ding, Z.; Song, M.; Xing, D.; Li, M. Towards Developing an Epidemic Monitoring and Warning System for Diseases and Pests of Hot Peppers in Guizhou, China. Agronomy 2022, 12, 1034. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef] [PubMed]
- Bai, R.; Wang, J.; Li, N.; Chen, R. Short- and Long-Term Prediction Models of Rubber Tree Powdery Mildew Disease Index Based on Meteorological Variables and Climate System Indices. Agric. For. Meteorol. 2024, 354, 110082. [Google Scholar] [CrossRef]
- Ghorbani, A.; Emamverdian, A.; Pehlivan, N.; Zargar, M.; Razavi, S.M.; Chen, M. Nano-Enabled Agrochemicals: Mitigating Heavy Metal Toxicity and Enhancing Crop Adaptability for Sustainable Crop Production. J. Nanobiotechnol. 2024, 22, 91. [Google Scholar] [CrossRef]
- Scortichini, M. Sustainable Management of Diseases in Horticulture: Conventional and New Options. Horticulturae 2022, 8, 517. [Google Scholar] [CrossRef]
- Baker, K.M.; Kirk, W.W. Comparative Analysis of Models Integrating Synoptic Forecast Data into Potato Late Blight Risk Estimate Systems. Comput. Electron. Agric. 2007, 57, 23–32. [Google Scholar] [CrossRef]
- Liu, K.; Zhang, C.; Yang, X.; Diao, M.; Liu, H.; Li, M. Development of an Occurrence Prediction Model for Cucumber Downy Mildew in Solar Greenhouses Based on Long Short-Term Memory Neural Network. Agronomy 2022, 12, 442. [Google Scholar] [CrossRef]
- Wadhwa, D.; Malik, K. A Generalizable and Interpretable Model for Early Warning of Pest-Induced Crop Diseases Using Environmental Data. Comput. Electron. Agric. 2024, 227, 109472. [Google Scholar] [CrossRef]
- Fenu, G.; Malloci, F.M. Artificial Intelligence Technique in Crop Disease Forecasting: A Case Study on Potato Late Blight Prediction. In Intelligent Decision Technologies; Czarnowski, I., Howlett, R.J., Jain, L.C., Eds.; Smart Innovation, Systems and Technologies; Springer: Singapore, 2020; Volume 193, pp. 79–89. ISBN 9789811559242. [Google Scholar]
- Sriwanna, K. Weather-Based Rice Blast Disease Forecasting. Comput. Electron. Agric. 2022, 193, 106685. [Google Scholar] [CrossRef]
- Saha, S.; Kucher, O.D.; Utkina, A.O.; Rebouh, N.Y. Precision Agriculture for Improving Crop Yield Predictions: A Literature Review. Front. Agron. 2025, 7, 1566201. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar]
- Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
- Bitmead, R.R.; Hovd, M.; Abooshahab, M.A. A Kalman-Filtering Derivation of Simultaneous Input and State Estimation. Automatica 2019, 108, 108478. [Google Scholar] [CrossRef]
- Wang, Z.; Li, W.; Tang, Z. Enhancing the Genomic Prediction Accuracy of Swine Agricultural Economic Traits Using an Expanded One-Hot Encoding in CNN Models. J. Integr. Agric. 2024, 24, 3574–3582. [Google Scholar] [CrossRef]
- Kim, M.K.; Jeong, H.B.; Yu, N.; Park, B.M.; Chae, W.B.; Lee, O.J.; Lee, H.E.; Kim, S. Comparative Heat Stress Responses of Three Hot Pepper (Capsicum annuum L.) Genotypes Differing Temperature Sensitivity. Sci. Rep. 2023, 13, 14203. [Google Scholar] [CrossRef]
- Bita, C.E.; Gerats, T. Plant Tolerance to High Temperature in a Changing Environment: Scientific Fundamentals and Production of Heat Stress-Tolerant Crops. Front. Plant Sci. 2013, 4, 273. [Google Scholar] [CrossRef]
- Chaves, S.W.P.; Coelho, R.D.; Costa, J.d.O.; Tapparo, S.A. Micrometeorological Modeling and Water Consumption of Tabasco Pepper Cultivated under Greenhouse Conditions. Ital. J. Agrometeorol. 2021, 21–36. [Google Scholar] [CrossRef]
- Spearman, C. The Proof and Measurement of Association between Two Things. Am. J. Psychol. 1904, 15, 72–101. [Google Scholar] [CrossRef]
- Cohen, J. The Analysis of Variance. In Statistical Power Analysis for the Behavioral Sciences; Routledge: New York, NY, USA, 2013; ISBN 978-0-203-77158-7. [Google Scholar]
- Schober, P.; Boer, C.; Schwarte, L.A. Correlation Coefficients: Appropriate Use and Interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef]
- Senter, H.F. Applied Linear Statistical Models. J. Am. Stat. Assoc. 2008, 103, 880. [Google Scholar] [CrossRef]
- Zhang, Q.; Sun, S. Weighted Data Normalization Based on Eigenvalues for Artificial Neural Network Classification; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. ACM SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
- Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; de Freitas, N. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef]
- Scikit-Learn Developers. *User Guide (Version 1.4)* [Computer Software Documentation]. 2024. Available online: https://scikit-learn.org/1.4/user_guide.html (accessed on 4 October 2025).
- Mohammed, S.; Arshad, S.; Alsilibe, F.; Moazzam, M.F.U.; Bashir, B.; Prodhan, F.A.; Alsalman, A.; Vad, A.; Ratonyi, T.; Harsányi, E. Utilizing Machine Learning and CMIP6 Projections for Short-Term Agricultural Drought Monitoring in Central Europe (1900–2100). J. Hydrol. 2024, 633, 130968. [Google Scholar] [CrossRef]
- Khan, N.; Sachindra, D.A.; Shahid, S.; Ahmed, K.; Shiru, M.S.; Nawaz, N. Prediction of Droughts over Pakistan Using Machine Learning Algorithms. Adv. Water Resour. 2020, 139, 103562. [Google Scholar] [CrossRef]
- Aneece, I.; Thenkabail, P.S. Classifying Crop Types Using Two Generations of Hyperspectral Sensors (Hyperion and DESIS) with Machine Learning on the Cloud. Remote Sens. 2021, 13, 4704. [Google Scholar] [CrossRef]
- Tageldin, A.; Adly, D.; Mostafa, H.; Mohammed, H.S. Applying Machine Learning Technology in the Prediction of Crop Infestation with Cotton Leafworm in Greenhouse. bioRxiv 2020. [Google Scholar] [CrossRef]
- Gao, Y.; Huang, C.; Zhang, X.; Zhang, Z.; Chen, B. Vertical Stratification-Enabled Early Monitoring of Cotton Verticillium Wilt Using in-Situ Leaf Spectroscopy via Machine Learning Models. Front. Plant Sci. 2025, 16, 1599877. [Google Scholar] [CrossRef]
- Nagesh, O.S.; Budaraju, R.R.; Kulkarni, S.S.; Vinay, M.; Ajibade, S.-S.M.; Chopra, M.; Jawarneh, M.; Kaliyaperumal, K. Boosting Enabled Efficient Machine Learning Technique for Accurate Prediction of Crop Yield towards Precision Agriculture. Discov. Sustain. 2024, 5, 78. [Google Scholar] [CrossRef]
- Zhao, Y.; Dong, H.; Huang, W.; He, S.; Zhang, C. Seamless Terrestrial Evapotranspiration Estimation by Machine Learning Models across the Contiguous United States. Ecol. Indic. 2024, 165, 112203. [Google Scholar] [CrossRef]
- Branstad-Spates, E.H.; Castano-Duque, L.; Mosher, G.A.; Hurburgh, C.R.; Owens, P.; Winzeler, E.; Rajasekaran, K.; Bowers, E.L. Gradient Boosting Machine Learning Model to Predict Aflatoxins in Iowa Corn. Front. Microbiol. 2023, 14, 1248772. [Google Scholar] [CrossRef]
- Ghosh, S.S.; Mandal, D.; Kumar, S.; Bhogapurapu, N.; Banerjee, B.; Siqueira, P.; Bhattacharya, A. An Evidence Modified Gaussian Process Classifier (EM-GPC) for Crop Classification Using Dual-Polarimetric C- and L- Band SAR Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 18683–18702. [Google Scholar] [CrossRef]
- Vázquez-Veloso, A.; Toraño Caicoya, A.; Bravo, F.; Biber, P.; Uhl, E.; Pretzsch, H. Does Machine Learning Outperform Logistic Regression in Predicting Individual Tree Mortality? Ecol. Inform. 2025, 88, 103140. [Google Scholar] [CrossRef]
- Shahoveisi, F.; Riahi Manesh, M.; Del Río Mendoza, L.E. Modeling Risk of Sclerotinia sclerotiorum-Induced Disease Development on Canola and Dry Bean Using Machine Learning Algorithms. Sci. Rep. 2022, 12, 864. [Google Scholar] [CrossRef]
- Kim, Y.; Roh, J.-H.; Kim, H.Y. Early Forecasting of Rice Blast Disease Using Long Short-Term Memory Recurrent Neural Networks. Sustainability 2017, 10, 34. [Google Scholar] [CrossRef]
- Xiao, Q.; Li, W.; Kai, Y.; Chen, P.; Zhang, J.; Wang, B. Occurrence Prediction of Pests and Diseases in Cotton on the Basis of Weather Factors by Long Short Term Memory Network. BMC Bioinform. 2019, 20, 688. [Google Scholar] [CrossRef] [PubMed]
- Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU Neural Network Methods for Traffic Flow Prediction. In Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 324–328. [Google Scholar]
- Friedman, M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. J. Am. Stat. Assoc. 1937, 32, 675–701. [Google Scholar] [CrossRef]
- Demšar, J. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
- Borror, C.M. Practical Nonparametric Statistics, 3rd Ed. J. Qual. Technol. 2001, 33, 260. [Google Scholar] [CrossRef]
- Brown, I.; Mues, C. An Experimental Comparison of Classification Algorithms for Imbalanced Credit Scoring Data Sets. Expert Syst. Appl. 2012, 39, 3446–3453. [Google Scholar] [CrossRef]
- García, S.; Fernández, A.; Luengo, J.; Herrera, F. Advanced Nonparametric Tests for Multiple Comparisons in the Design of Experiments in Computational Intelligence and Data Mining: Experimental Analysis of Power. Inf. Sci. 2010, 180, 2044–2064. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019. [Google Scholar]
- Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient Global Optimization of Expensive Black-Box Functions. J. Glob. Optim. 1998, 13, 455–492. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar]
- Mishra, P. Model Explainability and Interpretability. In Practical Explainable AI Using Python: Artificial Intelligence Model Explanations Using Python-Based Libraries, Extensions, and Frameworks; Mishra, P., Ed.; Apress: Berkeley, CA, USA, 2022; pp. 1–22. ISBN 978-1-4842-7158-2. [Google Scholar]
- Shuqin, J.; Fang, Z. Zero Growth of Chemical Fertilizer and Pesticide Use: China’s Objectives, Progress and Challenges. J. Resour. Ecol. 2018, 9, 50–58. [Google Scholar] [CrossRef]
- Lázaro, E.; Makowski, D.; Vicent, A. Decision Support Systems Halve Fungicide Use Compared to Calendar-Based Strategies without Increasing Disease Risk. Commun. Earth Environ. 2021, 2, 224. [Google Scholar] [CrossRef]
- Magarey, R.D.; Travis, J.W.; Russo, J.M.; Seem, R.C.; Magarey, P.A. Decision Support Systems: Quenching the Thirst. Plant Dis. 2002, 86, 4–14. [Google Scholar] [CrossRef]
- Liang, L.; Shi, H.; Wang, Z.; Wang, S.; Li, C.; Diao, M. Research on Time Series Prediction Model for Multi-Factor Environmental Parameters in Facilities Based on LSTM-AT-DP Model. Front. Plant Sci. 2025, 16, 1652478. [Google Scholar] [CrossRef]
- Zhao, G.; Zhao, Q.; Webber, H.; Johnen, A.; Rossi, V.; Nogueira Junior, A.F. Integrating Machine Learning and Change Detection for Enhanced Crop Disease Forecasting in Rice Farming: A Multi-Regional Study. Eur. J. Agron. 2024, 160, 127317. [Google Scholar] [CrossRef]
Variable Name | Measurement Range & Accuracy | Statistical Parameters | Sampling Interval | Unit |
---|---|---|---|---|
Air temperature | −30 to 70 °C, ±0.20 °C | Max: 53.83 Min: 11.61 Mean: 28.04 SD: 8.96 | 15 min | °C |
Air relative humidity | 0 to 100%, ±2% | Max: 90.58 Min: 4.71 Mean: 40.64 SD: 20.30 | 15 min | % |
Solar radiation | 0 to 1800 W/m2, ±5% | Max: 665.86 Min: 0 Mean: 64.76 SD: 116.85 | 15 min | W/m2 |
Substrate moisture content | 0–100%, ±0.02% | Max: 81 Min: 39 Mean: 68 SD: 8 | 10 min | % |
Outdoor wind speed | 0 to 67 m s−1, ±0.30 m s−1 | Max: 13.10 Min: 0 Mean: 1.46 SD: 1.76 | 10 min | m/s |
Disease Grading | Symptom Description |
---|---|
0 | No visible symptoms |
1 | Slight discoloration or faint lesions at the stem base |
3 | Distinct lesions at the stem-root junction, but plant growth is unaffected |
5 | Lesions or rot covering 1/3 to 1/2 of the stem base or root collar |
7 | Complete girdling of the stem base or root collar with discoloration and rot |
9 | Whole plant wilts and dies |
Feature Name | Input Variable | Unit |
---|---|---|
Maximum air temperature | max_t | °C |
Minimum air temperature | min_t | °C |
Mean air temperature | mean_t | °C |
Temperature range (max_t–min_t) | dt | °C |
Duration of temperature >28 °C | tt_28 | minutes |
Duration of temperature >30 °C | tt_30 | minutes |
Maximum relative humidity | max_rh | % |
Minimum relative humidity | min_rh | % |
Mean relative humidity | mean_rh | % |
Relative humidity range (max_rh–min_rh) | drh | % |
Maximum solar radiation | max_sr | W/m2 |
Mean solar radiation | mean_sr | W/m2 |
Maximum substrate moisture content | max_w | % |
Minimum substrate moisture content | min_w | % |
Substrate moisture range (max_w–min_w) | de | % |
Mean wind speed | mean_ws | m/s |
Duration of non-zero wind speed | wst | minutes |
Model | Important Parameters | Research Task |
---|---|---|
RF | n_estimators, max_depth, max_features, min_samples_leaf | Short-term agricultural drought monitoring [38] |
SVM | C, kernel, gamma | Potato blight disease [20] |
KNN | n_neighbors, weights | Prediction of droughts [39] |
NB | alpha | Classifying Crop Types [40] |
MLP | hidden_layer_sizes, activation, learning_rate_init | Weather-based rice blast disease forecasting [21] |
XGBoost | n_estimators, learning_rate, max_depth, gamma, lambda | Predict the manifestation of Egyptian cotton leaf worm [41] |
CatBoost | iterations, learning_rate, depth | Early warning of pest-induced crop diseases [19] |
LightGBM | num_leaves, learning_rate, n_estimators | Early monitoring of cotton Verticillium wilt [42] |
AdaBoost | n_estimators, learning_rate, base_estimator | Crop yield prediction [43] |
ET | n_estimators, max_depth, max_features | Seamless terrestrial evapotranspiration estimation [44] |
BRF | n_estimators, sampling_strategy, base_estimator | Early warning systems for pest-induced crop diseases [19] |
DT | max_depth, min_samples_split, criterion | Short-term agricultural drought monitoring [38] |
GBM | n_estimators, learning_rate, max_depth | Predicting aflatoxin contamination in Iowa corn [45] |
GPC | kernel, optimizer, n_restarts_optimizer | Accurate crop classification [46] |
OVR-Logistic | C, penalty, solver, max_iter | Predicting individual tree mortality [47] |
ANN | layers, units, learning_rate_init, dropout | Diseases on canola and dry bean crops [48] |
RNN | num_layers, hidden_size, units, activation, optimizer, dropout | Predict rice blast disease [49] |
LSTM | num_layers, hidden_size, units, activation, optimizer, dropout | Predict the occurrence of cotton pests and diseases [50] |
GRU | num_layers, hidden_size, units, activation, optimizer, dropout | Prediction of plant sap flow in precision agriculture [51] |
Input Variable | Origin VIF | Final VIF | Select Status |
---|---|---|---|
max_t | ∞ | 3.18 | save |
min_t | ∞ | 2.06 | save |
mean_t | 48.29 | — | remove |
dt | ∞ | — | remove |
tt_28 | 14.67 | — | remove |
tt_30 | 13 | — | remove |
max_rh | ∞ | 2.52 | save |
min_rh | ∞ | 2.67 | save |
drh | ∞ | — | remove |
mean_rh | 23.35 | — | remove |
mean_sr | 8.92 | 3.38 | save |
max_sr | 4.24 | 2.94 | save |
max_w | 2.11 | 2.05 | save |
min_w | 2.12 | 2.02 | save |
de | 1.84 | 1.75 | save |
Performance Metric | Test Statistic | p-Value | Conclusion |
---|---|---|---|
F1-score | 23.7808 | 8.8374 × 10−5 | Highly Significant |
AUC value | 18.9434 | 8.0633 × 10−4 | Highly Significant |
Model | Parameter | Default Value | Search Range | Optimal Value |
---|---|---|---|---|
ET | n_estimators | 100 | 5–2000 | 1770 |
max_depth | None | 3–200 | 200 | |
min_samples_split | 2 | 2–60 | 2 | |
min_samples_leaf | 1 | 1–60 | 1 | |
max_leaf_nodes | None | 10–200 | 166 | |
max_feartures | sqrt | 0.1–1.0 | 0.1 | |
max_samples | None | sqrt\log2\None | None | |
CatBoost | Learning_rate | 0.1 | 0.001–0.1 | 0.0986 |
Depth | 6 | 4–15 | 14 | |
Iterations | 1000 | 10–60 | 51 | |
L2_leaf_reg | 3 | 1–10 | 1 | |
Random_strength | 1 | 1–10 | 3.0605 | |
Border_count | 255 | 5–15 | 14 | |
SVM | C | 1.0 | 0.001–1000 | 1000 |
gamma | scale | 1 × 10−5–10 | 0.0159 | |
Kernel | rbf | Linear\rbf\poly | rbf | |
GPC | Kernel_type | RBF | RBF\Matern | Matern |
Length_scale | 1.0 | 0.001–10 | 0.0054 | |
alpha | 0 | 1 × 10−7–1 | 0.0287 | |
n_restarts | 0 | 2–60 | 37 | |
max_iter_predict | 100 | 20–300 | 145 | |
LightGBM | Learning_rate | 0.1 | 0.003–0.1 | 0.1 |
Max_depth | −1 | 3–15 | 3 | |
N_estimators | 100 | 100–1000 | 1000 | |
subsample | 1.0 | 0.6–1.0 | 0.8845 | |
Colsample_bytree | 1.0 | 0.6–1.0 | 0.7035 | |
Lambda_l1 | 0 | 0.003–1.0 | 0.6485 | |
Lambda_l2 | 0 | 0.003–1.0 | 0.0213 |
Model | F1-Score | Test Set Performance | AUC Value | Test Set Performance | ||
---|---|---|---|---|---|---|
Before Tuning | After Tuning | Before Tuning | After Tuning | |||
ET | 0.9754 | 1 | 0.9734 | 0.9984 | 1 | 0.9969 |
CatBoost | 0.9441 | 1 | 0.9504 | 0.9957 | 1 | 0.9988 |
SVM | 0.9357 | 1 | 0.9734 | 0.9948 | 1 | 0.9877 |
GPC | 0.9475 | 0.9687 | 0.8738 | 0.9938 | 0.9985 | 0.9766 |
LightGBM | 0.9379 | 1 | 0.9457 | 0.9878 | 1 | 0.9716 |
Stage of Intervention | Control Measures |
---|---|
Low severity | When the minimum air temperature falls below 20.01 °C, the minimum substrate moisture content should be maintained above 73.2% to ensure disease control. |
Moderate severity | A maximum substrate moisture content greater than 78.7% significantly inhibits disease progression. |
High severity | When relative humidity exceeds 67.94% and the minimum temperature drops below 19.24 °C, environmental humidity should be actively reduced to mitigate disease risk. |
Full stage | When solar radiation exceeds 262.1 W/m2, precise water management becomes critical—irrigation volume and humidity control must be dynamically coordinated. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Liu, K.; Liang, L.; Li, C.; Ji, T.; Xu, J.; Liu, H.; Diao, M. An Explainable Machine Learning Framework for the Hierarchical Management of Hot Pepper Damping-Off in Intensive Seedling Production. Horticulturae 2025, 11, 1258. https://doi.org/10.3390/horticulturae11101258
Wang Z, Liu K, Liang L, Li C, Ji T, Xu J, Liu H, Diao M. An Explainable Machine Learning Framework for the Hierarchical Management of Hot Pepper Damping-Off in Intensive Seedling Production. Horticulturae. 2025; 11(10):1258. https://doi.org/10.3390/horticulturae11101258
Chicago/Turabian StyleWang, Zhaoyuan, Kaige Liu, Longwei Liang, Changhong Li, Tao Ji, Jing Xu, Huiying Liu, and Ming Diao. 2025. "An Explainable Machine Learning Framework for the Hierarchical Management of Hot Pepper Damping-Off in Intensive Seedling Production" Horticulturae 11, no. 10: 1258. https://doi.org/10.3390/horticulturae11101258
APA StyleWang, Z., Liu, K., Liang, L., Li, C., Ji, T., Xu, J., Liu, H., & Diao, M. (2025). An Explainable Machine Learning Framework for the Hierarchical Management of Hot Pepper Damping-Off in Intensive Seedling Production. Horticulturae, 11(10), 1258. https://doi.org/10.3390/horticulturae11101258