1. Introduction
Agricultural engineering stands at a critical juncture where escalating demands for food security, resource efficiency, and climate resilience necessitate transformative technological integration [
1,
2]. The integration of machine learning (ML) and deep learning (DL) with agricultural systems has emerged as a pivotal frontier, enabling unprecedented capabilities in interpreting complex biological, environmental, and operational datasets [
3,
4,
5]. Traditional analytical approaches increasingly falter when confronted with the nonlinear dynamics of crop physiology, soil–plant–atmosphere interactions, and supply chain logistics, prompting widespread adoption of data-driven intelligence across the agricultural value chain [
6,
7,
8,
9].
Machine learning and deep learning techniques have fundamentally transformed agricultural engineering research and practice, offering capabilities for pattern recognition, predictive modeling, and automated decision making that address complex agricultural challenges previously considered intractable [
8,
10,
11]. From precision irrigation management to real-time crop disease detection, these applications demonstrate the remarkable potential to enhance agricultural productivity while reducing environmental impact and operational costs [
12,
13,
14]. The integration of machine learning with advanced sensing technologies has been particularly transformative [
15,
16]. Hyperspectral imaging, near-infrared spectroscopy, and other non-destructive analytical methods generate vast amounts of high-dimensional data that traditional statistical approaches cannot effectively process [
17,
18,
19]. Machine learning algorithms excel at extracting meaningful patterns from these complex datasets, enabling the rapid, accurate assessment of crop health, soil conditions, food quality, and environmental parameters [
20,
21,
22,
23]. Deep learning architectures, particularly convolutional neural networks, have revolutionized image-based agricultural applications by automatically learning hierarchical feature representations without requiring manual feature engineering [
24,
25,
26]. This advancement has enabled breakthrough applications in crop monitoring, livestock management, and food quality assessment that surpass human capabilities in both speed and accuracy [
27,
28]. However, practical implementation reveals both opportunities and challenges. While these technologies demonstrate exceptional performance in controlled research environments, their deployment in real-world agricultural contexts requires careful consideration of computational constraints, environmental variability, data availability, and economic feasibility [
29,
30,
31]. The development of lightweight models, robust algorithms for noisy field data, and cost-effective sensing solutions remains critical for widespread adoption [
32,
33,
34,
35].
This survey examines the current state of machine learning and deep learning applications in agricultural engineering through a systematic analysis of recent research developments. The investigation addresses fundamental questions regarding the types of models currently employed, the specific agricultural engineering challenges most frequently addressed, the methodological approaches used for experimental evaluation, the diverse application domains where these techniques have demonstrated success, and the emerging research challenges and future trends. The analysis reveals a rich ecosystem of computational approaches ranging from traditional machine learning algorithms such as support vector machines and random forests to sophisticated deep learning architectures, including convolutional neural networks and recurrent neural networks. These technologies address challenges spanning crop and livestock management, food quality and safety assessment, environmental monitoring, and agricultural automation. Understanding this landscape is essential for researchers, practitioners, and policymakers seeking to leverage these technologies for addressing global food security challenges.
This survey addresses critical gaps in the existing agricultural machine learning literature through unique systematic contributions. Unlike previous reviews that focus on narrow domains [
8] or provide descriptive overviews [
4], this work employs a structured meta-analysis (RQ1–RQ5) with quantitative performance synthesis, revealing DL methods achieve 5–10% accuracy improvements over traditional ML approaches. The survey provides the first comprehensive taxonomy of ML/DL techniques for agricultural engineering, systematically examines advanced sensing technology integration (hyperspectral imaging, Raman spectroscopy), and includes rigorous dataset availability and experimental reproducibility assessment. These evidence-based frameworks bridge the gap between technological possibility and practical deployment, offering comprehensive guidance for implementing ML/DL solutions in real-world agricultural contexts while addressing computational, interpretability, and environmental challenges not systematically examined in prior reviews. This comprehensive analysis aligns with computational sciences and engineering applications, addressing the intersection of advanced computing methods with practical agricultural systems, a domain where computational innovation drives real-world impact through intelligent data processing, automated decision making, and precision resource management [
19,
35,
36]. The work contributes to the growing body of literature on machine learning applications in engineering contexts, leveraging business intelligence principles and automated detection frameworks that are revolutionizing traditional practices [
37,
38,
39].
Understanding this landscape is essential for researchers, practitioners, and policymakers seeking to leverage these technologies for addressing global food security challenges. This survey examines the current state of machine learning and deep learning applications in agricultural engineering through systematic analysis of recent research developments. The investigation addresses five fundamental research questions:
RQ1: What machine learning and deep learning models are employed in agricultural engineering applications?
RQ2: What agricultural engineering challenges are most frequently addressed by these computational approaches?
RQ3: How are experimental evaluations performed to validate machine learning and deep learning solutions in agricultural contexts?
RQ4: What are the known application domains where these techniques have demonstrated success?
RQ5: What are the future research challenges and emerging trends in the application of machine learning and deep learning in agricultural engineering?
We formulate these questions to ensure a comprehensive understanding of the current state of ML and DL in agricultural engineering, examining methodological advances (RQ1), engineering challenges (RQ2), validation practices (RQ3), practical contributions (RQ4), and future directions (RQ5). Together, these questions structure the
Section 4 of this survey.
2. Review Methodology
2.1. Search Strategy and Database Selection
This systematic review employed a comprehensive search strategy across multiple academic databases to ensure thorough coverage of the relevant literature in machine learning and deep learning applications within agricultural engineering [
40,
41]. The primary databases searched included Web of Science, Scopus, IEEE Xplore, ScienceDirect, and PubMed, selected for their extensive coverage of engineering, computer science, and agricultural research publications. Additional searches were conducted in specialized databases including AGRICOLA and CAB Abstracts, to capture domain-specific agricultural engineering studies that might not be indexed in general scientific databases. The search strategy utilized a combination of controlled vocabulary terms and free-text keywords to maximize retrieval while maintaining relevance [
42,
43,
44]. Primary search terms included variations of “machine learning,” “deep learning,” “artificial intelligence,” “neural networks,” and “data mining” combined with agricultural domain terms such as “agriculture,” “farming,” “crop,” “livestock,” “food quality,” “precision agriculture,” and “agricultural engineering.” Boolean operators were employed to create comprehensive search strings that captured the intersection of computational methods and agricultural applications while excluding irrelevant domains such as biomedical applications of similar technologies. The temporal scope of the search encompassed publications from 2015 to 2025, a period selected to capture the rapid advancement and adoption of machine learning techniques in agricultural applications while ensuring relevance to current technological capabilities. This timeframe corresponds to the widespread availability of deep learning frameworks and the increasing accessibility of computational resources necessary for implementing sophisticated machine learning algorithms in agricultural contexts.
2.2. Geographical Scope and Limitations
The systematic search encompassed the global literature, though analysis revealed significant geographical imbalances, with studies originating primarily from high-income countries (North America: 34%, Europe: 24%, East Asia: 28%) and limited representation from Sub-Saharan Africa (3%), Latin America (6%), and South Asia (5%) [
45,
46]. Despite efforts to address this bias through regional database searches, multilingual inclusion, and consultation with international agricultural organizations, the potential underrepresentation of indigenous knowledge systems and research from resource-constrained institutions remains [
47,
48]. These geographical limitations suggest that reported performance metrics and implementation strategies may not transfer directly to agricultural contexts where food security challenges are most acute, highlighting the need for collaborative frameworks that engage researchers from diverse geographical and economic contexts.
2.3. Inclusion and Exclusion Criteria
Inclusion criteria were carefully designed to focus on studies that directly addressed machine learning or deep learning applications in agricultural engineering contexts. Studies were included if they presented original research involving the development, application, or comparison of machine learning algorithms for solving specific agricultural challenges. This included research on crop management, livestock monitoring, food quality assessment, precision agriculture, agricultural automation, environmental monitoring in agricultural systems, and post-harvest processing applications. Publications were required to demonstrate clear methodological rigor, including detailed descriptions of data collection procedures, model development approaches, and evaluation methodologies. Studies that merely mentioned machine learning techniques without substantial implementation or evaluation were excluded to ensure focus on contributions that advance the field through practical applications and validated results. Additionally, research had to involve real agricultural data or realistic simulations rather than purely theoretical developments without empirical validation. Exclusion criteria eliminated studies that focused primarily on non-agricultural applications of machine learning, even if they claimed agricultural relevance. Research dealing exclusively with laboratory-scale experiments without clear pathways to practical agricultural implementation was also excluded. Publications that did not provide sufficient technical detail for evaluation of methodological quality, studies published in languages other than English, and conference abstracts without full papers were excluded to maintain consistency and accessibility of the reviewed literature. Review articles, meta-analyses, and opinion pieces were excluded from the primary analysis but were consulted for contextual information and to identify additional relevant primary studies through backward citation searching. Patents and technical reports were similarly excluded to focus on peer-reviewed research that had undergone rigorous academic evaluation processes.
2.4. Study Selection Process
The study selection process followed a systematic multi-stage approach designed to ensure reproducibility and minimize selection bias. Initial screening involved removing duplicate records across databases using reference management software, followed by title and abstract screening to eliminate obviously irrelevant publications. Two independent authors conducted this initial screening using predetermined criteria, with disagreements resolved through discussion and consultation with a third author when necessary. Full-text screening was performed on potentially relevant articles identified during the initial screening phase. This detailed evaluation assessed methodological quality, relevance to agricultural engineering applications, and the depth of machine learning implementation. Studies were categorized based on their primary application domains, methodological approaches, and the types of machine learning techniques employed to facilitate subsequent analysis and synthesis. A standardized data extraction form was developed to capture relevant information from the included studies systematically. This form included fields for bibliographic information, study objectives, agricultural application domains, the types of data used, the machine learning algorithms employed, experimental design characteristics, the performance metrics reported, key findings, and limitations acknowledged by the authors.
2.5. Quality Assessment Framework
The quality assessment framework evaluated multiple dimensions of methodological rigor relevant to machine learning research in agricultural contexts [
49]. Technical quality assessment focused on the appropriateness of algorithm selection for the specific agricultural problem, adequacy of the data preprocessing procedures, robustness of the experimental design, including proper train–test splits and cross-validation procedures, and the comprehensiveness of the performance evaluation, using appropriate metrics for the task type.
Data quality evaluation examined the representativeness of the datasets used, including sample sizes, the diversity of conditions represented, and the relevance to real-world agricultural scenarios. Reporting quality was assessed based on the completeness of methodological descriptions, reproducibility of reported procedures, clarity of result presentation, and acknowledgment of study limitations. Studies that provided sufficient detail for replication and clearly discussed the practical implications and limitations of their findings received higher quality scores. The integration of domain expertise was evaluated by examining whether studies appropriately incorporated agricultural knowledge into their machine learning approaches, consulted with agricultural experts in the study design or interpretation, and addressed practical considerations relevant to agricultural implementation. This criterion recognized that effective agricultural applications of machine learning require a deep understanding of both computational methods and agricultural systems.
2.6. Data Extraction and Analysis Framework
Data extraction employed a structured approach designed to capture both quantitative performance data and qualitative insights regarding the practical applicability of different machine learning approaches in agricultural contexts. Quantitative data included performance metrics such as accuracy, precision, recall, and F1-scores for classification tasks, as well as R-squared, RMSE, and MAE values for regression applications. When possible, computational requirements, including training time, inference speed, and hardware requirements, were also extracted to assess practical deployment feasibility. Methodological characteristics were systematically categorized, including the types of input data used, the preprocessing procedures employed, the algorithm architectures implemented, the hyperparameter optimization approaches, and the validation strategies utilized. This information enabled comparative analysis of methodological trends and identification of best practices across different agricultural application domains. Application domain classification involved categorizing studies based on their primary agricultural focus areas, including crop production, livestock management, food quality and safety, precision agriculture, environmental monitoring, and agricultural automation. Within each domain, specific applications were further categorized to identify areas of concentrated research activity and emerging application opportunities. The analysis framework incorporated both descriptive statistics to characterize the overall landscape of research activity and comparative analysis to identify trends in performance, methodological approaches, and application domains over time. Meta-analytical techniques were employed where sufficient homogeneity existed among studies to enable the quantitative synthesis of results.
2.7. Data Availability and Reproducibility Assessment
To enhance transparency and support future research, we systematically evaluated the availability of datasets and code from the reviewed studies, revealing significant challenges in reproducibility practices across agricultural ML/DL research. Our analysis categorized studies into three groups: open-access (12% of studies), with complete datasets and code available through platforms like GitHub, Mendeley, or institutional repositories; upon-request (54% of studies), where data are available by contacting the authors, though response rates and data completeness vary significantly; and proprietary/restricted (34% of studies), with limited or no data availability due to commercial partnerships, privacy constraints, or a lack of sharing protocols. This distribution highlights critical gaps in research transparency that hinder scientific reproducibility and knowledge transfer. To address these limitations, we recommend establishing standardized data sharing protocols for agricultural ML/DL research, creating domain-specific repositories for agricultural datasets, and implementing FAIR (Findable, Accessible, Interoperable, Reusable) data principles to ensure that future research can build effectively upon existing work and facilitate a broader adoption of successful methodologies across diverse agricultural contexts.
2.8. Synthesis and Reporting Approach
The synthesis approach emphasized identifying patterns and trends across machine and deep learning applications in agricultural engineering while recognizing the heterogeneity of approaches, datasets, and evaluation methods employed across studies. Rather than attempting forced quantitative meta-analysis, the synthesis focused on identifying common challenges, successful approaches, and emerging trends that could inform future research directions. Thematic analysis was employed to identify recurring themes related to agricultural challenges amenable to machine learning solutions, characteristics of successful implementations, and barriers to practical deployment. This analysis informed the development of taxonomies for machine and deep learning approaches, agricultural application domains, and evaluation methodologies presented in
Section 3.5. Technical details of implementations were accurately represented while maintaining accessibility for readers with diverse backgrounds in both agricultural engineering and computational methods. Limitations were carefully documented, including the potential publication bias toward positive results, rapid technological development rendering older studies less relevant, and challenges comparing results across different datasets and evaluation criteria. These limitations were considered when interpreting findings and formulating research recommendations. The synthesis process culminated in developing comprehensive taxonomies of machine and learning techniques, agricultural applications, and evaluation methodologies, along with identifying research gaps and future directions. This approach ensured the review provides both a comprehensive overview of current research and actionable insights for advancing machine learning applications in agricultural engineering.
3. Taxonomy of Employed ML and DL Techniques in Agricultural Engineering
In this section, we address RQ1 by summarizing the machine learning and deep learning models employed in agricultural engineering. The surveyed literature reveals a diverse ecosystem of ML and DL models deployed to address core challenges in this domain [
50,
51,
52]. These models are typically selected based on data type, problem complexity, and computational constraints.
3.1. Temporal Evolution and Trends (2015–2024)
Analysis of the surveyed literature reveals clear temporal trends in ML/DL adoption patterns. Early studies (2015–2017) predominantly employed traditional ML approaches, with SVMs and ANNs accounting for 65% of applications. The period 2018–2020 marked a transition phase, with increasing CNN adoption (40% of studies), while recent years (2021–2024) show the dominance of deep learning architectures, particularly transfer learning approaches (55% of studies) and hybrid CNN-LSTM models (25%). This evolution reflects both improved computational accessibility and the recognition that agricultural data’s complexity requires more sophisticated feature extraction capabilities.
3.2. Hierarchical Classification of ML and DL Techniques
3.2.1. Traditional Machine Learning Approaches
Traditional ML algorithms remain widely utilized, particularly for structured/tabular data, spectral analysis, and scenarios with limited datasets.
Beyond core ML/DL architectures, the survey identified several specialized algorithms tailored to specific agricultural engineering tasks. Gradient Boosting frameworks (XGBoost, LightGBM, CatBoost) were leveraged for complex predictions, like determining tea origin and pungency intensity [
75], classifying oolong tea varieties, and forecasting time-series data such as tomato transpiration rates [
57] and solar radiation, frequently within hybrid model configurations. For simpler classification tasks, such as verifying tea authenticity [
64] or identifying maize seed harvest years [
17], the k-Nearest Neighbor (KNN) algorithm was commonly applied. Various regression techniques, including Linear Regression (LR), Multiple Linear Regression (MLR), and Additive Regression Tree (ART), proved effective for forecasting environmental variables like Vapor Pressure Deficit (VPD [
71]) and predicting moisture content in products like coated pineapple cubes [
79]. Domain-specific challenges were addressed by specialized models: the Grey Model handled seed distribution and time series forecasting [
63], Response Surface Methodology (RSM) optimized grain cleaning systems, and Fuzzy Models predicted losses like sieve efficiency in harvesters. Finally, dimensionality reduction, primarily through Principal Component Analysis (PCA) for spectral/imaging data preprocessing (e.g., watermelon seed analysis [
77]) and Linear Discriminant Analysis (LDA) for classification [
72], was a ubiquitous step in streamlining complex datasets before model application.
As summarized in
Table 1, the dominant traditional machine learning models demonstrate clear specialization patterns across different agricultural engineering applications.
3.2.2. Deep Learning Architectures
DL models, particularly CNNs, dominate applications involving image, spectral, and point cloud data, offering superior feature extraction capabilities.
The success of CNNs in agricultural applications can be attributed to their hierarchical feature learning capabilities and translation invariance properties. Recent advances in automatic data augmentation frameworks have further enhanced CNN performance in industrial and agricultural defect detection tasks, demonstrating the importance of data preprocessing strategies for improving model robustness [
39]. These developments reflect broader machine learning-facilitated business intelligence approaches that optimize complex decision-making processes [
38].
Recurrent Neural Networks (RNNs)
- o
Long Short-Term Memory (LSTM) Networks: These provide primary solutions for modeling sequential and time-series agricultural data, including physicochemical property modeling [
53], soil moisture prediction, yield approximation, and irrigation optimization [
86].
- o
Hybrid RNN Architectures: CNN-LSTM combinations enhance performance in tasks combining spatial and temporal data, as exemplified in pesticide residue detection using spectroscopy [
87].
Generative and Unsupervised Models
- o
Autoencoder Variants: Stacked Autoencoders (SAE), Stacked Denoising Autoencoders (SDAE), and Stacked Weighted Autoencoders (SWAE) are leveraged for feature learning from complex agricultural data, including heavy metal detection [
26,
74], fruit quality prediction [
68], and hyperspectral dataset processing [
78].
- o
Generative Adversarial Networks (GANs): These are primarily applied in data augmentation and image enhancement, including Wasserstein GANs (WGANs) for livestock recognition robustness [
11] and EnlightenGAN for low-light image enhancement [
85].
Transfer Learning and Specialized Architectures
- o
Transfer Learning: This represents a ubiquitous strategy across agricultural DL applications, utilizing pre-trained models (typically ImageNet-based) fine-tuned for specific agricultural tasks, significantly improving performance when labeled data are scarce [
11,
87].
- o
Other Deep Learning Models: These include Deep Belief Networks (DBNs) for specialized applications like rice seed germination detection using fluorescence spectra.
The diversity of deep learning architectures and their specialized applications in agricultural contexts is comprehensively illustrated in
Table 2. This hierarchical classification reveals that traditional ML methods provide interpretable solutions for structured data, while deep learning excels in complex feature extraction from high-dimensional datasets. The complementary nature of these approaches, combined with increasing hybrid strategies, demonstrates agricultural computing’s evolution toward integrated multi-modal frameworks [
88].
3.3. Synthesis of Model Utilization Trends
Ensemble and Hybrid Approaches: Combining models (Ensemble ML [
74,
80], CNN-LSTM hybrids [
86]) is a key strategy to enhance accuracy and robustness and handle multi-modal data.
Transfer Learning Imperative: The reliance on transfer learning to overcome data scarcity for DL models is near-universal in agricultural applications [
11,
69,
82].
Performance Focus: Studies consistently compare model performance (accuracy, precision, RMSE, R
2), with complex DL and ensemble methods, frequently achieving state-of-the-art results [
65,
68,
70,
87], albeit sometimes at higher computational cost. PLS/PLS-ML often serves as a key benchmark in spectral analysis [
70,
76,
77,
78].
Computational Pragmatism: Lightweight models (e.g., improved YOLOv5s [
76], Zero-DCE [
85], 1D-CNNs [
78]) and efficient architectures are prioritized for potential field deployment, especially for real-time tasks like detection and monitoring.
This taxonomy provides a comprehensive overview of the ML and DL model landscape actively employed in addressing contemporary agricultural engineering challenges, highlighting the synergy between model capabilities and specific application requirements.
3.4. Contributions of Ml and Dl Techniques to Agricultural Engineering Challenges
In this section, we address RQ2 by examining the agricultural engineering challenges most frequently tackled using machine learning and deep learning approaches. The reviewed studies reveal that these computational techniques significantly enhance efficiency, quality, and sustainability. Broadly, the challenges addressed fall into three major categories: (i) agricultural product quality assessment, classification, and identification; (ii) crop and field management; and (iii) agricultural automation and robotics.
3.4.1. Agricultural Product Quality Assessment, Classification, and Identification
Machine learning and deep learning techniques enable rapid, non-destructive evaluation of agricultural products by integrating advanced sensing technologies like hyperspectral imaging [
60,
68,
87,
90], NIR spectroscopy [
54,
59,
64,
79], and Raman spectroscopy [
73,
86]. These methods overcome traditional limitations by quantifying chemical constituents (amino acids [
54], toxins [
73], pesticide residues [
70,
86]), monitoring production processes (fermentation [
58,
76], drying [
78]), predicting internal attributes (soluble solids [
68,
87], firmness), authenticating geographical origins [
59,
75], and assessing product freshness/viability [
67,
77]. To handle high-dimensional spectral data complexity, deep learning models such as CNNs [
73,
78,
87] and autoencoders [
68,
87] excel at feature extraction and modeling non-linear relationships. Furthermore, these techniques provide robust solutions for classifying crop varieties [
1,
69,
81] and grading quality based on visual or spectral characteristics [
64,
82].
3.4.2. Crop and Field Management
ML/DL optimizes critical field operations through precision resource management and monitoring. Key applications include water resource optimization using evapotranspiration forecasting models (e.g., Random Forest for Vapor Pressure Deficit prediction [
63]) and intelligent irrigation scheduling via reinforcement learning [
5]. Plant health management is enhanced through early disease/pest detection from leaf images and heavy metal contamination monitoring using hyperspectral imaging with deep learning [
26,
61,
74]. Yield prediction and growth monitoring leverage ensemble learning with UAV remote sensing data, while carbon cycle analysis employs ML ensemble models to quantify net ecosystem exchange in tea plantations [
80], overcoming the limitations of traditional measurement systems.
3.4.3. Agricultural Automation and Robotics
Automation challenges are addressed through ML/DL-powered machine vision systems. Automated harvesting solutions enable the accurate, real-time detection of crops like tea shoots in complex field environments using lightweight models deployable on mobile platforms [
57,
58]. Intelligent sorting systems replace manual grading through livestock recognition networks (e.g., ResNet with GAN-enhanced imaging [
11]) and the automated quality assessment of produce like apples [
82]. Precision farming equipment benefits from ML-driven seed distribution prediction [
63] and adaptive controllers that optimize parameters for harvesters and seeders [
91]. Continuous 24/7 monitoring capabilities are achieved through specialized deep learning networks that enhance low-light imaging for nighttime operations like dairy cow detection [
85].
3.5. Experimental Evaluations of ML and DL Techniques in Agricultural Engineering
In this section, we address RQ3 by examining how experimental evaluations are performed to validate machine learning and deep learning approaches in agricultural engineering. The reviewed studies reveal a heterogeneous landscape of methodologies, encompassing data preparation strategies, performance metrics, statistical and optimization techniques, experimental procedures, software tools, and reproducibility practices. Together, these elements form the methodological backbone for assessing the accuracy, robustness, and effectiveness of computational solutions in agricultural contexts.
3.5.1. Data Splitting and Preparation
A foundational aspect of evaluation involves partitioning datasets to assess model generalization on unseen data. Typical strategies include two-way splits (training/testing) and three-way splits (training/validation/testing), with proportions varying by study [
84]. For example, tea constituent quantification used a 66.7–33.3 split [
54], while heavy metal detection applied a 75–25 split [
26]. Orange freshness detection followed a 70–30 division, whereas cotton irrigation relied on historical calibration data and recent validation sets. Cross-validation enhances robustness, with k-fold approaches (5- or 10-fold), Monte Carlo cross-validation (MCCV), and leave-one-out validation (LOO) frequently applied [
66,
69,
78]. Specialized designs such as 5x2CV paired
t-tests have also been used for pairwise model comparisons [
17,
62,
65,
72]. These practices are systematically compared in
Table 3, which highlights inconsistencies across agricultural applications.
3.5.2. Experimental Design Criteria and Decision Frameworks
Analysis of the surveyed literature reveals that researchers employ diverse criteria for experimental design decisions, though these are not always explicitly stated. Key decision frameworks identified include the following:
Dataset Size-Dependent Criteria: Studies with datasets > 1000 samples typically employed 70/30 or 80/20 splits with k-fold cross-validation, while smaller datasets (<200 samples) favored leave-one-out validation or bootstrap methods to maximize training data utilization [
17,
54]. Researchers justified these choices based on statistical power requirements and generalization assessment needs.
Application-Specific Validation Strategies: Time-series agricultural applications (irrigation, yield prediction) predominantly used temporal splits respecting chronological order [
88], while spatial applications employed geographical cross-validation to assess model transferability across different farms or regions. Spectroscopic studies consistently applied spectral preprocessing criteria, including noise reduction (Savitzky–Golay filtering), scatter correction (SNV/MSC), and variable selection (CARS/SPA) based on signal-to-noise ratios and collinearity assessments.
Performance Metric Selection Rationale: Researchers selected metrics based on application criticality and cost considerations. Food safety applications emphasized precision/recall balance (F1-scores) to minimize false negatives [
90], while yield prediction studies prioritized RMSE and R
2 for quantitative accuracy assessment [
71]. Spectroscopic applications consistently reported RPD values > 2.0 as acceptance criteria for model reliability.
Computational Resource Constraints: Studies involving real-time applications (robotic harvesting, field monitoring) explicitly considered computational efficiency, leading to lightweight model selection (MobileNet variants, pruned networks) and edge computing deployment criteria [
66,
85]. Laboratory-based studies with unlimited computational resources explored more complex architectures without such constraints.
Domain Expert Integration: Agricultural domain expertise influenced experimental design through feature engineering guidance, relevant variable selection, and the interpretation of results within biological/physical contexts. Studies involving agronomists or food scientists in the experimental design showed more robust validation protocols and practical applicability assessments.
3.5.3. Performance Metrics
Evaluation relies heavily on quantitative performance indicators, selected according to prediction type [
46,
60,
69]. For regression tasks, metrics such as the coefficient of determination (R
2), root mean square error (RMSE) [
45,
48,
49], mean absolute error (MAE) [
45,
48,
49], residual predictive deviation (RPD) [
65,
66,
67], and Akaike information criterion (AIC) are commonly reported [
54,
68,
71,
76]. In classification tasks, metrics such as accuracy, precision, recall, F1-score, area under the curve (AUC), and mean average precision (mAP) are employed [
11,
82,
91,
92]. Confusion matrices often complement these metrics, providing detailed error distribution [
93,
94,
95]. These measures are classified and contextualized in
Table 4, which demonstrates task-dependent selection but also significant inconsistency across studies.
3.5.4. Statistical Analysis and Optimization
Statistical tools and optimization techniques are integral to validating agricultural ML/DL models [
88]. Analysis of Variance (ANOVA) and Duncan’s Multiple Range Test (DMRT) assess factor significance and mean differences in agricultural trials [
10,
48]. Dimensionality reduction methods such as principal component analysis (PCA) and linear discriminant analysis (LDA) aid in feature extraction and group separation [
72,
77,
96,
97]. Variable selection methods, including Competitive Adaptive Reweighted Sampling (CARS) and Iteratively Retains Informative Variables (IRIV), are widely used for high-dimensional spectral data [
98,
99]. Optimization techniques such as Particle Swarm Optimization (PSO), the Chameleon Swarm Algorithm (CSA), and Differential Evolution (DE) are employed to tune model parameters and improve predictive accuracy [
100,
101]. A comparative mapping of these methods is provided in
Table 5, while
Table 6 illustrates common hyperspectral preprocessing pipelines.
3.5.5. Experimental Procedures and Visualizations
Beyond numerical evaluation, many studies incorporate domain-specific procedures to validate ML/DL models. Gold-standard reference measurements serve as benchmarks, including TVB-N for chicken freshness [
98], fatty acid analysis for flour storage [
102], atomic absorption spectrometry for heavy metals [
26], HPLC for pesticides, and sensory analysis for pungency [
75]. Advanced sensing applications include rapid bacterial species identification through integrated colorimetric sensors with near-infrared spectroscopy [
98] and comprehensive food odor visualization using chemo-responsive dyes [
103]. Imaging applications commonly involve ROI extraction, spectral preprocessing (Savitzky–Golay smoothing, SNV, MSC, derivatives), and feature selection algorithms (CARS, SPA, BOSS) [
29,
103,
104]. Visualization techniques such as color-coded maps, Taylor diagrams, and heatmaps enhance interpretability, while machine vision studies utilize controlled acquisition systems and data augmentation (rotation, mirroring, noise) [
23,
105,
106]. Food processing applications incorporate comprehensive experimental procedures, including microwave infrared cooperative drying with detailed monitoring of moisture evolution, structure changes, and physicochemical properties [
107]. Robotics-focused evaluations often combine simulation (e.g., MATLAB R2025a or earlier/Simulink) with real-world testing, ensuring both reproducibility and practicality [
20,
108,
109]. Cross-validation methodology selection represents a critical decision in experimental design, with different approaches showing distinct advantages and limitations as evaluated in
Table 7. The analysis reveals that k-fold cross-validation is preferred for medium to large datasets, while specialized approaches like Monte Carlo and time-series splits are used for specific scenarios. The computational cost-benefit trade-off influences method selection, with simpler approaches often chosen despite potentially lower reliability.
Performance benchmarking across different agricultural applications reveals significant variation in expected outcomes and success factors, as synthesized in
Table 8. The benchmark matrix establishes that performance expectations vary significantly across applications, with R
2 values ranging from 0.70–0.95 depending on the complexity of the agricultural task. Success factors consistently include proper preprocessing and feature selection, while common challenges relate to environmental variability and data quality issues across different agricultural contexts. The overall quality of experimental designs across the surveyed literature can be systematically evaluated using the framework presented in
Table 9. This quality assessment framework provides a systematic approach to evaluate study rigor, revealing that many agricultural engineering studies fall short of excellent standards. The weighted scoring system highlights critical gaps in sample sizes, validation strategies, and reproducibility practices. Most studies achieve “good” to “fair” ratings, with particular weaknesses in statistical testing comprehensiveness and code/data availability, suggesting significant room for improvement in research quality and transparency across the field. Overall, agricultural engineering experiments integrate rigorous preprocessing methodologies, tailored performance metrics, advanced statistical and optimization techniques, and comprehensive real-world validations to ensure effectiveness and reliability of machine and deep learning solutions across diverse agricultural applications.
3.5.6. Dataset Taxonomy and Availability
Datasets are central to experimental evaluations, providing the foundation for training and validating ML/DL models. The surveyed studies reveal a taxonomy of commonly used datasets across agricultural engineering:
Spectral datasets (NIR, hyperspectral, Raman, fluorescence HSI) dominate, particularly for food quality and chemical constituent analysis.
Image datasets (RGB images, annotated plant/livestock photos, point cloud data) are widely used for classification, detection, and robotic automation tasks.
Environmental/Meteorological datasets (climate, soil, IoT-based sensing) enable prediction of evapotranspiration, irrigation needs, and carbon flux.
Physicochemical/Reference datasets (TVB-N, HPLC, sensory panels) serve as gold standards for benchmarking model predictions.
Olfactory and sensor datasets support quality and freshness monitoring via electronic noses and volatile compound analysis.
As illustrated in
Figure 1, spectral data methods dominate current research, representing 42.2% of all dataset types, followed by image data (26.2%) and smaller proportions of environmental, reference, and sensory/olfactory datasets. This distribution reflects strong emphasis on non-destructive sensing modalities, with growing integration of environmental and IoT data sources. The implications of dataset accessibility and reproducibility practices for these diverse data types are discussed in
Section 2.7.
3.5.7. Software Platforms and Reproducibility Practices
Evaluation frameworks in agricultural engineering are deeply tied to the software platforms that support ML and DL analyses. While traditional environments such as MATLAB remain popular for spectral preprocessing and numerical modeling, they are often integrated with ML pipelines by exporting features into learning algorithms or embedding neural network toolboxes for predictive tasks. Similarly, SPSS 31 or earlier and ENVI V 5.7.2 or earlier, though conventionally associated with statistical analysis and hyperspectral image processing, are widely employed in conjunction with ML/DL frameworks, serving as preprocessing stages before models such as CNNs, SVMs, or ensemble learners are trained. Specialized commercial tools like The Unscrambler X and TQ Analyst facilitate variable selection and multivariate calibration, which are subsequently coupled with regression or classification algorithms. CloudCompare, commonly applied for point cloud data, has been extended through Python scripting to feed ML classifiers for tree segmentation and structural analysis.
Despite the role of these legacy platforms, the field is witnessing a decisive shift toward open-source ML/DL ecosystems, with Python 3 emerging as the dominant environment. Python’s extensive libraries, such as scikit-learn for classical ML, PyTorch V 2.8 or earlier and TensorFlow for deep learning, NumPy for numerical computation, and Matplotlib for visualization, provide comprehensive support for end-to-end workflows, from data preprocessing to model training and deployment. Studies increasingly report using Jupyter Notebooks for transparent documentation of experiments, facilitating both collaboration and reproducibility. WEKA has also gained traction as a lightweight, accessible platform for ML applications in agricultural datasets, particularly when introducing ensemble learners, feature selection, and comparative benchmarking across algorithms.
Reproducibility practices remain uneven. Many studies still limit access to datasets with “upon request” policies, which constrains independent validation. By contrast, exemplary works provide open repositories on GitHub containing both code and ML/DL model architectures, alongside Mendeley-hosted datasets for direct reuse. Such practices represent the gold standard, enabling full replication of ML pipelines. Nonetheless, reliance on commercial platforms continues to create barriers, underscoring the growing importance of community-driven, open-source ecosystems for scaling ML/DL solutions across agricultural engineering.
3.6. Application Domains of ML and DL in Agricultural Engineering
In this section, we address RQ4 by exploring the application domains where machine learning (ML) and deep learning (DL) techniques have demonstrated success in agricultural engineering and food science. The survey reveals widespread adoption of ML/DL due to their strengths in feature extraction, pattern recognition, prediction, and optimization. These methodologies consistently outperform traditional approaches in capturing complex data relationships, handling high-dimensional feature spaces, and delivering enhanced accuracy and efficiency across diverse agricultural applications.
3.6.1. Performance of Traditional ML Algorithms
Traditional ML algorithms remain widely applied due to their interpretability, relatively modest computational requirements, and strong performance in structured data applications. Representative performance results are summarized in
Table 10,
Figure 2 and
Figure 3.
As illustrated in
Figure 4, ML models generally achieve accuracies in the 83–96% range. Their R
2 values typically lie between 0.72 and 0.95, while RMSE values vary widely depending on the task (5.2–17.5). Ensemble and boosting methods often approach DL-level performance but with lower computational cost, making them attractive for resource-constrained agricultural deployments.
3.6.2. Performance of Deep Learning Algorithms
Deep learning models consistently outperform traditional ML approaches in handling of unstructured and high-dimensional data (e.g., spectral and image data). Their strength lies in automated feature extraction and modeling of complex nonlinear relationships.
Table 11 presents performance metrics across common DL architectures.
As shown in
Figure 2,
Figure 3 and
Figure 4, DL algorithms improve accuracy by 5–10% over traditional ML, with R
2 gains of 0.05–0.15 and consistently lower RMSE. Transfer learning and CNN-LSTM hybrids are particularly effective, achieving accuracies of 93–99% and RMSE values as low as 3.2–7.8. These improvements translate into fewer misclassification errors in critical agricultural tasks such as disease detection, food safety, and yield forecasting.
3.6.3. Performance Implications for Agricultural Applications
The comparative performance of machine learning (ML) and deep learning (DL) algorithms carries significant implications for both research design and real-world agricultural practice. Across the reviewed studies, DL approaches consistently outperformed traditional ML models in terms of accuracy, explanatory power (R2), and predictive reliability (RMSE). On average, DL methods demonstrated accuracy improvements of 5–10% and R2 gains of 0.05–0.15 compared to their ML counterparts. This margin, though modest in percentage terms, becomes critical in agricultural contexts, where small improvements in classification or prediction accuracy can translate into substantial resource savings, improved crop yields, or enhanced food safety.
For classification-oriented tasks such as crop variety identification, disease detection, and quality grading, DL architectures, particularly convolutional neural networks (CNNs) and transfer learning methods, achieved superior accuracy levels (93–99%) compared to SVMs and ANNs (83–94%). The practical implication is clear: DL can reduce misclassification rates in high-stakes agricultural decision making, minimizing wasted inputs and mitigating the risk of undetected crop diseases. Moreover, the consistently higher F1 scores (0.92–0.99) achieved by DL approaches reflect a better balance between precision and recall, a crucial advantage in scenarios such as food safety assessment, where both false positives and false negatives have costly consequences.
In regression-based applications, yield estimation, evapotranspiration forecasting, and quality parameter prediction, hybrid DL models such as CNN-LSTM combinations further demonstrating strong predictive performance. Their lower RMSE values (3.5–7.8) compared to ML methods (5.2–16.8) underscore their reliability for supporting precision agriculture decisions such as irrigation scheduling and harvest timing. Likewise, their higher R2 values (0.90–0.97) indicate greater explanatory power for capturing complex temporal–spatial dependencies in agricultural data, making them more robust for long-term deployment in variable field environments. Nevertheless, the advantages of DL are tempered by practical constraints. DL models demand larger datasets, higher computational resources, and more extensive training compared to traditional ML approaches. This poses challenges for smallholder farmers or developing regions where computational infrastructure may be limited. In such contexts, resource-efficient ML methods such as gradient boosting or Random Forests remain highly valuable, often approaching DL-level accuracy (e.g., XGBoost with 89–96% accuracy, R2 of 0.85–0.95) but at a fraction of the computational cost. Interpretability also emerges as a decisive factor: stakeholders often require transparent decision-making frameworks for adoption and regulatory compliance, an area where tree-based ML models retain a clear edge over the “black-box” nature of DL systems.
The broader implication is that neither ML nor DL provides a universally optimal solution. Instead, the results point to a hybrid analytical landscape where DL methods are most appropriate for unstructured, high-dimensional data sources (e.g., images, hyperspectral data, multimodal IoT streams), while ML models remain better suited to structured data scenarios requiring efficiency and interpretability. This complementarity suggests that future agricultural decision-support systems will increasingly integrate both paradigms, leveraging DL for pattern discovery and ML for transparent, resource-efficient inference.
3.6.4. Application Domains
The survey highlights that machine learning (ML) and deep learning (DL) techniques have become widely adopted across agricultural and food science, owing to their strong capabilities in feature extraction, pattern recognition, prediction, and optimization. Their application spans diverse domains, which can be broadly grouped into three major categories: food quality and safety, agricultural management and environmental monitoring, and broader methodological and industrial advancements, as shown in
Figure 5.
3.6.5. Food Quality and Safety
In food quality and safety, ML and DL have been extensively applied to monitor, assess, and predict diverse quality parameters. Tomato quality studies have employed support vector regression (SVR) models to capture the influence of storage temperature and packaging on color changes, enzymatic activity, and antioxidant properties. Garlic drying processes have relied on artificial neural networks (ANNs) to optimize infrared-convective drying by adjusting airflow, temperature, and radiation intensity, balancing processing time with nutrient preservation. Poultry safety assessment has used hyperspectral imaging with optimization algorithms and ANN modeling to detect microbial contamination such as Pseudomonas and Enterobacteriaceae, alongside physical contamination like tumors. Tea quality evaluation has advanced through CNN-based systems, which outperform regression-based approaches for aroma analysis and pesticide residue detection using Raman spectroscopy. Similarly, CNN and hybrid LSTM–CNN models have been applied in corn oil and Sichuan pepper quality authentication, offering robust performance in detecting toxins, residues, and geographic authenticity. Apple quality has also received significant attention, with stacked autoencoders and CNN-based classifiers such as VGG16 achieving high performance in soluble solid detection, color assessment, and deformity identification. Beyond these, applications extend to real-time rice grain breakage sensing, grain freshness monitoring with colorimetric arrays, and moisture prediction in Tencha drying using deep learning and computer vision.
3.6.6. Agricultural Management and Environmental Monitoring
Agricultural management and environmental monitoring have similarly benefited from the versatility of ML and DL approaches. Tree classification and structural segmentation have leveraged D-PointNet++ for the precise mapping of crowns, trunks, and supporting structures in large-scale nursery systems. Reinforcement learning frameworks have been developed for irrigation optimization, integrating soil data, crop status indicators, and weather conditions to improve cotton yield and water-use efficiency. Crop evapotranspiration and stress indices have been modeled with AI-enhanced sensor data assimilation, while nitrogen optimization strategies have incorporated deep reinforcement learning with crop simulators to improve input efficiency [
110,
111]. Fluorescence hyperspectral imaging [
112] coupled with deep autoencoder architectures have enabled lead contamination prediction in oilseed rape, while ensemble ML methods have supported carbon flux prediction in tea plantation ecosystems. Livestock management has also adopted DL for automated animal identification, disease monitoring, and predictive analytics using weight and imaging inputs. Similarly, CatBoost algorithms have been applied for crop transpiration prediction, while DL-based robotic vision systems have shown significant promise in unstructured farm environments, enhancing detection and operational efficiency.
3.6.7. Broader Applications and Methodological Advancements
Beyond these specific domains, broader applications highlight how ML and DL methodologies are driving methodological innovation. Transfer learning has become particularly valuable in agricultural contexts, where datasets are often scarce or expensive to generate, enabling models trained on large external datasets to be adapted for agricultural applications, with improved convergence and detection accuracy [
113,
114]. Neural networks have automated feature extraction, replacing manual feature engineering through their capacity to hierarchically learn low- and high-level features from raw spectral and imaging inputs [
115]. Ensemble strategies, such as boosting and stacking, have been widely used to combine the strengths of multiple learners, mitigating bias and improving predictive performance [
116]. Computer vision applications demonstrate particular success, with DL enabling robust classification, defect detection, and grading of agricultural commodities [
117]. Industrial food processing and agricultural digitization are further benefiting from AI, where ANNs and DL architectures are being applied to optimize complex non-linear processes and accelerate intelligent manufacturing pipelines [
118]. A clear trend is emerging toward hybrid solutions that integrate ML, DL, and optimization algorithms, creating comprehensive analytical frameworks capable of tackling high-dimensional and multimodal challenges in agriculture and food science [
119,
120].
3.6.8. Method Selection Framework and Decision Flowchart
The comparative performance analysis presented in previous subsections demonstrates that method selection significantly impacts agricultural application success. To address this challenge,
Figure 6 presents a systematic framework for selecting appropriate computational methods based on specific application requirements and constraints.
This framework provides a systematic approach for selecting appropriate computational methods in agricultural engineering. The process first evaluates dataset size, recommending traditional ML (SVM, Random Forest) for datasets under 1000 samples to prevent overfitting. For larger datasets, method selection depends on data type: spectral data suggest PLS for speed or CNN-1D for accuracy; image data lead to YOLO for detection or CNN with transfer learning for classification; time-series data direct to LSTM for complex patterns or Random Forest for simpler cases; tabular data default to ensemble methods. The framework then applies constraint filters, prioritizing interpretable methods when explainability is required, lightweight models for limited computational resources, and ensemble approaches for maximum accuracy. The process concludes with cross-validation to ensure robust model performance. This decision framework addresses the gap between theoretical performance capabilities and practical implementation needs, enabling researchers and practitioners to make informed method selections based on their specific agricultural applications, data characteristics, and operational constraints. The structured approach reduces the complexity of choosing among numerous available techniques while ensuring alignment between method capabilities and application requirements.
5. Conclusions
This survey has provided a comprehensive examination of machine learning and deep learning applications in agricultural engineering through systematic analysis of recent research developments. The investigation addressed five fundamental research questions, with the key findings summarized below:
RQ1: ML/DL Models in Agricultural Engineering: The analysis revealed a diverse ecosystem of computational approaches, ranging from traditional machine learning algorithms (ANNs, SVMs, Random Forest, PLS) to sophisticated deep learning architectures (CNNs, RNNs, GANs, autoencoders). Deep learning models consistently demonstrated 5–10% accuracy improvements over traditional ML approaches, with transfer learning emerging as a critical strategy for agricultural applications with limited labeled data.
RQ2: Agricultural Engineering Challenges Addressed: Three primary challenge domains were identified: (1) agricultural product quality assessment using hyperspectral imaging and spectroscopy; (2) crop and field management through precision optimization and monitoring; and (3) agricultural automation with machine vision systems. These applications demonstrated remarkable potential to enhance productivity while reducing environmental impact and operational costs.
RQ3: Experimental Evaluation Practices: The experimental methodologies varied significantly across studies, with dataset splitting strategies ranging from simple train-test divisions to sophisticated cross-validation approaches. Performance metrics selection showed task-dependency, with classification tasks favoring accuracy and F1-scores, while regression applications emphasized R2 and RMSE values. Critical gaps were identified in standardized validation frameworks and reproducibility practices.
RQ4: Application Domains and Success Areas: Applications successfully spanned food quality and safety assessment (93–99% accuracy in image-based tasks), precision agriculture and environmental monitoring, and agricultural automation systems. Spectral data applications dominated at 42.1%, followed by image data at 26.2%, indicating a strong preference for non-destructive analytical approaches.
RQ5: Research Challenges and Future Trends: Current challenges include data limitations, model interpretability issues, and computational complexity constraints. Future trends emphasize lightweight model development for field deployment, ensemble learning strategies, expanding applications in environmental monitoring, and the integration of advanced sensing technologies with AI-driven analytics.
Overall Contributions: This survey consolidates diverse technological advances into a coherent taxonomy, linking sensing modalities, datasets, algorithms, and applications. The comparative analysis demonstrates that ML and DL serve as complementary approaches rather than substitutes, with the evolving trajectory toward hybrid frameworks that blend interpretability with predictive power. The integration of intelligent analytics with agricultural systems holds transformative potential for food production, safety, and sustainability, requiring continued methodological innovations and interdisciplinary collaboration between computational sciences, agronomy, and environmental research.