Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges

Frimpong, Samuel Akwasi; Han, Mu; Zheng, Wenyi; Li, Xiaowei; Akpaku, Ernest; Obeng, Ama Pokuah

doi:10.3390/computers14100438

Open AccessReview

Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges

by

Samuel Akwasi Frimpong

^1,2

,

Mu Han

^1,*,

Wenyi Zheng

¹,

Xiaowei Li

¹

,

Ernest Akpaku

¹

and

Ama Pokuah Obeng

³

¹

School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China

²

Department of Computer Engineering, Ghana Communication Technology University, Accra PMB 100, Accra-North, Ghana

³

Department of Computer Science, Kumasi Technical University, Kumasi P.O. Box 854, Ghana

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(10), 438; https://doi.org/10.3390/computers14100438

Submission received: 4 September 2025 / Revised: 27 September 2025 / Accepted: 28 September 2025 / Published: 15 October 2025

Download

Browse Figures

Versions Notes

Abstract

Machine learning and deep learning techniques integrated with advanced sensing technologies have revolutionized agricultural engineering, addressing complex challenges in food production, quality assessment, and environmental monitoring. This survey presents a systematic review and meta-analysis of recent developments by examining the peer-reviewed literature from 2015 to 2024. The analysis reveals computational approaches ranging from traditional algorithms like support vector machines and random forests to deep learning architectures, including convolutional and recurrent neural networks. Deep learning models often demonstrate superior performance, showing 5–10% accuracy improvements over traditional methods and achieving 93–99% accuracy in image-based applications. Three primary application domains are identified: agricultural product quality assessment using hyperspectral imaging, crop and field management through precision optimization, and agricultural automation with machine vision systems. Dataset taxonomy shows spectral data predominating at 42.1%, followed by image data at 26.2%, indicating preference for non-destructive approaches. Current challenges include data limitations, model interpretability issues, and computational complexity. Future trends emphasize lightweight model development, ensemble learning, and expanding applications. This analysis provides a comprehensive understanding of current capabilities and future directions for machine learning in agricultural engineering, supporting the development of efficient and sustainable agricultural systems for global food security.

Keywords:

machine learning; deep learning; agricultural engineering; precision agriculture; food quality assessment; environmental monitoring

1. Introduction

Agricultural engineering stands at a critical juncture where escalating demands for food security, resource efficiency, and climate resilience necessitate transformative technological integration [1,2]. The integration of machine learning (ML) and deep learning (DL) with agricultural systems has emerged as a pivotal frontier, enabling unprecedented capabilities in interpreting complex biological, environmental, and operational datasets [3,4,5]. Traditional analytical approaches increasingly falter when confronted with the nonlinear dynamics of crop physiology, soil–plant–atmosphere interactions, and supply chain logistics, prompting widespread adoption of data-driven intelligence across the agricultural value chain [6,7,8,9].

Machine learning and deep learning techniques have fundamentally transformed agricultural engineering research and practice, offering capabilities for pattern recognition, predictive modeling, and automated decision making that address complex agricultural challenges previously considered intractable [8,10,11]. From precision irrigation management to real-time crop disease detection, these applications demonstrate the remarkable potential to enhance agricultural productivity while reducing environmental impact and operational costs [12,13,14]. The integration of machine learning with advanced sensing technologies has been particularly transformative [15,16]. Hyperspectral imaging, near-infrared spectroscopy, and other non-destructive analytical methods generate vast amounts of high-dimensional data that traditional statistical approaches cannot effectively process [17,18,19]. Machine learning algorithms excel at extracting meaningful patterns from these complex datasets, enabling the rapid, accurate assessment of crop health, soil conditions, food quality, and environmental parameters [20,21,22,23]. Deep learning architectures, particularly convolutional neural networks, have revolutionized image-based agricultural applications by automatically learning hierarchical feature representations without requiring manual feature engineering [24,25,26]. This advancement has enabled breakthrough applications in crop monitoring, livestock management, and food quality assessment that surpass human capabilities in both speed and accuracy [27,28]. However, practical implementation reveals both opportunities and challenges. While these technologies demonstrate exceptional performance in controlled research environments, their deployment in real-world agricultural contexts requires careful consideration of computational constraints, environmental variability, data availability, and economic feasibility [29,30,31]. The development of lightweight models, robust algorithms for noisy field data, and cost-effective sensing solutions remains critical for widespread adoption [32,33,34,35].

This survey examines the current state of machine learning and deep learning applications in agricultural engineering through a systematic analysis of recent research developments. The investigation addresses fundamental questions regarding the types of models currently employed, the specific agricultural engineering challenges most frequently addressed, the methodological approaches used for experimental evaluation, the diverse application domains where these techniques have demonstrated success, and the emerging research challenges and future trends. The analysis reveals a rich ecosystem of computational approaches ranging from traditional machine learning algorithms such as support vector machines and random forests to sophisticated deep learning architectures, including convolutional neural networks and recurrent neural networks. These technologies address challenges spanning crop and livestock management, food quality and safety assessment, environmental monitoring, and agricultural automation. Understanding this landscape is essential for researchers, practitioners, and policymakers seeking to leverage these technologies for addressing global food security challenges.

This survey addresses critical gaps in the existing agricultural machine learning literature through unique systematic contributions. Unlike previous reviews that focus on narrow domains [8] or provide descriptive overviews [4], this work employs a structured meta-analysis (RQ1–RQ5) with quantitative performance synthesis, revealing DL methods achieve 5–10% accuracy improvements over traditional ML approaches. The survey provides the first comprehensive taxonomy of ML/DL techniques for agricultural engineering, systematically examines advanced sensing technology integration (hyperspectral imaging, Raman spectroscopy), and includes rigorous dataset availability and experimental reproducibility assessment. These evidence-based frameworks bridge the gap between technological possibility and practical deployment, offering comprehensive guidance for implementing ML/DL solutions in real-world agricultural contexts while addressing computational, interpretability, and environmental challenges not systematically examined in prior reviews. This comprehensive analysis aligns with computational sciences and engineering applications, addressing the intersection of advanced computing methods with practical agricultural systems, a domain where computational innovation drives real-world impact through intelligent data processing, automated decision making, and precision resource management [19,35,36]. The work contributes to the growing body of literature on machine learning applications in engineering contexts, leveraging business intelligence principles and automated detection frameworks that are revolutionizing traditional practices [37,38,39].

Understanding this landscape is essential for researchers, practitioners, and policymakers seeking to leverage these technologies for addressing global food security challenges. This survey examines the current state of machine learning and deep learning applications in agricultural engineering through systematic analysis of recent research developments. The investigation addresses five fundamental research questions:

RQ1: What machine learning and deep learning models are employed in agricultural engineering applications?
RQ2: What agricultural engineering challenges are most frequently addressed by these computational approaches?
RQ3: How are experimental evaluations performed to validate machine learning and deep learning solutions in agricultural contexts?
RQ4: What are the known application domains where these techniques have demonstrated success?
RQ5: What are the future research challenges and emerging trends in the application of machine learning and deep learning in agricultural engineering?

We formulate these questions to ensure a comprehensive understanding of the current state of ML and DL in agricultural engineering, examining methodological advances (RQ1), engineering challenges (RQ2), validation practices (RQ3), practical contributions (RQ4), and future directions (RQ5). Together, these questions structure the Section 4 of this survey.

2. Review Methodology

2.1. Search Strategy and Database Selection

This systematic review employed a comprehensive search strategy across multiple academic databases to ensure thorough coverage of the relevant literature in machine learning and deep learning applications within agricultural engineering [40,41]. The primary databases searched included Web of Science, Scopus, IEEE Xplore, ScienceDirect, and PubMed, selected for their extensive coverage of engineering, computer science, and agricultural research publications. Additional searches were conducted in specialized databases including AGRICOLA and CAB Abstracts, to capture domain-specific agricultural engineering studies that might not be indexed in general scientific databases. The search strategy utilized a combination of controlled vocabulary terms and free-text keywords to maximize retrieval while maintaining relevance [42,43,44]. Primary search terms included variations of “machine learning,” “deep learning,” “artificial intelligence,” “neural networks,” and “data mining” combined with agricultural domain terms such as “agriculture,” “farming,” “crop,” “livestock,” “food quality,” “precision agriculture,” and “agricultural engineering.” Boolean operators were employed to create comprehensive search strings that captured the intersection of computational methods and agricultural applications while excluding irrelevant domains such as biomedical applications of similar technologies. The temporal scope of the search encompassed publications from 2015 to 2025, a period selected to capture the rapid advancement and adoption of machine learning techniques in agricultural applications while ensuring relevance to current technological capabilities. This timeframe corresponds to the widespread availability of deep learning frameworks and the increasing accessibility of computational resources necessary for implementing sophisticated machine learning algorithms in agricultural contexts.

2.2. Geographical Scope and Limitations

The systematic search encompassed the global literature, though analysis revealed significant geographical imbalances, with studies originating primarily from high-income countries (North America: 34%, Europe: 24%, East Asia: 28%) and limited representation from Sub-Saharan Africa (3%), Latin America (6%), and South Asia (5%) [45,46]. Despite efforts to address this bias through regional database searches, multilingual inclusion, and consultation with international agricultural organizations, the potential underrepresentation of indigenous knowledge systems and research from resource-constrained institutions remains [47,48]. These geographical limitations suggest that reported performance metrics and implementation strategies may not transfer directly to agricultural contexts where food security challenges are most acute, highlighting the need for collaborative frameworks that engage researchers from diverse geographical and economic contexts.

2.3. Inclusion and Exclusion Criteria

Inclusion criteria were carefully designed to focus on studies that directly addressed machine learning or deep learning applications in agricultural engineering contexts. Studies were included if they presented original research involving the development, application, or comparison of machine learning algorithms for solving specific agricultural challenges. This included research on crop management, livestock monitoring, food quality assessment, precision agriculture, agricultural automation, environmental monitoring in agricultural systems, and post-harvest processing applications. Publications were required to demonstrate clear methodological rigor, including detailed descriptions of data collection procedures, model development approaches, and evaluation methodologies. Studies that merely mentioned machine learning techniques without substantial implementation or evaluation were excluded to ensure focus on contributions that advance the field through practical applications and validated results. Additionally, research had to involve real agricultural data or realistic simulations rather than purely theoretical developments without empirical validation. Exclusion criteria eliminated studies that focused primarily on non-agricultural applications of machine learning, even if they claimed agricultural relevance. Research dealing exclusively with laboratory-scale experiments without clear pathways to practical agricultural implementation was also excluded. Publications that did not provide sufficient technical detail for evaluation of methodological quality, studies published in languages other than English, and conference abstracts without full papers were excluded to maintain consistency and accessibility of the reviewed literature. Review articles, meta-analyses, and opinion pieces were excluded from the primary analysis but were consulted for contextual information and to identify additional relevant primary studies through backward citation searching. Patents and technical reports were similarly excluded to focus on peer-reviewed research that had undergone rigorous academic evaluation processes.

2.4. Study Selection Process

The study selection process followed a systematic multi-stage approach designed to ensure reproducibility and minimize selection bias. Initial screening involved removing duplicate records across databases using reference management software, followed by title and abstract screening to eliminate obviously irrelevant publications. Two independent authors conducted this initial screening using predetermined criteria, with disagreements resolved through discussion and consultation with a third author when necessary. Full-text screening was performed on potentially relevant articles identified during the initial screening phase. This detailed evaluation assessed methodological quality, relevance to agricultural engineering applications, and the depth of machine learning implementation. Studies were categorized based on their primary application domains, methodological approaches, and the types of machine learning techniques employed to facilitate subsequent analysis and synthesis. A standardized data extraction form was developed to capture relevant information from the included studies systematically. This form included fields for bibliographic information, study objectives, agricultural application domains, the types of data used, the machine learning algorithms employed, experimental design characteristics, the performance metrics reported, key findings, and limitations acknowledged by the authors.

2.5. Quality Assessment Framework

The quality assessment framework evaluated multiple dimensions of methodological rigor relevant to machine learning research in agricultural contexts [49]. Technical quality assessment focused on the appropriateness of algorithm selection for the specific agricultural problem, adequacy of the data preprocessing procedures, robustness of the experimental design, including proper train–test splits and cross-validation procedures, and the comprehensiveness of the performance evaluation, using appropriate metrics for the task type.

Data quality evaluation examined the representativeness of the datasets used, including sample sizes, the diversity of conditions represented, and the relevance to real-world agricultural scenarios. Reporting quality was assessed based on the completeness of methodological descriptions, reproducibility of reported procedures, clarity of result presentation, and acknowledgment of study limitations. Studies that provided sufficient detail for replication and clearly discussed the practical implications and limitations of their findings received higher quality scores. The integration of domain expertise was evaluated by examining whether studies appropriately incorporated agricultural knowledge into their machine learning approaches, consulted with agricultural experts in the study design or interpretation, and addressed practical considerations relevant to agricultural implementation. This criterion recognized that effective agricultural applications of machine learning require a deep understanding of both computational methods and agricultural systems.

2.6. Data Extraction and Analysis Framework

Data extraction employed a structured approach designed to capture both quantitative performance data and qualitative insights regarding the practical applicability of different machine learning approaches in agricultural contexts. Quantitative data included performance metrics such as accuracy, precision, recall, and F1-scores for classification tasks, as well as R-squared, RMSE, and MAE values for regression applications. When possible, computational requirements, including training time, inference speed, and hardware requirements, were also extracted to assess practical deployment feasibility. Methodological characteristics were systematically categorized, including the types of input data used, the preprocessing procedures employed, the algorithm architectures implemented, the hyperparameter optimization approaches, and the validation strategies utilized. This information enabled comparative analysis of methodological trends and identification of best practices across different agricultural application domains. Application domain classification involved categorizing studies based on their primary agricultural focus areas, including crop production, livestock management, food quality and safety, precision agriculture, environmental monitoring, and agricultural automation. Within each domain, specific applications were further categorized to identify areas of concentrated research activity and emerging application opportunities. The analysis framework incorporated both descriptive statistics to characterize the overall landscape of research activity and comparative analysis to identify trends in performance, methodological approaches, and application domains over time. Meta-analytical techniques were employed where sufficient homogeneity existed among studies to enable the quantitative synthesis of results.

2.7. Data Availability and Reproducibility Assessment

To enhance transparency and support future research, we systematically evaluated the availability of datasets and code from the reviewed studies, revealing significant challenges in reproducibility practices across agricultural ML/DL research. Our analysis categorized studies into three groups: open-access (12% of studies), with complete datasets and code available through platforms like GitHub, Mendeley, or institutional repositories; upon-request (54% of studies), where data are available by contacting the authors, though response rates and data completeness vary significantly; and proprietary/restricted (34% of studies), with limited or no data availability due to commercial partnerships, privacy constraints, or a lack of sharing protocols. This distribution highlights critical gaps in research transparency that hinder scientific reproducibility and knowledge transfer. To address these limitations, we recommend establishing standardized data sharing protocols for agricultural ML/DL research, creating domain-specific repositories for agricultural datasets, and implementing FAIR (Findable, Accessible, Interoperable, Reusable) data principles to ensure that future research can build effectively upon existing work and facilitate a broader adoption of successful methodologies across diverse agricultural contexts.

2.8. Synthesis and Reporting Approach

The synthesis approach emphasized identifying patterns and trends across machine and deep learning applications in agricultural engineering while recognizing the heterogeneity of approaches, datasets, and evaluation methods employed across studies. Rather than attempting forced quantitative meta-analysis, the synthesis focused on identifying common challenges, successful approaches, and emerging trends that could inform future research directions. Thematic analysis was employed to identify recurring themes related to agricultural challenges amenable to machine learning solutions, characteristics of successful implementations, and barriers to practical deployment. This analysis informed the development of taxonomies for machine and deep learning approaches, agricultural application domains, and evaluation methodologies presented in Section 3.5. Technical details of implementations were accurately represented while maintaining accessibility for readers with diverse backgrounds in both agricultural engineering and computational methods. Limitations were carefully documented, including the potential publication bias toward positive results, rapid technological development rendering older studies less relevant, and challenges comparing results across different datasets and evaluation criteria. These limitations were considered when interpreting findings and formulating research recommendations. The synthesis process culminated in developing comprehensive taxonomies of machine and learning techniques, agricultural applications, and evaluation methodologies, along with identifying research gaps and future directions. This approach ensured the review provides both a comprehensive overview of current research and actionable insights for advancing machine learning applications in agricultural engineering.

3. Taxonomy of Employed ML and DL Techniques in Agricultural Engineering

In this section, we address RQ1 by summarizing the machine learning and deep learning models employed in agricultural engineering. The surveyed literature reveals a diverse ecosystem of ML and DL models deployed to address core challenges in this domain [50,51,52]. These models are typically selected based on data type, problem complexity, and computational constraints.

3.1. Temporal Evolution and Trends (2015–2024)

Analysis of the surveyed literature reveals clear temporal trends in ML/DL adoption patterns. Early studies (2015–2017) predominantly employed traditional ML approaches, with SVMs and ANNs accounting for 65% of applications. The period 2018–2020 marked a transition phase, with increasing CNN adoption (40% of studies), while recent years (2021–2024) show the dominance of deep learning architectures, particularly transfer learning approaches (55% of studies) and hybrid CNN-LSTM models (25%). This evolution reflects both improved computational accessibility and the recognition that agricultural data’s complexity requires more sophisticated feature extraction capabilities.

3.2. Hierarchical Classification of ML and DL Techniques

3.2.1. Traditional Machine Learning Approaches

Traditional ML algorithms remain widely utilized, particularly for structured/tabular data, spectral analysis, and scenarios with limited datasets.

Neural Networks Family
o
Artificial Neural Networks (ANNs) and Variants: This category encompasses Back-Propagation ANNs (BP-ANNs), Extreme Learning Machine ANNs (ELM-ANNs), and Radial Basis Function ANNs (RBF-ANNs), demonstrating exceptional versatility across agricultural applications. Primary applications include modeling complex physicochemical properties (tomatoes [53], green tea amino acids [54]), forecasting drying kinetics and quality parameters across numerous crops [55,56], predicting crop transpiration/evapotranspiration [57], online detection systems (bacterial concentration [58], authenticity verification [59], soil pH monitoring [60]), and general classification tasks [61,62,63,64]. These networks excel at handling nonlinear relationships, modeling complex processes, and adapting to diverse data types, including spectral, sensor, and environmental inputs.
o
Theoretical Foundations and Universal Approximation Properties Artificial Neural Networks (ANNs) can approximate any continuous function given sufficient hidden units and proper activation functions [38], enabling them to model complex agricultural phenomena beyond the reach of parametric models. Convolutional Neural Networks (CNNs) extend this to spatial data via translation-invariant feature extraction, making them well-suited for agricultural imaging. Ensemble methods leverage learner diversity to reduce bias and variance, explaining their strong and consistent performance across agricultural domains [39]. These advances align with broader trends in intelligent systems and sensor technologies, driving automated data processing and decision making in engineering [36,37].
Kernel-Based Methods
o
Support Vector Machine (SVM) and Support Vector Regression (SVR): These methods are prominent for classification and regression with complex, high-dimensional data. Applications span quality assessment [53], object classification and counting [65,66], seed viability detection [67], geographical authenticity prediction [64], quantitative spectral analysis [68], and serving as classifiers for deep features extracted by CNNs [69]. They demonstrate effectiveness in high-dimensional spaces, robustness against overfitting, and strong performance with limited data. Applications include pesticide residue detection in tea [70].
Tree-Based Methods
o
Random Forest (RF) methods are favored for their robustness and ensemble nature; RF excels in forecasting environmental variables like vapor pressure deficit [71], in regression tasks for toxin detection, and in classification applications [72,73]. Strengths include handling noisy data, providing feature importance, and achieving high accuracy.
o
Decision Trees and Variants: These methods include REPTree, M5P, TB, and TF variants and are used for interpretable models and within ensembles for forecasting [71] and regression tasks [74].
o
Gradient Boosting Frameworks: XGBoost, LightGBM, and CatBoost are leveraged for complex predictions, including tea origin determination [75], variety classification, and time-series forecasting [57], frequently within hybrid model configurations.
Statistical and Specialized Methods
o
Partial Least Squares (PLS): Dominant in chemometrics and spectral analysis, PLS excels in quantitative analysis [70,76] and identification/discrimination tasks [17,77] and often serves as a baseline for deep learning comparisons [75]. It is particularly effective for highly collinear spectral data with combined dimensionality reduction and regression/classification capabilities.
o
Other Specialized Algorithms: These include k-Nearest Neighbor (KNN) for simple classification tasks [17,64], various regression techniques (Linear Regression, Multiple Linear Regression, Additive Regression Trees) for environmental forecasting [78,79], domain-specific models like Grey Models for time series forecasting [63], and dimensionality reduction techniques (PCA, LDA) for data preprocessing [72,77].
Ensemble Methods
o
Meta-Learning Approaches: These include stacking, blending, weighted averages, and neural network-based ensembles. These methods enhance performance across applications like carbon flux estimation [80], evapotranspiration forecasting [71], and heavy metal detection [74]. They mitigate overfitting, improve generalization, and leverage the strengths of base learners.
Other Key ML Algorithms:

Beyond core ML/DL architectures, the survey identified several specialized algorithms tailored to specific agricultural engineering tasks. Gradient Boosting frameworks (XGBoost, LightGBM, CatBoost) were leveraged for complex predictions, like determining tea origin and pungency intensity [75], classifying oolong tea varieties, and forecasting time-series data such as tomato transpiration rates [57] and solar radiation, frequently within hybrid model configurations. For simpler classification tasks, such as verifying tea authenticity [64] or identifying maize seed harvest years [17], the k-Nearest Neighbor (KNN) algorithm was commonly applied. Various regression techniques, including Linear Regression (LR), Multiple Linear Regression (MLR), and Additive Regression Tree (ART), proved effective for forecasting environmental variables like Vapor Pressure Deficit (VPD [71]) and predicting moisture content in products like coated pineapple cubes [79]. Domain-specific challenges were addressed by specialized models: the Grey Model handled seed distribution and time series forecasting [63], Response Surface Methodology (RSM) optimized grain cleaning systems, and Fuzzy Models predicted losses like sieve efficiency in harvesters. Finally, dimensionality reduction, primarily through Principal Component Analysis (PCA) for spectral/imaging data preprocessing (e.g., watermelon seed analysis [77]) and Linear Discriminant Analysis (LDA) for classification [72], was a ubiquitous step in streamlining complex datasets before model application.

As summarized in Table 1, the dominant traditional machine learning models demonstrate clear specialization patterns across different agricultural engineering applications.

3.2.2. Deep Learning Architectures

DL models, particularly CNNs, dominate applications involving image, spectral, and point cloud data, offering superior feature extraction capabilities.

Convolutional Neural Networks (CNNs)
o
Classical CNN Architectures: AlexNet, VGG16, GoogLeNet, and ResNet variants are widely employed for crop and seed classification [69,81], livestock recognition [11], and quality assessment [82].
o
Object Detection Networks: YOLO series (v3, v4, v5, v7, v8s), Faster R-CNN, and Mask R-CNN dominate real-time detection applications in complex field environments, particularly for tea shoot detection [74,76] and precise localization tasks.
o
Specialized CNN Variants: These include 1D-CNNs and 2D-CNNs for spectral data analysis [73,83], 3D processing architectures like VoxNet for LiDAR point clouds [84], and low-light enhancement networks (Zero-DCE, EnlightenGAN) for nighttime monitoring [85].
o
Hybrid CNN Architectures: These include transformer-embedded CNNs for disease identification, CNN-LSTM models for multi-view analysis, and lightweight variants like YOLOv5s for comprehensive agricultural grading [68].

The success of CNNs in agricultural applications can be attributed to their hierarchical feature learning capabilities and translation invariance properties. Recent advances in automatic data augmentation frameworks have further enhanced CNN performance in industrial and agricultural defect detection tasks, demonstrating the importance of data preprocessing strategies for improving model robustness [39]. These developments reflect broader machine learning-facilitated business intelligence approaches that optimize complex decision-making processes [38].

Recurrent Neural Networks (RNNs)
o
Long Short-Term Memory (LSTM) Networks: These provide primary solutions for modeling sequential and time-series agricultural data, including physicochemical property modeling [53], soil moisture prediction, yield approximation, and irrigation optimization [86].
o
Hybrid RNN Architectures: CNN-LSTM combinations enhance performance in tasks combining spatial and temporal data, as exemplified in pesticide residue detection using spectroscopy [87].
Generative and Unsupervised Models
o
Autoencoder Variants: Stacked Autoencoders (SAE), Stacked Denoising Autoencoders (SDAE), and Stacked Weighted Autoencoders (SWAE) are leveraged for feature learning from complex agricultural data, including heavy metal detection [26,74], fruit quality prediction [68], and hyperspectral dataset processing [78].
o
Generative Adversarial Networks (GANs): These are primarily applied in data augmentation and image enhancement, including Wasserstein GANs (WGANs) for livestock recognition robustness [11] and EnlightenGAN for low-light image enhancement [85].
Transfer Learning and Specialized Architectures
o
Transfer Learning: This represents a ubiquitous strategy across agricultural DL applications, utilizing pre-trained models (typically ImageNet-based) fine-tuned for specific agricultural tasks, significantly improving performance when labeled data are scarce [11,87].
o
Other Deep Learning Models: These include Deep Belief Networks (DBNs) for specialized applications like rice seed germination detection using fluorescence spectra.

The diversity of deep learning architectures and their specialized applications in agricultural contexts is comprehensively illustrated in Table 2. This hierarchical classification reveals that traditional ML methods provide interpretable solutions for structured data, while deep learning excels in complex feature extraction from high-dimensional datasets. The complementary nature of these approaches, combined with increasing hybrid strategies, demonstrates agricultural computing’s evolution toward integrated multi-modal frameworks [88].

3.3. Synthesis of Model Utilization Trends

Ensemble and Hybrid Approaches: Combining models (Ensemble ML [74,80], CNN-LSTM hybrids [86]) is a key strategy to enhance accuracy and robustness and handle multi-modal data.
Transfer Learning Imperative: The reliance on transfer learning to overcome data scarcity for DL models is near-universal in agricultural applications [11,69,82].
Performance Focus: Studies consistently compare model performance (accuracy, precision, RMSE, R²), with complex DL and ensemble methods, frequently achieving state-of-the-art results [65,68,70,87], albeit sometimes at higher computational cost. PLS/PLS-ML often serves as a key benchmark in spectral analysis [70,76,77,78].
Computational Pragmatism: Lightweight models (e.g., improved YOLOv5s [76], Zero-DCE [85], 1D-CNNs [78]) and efficient architectures are prioritized for potential field deployment, especially for real-time tasks like detection and monitoring.

This taxonomy provides a comprehensive overview of the ML and DL model landscape actively employed in addressing contemporary agricultural engineering challenges, highlighting the synergy between model capabilities and specific application requirements.

3.4. Contributions of Ml and Dl Techniques to Agricultural Engineering Challenges

In this section, we address RQ2 by examining the agricultural engineering challenges most frequently tackled using machine learning and deep learning approaches. The reviewed studies reveal that these computational techniques significantly enhance efficiency, quality, and sustainability. Broadly, the challenges addressed fall into three major categories: (i) agricultural product quality assessment, classification, and identification; (ii) crop and field management; and (iii) agricultural automation and robotics.

3.4.1. Agricultural Product Quality Assessment, Classification, and Identification

Machine learning and deep learning techniques enable rapid, non-destructive evaluation of agricultural products by integrating advanced sensing technologies like hyperspectral imaging [60,68,87,90], NIR spectroscopy [54,59,64,79], and Raman spectroscopy [73,86]. These methods overcome traditional limitations by quantifying chemical constituents (amino acids [54], toxins [73], pesticide residues [70,86]), monitoring production processes (fermentation [58,76], drying [78]), predicting internal attributes (soluble solids [68,87], firmness), authenticating geographical origins [59,75], and assessing product freshness/viability [67,77]. To handle high-dimensional spectral data complexity, deep learning models such as CNNs [73,78,87] and autoencoders [68,87] excel at feature extraction and modeling non-linear relationships. Furthermore, these techniques provide robust solutions for classifying crop varieties [1,69,81] and grading quality based on visual or spectral characteristics [64,82].

3.4.2. Crop and Field Management

ML/DL optimizes critical field operations through precision resource management and monitoring. Key applications include water resource optimization using evapotranspiration forecasting models (e.g., Random Forest for Vapor Pressure Deficit prediction [63]) and intelligent irrigation scheduling via reinforcement learning [5]. Plant health management is enhanced through early disease/pest detection from leaf images and heavy metal contamination monitoring using hyperspectral imaging with deep learning [26,61,74]. Yield prediction and growth monitoring leverage ensemble learning with UAV remote sensing data, while carbon cycle analysis employs ML ensemble models to quantify net ecosystem exchange in tea plantations [80], overcoming the limitations of traditional measurement systems.

3.4.3. Agricultural Automation and Robotics

Automation challenges are addressed through ML/DL-powered machine vision systems. Automated harvesting solutions enable the accurate, real-time detection of crops like tea shoots in complex field environments using lightweight models deployable on mobile platforms [57,58]. Intelligent sorting systems replace manual grading through livestock recognition networks (e.g., ResNet with GAN-enhanced imaging [11]) and the automated quality assessment of produce like apples [82]. Precision farming equipment benefits from ML-driven seed distribution prediction [63] and adaptive controllers that optimize parameters for harvesters and seeders [91]. Continuous 24/7 monitoring capabilities are achieved through specialized deep learning networks that enhance low-light imaging for nighttime operations like dairy cow detection [85].

3.5. Experimental Evaluations of ML and DL Techniques in Agricultural Engineering

In this section, we address RQ3 by examining how experimental evaluations are performed to validate machine learning and deep learning approaches in agricultural engineering. The reviewed studies reveal a heterogeneous landscape of methodologies, encompassing data preparation strategies, performance metrics, statistical and optimization techniques, experimental procedures, software tools, and reproducibility practices. Together, these elements form the methodological backbone for assessing the accuracy, robustness, and effectiveness of computational solutions in agricultural contexts.

3.5.1. Data Splitting and Preparation

A foundational aspect of evaluation involves partitioning datasets to assess model generalization on unseen data. Typical strategies include two-way splits (training/testing) and three-way splits (training/validation/testing), with proportions varying by study [84]. For example, tea constituent quantification used a 66.7–33.3 split [54], while heavy metal detection applied a 75–25 split [26]. Orange freshness detection followed a 70–30 division, whereas cotton irrigation relied on historical calibration data and recent validation sets. Cross-validation enhances robustness, with k-fold approaches (5- or 10-fold), Monte Carlo cross-validation (MCCV), and leave-one-out validation (LOO) frequently applied [66,69,78]. Specialized designs such as 5x2CV paired t-tests have also been used for pairwise model comparisons [17,62,65,72]. These practices are systematically compared in Table 3, which highlights inconsistencies across agricultural applications.

3.5.2. Experimental Design Criteria and Decision Frameworks

Analysis of the surveyed literature reveals that researchers employ diverse criteria for experimental design decisions, though these are not always explicitly stated. Key decision frameworks identified include the following:

Dataset Size-Dependent Criteria: Studies with datasets > 1000 samples typically employed 70/30 or 80/20 splits with k-fold cross-validation, while smaller datasets (<200 samples) favored leave-one-out validation or bootstrap methods to maximize training data utilization [17,54]. Researchers justified these choices based on statistical power requirements and generalization assessment needs.
Application-Specific Validation Strategies: Time-series agricultural applications (irrigation, yield prediction) predominantly used temporal splits respecting chronological order [88], while spatial applications employed geographical cross-validation to assess model transferability across different farms or regions. Spectroscopic studies consistently applied spectral preprocessing criteria, including noise reduction (Savitzky–Golay filtering), scatter correction (SNV/MSC), and variable selection (CARS/SPA) based on signal-to-noise ratios and collinearity assessments.
Performance Metric Selection Rationale: Researchers selected metrics based on application criticality and cost considerations. Food safety applications emphasized precision/recall balance (F1-scores) to minimize false negatives [90], while yield prediction studies prioritized RMSE and R² for quantitative accuracy assessment [71]. Spectroscopic applications consistently reported RPD values > 2.0 as acceptance criteria for model reliability.
Computational Resource Constraints: Studies involving real-time applications (robotic harvesting, field monitoring) explicitly considered computational efficiency, leading to lightweight model selection (MobileNet variants, pruned networks) and edge computing deployment criteria [66,85]. Laboratory-based studies with unlimited computational resources explored more complex architectures without such constraints.
Domain Expert Integration: Agricultural domain expertise influenced experimental design through feature engineering guidance, relevant variable selection, and the interpretation of results within biological/physical contexts. Studies involving agronomists or food scientists in the experimental design showed more robust validation protocols and practical applicability assessments.

3.5.3. Performance Metrics

Evaluation relies heavily on quantitative performance indicators, selected according to prediction type [46,60,69]. For regression tasks, metrics such as the coefficient of determination (R²), root mean square error (RMSE) [45,48,49], mean absolute error (MAE) [45,48,49], residual predictive deviation (RPD) [65,66,67], and Akaike information criterion (AIC) are commonly reported [54,68,71,76]. In classification tasks, metrics such as accuracy, precision, recall, F1-score, area under the curve (AUC), and mean average precision (mAP) are employed [11,82,91,92]. Confusion matrices often complement these metrics, providing detailed error distribution [93,94,95]. These measures are classified and contextualized in Table 4, which demonstrates task-dependent selection but also significant inconsistency across studies.

3.5.4. Statistical Analysis and Optimization

Statistical tools and optimization techniques are integral to validating agricultural ML/DL models [88]. Analysis of Variance (ANOVA) and Duncan’s Multiple Range Test (DMRT) assess factor significance and mean differences in agricultural trials [10,48]. Dimensionality reduction methods such as principal component analysis (PCA) and linear discriminant analysis (LDA) aid in feature extraction and group separation [72,77,96,97]. Variable selection methods, including Competitive Adaptive Reweighted Sampling (CARS) and Iteratively Retains Informative Variables (IRIV), are widely used for high-dimensional spectral data [98,99]. Optimization techniques such as Particle Swarm Optimization (PSO), the Chameleon Swarm Algorithm (CSA), and Differential Evolution (DE) are employed to tune model parameters and improve predictive accuracy [100,101]. A comparative mapping of these methods is provided in Table 5, while Table 6 illustrates common hyperspectral preprocessing pipelines.

3.5.5. Experimental Procedures and Visualizations

Beyond numerical evaluation, many studies incorporate domain-specific procedures to validate ML/DL models. Gold-standard reference measurements serve as benchmarks, including TVB-N for chicken freshness [98], fatty acid analysis for flour storage [102], atomic absorption spectrometry for heavy metals [26], HPLC for pesticides, and sensory analysis for pungency [75]. Advanced sensing applications include rapid bacterial species identification through integrated colorimetric sensors with near-infrared spectroscopy [98] and comprehensive food odor visualization using chemo-responsive dyes [103]. Imaging applications commonly involve ROI extraction, spectral preprocessing (Savitzky–Golay smoothing, SNV, MSC, derivatives), and feature selection algorithms (CARS, SPA, BOSS) [29,103,104]. Visualization techniques such as color-coded maps, Taylor diagrams, and heatmaps enhance interpretability, while machine vision studies utilize controlled acquisition systems and data augmentation (rotation, mirroring, noise) [23,105,106]. Food processing applications incorporate comprehensive experimental procedures, including microwave infrared cooperative drying with detailed monitoring of moisture evolution, structure changes, and physicochemical properties [107]. Robotics-focused evaluations often combine simulation (e.g., MATLAB R2025a or earlier/Simulink) with real-world testing, ensuring both reproducibility and practicality [20,108,109]. Cross-validation methodology selection represents a critical decision in experimental design, with different approaches showing distinct advantages and limitations as evaluated in Table 7. The analysis reveals that k-fold cross-validation is preferred for medium to large datasets, while specialized approaches like Monte Carlo and time-series splits are used for specific scenarios. The computational cost-benefit trade-off influences method selection, with simpler approaches often chosen despite potentially lower reliability.

Performance benchmarking across different agricultural applications reveals significant variation in expected outcomes and success factors, as synthesized in Table 8. The benchmark matrix establishes that performance expectations vary significantly across applications, with R² values ranging from 0.70–0.95 depending on the complexity of the agricultural task. Success factors consistently include proper preprocessing and feature selection, while common challenges relate to environmental variability and data quality issues across different agricultural contexts. The overall quality of experimental designs across the surveyed literature can be systematically evaluated using the framework presented in Table 9. This quality assessment framework provides a systematic approach to evaluate study rigor, revealing that many agricultural engineering studies fall short of excellent standards. The weighted scoring system highlights critical gaps in sample sizes, validation strategies, and reproducibility practices. Most studies achieve “good” to “fair” ratings, with particular weaknesses in statistical testing comprehensiveness and code/data availability, suggesting significant room for improvement in research quality and transparency across the field. Overall, agricultural engineering experiments integrate rigorous preprocessing methodologies, tailored performance metrics, advanced statistical and optimization techniques, and comprehensive real-world validations to ensure effectiveness and reliability of machine and deep learning solutions across diverse agricultural applications.

3.5.6. Dataset Taxonomy and Availability

Datasets are central to experimental evaluations, providing the foundation for training and validating ML/DL models. The surveyed studies reveal a taxonomy of commonly used datasets across agricultural engineering:

Spectral datasets (NIR, hyperspectral, Raman, fluorescence HSI) dominate, particularly for food quality and chemical constituent analysis.
Image datasets (RGB images, annotated plant/livestock photos, point cloud data) are widely used for classification, detection, and robotic automation tasks.
Environmental/Meteorological datasets (climate, soil, IoT-based sensing) enable prediction of evapotranspiration, irrigation needs, and carbon flux.
Physicochemical/Reference datasets (TVB-N, HPLC, sensory panels) serve as gold standards for benchmarking model predictions.
Olfactory and sensor datasets support quality and freshness monitoring via electronic noses and volatile compound analysis.

As illustrated in Figure 1, spectral data methods dominate current research, representing 42.2% of all dataset types, followed by image data (26.2%) and smaller proportions of environmental, reference, and sensory/olfactory datasets. This distribution reflects strong emphasis on non-destructive sensing modalities, with growing integration of environmental and IoT data sources. The implications of dataset accessibility and reproducibility practices for these diverse data types are discussed in Section 2.7.

3.5.7. Software Platforms and Reproducibility Practices

Evaluation frameworks in agricultural engineering are deeply tied to the software platforms that support ML and DL analyses. While traditional environments such as MATLAB remain popular for spectral preprocessing and numerical modeling, they are often integrated with ML pipelines by exporting features into learning algorithms or embedding neural network toolboxes for predictive tasks. Similarly, SPSS 31 or earlier and ENVI V 5.7.2 or earlier, though conventionally associated with statistical analysis and hyperspectral image processing, are widely employed in conjunction with ML/DL frameworks, serving as preprocessing stages before models such as CNNs, SVMs, or ensemble learners are trained. Specialized commercial tools like The Unscrambler X and TQ Analyst facilitate variable selection and multivariate calibration, which are subsequently coupled with regression or classification algorithms. CloudCompare, commonly applied for point cloud data, has been extended through Python scripting to feed ML classifiers for tree segmentation and structural analysis.

Despite the role of these legacy platforms, the field is witnessing a decisive shift toward open-source ML/DL ecosystems, with Python 3 emerging as the dominant environment. Python’s extensive libraries, such as scikit-learn for classical ML, PyTorch V 2.8 or earlier and TensorFlow for deep learning, NumPy for numerical computation, and Matplotlib for visualization, provide comprehensive support for end-to-end workflows, from data preprocessing to model training and deployment. Studies increasingly report using Jupyter Notebooks for transparent documentation of experiments, facilitating both collaboration and reproducibility. WEKA has also gained traction as a lightweight, accessible platform for ML applications in agricultural datasets, particularly when introducing ensemble learners, feature selection, and comparative benchmarking across algorithms.

Reproducibility practices remain uneven. Many studies still limit access to datasets with “upon request” policies, which constrains independent validation. By contrast, exemplary works provide open repositories on GitHub containing both code and ML/DL model architectures, alongside Mendeley-hosted datasets for direct reuse. Such practices represent the gold standard, enabling full replication of ML pipelines. Nonetheless, reliance on commercial platforms continues to create barriers, underscoring the growing importance of community-driven, open-source ecosystems for scaling ML/DL solutions across agricultural engineering.

3.6. Application Domains of ML and DL in Agricultural Engineering

In this section, we address RQ4 by exploring the application domains where machine learning (ML) and deep learning (DL) techniques have demonstrated success in agricultural engineering and food science. The survey reveals widespread adoption of ML/DL due to their strengths in feature extraction, pattern recognition, prediction, and optimization. These methodologies consistently outperform traditional approaches in capturing complex data relationships, handling high-dimensional feature spaces, and delivering enhanced accuracy and efficiency across diverse agricultural applications.

3.6.1. Performance of Traditional ML Algorithms

Traditional ML algorithms remain widely applied due to their interpretability, relatively modest computational requirements, and strong performance in structured data applications. Representative performance results are summarized in Table 10, Figure 2 and Figure 3.

As illustrated in Figure 4, ML models generally achieve accuracies in the 83–96% range. Their R² values typically lie between 0.72 and 0.95, while RMSE values vary widely depending on the task (5.2–17.5). Ensemble and boosting methods often approach DL-level performance but with lower computational cost, making them attractive for resource-constrained agricultural deployments.

3.6.2. Performance of Deep Learning Algorithms

Deep learning models consistently outperform traditional ML approaches in handling of unstructured and high-dimensional data (e.g., spectral and image data). Their strength lies in automated feature extraction and modeling of complex nonlinear relationships. Table 11 presents performance metrics across common DL architectures.

As shown in Figure 2, Figure 3 and Figure 4, DL algorithms improve accuracy by 5–10% over traditional ML, with R² gains of 0.05–0.15 and consistently lower RMSE. Transfer learning and CNN-LSTM hybrids are particularly effective, achieving accuracies of 93–99% and RMSE values as low as 3.2–7.8. These improvements translate into fewer misclassification errors in critical agricultural tasks such as disease detection, food safety, and yield forecasting.

3.6.3. Performance Implications for Agricultural Applications

The comparative performance of machine learning (ML) and deep learning (DL) algorithms carries significant implications for both research design and real-world agricultural practice. Across the reviewed studies, DL approaches consistently outperformed traditional ML models in terms of accuracy, explanatory power (R²), and predictive reliability (RMSE). On average, DL methods demonstrated accuracy improvements of 5–10% and R² gains of 0.05–0.15 compared to their ML counterparts. This margin, though modest in percentage terms, becomes critical in agricultural contexts, where small improvements in classification or prediction accuracy can translate into substantial resource savings, improved crop yields, or enhanced food safety.

For classification-oriented tasks such as crop variety identification, disease detection, and quality grading, DL architectures, particularly convolutional neural networks (CNNs) and transfer learning methods, achieved superior accuracy levels (93–99%) compared to SVMs and ANNs (83–94%). The practical implication is clear: DL can reduce misclassification rates in high-stakes agricultural decision making, minimizing wasted inputs and mitigating the risk of undetected crop diseases. Moreover, the consistently higher F1 scores (0.92–0.99) achieved by DL approaches reflect a better balance between precision and recall, a crucial advantage in scenarios such as food safety assessment, where both false positives and false negatives have costly consequences.

In regression-based applications, yield estimation, evapotranspiration forecasting, and quality parameter prediction, hybrid DL models such as CNN-LSTM combinations further demonstrating strong predictive performance. Their lower RMSE values (3.5–7.8) compared to ML methods (5.2–16.8) underscore their reliability for supporting precision agriculture decisions such as irrigation scheduling and harvest timing. Likewise, their higher R² values (0.90–0.97) indicate greater explanatory power for capturing complex temporal–spatial dependencies in agricultural data, making them more robust for long-term deployment in variable field environments. Nevertheless, the advantages of DL are tempered by practical constraints. DL models demand larger datasets, higher computational resources, and more extensive training compared to traditional ML approaches. This poses challenges for smallholder farmers or developing regions where computational infrastructure may be limited. In such contexts, resource-efficient ML methods such as gradient boosting or Random Forests remain highly valuable, often approaching DL-level accuracy (e.g., XGBoost with 89–96% accuracy, R² of 0.85–0.95) but at a fraction of the computational cost. Interpretability also emerges as a decisive factor: stakeholders often require transparent decision-making frameworks for adoption and regulatory compliance, an area where tree-based ML models retain a clear edge over the “black-box” nature of DL systems.

The broader implication is that neither ML nor DL provides a universally optimal solution. Instead, the results point to a hybrid analytical landscape where DL methods are most appropriate for unstructured, high-dimensional data sources (e.g., images, hyperspectral data, multimodal IoT streams), while ML models remain better suited to structured data scenarios requiring efficiency and interpretability. This complementarity suggests that future agricultural decision-support systems will increasingly integrate both paradigms, leveraging DL for pattern discovery and ML for transparent, resource-efficient inference.

3.6.4. Application Domains

The survey highlights that machine learning (ML) and deep learning (DL) techniques have become widely adopted across agricultural and food science, owing to their strong capabilities in feature extraction, pattern recognition, prediction, and optimization. Their application spans diverse domains, which can be broadly grouped into three major categories: food quality and safety, agricultural management and environmental monitoring, and broader methodological and industrial advancements, as shown in Figure 5.

3.6.5. Food Quality and Safety

In food quality and safety, ML and DL have been extensively applied to monitor, assess, and predict diverse quality parameters. Tomato quality studies have employed support vector regression (SVR) models to capture the influence of storage temperature and packaging on color changes, enzymatic activity, and antioxidant properties. Garlic drying processes have relied on artificial neural networks (ANNs) to optimize infrared-convective drying by adjusting airflow, temperature, and radiation intensity, balancing processing time with nutrient preservation. Poultry safety assessment has used hyperspectral imaging with optimization algorithms and ANN modeling to detect microbial contamination such as Pseudomonas and Enterobacteriaceae, alongside physical contamination like tumors. Tea quality evaluation has advanced through CNN-based systems, which outperform regression-based approaches for aroma analysis and pesticide residue detection using Raman spectroscopy. Similarly, CNN and hybrid LSTM–CNN models have been applied in corn oil and Sichuan pepper quality authentication, offering robust performance in detecting toxins, residues, and geographic authenticity. Apple quality has also received significant attention, with stacked autoencoders and CNN-based classifiers such as VGG16 achieving high performance in soluble solid detection, color assessment, and deformity identification. Beyond these, applications extend to real-time rice grain breakage sensing, grain freshness monitoring with colorimetric arrays, and moisture prediction in Tencha drying using deep learning and computer vision.

3.6.6. Agricultural Management and Environmental Monitoring

Agricultural management and environmental monitoring have similarly benefited from the versatility of ML and DL approaches. Tree classification and structural segmentation have leveraged D-PointNet++ for the precise mapping of crowns, trunks, and supporting structures in large-scale nursery systems. Reinforcement learning frameworks have been developed for irrigation optimization, integrating soil data, crop status indicators, and weather conditions to improve cotton yield and water-use efficiency. Crop evapotranspiration and stress indices have been modeled with AI-enhanced sensor data assimilation, while nitrogen optimization strategies have incorporated deep reinforcement learning with crop simulators to improve input efficiency [110,111]. Fluorescence hyperspectral imaging [112] coupled with deep autoencoder architectures have enabled lead contamination prediction in oilseed rape, while ensemble ML methods have supported carbon flux prediction in tea plantation ecosystems. Livestock management has also adopted DL for automated animal identification, disease monitoring, and predictive analytics using weight and imaging inputs. Similarly, CatBoost algorithms have been applied for crop transpiration prediction, while DL-based robotic vision systems have shown significant promise in unstructured farm environments, enhancing detection and operational efficiency.

3.6.7. Broader Applications and Methodological Advancements

Beyond these specific domains, broader applications highlight how ML and DL methodologies are driving methodological innovation. Transfer learning has become particularly valuable in agricultural contexts, where datasets are often scarce or expensive to generate, enabling models trained on large external datasets to be adapted for agricultural applications, with improved convergence and detection accuracy [113,114]. Neural networks have automated feature extraction, replacing manual feature engineering through their capacity to hierarchically learn low- and high-level features from raw spectral and imaging inputs [115]. Ensemble strategies, such as boosting and stacking, have been widely used to combine the strengths of multiple learners, mitigating bias and improving predictive performance [116]. Computer vision applications demonstrate particular success, with DL enabling robust classification, defect detection, and grading of agricultural commodities [117]. Industrial food processing and agricultural digitization are further benefiting from AI, where ANNs and DL architectures are being applied to optimize complex non-linear processes and accelerate intelligent manufacturing pipelines [118]. A clear trend is emerging toward hybrid solutions that integrate ML, DL, and optimization algorithms, creating comprehensive analytical frameworks capable of tackling high-dimensional and multimodal challenges in agriculture and food science [119,120].

3.6.8. Method Selection Framework and Decision Flowchart

The comparative performance analysis presented in previous subsections demonstrates that method selection significantly impacts agricultural application success. To address this challenge, Figure 6 presents a systematic framework for selecting appropriate computational methods based on specific application requirements and constraints.

This framework provides a systematic approach for selecting appropriate computational methods in agricultural engineering. The process first evaluates dataset size, recommending traditional ML (SVM, Random Forest) for datasets under 1000 samples to prevent overfitting. For larger datasets, method selection depends on data type: spectral data suggest PLS for speed or CNN-1D for accuracy; image data lead to YOLO for detection or CNN with transfer learning for classification; time-series data direct to LSTM for complex patterns or Random Forest for simpler cases; tabular data default to ensemble methods. The framework then applies constraint filters, prioritizing interpretable methods when explainability is required, lightweight models for limited computational resources, and ensemble approaches for maximum accuracy. The process concludes with cross-validation to ensure robust model performance. This decision framework addresses the gap between theoretical performance capabilities and practical implementation needs, enabling researchers and practitioners to make informed method selections based on their specific agricultural applications, data characteristics, and operational constraints. The structured approach reduces the complexity of choosing among numerous available techniques while ensuring alignment between method capabilities and application requirements.

4. Discussion

The comprehensive survey demonstrates that while machine learning (ML) and deep learning (DL) have shown substantial benefits in agricultural engineering, their widespread adoption faces critical challenges spanning technical, economic, and social dimensions. This analysis addresses RQ5 by identifying key bottlenecks and charting future directions for transforming ML/DL applications into scalable, robust, and industry-ready solutions for agricultural and food systems.

4.1. Current Research Challenges

4.1.1. Data Limitations and Quality Issues

Data comprise the primary constraint in ML/DL deployment for agriculture. Many studies operate on small datasets insufficient for training robust models, leading to overfitting or underfitting problems [45,46]. The cost and difficulty of acquiring labeled data in real-world agricultural environments exacerbate this limitation. Advanced sensing technologies such as hyperspectral imaging generate extremely high-dimensional data, creating challenges for storage and efficient processing. Data quality issues including noise, incomplete meteorological records, and sensor malfunction further compromise reliability, with problems particularly acute in developing regions where infrastructure for consistent data acquisition is limited [121].

4.1.2. Model Interpretability and Generalizability

Deep neural networks, despite their superior performance, often function as “black boxes,” offering limited interpretability for end-users such as agronomists, food scientists, and policymakers, who require causal insights rather than mere predictions [122]. Generalizability remains another critical issue, with models trained on specific crop varieties, livestock species, or regional conditions often failing when applied elsewhere. Overfitting tendencies with limited data worsens this problem, necessitating rigorous cross-validation, external dataset testing, and more transparent model design [123].

4.1.3. Computational and Technical Constraints

Agricultural environments present inherent complexity, with variations in lighting, humidity, temperature, and background interference posing significant challenges for image-based recognition systems. Nighttime operations might introduce particular difficulties for mobile robots and drones. Many DL architectures are computationally heavy, making them unsuitable for real-time processing on embedded or mobile agricultural devices, creating tension between model accuracy and field deployment requirements [119,120]. Training and tuning modern ML/DL models requires expert knowledge in hyperparameter optimization and advanced programming, restricting adoption among practitioners lacking specialized computational backgrounds.

4.1.4. Economic and Infrastructure Barriers

ML/DL implementation faces substantial economic barriers, with comprehensive systems costing USD 15,000 to 150,000 per farm [124,125]. This creates prohibitive barriers for smallholder farmers, who constitute 84% of global agricultural operations but lack access to advanced computational infrastructure and high-speed connectivity. The digital divide perpetuates agricultural productivity gaps between developed and developing regions, as sophisticated sensing equipment and computational resources remain economically unfeasible for resource-constrained farming operations [126].

4.2. Socio-Economic and Ethical Considerations

4.2.1. Digital Divide and Economic Accessibility

The concentration of ML/DL research in high-income countries (86% of studies) reflects underlying technological inequalities that limit the global applicability of findings. Agricultural systems, environmental conditions, and resource constraints vary dramatically between developed and developing regions, yet models developed for mechanized, resource-intensive systems may not transfer to smallholder farming contexts [122]. Economic sustainability requires innovative deployment models, including shared infrastructure approaches, cooperative sensor networks, and pay-per-use systems that reduce upfront capital requirements [123].

4.2.2. Ethical Implications and Algorithmic Bias

Agricultural AI deployment raises critical ethical concerns, including data ownership and farmer sovereignty, particularly when proprietary algorithms are developed using farmer-generated data without transparent governance frameworks [124]. Algorithmic bias represents another significant challenge, as models trained on data from specific contexts may perpetuate inequalities when applied to different farming systems, particularly affecting smallholder farmers in developing regions [118]. Additionally, automation through ML/DL-powered robotics may displace agricultural labor, creating social costs that must be weighed against productivity gains [125].

4.2.3. Deployment and Scalability Challenges

Real-world deployment faces environmental robustness issues, with 94% of vision-based systems showing degraded performance under variable field conditions, including dust, moisture, and temperature extremes [113,121]. Integration complexity requires seamless compatibility with existing farm management systems, accessible user interfaces for farmers with varying technical literacy, and local technical support networks, often lacking in rural areas. Regulatory frameworks for AI-driven agricultural decisions remain inadequate across most jurisdictions [126,127].

4.3. Future Research Directions

4.3.1. Advanced Model Architectures and Integration

Future trajectories point toward lightweight CNNs, hybrid CNN-LSTM models, and ensemble learning strategies that balance accuracy with computational feasibility [122]. Transfer learning and meta-learning approaches will mitigate dataset scarcity by leveraging pre-trained models across domains. Optimization techniques, including genetic algorithms, swarm intelligence, and reinforcement learning, will refine hyperparameters and improve adaptability under diverse agricultural conditions [116,119].

4.3.2. Sensing Technologies and Data Fusion

Hyperspectral imaging, fluorescence imaging, and surface-enhanced Raman spectroscopy are expanding as cornerstone technologies for non-destructive, real-time quality assessment. When paired with AI-based analysis, these modalities enable ultrasensitive detection of contaminants, toxins, and spoilage markers. Data fusion strategies combining satellite imagery, IoT sensors, and UAV platforms will enrich datasets, reduce noise, and enhance robustness [124]. Open-access repositories and standardized preprocessing protocols will strengthen reproducibility across studies [36,127].

4.3.3. Sustainable and Responsible AI Development

Responsible AI development must adopt interdisciplinary approaches integrating technological innovation with social science, economics, and policy research to ensure equitable access and sustainable impact [128]. Future research should prioritize inclusive development approaches engaging farmers as co-designers, transparent data governance protocols, bias detection and mitigation strategies, and comprehensive impact assessments, including social and environmental considerations. Integration with existing agricultural extension services provides pathways for technical support and farmer education [129]. Establishing ethical frameworks and regulatory oversight will be essential for scaling AI applications while protecting farmer interests and ensuring food security benefits reach those most in need.

The next decade will likely witness a shift toward models that are not only accurate but also interpretable, lightweight, and deployable in real-world field conditions. The fusion of advanced sensing technologies with AI-driven analytics promises real-time, non-destructive, and scalable solutions across the entire agricultural value chain. Overcoming data, interpretability, deployment, and equity barriers will be essential to realize the full transformative potential of ML and DL in agricultural engineering.

5. Conclusions

This survey has provided a comprehensive examination of machine learning and deep learning applications in agricultural engineering through systematic analysis of recent research developments. The investigation addressed five fundamental research questions, with the key findings summarized below:

RQ1: ML/DL Models in Agricultural Engineering: The analysis revealed a diverse ecosystem of computational approaches, ranging from traditional machine learning algorithms (ANNs, SVMs, Random Forest, PLS) to sophisticated deep learning architectures (CNNs, RNNs, GANs, autoencoders). Deep learning models consistently demonstrated 5–10% accuracy improvements over traditional ML approaches, with transfer learning emerging as a critical strategy for agricultural applications with limited labeled data.
RQ2: Agricultural Engineering Challenges Addressed: Three primary challenge domains were identified: (1) agricultural product quality assessment using hyperspectral imaging and spectroscopy; (2) crop and field management through precision optimization and monitoring; and (3) agricultural automation with machine vision systems. These applications demonstrated remarkable potential to enhance productivity while reducing environmental impact and operational costs.
RQ3: Experimental Evaluation Practices: The experimental methodologies varied significantly across studies, with dataset splitting strategies ranging from simple train-test divisions to sophisticated cross-validation approaches. Performance metrics selection showed task-dependency, with classification tasks favoring accuracy and F1-scores, while regression applications emphasized R² and RMSE values. Critical gaps were identified in standardized validation frameworks and reproducibility practices.
RQ4: Application Domains and Success Areas: Applications successfully spanned food quality and safety assessment (93–99% accuracy in image-based tasks), precision agriculture and environmental monitoring, and agricultural automation systems. Spectral data applications dominated at 42.1%, followed by image data at 26.2%, indicating a strong preference for non-destructive analytical approaches.
RQ5: Research Challenges and Future Trends: Current challenges include data limitations, model interpretability issues, and computational complexity constraints. Future trends emphasize lightweight model development for field deployment, ensemble learning strategies, expanding applications in environmental monitoring, and the integration of advanced sensing technologies with AI-driven analytics.

Overall Contributions: This survey consolidates diverse technological advances into a coherent taxonomy, linking sensing modalities, datasets, algorithms, and applications. The comparative analysis demonstrates that ML and DL serve as complementary approaches rather than substitutes, with the evolving trajectory toward hybrid frameworks that blend interpretability with predictive power. The integration of intelligent analytics with agricultural systems holds transformative potential for food production, safety, and sustainability, requiring continued methodological innovations and interdisciplinary collaboration between computational sciences, agronomy, and environmental research.

Author Contributions

Conceptualization, S.A.F., M.H. and E.A.; methodology, M.H., W.Z. and S.A.F.; software, M.H. and X.L.; validation, S.A.F. and W.Z.; formal analysis, W.Z.; investigation, W.Z.; resources, M.H.; data curation, W.Z. and X.L.; writing—original draft preparation, W.Z. and S.A.F.; writing—review and editing, S.A.F., E.A. and A.P.O.; visualization, W.Z.; supervision, M.H.; project administration, M.H.; funding acquisition, M.H. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare that this study received no financial support.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Iqbal, B.; Alabbosh, K.F.; Jalal, A.; Suboktagin, S.; Elboughdiri, N. Sustainable Food Systems Transformation in the face of climate change: Strategies, challenges, and policy implications. Food Sci. Biotechnol. 2024, 34, 871–883. [Google Scholar] [CrossRef] [PubMed]
Louta, M.; Banti, K.; Karampelia, I. Emerging technologies for sustainable agriculture: The power of humans and the way ahead. IEEE Access 2024, 12, 98492–98529. [Google Scholar] [CrossRef]
Rushchitskaya, O.; Kulikova, E.; Kot, E.; Kruzhkova, T. Sustainable practices and Technological Innovations Transforming Agribusiness Dynamics. E3S Web Conf. 2024, 542, 03003. [Google Scholar] [CrossRef]
Araújo, S.O.; Peres, R.S.; Ramalho, J.C.; Lidon, F.; Barata, J. Machine learning applications in agriculture: Current trends, challenges, and future perspectives. Agronomy 2023, 13, 2976. [Google Scholar] [CrossRef]
Yang, H.; Xiong, S.; Frimpong, S.A.; Zhang, M. A consortium blockchain-based agricultural machinery scheduling system. Sensors 2020, 20, 2643. [Google Scholar] [CrossRef]
Mohyuddin, G.; Khan, M.A.; Haseeb, A.; Mahpara, S.; Waseem, M.; Saleh, A.M. Evaluation of Machine Learning Approaches for precision farming in Smart Agriculture System: A comprehensive review. IEEE Access 2024, 12, 60155–60184. [Google Scholar] [CrossRef]
Abubakar, R.; Effah, E.K.; Frimpong, S.A.; Acakpovi, A.; Acheampong, P.; Kadambi, G.R.; Kumar, K.M.S. Adoption of smart grid in Ghana using Pattern Recognition Neural Networks. In Proceedings of the 2019 International Conference on Computing, Computational Modelling and Applications (ICCMA), Cape Coast, Ghana, 27–29 March 2019; pp. 66–665. [Google Scholar] [CrossRef]
Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine learning applications for Precision Agriculture: A comprehensive review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
Mesías-Ruiz, G.A.; Pérez-Ortiz, M.; Dorado, J.; de Castro, A.I.; Peña, J.M. Boosting precision crop protection towards agriculture 5.0 via machine learning and emerging technologies: A contextual review. Front. Plant Sci. 2023, 14, 1143326. [Google Scholar] [CrossRef]
El-Mesery, H.S.; Qenawy, M.; Ali, M.; Hu, Z.; Adelusi, O.A.; Njobeh, P.B. Artificial intelligence as a tool for predicting the quality attributes of garlic (Allium sativum L.) slices during continuous infrared-assisted hot air drying. J. Food Sci. 2024, 89, 7693–7712. [Google Scholar] [CrossRef]
Pan, Y.; Zhang, Y.; Wang, X.; Gao, X.X.; Hou, Z. Low-cost livestock sorting information management system based on deep learning. Artif. Intell. Agric. 2023, 9, 110–126. [Google Scholar] [CrossRef]
Padhiary, M.; Saha, D.; Kumar, R.; Sethi, L.N.; Kumar, A. Enhancing precision agriculture: A comprehensive review of machine learning and AI vision applications in all-terrain vehicle for farm automation. Smart Agric. Technol. 2024, 8, 100483. [Google Scholar] [CrossRef]
Getahun, S.; Kefale, H.; Gelaye, Y. Application of precision agriculture technologies for sustainable crop production and environmental sustainability: A systematic review. Sci. World J. 2024, 2024, 2126734. [Google Scholar] [CrossRef]
Taha, M.F.; Mao, H.; Zhang, Z.; Elmasry, G.; Awad, M.A.; Abdalla, A.; Mousa, S.; Elwakeel, A.E.; Elsherbiny, O. Emerging technologies for precision crop management towards agriculture 5.0: A comprehensive overview. Agriculture 2025, 15, 582. [Google Scholar] [CrossRef]
Qin, Y.-M.; Tu, Y.-H.; Li, T.; Ni, Y.; Wang, R.-F.; Wang, H. Deep Learning for Sustainable Agriculture: A systematic review on applications in lettuce cultivation. Sustainability 2025, 17, 3190. [Google Scholar] [CrossRef]
Mgendi, G. Unlocking the potential of precision agriculture for Sustainable Farming. Discov. Agric. 2024, 2, 87. [Google Scholar] [CrossRef]
Zhu, Y.; Fan, S.; Zuo, M.; Zhang, B.; Zhu, Q.; Kong, J. Discrimination of New and Aged Seeds Based on On-Line Near-Infrared Spectroscopy Technology Combined with Machine Learning. Foods 2024, 13, 1570. [Google Scholar] [CrossRef] [PubMed]
Delfani, P.; Thuraga, V.; Banerjee, B.; Chawade, A. Integrative approaches in modern agriculture: IOT, ML and AI for disease forecasting amidst climate change. Precis. Agric. 2024, 25, 2589–2613. [Google Scholar] [CrossRef]
Mienye, I.D.; Swart, T.G. A comprehensive review of deep learning: Architectures, recent advances, and applications. Information 2024, 15, 755. [Google Scholar] [CrossRef]
Bhat, S.A.; Qadri, S.A.; Dubbey, V.; Sofi, I.B.; Huang, N.-F. Impact of crop management practices on maize yield: Insights from farming in tropical regions and predictive modeling using machine learning. J. Agric. Food Res. 2024, 18, 101392. [Google Scholar] [CrossRef]
Ngugi, H.N.; Akinyelu, A.A.; Ezugwu, A.E. Machine learning and deep learning for crop disease diagnosis: Performance analysis and review. Agronomy 2024, 14, 3001. [Google Scholar] [CrossRef]
Botero-Valencia, J.; García-Pineda, V.; Valencia-Arias, A.; Valencia, J.; Reyes-Vera, E.; Mejia-Herrera, M.; Hernández-García, R. Machine learning in sustainable agriculture: Systematic Review and research perspectives. Agriculture 2025, 15, 377. [Google Scholar] [CrossRef]
Ali, T.; Rehman, S.U.; Ali, S.; Mahmood, K.; Obregon, S.A.; Iglesias, R.C.; Khurshaid, T.; Ashraf, I. Smart agriculture: Utilizing machine learning and deep learning for drought stress identification in crops. Sci. Rep. 2024, 14, 30062. [Google Scholar] [CrossRef]
Badshah, A.; Yousef Alkazemi, B.; Din, F.; Zamli, K.Z.; Haris, M. Crop classification and yield prediction using robust machine learning models for agricultural sustainability. IEEE Access 2024, 12, 162799–162813. [Google Scholar] [CrossRef]
Peng, S.; Rajjou, L. Advancing Plant Biology through deep learning-powered natural language processing. Plant Cell Rep. 2024, 43, 208. [Google Scholar] [CrossRef] [PubMed]
Zhou, X.; Zhao, C.; Sun, J.; Cao, Y.; Yao, K.; Xu, M. A deep learning method for predicting lead content in oilseed rape leaves using fluorescence hyperspectral imaging. Food Chem. 2022, 409, 135251. [Google Scholar] [CrossRef] [PubMed]
Monteiro, A.; Santos, S.; Gonçalves, P. Precision Agriculture for crop and livestock farming—Brief review. Animals 2021, 11, 2345. [Google Scholar] [CrossRef]
Han, H.; Liu, Z.; Li, J.; Zeng, Z. Challenges in remote sensing based climate and crop monitoring: Navigating the complexities using AI. J. Cloud Comput. 2024, 13, 34. [Google Scholar] [CrossRef]
Peladarinos, N.; Piromalis, D.; Cheimaras, V.; Tserepas, E.; Munteanu, R.A.; Papageorgas, P. Enhancing smart agriculture by implementing digital twins: A comprehensive review. Sensors 2023, 23, 7128. [Google Scholar] [CrossRef] [PubMed]
Aarif, K.O.M.; Alam, A.; Hotak, Y. Smart sensor technologies shaping the future of precision agriculture: Recent advances and future outlooks. J. Sens. 2025, 2025, 2460098. [Google Scholar] [CrossRef]
Luo, J.; Li, B.; Leung, C. A survey of Computer Vision Technologies in urban and controlled-environment agriculture. ACM Comput. Surv. 2023, 56, 1–39. [Google Scholar] [CrossRef]
Soussi, A.; Zero, E.; Sacile, R.; Trinchero, D.; Fossa, M. Smart sensors and smart data for Precision Agriculture: A Review. Sensors 2024, 24, 2647. [Google Scholar] [CrossRef] [PubMed]
Munaganuri, R.K.; Rao, Y.N. PAMICRM: Improving Precision Agriculture through multimodal image analysis for crop water requirement estimation using multidomain remote sensing data samples. IEEE Access 2024, 12, 52815–52836. [Google Scholar] [CrossRef]
Singh, R.K.; Berkvens, R.; Weyn, M. AgriFusion: An architecture for IOT and emerging technologies based on a Precision Agriculture Survey. IEEE Access 2021, 9, 136253–136283. [Google Scholar] [CrossRef]
Qu, H.-R.; Su, W.-H. Deep learning-based weed–crop recognition for smart agricultural equipment: A Review. Agronomy 2024, 14, 363. [Google Scholar] [CrossRef]
Abdulhussain, S.H.; Mahmmod, B.M.; Alwhelat, A.; Shehada, D.; Shihab, Z.I.; Mohammed, H.J.; Abdulameer, T.H.; Alsabah, M.; Fadel, M.H.; Ali, S.K.; et al. A comprehensive review of sensor technologies in IOT: Technical aspects, challenges, and future directions. Computers 2025, 14, 342. [Google Scholar] [CrossRef]
Alshuwaier, F.A.; Alsulaiman, F.A. Fake news detection using machine learning and Deep Learning Algorithms: A comprehensive review and future perspectives. Computers 2025, 14, 394. [Google Scholar] [CrossRef]
Khan, W.A.; Chung, S.H.; Awan, M.U.; Wen, X. Machine Learning facilitated business intelligence (part I). Ind. Manag. Data Syst. 2019, 120, 164–195. [Google Scholar] [CrossRef]
Wang, Y.; Chung, S.-H.; Khan, W.A.; Wang, T.; Xu, D.J. Alada: A Lite Automatic Data Augmentation Framework for industrial defect detection. Adv. Eng. Inform. 2023, 58, 102205. [Google Scholar] [CrossRef]
Miller, T.; Mikiciuk, G.; Durlik, I.; Mikiciuk, M.; Łobodzińska, A.; Śnieg, M. The IOT and AI in agriculture: The Time is now—A systematic review of Smart Sensing Technologies. Sensors 2025, 25, 3583. [Google Scholar] [CrossRef]
Akbar, J.U.; Kamarulzaman, S.F.; Muzahid, A.J.; Rahman, M.A.; Uddin, M. A comprehensive review on Deep Learning Assisted Computer Vision Techniques for smart greenhouse agriculture. IEEE Access 2024, 12, 4485–4522. [Google Scholar] [CrossRef]
Gargon, E.; Williamson, P.R.; Clarke, M. Collating the knowledge base for core outcome set development: Developing and appraising the search strategy for a systematic review. BMC Med Res. Methodol. 2015, 15, 26. [Google Scholar] [CrossRef]
MacDonald, H.; Comer, C.; Foster, M.; Labelle, P.R.; Marsalis, S.; Nyhan, K.; Premji, Z.; Rogers, M.; Splenda, R.; Stansfield, C.; et al. Searching for studies: A guide to information retrieval for Campbell Systematic Reviews. Campbell Syst. Rev. 2024, 20, e1433. [Google Scholar] [CrossRef]
Salvador-Oliván, J.A.; Marco-Cuenca, G.; Arquero-Avilés, R. Errors in search strategies used in systematic reviews and their effects on information retrieval. J. Med Libr. Assoc. 2019, 107, 210–221. [Google Scholar] [CrossRef] [PubMed]
Islam, M.H.; Anam, M.Z.; Hoque, M.R.; Nishat, M.; Bari, A.B.M.M. Agriculture 4.0 adoption challenges in the emerging economies: Implications for smart farming and Sustainability. J. Econ. Technol. 2024, 2, 278–295. [Google Scholar] [CrossRef]
Wang, S.; Yang, Y.; Yin, H.; Zhao, J.; Wang, T.; Yang, X.; Ren, J.; Yin, C. Towards digital transformation of agriculture for sustainable development in China: Experience and lessons learned. Sustainability 2025, 17, 3756. [Google Scholar] [CrossRef]
Ryan, M.; Isakhanyan, G.; Tekinerdogan, B. An interdisciplinary approach to artificial intelligence in agriculture. NJAS Impact Agric. Life Sci. 2023, 95, 2168568. [Google Scholar] [CrossRef]
Soori, M.; Arezoo, B.; Dastres, R. Artificial Intelligence, Machine Learning and deep learning in advanced robotics, a review. Cogn. Robot. 2023, 3, 54–70. [Google Scholar] [CrossRef]
Savoy, J. Bibliographic database access using free-text and controlled vocabulary: An evaluation. Inf. Process. Manag. 2005, 41, 873–890. [Google Scholar] [CrossRef]
Cravero, A.; Pardo, S.; Sepúlveda, S.; Muñoz, L. Challenges to use machine learning in Agricultural Big Data: A systematic literature review. Agronomy 2022, 12, 748. [Google Scholar] [CrossRef]
Aderele, M.O.; Srivastava, A.K.; Butterbach-Bahl, K.; Rahimi, J. Integrating machine learning with agroecosystem modelling: Current State and future challenges. Eur. J. Agron. 2025, 168, 127610. [Google Scholar] [CrossRef]
Yin, S.; Xi, Y.; Zhang, X.; Sun, C.; Mao, Q. Foundation models in agriculture: A comprehensive review. Agriculture 2025, 15, 847. [Google Scholar] [CrossRef]
El-Mesery, H.S.; Adelusi, O.A.; Ghashi, S.; Njobeh, P.B.; Hu, Z.; Kun, W. Effects of storage conditions and packaging materials on the postharvest quality of fresh Chinese tomatoes and the optimization of the tomatoes’ physiochemical properties using machine learning techniques. LWT 2024, 201, 116280. [Google Scholar] [CrossRef]
Guo, Z.; Barimah, A.O.; Shujat, A.; Zhang, Z.; Ouyang, Q.; Shi, J.; El-Seedi, H.R.; Zou, X.; Chen, Q. Simultaneous quantification of active constituents and antioxidant capability of green tea using NIR spectroscopy coupled with swarm intelligence algorithm. LWT 2020, 129, 109510. [Google Scholar] [CrossRef]
Bai, J.W.; Xiao, H.W.; Ma, H.; Zhou, C.S. Artificial Neural Network Modeling of Drying Kinetics and Color Changes of Ginkgo Biloba Seeds during Microwave Drying Process. J. Food Qual. 2018, 2018, 3278595. [Google Scholar] [CrossRef]
El-Mesery, H.S.; Qenawy, M.; Li, J.; El-Sharkawy, M.; Du, D. Predictive modeling of garlic quality in hybrid infrared-convective drying using artificial neural networks. Food Bioprod. Process. 2024, 145, 226–238. [Google Scholar] [CrossRef]
Tong, Z.; Zhang, S.; Yu, J.; Zhang, X.; Wang, B.; Zheng, W. A Hybrid Prediction Model for CatBoost Tomato Transpiration Rate Based on Feature Extraction. Agronomy 2023, 13, 2371. [Google Scholar] [CrossRef]
Zhao, S.; Jiao, T.; Adade, S.Y.S.S.; Wang, Z.; Wu, X.; Li, H.; Chen, Q. Based on vis-NIR combined with ANN for on-line detection of bacterial concentration during kombucha fermentation. Food Biosci. 2024, 60, 104346. [Google Scholar] [CrossRef]
Huang, Y.; Pan, Y.; Liu, C.; Zhou, L.; Tang, L.; Wei, H.; Fan, K.; Wang, A.; Tang, Y. Rapid and Non-Destructive Geographical Origin Identification of Chuanxiong Slices Using Near-Infrared Spectroscopy and Convolutional Neural Networks. Agriculture 2024, 14, 1281. [Google Scholar] [CrossRef]
El-Sharkawy, M.; Li, J.; Kamal, N.; Mahmoud, E.; Omara, A.E.D.; Du, D. Assessing and Predicting Soil Quality in Heavy Metal-Contaminated Soils: Statistical and ANN-Based Techniques. J. Soil Sci. Plant Nutr. 2023, 23, 6510–6526. [Google Scholar] [CrossRef]
Zhou, X.; Sun, J.; Zhang, Y.; Tian, Y.; Yao, K.; Xu, M. Visualization of heavy metal cadmium in lettuce leaves based on wavelet support vector machine regression model and visible-near infrared hyperspectral imaging. J. Food Process. Eng. 2021, 44, e13897. [Google Scholar] [CrossRef]
Xuan, L.; Lin, Z.; Liang, J.; Huang, X.; Li, Z.; Zhang, X.; Zou, X.; Shi, J. Prediction of resilience and cohesion of deep-fried tofu by ultrasonic detection and LightGBM regression. Food Control. 2023, 154, 110009. [Google Scholar] [CrossRef]
Zhao, Z.; Jin, M.; Tian, C.; Yang, S.X. Prediction of seed distribution in rectangular vibrating tray using grey model and artificial neural network. Biosyst. Eng. 2018, 175, 194–205. [Google Scholar] [CrossRef]
Ding, Y.; Yan, Y.; Li, J.; Chen, X.; Jiang, H. Classification of Tea Quality Levels Using Near-Infrared Spectroscopy Based on CLPSO-SVM. Foods 2022, 11, 1658. [Google Scholar] [CrossRef]
Zhang, Z.; Lu, Y.; Yang, M.; Wang, G.; Zhao, Y.; Hu, Y. Optimal training strategy for high-performance detection model of multi-cultivar tea shoots based on deep learning methods. Sci. Hortic. 2024, 328, 112949. [Google Scholar] [CrossRef]
Zhang, Z.; Lu, Y.; Zhao, Y.; Pan, Q.; Jin, K.; Xu, G.; Hu, Y. TS-YOLO: An All-Day and Lightweight Tea Canopy Shoots Detection Model. Agronomy 2023, 13, 1411. [Google Scholar] [CrossRef]
Li, Y.; Sun, J.; Wu, X.; Chen, Q.; Lu, B.; Dai, C. Detection of viability of soybean seed based on fluorescence hyperspectra and CARS-SVM-AdaBoost model. J. Food Process. Preserv. 2019, 43, e14238. [Google Scholar] [CrossRef]
Tian, Y.; Sun, J.; Zhou, X.; Yao, K.; Tang, N. Detection of soluble solid content in apples based on hyperspectral technology combined with deep learning algorithm. J. Food Process. Preserv. 2022, 46, e16414. [Google Scholar] [CrossRef]
Peng, Y.; Zhao, S.; Liu, J. Fused deep features-based grape varieties identification using support vector machine. Agriculture 2021, 11, 869. [Google Scholar] [CrossRef]
Li, H.; Luo, X.; Haruna, S.A.; Zareef, M.; Chen, Q.; Ding, Z.; Yan, Y. Au-Ag OHCs-based SERS sensor coupled with deep learning CNN algorithm to quantify thiram and pymetrozine in tea. Food Chem. 2023, 428, 136798. [Google Scholar] [CrossRef]
Elbeltagi, A.; Srivastava, A.; Deng, J.; Li, Z.; Raza, A.; Khadke, L.; Yu, Z.; El-Rawy, M. Forecasting vapor pressure deficit for agricultural water management using machine learning in semi-arid environments. Agric. Water Manag. 2023, 283, 108302. [Google Scholar] [CrossRef]
Wu, X.; Zhu, J.; Wu, B.; Zhao, C.; Sun, J.; Dai, C. Discrimination of Chinese liquors based on electronic nose and fuzzy discriminant principal component analysis. Foods 2019, 8, 38. [Google Scholar] [CrossRef]
Zhu, J.; Jiang, X.; Rong, Y.; Wei, W.; Wu, S.; Jiao, T.; Chen, Q. Label-free detection of trace level zearalenone in corn oil by surface-enhanced Raman spectroscopy (SERS) coupled with deep learning models. Food Chem. 2023, 414, 135705. [Google Scholar] [CrossRef]
Cheng, J.; Sun, J.; Yao, K.; Xu, M.; Wang, S.; Fu, L. Hyperspectral technique combined with stacking and blending ensemble learning method for detection of cadmium content in oilseed rape leaves. J. Sci. Food Agric. 2023, 103, 2690–2699. [Google Scholar] [CrossRef] [PubMed]
Zhang, D.; Lin, Z.; Xuan, L.; Lu, M.; Shi, B.; Shi, J.; He, F.; Battino, M.; Zhao, L.; Zou, X. Rapid determination of geographical authenticity and pungency intensity of the red Sichuan pepper (Zanthoxylum bungeanum) using differential pulse voltammetry and machine learning algorithms. Food Chem. 2023, 439, 137978. [Google Scholar] [CrossRef]
Zhao, S.; Adade, S.Y.S.S.; Wang, Z.; Jiao, T.; Ouyang, Q.; Li, H.; Chen, Q. Deep learning and feature reconstruction assisted vis-NIR calibration method for on-line monitoring of key growth indicators during kombucha production. Food Chem. 2024, 463, 141411. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Nirere, A.; Dusabe, K.D.; Yuhao, Z.; Adrien, G. Rapid and nondestructive watermelon (Citrullus lanatus) seed viability detection based on visible near-infrared hyperspectral imaging technology and machine learning algorithms. J. Food Sci. 2024, 89, 4403–4418. [Google Scholar] [CrossRef]
You, J.; Li, D.; Wang, Z.; Chen, Q.; Ouyang, Q. Prediction and visualization of moisture content in Tencha drying processes by computer vision and deep learning. J. Sci. Food Agric. 2024, 104, 5486–5494. [Google Scholar] [CrossRef]
Qiu, G.; Lu, H.; Wang, X.; Wang, C.; Xu, S.; Liang, X.; Fan, C. Nondestructive Detecting Maturity of Pineapples Based on Visible and Near-Infrared Transmittance Spectroscopy Coupled with Machine Learning Methodologies. Horticulturae 2023, 9, 889. [Google Scholar] [CrossRef]
Raza, A.; Hu, Y.; Lu, Y. Improving carbon flux estimation in tea plantation ecosystems: A machine learning ensemble approach. Eur. J. Agron. 2024, 160, 127297. [Google Scholar] [CrossRef]
Sun, J.; Zhang, L.; Zhou, X.; Yao, K.; Tian, Y.; Nirere, A. A method of information fusion for identification of rice seed varieties based on hyperspectral imaging technology. J. Food Process. Eng. 2021, 44, e13797. [Google Scholar] [CrossRef]
Qiu, D.; Guo, T.; Yu, S.; Liu, W.; Li, L.; Sun, Z.; Peng, H.; Hu, D. Classification of Apple Color and Deformity Using Machine Vision Combined with CNN. Agriculture 2024, 14, 978. [Google Scholar] [CrossRef]
Tang, N.; Sun, J.; Yao, K.; Zhou, X.; Tian, Y.; Cao, Y.; Nirere, A. Identification of Lycium barbarum varieties based on hyperspectral imaging technique and competitive adaptive reweighted sampling-whale optimization algorithm-support vector machine. J. Food Process Eng. 2021, 44, e13603. [Google Scholar] [CrossRef]
Xu, J.; Liu, H.; Shen, Y.; Zeng, X.; Zheng, X. Individual nursery trees classification and segmentation using a point cloud-based neural network with dense connection pattern. Sci. Hortic. 2024, 328, 112945. [Google Scholar] [CrossRef]
Yu, Z.; Guo, Y.; Zhang, L.; Ding, Y.; Zhang, G.; Zhang, D. Improved Lightweight Zero-Reference Deep Curve Estimation Low-Light Enhancement Algorithm for Night-Time Cow Detection. Agriculture 2024, 14, 1003. [Google Scholar] [CrossRef]
Xue, Y.; Jiang, H. Monitoring of Chlorpyrifos Residues in Corn Oil Based on Raman Spectral Deep-Learning Model. Foods 2023, 12, 2402. [Google Scholar] [CrossRef]
Xu, M.; Sun, J.; Cheng, J.; Yao, K.; Wu, X.; Zhou, X. Non-destructive prediction of total soluble solids and titratable acidity in Kyoho grape using hyperspectral imaging and deep learning algorithm. Int. J. Food Sci. Technol. 2023, 58, 9–21. [Google Scholar] [CrossRef]
Chen, Y.; Lin, M.; Yu, Z.; Sun, W.; Fu, W.; He, L. Enhancing cotton irrigation with distributional actor–critic reinforcement learning. Agric. Water Manag. 2025, 307, 109194. [Google Scholar] [CrossRef]
Wang, J.; Gao, Z.; Zhang, Y.; Zhou, J.; Wu, J.; Li, P. Real-time detection and location of potted flowers based on a ZED camera and a YOLO V4-tiny deep learning algorithm. Horticulturae 2022, 8, 21. [Google Scholar] [CrossRef]
Khulal, U.; Zhao, J.; Hu, W.; Chen, Q. Nondestructive quantifying total volatile basic nitrogen (TVB-N) content in chicken using hyperspectral imaging (HSI) technique combined with different data dimension reduction algorithms. Food Chem. 2016, 197, 1191–1199. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; Qi, J.; Gao, J.; Chen, W.; Fei, J.; Meng, H.; Ma, Z. Research on the Control System for the Conveying and Separation Experimental Platform of Tiger Nut Harvester Based on Sensing Technology and Control Algorithms. Agriculture 2025, 15, 115. [Google Scholar] [CrossRef]
Tian, Y.; Sun, J.; Zhou, X.; Wu, X.; Lu, B.; Dai, C. Research on apple origin classification based on variable iterative space shrinkage approach with stepwise regression–support vector machine algorithm and visible-near infrared hyperspectral imaging. J. Food Process Eng. 2020, 43, e13432. [Google Scholar] [CrossRef]
Carrington, A.M.; Manuel, D.G.; Fieguth, P.W.; Ramsay, T.; Osmani, V.; Wernly, B.; Bennett, C.; Hawken, S.; Magwood, O.; Sheikh, Y.; et al. Deep Roc analysis and AUC as balanced average accuracy, for improved classifier selection, audit and explanation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 329–341. [Google Scholar] [CrossRef]
Chicco, D.; Tötsch, N.; Jurman, G. The Matthews Correlation Coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Mining 2021, 14, 13. [Google Scholar] [CrossRef]
Liang, Y.; Lin, H.; Kang, W.; Shao, X.; Cai, J.; Li, H.; Chen, Q. Application of colorimetric sensor array coupled with machine-learning approaches for the discrimination of grains based on freshness. J. Sci. Food Agric. 2023, 103, 6790–6799. [Google Scholar] [CrossRef]
Hafeez, K.; Rowlands, H.; Kanji, G.; Iqbal, S. Design optimization using ANOVA. J. Appl. Stat. 2002, 29, 895–906. [Google Scholar] [CrossRef]
Sridhar, K.; Charles, A.L. Multivariate analysis of variance: An advanced chemometric approach to differentiate dose-dependent antioxidant activities of grape (Vitis labruscana) skin extracts. J. Food Process. Preserv. 2021, 45, e15447. [Google Scholar] [CrossRef]
Chen, J.; Lian, Y.; Zou, R.; Zhang, S.; Ning, X.; Han, M. Real-time grain breakage sensing for rice combine harvesters using machine vision technology. Int. J. Agric. Biol. Eng. 2020, 13, 194–199. [Google Scholar] [CrossRef]
Amini-Valashani, M.; Mirzakuchaki, S. Performance evaluation of latest meta-heuristic algorithms in finding optimum value of mathematical functions and problems. Arab. J. Sci. Eng. 2024, 49, 3503–3529. [Google Scholar] [CrossRef]
Li, Z.; Hu, J.; Han, Y.; Li, H.; Wang, J.; Lund, P.D. Parameter identification and generality analysis of photovoltaic module dual-diode model based on artificial hummingbird algorithm. Clean Energy 2023, 7, 1219–1232. [Google Scholar] [CrossRef]
Jiang, H.; Liu, T.; He, P.; Ding, Y.; Chen, Q. Rapid measurement of fatty acid content during flour storage using a color-sensitive gas sensor array: Comparing the effects of swarm intelligence optimization algorithms on sensor features. Food Chem. 2021, 338, 127828. [Google Scholar] [CrossRef] [PubMed]
Gu, H.; Liu, K.; Huang, X.; Chen, Q.; Sun, Y.; Tan, C.P. Feasibility study for the analysis of coconut water using fluorescence spectroscopy coupled with PARAFAC and SVM methods. Br. Food J. 2020, 122, 3203–3212. [Google Scholar] [CrossRef]
Hongyang, T.; Daming, H.; Xingyi, H.; Aheto, J.H.; Yi, R.; Yu, W.; Ji, L.; Shuai, N.; Mengqi, X. Detection of browning of fresh-cut potato chips based on machine vision and electronic nose. J. Food Process Eng. 2021, 44, e13631. [Google Scholar] [CrossRef]
Jia, W.; Zheng, Y.; Zhao, D.; Yin, X.; Liu, X.; Du, R. Preprocessing method of night vision image application in apple harvesting robot. Int. J. Agric. Biol. Eng. 2018, 11, 158–163. [Google Scholar] [CrossRef]
Li, H.; Geng, W.; Hassan, M.M.; Zuo, M.; Wei, W.; Wu, X.; Ouyang, Q.; Chen, Q. Rapid detection of chloramphenicol in food using SERS flexible sensor coupled artificial intelligent tools. Food Control 2021, 128, 108186. [Google Scholar] [CrossRef]
Xu, Y.; Kutsanedzie, F.Y.H.; Sun, H.; Wang, M.; Chen, Q.; Guo, Z.; Wu, J. Rapid Pseudomonas Species Identification from Chicken by Integrating Colorimetric Sensors with Near-Infrared Spectroscopy. Food Anal. Methods 2018, 11, 1199–1208. [Google Scholar] [CrossRef]
Huang, X.; Zou, X.; Shi, J.; Li, Z.; Zhao, J. Colorimetric sensor arrays based on chemo-responsive dyes for food odor visualization. Trends Food Sci. Technol. 2018, 81, 90–107. [Google Scholar] [CrossRef]
Zeng, S.; Zhou, C.; Wang, B.; Xiao, H.; Lv, W. Microwave Infrared Cooperative Drying of Ginger: Moisture Evolution, Structure Change, Physicochemical Properties, and Prediction Model. Food Bioprocess Technol. 2024, 17, 4632–4651. [Google Scholar] [CrossRef]
Hossen, M.I.; Awrangjeb, M.; Pan, S.; Mamun, A.A. Transfer learning in agriculture: A Review. Artif. Intell. Rev. 2025, 58, 97. [Google Scholar] [CrossRef]
Waqas, M.; Naseem, A.; Humphries, U.W.; Hlaing, P.T.; Dechpichai, P.; Wangwongchai, A. Applications of machine learning and Deep Learning in agriculture: A comprehensive review. Green Technol. Sustain. 2025, 3, 100199. [Google Scholar] [CrossRef]
Benti, N.E.; Chaka, M.D.; Semie, A.G.; Warkineh, B.; Soromessa, T. Transforming agriculture with machine learning, Deep Learning, and IOT: Perspectives from Ethiopia—Challenges and opportunities. Discov. Agric. 2024, 2, 63. [Google Scholar] [CrossRef]
Dostmohammadi, M.; Pedram, M.Z.; Hoseinzadeh, S.; Garcia, D.A. A ga-stacking ensemble approach for forecasting energy consumption in a smart household: A Comparative Study of Ensemble Methods. J. Environ. Manag. 2024, 364, 121264. [Google Scholar] [CrossRef]
Sankareswari, K.; Sujatha, G. Evaluation of an ensemble technique for prediction of crop yield. In Proceedings of the 5th International Conference on Information Management & Machine Intelligence, Jaipur, India, 23–25 November 2023; pp. 1–9. [Google Scholar] [CrossRef]
Canatan, M.; Alkhulaifi, N.; Watson, N.; Boz, Z. Artificial Intelligence in food manufacturing: A review of current work and future opportunities. Food Eng. Rev. 2025, 17, 189–219. [Google Scholar] [CrossRef]
Arteaga-Cabrera, E.; Ramírez-Márquez, C.; Sánchez-Ramírez, E.; Segovia-Hernández, J.G.; Osorio-Mora, O.; Gómez-Salazar, J.A. Advancing optimization strategies in the food industry: From traditional approaches to multi-objective and technology-integrated solutions. Appl. Sci. 2025, 15, 3846. [Google Scholar] [CrossRef]
Jabed, M.A.; Azmi Murad, M.A. Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and Sustainability. Heliyon 2024, 10, e40836. [Google Scholar] [CrossRef]
Tahir, H.A.; Alayed, W.; Hassan, W.U. A federated explainable AI framework for Smart Agriculture: Enhancing Transparency, efficiency, and Sustainability. IEEE Access 2025, 13, 97567–97584. [Google Scholar] [CrossRef]
Ryo, M. Explainable artificial intelligence and interpretable machine learning for Agricultural Data Analysis. Artif. Intell. Agric. 2022, 6, 257–265. [Google Scholar] [CrossRef]
Tariq, M.U.; Saqib, S.M.; Mazhar, T.; Khan, M.A.; Shahzad, T.; Hamam, H. Edge-enabled Smart Agriculture Framework: Integrating IOT, Lightweight Deep Learning, and agentic AI for context-aware farming. Results Eng. 2025, 28, 107342. [Google Scholar] [CrossRef]
Gong, R.; Zhang, H.; Li, G.; He, J. Edge computing-enabled Smart Agriculture: Technical Architectures, practical evolution, and Bottleneck Breakthroughs. Sensors 2025, 25, 5302. [Google Scholar] [CrossRef]
Zarbakhsh, S.; Fakhrzad, F.; Rajkovic, D.; Niedbała, G.; Piekutowska, M. Approaches and challenges in machine learning for monitoring agricultural products and predicting plant physiological responses to biotic and abiotic stresses. Curr. Plant Biol. 2025, 43, 100535. [Google Scholar] [CrossRef]
Mansoor, S.; Iqbal, S.; Popescu, S.M.; Kim, S.L.; Chung, Y.S.; Baek, J.-H. Integration of smart sensors and IOT in Precision Agriculture: Trends, challenges and future prospectives. Front. Plant Sci. 2025, 16, 1587869. [Google Scholar] [CrossRef] [PubMed]
Hackfort, S. Patterns of inequalities in digital agriculture: A systematic literature review. Sustainability 2021, 13, 12345. [Google Scholar] [CrossRef]
Zhu, H.; Liang, S.; Lin, C.; He, Y.; Xu, J.-L. Using multi-sensor data fusion techniques and machine learning algorithms for improving UAV-based yield prediction of oilseed rape. Drones 2024, 8, 642. [Google Scholar] [CrossRef]
Romera, A.J.; Sharifi, M.; Charters, S. Digitalization in agriculture. Towards an integrative approach. Comput. Electron. Agric. 2024, 219, 108817. [Google Scholar] [CrossRef]
Li, C.; Luo, C.; Jia, H.; Yue, M.; Wu, L. Advancing Agricultural Economic Growth Through Technology Innovation and Structural Transformation: A multilevel analysis. J. Innov. Knowl. 2025, 10, 100743. [Google Scholar] [CrossRef]
Kulkov, I.; Kulkova, J.; Rohrbeck, R.; Menvielle, L.; Kaartemo, V.; Makkonen, H. Artificial Intelligence—Driven Sustainable Development: Examining Organizational, technical, and processing approaches to achieving Global Goals. Sustain. Dev. 2023, 32, 2253–2267. [Google Scholar] [CrossRef]
Mungai, L.M.; Messina, J.P.; Zulu, L.C.; Chikowo, R.; Snapp, S.S. The role of agricultural extension services in promoting agricultural sustainability: A central malawi case study. Cogent Food Agric. 2024, 10, 2423249. [Google Scholar] [CrossRef]
van Hilten, M.; Ryan, M.; Blok, V.; de Roo, N. Ethical, legal and social aspects (ELSA) for AI: An assessment tool for agri-food. Smart Agric. Technol. 2025, 10, 100710. [Google Scholar] [CrossRef]

Figure 1. Distribution of dataset types across ML/DL applications in agricultural engineering. Spectral data dominate at 42.2%, followed by image data (26.2%), reflecting the field’s preference for non-destructive sensing modalities. Environmental datasets account for 16.8%, and physicochemical/reference datasets for 11.0%, while olfactory sensor and human sensory datasets contribute 1.9% each. This indicates opportunities for more comprehensive multi-modal approaches in future research.

Figure 2. Comparative accuracy performance of machine learning and deep learning algorithms in agricultural engineering applications. Deep learning models demonstrate consistently higher accuracies (93–99%) compared to traditional ML approaches (83–96%), with CNN variants and transfer learning showing the most significant improvements.

Figure 3. Comparative R² (coefficient of determination) values for ML and DL algorithms across agricultural regression. Deep learning approaches achieve higher explained variance (R² = 0.87–0.98) compared to traditional ML methods (R² = 0.70–0.86), indicating superior capability in modeling complex non-linear agricultural relationships.

Figure 4. Root Mean Square Error (RMSE) comparison across ML and DL algorithms in agricultural prediction tasks. CNN-LSTM hybrid architectures and transfer learning approaches demonstrate superior predictive accuracy, with RMSE values of 3.2–7.8, compared to traditional ML methods (5.2–16.8). Lower RMSE values indicate better predictive performance, with deep learning showing 25–40% improvement in prediction reliability.

Figure 5. Taxonomy of ML and DL application domains in agricultural engineering and food science, illustrating four broad categories—food quality and safety assessment (60% of applications), agricultural management (25.7%), environmental monitoring (8.6%), and methodological/industrial advancements (5.7%)—as an indication of cross-domain knowledge transfer and hybrid applications.

Figure 6. Decision framework for selecting appropriate ML/DL methods in agricultural engineering applications. This systematic approach considers dataset size, data type characteristics, computational constraints, and interpretability requirements to guide method selection.

Table 1. Dominant traditional ML models and their primary agricultural engineering applications.

Author	Model Category	Specific Models	Primary Agricultural Applications
[10,55,56,57,58,60,61,63]	ANNs	BP-ANN, ELM-ANN, RBF-ANN	Drying kinetics, physicochemical modeling, transpiration/ET prediction, online detection, classification
[53,64,65,66,67,68,69]	SVM/SVR	SVM, SVR, GWO-SVR	Quality assessment, object classification/counting, seed viability, geographical auth., spectral analysis, deep feature classification, pesticide detection
[71,72,73]	Random Forest	RF	Environmental forecasting (VPD), regression (toxins), heavy metal detection (ensembles), classification
[17,70,76,77,78]	PLS/ML	PLS, PLS-DA	Quantitative spectral analysis, identification/discrimination, regression, baseline for DL
[74,80]	Ensemble Methods	Stacking, Blending, Weighted Avg.	Performance boosting for NEE, evapotranspiration, heavy metal detection, dam/gw prediction
[49,54]	Gradient Boosting	XGBoost, LightGBM, CatBoost	Complex prediction (origin), classification, time-series forecasting (hybrids)
[54]	Dimensionality Red.	PCA, LDA	Feature extraction/reduction for spectral/imaging data, classification

Table 2. Dominant deep learning models and their primary agricultural engineering applications.

Author	Model Category	Specific Models/Architectures	Primary Agricultural Applications
[69,81,82]	CNNs (General Classif.)	AlexNet, VGG16, GoogLeNet, ResNet	Crop/seed variety ID (grapes, rice, maize), livestock individual recognition, apple quality grading
[65,66,89]	CNNs (Real-Time Det.)	YOLOv3-v8, Faster R-CNN, Mask R-CNN	Tea shoot detection and picking point localization, object detection in complex fields
[73,78,83]	CNNs (Spectral)	1D-CNN, 2D-CNN	Toxin detection (ZEN), food comp. quantification, variety/origin classification, adulteration detection
[84,85]	CNNs (3D/Specialized)	VoxNet, Zero-DCE, EnlightenGAN	3D object recognition (LiDAR), low-light image enhancement for monitoring
[86]	CNNs (Hybrid)	CNN-LSTM, Transformer-embedded CNN	Time-series + image analysis (pesticides), crop disease identification
[26,68,74,87]	Autoencoders	SAE, SDAE, SWAE	Feature extraction for heavy metal detection, fruit quality prediction, hyperspectral data processing
[53,88]	RNNs/LSTMs	RNN, LSTM, CNN-LSTM	Time-series modeling (soil moisture, properties), yield prediction, irrigation optimization
[85]	GANs	WGAN, EnlightenGAN	Livestock image enhancement, low-light image correction
[11]	Transfer Learning	Fine-tuned Pre-trained Models	Ubiquitous strategy to leverage existing models for crop/livestock tasks with limited data

Table 3. Data splitting strategies across applications.

Application Domain	Dataset Size	Training/Calibration (%)	Validation (%)	Testing/Prediction (%)	Cross-Validation Method	Sample Reference
Green Tea Constituents	90 samples	66.7% (60)	-	33.3% (30)	-	[54]
Heavy Metal Detection	1400 samples	75% (1050)	-	25% (350)	-	[26]
Orange Freshness	Not specified	70%	-	30%	-	[79]
Cotton Irrigation	44 years of data	Historical calibration	2023–2024 validation	-	Time-series split	[88]
Maize Seeds	Not specified	-	-	-	10-fold CV	[17]
Tea Shoot Detection	Not specified	50%	-	50%	5x2CV paired t-test	[65]

Table 4. Performance metrics classification and application context.

Metric Category	Metric Name	Formula/Description	Optimal Range	Best For	Limitations	Applications in Review
Regression	R² (Coefficient of Determination)	1 − (SS_res/SS_tot)	0.7–1.0 (good–excellent)	Linear relationships	Non-linear patterns	Green tea, Lead detection
Regression	RMSE	√(Σ(y_pred − y_actual)²/n)	Lower is better	General accuracy	Scale-dependent	Multiple applications
Regression	MAE	Σ\|y_pred − y_actual\|/n	Lower is better	Robust to outliers	Less sensitive to large errors	Texture prediction
Regression	RPD	SD/RMSE	>2.0 (good), >2.5 (excellent)	Spectroscopic models	Dataset dependent	Hyperspectral studies
Classification	Accuracy	(TP + TN)/(TP + TN + FP + FN)	>90% (excellent)	Balanced datasets	Misleading for imbalanced data	Object detection
Classification	F1-Score	2 × (Precision × Recall)/(Precision + Recall)	>0.8 (good)	Imbalanced datasets	Harmonic mean limitation	Classification tasks

Table 5. Statistical analysis and optimization techniques matrix.

Technique Category	Method	Primary Purpose	Data Requirements	Computational Complexity	Agricultural Applications	Output Interpretation
Dimensionality Reduction	PCA	Feature extraction, visualization	Continuous variables	Low-Medium	Spectral data analysis	Variance explained
Dimensionality Reduction	LDA	Classification, group separation	Labeled data	Low-Medium	Quality classification	Discriminant functions
Variable Selection	CARS	Wavelength selection	Spectral data	Medium	Hyperspectral imaging	Selected variables
Variable Selection	IRIV	Iterative variable selection	High-dimensional data	Medium-High	Spectroscopy	Variable importance
Optimization	PSO	Parameter optimization	Continuous parameters	Medium	Storage optimization	Optimal parameters
Optimization	CSA	Neural network optimization	ANN parameters	High	Model tuning	Convergence curves
Statistical Testing	ANOVA	Factor significance testing	Experimental design	Low	Quality assessment	p-values, F-statistics

Table 6. Hyperspectral data processing pipeline comparison.

Processing Stage	Technique Options	Purpose	Advantages	Disadvantages	Recommended For
Preprocessing	Savitzky–Golay (SG)	Noise reduction	Preserves peak shape	May over-smooth	All spectral data
Preprocessing	SNV	Scatter correction	Simple, effective	Assumes normal distribution	Reflectance spectra
Preprocessing	MSC	Multiplicative scatter correction	Physical meaning	Requires reference spectrum	Diffuse reflectance
Feature Selection	CARS	Competitive selection	Automatic selection	Computationally intensive	Large spectral datasets
Feature Selection	SPA	Successive projections	Minimizes collinearity	May miss important variables	Regression models
Feature Selection	MC-UVE	Uninformative variable elimination	Monte Carlo stability	Parameter sensitive	Variable screening

Table 7. Cross-validation strategy effectiveness analysis.

CV Method	Dataset Size Requirements	Computational Cost	Bias Level	Variance Level	Best Applications	Limitations
k-fold (k = 5)	Medium-Large (>100)	Medium	Medium	Medium	General purpose	Data dependency
k-fold (k = 10)	Medium-Large (>200)	High	Low	High	Model comparison	Computational cost
Leave-One-Out	Small-Medium (<100)	Very High	Very Low	Very High	Small datasets	Variance issues
Monte Carlo	Any size	Variable	Low	Medium	Robust estimation	Parameter tuning
Time Series Split	Temporal data	Low	Medium	Low	Time-dependent data	Not for i.i.d. data
5x2CV	Medium (>50)	Medium	Low	Medium	Model comparison	Specific to comparisons

Table 8. Model performance benchmark matrix: establishing performance benchmarks across agricultural applications.

Application	Best Performing Model	R² Range	RMSE Range	Accuracy Range	Key Success Factors	Common Challenges
Green Tea Quality	PLS/ANN ensemble	0.85–0.95	5–15% RMSEP	-	Spectral preprocessing	Matrix effects
Freshness Detection	CNN/SVM hybrid	0.80–0.92	-	85–95%	Image quality	Lighting variation
Heavy Metal Detection	RF/PLS combination	0.75–0.90	10–25%	-	Feature selection	Soil heterogeneity
Yield Prediction	LSTM/RF ensemble	0.70–0.85	15–30%	-	Multi-source data	Weather variability

Table 9. Experimental Design Quality Assessment Framework.

Quality Criterion	Excellent (3)	Good (2)	Fair (1)	Poor (0)	Weight	Scoring Guidelines
Sample Size	>1000 samples	500–1000	100–500	<100	15%	Based on statistical power
Data Splitting	Proper CV + holdout	CV or holdout	Simple split	No validation	20%	Generalization assessment
Preprocessing	Multi-step validated	Standard methods	Basic preprocessing	Minimal/none	15%	Data quality improvement
Metric Selection	Multiple appropriate	2–3 relevant	1–2 basic	Inappropriate	20%	Task-metric alignment
Statistical Testing	Comprehensive analysis	Basic testing	Limited testing	No testing	15%	Statistical rigor
Reproducibility	Code+data available	Detailed methods	Partial details	Minimal details	15%	Replication potential

Table 10. Performance metrics of traditional ML algorithms in agricultural applications.

Model Category	Specific Models	Common Applications	Accuracy (%)	R²	RMSE	Other Metrics
ANNs	BP-ANN, ELM-ANN, RBF-ANN	Drying kinetics, ET prediction	85–92%	0.75–0.92	6.8–15.4	MAE: 4.2–12.1
SVM/SVR	SVM, SVR, GWO-SVR	Quality assessment, classification	83–94%	0.78–0.90	8.2–13.7	F1: 0.81–0.93
Random Forest	RF	Environmental forecasting, toxin regression	86–93%	0.82–0.93	7.4–11.2	Precision: 0.85–0.94
PLS/PLS-DA	PLS, PLS-DA	Quantitative spectral analysis	80–88%	0.72–0.88	9.5–16.8	Recall: 0.78–0.90
Ensemble Methods	Stacking, Blending, Weighted Avg.	Performance boosting for NEE	88–95%	0.84–0.94	5.8–10.3	RPD: 2.5–3.8
Gradient Boosting	XGBoost, LightGBM, CatBoost	Complex prediction, classification	89–96%	0.85–0.95	5.2–9.6	MAE: 3.8–8.5
Dimensionality Red.	PCA, LDA	Feature extraction for spectral data	82–90%	0.70–0.86	8.8–17.5	MAPE: 8.2–15.7%

Table 11. Performance metrics of traditional DL algorithms in agricultural applications.

Model Category	Specific Models/Architectures	Common Applications	Accuracy (%)	R²	RMSE	Other Metrics
CNNs (General Classif.)	AlexNet, VGG16, ResNet	Crop/seed variety ID	91–98%	0.88–0.96	4.2–9.8	F1: 0.90–0.97
CNNs (Real-Time Det.)	YOLOv3-v8, Faster R-CNN	Tea shoot detection	92–99%	0.89–0.97	3.8–8.5	IoU: 0.75–0.92
CNNs (Spectral)	1D-CNN, 2D-CNN	Toxin detection, food composition	89–96%	0.86–0.95	5.1–10.2	Precision: 0.88–0.96
CNNs (3D/Specialized)	VoxNet, Zero-DCE	3D object recognition	90–97%	0.87–0.96	4.5–9.3	Recall: 0.89–0.98
CNNs (Hybrid)	CNN-LSTM, Transformer-CNN	Time-series + image analysis	93–98%	0.90–0.97	3.5–7.8	MAE: 2.6–6.3
Autoencoders	SAE, SDAE, SWAE	Feature extraction, quality prediction	87–94%	0.84–0.93	6.2–11.5	RPD: 3.0–4.5
RNNs/LSTMs	RNN, LSTM, CNN-LSTM	Time-series modeling	90–96%	0.87–0.95	4.8–9.5	MAPE: 5.3–11.2%
GANs	WGAN, EnlightenGAN	Image enhancement	89–95%	0.85–0.94	5.5–10.8	PSNR: 28–36 dB
Transfer Learning	Fine-tuned Pre-trained Models	Crop/livestock tasks with limited data	93–99%	0.91–0.98	3.2–7.5	F1: 0.92–0.99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Frimpong, S.A.; Han, M.; Zheng, W.; Li, X.; Akpaku, E.; Obeng, A.P. Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges. Computers 2025, 14, 438. https://doi.org/10.3390/computers14100438

AMA Style

Frimpong SA, Han M, Zheng W, Li X, Akpaku E, Obeng AP. Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges. Computers. 2025; 14(10):438. https://doi.org/10.3390/computers14100438

Chicago/Turabian Style

Frimpong, Samuel Akwasi, Mu Han, Wenyi Zheng, Xiaowei Li, Ernest Akpaku, and Ama Pokuah Obeng. 2025. "Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges" Computers 14, no. 10: 438. https://doi.org/10.3390/computers14100438

APA Style

Frimpong, S. A., Han, M., Zheng, W., Li, X., Akpaku, E., & Obeng, A. P. (2025). Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges. Computers, 14(10), 438. https://doi.org/10.3390/computers14100438

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine and Deep Learning in Agricultural Engineering: A Comprehensive Survey and Meta-Analysis of Techniques, Applications, and Challenges

Abstract

1. Introduction

2. Review Methodology

2.1. Search Strategy and Database Selection

2.2. Geographical Scope and Limitations

2.3. Inclusion and Exclusion Criteria

2.4. Study Selection Process

2.5. Quality Assessment Framework

2.6. Data Extraction and Analysis Framework

2.7. Data Availability and Reproducibility Assessment

2.8. Synthesis and Reporting Approach

3. Taxonomy of Employed ML and DL Techniques in Agricultural Engineering

3.1. Temporal Evolution and Trends (2015–2024)

3.2. Hierarchical Classification of ML and DL Techniques

3.2.1. Traditional Machine Learning Approaches

3.2.2. Deep Learning Architectures

3.3. Synthesis of Model Utilization Trends

3.4. Contributions of Ml and Dl Techniques to Agricultural Engineering Challenges

3.4.1. Agricultural Product Quality Assessment, Classification, and Identification

3.4.2. Crop and Field Management

3.4.3. Agricultural Automation and Robotics

3.5. Experimental Evaluations of ML and DL Techniques in Agricultural Engineering

3.5.1. Data Splitting and Preparation

3.5.2. Experimental Design Criteria and Decision Frameworks

3.5.3. Performance Metrics

3.5.4. Statistical Analysis and Optimization

3.5.5. Experimental Procedures and Visualizations

3.5.6. Dataset Taxonomy and Availability

3.5.7. Software Platforms and Reproducibility Practices

3.6. Application Domains of ML and DL in Agricultural Engineering

3.6.1. Performance of Traditional ML Algorithms

3.6.2. Performance of Deep Learning Algorithms

3.6.3. Performance Implications for Agricultural Applications

3.6.4. Application Domains

3.6.5. Food Quality and Safety

3.6.6. Agricultural Management and Environmental Monitoring

3.6.7. Broader Applications and Methodological Advancements

3.6.8. Method Selection Framework and Decision Flowchart

4. Discussion

4.1. Current Research Challenges

4.1.1. Data Limitations and Quality Issues

4.1.2. Model Interpretability and Generalizability

4.1.3. Computational and Technical Constraints

4.1.4. Economic and Infrastructure Barriers

4.2. Socio-Economic and Ethical Considerations

4.2.1. Digital Divide and Economic Accessibility

4.2.2. Ethical Implications and Algorithmic Bias

4.2.3. Deployment and Scalability Challenges

4.3. Future Research Directions

4.3.1. Advanced Model Architectures and Integration

4.3.2. Sensing Technologies and Data Fusion

4.3.3. Sustainable and Responsible AI Development

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI