Machine Learning Applications in Agriculture: Current Trends, Challenges, and Future Perspectives

: Progress in agricultural productivity and sustainability hinges on strategic investments in technological research. Evolving technologies such as the Internet of Things, sensors, robotics, Artiﬁcial Intelligence, Machine Learning, Big Data, and Cloud Computing are propelling the agricultural sector towards the transformative Agriculture 4.0 paradigm. The present systematic literature review employs the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology to explore the usage of Machine Learning in agriculture. The study investigates the foremost applications of Machine Learning, including crop, water, soil, and animal management, revealing its important role in revolutionising traditional agricultural practices. Furthermore, it assesses the substantial impacts and outcomes of Machine Learning adoption and highlights some challenges associated with its integration in agricultural systems. This review not only provides valuable insights into the current landscape of Machine Learning applications in agriculture, but it also outlines promising directions for future research and innovation in this rapidly evolving ﬁeld.


Introduction
Agriculture 4.0 [1][2][3][4][5], also known as "Digital Agricultural Revolution" [6], represents a paradigm shift in agriculture, leveraging cutting-edge technologies to optimise various aspects of farming operations.These technologies encompass the Internet of Things (IoT), Artificial Intelligence (AI), Big Data, cloud computing, Decision Support System (DSS), advanced sensing technology, and autonomous robots [1,6,7].Sensors and robotics play a crucial role in collecting essential field data, which is then transmitted to a local or cloud server via IoT technology for storage, processing, and analysis.Big data and AI-based techniques can be used to convert these data into valuable insights.To facilitate user interaction and informed decision making, a DSS equips users with the necessary tools to optimise the agricultural system and undertake appropriate actions.
Machine Learning (ML), a subset of AI, has shown great potential in enhancing various aspects of Agriculture 4.0.It can be defined as a computer program or system that can learn specific tasks without being explicitly programmed to do so [8][9][10].It is a process that involves the use of a computer to make decisions based on multiple data inputs [8].In this case, data mean a set of examples.Labeled data is often used for supervised learning tasks (where the model learns from labeled examples), and unlabeled data might be used for unsupervised learning tasks (where the model finds patterns and structures in the data) [9].
ML indeed benefit from large amounts of data to achieve meaningful accuracy in their tasks.In the context of agriculture, obtaining vast and diverse data can be sometimes challenging yet pivotal for the success of ML models.IoT sensors are instrumental in collecting a diverse range of agricultural data as they can be strategically deployed across fields to capture relevant information regarding, for instance, soil conditions, climate variables, crop health, and livestock metrics [1].The widespread adoption of IoT technology facilitates continuous and real-time data acquisition, enabling the generation of extensive datasets over time.However, it is essential to consider that the data should be collected with sufficient quality to ensure its representativeness in the specific case study at hand.For instance, in crop management, studying the different stages of the crop is important for developing models that are accurate and applicable to real-world scenarios.Obtaining such representative datasets may take time, but it is a necessary investment for the effectiveness and reliability of ML applications in agriculture.Furthermore, collaborative initiatives and partnerships with farmers, agricultural institutions, and research organisations can contribute to the pooling of data resources.
A general flow for the creation of ML models and their deployment in agriculture is illustrated in Figure 1.The initial phase involves the retrieval of agricultural data from diverse sources, forming the foundational input for subsequent ML processes.These data are then divided into 'training' and 'testing' datasets.The training dataset becomes the substrate for instructing the ML model, while the testing dataset serves as an evaluation mechanism, assessing the model's performance and ensuring its accuracy and reliability.The outcome of these processes is a robust ML model capable of making classifications, predictions, or decisions tailored to specific agricultural contexts.Subsequently, the validated model is ready for deployment across various agricultural domains, including crop (i.e., optimising crop yields and health), water (i.e., ensuring efficient utilisation of irrigation resources), soil (i.e., maintaining soil health and fertility), and animal management (i.e., monitoring and improving livestock health and productivity).Several prevalent ML algorithms have emerged within the context of Agriculture 4.0.These encompass well-known methods such as Random Forest (RF), Support Vector Machine (SVM), Artificial Neural Network (ANN), and an array of Deep Learning (DL) variations [1].These algorithms play a key role in reshaping the agricultural domain, taking innovation and efficiency to new heights.While there has been extensive literature discussing the potential of ML in agriculture, the existing body of work often lacks a systematic and consolidated overview of the applications, impacts, outcomes, and challenges of ML integration in this dynamic field.A review made by [9] concluded that 61% of the analysed articles used ML techniques for crop management (22% disease detection, 20% yield prediction, 8% weed detection, 8% crop quality, and 3% species recognition), 19% for livestock management (12% livestock production and 7% animal welfare), 10% for soil management, and 10% for water management.Inspired by this study, the search for a current and comprehensive understanding of the landscape of ML applications in agriculture motivated the undertaking of a Systematic Literature Review (SLR) in 2023.
This effort seeks to elucidate the latest advances, trends, and challenges in this dynamic field, with the aim of contributing valuable insights to the agricultural research community.
The present SLR, conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology [11,12], addresses this research gap by providing a meticulous analysis of the state-of-the-art applications of ML in agriculture.By doing so, this review not only contributes to the existing body of knowledge, but it also offers valuable insights to researchers, practitioners, and stakeholders aiming to leverage ML technologies for sustainable and efficient agricultural practices.PRISMA is internationally recognised for its systematic framework, which effectively mitigates bias and increases the reliability of systematic reviews by providing a structured protocol for the identification, selection, and synthesis of studies.Adhering to the PRISMA guidelines was fundamental to maintaining the highest standards of methodological rigor, leading to an overall improvement in the validity and reliability of the results.To help guide this review, the following research questions were formulated: The present document is organised as follows: Section 2 (Principles and Methods) details the PRISMA methodology, including the search strategy, inclusion, and exclusion criteria, data extraction process, and quality assessment of selected studies.Section 3 (Results and Discussion) discusses the results of the SLR.Section 4 (Machine Learning Trends) presents the most used ML models and algorithms of the SLR.Section 5 (Machine Learning in Agriculture) provides an overview of the ML techniques used, as well as specific applications, according to the domains outlined in Section 3. Section 6 (Challenges and Research Opportunities) discusses the challenges associated with implementing ML in agricultural systems.Finally, Section 7 (Conclusions, Limitations, and Future Work) summarises the main findings of the present SLR and how they contribute to the current understanding of ML in agriculture.

Principles and Methods
This SLR adheres to the well-established PRISMA guidelines, which describes how to collect and analyse data from available studies.The PRISMA Statement is a systematic framework encompassing 27 items in the form of a checklist, along with a four-phase flow diagram that serves as an invaluable tool for guiding researchers in the preparation of reviews and meta-analyses [11].In the present section, each phase of the systematic review process is detailed, seamlessly aligned with the four fundamental phases outlined in the PRISMA methodology.These include the identification (Section 2.1), screening (Section 2.2), eligibility (Section 2.3), and inclusion phase (Section 2.4).Lastly, an overview of the PRISMA framework itself (Section 2.5) is provided, offering a holistic understanding of the structured approach adopted in this study.

Identification Phase
The identification phase represents the initial identification of records via various sources.To facilitate this step, a search string was formulated (Table 1).Final string: ("Agricultur*" OR "Farm*") AND ("Machine Learning") AND ("application" OR "implementation" OR "case study" OR "experimental" OR "practical").
The search encompassed records indexed in the specified repositories (Web of Science (WoS) and Scopus) up until the 1st of July, 2023, and the search string was adjusted to the syntax of each digital repository.The wildcard symbol ("*") was incorporated at the end of certain words in the search string, enabling the retrieval of all possible variants of the respective term.The boolean "AND" was employed to connect keywords originating from different groups within the search string, while "OR" was used to link keywords within the same group.

Group 1
Group 2 Group 3 "agricultur*" "machine learning" "application" "farm*" "implementation" "case study" "experimental" "practical" Additionally, some inclusion criteria were considered (Table 2) in order to meet what is intended in this study.In IC1, the digital repositories WoS and Scopus were selected as they are highly valued databases, due to their scientific and technical content, but also because they are closely related to the areas of knowledge associated with the objective of this article.The chosen search period, in IC2, was determined based on the significant progress and contributions to the field of integrating ML in the agricultural sector.Regarding IC3, the restriction to examining article titles, abstracts, and keywords was made with careful deliberation.This focused approach is predicated on the premise that these sections encapsulate succinct yet crucial information regarding the content and relevance of the articles, thereby facilitating the efficient identification of the pertinent literature.In IC4, the emphasis on scientific journal articles stems from a strategic assessment of the academic landscape in the specific domain of interest as they generally have a greater impact than other types of documents.Lastly, in IC5, the selection of English as the exclusive language criterion arises from its ubiquitous adoption as the lingua franca of global scientific discourse.

Screening Phase
In the screening phase, a systematic process is employed to evaluate the identified records against the predetermined inclusion and exclusion criteria (Table 3).The decision to focus exclusively on Q1 journal articles reflects a commitment to ensure the highest standard of research quality.

Eligibility Phase
The eligible studies are determined via a detailed assessment of the full texts of the remaining records after the initial screening.This phase involves a thorough evaluation of the studies against the pre-defined inclusion and exclusion criteria (Table 4).Full-text Must be available

Inclusion Phase
The inclusion phase represents the studies that meet all the inclusion criteria and are included in the SLR.This screening was made manually by the same author.

PRISMA Overview
Following the PRISMA guidelines, an initial screening process involved searching the repositories WoS and Scopus, resulting in a total of 4580 articles matching the search string from Table 1.Through a rigorous screening and eligibility phase (Tables 3 and 4, respectively), a final selection of 272 articles was identified for in-depth analysis.Figure 2 illustrates the PRISMA flowchart reporting the different phases of the systematic review and the respective results.guidelines.Specific criteria were applied at various stages of the review process, including removal of duplicates (SC1), record screening (SC2), journal rank (SC3), document type (SC4), prioritisation of content relevant to agriculture (EC1) and Machine Learning applications (EC2), and inclusion of full-text versions (EC3).

Statistical Analysis
The number of publications by year and country are represented in Figures 3 and 4, respectively.According to Figure 3, the number of publications in the WoS and Scopus databases has generally increased over the years, with a significant jump from 2019 to 2020 and a consistent upward trend until 2022.
In Figure 3, the pre-PRISMA analysis of WoS publications showed steady growth, from 274 in 2019 to a peak of 862 in 2022.This trend indicates a growing interest in research efforts of ML science in agriculture during this period.Scopus publications exhibited a parallel increase, starting at 408 in 2019 and culminating at 1127 in 2022.This upward trend can be attributed to significant advancements in advanced technologies that have emerged within the agricultural sector, owing to the emergence and widespread acceptance of the Agriculture 4.0 paradigm [1].However, it is important to note that in 2023, there was a decrease in the number of publications compared to the previous year.This decrease can be attributed to the fact that the search was conducted in the beginning of July 2023 and the data do not capture the full extent of publications for the entire year.
By examining these trends, we have uncovered a compelling narrative of how research efforts have evolved in tandem with the technological landscape.Technological advances and their adoption in the agricultural sector have played a key role in driving this increase in interest and research activity.Notable advances in IoT, sensors, and robotics have catalysed the adoption of ML in various facets of agriculture.In addition, political changes, economic fluctuations, and global events may have influenced the trajectory of research in this field.Lastly, the application of the PRISMA methodology has resulted in a noticeable reduction in the overall count of publications for each year.While this might initially appear as a decrement, it signifies an intentional effort to ensure a higher and more reliable quality of studies.This approach reinforces the credibility and validity of the selected research, ultimately enhancing the robustness of the systematic review.Furthermore, Figure 4 highlights the top seven countries that have made substantial contributions in terms of published articles.Before PRISMA, the number of publications in the WoS database varies for each country, with China having the highest count of 619, followed by the United States of America (USA) with 552 and India with 305.The number of publications in the Scopus database also varies for each country, with India having the highest count of 882, followed by China with 654 and the United States with 541.After PRISMA, China and the United States lead the pack in terms of publication output, with 96 and 30 publications, respectively.On the other hand, countries such as India, Australia, the United Kingdom (UK), Germany, and Italy have relatively lower publication counts.These nuanced insights into the geographical distribution of research contributions enrich the understanding of the global landscape of ML applications in agriculture, underlining the need for international collaboration and knowledge exchange in this dynamic field.
Regarding the journals, the present SLR included 272 papers from 62 different journals.Figure 5 illustrates the top 10 journals, along with the corresponding counts of publications from each journal.From this analysis, "Remote Sensing" had the highest count of publications with 73 publications, followed by "Computers and Electronics in Agriculture" with 46 publications and "Agricultural Water Management" with 14 publications.It is further stated that the mean impact factor (Clarivate [13]) is 6.22, indicating the average influence of the journals.The range of impact factors spans from a minimum of 2.4 (Journal: "Mathematics") to a maximum of 16.6 (Journal: "Nature Communications").This range highlights the diversity of influence of the selected journals, which signifies the breadth and depth of academic engagement in this field.These insights not only enrich the understanding of the academic landscape, but it also serves as a testimony to the rigor and quality of the research efforts encompassed in this analysis.At last, Figure 6 shows the count of publications in various categories (according to Clarivate [13]).The category "Geosciences" has the highest count of publications, with 82 publications."Agriculture" is the second-highest category, with 55 publications, followed by "Agronomy" with 23 publications.This categorical distribution of publications provides valuable information on the various domains that intersect with ML applications in agriculture.It highlights the multifaceted nature of research efforts in this field and underlines the key role of interdisciplinary collaboration in the realisation of Agriculture 4.0 and the advancement of agricultural technologies.

Application Domains in Agriculture
The distribution of application domains in agriculture, based on the SLR, is represented in Figure 7.As it is possible to see, the largest portion, accounting for 74.6%, is dedicated to Crop management.The Water management domain represents 21.7%, followed by the Soil management and Animal management domains, with 16.5% and 12.5%, respectively.The total percentage of each domain represents the proportion of articles that primarily focus on that specific domain.However, certain articles can be multidisciplinary in nature, addressing more than one domain within the agricultural context.Furthermore, five subdomains were outlined, representing a distinct area of focus within the crop domain: Crop quality (33.8%),Crop mapping/recognition (27.9%),Crop yield (20.6%), Crop disease (8.8%), and Pest/weed detection (1.8%).
The meticulous analysis of crop and animal types provides valuable insights into prevailing trends and focus areas for ML applications in agriculture.The prominence of "Plants" as the most studied crop category underlines the crucial role of ML in optimising various plant-based agricultural practices.Wheat, maize, and rice emerge as the main areas of focus, reflecting the importance of staple crops in agricultural research.In addition, the inclusion of specialty crops, such as vineyards and tea, exemplifies the diversity of agricultural contexts in which ML-based interventions are making substantial contributions.In the animal context, cows dominate the category, followed by poultry and sheep.The prevalence of interdisciplinary approaches in Agriculture 4.0 is evident in the analysis.In addition to traditional crops and livestock, the inclusion of categories such as bees and invasive insects demonstrates the innovative application of ML in various agricultural contexts, from agriculture to pest management.

Machine Learning Trends
The present section provides crucial insights towards answering to RQ1 (What are the most used ML algorithms in agriculture?).As it is known, various ML algorithms can be used for statistical modeling, data analysis, classification, regression, and dimensionality reduction processes.In the context of this SLR, Figure 8 provides an overview of the most used ML algorithms employed in the agricultural scope.
From the analysis of Figure 8, RF [14,15] emerges as the most widely used ML algorithm, representing 19.2% of the overall distribution.Its versatility and robustness make it a favored choice for handling complex problems.SVM [16,17] ranks second with 15.9% as it is known for their effectiveness in both classification and regression tasks.The Gradient Boosted Tree (GBT) [18,19] (8.3%), and Convolutional Neural Network (CNN) [20,21] (7.3%) also demonstrate significant usage and adoption in the agricultural sector.In descending order of frequency, the ML approach categories present in the SLR, as well as their respective algorithms, are: 1.
Ensemble Learning: this category has the largest percentage in the SLR with 35.6% of the total distribution.Ensemble Learning [22][23][24] emerges as a key force for improving the performance and generalisation of ML models, making them more robust and reliable.This category includes RF (19.2%, frequency: 127), GBT (8.3%, frequency: 55), Extreme Gradient Boosting (XGBoost) (4.5%, frequency: 30), AdaBoost (0.9%, frequency: 6), Bagging (0.8%, frequency: 5), CatBoost (0.6%, frequency: 4), Stacking (0.3%, frequency: 2), and "not specified" ensemble methods (0.3%, frequency: 2).Of all the algorithms within this category, the RF presents the highest frequency.While the Decision Tree (DT) [25] offers a simple and interpretable model, RF leverages the power of multiple DT to provide robust predictions and classifications that are crucial for the optimisation of various agricultural processes.These processes can include crop yield estimation, disease detection, and land cover classification based on remote sensing data.

2.
Artificial Neural Networks: the second category within the scope of this SLR constitutes 24.9% of the overall content and encompasses a range of influential algorithms that fall under the domain of ANN [26,27].), DL-not specified (0.6%, frequency: 4), Gated Recurrent Unit (GRU) (0.5%, frequency: 3), Recurrent Neural Network (RNN) (0.3%, frequency: 2), Generative Adversarial Networks (GAN) (0.2%, frequency: 1), and Encoder and Autoencoder (0.2%, frequency: 1).These algorithms are powerful tools capable of learning complex patterns, thereby facilitating accurate predictions and advancing the capabilities of numerous applications.Among these algorithms, the one that stands out the most is CNN, known to be specialised for image data analysis [28], making them valuable for tasks like crop disease identification, plant species recognition, and weed detection [1].

3.
Support Vector Machine: the third most prominent category, accounting for 15.9% (frequency: 105), is the SVM.This algorithm holds significant popularity and widespread application for tasks encompassing both classification and regression procedures [29].SVM underscores its significance in guiding informed decisions for bolstering agricultural productivity in the era of Agriculture 4.0 [1].Through its adeptness in crop mapping, yield estimation, and disease detection, SVM contributes to the ongoing transformation of agriculture into a more precise, efficient, and resilient practice, aligning seamlessly with the evolving demands of a dynamic global food landscape.

4.
Dimensionality Reduction: this category represents 6.2% of the total distribution and includes three different algorithms that can aid in dimensionality reduction [30] and feature engineering [31]  Nearest Neighbour: this category exclusively employs the k-Nearest Neighbors (KNN) algorithm (4.5%, frequency: 30).Among the various algorithms in the field of ML, the KNN algorithm stands out as one of the simplest yet extensively employed methods for classification purposes [32,33].Its adaptive and comprehensible design contributes to its popularity in various classification tasks.
Multi-task Learning: represents 0.3% (frequency: 2) of the total distribution and its objective is to enhance the outcomes of several interconnected learning tasks by utilising valuable insights shared among them [36,37].

Machine Learning in Agriculture
The current section is dedicated to addressing RQ2 (What are the impacts and outcomes of integrating ML in agriculture?), with a detailed analysis of the distribution of application domains in agriculture (as mentioned in Section 3.2).For each domain (and sub-domain, in the case of crop management), the authors selected five to seven articles based on their relevance, impact, and ability to provide insights that contribute to the overall understanding of the subject.This sample size is considered reasonable as it allows for the inclusion of key findings and trends within each domain without overwhelming the review with an exhaustive list of articles.Section 5.5 summarises the main findings from each domain, providing a brief overview of the impact that ML technologies have had on modernising agricultural practices.

Crop Management Domain
Crop management is associated to several agricultural practices that profoundly influence the growth and yield of cultivated crops.These practices encompass a wide range of activities, starting with the meticulous sowing process, extending to the vigilant maintenance of crops throughout their growth and development phases, and concluding with the phases of harvest [1].The optimisation of crop management strategies is essential to increase agricultural productivity, thereby addressing the escalating global requisites for sustenance, textile fibers, energy sources, and fundamental raw materials [38].
According to Figure 7, the crop management domain has the largest portion, accounting for 74.6%, of the study.This finding indicates that the application of ML techniques in crop management has significantly revolutionised conventional farming practices, offering capabilities such as crop mapping and recognition, yield prediction, optimal irrigation scheduling, pest and weed management, and disease detection [1].From the 272 articles included in the SLR, 203 articles are related to crop, where 92 are related to crop quality, 76 to crop mapping/recognition, 56 to crop yield, 24 to crop disease, and five to pest and weed detection.We delve into each sub-domain to highlight their specific contributions to crop management and their impact on enhancing agricultural practices.

Crop Quality
Within this study, crop quality refers to the characteristics of crops that determine their value and suitability.Improving crop quality via ML involves monitoring and managing crop's growth, nutrient levels, organoleptic characteristics, and others parameters.
By examining Table 5, it becomes evident that ML-based techniques have harnessed their computational prowess to effectively manage complex datasets encompassing a wide range of crop attributes (such as spanning size, appearance, and sensory characteristics).The synergy between cutting-edge ML algorithms and real-time data, including images and meteorological information, has propelled substantial advancements in the agricultural sector.This convergence has unlocked remarkable progress, allowing for more precise evaluations of crop quality based on current conditions and attributes.Furthermore, ML methods demonstrate their adaptability by excelling in the prediction and evaluation of crop quality using non-destructive approaches.This innovative strategy obviates the need for intrusive testing while simultaneously facilitating seamless real-time quality control throughout the supply chain.This paradigm shift enhances the efficiency of crop management and distribution, underscoring the transformative potential of ML in optimising agricultural processes.

Ref. Crop Type Models Used Summary
[39] Lettuce CNN, DNN AirSurf platform developed for ultra-scale aerial phenotyping, crop counting, and crop quality assessment.AirSurf-Lettuce achieves high accuracy (>98%) in scoring and categorising iceberg lettuces and provides novel analysis functions for mapping lettuce size distribution to enhance precision agricultural practices.
[40] Kiwifruit ANN, SVM, Gaussian Process, Ensemble learning Non-destructive tactile sensing approach for estimating the stiffness of kiwifruits, achieving accurate ripeness estimation with regression-based ML, showcasing potential applications in real-time quality control and sorting of fruits throughout the supply chain.[ This approach can significantly enhance N management strategies in rice cultivation, contributing to sustainable development and food security.
[44] Green coffee SVM, RF, XGBoost, CatBoost, PCA Focuses on distinguishing between special and traditional green coffee beans using an advanced multispectral imaging technique based on reflectance and autofluorescence data and combined with ML techniques.SVM achieves the highest accuracy (0.96) for the test dataset.This approach showcases its potential as a non-destructive and real-time tool for classifying green coffee beans in the food industry.
[45] Wheat Multi-task learning Multi-task learning approach using a real-world agricultural dataset, showing superior accuracy and stability in fertilisation prediction, leading to the development of a precision fertilisation system for intelligent and personalised farm management.

Crop Mapping and Recognition
Crop mapping and recognition refers to the process of identifying and mapping different crop types within agricultural fields.It involves using various data sources (such as satellite imagery, aerial and/or proximal photography, and spectroscopy) to detect and classify different crops and their spatial distribution.With ML techniques, it is possible to create accurate and detailed crop maps and identify the unique characteristics of each crop, which can be valuable for agricultural planning, resource management, and yield estimation.
Drawing insights from Table 6, it becomes evident that the application of ML-based techniques extends its computational capabilities into the domain of crop mapping and recognition, revolutionising how agricultural landscapes are understood and managed.The ability to process intricate data, coupled with real-time insights, enhances the precision and efficiency with which crop types and distributions are identified.Furthermore, by harnessing methodologies such as DL and established ML algorithms, these studies underscore the potential to effectively distinguish specific crop varieties with a commendable level of accuracy.

Ref. Crop Type
Models Used Summary [46] Apricot cultivars DT, KNN, LDA, NB, SVM, BPNN Demonstrates the feasibility of using ML to identify apricot cultivars based on their shape features, suggesting potential for non-destructive automatic identification systems.SVM achieved the best accuracy of 90.7% in the test set for classifying apricot cultivars.
[47] Grapevines SVM, CNN The study demonstrated the feasibility of using spectroscopy, Big Data, and ML to distinguish specific grapevine varieties (Touriga Nacional or Touriga Franca) from a larger group of other varieties.
[48] Various crops RF Compares various classification strategies for vegetation mapping over large-scale areas using Sentinel data within the Google Earth Engine platform and RF algorithms for classification.
[50] Pineapple ANN, SVM, RF, NB, DT, KNN Method involving UAV-captured RGB images, image processing, and ML classifiers to identify pineapple crowns, classify them as fruit or non-fruit, and count them accurately.The process involves pre-processing and segmenting high spatial-resolution aerial images, extracting features based on shape, color, and texture, and optimising classifiers' performance via feature fusion using one-way analysis of variance (ANOVA).
[51] Corn, soybean DL, CNN Innovative within-season emergence (WISE) phenology-normalised DL model for scalable within-season crop mapping using time-series remote sensing data.This approach accommodates spatiotemporal variations in crop phenological dynamics, yielding an over 90% overall accuracy for classifying corn and soybeans at the end of the season, as well as a satisfactory performance (85% overall accuracy) one to four weeks earlier than calendar-based approaches during the growing season.

Crop Yield
Crop yield refers to the quantity of agricultural produce obtained from a specific area of land during a growing season.Ensuring high crop yields is of utmost importance for addressing global food challenges and meeting the demands of a growing population [38].There has been a growing application of ML methods to estimate crop yield, aiming to facilitate farming planning, resource allocation (such as water, fertilisers, and pesticides), enhance storage management and marketing strategies, and tackle the pressing challenges of food security in the forthcoming years [1].
Reflecting upon the compilation detailed in Table 7, it becomes apparent that the application of ML-based methodologies showcase the potential to predict crop yields with remarkable accuracy.By integrating diverse data sources like remote sensing imagery, meteorological data, and canopy geometric parameters, these models not only provide insight into crop yield, but it also highlights the interplay of various factors influencing the agricultural output.

Ref. Crop Type
Models Used Summary [52] Vineyard ANN Combines remote sensing, computer vision, and ML for vineyard yield estimation.By using VIs and vegetated fraction cover obtained from UAV multispectral imagery, along with ANN techniques, the approach provides accurate yield predictions with higher accuracy than traditional methods, supporting decision making in viticulture practices and harvest planning.
[53] Rice BPNN, RNN Proposes an end-to-end model for rice yield prediction using DL fusion to learn deep spatial and temporal features from time-series meteorology and area data.The model achieves accurate predictions for both summer and winter rice yields.
[54] Strawberry RF, MLR, MARS, XGBoost, SVM, ANN The combination of canopy geometric parameters and VIs obtained from UAV imagery proved effective for estimating strawberry dry biomass using ML models.ANN showed the highest accuracy in cross-validation, and red-edge-related VIs were found to be the most influential variables.
[55] Apple tree Ensemble learning, SVM, KNN Develops an automatic processing channel to extract morphological and spectral features from UAV LiDAR and multispectral imagery data.The ensemble learning model outperforms other base learners (SVM and KNN) and provides accurate yield predictions for individual apple trees in the orchard.

RF, SVM, MLR, generalised boosting regression
The research explores various VI, Sentinel-2 bands, and the biophysical parameter LAI retrieved from radiative transfer models (RTM) as input data for the models.RFRandom forest regression stands out as the most effective model.[57] Winter wheat

Linear regression, Ensemble learning, DT, SVM, Gaussian Process
The study employs ML and historical data to predict winter wheat yield and dry matter, with the Gaussian process model achieving the highest accuracy (R2 = 0.87 and R2 = 0.86, respectively).The results offer valuable insights into site-specific crop management and could aid in formulating water and nitrogen management strategies for global food security.

Crop Disease
Crop disease refers to the study and management of various diseases that affect agricultural crops, leading to reduced yields and economic losses for farmers and the agricultural industry as a whole.
The use of ML-based techniques have proven to be key strategies in crop disease management, as highlighted in Table 8.Several techniques are applied to discern disease patterns, anticipate outbreaks, and implement targeted interventions, thereby offering a promising avenue for detection, diagnosis, and control of crop diseases [1].Through the fusion of ML models with diverse data sources, such as IoT-generated data and satellite and UAV imagery, these studies showcase the capacity to accurately categorise and identify diseases across various crops, enabling timely and effective responses to mitigate their impact.

Ref. Crop Type Models Used Summary
[59] Tomato YOLO (v3) Employs a machine vision approach for early real-time detection of tomato diseases and pests in natural environments.The outcomes demonstrate an average recognition accuracy of 91.8%.The developed approach has been put into practice within real tomato cultivation settings, demonstrating its effectiveness in detecting small objects and leaves occlusion.
[60] Not specified SVM, CNN, KNN, NB IoT-based that uses sensors and cameras to collect data from plants, which are then analysed via ML models.The system proposes ensemble classification and pattern recognition for crop monitoring system to identify plant diseases at the early.[61] Sugarcane CNN, YOLO (v5) Detects White Leaf Disease in sugarcane crops using UAV imagery and DL models.The proposed methodology provides technical guidelines for effective crop management and disease monitoring.
[62] Watermelon MLP, DT Uses remote sensing, VIs, and ML for identifying and classifying different severity stages of Downy Mildew disease in watermelon.The highest classification accuracy was achieved via the MLP method.
[63] Rice MLP, SVM, NB, DT, KNN Weather-based rice blast disease-forecasting system that uses an ensemble feature ranking approach to enhance predictive accuracy.By evaluating fifteen weather features, the proposed method identifies the most impactful ones.Among these features, average visibility, rainfall amount, sun exposure hours, maximum wind speed, and rainy days emerge as the most influential in rice blast prediction.
[64] Potatoes ANN Innovative approach to the early detection of Verticillium wilt in potatoes using nearinfrared spectroscopy and ANN models.The models accurately predict physiological responses to infection and classify infected plants within just two days after inoculation, even before visible symptoms appear.

Pest and Weed Detection
Instances of crop pest infestations, ranging from weeds, insects, pathogens, and rodents [65], have emerged as factors affecting global agricultural production.This subdomain focuses on the utilisation of advanced technologies, such as sensors, imaging systems, and ML algorithms, to detect and mitigate the presence of unwanted organisms that can negatively impact crop growth and yield.
From Table 9, it is possible to understand that ML techniques can help analyse complex data from various sources (such as satellites, UAV, or sensors) and identify patterns and anomalies associated with pest and weed presence that may not be easily recognisable to the human eye.ML-powered systems can detect pests and weeds at their early stages, enabling swift intervention before infestations become widespread [1].

Ref. Crop Type Models Used Summary
[66] Vineyard

DT with object-based image analysis
Innovative approach for mapping Cynodon dactylon (bermudagrass) infestations in vineyard cover crops using an automatic DT-OBIA algorithm combined with UAV imagery.This method is crucial due to the negative impacts of bermudagrass on vineyard productivity.[67] Wheat SVM with Radial Basis Function Assesses weed impact on wheat biomass using RGB images and proximal sensing techniques.The SVM model discriminates between crop and weeds and generates indicators like weed pressure and local wheat biomass production.
[68] Wheat DNN Detection of Italian ryegrass in wheat fields using UAV imagery (RGB) and DNN, along with an extensive feature selection method to accurately detect ryegrass in wheat and estimate its canopy coverage.Predictive models were developed to relate early-season ryegrass canopy coverage with end-of-season ryegrass biomass and seed yield, as well as wheat biomass and grain yield reduction.

Ref. Crop Type Models Used Summary
[69] Wheat DL with SVM, KNN, NN Novel approach for classifying weed and wheat in drone-captured images, integrating an optimised voting classifier with NN, SVM, and KNN to classify features extracted using AlexNet via transfer learning.
[70] Corn YOLO (v7) Identifies major pests (corn borer, armyworm, and bollworm) of corn using the YOLOv7 network combined with the Adam optimiser.The approach demonstrates the feasibility of using DL and advanced optimisation techniques for effective crop pests and disease identification, contributing to agricultural modernisation.

Water Management Domain
As water resources become increasingly finite and their management more complex, the fusion of cutting-edge technology with robust data analytics holds great promise in promoting more sustainable water management practices.IoT technology, sensors and actuators networks, data analytics, and predictive models have enabled farmers to monitor water quality, soil moisture levels, weather forecasts, and Crop Evapotranspiration (ETc) rates [71].
As demonstrated in this current study, the water management domain represents 21.7% of the study (Figure 7), highlighting its significance in agricultural applications.Table 10 exemplifies the utilisation of an array of ML algorithms, coupled with remote and proximal sensing techniques as well as innovative IoT technologies, to address diverse water-related challenges encompassing irrigation management, water quality surveillance, and ETc prediction.The hybrid model predicts real-time and time-series water needs based on various observations.The work is demonstrated using banana cultivation, achieving up to a 31.4% water optimisation for a single banana tree. [78] Grains, vegetables, fruits, flowers RF, NN, SVM Predicts phosphorus concentrations in shallow groundwater in intensive agricultural regions.SVM achieved the highest accuracy (R2 = 0.60).These findings support groundwater phosphorus monitoring, early warning, and pollution management decision making in intensive agricultural regions.

Soil Management Domain
Agricultural land is the extent of land considered suitable for agricultural production, covering both crop cultivation and livestock rearing [1].By embracing the principles of Agriculture 4.0, the integration of IoT sensors for real-time parameter measurements, AIdriven data analysis techniques, and DSS for informed decision making equips farmers with the tools to effectively oversee their fields in a manner that is both efficient and sustainable [1,79].ML-based techniques can process vast amounts of soil-related data (such as soil composition, texture, and moisture measurements) and generate insights into optimal irrigation schedules, nutrient management strategies, and soil health assessments.
In the present study, the soil management domain represents 16.5% of the entirety, as illustrated in Figure 7, and refers to the study and management of soil properties, composition, and conditions within agricultural systems.As is clear from Table 11, ML techniques possess the ability to predict soil properties and behaviours, empowering farmers to make well-informed choices pertaining to soil fertility, structure, moisture levels, and nutrient concentrations, all aimed at enhancing crop growth and yield.Additionally, by leveraging computer vision and the remote sensing data, ML simplifies the monitoring of both crops and soil conditions.This technological synergy allows for a comprehensive assessment of crop health, growth stages, and potential stressors.Beyond remote sensing, one particularly notable application of ML involves the utilisation of cell phone images, as demonstrated in the study by [80].This innovative approach showcases the potential of ML to develop efficient proximal soil sensors capable of swiftly and accurately predicting crucial soil properties.By harnessing readily available technology, this advancement exemplifies the adaptability and practicality of ML solutions in modern soil management practices.This not only exemplifies the adaptability and practicality of ML solutions in modern soil management practices, but it also underscores the transformative impact that technology-driven approaches can have on agricultural sustainability.

Ref. Crop Field
Models Used Summary [81] Various soil samples

RF, SVM, Logistic Regression
Predicts disease occurrence with high accuracy by analysing soil macroecological patterns of Fusarium wilt, a destructive soil-borne plant disease.The research employs a ML approach using bacterial and fungal data sets from diseased and healthy soils across various countries and plant varieties.The results reveal distinct differences in bacterial and fungal communities between healthy and diseased soils.
[82] Canola RF The research utilises a ML approach to determine key predictors of soil nitrous oxide (N2O) emissions, including soil temperature, moisture, and nitrate availability.The results highlight that N2O emissions were influenced by these factors, with emission factors being lower in high yield zones compared to low yield zones.

Ref. Crop Field Models Used Summary
[80] Maize, soybean DT, RF, Cubist, Gaussian Process, SVM, ANN Estimates soil organic matter (SOM) and soil moisture content (SMC) based on 22 color and texture features extracted from cell phone images.The study demonstrates the potential of using computer vision and ML to create an efficient proximal soil sensor for quick and accurate predictions of soil properties.Gaussian Process and Cubist models performed the best for SMC prediction, while ANN and Cubist showed satisfactory accuracy for SOM prediction.
[83] Vineyard NN regression, KNN, SVM with Linear Kernel, XGBoost, Cubist Explores the potential of using soil protists as bioindicators to assess multiple stresses in agricultural soils.The findings indicate that changes in protist taxa occurrence and diversity metrics are effective predictors of key soil variables, with soil copper concentration, moisture, pH, and basal respiration being particularly well predicted.[84] Rice CNN A CNN model is developed to predict heavy metal (Cadmium, Lead, Chromium, Arsenic, and Mercury) concentrations in soil-rice system using 17 environmental factors.
The model exhibits strong predictive accuracy, especially for Cadmium and Mercury.The study emphasises the model's stability and robustness, particularly for quick predictions during emergencies. [85] Wheat, maize, peanut

RF, NN (regression, radial basis function), BPNN, ELM
Introduces a method for farmland surface soil moisture retrieval using feature (extracted from Sentinel-1/2 and Radarsat-2 remote sensing data) optimisation and ML.RF model exhibited the highest accuracy.The proposed method shows potential for accurate surface soil moisture retrieval and offers insights for future applications in other farmland surface types.
[86] Not specified ANN, KNN, SVM, RF, GBT, XGBoost, MLR, Cubist Estimates soil water, salt contents, and bulk density from time domain reflectometry measurements using various ML algorithms.The research demonstrates that soil particle-size fractions are crucial predictors for all the targeted soil properties.XG-Boost is recommended for accurate soil gravimetric water content and bulk density estimation, while GBT is suggested for precise volumetric water content and soil salt content prediction.

Animal Management Domain
Animal (livestock and aquatic) production is a crucial part of agriculture, not only because it provides food and dairy products, but it also supplies other high-quality goods, such as wool and leather.Global demand for animal products is expected to increase further due to population growth [38], meaning that agrifood industries must optimise production practices by ensuring the welfare and safety of animals and increasing the capacity to prevent, detect, diagnose, and treat animal diseases.Considering this, there is a growing awareness that animal management can no longer be performed via traditional means and requires the adoption of new digital technologies.
The present SLR shows that the animal management domain comprises 12.5% of the research scope (Figure 7).The contents of Table 12 encompass a selection of seven articles dedicated to the utilisation of ML techniques in the domain of animal management.Smart animal monitoring systems have been viewed with great interest in the academic community, agrifood industries, and markets.Sensor-based animal wearables, computer vision systems, and other detection devices can capture the status of animals and environment in real time, which can be analysed afterwards with the aid of AI-based mechanisms to control and predict animals' health, welfare, production, etc. Livestock monitoring includes information related to animals' behaviour, physiology, clinical status, and performance [87], while in aquaculture, the desired information is more focused on water quality (water temperature, pH, dissolved oxygen content, ammonia, salt, etc.) [88,89].The proposed framework uses multi-modal data, including 3-D trajectories and infrared imagery, along with a multi-evidence approach to detect invasive insects near beehives.
The framework achieves a high classification accuracy of 97.1% for Vespa hornets and honeybees, showing the potential to ensure the safety and smart monitoring of beehives against invasive species.
[95] Fish CNN, YOLO (v5) The study uses a CNN for fish detection in recirculating aquaculture systems.The authors employ the one-stage YOLOv5 model and compare it with a two-stage Faster R-CNN model.The aim is to enhance fish production management via AI assistance.
[96] Broilers KNN, SVM, DT, RF, GBT Identification of aflatoxin-poisoned broilers via wearable accelerometers and ML.Poisoned broilers exhibit distinct behavioural changes, such as reduced time spent on feeding, drinking, walking, and standing, as well as increased sitting behaviour.The study successfully demonstrated that the used ML models can accurately identify poisoned broilers, particularly those with higher aflatoxin concentrations, with GBT showing the best performance.

Main Findings
The study, development, and deployment of technologies stemming from the Agriculture 4.0 paradigm has revealed a multitude of transformative advances in the agricultural sector.By leveraging data-driven insights and advanced computational techniques, MLbased technologies are poised to further revolutionise the agricultural sector, driving efficiency, sustainability, and productivity to new heights [1].

Crop Management
ML techniques have demonstrated remarkable proficiency in evaluating crop quality attributes, enabling precise assessments without invasive testing.Additionally, they have revolutionised crop mapping and recognition, enhancing the accuracy of identifying specific crop varieties within agricultural landscapes.Moreover, ML-driven models exhibit exceptional capabilities in predicting crop yields by integrating diverse data sources, offering valuable insights into factors influencing the agricultural output.Additionally, MLpowered solutions have emerged as powerful tools for disease, pest, and weed detection.By leveraging satellite imagery and IoT-generated data, these models excel in accurately categorising and identifying diseases, pests, and weeds.This capability enables timely and effective interventions, minimising the impact of outbreaks on crop yield.

Water Management
Through the integration of advanced sensing techniques, coupled with IoT technologies, ML algorithms demonstrate exceptional proficiency in optimising water-related practices.Precision irrigation is a prominent application, where ML models suggest precise schedules based on data processed in real-time.In addition, these models excel at vigilantly monitoring water quality, ensuring that crops receive water with an optimal nutrient composition.Furthermore, ML-driven predictions of crop evapotranspiration rates offer valuable information on water requirements, facilitating a more sustainable approach to irrigation practices.

Soil Management
ML techniques have proven valuable in predicting soil properties, allowing farmers, researchers, and stakeholders to make informed decisions regarding soil fertility, moisture levels, and nutrient concentrations.By assimilating data from various sources, ML models provide valuable insights into the dynamic nature of soil behaviour, allowing for proactive adjustments in farming practices to ensure optimal conditions for crop growth and yield.Additionally, via the application of computer vision and remote sensing data, ML simplifies the monitoring of both crops and soil conditions by offering timely information on crop health, growth stages, and potential stressors.

Animal Management
The integration of ML with smart animal monitoring systems represents a significant leap forward in enhancing animal welfare and productivity.This innovative approach harnesses sensor-based wearables, computer vision systems, and other detection devices to capture real-time data on animal status and environmental conditions.ML algorithms, in tandem with these advanced technologies, enable the analysis of the captured data, providing valuable insights into animal health, behaviour, and overall wellbeing.This data can be processed and interpreted to control and predict various aspects of animal management, including health, welfare, and production.

Challenges and Research Opportunities
The present section focuses on answering RQ3 (What are the challenges and future directions associated with integrating ML in agriculture and agricultural systems?).The integration of ML in agriculture, although promising, still presents some challenges.In a study made by [1], various challenges were identified that need to be addressed to enable a successful transition towards Agriculture 4.0 paradigm.These are stratified into five main levels, namely device, data, network, application, and system.Of these levels, one that relates to the implementation of ML in agricultural systems is the data level.Table 13 provides an overview of some identified challenges covering a wide range of critical aspects related to the integration of ML in agriculture, along with possible solutions and further research.

Data accessibility
Encompass the efficient management of data, ensuring it is readily available to be used.For example, a delay in accessing data due to storage issues could hinder the real-time capabilities of ML applications.
Optimising data management systems and storage solutions, ensuring both efficiency and security.

Data accuracy
Accurate data are critical for training ML models.Inaccurate data can lead to incorrect predictions or recommendations.
Ensuring that data are accurate, credible, and trustworthy by exploring methods for data validation and quality assurance.

Data completeness
Incomplete data may result in biased or incomplete ML models.For example, missing data points in a crop monitoring dataset may hinder the model's ability to accurately predict crop yield.
Exploring techniques for data imputation/extrapolation to address missing data in agricultural datasets.Investigating methods for optimising models' performance in the presence of incomplete information (e.g., Feature Engineering).

Data consistency
Consistent data ensures that ML models are reliable and reproducible.For example, inconsistent labeling of images in a crop disease detection dataset could lead to incorrect classification.
Exploring data validation and cleaning techniques to ensure consistency in agricultural datasets.Developing techniques that can identify and rectify inconsistencies.

Data context
ML models need to be trained on data that are relevant to the specific agricultural task at hand.For example, using weather data from a different region may not provide accurate predictions for local farming conditions.
Investigating techniques for adapting ML models based on the specific agricultural context.A possible approach could be the use of Transfer Learning as it involves leveraging pre-trained models on similar tasks or domains and fine-tuning them using local data.

Data security and privacy
Agricultural data are often sensitive information that requires compliance with data protection regulations.
Exploring mechanisms that encompass data anonymisation, access control, and compliance with evolving data protection regulations will be crucial in building a foundation of trust for ML-driven agricultural solutions.

Data timeliness
Delayed/outdated data can lead to non-optimal results, impeding the potential benefits derived from ML-driven insights.However, it should be noted that there are scenarios in which historical data can be of significant use as it can offer invaluable insight into long-term trends, cyclical patterns, and the cumulative effects of farming practices.
Exploring methods for real-time data acquisition and processing that can adapt and make decisions based on the most up-to-date data, ensuring timely responses in ML applications.However, depending on the case at hand, a hybrid approach can be used, striking a balance between integrating real-time and historical data.This involves using real-time data for immediate decision making and integrating historical data for long-term strategic planning.

Humanmachine collaboration
ML-based systems should enhance, rather than replace, human expertise in agriculture.Designing systems that facilitate seamless collaboration between stakeholders is an emerging area of research.
Designing collaborative decision making frameworks that seamlessly integrate ML insights with human expertise.Developing interfaces that empower users to interact with and guide ML models in agricultural tasks.

Interpretability and explainability
ML-based systems pose a significant challenge in gaining the trust and acceptance of farmers, stakeholders, and the agricultural industry.It is important to understand how models achieve their outputs.
Ensuring that ML models are transparent and that their inner workings are accessible.This means providing information on the features, variables, and algorithms that contribute to a model's results.Techniques such as SHAP values [97] or LIME [98] can be useful to identify which features are most influential in a model's predictions.

Limited literacy
Generally, aged workers may have limited literacy on digital technologies that could cause resistance or difficulties in adopting and effectively utilising technologies from the Agriculture 4.0.
Investing on training methods (e.g., workshops, courses), knowledge transfer, and skill-building in the context of ML-based technologies.Designing user-friendly interfaces tailored to older workers.

Resource constraints
ML-based systems often necessitate real-time processing and decision making.Remote regions or resource-constrained enterprises may lack the computational resources required for data processing.
Developing lightweight and efficient models that can operate effectively in low-resource scenarios.Investigating techniques for distributed and edge computing.

Conclusions, Limitations, and Future Work 7.1. Conclusions
Our study revealed notable findings when addressing RQ1 (What are the most used ML algorithms in agriculture?).As expected, RF emerged as the most prevalent ML algorithm, constituting 19.2% of the overall distribution.This popularity can be attributed to its versatility and robustness, which render it highly adept at handling intricate agricultural challenges.Following closely, SVM held the second position at 15.9%.Renowned for their efficacy in both classification and regression tasks, SVM have garnered substantial traction within the agricultural domain.GBT and CNN exhibited noteworthy adoption rates of 8.3% and 7.3%, respectively, further highlighting their significance in the agricultural sector.
By answering to RQ2 (What are the impacts and outcomes of integrating ML in agriculture?), it was possible to uncover a range of transformative effects arising from the integration of ML applications in the agricultural domain.Namely, we observed a substantial increase in the efficiency of agricultural production attributed to ML-based precision farming techniques, mainly with regard to advances in resource allocation strategies, ensuring that inputs such as water, fertilisers, and pesticides are used judiciously.Consequently, the environmental footprint of agricultural practices has been positively influenced by the incorporation of ML technologies.Through data-driven decision making, farmers can implement sustainable practices that reduce resource consumption and limit environmental impacts.For example, ML-powered precision irrigation systems can adaptively regulate water use based on real-time soil moisture data, promoting water conservation and maintaining optimal soil conditions.In addition, the integration of ML has substantially strengthened the agricultural sector's capabilities in managing diseases, pests, and weeds.ML algorithms have demonstrated remarkable accuracy in the early detection and classification of plant diseases, allowing for timely intervention and mitigation measures.This not only safeguards crop health, but it also mitigates potential yield losses.Overall, the integration of ML in agriculture represents a paradigm shift, propelling the sector towards a more efficient, sustainable, and technological future.The benefits go beyond mere productivity gains, encompassing a holistic transformation of agricultural practices to align with contemporary food safety and environmental management requirements.This bodes well for the resilience and adaptability of the agricultural sector in the face of evolving global challenges.
Given the growing potential of ML integration in agriculture, there are several open issues and promising avenues for future research.By answering RQ3 (What are the challenges and future directions associated with integrating ML into agriculture and agricultural systems?), it was possible to identify and explore some of these challenges and provide mitigation strategies.These include ensuring adaptable ML models, optimising data accessibility, and maintaining data accuracy, completeness, and consistency.Contextualising data usage, addressing security and privacy concerns, and ensuring timely data are also vital.Additionally, promoting human-machine collaboration, enhancing interpretability, and overcoming limited digital literacy among ordinary farmers are essential areas for attention.It is imperative to design ML applications with user-friendly interfaces that require minimal technical expertise.Incorporating intuitive visualisations and simple dashboards can enhance accessibility for farmers with limited literacy.Additionally, outreach programs and training initiatives tailored to the specific needs of agricultural communities can be implemented.Workshops, demonstrations, and educational campaigns can empower farmers with the knowledge and skills to effectively utilise ML-based technologies in their day-today practices.Moreover, in resource-constrained environments, developing lightweight models and exploring distributed computing methods are crucial steps toward a successful integration of ML in agriculture.Addressing these challenges will lead to a more effective and widespread implementation of ML technologies in the agricultural sector.
In summary, the integration of ML in agriculture produces substantial benefits, ranging from improved agricultural production and resource allocation to the better detection of diseases and pests and reduced environmental impacts.These advances pave the way for a more sustainable and adaptable agricultural sector, ready to meet the demands of the future.As data-driven approaches continue to flourish, the agricultural landscape is on the brink of a more sophisticated and dynamic future, where technology and tradition converge harmoniously for the betterment of global agriculture.

Limitations
The extent of this review has been constrained to encompass Q1 articles, inadvertently leading to a decrease in the overall count of papers incorporated within the analysis.Due to the exclusion criteria used while performing the retrieval of identified research from the electronic databases, it is possible that some relevant publications might have been left out of the study.Hence, it is advisable that forthcoming researchers consider a broader range of literature sources (for example, ScienceDirect repository) to enrich the review's inclusiveness.

Future Work
Future research into the integration of ML in agriculture should focus on harnessing different data sources (such as satellite/drone imagery, IoT-based sensor data, and weather station information) for a better understanding of agricultural systems.In addition, the integration of ML with robotics and automation presents an opportunity for intelligent, selflearning systems capable of performing complex tasks for farmers or agricultural industries, such as the development of autonomous fruit-picking machines.Future efforts should focus on creating affordable and scalable ML solutions for regions with limited resources, ensuring that the benefits of the technology reach smallholder farmers and communities in developing areas.These research directions will move the field forward, leading to more sustainable, efficient, and resilient agricultural systems.Interdisciplinary collaboration between ML experts and professionals from specific fields, such as agronomy or chemistry, can lead to solutions tailored to agricultural challenges.Lastly, it would be interesting to carry out in-depth studies assessing the socio-economic impacts of the widespread adoption of ML in agriculture, including their effects on employment, economic viability, and equity in access to technological resources.

Figure 1 .
Figure 1.General flow for the creation of Machine Learning models and their application in agriculture.

•:
RQ1What are the most used ML algorithms in agriculture?• RQ2: What are the impacts and outcomes of integrating ML in agriculture?• RQ3: What are the challenges and future directions associated with integrating ML in agriculture and agricultural systems?

Figure 2 .
Figure 2. The flowchart illustrating the study inclusions and exclusions for the systematic literature review adheres to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines.Specific criteria were applied at various stages of the review process, including removal of duplicates (SC1), record screening (SC2), journal rank (SC3), document type (SC4), prioritisation of content relevant to agriculture (EC1) and Machine Learning applications (EC2), and inclusion of full-text versions (EC3).

Figure 3 .
Figure 3. Distribution of the selected publications by year (before and after PRISMA).

Figure 4 .
Figure 4. Distribution of the selected publications by country (before and after PRISMA).

Figure 6 .
Figure 6.Distribution of the top 10 research areas (after PRISMA).

Figure 7 .
Figure 7. Distribution of application domains in agriculture (after PRISMA): crop, water, soil, and animal management.

Figure 8 .
Figure 8. Distribution of the most used Machine Learning algorithms in the Systematic Literature Review.

Table 1 .
Search keywords to be used to collect records from different digital databases.

Table 2 .
Inclusion criteria for the Identification phase of the survey.

Table 3 .
Inclusion criteria for the Screening phase of the survey.

Table 4 .
Inclusion criteria for the Eligibility phase of the survey.

Table 5 .
Machine learning applications in crop quality sub-domain.
The study uses active canopy sensor data and combines it with environmental and agronomic variables to develop N status diagnosis and recommendation models.

Table 6 .
Machine learning applications in crop mapping and recognition sub-domain.

Table 7 .
Machine learning applications in crop yield sub-domain.

Table 8 .
Machine learning applications in crop diseases sub-domain.
Detects banana plants and their major diseases using satellite and UAV images and ML for classification.The developed model effectively categorised both healthy and diseased plants.

Table 9 .
Machine learning applications in pest and weed detection sub-domain.

Table 10 .
Machine learning applications in water management domain.

Table 11 .
Machine learning applications in soil management domain.

Table 12 .
Machine learning applications in animal management domain.The results demonstrate that the RF model is effective in detecting the impact of extreme heat conditions on milk yield, with an average relative error of about 18% for single daily yield predictions and 2% for total milk production.

Table 13 .
Challenges and proposed solutions for integrating Machine Learning in agriculture.