A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms

Qin, Chang; Zhao, Peiqin; Qian, Ying; Yang, Guijun; Hao, Xingyao; Mei, Xin; Yang, Xiaodong; He, Jin

doi:10.3390/agronomy15122898

Open AccessReview

A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms

by

Chang Qin

^1,2,

Peiqin Zhao

²,

Ying Qian

^1,2,

Guijun Yang

²

,

Xingyao Hao

²,

Xin Mei

¹,

Xiaodong Yang

^2,* and

Jin He

^1,*

¹

School of Resources and Environmental Science, Hubei University, Wuhan 430062, China

²

Key Laboratory of Quantitative Remote Sensing in Agriculture of Ministry of Agriculture and Rural Affairs, Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2025, 15(12), 2898; https://doi.org/10.3390/agronomy15122898

Submission received: 24 November 2025 / Revised: 13 December 2025 / Accepted: 14 December 2025 / Published: 16 December 2025

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Big data and artificial intelligence technologies are driving a paradigm shift in smart farming, yet intelligent decision-making faces critical bottlenecks. At the data level, challenges include fragmentation, high acquisition costs, and inadequate secure sharing; at the model level, issues involve regional heterogeneity, weak adaptability, and insufficient explainability. To address these, this paper systematically reviews global research to establish a theoretical framework spanning the entire production cycle. Regarding data governance, trends favor federated systems with unified metadata and layered storage, utilizing technologies like federated learning for secure lifecycle management. For decision-making, approaches are evolving from experience-based to data-driven intelligence. Pre-harvest planning now integrates mechanistic models and transfer learning for suitability and variety optimization. In-season management leverages deep reinforcement learning (DRL) and model predictive control (MPC) for precise regulation of seedlings, water, fertilizer, and pests. Post-harvest evaluation strategies utilize spatio-temporal deep learning architectures (e.g., Transformers or LSTMs) and intelligent optimization algorithms for yield prediction and machinery scheduling. Finally, a staged development pathway is proposed: prioritizing standardized data governance and foundation models in the short term; advancing federated learning and human–machine collaboration in the mid-term; and achieving real-time, ethical edge AI in the long term. This framework supports the transition toward precise, transparent, and sustainable smart agriculture.

Keywords:

smart agriculture; agricultural big data; smart farms; data governance; intelligent decision-making

1. Introduction

As the global population continues to rise, the Food and Agriculture Organization of the United Nations (FAO) estimates that food production must increase by approximately 50% by 2050 to meet growing demand [1,2]. However, maintaining this growth rate presents significant challenges due to natural resource limitations, insufficient agricultural investment, and a lack of technological innovation [3]. Consequently, agricultural systems must undergo fundamental transformations to enhance productivity while ensuring the sustainability of food production [4]. To achieve this goal, smart farming systems must be upgraded toward greater intelligence and precision. This includes improving operational efficiency and resource utilization by processing, analyzing, and making decisions based on the vast amounts of data generated by modern agricultural production [5].

The theoretical foundations of agricultural informatization and big data technologies provide essential support for the development of smart farms. Agricultural big data plays a pivotal role across all stages of food production by enabling data integration, analysis, forecasting, and other advanced functions [6]. Data integration allows diverse sources of agricultural data to be consolidated and processed within a unified platform, laying a robust foundation for subsequent analysis and decision-making support [7]. Through intelligent decision-making models, smart farms can extract insights from historical data to forecast weather patterns, market demand, and crop growth trends, making decisions that are predictive rather than reactive, multi-objective (yield, cost, sustainability), and continuously optimized in real-time or near real-time using machine learning and optimization algorithms [8]. These capabilities offer critical support for risk management and resource optimization throughout the food production process. Moreover, the advancement of smart farms not only accelerates the modernization of food production and management but also contributes to rural revitalization and the transformation of the agricultural economy [9].

Building upon these advancements, the overall operational logic of smart farms can be more clearly understood through a unified system perspective. As shown in Figure 1, smart farm architecture integrates multi-source data acquisition, data processing and governance, and model-driven decision-making into a coherent workflow that supports the full agricultural production cycle. Presenting this conceptual framework at the outset helps situate the subsequent sections within a consistent structure and clarifies how data and models jointly enable intelligent agricultural decisions. Although substantial progress has been made in sensing technologies, data platforms, and model development, related studies often address these components independently. As a result, the interaction between data governance mechanisms and decision-making models, as well as their integration across different production stages, has not been systematically summarized. These developments highlight the need to revisit existing research from an integrated smart farm perspective.

To ensure a rigorous and systematic analysis, this review employed a structured literature search across major databases including Web of Science, IEEE Xplore, and ScienceDirect, covering the period from January 2000 to January 2025, with a significant focus on post-2018 advancements. The search strategy utilized Boolean combinations of keywords connecting three core dimensions: agricultural scenarios (e.g., “smart farm”, “precision agriculture”), data infrastructure (e.g., “big data storage”, “blockchain”, “federated learning”, “data security”), and intelligent modeling techniques (e.g., “deep learning” [CNN, LSTM], “remote sensing”, “evolutionary algorithms” [ACO, PSO]). We prioritized peer-reviewed studies that explicitly address the integration of multi-source heterogeneous data (meteorological, IoT, and spectral data) into end-to-end decision-making frameworks, excluding purely hardware-centric research to focus on the algorithmic and methodological evolution of agricultural intelligence.

Focusing on the core technological frameworks and recent advancements in big data-driven decision-making models for smart agriculture, this study systematically outlines their theoretical foundations and implementation pathways, providing strategic insights for future development. The manuscript is structured as follows: (1) a definition of smart farms and an analysis of their current development trajectory, with particular emphasis on the synergistic interplay between data and models; (2) governance of agricultural big data in smart farming, including the acquisition, processing, storage, security, and sharing of multi-source datasets such as satellite imagery and unmanned aerial vehicle (UAV) observations; and (3) design of end-to-end intelligent decision-making models tailored to pre-sowing planning, in-season cultivation management, and post-harvest performance evaluation.

2. The Concept and Development Status of Smart Farms

2.1. Concept and Characteristics of Smart Farms

In recent years, smart farms—an integral component of smart agriculture—have emerged as a prominent topic in agricultural research. Before the recent formalization of “smart farms,” the concept evolved through several earlier stages including precision agriculture in the 1990s, digital agriculture in the 2000s, and smart agriculture after 2015. These stages laid the technological foundation—such as GPS-based variability management, multi-source data integration, and IoT-enabled monitoring—for the more advanced smart farm definitions summarized below. Table 1 presents various scholarly definitions of smart farms. From these definitions, several common characteristics can be identified: First, smart farms are heavily reliant on modern information technologies such as the Internet of Things (IoT), big data, and artificial intelligence (AI); second, they emphasize automation and intelligence throughout the production process, aiming to minimize human intervention; and third, they prioritize the collection, analysis, and application of agricultural production data to enable precision management. Accordingly, a smart farm is a modern agricultural system supported by core technologies such as IoT, big data, and AI. It utilizes intelligent equipment and systems to automate the entire agricultural production process, reduce the need for manual labor, and achieve precision management through the real-time collection, analysis, and application of production data.

2.2. Current Status of Smart Farm Development

The emergence of computer technology in the 1950s laid the foundation for transformative changes in agriculture, with developed countries leading efforts to integrate information technology into agricultural production. By the 1980s, the concept of precision agriculture had taken root in countries such as the United States and Japan. Utilizing technologies like Geographic Information Systems (GIS), Global Positioning Systems (GPS), and remote sensing, agricultural production began shifting toward automation and precision, while also accelerating farm scale expansion [14]. In the 21st century, the deep integration of AI and IoT technologies has given rise to a new paradigm of intelligent agriculture. Innovations such as intelligent robots and drones now enable full life-cycle data collection and autonomous operations, forming a sustainable cyclic system characterized by “data-driven decision-making—intelligent equipment implementation—eco-efficiency feedback” [15].

The development of smart farms worldwide exhibits distinct geographical characteristics. In the United States, the advancement of smart agriculture is driven by substantial big data resources and strong capabilities in algorithm development, forming an enterprise-led technology ecosystem. For example, Bayer’s Climate FieldView platform integrates data from farms across 120 countries, supporting decision-making models that cover hundreds of millions of acres of cropland [16], while John Deere’s GreenStar system employs over 300 sensors to enable planting management at the monoculture level. In practice, GreenStar collects GNSS/RTK signals, implement sensor data, and machine logs through the CAN-bus system. These data are processed by variable-rate algorithms that generate operational prescriptions such as seeding density, fertilizer rate, and implement adjustments. In Japan, smart farming initiatives address the dual challenges of fragmented arable land and labor shortages. Kubota’s KSAS system utilizes information and communication technology (ICT) to integrate farm machinery operations, soil moisture data, and weather information, thereby enabling a precision management model. This system also analyzes accumulated big data to guide the implementation of variable-rate fertilizer application and chemical spraying [17]. In contrast, China’s development of smart agriculture follows a unique “application-first, technology-later” trajectory [18]. The core bottlenecks lie in the decentralization of intelligent decision-making models and technological fragmentation. The diversity of model developers results in inconsistent standards and interfaces; algorithm accuracy is constrained by regional heterogeneity; and synergistic capabilities across different production scenarios remain limited. To address these issues, China is establishing a unified data governance structure and a modular decision-making model framework through the construction of a national agricultural big data platform. This initiative aims to support cross-regional customization, enable knowledge migration, and overcome systemic barriers. Ultimately, it seeks to provide both theoretical and practical support for building an autonomous and controllable agricultural intelligent decision-making system, facilitating the transformation of smart farms from an “experience-driven” to a “data- and model-driven” paradigm. Although these national pathways differ, they share several common trends: multi-source sensing, cloud–edge collaborative computation, and increasing reliance on autonomous machinery. These developments indicate that smart farms are moving from fragmented digital tools toward integrated, model-driven decision-making systems.

3. Governance of Agricultural Big Data in Smart Farms

3.1. Acquisition & Processing of Agricultural Data

Data acquisition in smart farming relies on multi-source sensing technologies including satellite platforms, UAVs, and ground-based IoT sensors combined with advanced data processing and fusion methods to provide high-precision support for precision agriculture. Remote sensing has become a widely adopted tool in the production monitoring of smart farms due to its broad spatial coverage and timely data updates. Passive remote sensing satellites acquire multispectral and hyperspectral imagery to support intelligent decision-making tasks such as crop growth monitoring and yield prediction, followed by preprocessing steps including radiometric calibration, atmospheric correction, geometric correction, and cloud masking to ensure data quality and usability. These satellites offer rich spectral and textural features but require preprocessing—such as radiometric, atmospheric, and geometric corrections—to mitigate the impacts of cloud cover, precipitation, and fog. Active remote sensing platforms, such as Synthetic Aperture Radar (SAR) satellites, enable all-weather monitoring by detecting reflected echoes of emitted microwaves, making them suitable for assessing soil moisture and crop canopy characteristics [19]. Studies have demonstrated that combining SAR with optical imagery enhances the accuracy and granularity of crop classification and monitoring [20,21]. This is achieved by applying multi-view correction or filtering techniques to reduce speckle noise caused by systematic errors. Additionally, very high-resolution (VHR) satellites such as SPOT5 and QuickBird are effective for fine-scale classification and in situ monitoring [22]. However, their high cost and limited commercial availability have restricted their widespread application in smart farm production monitoring. UAVs serve as a critical complement to satellite-based monitoring due to their high spatial and temporal resolution (sub-meter to centimeter level) and operational flexibility (on-demand flight planning with rapid revisit times) [23,24]. Maimaitijiang assessed the capability of UAV-based multimodal data fusion for soybean yield estimation [25], while Zhu used imagery from multispectral and LiDAR sensors to estimate above-ground biomass (AGB) in maize [26]. Equipped with RGB, multispectral, thermal, and LiDAR sensors, UAVs can generate high-resolution digital elevation models and support comprehensive multi-source data fusion analysis.

Field sensors, including handheld spectrometers, soil moisture sensors, weather cameras, and RGB cameras, have been widely utilized with typical sampling frequencies ranging from minutes to hours, enabling near-real-time ground truth validation and multi-layer data fusion. For instance, Hasan employed a high-resolution RGB camera mounted on a ground-based imaging platform to capture seasonal imagery for estimating wheat yield [27]. Similarly, Zhang ZhenTao analyzed the yield improvement of summer maize by comparing the effects of optimal sowing periods with actual sowing dates, based on data from soil moisture sensors at agrometeorological observation stations [28].

Additionally, meteorological and reanalysis data support disaster early warning and climate adaptation decision-making by integrating atmospheric and land surface parameters. Production management data include structured records such as agricultural machinery scheduling and the use of production inputs, while survey statistics provide supplementary information on the behavior of agricultural enterprises and socialized services. These datasets encompass structured, semi-structured, and unstructured formats, characterized by high spatial and temporal resolution and multimodal integration. For example, hyperspectral remote sensing combined with ERA5 reanalysis data can dynamically refine regional evapotranspiration models [29]. The heterogeneity of large-scale, multi-source data increases the complexity of intelligent analysis, highlighting the need to shift from a “model-driven” to a “data-model synergistic-driven” paradigm to establish a comprehensive, multi-granularity data foundation for precision agriculture.

3.2. Storage & Management of Agricultural Data

Multi-source heterogeneous data in smart farms imposes dual demands on storage architecture, requiring both high-performance access and efficient management. To accommodate various data types and business scenarios, a tiered storage strategy integrated with a multimodal storage engine is essential for optimizing data access efficiency while significantly reducing storage costs—typically achieving 5–20× cost savings compared to single-tier solutions [30]. Temporal data storage, designed for high-frequency sensor data such as soil moisture and weather station readings, employs cloud-native temporal databases to enable millisecond-level write speeds, real-time aggregation and analysis, time-window slicing, and compression algorithms to minimize storage redundancy [31]. Raster data storage, targeting unstructured data such as remote sensing imagery and UAV orthophotos, leverages distributed object storage to build a high-concurrency access interface that supports parallel read/write operations and metadata-tagged retrieval, enabling rapid access to multi-scale raster datasets [32]. Relational data storage uses cloud-native relational databases to manage structured data such as agricultural operation records and production logs. Performance for complex queries is enhanced through index optimization and partitioned table techniques [33]. In-memory storage and caching solutions, such as Redis or Alluxio, accelerate real-time decision-making by caching frequently accessed data, thereby reducing I/O latency [34]. Cold data archiving preserves historical survey statistics and reanalysis datasets at low cost using offline storage combined with code correction techniques, and supports on-demand loading to computing clusters [35]. Table 2 presents a performance comparison and application scenarios for typical storage technologies. By adopting a hybrid storage architecture and intelligent scheduling strategies, smart farms can balance data access efficiency with storage cost-effectiveness, ensuring low-latency, high-reliability data support for upper-level intelligent decision-making models.

To ensure comprehensive data lifecycle management, it is essential to establish a unified metadata catalog and data lineage tracking mechanism. Metadata governance should utilize standardized labels to enable unified retrieval across different storage engines and enhance data interoperability through the association of spatial and temporal attributes. A dynamic hierarchical storage system should automatically migrate data across storage media based on access frequency. For instance, frequently accessed “hot” data used in sowing path planning models should be stored on solid-state drives (SSDs), while less frequently accessed data, such as annual yield reports, can be archived on tape storage, balancing performance and cost-efficiency. In terms of security compliance, encrypted storage and fine-grained access control should be implemented to safeguard farmers’ personal data and commercial information. Additionally, audit logs and data desensitization techniques are necessary to meet regulatory requirements and establish a foundation for the secure and credible sharing of multi-source data.

3.3. Security & Sharing of Agricultural Data

Multi-source heterogeneous data in smart farms face comprehensive challenges related to confidentiality, integrity, and availability during transmission, storage, and computation. To address these challenges, a layered security strategy integrated with cloud-native security services should be adopted to enhance risk prevention and control efficiency. At the transmission layer, end-to-end encrypted channels are established using the TLS protocol and AES algorithm, while hardware security modules manage key storage and dynamic updates to ensure the confidentiality and real-time availability of high-frequency sensor data during millisecond-level transmission [36]. At the storage and computation layer, homomorphic encryption is applied to sensitive fields in time-series databases and distributed object storage, allowing encrypted data to support aggregation analysis and model inference without decryption. Additionally, operation logs and financial records stored in relational databases are processed within a trusted execution environment to prevent tampering or data leakage [37]. For integrity verification, a unique verification code is generated for each data record, and corresponding fingerprints are uploaded to a blockchain-based distributed ledger. This process, combined with smart contracts and data fingerprinting technologies, enables end-to-end tamper-proof traceability from data collection to storage [38].

To ensure compliance and privacy, smart farms must establish a unified data sharing governance framework that supports cross-entity, multi-scenario secure collaboration and efficient data reuse. For multi-institutional collaborative training, a federated learning platform can be deployed to enable cross-domain model optimization without exposing raw data, by exchanging encrypted gradients or intermediate model parameters. For role and permission management, attribute-based encryption (ABE) is introduced. Field-level decryption keys are dynamically distributed based on user credentials, ensuring that each role only accesses authorized data. At the shared interface layer, unified data access is implemented via an API gateway. Combined with dynamic access control policies and audit log monitoring, this supports real-time, on-demand data invocation and secure distribution. For historical or infrequently accessed data, differential privacy and data desensitization algorithms are applied prior to storage. These data are then archived using a hierarchical storage strategy and managed through intelligent scheduling, enabling low-cost storage with on-demand loading. This approach balances the value of data utility with privacy protection [39,40].

4. Intelligent Decision-Making Models for Smart Farms

4.1. Pre-Season Cultivation Planning Decision Models

These pre-season decision models constitute the first closed-loop planning stage of the smart farm end-to-end pipeline. Suitability assessment results directly constrain planting plan optimization, which in turn provides spatial boundaries and target yields for sowing timing and variety recommendation models, ultimately generating a unified, field-specific pre-season decision scheme that is automatically pushed to downstream in-season management systems.

4.1.1. Suitability Assessment

Crop suitability analysis forms the foundation for optimizing land resource utilization in smart farms [41]. Evaluation methods are generally categorized into linear and nonlinear approaches [42]. Linear methods quantify the influence of environmental factors using weighted indicator systems, such as the Analytic Hierarchy Process (AHP) and the Delphi method. These approaches are intuitive and easy to implement but are often affected by subjective bias and inconsistencies in data [43]. To address these limitations, improved techniques like the Ordered Weighted Averaging (OWA) method and Logical Preference Scoring—implemented via GIS—enable nonlinear aggregation of more than 30 indicators while preserving weight differentiation, demonstrating significantly better performance than traditional hierarchical analysis [44]. In contrast, nonlinear methods relax the assumptions of linearity. For instance, fuzzy mathematics models the continuity of indicators using membership functions and can be enhanced with genetic algorithms or Telfer’s method to improve weight accuracy [45]. Increasingly, linear and nonlinear methods are being integrated, combining expert knowledge with objective data to enhance the accuracy and interpretability of suitability evaluations, thereby driving the development of more robust and data-informed decision-making frameworks in smart agriculture.

Advancements in computer performance and big data technologies have significantly expanded the application of machine learning in crop suitability evaluation. Artificial Neural Networks (ANNs) address complex nonlinear relationships by mimicking the structure of the human brain. Among them, multilayer feed-forward networks are typically trained using backpropagation algorithms, while Kohonen networks perform unsupervised clustering to extract latent patterns from data. Both types of networks have demonstrated strong capabilities in modeling the relationship between environmental factors and crop suitability [46]. The Maximum Entropy (MaxEnt) model, a representative of ecological niche modeling, predicts potential crop suitability zones through probabilistic distribution analysis. It is characterized by high stability and scalability, making it suitable for multi-scale spatial assessments [47]. Random Forest (RF), which does not rely on assumptions of variable normality and is highly robust against multicollinearity, exhibits excellent generalization performance, particularly under small-sample conditions. Its regression accuracy consistently surpasses that of traditional multiple linear regression and single decision tree models, establishing it as a widely adopted approach in geoscientific research [48]. In contrast to multi-indicator methods, crop growth models serve as dynamic nonlinear tools that simulate the entire growth process of crops. By accounting for physiological and environmental interactions in detail, they provide a mechanistic understanding of crop development, offering superior interpretability compared to purely statistical or data-driven methods.

4.1.2. Planting Plan

Planting planning is a critical component of pre-production decision-making in smart farms, with the primary objective of achieving optimal resource allocation and maximizing economic returns through scientific methodologies. Current research predominantly integrates mathematical modeling and intelligent algorithms. Mathematical models establish quantitative relationships among crop planting area, yield, and processing demand, providing a rigorous foundation for decision-making. For instance, Chen Feifei and Bo developed a dynamic tomato planting–processing model based on optimal control theory, effectively addressing the challenge of aligning production capacity with processing demand [49]. In parallel, intelligent algorithms, renowned for their robust global search capabilities, offer distinct advantages in solving multi-objective optimization problems. Dan and Bingbing proposed a multi-objective particle swarm biogeography-based optimization algorithm, which integrates Pareto non-dominated sorting with the particle swarm optimization mechanism [50]. This approach successfully achieved the synergistic optimization of multiple conflicting objectives: minimizing enterprise losses, maximizing farmer incomes, and reducing planting area. While mathematical models offer transparent logic and theoretical clarity, they often depend heavily on the availability of accurate parameters, limiting their flexibility in complex or uncertain environments. Conversely, intelligent algorithms can effectively handle nonlinearities and dynamic conditions, though they demand greater computational resources. Future research should emphasize the deep integration of mathematical models and intelligent algorithms, leveraging advanced technologies such as deep learning to enable dynamic forecasting and adaptive constraint optimization. This integration aims to enhance the flexibility and precision of planting strategies, thereby providing more robust decision-making support for smart farm management.

4.1.3. Sowing Plan

Sowing planning aims to determine the optimal sowing time for crops through scientific methodologies, thereby avoiding extreme climatic stresses and enhancing resource utilization efficiency. Recent research has progressively shifted from traditional experience-based approaches to data- and model-driven precision decision-making systems. The primary research methods include field experiments, yield comparison trials, and crop growth model simulations.

The field trial method offers practical guidance for specific regions by establishing different sowing treatment groups and directly observing variations in crop yield and physiological characteristics. For example, Zhan Changgui and Ting demonstrated that early sowing significantly increased the depth of tiller node penetration and enhanced cold resistance in a winter wheat trial conducted in southern Xinjiang [51]. However, field trials require multi-year and multi-location replications for validation, and their findings are often constrained by specific environmental conditions and crop varieties, limiting their generalizability. To explore optimal management strategies under the interaction of multiple factors, the yield comparison method integrates variables such as sowing date, crop variety, and fertilizer application into a comprehensive experimental design. Vitantonio-Mazzini and Borrás uncovered the synergistic mechanism between early sowing and precise fertilizer application through experiments combining maize sowing dates with nitrogen and phosphorus regulation, providing valuable insights into environment–management interactions and mechanisms of yield enhancement [52]. Compared with traditional experimental methods, simulation techniques based on crop growth models significantly reduce research costs and expand the scope of spatio-temporal analysis. For instance, Jones and Hoogenboom used the CERES-Wheat model to simulate various sowing scenarios and identified mid to late November as the optimal sowing window for winter wheat [53].

Future advancements should focus on methodological innovation and interdisciplinary integration. On one hand, remote sensing, meteorological forecasting, and IoT data should be integrated and coupled with machine learning algorithms to optimize model parameters, thereby enhancing simulation accuracy and dynamic responsiveness. On the other hand, greater emphasis should be placed on the synergistic analysis of factors such as sowing date, variety selection, planting density, and water–fertilizer management, with the aim of constructing a multi-objective optimization framework. For ecologically sensitive zones—such as alpine and arid regions—the development of region-specific sowing decision models is essential for improving the adaptability and applicability of the technology. For example, Padovan and Martre analyzed the interaction effects among sowing date, environmental conditions, and variety on wheat across four Mediterranean regions [54]. Their study demonstrated that management strategies should be tailored to local conditions: in warm–dry areas, temperature and precipitation are the primary yield-limiting factors, and early sowing of short-cycle varieties is recommended; whereas in cold and wet environments, short-cycle varieties may shorten the flowering period and reduce yield, thus medium- to late-maturing varieties are more suitable.

4.1.4. Variety Recommendation

Variety recommendation aims to maximize planting potential and mitigate production risks by identifying crop varieties that are well-suited to specific environmental conditions and production objectives. Traditional approaches primarily rely on manually constructed indicator systems and statistical models. For example, Zhang Jianjun and Xiaoqun employed fuzzy mathematical methods to develop indicators for assessing the climatic suitability of maize, quantifying key parameters such as cumulative temperature and precipitation [55]. These methods typically involve the use of principal component analysis or analytic hierarchy processes to screen indicators and assign weights. While such approaches are logically sound, they are constrained by subjective judgment; the selection of indicators and the assignment of weights are prone to bias, thereby affecting the objectivity and robustness of the evaluation results.

With the advancement of artificial intelligence, machine learning methods have increasingly become the mainstream approach for variety recommendation. By autonomously learning from multi-source data and uncovering latent associations between environmental conditions and crop varieties, machine learning significantly enhances the accuracy and objectivity of recommendations. Advanced machine learning techniques have increasingly replaced traditional methods. For instance, Graph Convolutional Networks (GCNs) [56] and Random Forest-based systems [57] have been successfully developed to capture complex interaction patterns between crop varieties and environmental variables, significantly enhancing recommendation accuracy and regional adaptability.

Table 3 provides a structured overview of representative methods and application scenarios for each type of pre-season decision-making task. To further synthesize their comparative characteristics, Figure 2 presents a visual summary of key strengths and limitations associated with each task. This figure serves as a complement to the tabulated information by emphasizing model interpretability, data dependency, generalizability, and computational complexity, thereby supporting task-specific model selection and future integration strategies. The comparative scoring in Figure 2 highlights the trade-offs among the four pre-season tasks. For example, while suitability assessment methods offer high interpretability, they often lack generalization capability. Conversely, variety recommendation models based on machine learning demonstrate high accuracy and adaptability, yet remain limited by low transparency. These contrasts reinforce the need for hybrid approaches that balance expert knowledge with data-driven inference.

4.2. Mid-Season Cultivation Management Decision Models

Within the smart farm framework, the three mid-season modules (seedling monitoring → variable fertilization, moisture sensing → irrigation scheduling, pest/disease identification → targeted spraying) operate as an integrated real-time decision pipeline (Figure 3). Data streams from multi-source sensors are fused in a central platform, triggering sequential or parallel execution of the corresponding models, thereby achieving closed-loop, plot-level precise control with minimal human intervention.

4.2.1. Seedling Monitoring & Variable Fertilization Decision

Seedling monitoring focuses on assessing the physiological status of crop growth and development. Traditionally, visual assessment methods—based on characteristics such as plant height, leaf size and color, stalk thickness, and grain fullness—have been employed. However, these methods are unsuitable for large-scale monitoring due to labor intensity and subjectivity. Remote sensing technology offers a scalable and objective alternative by estimating crop growth indicators through spectral information. Early studies primarily employed single vegetation indices (e.g., NDVI) to assess growth metrics such as leaf area index (LAI) and AGB for crops like rice and winter wheat [58]. Nevertheless, the reliability of single indices is often compromised by vegetation cover variability and soil background interference, limiting their capacity to capture complex growth variations [59]. To enhance monitoring accuracy, recent research has adopted multi-index fusion approaches that integrate complementary spectral features to mitigate environmental noise and improve crop type and growth discrimination [60]. Furthermore, the synergistic incorporation of climatic variables, growth parameters, and remote sensing data—along with dynamic parameter calibration using multiple regression models—has substantially improved the accuracy and transferability of growth monitoring systems for crops such as soybean and winter wheat [61]. Despite these advances, most existing research remains focused on static analyses of single growth stages. Future efforts should emphasize dynamic monitoring across the entire growing season by integrating time-series remote sensing data with crop growth simulation models to enable continuous tracking and accurate forecasting. UAV-based remote sensing presents a promising avenue for high-resolution crop monitoring by leveraging the strong correlation between canopy spectral reflectance and physiological indicators. Two primary modeling approaches are used: empirical statistical models and radiative transfer models. Empirical models quantify linear or nonlinear relationships between spectral features and crop traits and are often enhanced with machine learning algorithms such as support vector regression (SVR), RF, and ANN [62,63,64]. While computationally efficient, these models require large training datasets and often suffer from limited generalizability. In contrast, radiative transfer models such as PROSAIL simulate spectral reflectance based on physical and optical principles, offering higher generalization but at the cost of increased complexity in model inversion [65]. Moving forward, integrating the strengths of both empirical and mechanistic models—while leveraging multi-source data fusion—will be essential for advancing seedling monitoring toward a balanced paradigm of accuracy, efficiency, and adaptability.

Variable fertilization decision-making is a core component of precision agriculture, aimed at optimizing nutrient management strategies to enhance crop yield while ensuring environmental sustainability. Current research primarily explores two key approaches: crop growth model-based data assimilation and deep learning techniques. By integrating multi-source data and designing intelligent algorithms, these methods increasingly enable dynamic and precise fertilization decisions. The fusion of crop growth models with data assimilation techniques offers mechanistic support for fertilization decision-making. Data assimilation enhances the accuracy of crop state variable estimation by integrating remote sensing observations with model simulations, thereby guiding fertilizer application strategies. Optimization algorithms such as particle swarm optimization and composite hybrid evolution have demonstrated strong performance in parameter inversion. For instance, Xing and Li found that the composite hybrid evolution algorithm achieves both high accuracy and computational efficiency when assimilating winter wheat biomass, though its applicability is constrained by model structure and crop specificity [66]. Filtering algorithms further improve model adaptability by dynamically updating state variables. Xie and Wang utilized a particle filtering algorithm to assimilate maize LAI and Vegetation Temperature Condition Index (VTCI) data, revealing the influence of water stress on yield and highlighting the potential of data assimilation in fertilization planning [67]. Nevertheless, existing studies tend to emphasize yield forecasting, while research directly addressing fertilization optimization remains limited. Additionally, algorithmic complexity and a lack of real-time performance hinder widespread field deployment. The emergence of deep learning offers a promising pathway for advancing variable-rate fertilization. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) can autonomously extract spatiotemporal features from multi-source remote sensing data, significantly improving the accuracy of growth parameter estimation (e.g., LAI, biomass). Yang and Shi proposed a CNN-LSTM network enhanced with an attention mechanism to invert rice growth parameters by fusing satellite and UAV data, outperforming traditional methods and providing reliable inputs for fertilization models [68]. Moreover, Deep Reinforcement Learning (DRL) introduces a framework for optimizing fertilization strategies through a state-action-reward paradigm. Liu applied the Proximal Policy Optimization (PPO) algorithm to nitrogen application decision-making, dynamically generating nitrogen application curves tailored to individual plots with yield maximization and nitrogen use efficiency as objectives, thereby improving both economic and environmental outcomes. Chlingaryan and Sukkarieh provided a comparative overview of supervised, unsupervised, and reinforcement learning algorithms, emphasizing the advantages of deep learning in multi-source agricultural big data analysis and strategic optimization [69]. Despite this potential, deep learning applications in precision fertilization remain at an early stage and face challenges such as algorithm applicability, model generalization, and interpretability. Future research should prioritize foundational theory, critical technologies, and real-world demonstrations. Additionally, reducing the computational burden of deep models through lightweight algorithm design and edge computing integration will be essential for promoting their practical adoption in field environments.

4.2.2. Moisture Sensing & Efficient Irrigation Strategy Decision

Soil moisture monitoring using remote sensing is a pivotal technology in precision agriculture, as it enables informed decision-making through the retrieval of moisture conditions based on spectral interactions within the plant–soil–atmosphere continuum. The core challenge lies in accurately inverting soil moisture content from remote sensing data. Existing methodologies can be broadly categorized into two types: physical models and statistical machine learning models. Physical models, particularly those based on microwave and thermal infrared remote sensing, offer high-precision soil moisture estimates. However, they require numerous complex input parameters—such as soil texture, surface roughness, and vegetation cover—which significantly limit their scalability and applicability for continuous spatiotemporal monitoring at regional or global scales [70]. In contrast, statistical and machine learning approaches bypass many of these limitations by exploiting the nonlinear relationships between spectral features and soil moisture. Recent studies have demonstrated the effectiveness of these data-driven methods. In soil moisture monitoring, data-driven models fusing UAV and satellite imagery (e.g., ANN, SVM, GBRT) have proven effective in bypassing complex parameterization, achieving high estimation accuracy across varying scales [71,72,73]. For irrigation decision-making, the focus has shifted from static schedules to dynamic control. Fuzzy logic systems [74,75] and Model Predictive Control (MPC) frameworks [76,77] are now widely employed to optimize water usage by integrating real-time soil feedback and meteorological forecasts, significantly reducing resource waste compared to conventional methods.

However, many existing MPC-based studies assume complete infiltration and utilization of rainfall, overlooking the dynamic interplay between rainfall intensity and soil infiltration capacity. This leads to overestimation of effective rainwater use and increased risks of surface runoff or deep percolation, especially in soils with extreme textures such as clay or sand. Future smart irrigation systems should focus on constructing dynamic, multi-objective optimization frameworks that incorporate soil–water–atmosphere interactions, rainfall-runoff modeling, and real-time data assimilation. Such systems will enhance the accuracy and resilience of irrigation strategies, thereby improving overall agricultural water use efficiency and sustainability.

4.2.3. Pest and Disease Monitoring & Precision Application Decision

Statistical modeling represents one of the earliest technical approaches for remote sensing-based monitoring of crop pests and diseases, primarily aimed at establishing quantitative relationships between spectral characteristics and disease severity based on statistical principles. Among these, discriminant analysis models—particularly Fisher’s Linear Discriminant Analysis (FLDA)—have shown strong performance in classifying wheat diseases. For example, YUAN and ZHANG applied FLDA to leaf spectral data and successfully differentiated between wheat yellow rust, powdery mildew, and aphid infestations, achieving an accuracy of 75% [78]. In regression-based approaches, PLSR has become a widely used method for inverting disease severity due to its robustness against multicollinearity and resistance to overfitting. Wang and Jing developed a wheat stripe rust severity inversion model using PLSR, achieving a correlation coefficient of 0.936 between estimated and observed disease severity [79]. However, statistical models are inherently constrained by linear assumptions and their dependency on localized environmental data, limiting their applicability in large-scale field conditions characterized by nonlinear interactions and multifactorial influences. The advent of traditional machine learning methods has significantly enhanced the intelligence and automation of pest and disease monitoring through supervised and unsupervised learning paradigms. Supervised models such as ANN, SVM, and RF have demonstrated high accuracy in disease classification and feature selection tasks. Supervised machine learning models, such as ANN and SVM, have demonstrated high efficacy in disease classification and severity assessment by learning from spectral features [80,81]. More recently, deep learning architectures including PSPNet [82] and attention-based ResNet [83] have further revolutionized this field. These models automatically extract hierarchical features from imagery, achieving robust recognition performance in complex field conditions without relying on manual feature engineering.

The core of precision application decision-making in smart agriculture lies in the rapid and accurate diagnosis of both disease type and severity using intelligent recognition technologies, which subsequently guide the formulation of targeted application strategies. Current research emphasizes the integration and optimization of traditional image processing techniques with deep learning approaches, facilitating the transition of disease identification from reliance on manual expertise to automated intelligent diagnosis. However, the challenge of accurately distinguishing varying severity levels of the same disease remains a major bottleneck. Traditional image processing methods perform disease identification by manually extracting features—such as color and texture—and designing classifiers. For example, Yandong and He combined SVM with Dempster–Shafer (DS) evidence theory, achieving a recognition accuracy of 93.3% for corn disease [84]. While effective in specific contexts, these methods suffer from complex operations and limited generalization capabilities due to their reliance on handcrafted features. By contrast, deep learning techniques have significantly enhanced the accuracy and efficiency of plant disease recognition, leveraging their capacity for automatic feature extraction and end-to-end learning. CNNs and their variants dominate current research. For instance, Zhang and Qiao employed a triplet-loss dual CNN architecture to achieve over 90% accuracy in maize disease recognition [85]. ResNeXt improves model performance by optimizing network topology through aggregated transformations, avoiding a substantial increase in parameter count. Similarly, Xu Jinghui and Mingye achieved 95.33% accuracy in disease classification using an enhanced VGG16 model [86]. Despite these advancements, most studies focus primarily on differentiating disease types, while the ability to recognize variations in disease severity remains limited. This shortfall constrains the precision regulation of input quantities, such as pesticide application, which is essential for fully automated variable-rate operations. Therefore, addressing this gap is critical to achieving end-to-end automation in smart farming disease management systems.

Table 4 summarizes the major data types, algorithmic approaches, and decision targets associated with typical mid-season cultivation tasks. To further elucidate the decision-making process, Figure 3 presents a modular workflow integrating three critical submodules: seedling monitoring coupled with variable fertilization, moisture sensing linked to irrigation scheduling, and pest and disease identification for targeted spraying. Each module exemplifies a data-to-decision pipeline, demonstrating how data-driven models are embedded within real-time agricultural applications. Together, Table 4 and Figure 3 offer a holistic perspective on the operational structure of intelligent mid-season management systems. The modular design also allows for selective deployment of submodules depending on crop type, sensor availability, and farm-scale infrastructure.

4.3. Post-Harvest Benefit Evaluation Models

These post-harvest models close the annual decision-making loop of smart farms. Yield forecasts generated in real time during the late growth stage are directly fed into harvest timing prediction and machinery scheduling modules; the resulting operational data (actual harvest time, machinery paths, and final yield) are then automatically written back to the comprehensive benefit evaluation system, forming a complete data–model–execution–feedback closed loop across the entire production cycle.

4.3.1. Production Forecasts

In the context of global population growth and the increasing frequency of extreme weather events, accurate crop yield prediction has become a critical component of precision agriculture. At present, the main approaches to crop yield prediction can be broadly categorized into two groups: physical simulation models based on crop growth processes, and statistical machine learning models based on historical data. Physical models simulate the physiological and environmental processes of crop development to estimate potential yields. While these models can provide biologically interpretable insights, they require extensive field data for calibration and high computational complexity, which limits their scalability and practical application over large areas [87]. In contrast, statistical and machine learning models establish predictive relationships between historical yield data and influencing factors such as meteorological variables, remote sensing indices, and soil characteristics. These models do not depend on detailed biophysical parameters, making them more scalable and adaptable for large-scale yield forecasting. For example, Fu and Jiang combined various vegetation indices with techniques such as multiple regression analysis to predict wheat yields, achieving improved prediction accuracy [88].

With the rapid advancement of machine learning technologies, nonlinear models and ensemble learning methods have become increasingly prominent in the field of crop yield prediction. Ensemble methods enhance prediction accuracy and robustness by integrating multiple base models, thereby mitigating the limitations of individual models. For instance, Filippi et al. employed the RF algorithm to estimate yields of wheat, barley, and oilseed rape using MODIS-EVI remote sensing data, and demonstrated that RF models yield higher prediction accuracy compared to traditional techniques [89]. When processing large-scale remote sensing datasets, deep learning techniques exhibit strong feature extraction capabilities. These models automatically learn complex spatial and temporal patterns, significantly improving yield prediction performance. Nevavuori and Narra introduced a CNN-based approach that successfully predicted wheat and barley yields using remotely sensed imagery, outperforming conventional methods in terms of accuracy [90]. Furthermore, the integration of CNN and Long Short-Term Memory (LSTM) networks in spatio-temporal models enhances the fusion of spatial features with time-series information. This architecture improves predictive performance for dynamic agricultural systems. For example, Qiao and He developed a spatial–spectral–temporal neural network combining 3D CNN and LSTM, which effectively integrates spectral, spatial, and temporal information [91]. Their model achieved outstanding performance in predicting yields of winter wheat and maize across three regions in China, demonstrating substantial improvements in the accuracy and generalizability of grain yield forecasts.

To gain a more comprehensive understanding of the factors influencing crop yield, future research should focus on fine-scale prediction, particularly through experimental studies conducted at the farm and pixel levels. While some scholars have begun to explore coupled models for crop growth parameter inversion and yield estimation, this area remains underdeveloped. Coupled models integrate the strengths of deep learning and mechanistic (process-based) models, thereby enhancing the interpretability of deep learning outputs and addressing the limitations of traditional crop growth models or radiative transfer models, which often face challenges in large-scale application due to their reliance on detailed parameters and site-specific calibration. By combining data-driven learning with domain-specific biophysical knowledge, coupled models offer improved scalability and generalization across diverse agricultural environments. Consequently, the integration of mechanistic modeling and deep learning holds significant potential for advancing regional-scale crop growth monitoring and yield estimation, making it a promising direction for future agricultural research and precision farming applications.

4.3.2. Harvest Timing Forecasting & Agricultural Machinery Dispatch

In the context of agricultural intensification and escalating climate change, accurate prediction of crop harvest periods has become a critical technology for optimizing agricultural machinery scheduling and reducing post-harvest losses. Existing prediction methods can be broadly classified into two categories: physical models based on crop physiological mechanisms and data-driven models based on remote sensing and machine learning. Physical models estimate crop maturity by simulating interactions between growth cycles and environmental factors; however, they require extensive field parameter calibration and struggle to adapt to complex terrains and diverse climatic conditions [92]. In contrast, data-driven approaches establish nonlinear relationships between remote sensing spectral features and crop maturity through machine learning algorithms, significantly enhancing prediction efficiency and adaptability. Data-driven approaches have significantly enhanced prediction efficiency. For example, studies integrating UAV-acquired multispectral data with algorithms such as Random Forest [93] and CNN-LSTM architectures [94] have successfully established non-linear relationships between spectral features and crop maturity, consistently outperforming traditional physical models in terms of accuracy and adaptability. Deep learning also improves cross-regional generalization, as demonstrated by Singh and Duddu, who fused hyperspectral and multispectral data to construct a spectral response model for oilseed rape maturity, highlighting the potential of multi-source remote sensing synergy in harvest prediction [95].

Building on predictive harvest period modeling, the agricultural machinery scheduling module integrates the spatiotemporal distribution of crop maturity with vehicle path optimization algorithms to enable efficient and low-loss harvesting operations. Current research has increasingly transitioned from traditional static, experience-based management to intelligent, dynamic scheduling frameworks. Conventional approaches often result in idle time or machine overload, especially during cross-regional operations, due to poor information flow and coordination inefficiencies [96]. Recent studies focus on modeling the vehicle routing problem (VRP) in agricultural settings and solving it with intelligent optimization algorithms. For example, Ruyue and Shichao applied an improved ant colony optimization (ACO) algorithm to optimize agricultural machinery paths, effectively addressing the “neighborhood” task allocation issue [97]. Liang implemented a multi-objective optimization algorithm, significantly enhancing the solution efficiency for complex scheduling tasks [98]. Despite these advances, current models still face limitations. Many are built on idealized assumptions, neglecting critical real-world constraints such as actual machinery working hours, field task volume, and equipment availability. Moreover, standardization and fusion of multi-source data—essential for effective cross-regional coordination—remain unresolved. Future research should focus on improving the practical applicability of scheduling algorithms under real-world constraints, and enhancing the integration of multi-source remote sensing, meteorological, and machinery data. This is essential for advancing from theoretical optimization to real-time adaptive scheduling. In terms of multi-objective collaborative optimization, Li Shichao and Man demonstrated that the combination of NSGA-III and an improved ACO algorithm can achieve zero waiting time for harvesters and minimize transportation distance, providing a valuable reference for large-scale intelligent agricultural logistics [99].

4.3.3. Holistic Performance Assessment

Comprehensive agricultural benefit evaluation serves as a critical tool for assessing the sustainability of agricultural production, aiming to quantitatively measure the synergistic effects of economic, ecological, and social benefits through a multidimensional indicator system. Recent research has evolved from traditional single-benefit assessments to more advanced multi-objective comprehensive evaluations, increasingly integrating quantitative analysis, dynamic simulation models, and multi-criteria decision-making approaches. However, despite these advancements, significant challenges remain—particularly regarding the applicability of evaluation methods across diverse agricultural contexts and the credibility and robustness of the results.

Early research on agricultural benefit evaluation primarily focused on single-dimensional analyses, particularly economic benefit assessments, due to constraints in data availability and methodological maturity. Methods such as Analytic Hierarchy Process (AHP), entropy weight method, and other qualitative and quantitative models were widely adopted. Early evaluations primarily relied on single-method approaches, such as AHP or Life Cycle Assessment (LCA), to quantify specific economic or environmental metrics [100,101,102]. To address the limitations of individual methods, recent studies favor hybrid evaluation frameworks. By integrating subjective expert scoring (e.g., AHP) with objective data analysis (e.g., entropy weight, TOPSIS), these composite models [103,104] effectively balance qualitative and quantitative indicators, enhancing the robustness and comprehensiveness of sustainability assessments.

Table 5 provides a categorized summary of data sources, algorithms, and decision targets for post-harvest evaluation. To illustrate the process logic, Figure 4 presents a structured workflow integrating three interdependent modules: (i) production forecasting using remote sensing and spatiotemporal modeling; (ii) harvest timing estimation and machinery dispatch through deep learning and optimization algorithms; and (iii) holistic performance assessment via multi-criteria decision analysis. The diagram highlights the sequential flow from predictive modeling to actionable scheduling and ultimately to sustainability scoring. Together, Table 5 and Figure 4 provide an operational and evaluative framework for supporting post-harvest decision intelligence in smart agriculture. This modular architecture emphasizes inter-module dependency, where yield forecasts inform harvest planning, and execution data feeds back into sustainability evaluation, creating a closed-loop decision-support system.

4.4. Section Summary

In the pre-season phase, decision models focus on site-specific planning tasks such as suitability assessment, sowing schedules, and variety selection. These are supported by a mix of traditional statistical methods, non-linear evaluation, and increasingly machine learning and crop growth simulations, enabling precision layout of agricultural production. During the in-season management stage, real-time monitoring and intelligent control become essential. UAV-based sensing, deep learning, and MPC enable accurate assessment of crop status, dynamic fertilization, irrigation scheduling, and biotic stress surveillance. These techniques highlight the fusion of multi-source data and algorithmic adaptability. Post-harvest models emphasize yield forecasting, harvest timing, machinery scheduling, and performance evaluation. Spatio-temporal neural networks, ensemble learning, and multi-objective optimization algorithms are increasingly used to improve both prediction accuracy and operational efficiency.

Despite these advancements, the transition from theoretical models to field application faces critical bottlenecks. First, cross-regional model transferability remains limited; algorithms trained on specific datasets often suffer significant performance degradation when applied to new environments with differing soil backgrounds and climatic patterns. Second, data scarcity—particularly the lack of high-quality labeled samples for rare pest outbreaks or specific growth stages—continues to constrain the robustness of data-driven models. Finally, real-world constraints, such as limited onboard computing power and unstable network connectivity in rural areas, hinder the deployment of complex deep learning architectures (e.g., 3D-CNNs) on edge devices, highlighting an urgent need for lightweight model design.

As illustrated in the Sankey diagram (Figure 5), many technical methods—especially deep learning, intelligent optimization, and multi-modal sensing—are no longer confined to single phases but have become cross-cutting tools that support a continuous, integrated end-to-end decision-making pipeline from pre-season planning through in-season control to post-harvest evaluation and feedback. Overall, the evolution of these models reflects a clear trend toward integrated, adaptive, and cross-stage frameworks, laying the groundwork for sustainable and scalable smart agriculture systems.

5. Discussion, Conclusions and Outlook

5.1. Discussion

The findings of this review highlight that while technical capabilities in smart farming are advancing rapidly, realizing their full potential requires addressing critical implementation barriers through coordinated stakeholder action. For policymakers, the challenge of data fragmentation underscores an urgent need for national standards on agricultural data formats and sharing protocols. Furthermore, governments should incentivize the deployment of rural digital infrastructure, such as 5G networks and edge computing nodes, which are prerequisites for supporting the real-time decision capabilities envisioned in this study. Simultaneously, technology providers must shift their focus from isolated algorithmic improvements to system interoperability. Adopting the federated governance and layered storage architectures identified in Section 3 is essential for reducing data management costs while ensuring security. For agricultural practitioners, the increasing complexity of models—particularly the “black box” nature of deep learning—poses adoption risks. Consequently, extension services must prioritize enhancing farmers’ digital literacy to facilitate effective collaboration with the hybrid decision-making systems of the future.

Despite the comprehensive framework presented, this study has specific limitations. First, by primarily synthesizing academic literature, the review may underrepresent proprietary, high-performance models developed by commercial agribusinesses, which are often not publicly detailed. Second, as noted in the regional analysis, smart farm models exhibit strong geographical dependencies; thus, the decision frameworks summarized here may require significant parameter recalibration when transferred across different climatic zones or cropping systems. Finally, this study focuses on technical feasibility and architectural logic, meaning a detailed cost–benefit analysis of implementing these high-precision technologies—a critical factor for smallholder adoption—was outside the scope of this review.

5.2. Conclusion

Agricultural development is progressively transitioning toward a resource-integrated, data-driven paradigm characterized by intelligent and modernized practices. At the core of this transformation is the smart farm, whose fundamental objective is to enable precise decision-making throughout the entire agricultural production process. Among the various enabling technologies, intelligent decision-making models based on big data play a pivotal role. As the foundational support for smart farming, such models have become essential for the theoretical advancement and practical implementation of modern agriculture. This paper provides a systematic overview of the fundamental principles underpinning big data-driven intelligent decision-making models for smart farms. It outlines the technical framework and recent research progress in areas such as data governance and model construction. The study further analyzes the application scenarios of end-to-end intelligent decision-making—spanning pre-production planning, in-season management, and post-production evaluation. It also summarizes the key achievements of various modeling approaches, including statistical methods, traditional machine learning algorithms, and deep learning techniques. The findings indicate that standardized data governance procedures and multi-source heterogeneous data fusion methodologies have reached a relatively mature stage. Pre-production and mid-production decision models increasingly rely on deep learning frameworks integrated with multimodal datasets. In contrast, research on post-production evaluation and regional-scale global optimization remains underdeveloped. Most importantly, these models are evolving from isolated applications into an integrated end-to-end intelligent decision-making pipeline that spans the entire crop lifecycle, with data, models, and execution equipment forming tight closed-loop feedback. Current trends suggest a shift toward the deployment of lightweight algorithms and edge computing technologies, aimed at leveraging full-process multi-source data to support dynamic, real-time decision-making in smart farming systems.

5.3. Outlook

Although current research on big data-driven intelligent decision-making models for smart farms has achieved notable advancements, further progress is essential to enable fully intelligent and large-scale applications across the entire agricultural production cycle. This section puts forward a phased development pathway for advancing intelligent agricultural decision-making, structured across short-, mid-, and long-term objectives.

In the short term, the focus should be on establishing standardized data governance frameworks and developing generalizable agricultural foundation models. Agricultural data collection currently relies on a wide range of heterogeneous sources, such as sensors, UAVs, and weather stations, leading to fragmented, inconsistent datasets and high labeling costs. Therefore, efforts should center on constructing unified data standards, metadata schemas, and shared data directories. Meanwhile, large-scale agricultural foundation models should be developed to integrate multi-source data such as satellite imagery, meteorological information, and agronomic observations. These models can help address regional variability and improve generalization through transfer learning and domain adaptation. By strengthening data infrastructure and modeling capacity, this stage lays the groundwork for more intelligent and scalable decision systems; laying the data and model foundation for future end-to-end decision pipelines.

In the mid-term, the emphasis shifts to enabling collaborative and adaptive decision-making through federated intelligence and human–machine interaction. Building upon standardized data protocols, federated learning frameworks can support secure and privacy-preserving multi-party model training across different farms or institutions. Blockchain-based auditing mechanisms may also be introduced to ensure traceability and trust in data transactions. In parallel, interpretable hybrid decision-making systems should be established by combining expert knowledge with deep learning through natural language interfaces. These systems can enhance usability and transparency in field operations and prepare the foundation for real-time responsiveness in subsequent stages; enabling secure, collaborative, and human–machine co-driven end-to-end decision-making across farms.

In the long term, intelligent agricultural systems must achieve real-time, high-frequency control and embed sustainability-centered ethical governance. By deploying lightweight models on edge devices through 5G infrastructure, decision systems can support millisecond-level task responses such as fertilization and irrigation scheduling. Furthermore, intelligent platforms should incorporate environmental constraints and sustainability objectives, optimizing resource allocation while minimizing ecological footprints. Blockchain technologies can support transparent carbon tracking and low-carbon transitions. In parallel, it is essential to develop ethical governance frameworks to guide model development and deployment in alignment with principles of ecological integrity and social equity. This shift from a “high-yield, high-efficiency” paradigm to one centered on “precision and sustainability” will provide robust and responsible support for future agricultural transformation; ultimately achieving fully autonomous, real-time, sustainable end-to-end intelligent decision-making systems for smart farms.

Author Contributions

C.Q.: Conceptualization, Data curation, Formal analysis, Methodology, Resources, Visualization, Writing—original draft, Writing—review & editing. P.Z.: Conceptualization, Data curation, Methodology, Resources, Writing—review & editing. Y.Q.: Conceptualization, Data curation, Investigation, Methodology, Resources. G.Y.: Conceptualization, Funding acquisition, Methodology, Supervision, Validation, Writing—review & editing. X.H.: Conceptualization, Methodology, Supervision. X.M.: Conceptualization, Funding acquisition, Methodology, Supervision, Validation, Writing—review & editing. X.Y.: Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing—review & editing. J.H.: Conceptualization, Formal analysis, Methodology, Project administration, Supervision, Validation, Writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Funding

This study is supported by the National Key Research and Development Program of China (2023YFD2000105).

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

World Health Organization (WHO). The State of Food Security and Nutrition in the World 2020: Transforming Food Systems for Affordable Healthy Diets; Food & Agriculture Org.: Rome, Italy, 2020; Volume 2020. [Google Scholar]
Food and Agriculture Organization. The Future of Food and Agriculture: Alternative Pathways to 2050; Food and Agriculture Organization of the United Nations: Rome, Italy, 2018; p. 60. [Google Scholar]
Food and Agriculture Organization (FAO) of the United Nations. The Future of Food and Agriculture: Trends and Challenges; FAO: Rome, Italy, 2017. [Google Scholar]
Çakmakçı, R.; Salık, M.A.; Çakmakçı, S. Assessment and principles of environmentally sustainable food and agriculture systems. Agriculture 2023, 13, 1073. [Google Scholar] [CrossRef]
Moysiadis, V.; Sarigiannidis, P.; Vitsas, V.; Khelifi, A. Smart Farming in Europe. Comput. Sci. Rev. 2021, 39, 100345. [Google Scholar] [CrossRef]
Li, Q.; Gao, M.-F.; Fang, Y. Research on the construction of the agricultural big data information platform. J. Agric. Big Data 2021, 3, 24–30. [Google Scholar]
Huang, Y.; Chen, Z.-X.; Tao, Y.; Huang, X.-Z.; Gu, X.-F. Agricultural remote sensing big data: Management and applications. J. Integr. Agric. 2018, 17, 1915–1931. [Google Scholar] [CrossRef]
Wolfert, S.; Ge, L.; Verdouw, C.; Bogaardt, M.-J. Big data in smart farming–a review. Agric. Syst. 2017, 153, 69–80. [Google Scholar] [CrossRef]
Ling, N.-J.; Rao, Y. Design and implementation of a big data platform for cloud server farm smart services. J. Agric. Big Data 2022, 3, 10–19. [Google Scholar]
Zhao, C.-J. Current situations and prospects of smart agriculture. J. S. China Agric. Univ. 2021, 42, 1–7. [Google Scholar]
Nahina, I.; Mamunur, R.M.; Faezeh, P.; Biplob, R.; Steven, M.; Rajan, K. A Review of Applications and Communication Technologies for Internet of Things (IoT) and Unmanned Aerial Vehicle (UAV) Based Sustainable Smart Farming. Sustainability 2021, 13, 1821. [Google Scholar] [CrossRef]
Hong, Y.X.; Yang, Y.Z. A new journey of innovation in the development model of modern agriculture—Also on the innovative research of China’s development economics. J. Manag. World 2023, 39, 1–8+53+59. [Google Scholar] [CrossRef]
Dutta, M.; Gupta, D.; Tharewal, S.; Goyal, D.; Sandhu, J.K.; Kaur, M.; AlZubi, A.A.; Alanazi, J.M. Internet of Things-Based Smart Precision Farming in Soilless Agriculture: Opportunities and Challenges for Global Food Security. IEEE Access 2025, 13, 34238–34268. [Google Scholar] [CrossRef]
Khujamatov, K.E.; Toshtemirov, T.; Lazarev, A.; Raximjonov, Q. IoT and 5G technology in agriculture. In Proceedings of the 2021 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, 3–5 November 2021; pp. 1–6. [Google Scholar]
Cao, B.; Li, H.; Zhao, C.; Li, J. The Path of Smart Agricultural Technology Innovation Leading Development of Agricultural New Quality Productivity. Smart Agric. 2024, 6, 116. [Google Scholar]
Perry, T.S. John Deere’s quest to solve agricultures deep-learning problems-[Spectral Lines]. IEEE Spectr. 2020, 57, 4. [Google Scholar] [CrossRef]
Iida, S. Precision Agriculture in Rice Farming. In Precision Agriculture: Modelling; Springer: Berlin/Heidelberg, Germany, 2023; pp. 239–250. [Google Scholar]
Xiong, F. Technical architecture and implementation of intelligent system for agriculture domain. Pattern Recognit. Artif. Intell. 2012, 25, 729–736. [Google Scholar] [CrossRef]
Tripathi, A.; Tiwari, R.K.; Tiwari, S.P. A deep learning multi-layer perceptron and remote sensing approach for soil health based crop yield estimation. Int. J. Appl. Earth Obs. Geoinf. 2022, 113, 102959. [Google Scholar] [CrossRef]
Sun, C.; Bian, Y.; Zhou, T.; Pan, J. Using of multi-source and multi-temporal remote sensing data improves crop-type mapping in the subtropical agriculture region. Sensors 2019, 19, 2401. [Google Scholar] [CrossRef] [PubMed]
Bahrami, H.; Homayouni, S.; Safari, A.; Mirzaei, S.; Mahdianpari, M.; Reisi-Gahrouei, O. Deep learning-based estimation of crop biophysical parameters using multi-source and multi-temporal remote sensing observations. Agronomy 2021, 11, 1363. [Google Scholar] [CrossRef]
dos Santos, J.A.; Gosselin, P.-H.; Philipp-Foliguet, S.; Torres, R.d.S.; Falcao, A.X. Interactive multiscale classification of high-resolution remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2020–2034. [Google Scholar] [CrossRef]
Varela, S.; Dhodda, P.R.; Hsu, W.H.; Prasad, P.V.; Assefa, Y.; Peralta, N.R.; Griffin, T.; Sharda, A.; Ferguson, A.; Ciampitti, I.A. Early-season stand count determination in corn via integration of imagery from unmanned aerial systems (UAS) and supervised learning techniques. Remote Sens. 2018, 10, 343. [Google Scholar] [CrossRef]
Lu, B.; Dao, P.D.; Liu, J.; He, Y.; Shang, J. Recent advances of hyperspectral imaging technology and applications in agriculture. Remote Sens. 2020, 12, 2659. [Google Scholar] [CrossRef]
Maimaitijiang, M.; Sagan, V.; Sidike, P.; Hartling, S.; Esposito, F.; Fritschi, F.B. Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sens. Environ. 2020, 237, 111599. [Google Scholar] [CrossRef]
Zhu, W.; Sun, Z.; Peng, J.; Huang, Y.; Li, J.; Zhang, J.; Yang, B.; Liao, X. Estimating maize above-ground biomass using 3D point clouds of multi-source unmanned aerial vehicle data at multi-spatial scales. Remote Sens. 2019, 11, 2678. [Google Scholar] [CrossRef]
Hasan, M.M.; Chopin, J.P.; Laga, H.; Miklavcic, S.J. Detection and analysis of wheat spikes using convolutional neural networks. Plant Methods 2018, 14, 100. [Google Scholar] [CrossRef]
Zhang, Z.; Yang, X.; Gao, J.; Wang, X.; Bai, F.; Sun, S.; Liu, Z.; Ming, B.; Xie, R.; Wang, K. Analysis of suitable sowing date for summer maize in North China Plain under climate change. Sci. Agric. Sin. 2018, 51, 3258–3274. [Google Scholar]
Fu, J.; Wang, W.; Shao, Q.; Xing, W.; Cao, M.; Wei, J.; Chen, Z.; Nie, W. Improved global evapotranspiration estimates using proportionality hypothesis-based water balance constraints. Remote Sens. Environ. 2022, 279, 113140. [Google Scholar] [CrossRef]
Niu, L.; Zhang, X. A study of estimation model for the chlorophyll content of wheat leaf based on hyperspectral imaging. Asian Agric. Res. 2016, 8, 86–90. [Google Scholar]
Xue, D.; Yuan, G.; Bing, A. Research on data storage method of smart grid monitoring system based on TDengine. Electrotech. Appl. 2021, 40, 68–74. [Google Scholar]
Lei, J. MinIO-Based Efficient Database Backup and Synchronization Strategy: Compressed and Encrypted Storage with Cross-Region Real-Time Synchronization. Appl. Sci. Innov. Res. 2025, 9, 136. [Google Scholar] [CrossRef]
Salunke, S.V.; Ouda, A. A Performance Benchmark for the PostgreSQL and MySQL Databases. Future Internet 2024, 16, 382. [Google Scholar] [CrossRef]
Ma, Y.; Song, J.; Zhang, Z. In-Memory Distributed Mosaicking for Large-Scale Remote Sensing Applications with Geo-Gridded Data Staging on Alluxio. Remote Sens. 2022, 14, 5987. [Google Scholar] [CrossRef]
Liu, J.; Jin, S.; Wang, D.; Li, H. An archive-based method for efficiently handling small file problems in HDFS. Concurr. Comput. Pract. Exp. 2024, 36, e8260. [Google Scholar] [CrossRef]
Mohammed, Z.K.; Mohammed, M.A.; Abdulkareem, K.H.; Zebari, D.A.; Lakhan, A.; Marhoon, H.A.; Nedoma, J.; Martinek, R. A metaverse framework for IoT-based remote patient monitoring and virtual consultations using AES-256 encryption. Appl. Soft Comput. 2024, 158, 111588. [Google Scholar] [CrossRef]
Polato, I.; Ré, R.; Goldman, A.; Kon, F. A comprehensive view of Hadoop research—A systematic literature review. J. Netw. Comput. Appl. 2014, 46, 1–25. [Google Scholar] [CrossRef]
Fotso, S.B.N.; Atchoffo, W.N.; Nzeukou, A.C.; Talla Mbé, J.H. Enhanced security in lossless audio encryption using zigzag scrambling, DNA coding, SHA-256, and Hopfield networks: A practical VLC system implementation. Multimed. Tools Appl. 2024, 84, 27091–27125. [Google Scholar] [CrossRef]
Nair, A.M.; Santhosh, R. Privacy and Integrity Verification Model with Decentralized ABE and Double Encryption Storage Scheme. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 2023, 14, 465–472. [Google Scholar] [CrossRef]
Lin, W.; Shi, F.; Zeng, L.; Li, D.; Xu, Y.; Liu, B. Survey of Federated Learning Open-Source Frameworks. J. Comput. Res. Dev. 2023, 60, 1551–1580. [Google Scholar] [CrossRef]
Jia, Y.; Yun, C.; Wu, J. Modeling and implementation of classification rule discovery by ant colony optimisation for spatial land-use suitability assessment. Comput. Environ. Urban Syst. 2010, 35, 308–319. [Google Scholar]
Elaalem, M. A Comparison of Parametric and Fuzzy Multi-Criteria Methods for Evaluating Land Suitability for Olive in Jeffara Plain of Libya. APCBEE Procedia 2013, 5, 405–409. [Google Scholar] [CrossRef]
Al-Hanbali, A.; Shibuta, K.; Alsaaideh, B.; Tawara, Y. Analysis of the land suitability for paddy fields in Tanzania using a GIS-based analytical hierarchy process. Geo-Spat. Inf. Sci. 2022, 25, 212–228. [Google Scholar]
Montgomery, B.; Dragićević, S.; Dujmović, J.; Schmidt, M. A GIS-based Logic Scoring of Preference method for evaluation of land capability and suitability for agriculture. Comput. Electron. Agric. 2016, 124, 340–353. [Google Scholar] [CrossRef]
Schmitter, P.; Zwart, S.J.; Danvi, A.; Gbaguidi, F. Contributions of lateral flow and groundwater to the spatio-temporal variation of irrigated rice yields and water productivity in a West-African inland valley. Agric. Water Manag. 2015, 152, 286–298. [Google Scholar] [CrossRef]
Ahmadi, F.F.; Layegh, N.F. Integration of artificial neural network and geographical information system for intelligent assessment of land suitability for the cultivation of a selected crop. Neural Comput. Appl. 2015, 26, 1311–1320. [Google Scholar] [CrossRef]
Bai, D.-F.; Chen, P.-J.; Atzeni, L.; Cering, L.; Li, Q.; Shi, K. Assessment of habitat suitability of the snow leopard (Panthera uncia) in Qomolangma National Nature Reserve based on MaxEnt modeling. Zool. Res. 2018, 39, 373. [Google Scholar] [PubMed]
Dorijan, R.; Mladen, J.; Mateo, G.; Ivan, P.; Oleg, A. Cropland Suitability Assessment Using Satellite-Based Biophysical Vegetation Properties and Machine Learning. Agronomy 2021, 11, 1620. [Google Scholar] [CrossRef]
Chen, F.; Jiang, B.; Guo, J. Research on processing tomato planting planning based on optimal control theory. Xinjiang Agric. Sci. 2012, 49, 1949–1954. [Google Scholar]
Luo, D.; Jiang, B. A multi-objective particle swarm-biogeography optimization algorithm for tomato planting planning problem. Comput. Appl. Softw. 2023, 40, 294–299. [Google Scholar]
Zhang, C.; Sun, T.; Wang, Y.; Gao, Z.; Li, T. Effects of sowing date and density on photosynthetic characteristics and yield of winter wheat under drip irrigation. Xinjiang Agric. Sci. 2021, 58, 1971–1980. [Google Scholar]
Vitantonio-Mazzini, L.N.; Borrás, L.; Garibaldi, L.A.; Pérez, D.H.; Gallo, S.; Gambin, B.L. Management options for reducing maize yield gaps in contrasting sowing dates. Field Crops Res. 2020, 251, 107779. [Google Scholar] [CrossRef]
Jones, J.W.; Hoogenboom, G.; Porter, C.H.; Boote, K.J.; Batchelor, W.D.; Hunt, L.A.; Wilkens, P.W.; Singh, U.; Gijsman, A.J.; Ritchie, J.T. The DSSAT cropping system model. Eur. J. Agron. 2003, 18, 235–265. [Google Scholar] [CrossRef]
Padovan, G.; Martre, P.; Semenov, M.A.; Masoni, A.; Bregaglio, S.; Ventrella, D.; Lorite, I.J.; Santos, C.; Bindi, M.; Ferrise, R.; et al. Understanding effects of genotype × environment × sowing window interactions for durum wheat in the Mediterranean basin. Field Crops Res. 2020, 259, 107969. [Google Scholar] [CrossRef]
Zhang, J.; Ma, X.; Xu, Y. Establishment and application of growing climatic suitability indicator of single cropping rice in Anhui Province. Meteorol. Mon. 2013, 39, 88–93. [Google Scholar]
Zhang, Q.; Li, B.; Zhang, Y.; Wang, S. Suitability Evaluation of Crop Variety via Graph Neural Network. Comput. Intell. Neurosci. 2022, 2022, 5614974. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, K.; Han, Y.; Liu, Z.; Yang, F.; Wang, S.; Zhao, X.; Zhao, C. A crop variety yield prediction system based on variety yield data compensation. Comput. Electron. Agric. 2022, 203, 107460. [Google Scholar] [CrossRef]
Li, C.; Li, H.; Li, J.; Lei, Y.; Li, C.; Manevski, K.; Shen, Y. Using NDVI percentiles to monitor real-time crop growth. Comput. Electron. Agric. 2019, 162, 357–363. [Google Scholar] [CrossRef]
Lu, Z.; Luo, M.; Tan, C.; Xu, F.; Liang, S.; Yang, X. Monitoring and evaluation of winter wheat growth based on analysis of vegetation index changes on remote sensing images. J. Triticeae Crops 2020, 10, 1257–1264. [Google Scholar]
Zhang, H.; Li, W.-G.; Zhang, X.-D.; Li, W.; Ma, T.-H.; Han, Z.-Q. Estimation of stem and tiller number of winter wheat in field based on optimization of multiple remote sensing spectral index. J. Triticeae Crops 2023, 43, 391–398. [Google Scholar]
Zhao, Y.; Wang, X.; Guo, Y.; Hou, X.; Dong, L. Winter Wheat Phenology Variation and Its Response to Climate Change in Shandong Province, China. Remote Sens. 2022, 14, 4482. [Google Scholar] [CrossRef]
Alvarez-Hess, P.S.; Thomson, A.L.; Karunaratne, S.B.; Douglas, M.L.; Wright, M.M.; Heard, J.W.; Jacobs, J.L.; Morse-McNabb, E.M.; Wales, W.J.; Auldist, M.J. Using multispectral data from an unmanned aerial system to estimate pasture depletion during grazing. Anim. Feed. Sci. Technol. 2021, 275, 114880. [Google Scholar] [CrossRef]
Näsi, R.; Viljanen, N.; Kaivosoja, J.; Alhonoja, K.; Hakala, T.; Markelin, L.; Honkavaara, E. Estimating Biomass and Nitrogen Amount of Barley and Grass Using UAV and Aircraft Based Spectral and Photogrammetric 3D Features. Remote Sens. 2018, 10, 1082. [Google Scholar] [CrossRef]
Zhao, J.; Ding, Y.; Du, M.; Liu, W.; Zhu, H.; Li, G.; Yang, J. Vegetation coverage inversion of alpine grassland in the source of the Yellow River based on unmanned aerial vehicle and machine learning. Sci. Technol. Eng. 2021, 21, 10209–10214. [Google Scholar]
Jiang, H.; Chai, L.; Jia, K.; Liu, J.; Yang, S.; Zheng, J. Estimation of water content for short vegetation based on PROSAIL model and vegetation water indices. J. Remote Sens. 2021, 25, 1025–1036. [Google Scholar]
Xing, H.; Li, Z.; Xu, X.; Feng, H.; Yang, G.; Chen, Z. Multi-assimilation methods based on AquaCrop model and remote sensing data. Trans. Chin. Soc. Agric. Eng. 2017, 33, 183–192. [Google Scholar]
Xie, Y.; Wang, P.; Wang, L.; Zhang, S.; Li, L.; Liu, J. Estimation of wheat yield based on crop and remote sensing assimilation models. Trans. Chin. Soc. Agric. Eng. 2016, 32, 179–186. [Google Scholar]
Yang, Q.; Shi, L.; Han, J.; Zha, Y.; Zhu, P. Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images. Field Crops Res. 2019, 235, 142–153. [Google Scholar] [CrossRef]
Chlingaryan, A.; Sukkarieh, S.; Whelan, B. Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review. Comput. Electron. Agric. 2018, 151, 61–69. [Google Scholar] [CrossRef]
Ihuoma, S.O.; Madramootoo, C.A. Recent advances in crop water stress detection. Comput. Electron. Agric. 2017, 141, 267–275. [Google Scholar] [CrossRef]
Ebrahim, B.; Sidike, P.; Nahian, S.; Devabhaktuni, V.K.; Markus, T. Estimation of root zone soil moisture from ground and remotely sensed soil information with multisensor data fusion and automated machine learning. Remote Sens. Environ. 2021, 260, 112434. [Google Scholar]
Cheng, M.; Jiao, X.; Liu, Y.; Shao, M.; Yu, X.; Bai, Y.; Wang, Z.; Wang, S.; Nuremanguli, T.; Shuaibing, L.; et al. Estimation of soil moisture content under high maize canopy coverage from UAV multimodal data and machine learning. Agric. Water Manag. 2022, 264, 107530. [Google Scholar] [CrossRef]
Chen, L.; Xing, M.; He, B.; Wang, J.; Shang, J.; Huang, X.; Xu, M. Estimating Soil Moisture Over Winter Wheat Fields During Growing Season Using Machine-Learning Methods. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3706–3718. [Google Scholar] [CrossRef]
Kia, P.J.; Far, A.T.; Omid, M.; Alimardani, R.; Naderloo, L. Intelligent control based fuzzy logic for automation of greenhouse irrigation system and evaluation in relation to conventional systems. World Appl. Sci. J. 2009, 6, 16–23. [Google Scholar]
Giusti, E.; Marsili-Libelli, S. A Fuzzy Decision Support System for irrigation and water conservation in agriculture. Environ. Model. Softw. 2015, 63, 73–86. [Google Scholar] [CrossRef]
Delgoda, D.; Malano, H.; Saleem, S.K.; Halgamuge, M.N. Irrigation control based on model predictive control (MPC): Formulation of theory and validation using weather forecast data and AQUACROP model. Environ. Model. Softw. 2016, 78, 40–53. [Google Scholar] [CrossRef]
Chao, S.; Han, C.W.; Duncan, S.A.; Fengqi, Y. Robust Model Predictive Control of Irrigation Systems With Active Uncertainty Learning and Data Analytics. IEEE Trans. Control. Syst. Technol. 2019, 28, 1493–1504. [Google Scholar] [CrossRef]
Yuan, L.; Zhang, J.-C.; Zhao, J.-L.; Huang, W.-J.; Wang, J.-H. Differentiation of yellow rust and powdery mildew in winter wheat and retrieving of disease severity based on leaf level spectral analysis. Spectrosc. Spectr. Anal. 2013, 33, 1608–1614. [Google Scholar]
Wang, J.; Jing, Y.; Huang, W.; Zhang, J.; Zhao, J.; Zhang, Q.; Wang, L. Comparative research on estimating the severity of yellow rust in winter wheat. GU Case GPU Snow Spectr. Anal. 2015, 35, 1649–1653. [Google Scholar]
Yuan, L.; Zhang, J.; Shi, Y.; Nie, C.; Wei, L.; Wang, J. Damage Mapping of Powdery Mildew in Winter Wheat with High-Resolution Satellite Image. Remote Sens. 2014, 6, 3611–3623. [Google Scholar]
Azadbakht, M.; Ashourloo, D.; Aghighi, H.; Radiom, S.; Alimohammadi, A. Wheat leaf rust detection at canopy scale under different LAI levels using machine learning techniques. Comput. Electron. Agric. 2019, 156, 119–128. [Google Scholar] [CrossRef]
Qian, P.; Maofang, G.; Pingbo, W.; Jingwen, Y.; Shilei, L. A Deep-Learning-Based Approach for Wheat Yellow Rust Disease Recognition from Unmanned Aerial Vehicle Images. Sensors 2021, 21, 6540. [Google Scholar]
Chao, Q.; Murilo, S.; Jesper, C.W.; Ea, H.R.S.; Merethe, B.; Erik, A.; Junfeng, G. In-field classification of the asymptomatic biotrophic phase of potato late blight based on deep learning and proximal hyperspectral imaging. Comput. Electron. Agric. 2023, 205, 107585. [Google Scholar]
Mao, Y.; Gong, H. Research on corn disease identification based on fusion of multiple features based on SVM and DS evidence theory. J. Chin. Agric. Mech. 2020, 41, 152–157. [Google Scholar] [CrossRef]
Zhang, X.; Qiao, Y.; Meng, F.; Fan, C.; Zhang, M. Identification of maize leaf diseases using improved deep convolutional neural networks. IEEE Access 2018, 6, 30370–30377. [Google Scholar] [CrossRef]
Xu, j.; Shao, M.; Wang, Y.; Han, W. Maize disease image recognition using matrix neural network based on transfer learning. Trans. Chin. Soc. Agric. Mach. 2020, 51, 230–236+253. [Google Scholar]
Zhang, Z.; Zhang, Z.; Jin, Y.; Chen, B.; Brown, P. California Almond Yield Prediction at the Orchard Level With a Machine Learning Approach. Front. Plant Sci. 2019, 10, 809. [Google Scholar] [CrossRef]
Fu, Z.; Jiang, J.; Gao, Y.; Krienke, B.; Wang, M.; Zhong, K.; Cao, Q.; Tian, Y.; Zhu, Y.; Cao, W.; et al. Wheat Growth Monitoring and Yield Estimation based on Multi-Rotor Unmanned Aerial Vehicle. Remote Sens. 2020, 12, 508. [Google Scholar] [CrossRef]
Filippi, P.; Jones, E.J.; Wimalathunge, N.S.; Somarathna, P.D.; Pozza, L.E.; Ugbaje, S.U.; Jephcott, T.G.; Paterson, S.E.; Whelan, B.M.; Bishop, T.F.A. An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning. Precis. Agric. 2019, 20, 1015–1029. [Google Scholar] [CrossRef]
Nevavuori, P.; Narra, N.; Lipping, T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019, 163, 104859. [Google Scholar] [CrossRef]
Qiao, M.; He, X.; Cheng, X.; Li, P.; Luo, H.; Zhang, L.; Tian, Z. Crop yield prediction from multi-spectral, multi-temporal remotely sensed imagery using recurrent 3D convolutional neural networks. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102436. [Google Scholar] [CrossRef]
Liu, Y.; Zhao, D.; Zhao, J.; Liu, J.; Zhang, H.; Ma, X.L.; Wang, P.; Wang, D. Influence of the operating parameters of an unmanned pine seeding system on the final seeding uniformity. Ind. Crops Prod. 2023, 205, 117439. [Google Scholar] [CrossRef]
Bazrafkan, A.; Navasca, H.; Kim, J.H.; Morales, M.; Johnson, J.P.; Delavarpour, N.; Fareed, N.; Bandillo, N.; Flores, P. Predicting Dry Pea Maturity Using Machine Learning and Advanced Sensor Fusion with Unmanned Aerial Systems (UASs). Remote Sens. 2023, 15, 2758. [Google Scholar] [CrossRef]
Han, S.; Liu, J.; Zhou, G.; Jin, Y.; Zhang, M.; Xu, S. InceptionV3-LSTM: A deep learning net for the intelligent prediction of rapeseed harvest time. Agronomy 2022, 12, 3046. [Google Scholar] [CrossRef]
Singh, K.D.; Duddu, H.S.; Vail, S.; Parkin, I.; Shirtliffe, S.J. UAV-based hyperspectral imaging technique to estimate canola (Brassica napus L.) seedpods maturity. Can. J. Remote Sens. 2021, 47, 33–47. [Google Scholar] [CrossRef]
Ding, C.; Wang, L.; Chen, X.; Yang, H.; Huang, L.; Song, X. A blockchain-based wide-area agricultural machinery resource scheduling system. Appl. Eng. Agric. 2023, 39, 1757. [Google Scholar] [CrossRef]
Cao, R.; Li, S.; Ji, Y.; Zhang, Z.; Xu, H.; Zhang, M.; Li, M.; Li, H. Task assignment of multiple agricultural machinery cooperation based on improved ant colony algorithm. Comput. Electron. Agric. 2021, 182, 105993. [Google Scholar] [CrossRef]
Liang, Z. Optimization of Agricultural Machinery Task Scheduling Algorithm Based on Multiobjective Optimization. J. Sens. 2022, 2022, 5800332. [Google Scholar]
Li, S.; Zhang, M.; Wang, N.; Cao, R.; Zhang, Z.; Ji, Y.; Li, H.; Wang, H. Intelligent scheduling method for multi-machine cooperative operation based on NSGA-III and improved ant colony algorithm. Comput. Electron. Agric. 2023, 204, 107532. [Google Scholar] [CrossRef]
Khan, S.; Abbas, A.; Gabriel, H.; Rana, T.; Robinson, D. Hydrologic and economic evaluation of water-saving options in irrigation systems. Irrig. Drain. J. Int. Comm. Irrig. Drain. 2008, 57, 1–14. [Google Scholar] [CrossRef]
Ping, L.; Lisha, F.; Kaiming, J. Evaluation of Economic Benefits of Xiangfeng Peony. Sci. Silvae Sin. 2019, 55, 167–174. [Google Scholar]
Kamali, F.P.; Meuwissen, M.P.; de Boer, I.J.; van Middelaar, C.E.; Moreira, A.; Lansink, A.G.O. Evaluation of the environmental, economic, and social performance of soybean farming systems in southern Brazil. J. Clean. Prod. 2017, 142, 385–394. [Google Scholar] [CrossRef]
Wang, G.; Bu, C.; Feng, W. Evaluating Ecological Benefits of “Danzhi” Project Based on TOPSIS Model with Grey Correlation. Bull. Soiland Water Conserv. 2019, 39, 189–193. [Google Scholar]
Xu, Y.; Zhang, R. Comprehensive benefit evaluation of edible fungus cultivation in Hebei Province based on factor analysis combined with entropy method. North. Hortic. 2022, 3, 138–144. [Google Scholar]

Figure 1. Basic Framework for Research on Agricultural Big Data and Intelligent Decision Making in Smart Farms. The architecture integrates data governance with decision models across the production cycle. The purple arrows illustrate the data flow and closed-loop feedback mechanism, where “Standardized Data Support” feeds into decision models, and "Optimization Feedback" connects post-harvest evaluation back to pre-season planning to guide future operations.

Figure 2. Comparative Evaluation of Four Core Tasks in Pre-season Cultivation Planning.

Figure 3. Structured Workflow of Mid-Season Cultivation Management.

Figure 4. Structured Workflow of Post-harvest Benefit Evaluation.

Figure 5. Sankey Diagram of Technical and Modeling Linkages Across the Agricultural Decision-Making Pipeline.

Table 1. The Concept of Smart Farms.

Years	Concept
2021	Zhao conceptualizes the smart farm as a specific practical form of smart agriculture, characterized primarily by unmanned or minimally manned operations [10]. By leveraging agricultural sensors, the Internet of Things (IoT), big data, artificial intelligence (AI), and other advanced technologies, smart farms can achieve fully automated operations throughout the entire agricultural process—from cultivation to harvest.
2021	According to Nahina, smart farms integrate advanced technologies with traditional farming practices to enhance both the quality and quantity of agricultural production while significantly reducing input usage [11].
2023	Hong and Yang describe the smart farm as a highly efficient production system that leverages advanced and information technologies to intelligently control all aspects of agricultural production, enabling automation and refined management throughout the process [12].
2025	According to Dutta, smart farms enhance resource efficiency and automate environmental controls by leveraging real-time monitoring, data-driven decision-making, and automation to ensure consistent, high-yield crop production [13].

Table 2. Smart Farm Data Storage Technology Selection and Characterization.

Storage Type	Technical Proposal	Core Advantages	Typical Application Scenarios
Time Series Database	TDengine	High-throughput writing, time series compression	Sensor data streams, weather monitoring
Object Storage	MINIO	High-concurrency access, unbounded scalability	Remote sensing imagery, drone orthomaps
Relational Database	MySQL/PostgreSQL	ACID transaction support, complex query optimization	Agricultural machinery scheduling records, production material ledgers
Memory Cache	Redis/Alluxio	Microsecond-level response, real-time data acceleration	Pest and disease warning, irrigation decision model parameters
Offline Storage	HDFS	Low-cost archiving, massive data fault tolerance	Historical survey data, reanalysis datasets

Table 3. Comparative Analysis of Modeling Approaches for Pre-Season Cultivation Planning.

Decision Type	Model Category	Key Techniques	Strengths	Limitations
Suitability Assessment	Linear/Non-linear Evaluation	AHP; Delphi; Fuzzy Mathematics	Intuitive and easy to implement; integrates expert knowledge.	Prone to subjective bias; relies on manual weight allocation.
Suitability Assessment	Machine Learning	ANN; MaxEnt; Random Forest (RF)	Handles complex non-linear relationships; robust for small samples.	Low interpretability (black-box nature); dependent on data quality.
Planting Plan	Mathematical Modeling	Optimal Control Theory; Dynamic Models	Logically clear framework; provides quantitative basis.	Dependent on precise parameters; limited flexibility in uncertain environments.
Planting Plan	Intelligent Optimization	MOPSO; Genetic Algorithms	Strong global search for multi-objective problems; adaptable to dynamic conditions.	High computational cost; requires complex parameter tuning.
Sowing Plan	Field Trials (Empirical)	Field Experiments; Yield Comparison	Provides direct regional guidance; observes real-world interactions.	Time-consuming; high cost; limited generalizability across regions.
Sowing Plan	Mechanistic Simulation	Crop Growth Models (e.g., CERES-Wheat)	Reduces research cost; extends spatio-temporal analysis scope.	Complex structure requires detailed calibration; may miss extreme weather impacts.
Variety Recommendation	Traditional Statistical	PCA; Fuzzy Mathematics	Logically sound framework using statistical principles.	Indicator selection and weighting are prone to subjective bias.
Variety Recommendation	Machine Learning	GCN; Random Forest; Transfer Learning	Data-driven discovery of latent associations; high accuracy.	Low model transparency; needs large datasets or transfer learning support.

Table 4. Comparative Analysis of Modeling Approaches for Mid-Season Cultivation Management.

Decision Module	Model Category	Key Techniques	Strengths	Limitations
Seedling Monitoring & Variable Fertilization	Remote Sensing (Empirical)	UAV-based fusion; SVR; Random Forest	High spatial–temporal resolution; efficient computation for specific conditions.	Limited generalizability; heavy reliance on large training datasets.
	Radiative Transfer Models	PROSAIL; Physical Optics Models	High generalization based on physical mechanisms; robust across environments.	High complexity in model inversion; requires multiple difficult-to-obtain parameters.
	Data Assimilation	Crop Growth Models + Particle Swarm Optimization (PSO)	Provides mechanistic support; enhances estimation accuracy of state variables.	Algorithmic complexity is high; lacks real-time performance for field deployment.
	Deep Learning/RL	CNN-LSTM; Deep Reinforcement Learning (DRL)	Autonomously extracts spatiotemporal features; optimizes strategies dynamically.	“Black box” nature (low interpretability); high computational burden.
Moisture Sensing & Irrigation Strategy	Physical Models	Microwave/Thermal Infrared Remote Sensing	High precision; grounded in physical scattering/emission mechanisms.	Requires complex parameterization (roughness, texture); limited scalability.
	Machine Learning	AutoML; ANN; SVM; PLSR	Efficiently handles non-linear spectral relationships; bypasses complex parameter inputs.	Dependent on labeled data; lacks physical interpretability.
	Fuzzy Logic Control	Rule-based Systems	Improves adaptability by integrating environmental factors; intuitive logic.	Relies on expert-defined rules; lacks global optimization capability.
	Model Predictive Control (MPC)	Robust MPC (RMPC); DDRMPC	Enables rolling-horizon optimization; handles uncertainty through feedback correction.	Often overestimates rainfall utilization (ignores infiltration limits); computationally demanding.
Pest/Disease Monitoring & Precision Application	Statistical Models	Fisher’s LDA; PLSR	Robust against multicollinearity (PLSR); established theoretical basis.	Constrained by linear assumptions; low generalizability across regions.
	Traditional Machine Learning	SVM; ANN; Random Forest	Higher accuracy than statistical methods; capable of non-linear classification.	Relies heavily on manual feature engineering; limited automation.
	Deep Learning (Vision)	CNN; ResNet; PSPNet; Transfer Learning	Automatic feature extraction; high precision in pattern recognition.	Requires massive labeled datasets; high hardware costs.
	Intelligent Diagnosis	Triplet-loss CNN; ResNeXt	End-to-end learning for disease type identification.	Limited ability to quantify disease severity levels for variable-rate control.

Table 5. Comparative Analysis of Modeling Approaches for Post-harvest Benefit Evaluation.

Decision Type	Model Category	Key Techniques	Strengths	Limitations
Production Forecasts	Physical Simulation	Crop Growth Models (Process-based)	Provides biologically interpretable insights into crop development.	Requires extensive field data for calibration; high computational complexity limits scalability.
	Statistical/Machine Learning	Random Forest; Multiple Regression	Scalable for large areas; does not require detailed biophysical parameters.	Lacks mechanistic explanation; relies on historical data correlations.
	Deep Spatio-temporal Models	3D CNN + LSTM; Spatial–Spectral–Temporal Nets	Automatically learns complex spatial and temporal patterns; high prediction accuracy.	“Black box” nature; requires massive datasets; potential overfitting without sufficient data.
Harvest Timing & Machinery Dispatch	Harvest Timing: Physical Models	Physiological Mechanism Models	Simulates interactions between growth cycles and environment.	Struggles to adapt to complex terrains and diverse climatic conditions; requires calibration.
	Harvest Timing: Data-driven	CNN-LSTM; Multi-source Remote Sensing Fusion	High prediction efficiency; improves cross-regional generalization and accuracy (>96%).	Dependent on high-quality multimodal data (spectral + meteorological).
	Machinery Dispatch: Intelligent Optimization	Improved Ant Colony (ACO); NSGA-III	Efficiently solves complex vehicle routing problems (VRP); minimizes waiting time and distance.	Often based on idealized assumptions; neglects real-world constraints (e.g., machine availability, varying field conditions).
Holistic Performance Assessment	Single-criterion/Single-method	AHP; Entropy Weighting; LCA	Intuitive and simple to implement for specific metrics (e.g., economic or ecological).	Susceptible to subjective bias (AHP) or data uncertainty; limited generalizability.
Holistic Performance Assessment	Composite/Hybrid Evaluation	AHP + Entropy; TOPSIS + Grey Relational	Balances subjective expert judgment with objective data weighting; enhances robustness.	Methodological complexity is higher; relies on the availability of multi-dimensional indicator data.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qin, C.; Zhao, P.; Qian, Y.; Yang, G.; Hao, X.; Mei, X.; Yang, X.; He, J. A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms. Agronomy 2025, 15, 2898. https://doi.org/10.3390/agronomy15122898

AMA Style

Qin C, Zhao P, Qian Y, Yang G, Hao X, Mei X, Yang X, He J. A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms. Agronomy. 2025; 15(12):2898. https://doi.org/10.3390/agronomy15122898

Chicago/Turabian Style

Qin, Chang, Peiqin Zhao, Ying Qian, Guijun Yang, Xingyao Hao, Xin Mei, Xiaodong Yang, and Jin He. 2025. "A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms" Agronomy 15, no. 12: 2898. https://doi.org/10.3390/agronomy15122898

APA Style

Qin, C., Zhao, P., Qian, Y., Yang, G., Hao, X., Mei, X., Yang, X., & He, J. (2025). A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms. Agronomy, 15(12), 2898. https://doi.org/10.3390/agronomy15122898

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comprehensive Review of Big Data Intelligent Decision-Making Models for Smart Farms

Abstract

1. Introduction

2. The Concept and Development Status of Smart Farms

2.1. Concept and Characteristics of Smart Farms

2.2. Current Status of Smart Farm Development

3. Governance of Agricultural Big Data in Smart Farms

3.1. Acquisition & Processing of Agricultural Data

3.2. Storage & Management of Agricultural Data

3.3. Security & Sharing of Agricultural Data

4. Intelligent Decision-Making Models for Smart Farms

4.1. Pre-Season Cultivation Planning Decision Models

4.1.1. Suitability Assessment

4.1.2. Planting Plan

4.1.3. Sowing Plan

4.1.4. Variety Recommendation

4.2. Mid-Season Cultivation Management Decision Models

4.2.1. Seedling Monitoring & Variable Fertilization Decision

4.2.2. Moisture Sensing & Efficient Irrigation Strategy Decision

4.2.3. Pest and Disease Monitoring & Precision Application Decision

4.3. Post-Harvest Benefit Evaluation Models

4.3.1. Production Forecasts

4.3.2. Harvest Timing Forecasting & Agricultural Machinery Dispatch

4.3.3. Holistic Performance Assessment

4.4. Section Summary

5. Discussion, Conclusions and Outlook

5.1. Discussion

5.2. Conclusion

5.3. Outlook

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI