The results of this systematic review include an analysis of related work, the systematization of relevant findings in a bibliographic matrix, and a bibliometric analysis.
3.1. Related Works
During the evaluation of the retrieved publications, one exclusion criterion was whether the document was a review or survey-type study. Although excluded from the main analysis, some of these publications are considered relevant and are included in this section as background. Literature reviews play a crucial role in consolidating the available knowledge on agricultural yield prediction using artificial intelligence and its link to crop management practices. Numerous studies emphasize the significance of accurate climate forecasts, especially in a changing climate. In addition, the use of technologies such as the Internet of Things (IoT) and artificial intelligence can significantly enhance the estimation of agricultural production. Factors such as climate, soil quality, and satellite imagery are key variables for more accurate predictions. Likewise, the authors emphasize that to achieve reliable results, it is crucial to conduct continuous and up-to-date research [
14].
Agriculture plays a central role as a source of livelihood for millions of people worldwide but faces multiple challenges such as water scarcity, supply–demand imbalances and increasing climate instability. In response to these threats, the implementation of smart and data-driven agricultural practices has been proposed as a viable solution. In this context, decreases in crop yields are often associated with the deterioration of irrigation infrastructure, declining soil fertility, and the inefficient use of cultivation techniques [
3]. For this reason, the use of machine learning–based approaches, such as Random Forest and Support Vector Regression, is increasingly recommended, as these models demonstrate greater accuracy and robustness than traditional linear regression methods [
10,
11]. Another point emphasized in the literature is the relevance of models that evaluate the impact of climate variability on crop yields, which directly influence agricultural production. These models allow for assessing the effectiveness of control measures implemented and developing strategies to enhance productivity and crop intensity. In this context, predictive tools for drought, soil quality, or yield estimation based on agroclimatic indices have become increasingly important, as they can significantly contribute to global food security [
3]. The adoption of efficient and scalable models supports the transition to climate-smart agriculture and enables the replication of experimental findings in other regions. Several studies reviewed provide detailed analyses of machine learning algorithms applied to the modeling of these indices. Areas such as yield prediction, crop monitoring, soil quality assessment, and the estimation of variables like evapotranspiration, precipitation, drought, and pest appearance represent fields where AI techniques have shown remarkable potential [
10,
11].
From a climate change perspective, yield prediction represents an essential component of global food security, as it enables proactive responses to potential yield losses caused by environmental stressors. Recent studies have demonstrated that machine learning techniques, particularly Random Forest and Deep Neural Networks, are highly effective when incorporating variables such as climate, soil properties, and fertilizer use [
10,
11,
15]. A growing body of literature also highlights that the accuracy of these models depends on the quality and diversity of the selected input features, which determine their predictive reliability across agroecological zones. These predictions, when reliable, provide farmers and policymakers with a solid foundation for making management decisions aimed at preventing production shortages [
15]. Machine learning has been widely implemented in smart agriculture, where it is employed to forecast crop yields, monitor field conditions, and evaluate interactions between crop performance and climate variability [
15,
16,
17]. Tools integrated with remote sensing and Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and satellite imagery, enable continuous observation of nutrient levels and soil moisture, improving yield estimation accuracy. In terms of predictive modeling, techniques such as Artificial Neural Networks (ANNs), Convolutional Neural Networks (CNNs), Deep Neural Networks (DNNs), and Random Forest have been identified as the most effective and frequently used [
16,
17,
18]. However, one of the main challenges reported in the literature is the limited availability of large, high-quality training datasets, which restrict model generalization and reduce accuracy in real-world agricultural scenarios [
17]. Addressing this limitation is essential to strengthen the operational deployment of AI-driven decision-support systems for sustainable agriculture.
These reviews offer a thorough overview of the current knowledge about AI-driven agricultural yield prediction. They serve as a strong foundation for developing new methods and for identifying challenges and emerging opportunities in this area. Additionally, they act as a guide for upcoming research and help advance science in applying AI to the agricultural sector.
3.2. Bibliographic Matrix
The construction of the bibliographic review that supports this study was conducted using the Tree of Science (ToS) tool, which facilitated the identification and selection of the most representative scientific articles in the field of artificial intelligence applied to crop yield prediction. The ToS methodology organizes publications according to their relevance and citation structure roots (foundational works), trunks (conceptual developments), and leaves (recent innovations) allowing a systematic visualization of the knowledge evolution [
18]. This selection focused on studies incorporating climatic, edaphic, and crop management variables to ensure a comprehensive representation of the domain. Additionally, to carry out the corresponding bibliometric analysis, the specialized tools VOSviewer (version 1.6.20) and Bibliometrix (version 4.2.1) were used, the results of which are presented in later sections of the document. The selected articles were organized into a bibliographic matrix that summarizes the main data, methods, and contributions of each study.
Table 3,
Table 4 and
Table 5 synthesize this selection [
18,
19,
20].
Table 3 presents an overview of studies that employ climatic data such as temperature, precipitation, solar radiation, and humidity as primary inputs for crop yield prediction using artificial intelligence (AI) techniques. These models rely on the direct relationships between weather variability and crop productivity and have demonstrated their capacity to capture the temporal dynamics that affect yield formation. The research compiled under this category reveals a consistent trend toward the use of machine learning algorithms (e.g., Random Forest, Gradient Boosting, and Support Vector Machines). Such models have been particularly successful in regions where meteorological data are abundant but information on soil or management practices is limited, enabling the development of robust and scalable predictive frameworks.
The results synthesized in
Table 3 indicate that climatic factors alone can provide substantial explanatory power for crop yield prediction, especially when temporal resolution is high and spatial heterogeneity is moderate. Studies employing ensemble and recurrent neural network approaches frequently reported high predictive accuracy (e.g., R
2 > 0.80, RMSE < 10%) for major cereals such as wheat, maize, and rice. Notably, temperature and precipitation emerge as the most influential drivers, followed by solar radiation and evapotranspiration indices. The application of hybrid models combining regression and tree-based algorithms shows potential to improve generalization across climatic zones; however, several works underline persistent challenges related to the quality, continuity, and local representativeness of weather datasets, which may lead to model overfitting or reduced transferability. In summary, this group of studies confirms that climate-driven AI models form a reliable first step for yield prediction, while also emphasizing the need to incorporate complementary variables (soil fertility, management practices, phenology) to achieve greater predictive robustness in future research.
Table 4 compiles studies that employ remote sensing data obtained from satellite platforms such as Sentinel, MODIS, Landsat, and UAV-based sensors in combination with artificial intelligence algorithms to estimate crop yield. These approaches represent a significant evolution in yield prediction, as they incorporate spatial and spectral information that captures vegetation health, canopy structure, and phenological stages across large geographic areas. The increasing availability of multispectral and radar datasets has enabled the integration of vegetation indices (e.g., NDVI, EVI, SAVI, SIF), soil moisture indicators, and biophysical parameters derived from surface reflectance models, allowing AI-based frameworks to establish high-resolution spatial correlations between crop performance and environmental variability, thereby bridging the gap between field-level observations and regional or national yield forecasting.
The studies summarized in
Table 4 highlight the transformative role of remote sensing AI integration in enhancing predictive accuracy and spatial generalization. Models that combined optical and radar data sources (e.g., Sentinel-1 and Sentinel-2) consistently achieved strong correlations between predicted and observed yields (R
2 ranging from 0.75 to 0.92), demonstrating the complementary nature of spectral and structural information. Among the most recurrent algorithms, Convolutional Neural Networks (CNNs) and Gradient Boosting methods emerged as particularly effective in handling large-scale, high-dimensional image datasets; CNN-based architectures outperform traditional regressors when temporal image sequences are available, as they can extract spatial features related to crop density, chlorophyll content, and canopy dynamics. Furthermore, the integration of vegetation indices and surface temperature with meteorological variables strengthens the models’ explanatory power by capturing stress conditions associated with droughts, nutrient deficiencies, or phenological delays. However, challenges remain regarding data preprocessing, atmospheric corrections, and image fusion techniques, which can introduce uncertainty and affect model reproducibility. Overall, these studies validate the potential of satellite-integrated AI systems as essential tools for precision agriculture and early yield forecasting, promoting sustainable decision-making at multiple spatial scales.
Table 5 presents studies that implement hybrid modeling approaches, integrating multiple data domains such as climate, soil, remote sensing, and agronomic management with complementary artificial intelligence and simulation techniques. These models combine the strengths of both process-based (mechanistic) and data-driven (machine learning or deep learning) paradigms, enabling more comprehensive representations of agricultural systems and improving prediction accuracy under complex, nonlinear conditions. Hybrid AI frameworks often incorporate outputs from physical crop models (e.g., DSSAT, APSIM, AquaCrop) as additional predictors in machine learning algorithms; others fuse diverse data layers (meteorological, spectral, soil, phenological) to construct highly adaptive predictive systems capable of operating across multiple temporal and spatial scales.
The studies compiled in
Table 5 emphasize the growing relevance of hybrid AI frameworks that bridge mechanistic understanding and statistical learning. By integrating physically based models with machine learning algorithms, these approaches achieve higher accuracy and interpretability than standalone methods; for example, models coupling AquaCrop or DSSAT outputs with neural networks or ensemble regressors report improvements of 10–20% in predictive metrics (R
2) compared with purely empirical models. Furthermore, deep hybrid architectures such as combinations of CNN, LSTM, and Gradient Boosting demonstrate superior capability to capture geotemporal dependencies, especially in heterogeneous agricultural landscapes. These studies underscore the advantages of multi-source data fusion, revealing that the joint use of soil, climate, and management variables reduces uncertainty and enhances transferability to new regions or crop types.
However, hybridization also introduces challenges: model calibration can be computationally intensive, and the harmonization of heterogeneous data formats often requires sophisticated preprocessing pipelines. Nonetheless, the findings collectively point to a paradigm shift toward integrated modeling ecosystems, where AI acts not as a substitute but as a synergistic complement to traditional agronomic simulation. This category of studies establishes a pathway toward explainable and scalable AI systems in agriculture, aligning predictive analytics with sustainability and climate-resilient decision-making.
In addition to the main
Table 3,
Table 4 and
Table 5,
Supplementary Table S1 provides a complete record of the 149 references reviewed in this research. It includes full bibliographic information, DOI verification, and detailed notes on the modeling techniques, input variables, and geographical or cultivation contexts analyzed. The purpose of this
supplementary dataset is to ensure transparency, reproducibility, and traceability in the review process, enabling other researchers to cross-reference methodologies, datasets, and evaluation metrics across different studies. The structure of
Table S1 consolidates the information from the main tables and expands it by including works that, while relevant, do not explicitly present performance metrics (e.g., R
2 or RMSE) but contribute conceptually to understanding the integration of artificial intelligence in agricultural modeling; this inclusion allows for a more holistic and inclusive mapping of the research landscape, acknowledging both quantitative and methodological contributions.
From an analytical standpoint, the extended database highlights several important patterns: (i) a strong prevalence of machine learning approaches—particularly Random Forest, SVM, and Gradient Boosting—in climate- and soil-based studies; (ii) a growing shift toward deep learning architectures (CNN, LSTM, Transformer-based models) when integrating satellite and multispectral data; (iii) an increasing number of hybrid studies that link physical models (e.g., AquaCrop, DSSAT, APSIM) with AI algorithms, suggesting convergence between mechanistic and data-driven paradigms; and (iv) a persistent challenge across studies: the lack of standardized datasets and limited access to high-quality ground-truth data which restricts direct model comparison.
In summary,
Supplementary Table S1 complements the main analysis by providing a complete and verifiable evidence base, reinforcing the scientific robustness of the review. It serves as a valuable reference for researchers and policymakers seeking to understand how AI has evolved as a driver of innovation and sustainability in agricultural yield prediction. Leveraging this comprehensive, verifiable corpus (
Supplementary Table S1), we proceed to profile the modeling approaches adopted in the analyzed studies, highlighting how machine learning, especially deep learning, has come to dominate methodological choices in crop yield prediction. In the set of studies analyzed in this systematic review, machine learning, particularly deep learning approaches, was identified as the predominant computational paradigm for crop yield prediction; among the most widely used models are Artificial Neural Networks (ANNs), Deep Neural Networks (DNNs), and one-dimensional Convolutional Neural Networks (1D-CNNs). Techniques such as Random Forest (RF), Deep Belief Networks (DBNs), Fuzzy Neural Networks (FNNs), and more advanced architectures like Long Short-Term Memory (LSTM) and Geo-Temporal Weighted Neural Networks (GTWNNs) also stand out.
An overview of algorithms by data domain (climate, remote sensing, and hybrid approaches) is provided in
Table 3,
Table 4 and
Table 5. To facilitate direct comparison among the main machine learning algorithms applied to crop yield prediction,
Table 6 summarizes representative performance indicators (R
2, RMSE, and MAE) reported in the analyzed studies. The table groups result by algorithm and highlight the dominant crop types and geographical regions where each method has been applied.
The results in
Table 6 show that neural network–based models consistently outperform other algorithms, achieving higher R
2 and lower RMSE and MAE values. This indicates a strong capacity to capture complex nonlinear interactions among climatic, soil, and management variables, compared with the relatively simpler Random Forest and regression-based approaches.
The studies analyzed in this systematic review used a variety of data sources, the most recurrent being those associated with climate, soil, and remote sensing observations. Regarding climate data, variables such as temperature, precipitation, and solar irradiance were used, while for the soil component, information from sensors and agricultural databases was employed. Remote sensing tools, such as Landsat 8 and Google Earth Engine, played a fundamental role in providing satellite data with different spatial resolutions that were considered determining variables for prediction quality; additionally, some articles integrated historical crop-yield data and field-sensor measurements. Several studies also considered the temporal and geographical dimension, which improved the accuracy of predictive models by incorporating key spatial and temporal elements into the analysis [
46,
47,
48,
49,
50].
Using deep learning models for crop yield prediction has proven effective at learning features directly from input data, overcoming the need to manually design predictors. Models such as LSTM and CNN have shown superior results compared with traditional models especially for wheat in Germany although they have limitations in representing extreme temperature and humidity conditions [
51]. On the other hand, hybrid models that incorporate climatic, soil, and agricultural management variables have significantly increased prediction accuracy, facilitating strategic decisions for farmers in contexts of high climate variability [
52]. The comparison between classical statistical models and machine learning approaches, such as XGBoost and Random Forest, has demonstrated the superior performance of the latter in capturing nonlinear relationships between agricultural and climatic variables [
53].
In recent studies, integrating high-resolution satellite imagery and multi-temporal data has enabled accurate estimation of maize yield—even at early cultivation stages—facilitating proactive agricultural management actions [
54]. Likewise, explainable artificial intelligence (XAI) has been applied to identify the determinants of yield under climate-change scenarios, improving the understanding of complex agronomic processes [
55]. Predictive models based on remote sensing and meteorological data have enhanced maize-yield performance by utilizing deep learning techniques and trait selection, resulting in high accuracy in regions such as Iowa and Nebraska [
56]; other work shows that integrating multi-source satellite data (Landsat 8 and Sentinel-2) with models such as CatBoost, optimized using Bayesian methods, enables highly accurate prediction of winter-wheat yield up to 40 days in advance [
57].
Collaborative approaches, such as federated learning, have also been developed for maize-yield prediction, enabling multiple institutions to train joint models without sharing sensitive data while maintaining the accuracy of the centralized model [
58]. In Saudi Arabia, an MLP model optimized with evolutionary algorithms such as the Spider Monkey algorithm has efficiently predicted corn yield with minimal mean-square error [
59]. Multistage and multicrop models, such as the multilayer perceptron, have been utilized to evaluate soil suitability for various crops in Canada, resulting in significant reductions in mean absolute error and highlighting the agricultural potential of northern regions in the context of climate change [
60]. In variable spatial and temporal contexts, domain-adaptation techniques such as DANN and KLIEP have enabled model transfer across regions with moderate divergence in agroclimatic characteristics while maintaining high accuracy in maize-yield prediction [
61]. The combined use of agrometeorological data and remote sensing has achieved outstanding accuracy in estimating tea yield, with automatically optimized deep neural networks outperforming conventional and ensemble models [
62]. In the case of rice in Jiangsu province, the combination of optical and radar data has significantly improved yield prediction by integrating with a regression meta-learning framework that is robust to phenological variability [
63,
64]. In pre-planting crop-type prediction scenarios, crop-sequence polygon-segmentation methods have handled large data volumes without sacrificing accuracy, achieving superior results across multiple large-scale tests in the United States [
64]. Finally, the simulation of water balance in arid conditions using the AquaCrop model has demonstrated its validity in Egypt to adjust irrigation strategies for directly sown rice, achieving high predictive efficiency even under severe water scarcity; although AquaCrop is a process-based model rather than an artificial-intelligence approach, its inclusion is relevant for comparison because it provides valuable insights into water management under arid environments [
65]. In addition, the AquaCrop model has been successfully validated to simulate rice yield under different irrigation regimes in arid regions of Egypt. The model showed high accuracy in simulating canopy cover, biomass, evapotranspiration, and soil water balance, highlighting its usefulness as a water management tool in water-scarce regions [
66].
Crop yield prediction has reached new levels of accuracy thanks to hybrid approaches that integrate agricultural simulation models, data assimilation techniques, and machine learning. A prominent example is the wheat-yield estimate in the northern plains of China, where incorporating leaf area index and soil moisture into the model yielded a correlation of 0.97 and a mean absolute error of only 1.74%, even with climate forecasts up to 3 months in advance [
67]. In the United States, analyzing more than 25 years of field-level data on sweet corn enabled the training of multiple machine learning models; the Random Forest model performed well, with an RMSE of 3.29 Mt/ha, and identified the year of cultivation, geographical location, and seed source as the most influential variables [
68]. In a similar vein, another study proposed a prediction system based on meteorological and pesticide records, in which the Gradient Boosting model achieved a coefficient of determination (R
2) of 99.99%, surpassing techniques such as K-NN and logistic regression, underscoring the potential of advanced analytics for sustainable agriculture and informed decision-making [
69].
Plot-scale prediction has also benefited from the use of multi-temporal imagery captured by UAVs. A 3D convolutional neural network applied to soybean cultivation reached an R
2 greater than 0.8, demonstrating robustness to variations in lighting and field conditions [
70]. In the case of wheat, combining climate data with NDVI and applying trait-selection techniques enabled the identification of critical variables such as average temperature during the reproductive phase and accumulated precipitation thereby markedly improving predictive accuracy [
33]. Cotton monitoring using drone-captured multispectral imagery showed outstanding performance in predicting biomass and yield with artificial neural networks, achieving an accuracy of over 95% [
71]. For rice, a 1D convolution lattice with temporal-attention mechanisms achieved greater than 92% accuracy, demonstrating efficacy in capturing complex growth patterns [
72].
The integration of optical and radar data specifically Sentinel-1 and Sentinel-2 with deep neural networks enabled yield estimation with an MAE of less than 0.2 t/ha, exceeding conventional models by more than 30% [
73]. Similarly, using the XGBoost algorithm in semi-arid areas for barley-yield prediction showed an R
2 of 0.88 and stability under extreme weather conditions [
74]. In Brazil, a deep learning approach was applied to sugarcane using satellite and historical-productivity imagery, achieving an R
2 of 0.91 and outperforming linear regression and Gradient Boosted Decision Tree (GBDT) methods [
75]. In turn, federated learning was successfully employed for rice prediction, enabling cooperation across regions without compromising data privacy, with results comparable to those of centralized models [
76].
Potato yield was modeled using Random Forest with agrometeorological and topographic variables, obtaining an MAE of less than 0.5 t/ha; factors such as altitude, slope, and temperature proved decisive [
77]. In Argentina, the use of deep neural networks enabled the prediction of soybean yield with an average R
2 of 0.87, demonstrating good spatial generalization [
78]. An LSTM-based architecture applied to wheat adequately captured phenological peaks using NDVI time series, achieving an RMSE of less than 0.3 t/ha [
79]. In high-altitude regions, a machine learning system predicted oat yield with 90% accuracy, even with limited historical data [
80]. In agricultural transition zones, CNNs applied to multispectral imagery enabled estimation of sorghum yield with an MAE of 0.18 t/ha, which is especially useful in areas with limited access to meteorological sensors [
81]. Finally, a system that integrates IoT sensors and deep learning can predict vegetable yield in real-time greenhouse environments, achieving an R
2 of 0.95 [
82].
From the detailed review of the works included, several climatic factors used in agricultural yield-prediction models were identified, with the following standing out: Temperature; Rain; Humidity; Solar radiation; Accumulated precipitation; Wind speed; Shortwave radiation; Estimated precipitation; Atmospheric pressure. Regarding soil-quality variables, the reviewed studies identified the following characteristics as key factors influencing crop growth and production: Soil type classification; Detailed soil maps; pH level; Fertilization methods; Nitrogen content; Irrigation application; Presence of potassium; Zinc concentration; Magnesium levels; Available sulfur; Calcium; Organic carbon. Finally, among the aspects related to agricultural practices included as input variables in the predictive models, the following were identified: Irrigation systems implemented; Fertilization strategies used.
3.4. Multi-Criteria Evaluation of AI Models for Crop Yield Prediction
To strengthen comparative rigor, a Multi-Criteria Decision Making (MCDM) analysis was performed using the TOPSIS method [
92], which ranks alternatives by their relative distance to an ideal solution in a multidimensional space. This approach allowed the integration of key performance indicators reported R
2, error metrics (RMSE, MAE), and data diversity (number of sources and variable domains) to classify models according to their relative performance and methodological soundness. The evaluation of crop yield prediction models was expanded through this MCDM approach;
Table S1 summarizes the 149 studies analyzed, integrating quantitative indicators such as the coefficient of determination (R
2), RMSE, data breadth (number of integrated data sources), and model family (Machine Learning, Deep Learning, or Hybrid).
To visualize the multidimensional relationships among these factors, a bubble map was generated (
Figure 10). The horizontal axis represents data breadth (ranging from 1 to 4, where 4 denotes models integrating multiple heterogeneous data sources such as climate, soil, satellite, and management variables); the vertical axis corresponds to predictive accuracy (R
2). Bubble size reflects the TOPSIS composite score which integrates model accuracy, input diversity, and classification quality while color differentiates the model family (ML, DL, or Hybrid). The figure shows the relationship between model performance (R
2) and the breadth of data used (from 1 = single source to 4 = multi-source integration): each bubble represents a study, classified according to the model family (ML, DL, or Hybrid/ML + process-based models).
The vertical clustering around R2 ≈ 0.85–0.95 reveals that most models achieve high predictive accuracy, regardless of the number of data sources. However, the largest bubbles (highest TOPSIS scores) concentrate on the right side (breadth = 3), confirming that models trained with diverse data (weather, soil, satellite, and management) achieve the best overall performance. Hybrid models dominate the upper-right quadrant, reflecting both methodological integration and data richness; deep learning models show moderate dispersion, suggesting sensitivity to data breadth but strong individual performance. In contrast, traditional ML models cluster at lower breadth levels, with competitive yet less robust performance. Overall, the figure demonstrates that data diversity and model hybridization synergistically improve predictive performance and stability, reinforcing the quantitative results of the TOPSIS analysis and revealing a clear upward trend between data integration and model accuracy.
In quantitative terms: (i) Hybrid models combining process-based and deep learning algorithms (e.g., APSIM + DNN, DSSAT + Gradient Boosting, AquaCrop + DL) occupy the upper-right quadrant and indicate superior performance and generalization capacity (R2 > 0.90, TOPSIS score > 0.85). (ii) Deep learning architectures (CNN, LSTM, Transformer) also show strong predictive power (average R2 ≈ 0.90), particularly when integrating multi-temporal satellite and climatic variables. (iii) Traditional ML models (Random Forest, Decision Tree, SVM) display moderate performance (R2 = 0.75–0.85), often limited by narrower data-input ranges (data breadth = 1–2). This quantitative evidence confirms that combining data diversity with model hybridization enhances yield-prediction robustness, supporting the transition toward more adaptive and context-aware modeling frameworks and highlighting research opportunities in regions where limited climatic or soil data constrain accuracy underscoring the relevance of open-access geospatial and sensor-based datasets.