1. Introduction
As a key link in the ecosystem carbon cycle, agricultural soil respiration is directly involved in the dynamic exchange between atmospheric carbon pools and soil carbon pools. Its flux change is the core process that affects soil carbon sink potential and carbon balance of agricultural systems, which is deeply related to global climate change response and agricultural sustainable development [
1]. In agricultural ecosystems, soil respiration is mainly derived from root respiration and microbial decomposition of organic matter, which is regulated by the climate, soil, vegetation and human management [
2]. Therefore, accurate quantification of temporal and spatial changes in agricultural soil respiration is not only an important basis for understanding the carbon cycle mechanism, but also a key scientific premise for assessing farmland carbon sink function, optimizing management measures, and promoting agricultural carbon neutrality.
With the rapid development of machine learning technology, its application in ecology and environmental science is deepening. Machine learning models can effectively deal with complex nonlinear relationships, integrate multi-source and multi-scale observation data, and significantly improve the simulation and prediction accuracy of key carbon flux processes such as soil respiration [
3]. Under this trend, quantitative research on soil respiration based on machine learning has become a frontier field at the intersection of agriculture and information science, which promotes the transformation of related research from traditional empirical models to data-driven, high-precision and scalable intelligent prediction paradigms [
4], and also provides new methodological support for scientific assessment of soil carbon sinks.
However, there are still some limitations in current research in this field: research topics are relatively concentrated, the geographical distribution is uneven, and the application of methods has not been unified [
5,
6,
7,
8]; most of the research results focus on specific algorithms or regional cases, and there is a lack of systematic explorations of the overall evolution context, knowledge structure and global cooperation pattern of the field from the perspective of bibliometrics. At the same time, machine learning models still face challenges in interpretability, uncertainty quantification and cross-regional extrapolation. It is urgent to clarify the development path from the macro level to promote application breakthroughs in carbon sink assessment and agricultural management.
Bibliometrics, as a method of quantitative analysis of the scientific literature, can effectively reveal the macro knowledge structure, evolution process, core research strength and research frontier of specific research fields [
9,
10]. Unlike systematic reviews and meta-analyses that aim to synthesize specific research conclusions, bibliometrics focuses more on depicting the “knowledge map” of the discipline as a whole. In view of the fact that the application of machine learning to the study of agricultural soil respiration is a cross-cutting field with rapid development and a surge in the literature, the use of bibliometric methods can quickly and systematically elucidate its development context, identify key research topics and cooperation networks, and provide a macro and structured cognitive framework for scholars in this field, pointing out the direction for subsequent in-depth systematic reviews.
Therefore, this study used bibliometric methods to systematically analyze the development trend of the field of “quantitative research on agricultural soil respiration based on machine learning” from 2021 to 2025 (the retrieval interval is 1985–2025, and the literature is concentrated in 2021–2025 after screening). By collecting relevant publications from the core collection of Web of Science and comprehensively using tools such as Biblioshiny, CiteSpace and VOSviewer, this paper reveals the research topics, knowledge structure and evolution logic in this field from the perspectives of publication trends, international cooperation networks, keyword clustering and emergence evolution, and pays special attention to its implications for agricultural soil carbon sink assessment and management decision-making. The purpose of this study is to provide a systematic scientific basis for academic development, scientific research layout, international cooperation and sustainable agricultural decision-making for carbon neutrality in this field. In order to achieve these goals, this study specifically answers the following research questions: (1) What are the annual publication trends and key research forces (countries and institutions) in the field of quantitative research on agricultural soil respiration based on machine learning? (2) What is the knowledge structure of this field? What has been the evolution of this field in the past five years? What are the main research hotspots and frontier topics? (3) What are the implications of these bibliometric results for accurate assessment, intelligent management and policy formulation for agricultural soil carbon sinks in the context of carbon neutrality?
2. Data Sources and Research Methods
2.1. Data Source
The data source is the core collection of Web of Science, and the retrieval formula is TS = ((“machine learning” OR “deep learning” OR “artificial intelligence” OR “AI” OR “random forest” OR “RF” OR “support vector machine” OR “SVM” OR “neural network” OR “NN” OR “CNN” OR “LSTM” OR “XGBoost” OR “LightGBM” OR “gradient boosting” OR “ensemble learning” OR “supervised learning” OR “regression tree” OR “decision tree” OR “GPR” OR “Gaussian process”) AND (“agricultural soil respiration” OR “agroecosystem respiration” OR “soil CO2 flux” OR “soil respiration” OR “Reco” OR “Rh” OR “CO2 efflux” OR “carbon dioxide emission” OR “below-ground carbon flux”) AND (“quantif*” OR “predict*” OR “estimate*” OR “model*” OR “simulate*” OR “mapping”)). The literature search was conducted on 31 December 2025, covering the period from 1 January 1985, to 31 December 2025. Although the retrieval time span is long, the analysis found that the relevant literature in line with the theme has only shown a significant increase since 2021 (see
Figure 1), and only one work was published in 1998, indicating that the research field is an emerging frontier direction. Therefore, the focus of this paper is on 2021–2025. A total of 975 data points are retrieved, and 966 data points are obtained after manual deduplication.
The core collection of Web of Science was selected as the only data source because the database is famous for its high-quality, peer-reviewed body of research, and its standardized data format is very suitable for bibliometric analysis using tools such as R language and CiteSpace. We manually screened the search results to remove duplicate and irrelevant publications, and finally obtained a collection of 966 articles focusing on this research topic.
2.2. Literature Screening Criteria
In order to ensure the relevance and quality of the literature analysis, we implemented a strict two-step manual screening process for the initial search results. First, a preliminary judgment is made through the title and abstract, and then the full text is further reviewed for the records with doubts to ensure that the final included literature is highly focused on the research topic.
In the screening process, we set clear inclusion rules:
- (1)
Research must be mainly aimed at agricultural ecosystems, including farmland, arable land, orchards, grasslands and other soil systems that are managed by humans;
- (2)
Machine learning methods (such as random forest, support vector machine, neural network, XGBoost, deep learning) must be used as the core technical means for quantification, prediction, simulation, or spatial estimation of total soil respiration (Reco), heterotrophic respiration (Rh), and root respiration.
- (3)
The literature needs to involve the spatial and temporal dynamic changes in agricultural soil respiration, environmental driving factors (such as temperature, soil moisture, organic matter content, fertilization, tillage methods) or carbon sink potential assessment;
- (4)
In addition, only peer-reviewed English original research articles or review articles are included.
We also clarified the exclusion criteria to effectively filter irrelevant or low-relevance research:
- (1)
Studies that focus only on natural ecosystems (such as primary forests, natural grasslands, wetlands, or tundra) and lack an agricultural management background are excluded.
- (2)
Works in which machine learning is only used as an auxiliary tool rather than a core quantitative method, or studies that rely mainly on traditional statistical models and process models, are not included.
- (3)
Non-English publications, meeting abstracts, editorials, letters, book chapters, technical reports and other informal academic works are excluded.
- (4)
In addition, if the research topics are mainly focused on the development of sensor hardware or the construction of pure theoretical model frameworks and not actually applied to the quantification of agricultural soil respiration, they are also eliminated.
- (5)
Any records that are clearly repeated or that deviate significantly from the subject of agricultural soil respiration are also not retained.
The whole screening process is completed independently by all authors, and the preliminary screening is based on the title–summary. Any differences are finally resolved through collective discussion and full-text review. This rigorous process eliminated 9 duplicate or obviously unrelated publications from 975 records obtained from the initial search, and finally formed a data set of 966 high-quality publications for subsequent bibliometric analysis. This screening strategy not only minimizes topic noise, but also ensures the scientificity, pertinence, and credibility of the analysis results of subsequent keyword clustering, international/institutional cooperation networks, and research frontier evolution, which provides a solid foundation for revealing the development context of quantitative research on agricultural soil respiration based on machine learning.
2.3. Research Methods
In this study, Biblioshiny [
9], CiteSpace 6.4.R1 [
10], Vosviewer 1.6.20 [
11] visual analysis software and Origin 2024 drawing software were used to scientifically analyze a total of 966 English-language articles on quantitative research on agricultural soil respiration based on machine learning. Biblioshiny is an open source tool written in the R language which is flexible and easy to integrate with other statistical and graphical packages [
9]. Citespace is a kind of citation visualization and analysis software which presents the structure, law and distribution of scientific knowledge through visual means. Therefore, the visual graphics obtained by this method are also called “scientific knowledge maps”. In order to more intuitively show the intensity of cooperation between countries, this study used VOSviewer to extract national cooperation network data (generating data files containing national nodes and cooperation links). Subsequently, the data were imported into Origin software to draw a national co-occurrence chord diagram [
10]. In the string diagram, the arc length represents the number of papers published by each country, the connection line between countries represents the cooperative connection, the thickness of the connection directly reflects the intensity of cooperation, and the intensity of cooperation is quantified by the number of papers published in cooperation between the two countries.
In order to identify emerging research frontiers and track the evolution of hot topics over time, CiteSpace’s burst word detection algorithm is used. The algorithm is based on Kleinberg’s burst detection method to identify terms with a sharp increase in frequency in a short period of time. The minimum duration of burst words is set to 1 year to capture emerging hotspots in the short term. The burst intensity reflects the intensity of the surge in citation or occurrence frequency [
9,
10,
11].
There are some exponential laws in the software, including betweenness centrality, which is an index to measure the importance of nodes in the network. It indicates the degree to which a node in the network graph is an “intermediary” of other nodes in the graph. This study uses CiteSpace software to draw and interpret the knowledge map. The specific parameters and processes are as follows:
Time slice: Set the analysis time period (2021–2025) to one time slice per year.
Literature screening criteria: In each time slice, set the top 25 literature data points (g-index: k = 25) [
10,
12].
Network pruning and clustering: In order to improve the clarity of the network map and the readability of the key structures, the “Pathfinder” (Pathfinder network) and “Pruning sliced networks” algorithms are selected to prune the network. Keyword clustering analysis uses the LLR (Log-Likelihood Ratio) algorithm to automatically identify cluster labels [
13,
14].
Clustering quality evaluation: As described in the original article, the keyword clustering results are quantitatively evaluated by the modularity value (Q value) and the average contour value (S value). When Q > 0.3, it shows that the network community structure is significant; when S > 0.5, it is thought that the clustering is reasonable, and S > 0.7 means that the clustering is convincing. The clustering results of this study meet these criteria, confirming the validity of the analysis [
13,
14].
3. Analysis of the Basic Characteristics of the Literature
3.1. Publishing and Publishing Trends
Revealing the development trend and research stage of the field of “quantitative research on agricultural soil respiration based on machine learning”, this study conducted a trend analysis on the annual number of published articles and the average citation frequency of related research, and the results are shown in
Figure 1. The analysis is based on 966 articles retrieved and cleaned from the core collection of Web of Science (as of 31 December 2025). The time distribution and citation characteristics clearly reflect the growth trajectory, academic vitality and influence evolution of this field.
On the whole, the research activities in this field have shown a significant growth trend since 2021, and show the following stage characteristics: The number of annual publications in the germination and initial period (2021–2022) is relatively small, indicating that the research direction was still in the initial stage of exploration; however, the average citation frequency is at a high level, indicating that the early research results are sources of strong influence and inspiration, which lays a theoretical and methodological foundation for this field. In the rapid growth period (2023–2025), the number of publications jumped significantly and maintained growth, which is consistent with the wide application trend of machine learning in agricultural environmental monitoring. This shows that the field has entered an active development stage of rapid accumulation of knowledge and expansion of research teams [
15].
With the rapid growth of the number of published papers, the average citation frequency of each paper fell again after 2023, which is in line with the general law of literature growth in emerging fields: the number of newly published papers has surged, and the accumulation of citations needs a certain time lag [
15,
16]. This phenomenon does not mean a decline in the quality of research, but rather shows that the field is in a stage of development with high yield and continuous improvement in academic attention [
17,
18].
3.2. Collaborative Network Analysis
3.2.1. Analysis of International Cooperation Network
Based on the comprehensive analysis results obtained from the R language package (Biblioshiny) and VOSviewer 2024 (as shown in
Figure 2,
Figure 3 and
Figure 4 and
Table 1), this study systematically reveals the international cooperation pattern and structural characteristics in the field of machine learning-based quantitative research on agricultural soil respiration.
Characteristics of national output distribution: From the perspective of the scale of literature output (
Figure 2,
Table 1), China occupies an absolute dominant position in this field, with 467 papers (61.9% of the total literature), far more than other countries, showing China’s high concentration and core influence in this research direction. The United States (83 articles, 8.9%) and India (67 articles, 6.9%) ranked second and third, respectively, together constituting the main research force in this field.
From the perspective of cooperative network topology, the international cooperation network extracted and constructed by VOSviewer (
Figure 3) shows that the global cooperation presents a dual-center structure with China and the United States as the core. Close cooperation has been established between the two countries and with Germany, Australia, Japan, the Republic of Korea and other countries (the connection line in the string diagram is thick). The proportion of international cooperation papers (MCP %) in
Table 1 further confirms the positive degree of countries’ participation in cooperation, especially Egypt (73.3%), Saudi Arabia (50%) and Germany (43.8%).
In the cooperation network, betweenness centrality is an important indicator to identify hub countries. The geographical distribution clustering results in
Figure 4 show that although the total number of publications in Sweden, France, Austria and other European countries is not high (the node weight is small), they are in the key connection position in the network, showing a high degree of intermediary centrality. This shows that these countries play an important “bridging” role in promoting cross-regional (such as North America, Asia, Europe) academic exchanges and cooperation.
From the perspective of differences in international cooperation models, we find that there are obvious differences in the cooperation models of different countries. Although China, India and other high-yield countries have a large amount of research, their independent national papers (SCPs) account for a relatively high proportion, indicating that their internal research system is relatively complete. Saudi Arabia, Egypt, Germany and other countries show stronger international cooperation and integration capabilities (MCP % is higher). The Republic of Korea, Japan and other countries play an important role in the connection and transition of the Sino-US cooperation network.
From the perspective of the cooperation pattern, a current cooperation pattern of “Asia-led, Europe and the United States, multi-polar collaboration” has been formed, which may be due to China’s large-scale application needs and policy support in the field of smart agriculture and carbon monitoring, as well as the traditional advantages of European and American countries in machine learning algorithms and interdisciplinary methodologies. This pattern of division of labor and cooperation is conducive to promoting technology integration and knowledge diffusion, but it also suggests the need to further strengthen the deep cooperation between core countries and the network integration of emerging research regions.
3.2.2. Analysis of the Main Research Institutions
The collaboration network among scientific research institutions (
Figure 5) reveals the core research force and collaboration model in the field of “quantitative research on agricultural soil respiration based on machine learning”. The full names and abbreviations of the main institutions are shown in
Table 2. On the whole, this field presents a scientific research pattern of “Chinese institutions leading, Sino-US–European multi-cooperation”.
From the perspective of the institutional output and cooperation network, Chinese scientific research institutions occupy an absolute dominant position in this field. The national scientific research system represented by the Chinese Academy of Sciences (CAS) and its affiliate University of Chinese Academy of Sciences (UCAS) has formed a research cluster that radiates through the whole country and intersects disciplines. At the same time, a number of universities with unique advantages in agriculture, environment and information science have also played a key role, such as China Agricultural University (CAU), Northwest A & F University (NWAFU) and Nanjing University of Information Science and Technology (NUIST). They demonstrate distinct research characteristics in agricultural system modeling, climate data analysis and machine learning method application, reflecting the complete layout of China from basic research to application in this field.
Although American scientific research institutions have a lower total number of publications than China, they play an important role in the international cooperation network [
19]. For example, the Pacific Northwest National Laboratory (PNNL) exhibits long-term accumulation of environmental modeling and carbon cycle research, and its betweenness centrality is high, showing its role as a bridge connecting China, the United States and global cooperation networks [
19,
20]. In addition, universities such as the University of Maryland (UMD) also have significant influence on remote sensing data assimilation and machine learning algorithms, forming a scientific research system promoted by national laboratories and research universities [
21,
22].
The formation of the current institutional cooperation pattern is not only due to China’s major scientific research investment in the field of agricultural informatization and carbon neutrality, but also inseparable from the long-term leading advantages of the United States and other countries in basic machine learning algorithms and environmental remote sensing. In the future, in order to further improve the research quality and global influence of this field, the following suggestions are made: (1) continue to deepen the substantive cooperation between Chinese and American institutions in open source data and repeatable models; (2) strengthen the role of scientific research institutions in Europe and other regions in regional verification and cross-ecological-zone comparative research; (3) encourage more universities with the characteristics of an agricultural and environmental discipline background to integrate into the international cooperation network, and promote the transformation of research results into actual agricultural management.
3.3. Keywords and Hot Frontier Research Analysis
3.3.1. Keyword Analysis
Keyword analysis can effectively reveal the core themes and development trends in the research field. This study is based on the “keywords” and “Biblioshiny” packages in the R language environment, used to extract and statistically analyze keywords. Through word frequency calculation, a keyword cloud map reflecting high-frequency terms in this field is generated (
Figure 6), and the top ten keywords with the highest frequency are sorted out (
Table 3) [
23]. At the same time, combined with the keyword clustering timeline (
Figure 7), the evolution trajectories of different topics can be further tracked.
From the keyword cloud map (
Figure 6) and the high-frequency vocabulary (
Table 3), it can be seen that the frequency of “machine learning” (216 times) is clearly higher than that of other keywords, indicating that the machine learning method itself is the core research tool and topic in this field. Subsequently, “temperature” (107 times), “prediction” (98 times), “model” (94 times) and “soil respiration” (72 times) clearly outline the main line of research in this field: using various models (such as random forest, “random forest”, 54 times) to predict soil respiration, and focusing on key environmental driving factors such as temperature. In addition, the high frequency of “climate” (47 times) and “china” (40 times) reflects the research background of climate change and China’s outstanding contribution in this field [
24].
The keyword timeline diagram (
Figure 7) further reveals the phased changes in research hotspots. In the initial stage of 2021–2022, the research topics are relatively concentrated, mainly focusing on basic methodologies and core scientific issues such as “machine learning”, “soil respiration” and “prediction”, marking the initial formation of a clear research paradigm in this field. From 2023 to 2024, the diversity of keywords increases clearly, and more specific technical methods and observation methods such as “random forest”, “deep learning”, “remote sensing”, and “carbon flux” emerge, indicating that research is developing in the direction of technology deepening and data diversification. By 2025, topics such as “uncertainty”, “scaling”, and “ecosystem model” begin to be integrated into keyword clustering, reflecting the gradual shift in research focus to uncertainty quantification, scale expansion, and integration with ecosystem models. This indicates that the field is transitioning from method validation to theoretical improvement and application deepening [
25].
From the evolution path of keywords, it can be seen that this field presents the development logic of “method-driven technology integration–problem deepening”: At the beginning of the 5-year period, machine learning methods are mainly introduced, in the middle stage, multi-source data and multi-algorithm integration are emphasized, and in the near future, more attention will be paid to model reliability, interpretability and connection with ecological theory. This shows that the field is gradually developing from the application of cross-cutting technologies to the frontier direction of both methodological innovation and the ability to solve major scientific problems. In the future, how to improve the mechanism integration of models and enhance the applicability of multi-scale carbon cycle simulation may become the key to continuous breakthroughs in this field [
25,
26].
3.3.2. Keyword Cluster Analysis
In order to deeply analyze the knowledge structure and evolution context of the field of quantitative research on agricultural soil respiration based on machine learning, this study used CiteSpace software to cluster the keywords to obtain a visual map (
Figure 8) and clustering indicators (
Table 4). The average contour value (S value) of each cluster is higher than 0.8, and the modular value (Q value) is greater than 0.3, indicating that the clustering results have high internal consistency and significant network structure, and the analysis is credible [
27].
The analysis shows that the research topics are mainly clustered into three interrelated clusters. The first cluster is an algorithm and model method cluster with “machine learning”, “random forest”, and “deep learning” as the core, which is rapidly formed and continues to deepen in 2021–2022, reflecting the active exploration and technical verification of the prediction tool itself in the early stage of the field [
28,
29]. The second cluster is around environmental driving factors and process mechanisms, including keywords such as “temperature”, “soil moisture”, “climate change”, and “carbon flux”. This cluster focuses on the biogeochemical regulation mechanism of soil respiration, explores the influence path of environmental factors such as temperature and water on soil respiration and its response to climate change, and provides mechanical support for model construction. The third cluster focuses on remote sensing fusion and regional applications, marked by keywords such as “remote sensing”, “china”, “agriculture”, “uncertainty”, and “scaling”. Since 2023, the importance of this cluster has increased significantly, marking a shift in the focus of research from site-scale mechanism and method exploration to national- or regional-scale carbon flux simulation, spatial mapping and model uncertainty quantification. Multi-source spatial information technology such as remote sensing plays a key supporting role in this process, and promotes the extrapolation ability of the model from “point” to “surface” [
30,
31].
These three clusters are not developed in isolation, but show dynamic characteristics of “method–mechanism–application” collaborative evolution: the algorithm cluster provides tool support for mechanism research, the mechanism cluster deepens the understanding of processes and provides a basis for model optimization, and the application cluster promotes the expansion of research results to the regional scale and the transformation of actual management, which together constitute a complete knowledge ecosystem in this field.
This clustering structure clearly reveals the dynamic evolution path of the domain’s research paradigm. Early studies focused on the introduction of machine learning algorithms and the construction of basic prediction models. In the medium term, the regulation mechanism of environmental factors such as temperature and moisture on soil respiration was deeply analyzed. Recently, there has been a clear trend of multi-scale application integration and model reliability evaluation [
32]. The close co-occurrence of keywords such as “remote sensing” and “uncertainty” in recent clustering is particularly worthy of attention. It indicates that the research paradigm is shifting from a single pursuit of prediction accuracy to a comprehensive analysis framework of “multi-source data fusion–multi-scale modeling–uncertainty quantification”. This reflects that this field is gradually moving towards a mature stage of development driven by technology, which takes into account the theoretical depth, application breadth and result credibility [
33]. The future frontier is expected to focus on the development of mechanism-enhanced machine learning models, the collaborative assimilation of multi-platform observation data, and uncertainty transfer and risk analysis for agricultural carbon management decision-making, so as to provide a more solid scientific basis for the realization of precision agriculture and carbon neutrality goals.
3.3.3. Analysis of Hot Research Topics
The evolutionary trajectory of the research frontier is clearly identified by burst word detection.
Figure 9 shows the top 22 burst keywords and their duration, which intuitively reflects the shift in research focus in this field in different periods. The burst word analysis shows that “machine learning” has the highest burst intensity, and its starting year is 2021, indicating that the introduction of machine learning methods is the initial driving force and core hotspot in this field. It is followed by “soil respiration” and “temperature”, both of which began to emerge in 2021, and the intensity of emergence was significant, highlighting that research was closely combined with the core scientific issues (soil respiration) and key environmental driving factors (temperature) in the initial stage [
30].
From the perspective of the temporal distribution of emergent words, the evolution of research hotspots presents three clear stages:
- (1)
Methodology establishment and basic problem focusing stage (2021–2022): The key words in this stage are “machine learning”, “soil respiration”, “temperature”, “prediction” and “model”. This reflects that the focus of early research in the field was on establishing the applicability of machine learning methods in the quantitative study of soil respiration, and on the construction of basic prediction models and the in-depth exploration of temperature, the most important environmental control factor [
27,
34].
- (2)
Technological diversification and process deepening stage (2022–2023): At this stage, emergent words show the diversification of technical methods and the deepening of process research. Keywords such as “random forest”, “deep learning”, “climate change”, and “carbon flux” begin to emerge. This shows that research has shifted from the general machine learning framework to the in-depth application of specific efficient algorithms (such as random forest and deep learning). At the same time, this stage pays more attention to the carbon flux response in the context of climate change, and the research perspective has expanded from single prediction to mechanism correlation [
35].
- (3)
Data fusion and scale expansion stage (2023–2025): Recent burst terms show significant data fusion and spatial scale expansion characteristics. Keywords such as “remote sensing”, “china”, “uncertainty” and “performance” have become new hotspots. This indicates that the research frontier has shifted to integrating multi-source spatial data such as remote sensing to support regional-scale simulation, focusing on case applications in typical regions such as China, and attaching great importance to the quantification of model performance evaluation and prediction uncertainty. This shift reflects the evolution of the field from the pursuit of “usability” to the pursuit of “reliability” and “practicality” [
36].
It is worth noting that the simultaneous emergence and continuous activity of “remote sensing” and “uncertainty” reveal an important frontier cross-direction: how to use remote sensing data to improve the prediction ability of the model at the regional scale, and simultaneously solve the problem of scale conversion error and uncertainty increase. This indicates that the research paradigm in this field is moving from relatively independent “site observation–model prediction” to a new stage of integrated system research on “multi-source data fusion–multi-scale modeling–uncertainty quantification”. This change not only promotes the progress of methodology, but also greatly enhances the practical ability of research results to serve agricultural management decision-making and global carbon cycle assessment [
37]. In the future, model optimization based on multi-source data assimilation, the application of artificial intelligence interpretability in soil respiration prediction, and agricultural management scenario simulation for carbon neutrality goals are expected to become hot topics in the field’s continuous development.
Figure 9.
The top 22 keywords show the most significant surges in citations in quantitative research on agricultural soil respiration based on machine learning (created with CiteSpace 6.4.R1). In the analysis of keyword burstiness, keywords refer to the terms related to these bursts, and the year indicates the initial year of their occurrence. The intensity attribute reflects the intensity of the citation burst, with “begin” marking the year when the burst began and “end” marking the year when the burst ended. Each line of light blue lines represents the period from 2021 to the first appearance of the corresponding keywords. In contrast, the blue line is from the emergence of keywords to 2025, while the red line indicates the duration of the surge in citations of keywords [
38].
Figure 9.
The top 22 keywords show the most significant surges in citations in quantitative research on agricultural soil respiration based on machine learning (created with CiteSpace 6.4.R1). In the analysis of keyword burstiness, keywords refer to the terms related to these bursts, and the year indicates the initial year of their occurrence. The intensity attribute reflects the intensity of the citation burst, with “begin” marking the year when the burst began and “end” marking the year when the burst ended. Each line of light blue lines represents the period from 2021 to the first appearance of the corresponding keywords. In contrast, the blue line is from the emergence of keywords to 2025, while the red line indicates the duration of the surge in citations of keywords [
38].
4. Discussion
Based on the bibliometric method, this study systematically revealed the development trend, knowledge structure and evolution path of the field of “quantitative research on agricultural soil respiration based on machine learning” from 2021 to 2025. Based on the comprehensive analysis of 966 related publications in the core collection of Web of Science, this paper comprehensively outlines the dynamic picture of the interdisciplinary field in the past five years from germination to rapid development, and gradually to maturity and integration, from multiple perspectives such as annual publication trends, national and institutional cooperation networks, keyword frequency and clustering, and research frontier evolution [
39]. The research results fully reflect the profound changes in methodology, scientific issues and application scenarios in this field [
40]. At the beginning of the 5-year period (2021–2022), the introduction and verification of machine learning methods were the core, and the basic prediction and environmental response of soil respiration were relevant [
40]. In the mid-term (2023–2024), the research topics rapidly differentiated and deepened to specific algorithm optimization, multi-source data fusion and typical regional applications. Recently (2024–2025), the field has further focused on the quantification of model uncertainty, scale expansion and mechanism coupling with ecological processes, showing a significant shift from “technology-driven” to “problem-oriented” and “system integration” [
40,
41]. This evolution path not only reflects the strong penetration and adaptability of machine learning technology in agricultural environmental science, but also reflects the strong traction of the urgent needs of the global response to climate change and the promotion of green and low-carbon transformation of agriculture in this field.
4.1. Stage Characteristics and Driving Forces of Domain Development
As shown in the publication trends (
Figure 1), this field presents typical “technology-driven, application-driven” phased growth characteristics. In the initial stage of 2021–2022, the high emergence intensity with “machine learning” as the core keyword clearly shows that the introduction of machine learning methods as “enabling technologies” is the direct driving force for the rise of the field [
29,
42]. The research in this period mainly verified the feasibility and superiority of machine learning in soil respiration prediction, which laid a methodological foundation [
43]. The rapid growth period after 2023, the jump in the number of publications, and the diversification of keywords reflect the transformation of research from “method verification” to “problem deepening” and “scenario expansion”. The driving force of this stage comes from two aspects: one is the continuous penetration and maturity of machine learning technology itself in ecological and environmental science; the second is the urgent need for accurate and efficient monitoring and evaluation of the carbon cycle of agroecosystems in the context of global climate change [
42,
43,
44]. By 2025, although the number of research papers had stabilized, the attention on keywords such as “uncertainty” and “performance” had increased, indicating that the field is entering a mature and deepening period that focuses on the quality, reliability and practical application value of results.
4.2. Clustering Structure and Evolution Logic of Research Topics
Keyword clustering and burst analysis (
Figure 8 and
Figure 9) clearly outline the three pillars of the domain’s knowledge structure and its dynamic evolution logic. The pillars of algorithm and model methods (represented by # 5 deep learning, # 11 machine learning) are always the core and most active parts of the field, and their utilization has evolved from general method exploration (such as random forest) to complex model applications (such as deep learning). The pillar of environmental driving factors and process mechanisms (with # 3 sensors, # 9 environmental factors and others clustering as the core) focuses on the nature of science [
45], extending from focusing on a single dominant factor (temperature) to multi-factor (such as water, climate) interaction, deepening the understanding of soil respiration regulation mechanisms [
46,
47]. The data fusion and regional application pillar (marked by clustering of keywords such as # 1 relative humidity, # 5 deep learning, and # 18 ensemble learning, and associated with “remote sensing”, “china”, and other emergent words) represents the frontier direction of field development; that is, by integrating multi-source data such as remote sensing, the site-scale model capabilities are extended to regional and even national scales, with a focus on large agricultural countries such as China [
48].
These three pillars are not isolated in development, but show a strong co-evolutionary relationship. In the early days, the algorithm pillar provided a new tool for process research. In the mid-term, the need for process research promotes the optimization of the algorithm to specific scenarios (such as ensemble learning for heterogeneous data); recently, the demand for regional applications has driven the development of data fusion technology and uncertainty quantification methods. This spiral rise of “method–mechanism–application” constitutes the internal logic of the development of this field.
4.3. Advantages and Potential Challenges of Cooperative Network Pattern
The analysis of the cooperation network between countries and institutions reveals a clear pattern of “Asia-led, Sino-US dual-core, and Continental Unicom”. China’s absolute advantage in the number of publications reflects its strong scientific research input and output efficiency in responding to the national “double carbon” strategy and promoting the development of smart agriculture [
45]. With their deep accumulation of environmental modeling, remote sensing science and basic algorithms, American institutions play a key “hub” role in the global cooperation network. The advantage of this pattern lies in the ability to quickly integrate the data advantages of large-scale application scenarios in the East and the innovation ability of Western frontier methodology [
17,
45].
However, this pattern also implies challenges. First, excessive country concentration may limit the diversity of research perspectives, data sources, and verification scenarios to a certain extent, and may affect the universality of the model in different agricultural systems around the world (such as tropical plantations and arid pastures) [
49,
50]. Secondly, although there is cooperation between China and the United States, there is still considerable room for improvement in deep and substantive data sharing and model mutual recognition cooperation. Finally, although many large agricultural countries (such as India and Brazil) have certain research outputs, their centrality in the international cooperation network is low, and their localized knowledge and data value have not been fully integrated into the global knowledge system.
4.4. Implications for the Management of Agricultural Soil Carbon Sinks
As a key output process of the soil carbon cycle, there is a dynamic equilibrium relationship between soil respiration and soil carbon sinks. The accumulation of soil carbon sinks depends on the net difference between the organic carbon input by plant photosynthesis and the carbon dioxide released by soil respiration [
51,
52]. Therefore, the intensity of soil respiration directly affects the stability and carbon sink potential of soil carbon pools [
5,
9]. The bibliometric analysis in this study reveals that quantitative research on soil respiration based on machine learning is moving from method verification to mechanism integration and regional application, which provides important implications for the scientific evaluation and intelligent management of agricultural soil carbon sinks.
The high-precision performance of the machine learning model in soil respiration prediction makes it an effective tool for identifying key environmental drivers and human-managed responses. By integrating multi-source data and spatio-temporal simulation, these models can accurately describe the spatio-temporal dynamics of soil respiration in different agricultural systems (such as arid irrigation areas, intensive farmland), and provide a scientific basis for optimizing field management measures [
3,
21,
25,
29]. For example, tillage, irrigation and fertilization strategies can be adjusted according to the respiration-sensitive period predicted by the model, thereby reducing carbon output and enhancing soil carbon sequestration while maintaining agricultural productivity. In addition, this study found that the research frontier is focusing on regional-scale simulation and uncertainty quantification, which means that machine learning methods can provide spatially clear and confidence interval estimation results for agricultural carbon sink accounting at county and even national scales [
6,
9]. This is of great value for building a credible agricultural carbon sink monitoring–reporting–verification system, supporting carbon trading markets and agricultural carbon subsidy policies.
Furthermore, the development of this field in the direction of mechanisms and data fusion will help to deepen the systematic understanding of the coupling mechanism of “soil respiration–carbon sink management measures”. In the future, research can explain the combination of artificial intelligence and process models, which can not only improve the reliability of prediction, but also reveal the long-term impact path of different agricultural management scenarios (such as conservation tillage, organic fertilization, and cropping system adjustment) on soil respiration and carbon sinks, so as to provide decision support for screening efficient carbon sequestration technology systems adapted to regional characteristics and promoting the transformation of agriculture to “climate-smart” [
9,
37,
53,
54,
55,
56]. Therefore, soil respiration research based on machine learning not only promotes the progress of scientific knowledge of the carbon cycle, but also provides key technical support for accurate assessment, intelligent management and policy formulation of agricultural soil carbon sinks through method innovation and multi-scale application. It is expected to play a central role in achieving agricultural carbon neutrality and sustainable development goals.
4.5. Future and Prospects
Based on the above analysis, in order to promote the sustainable and healthy development of the field of quantitative research on agricultural soil respiration based on machine learning, and further strengthen its supporting role in the assessment and management of agricultural soil carbon sinks, future research should simultaneously be conducted in the following directions:
- (1)
Promote the integration of model interpretability and mechanisms for in-depth development, and serve carbon sink mechanism discovery and management optimization. Current research is still based on “black box” prediction. In the future, methods such as interpretable artificial intelligence, causal inference and physical information neural networks should be systematically developed to deeply reveal the complex nonlinear relationship and action path between key environmental factors identified by machine learning models, agricultural management measures and soil respiration, so as to clarify the response mechanism and regulation potential of soil carbon sinks under different management scenarios [
28,
45]. On this basis, the deep coupling of data-driven models with process-based soil biogeochemical models and crop growth models is promoted, and a two-way fusion framework of “mechanism-guided data assimilation” and “data-driven parameter optimization” is constructed to develop a next-generation intelligent model with high prediction accuracy and ecological mechanisms, which significantly improves the long-term prediction and scenario simulation ability of the model for soil carbon sink dynamics under different climate scenarios and management strategies [
53].
- (2)
Construct a multi-source data intelligent assimilation and full-chain uncertainty management system to support the reliability of carbon sink accounting and decision-making risk management and control. In the face of the differences in spatial and temporal resolution and accuracy of multi-source heterogeneous data such as remote sensing, near-earth sensing, Internet of Things, and flux observation, it is necessary to focus on the development of intelligent data assimilation algorithms and quality control processes that adapt to the characteristics of agricultural ecosystems [
39]. At the same time, it is urgent to establish a full-chain uncertainty quantification, transmission, and attribution analysis system from input data, model parameters, and structural assumptions for carbon flux output results, and develop an uncertainty management framework that integrates Bayesian methods, ensemble simulation, and sensitivity analysis [
31,
49,
55]. This will provide a reliable risk assessment tool for agricultural carbon sink accounting, carbon trading and adaptive management, and promote the transformation of research results from scientific knowledge to carbon sink risk management and policy decision support.
- (3)
Expand the research network in terms of global diversity and regional representativeness, and enhance the universality and extrapolation ability of carbon sink models. At present, research on the types and geographical distribution of agricultural systems is still uneven. In the future, efforts should be made to build a collaborative observation–modeling network covering the world’s major agro-ecological zones (such as tropical plantations, arid irrigation areas, intensive temperate farmland, alpine pastoral areas) [
31,
56]. Long-term and cross-regional comparative studies are encouraged to systematically evaluate the applicability, stability and migration ability of different machine learning models for soil respiration and carbon sink estimation under different climate–soil–management combinations [
56]. Constructing an open and standard regional case base and benchmark data set would promote the establishment of a globally representative model evaluation and calibration system, and enhance the universality and extrapolation reliability of carbon sink estimation models.
- (4)
Build an open and collaborative international research infrastructure and community to promote carbon sink data sharing and model mutual recognition. It is advocated that the main national scientific research institutions take the lead to jointly build an open and shared multi-source database of agricultural soil respiration and carbon sinks, a benchmark test platform and a model interoperability interface. The goal is to promote the establishment of a normalized international model comparison plan, and organize joint research on the common challenges in carbon sink estimation. In particular, we should strengthen substantive cooperation in data, methods and talents between core research areas such as China, the United States and Europe, and emerging research areas such as the “Belt and Road” and the Global South, jointly develop data standards, model specifications and application guidelines in this field, and build an inclusive, efficient and sustainable international scientific research cooperation ecology.
- (5)
Strengthen research and development of agricultural management scenario simulation and decision support systems for carbon neutrality. Future research should pay more attention to the combination of soil respiration prediction models with agricultural management measures and carbon sink enhancement goals, and develop scenario simulation and optimization tools for carbon neutrality. By integrating multi-source data, mechanism models and machine learning algorithms, an intelligent decision support system integrating “monitoring–simulation–assessment–management” is constructed to realize dynamic assessment and optimization recommendations of soil carbon sink changes under different management measures, which directly serves the implementation of intelligent farmland management, national carbon sink accounting and agricultural carbon neutrality strategy.
This field is at a critical stage of transformation from “technology-driven” to “system integration” and from “method innovation” to “decision support”. With the continuous integration of new technologies such as interpretable AI, multi-modal data fusion, and digital twins, as well as the increasingly urgent demand for accurate monitoring and intelligent management of agricultural systems in global carbon neutrality, quantitative research on agricultural soil respiration based on machine learning is expected to achieve systematic breakthroughs in the following aspects: First, an intelligent monitoring–simulation–decision support system integrating “space–air–ground” is formed at the technical application level, which directly serves national-scale carbon sink accounting and farmland intelligent management. The second aspect is deepening the understanding of the coupling mechanism of “respiration–carbon sink management” of agroecosystems and climate feedback at the scientific cognitive level, and provide a theoretical basis and methodological support for the development of climate-smart agriculture. The third is promoting the formation of an interdisciplinary, cross-border, open and collaborative global cooperation model at the level of the research paradigm, and to build a scientific and technological innovation community for sustainable development goals. By adhering to problem orientation, deepening technology integration and strengthening open collaboration, this field will play an indispensable core role in realizing green transformation of agriculture, improving soil carbon sink function and coping with global climate change challenges.
4.6. Limitations of the Study
In this study, the development trend and knowledge structure of the quantitative research field of agricultural soil respiration based on machine learning were systematically examined by means of bibliometrics, which provided a macro academic prospect for relevant scholars. However, this study also has some limitations, which need to be paid attention to when interpreting the results.
First of all, bibliometric analysis mainly relies on the metadata of the literature (such as keywords, authors, institutions), and reveals the macro characteristics of research hotspots through word frequency co-occurrence and cluster analysis, but it cannot deeply analyze the specific research content, methodological details and empirical conclusions of each publication. For example, the high frequency of keywords can only reflect the surface correlation of research topics, but cannot reveal the specific differences in model selection, parameter optimization and other aspects of different research, and cannot comprehensively evaluate the predictive efficiency of each model. Therefore, the conclusions of this study should be regarded as a summary of the macro trends in the research field, rather than a systematic synthesis of specific scientific issues.
Secondly, the analysis scope of this study is limited to the English literature included in the Web of Science Core Collection, and the search deadline was 31 December 2025. Some non-English research and the latest published but not fully cited research results may be omitted, which to some extent affects the global representativeness and timeliness of the research conclusions. Future research can combine systematic review or meta-analysis methods to conduct more in-depth quantitative synthesis of core scientific issues, expand data sources, and include more regional research results to make up for the above deficiencies.
5. Conclusions
Based on the method of bibliometrics and the technology of scientific knowledge mapping, this study systematically revealed the development trend, knowledge structure and evolution logic of the field of “quantitative research of agricultural soil respiration based on machine learning” from 2021 to 2025. Through a comprehensive analysis of 966 articles in the core collection of Web of Science, this study found that the field showed rapid evolution characteristics, forming a triple-helix development model of “technology-driven, application-driven, and interdisciplinary”. The research topic has gone through a phased deepening process from the establishment of methodology and the diversification of technical paths to multi-source data fusion and spatial scale expansion, and has constructed an organic linkage academic ecosystem of “method–mechanism–application” with machine learning algorithms as the core, environment-driven mechanisms as support and regional application practice as guidance.
From the perspective of the international cooperation pattern, this field has formed a global scientific research collaboration network of “Asia-led, Sino-US dual-core, and Continental Unicom”. China dominates the scale of scientific research output, the United States plays a key pivotal role in the international cooperation network, and many European countries show unique connection value in the cross-regional knowledge flow. This pattern reflects the differentiated advantages of global scientific research forces, and also reflects the inevitable trend of collaborative innovation in the international scientific community in the context of climate change.
This study further pointed out, from the perspective of soil carbon sinks, that the quantitative study of soil respiration based on machine learning provides key support for accurate assessment and management decision-making for agricultural soil carbon sinks. Through high-precision prediction, uncertainty quantification and multi-source data fusion, this field is gradually forming an integrated carbon sink intelligent analysis framework of “monitoring–simulation–evaluation–management”, which is expected to directly serve the national carbon neutrality strategy and agricultural green transformation.
At present, the field is in a critical stage of transition from a period of rapid growth to a period of system integration. Future research needs to achieve key breakthroughs in the following dimensions: deepening the integration of model interpretability and ecological mechanisms, and promoting the paradigm shift from “black box prediction” to “gray box modeling”; constructing a multi-source data intelligent assimilation and full-chain uncertainty quantification system to provide a reliable analysis tool for carbon sink accounting; expanding case studies and comparative analysis of global diversified agroecosystems, and enhancing the adaptability and extrapolation ability of the model under different climate–soil–management combinations; and building an open and collaborative international research infrastructure and cooperation network to promote the global unification of data standards and model interfaces related to carbon sinks.
With the continuous empowerment of cutting-edge technologies such as interpretable artificial intelligence, multi-modal data fusion, and digital twins, this field is expected to achieve systematic breakthroughs at three levels: forming an intelligent monitoring–simulation–decision support system with “space–air–ground” integration at the technical application level; deepening the understanding of the carbon–water–nitrogen coupling cycle and climate feedback mechanism of agroecosystems at the scientific cognitive level; and promoting, via the research paradigm, the formation of an interdisciplinary, cross-border, open and collaborative global cooperation model. By continuously deepening method innovation, strengthening problem orientation, and expanding global collaboration, quantitative research on agricultural soil respiration based on machine learning will play an irreplaceable core role in promoting agricultural green transformation, improving soil carbon sink function, and responding to climate change challenges, contributing key scientific wisdom and technical solutions to sustainable development.
Author Contributions
Conceptualization, X.M.; methodology, X.M. and T.C.; validation, T.C. and L.W.; formal analysis, L.W. and T.C.; resources, L.W. and T.C.; data curation, X.M., J.H. and F.Z.; writing—original draft preparation, T.C. and X.M.; writing—review and editing, L.W. and X.M.; visualization, F.Z. and J.H.; supervision, L.W.; project administration, L.W.; funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by “Inner Mongolia Department of Science and Technology 2024 major projects to prevent and control sand demonstration ‘unveiled marshal’ project, grant number 2024JBGS0016”, ”Dynamic Change Analysis of Sediment Input into the Yellow River in Inner Mongolia Reach, Grant number 202501010401A”, “the National Natural Science Foundation of China, grant number U2243212-2” and “Socio-economic Influencing Factors of Soil Erosion in Huangshui River Basin and Its Control Measures, grant number 23Q061”.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Le Quéré, C.; Andrew, R.M.; Friedlingstein, P.; Sitch, S.; Hauck, J.; Pongratz, J.; Pickers, P.A.; Korsbakken, J.I.; Peters, G.P.; Canadell, J.G.; et al. Global Carbon Budget 2018. Earth Syst. Sci. Data 2018, 10, 2141–2194. [Google Scholar] [CrossRef]
- Beillouin, D.; Corbeels, M.; Demenois, J.; Berre, D.; Boyer, A.; Fallot, A.; Feder, F.; Cardinael, R. A global meta-analysis of soil organic carbon in the Anthropocene. Nat. Commun. 2023, 14, 3700. [Google Scholar] [CrossRef]
- Singh, M.; Kumar, B.; Chattopadhyay, R.; Amarjyothi, K.; Sutar, A.K.; Roy, S.; Rao, S.A.; Nanjundiah, R.S. Machine learning for Earth System Science (ESS): A survey, status and future directions for South Asia. arXiv 2021, arXiv:2112.12966. [Google Scholar]
- Schimel, D.; Pavlick, R.; Fisher, J.B.; Asner, G.P.; Saatchi, S.; Townsend, P.; Miller, C.; Frankenberg, C.; Hibbard, K.; Cox, P. Observing terrestrial ecosystems and the carbon cycle from space. Glob. Change Biol. 2015, 21, 1762–1776. [Google Scholar] [CrossRef]
- Laskowski, R.; Niklińska, M.; Nycz-Wasilec, P.; Wójtowicz, M.; Weiner, J. Variance components of the respiration rate and chemical characteristics of soil organic layers in Niepolomice Forest, Poland. Biogeochemistry 2003, 64, 149–163. [Google Scholar] [CrossRef]
- Ogle, S.M.; Breidt, F.J.; Paustian, K. Agricultural management impacts on soil organic carbon storage under moist and dry climatic conditions of temperate and tropical regions. Biogeochemistry 2005, 72, 87–121. [Google Scholar] [CrossRef]
- Davidson, E.A.; Janssens, I.A. Temperature sensitivity of soil carbon decomposition and feedbacks to climate change. Nature 2006, 440, 165–173. [Google Scholar] [CrossRef]
- Lal, R. Soil carbon sequestration impacts on global climate change and food security. Science 2004, 304, 1623–1627. [Google Scholar] [CrossRef]
- Aria, M.; Cuccurullo, C. bibliometrix: An R-tool for comprehensive science mapping analysis. J. Informetr. 2017, 11, 959–975. [Google Scholar] [CrossRef]
- Chen, C. The CiteSpace Manual; College of Computing and Informatics, Drexel University: Philadelphia, PA, USA, 2014; pp. 1–84. [Google Scholar]
- Van Eck, N.J.; Waltman, L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef]
- Chen, C.; Ibekwe-SanJuan, F.; Hou, J. The Structure and Dynamics of Co-Citation Clusters: A Multiple-Perspective Co-Citation Analysis. J. Am. Soc. Inf. Sci. Technol. 2010, 61, 1386–1409. [Google Scholar] [CrossRef]
- Ma, Y.; Song, T. A Bibliometric Review and Interdisciplinary Analysis of the Brahmaputra River. Water 2024, 16, 3115. [Google Scholar] [CrossRef]
- Chen, C.; Song, M. Visualizing a Field of Research: A Methodology of Systematic Scientometric Reviews. arXiv 2019, arXiv:1902.05888. [Google Scholar] [CrossRef]
- Bornmann, L.; Mutz, R. Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 2015, 66, 2215–2222. [Google Scholar] [CrossRef]
- Ellegaard, O.; Wallin, J.A. The bibliometric analysis of scholarly production: How great is the impact? Scientometrics 2015, 105, 1809–1831. [Google Scholar] [CrossRef]
- Larsen, P.O.; von Ins, M. The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics 2010, 84, 575–603. [Google Scholar] [CrossRef]
- Fan, M.; Yang, W.; Wu, J.; Zhang, H.; Ye, Z.; Shaukat, M. Soil Organic Carbon Research and Hotspot Analysis Based on Web of Science: A Bibliometric Analysis in CiteSpace. Agriculture 2024, 14, 1774. [Google Scholar] [CrossRef]
- Zhang, X.; Zhao, T.; Xu, H.; Liu, W.; Wang, J.; Chen, X.; Liu, L. GLC_FCS30D: The first global 30 m land-cover dynamics monitoring product with a fine classification system for the period from 1985 to 2022 generated using dense-time-series Landsat imagery and the continuous change-detection method. Earth Syst. Sci. Data 2024, 16, 1353–1381. [Google Scholar] [CrossRef]
- Zhang, D.; Huggins, J.; Li, Q.; Ramachandran, S.; Serbin, S.; Webb, C.; Zuo, Z.; Dietze, M. Mapping the North American Terrestrial Carbon Cycle: A Process-based Reanalysis Using State Data Assimilation (SDA). bioRxiv 2026. [Google Scholar] [CrossRef]
- Cai, H.; Liu, S.; Shi, H.; Zhou, Z.; Jiang, S.; Babovic, V. Toward improved lumped groundwater level predictions at catchment scale: Mutual integration of water balance mechanism and deep learning method. J. Hydrol. 2022, 613, 128495. [Google Scholar] [CrossRef]
- Zhou, Q.; Guan, K.; Wang, S.; James, H.; Zhangliang, C. From satellite-based phenological metrics to crop planting dates: Deriving field-level planting dates for corn and soybean in the U.S. Midwest. ISPRS J. Photogramm. Remote Sens. 2024, 216, 259–273. [Google Scholar] [CrossRef]
- Donthu, N.; Kumar, S.; Mukherjee, D.; Pandey, N.; Lim, W.M. How to conduct a bibliometric analysis: An overview and guidelines. J. Bus. Res. 2021, 133, 285–296. [Google Scholar] [CrossRef]
- Liu, D.; Che, S.; Zhu, W. Visualizing the Knowledge Domain of Academic Mobility Research from 2010 to 2020: A Bibliometric Analysis Using CiteSpace. SAGE Open 2022, 12, 21582440211068510. [Google Scholar] [CrossRef]
- Arokiasamy, A.R.A.; Tan, R.S.E.; Deng, P.; Krishnasamy, H.N.; Liu, M.; Wu, G.; Wider, W. A bibliometric deep-dive: Uncovering key trends, emerging innovations, and future pathways in sustainable employability research from 2014 to 2024. Discov. Sustain. 2024, 5, 664. [Google Scholar] [CrossRef]
- Huang, N.; Wang, L.; Song, X.-P.; Black, T.A.; Jassal, R.S.; Myneni, R.B.; Wu, C.; Wang, L.; Song, W.; Ji, D.; et al. Spatial and temporal variations in global soil respiration and their relationships with climate and land cover. Sci. Adv. 2020, 6, eabb8508. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.; Yuan, Y.; Zhou, A.; Guo, L.; Zhang, H.; Liu, X. Development trends and research frontiers of preferential flow in soil based on CiteSpace. Water 2022, 14, 3036. [Google Scholar] [CrossRef]
- Yurtsever, M.M.E.; Küçükmanisa, A.; Kilimci, Z.H. A novel transformer-based approach for soil temperature prediction. arXiv 2023, arXiv:2311.11626. [Google Scholar]
- Jiang, J.; Feng, L.; Hu, J.; Liu, H.; Zhu, C.; Chen, B.; Chen, T. Global soil respiration predictions with associated uncertainties from different spatio-temporal data subsets. Ecol. Inform. 2024, 82, 102777. [Google Scholar] [CrossRef]
- Tamayo-Vera, D.; Wang, X.; Mesbah, M. A Review of Machine Learning Techniques in Agroclimatic Studies. Agriculture 2024, 14, 481. [Google Scholar] [CrossRef]
- Li, T.; Cui, L.; Kuhnert, M.; McLaren, T.I.; Pandey, R.; Liu, H.; Wang, W.; Xu, Z.; Xia, A.; Dalal, R.C.; et al. A comprehensive review of soil organic carbon estimates: Integrating remote sensing and machine learning technologies. J. Soils Sediments 2024, 24, 3556–3571. [Google Scholar] [CrossRef]
- Wadoux, A.M.J.C. Using deep learning for multivariate mapping of soil with quantified uncertainty. Geoderma 2019, 351, 59–70. [Google Scholar] [CrossRef]
- He, G.-F.; Yin, Z.-Y.; Zhang, P. Uncertainty quantification in data-driven modelling with application to soil properties prediction. Acta Geotech. 2025, 20, 843–859. [Google Scholar] [CrossRef]
- Novielli, P.; Magarelli, M.; Romano, D.; Di Bitonto, P.; Stellacci, A.M.; Monaco, A.; Amoroso, N.; Bellotti, R.; Tangaro, S. Leveraging explainable AI to predict soil respiration sensitivity and its drivers for climate change mitigation. Sci. Rep. 2025, 15, 12527. [Google Scholar] [CrossRef] [PubMed]
- Abdulraheem, M.I.; Zhang, W.; Li, S.; Moshayedi, A.J.; Farooque, A.A.; Hu, J. Advancement of Remote Sensing for Soil Measurements and Applications: A Comprehensive Review. Sustainability 2023, 15, 15444. [Google Scholar] [CrossRef]
- Padarian, J.; Minasny, B.; McBratney, A.B. Using deep learning for digital soil mapping. Soil 2019, 5, 79–89. [Google Scholar] [CrossRef]
- Zhang, J.; Liu, J.; Chen, Y.; Feng, X.; Sun, Z. Knowledge Mapping of Machine Learning Approaches Applied in Agricultural Management—A Scientometric Review with CiteSpace. Sustainability 2021, 13, 7662. [Google Scholar] [CrossRef]
- Du, C.; Wu, Y.; Ma, L.; Lei, D.; Yuan, Y.; Ren, X.; Wang, Q.; Jian, J.; Du, X. Bibliometric Analysis of Research on the Effects of Conservation Management on Soil Water Content Using CiteSpace. Water 2024, 16, 3415. [Google Scholar] [CrossRef]
- Ryo, M.; Angelov, B.; Mammola, S.; Kass, J.M.; Benito, B.M.; Hartig, F. Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models. Ecography 2021, 44, 199–211. [Google Scholar] [CrossRef]
- Hu, J.; Zhou, J.; Zhou, G.; Luo, Y.; Xu, X.; Li, P.; Liang, J. Improving estimations of spatial distribution of soil respiration using the Bayesian maximum entropy algorithm and soil temperature as auxiliary data. PLoS ONE 2016, 11, e0146589. [Google Scholar] [CrossRef]
- Zhao, J.; Lange, H.; Meissner, H. Gap-filling continuously-measured soil respiration data: A highlight of time-series-based methods. Agric. For. Meteorol. 2020, 285–286, 107912. [Google Scholar] [CrossRef]
- Zeng, J.; Zhou, T.; Cao, L.; Yu, Y.; Tan, E.; Zhang, Y.; Wu, X.; Zhang, J.; Zhang, Q.; Qu, Y.; et al. Various responses of global heterotrophic respiration to variations in soil moisture and temperature enhance the positive feedback on atmospheric warming. Commun. Earth Environ. 2025, 6, 475. [Google Scholar] [CrossRef]
- Jiang, P.; Chen, X.; Missik, J.E.C.; Gao, Z.; Liu, H.; Verbeke, B.A. Encoding diel hysteresis and the Birch effect in dryland soil respiration models through knowledge-guided deep learning. Front. Environ. Sci. 2022, 10, 1035540. [Google Scholar] [CrossRef]
- Wang, X.; Yang, Y.; Lv, J.; He, H. Past, present and future of the applications of machine learning in soil science and hydrology. Soil Water Res. 2023, 18, 67–80. [Google Scholar] [CrossRef]
- Ferdous, S.; Ahire, J.; Bergman, R.; Xin, L.; Blanc-Betes, E.; Zhang, Z.; Wang, J. A machine learning model using the snapshot ensemble approach for soil respiration prediction in an experimental Oak Forest. Ecol. Inform. 2025, 85, 102991. [Google Scholar] [CrossRef]
- Grunwald, S. Artificial intelligence and soil carbon modeling demystified: Power, potentials, and perils. Carbon Footpr. 2022, 1, 5. [Google Scholar] [CrossRef]
- Liu, J.; Hu, J.; Liu, H.; Han, K. Global soil respiration estimation based on ecological big data and machine learning model. Sci. Rep. 2024, 14, 11576. [Google Scholar] [CrossRef]
- Win, K.; Sato, T.; Tsuyuki, S. Application of Multi-Source Remote Sensing Data and Machine Learning for Surface Soil Moisture Mapping in Temperate Forests of Central Japan. Information 2024, 15, 485. [Google Scholar] [CrossRef]
- Chen, Z.; Cai, Y.; Pan, C.; Jiang, H.; Jia, Z.; Li, C.; Zhou, G. Spatial Heterogeneity of Soil Respiration and Its Relationship with the Spatial Distribution of the Forest Ecosystem at the Fine Scale. Forests 2025, 16, 678. [Google Scholar] [CrossRef]
- Yu, J.-C.; Liou, W.-T.; Chiang, P.-N. Prolonged Spring Drought Suppressed Soil Respiration in an Asian Subtropical Monsoon Forest. Forests 2025, 16, 1554. [Google Scholar] [CrossRef]
- Quetin, G.R.; Famiglietti, C.A.; Dadap, N.C.; Bloom, A.A.; Bowman, K.W.; Diffenbaugh, N.S.; Liu, J.; Trugman, A.T.; Konings, A.G. Attributing past carbon fluxes to CO2 and climate change: Respiration response to CO2 fertilization shifts regional distribution of the carbon sink. Glob. Biogeochem. Cycles 2023, 37, e2022GB007478. [Google Scholar] [CrossRef]
- Sitch, S.; Friedlingstein, P.; Gruber, N.; Jones, S.D.; Murray-Tortarolo, G.; Ahlström, A.; Doney, S.C.; Graven, H.; Heinze, C.; Huntingford, C.; et al. Recent trends and drivers of regional sources and sinks of carbon dioxide. Biogeosciences 2015, 12, 653–679. [Google Scholar] [CrossRef]
- Lloyd, J.; Taylor, J.A. On the Temperature Dependence of Soil Respiration. Funct. Ecol. 1994, 8, 315–323. [Google Scholar] [CrossRef]
- Sweet, L.; Müller, C.; Anand, M.; Zscheischler, J. Cross-validation strategy impacts the performance and interpretation of machine learning models. Artif. Intell. Earth Syst. (AIES) 2023, 2, e230026. [Google Scholar] [CrossRef]
- Mirjalili, S.; Faris, H.; Aljarah, I. (Eds.) Evolutionary Machine Learning Techniques: Algorithms and Applications; Springer: Singapore, 2019; Volume 1090. [Google Scholar] [CrossRef]
- Elshall, A.S.; Ye, M.; Niu, G.-Y.; Barron-Gafford, G.A. Bayesian inference and predictive performance of soil respiration models in the presence of model discrepancy. Geosci. Model Dev. 2019, 12, 2009–2032. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |