1. Introduction
Global energy consumption, according to the International Energy Agency, is expected to double by 2050 [
1]. Unfortunately, energy consumption derived from fossil fuels remains high at around 80%, not only in certain countries but also in specific economic sectors [
2]. Specifically, in agricultural production, diesel fuel currently dominates this sector, primarily due to the economic efficiency, durability, and reduced emissions of unburned hydrocarbons and carbon dioxide [
3,
4,
5]. Although diesel is currently being successfully replaced by biodiesel in many countries [
6], as the fuel of the 21st century, unfortunately, in the Republic of Serbia, diesel production has not yet reached a competitive level in the market [
7]. Similar modeling approaches have been explored internationally, confirming the global relevance of such analyses. In Finland, Jokiniemi et al. [
8] modeled fuel consumption across different silage-harvesting methods, integrating operational, mechanical, and environmental parameters. Nagar et al. [
9] proposed a generalized machine learning framework to predict tractor fuel use under diverse Indian farming conditions, while Yang et al. [
10] applied neural networks for estimating fuel consumption in autonomous agricultural systems. These studies demonstrate the broader applicability of AI-based models in optimizing fuel use and reducing emissions in agriculture across varying geographic and technological contexts [
8,
9,
10]. Therefore, the analysis of energy inputs—fuel consumption, as well as monitoring gas emissions in different operational modes of tractors—is of exceptional importance.
The diverse operational purposes of tractors necessitate structural, energy, ergonomic, and ecological adaptations to current conditions. A tractor, as a working machine in agriculture, serves multifunctional purposes, and its ability to adapt to operational requirements defines it as more or less adaptable, which can significantly contribute to the reduction in exhaust gas emissions resulting from fuel combustion in internal combustion engines [
11,
12,
13,
14].
Many studies have addressed the impact of exhaust gases on the environment, with authors stating that vehicle exhaust gases have a significant effect on global warming, influence acid rain, and affect air composition [
15] and, consequently, human health and other living beings [
16,
17]. In previous work [
18], the authors developed a simplified approach to assess the key factors influencing tractor fuel consumption. To more accurately assess the issues of tractor exhaust emissions and the factors leading to increased emissions, factors are classified into three groups [
19]: the first group is the technological equipment of tractors with technical systems for processing exhaust gases, the second includes operational factors, and the third pertains to the type and composition of the fuel and lubricants burned in the internal combustion engine. The first group can include factors such as technical and technological solutions for post-treatment of exhaust gases, as well as the technological level of the tractor engine itself. The second group can encompass operational causes such as the working setups of the tractor for performing certain operations in accordance with working conditions, the requirements of the implement with which the tractor is aggregated, etc., while the third group can include factors related to the fuel, such as the type and kind of fuel, the chemical composition of the fuel, etc.
This paper focuses on studying the causes of the second group, namely the operational factors that influence the change and composition of exhaust gases in the agro-technical operation of primary soil tillage with a plow. A tractor operating as a working machine in conjunction with a plow operates under variable loads, so variations in operation can be projected through different emissions of exhaust gases into the atmosphere. Therefore, an inadequately adjusted operational mode of the tractor for the given operational conditions can be noted as one of the causes of variation in exhaust gas emissions, categorized as an operational factor [
20,
21]. Determining the optimal operating mode of the tractor is very interesting from an economic point of view. Substantial expenditures are not needed to reduce exhaust gas emissions; instead, certain corrective actions or software solutions can bring about specific benefits.
Various authors have analyzed the fuel consumption of tractors and the environmental impact of exhaust gases when carrying out different operations and conditions [
22,
23]. According to the authors [
24], the differences in the observed parameters range from a few percent to several times more.
The setting of the tractor’s working mode, especially the engine load, has a major influence on exhaust emissions. For example, it is found that the operating mode at engine speeds of around 1000 rpm and low torque of around 30% Mmax is not favorable from either an economic or ecological point of view. At a higher torque >50% Mmax and medium (1000–2000 rpm) and high (>2000 rpm) engine speeds, the tractor is more efficient and acceptable from an environmental point of view [
25,
26].
In a previous paper [
27], it is stated that the specific tractor force of the tractor during tillage changes with the involvement of clay, the change in the volumetric mass of the soil, the involvement of organic matter in the soil, and the cohesive forces that prevail between the aggregates in the soil. It was also found that when the increased standard of organic fertilizer was applied for several years, the consumption of diesel fuel on the plot was reduced by 25%, while with the standard of organic fertilizer, 14% of diesel fuel was saved compared to the plot that was not fertilized with organic fertilizer. In addition to organic matter, the moisture content of the soil also has a major influence on the specific draught and thus on fuel consumption [
28].
In addition to the standard methods for modeling exhaust emissions, artificial intelligence (AI) has been increasingly used for these problems in recent years. The most applied models are artificial neural networks (ANNs), which have proven to be exceptionally good at predicting data compared to traditional numerical models. Most models are based on the application of the Levenberg–Marquardt algorithm in conjunction with log- and tan-sigmoidal transfer functions, which provide the best results [
29,
30,
31,
32,
33]. In contrast to the most applied models such as ANN networks, Decision Trees (DTs), Random Forest (RF), Support Vector Machines (SVM), and others [
34,
35], the XGBoost model was applied to predict specific fuel consumption and CO
2 emissions due to the specificity of the data obtained from the experiment (for model descriptions, see
Section 2.7).
Therefore, the main objective of this study is to develop and validate predictive models using the XGBoost machine learning algorithm to estimate CO2 emissions and specific fuel consumption of an agricultural tractor during primary soil tillage. The models are based on real operational data collected under field conditions, including soil properties (clay content, organic matter, and moisture), tractor settings (engine load and working regime), and exhaust gas measurements. By applying AI techniques to field data, this research aims to improve understanding of the relationships between soil–tractor interactions and environmental impact and support future development of data-driven decision-support tools in precision agriculture.
2. Materials and Methods
2.1. Structure of the Experiment
The experimental tests were carried out in December 2019 and November 2020 on a sample area of 243 ha in Vojvodina in the municipality of Stara Pazova at the coordinates 44°59′38.0″ N 20°06′10.3″ E, during the agrotechnology time frame for autumn/winter soil tillage in Serbian conditions. The area is located at an altitude of 81–86 m above sea level, which is important because the height of the terrain has a minimal influence on the change in the traction resistance of the plough. The direction of movement of the tractor on the plot is shown in
Figure 1. The length of the tractor’s path during operation in the direction 75° E–256° W was approximately 2000 m. The plot is divided into 80 cultivation zones, of which 33 cultivation plots were effectively utilized during the trials in 2019 and 2020. The soil type on the trial plot is carbonate chernozem.
The control of soil compaction on the plot was performed at the tillage depth of 25 cm with a penetrometer at 50 points, and in the 2019 production year, resistances between 100 and 200 N/cm2 were detected, while in 2020, the values were between 90 and 220 N/cm2.
In each production year, soil moisture was monitored in certain cultivation zones where the tractor was driven in order to rule out a possible influence of moisture on the change in resistance during tillage. Sampling was carried out at 3 points during a tractor pass. Soil moisture during tillage in 2019 was between 24.10 and 27.63%, while in 2020, moisture of 24.20–27.20% was measured according to the gravimetric method.
The average volumetric mass obtained with the Kopecky method cylinder 100 cm3 of the sampled soil in 2019 was 1.42 g/cm3, while in 2020, it was 1.40 g/cm3, taken from the surface and from a depth of 25 cm from the ploughing depth. During tillage, the tractor took the direction of movement (vector orientation) on the plot 75° E–256° W. One working mode of the tractor was maintained from end to end of the plot, shifting the change in six different working modes at each subsequent pass for the working width of the plough of 2.4 m, and the movement of the tractor was carried out within the same cultivation zones, with similar soil moisture and similar soil compaction. In this way, the influence of secondary factors was canceled out, and the characteristics of the soil could exert their influence on the variation of the observed factors.
The wheel slippage in 2019 was between 9 and 16%, while in 2020 it was 8–14%. The Fendt 936 tractor (AGCO, Marktoberdorf, Germany) 2017 was used for the test. The tractor has a Deutz Fahr TTCD 6-cylinder diesel engine with a volume of 7750 cm
3, which is specified with a rated power of 263 kW (358 hp) at 2200 rpm in accordance with the ECE R24 standard. The diesel fuel in the tractor is defined according to the DIN EN 590 standard, as required by the manufacturer. The compression ratio of the engine is 1:18 ± 0.3 and the fuel injection is a Deutz common rail system with an EDC 17 hardware engine control unit from Bosch. The engine meets the Tier IV emissions standard with a 2-stage turbocharger and cooled air in the intercooler, an EGR valve with cooled exhaust gases, a DPF filter, and SCR technology with AdBlue fluid, which is declared in accordance with the DIN 770 70 standard and diluted with demineralized water in a ratio of 32.5:67.5. The DPF filter was regenerated in 80 working hours in 2019, while the regeneration in 2020 took 130 working hours. All experimental setup data are shown in
Table 1.
During tractor operation, the Tractor Management System (TMS) was used to set the transmission ratio with load limit control in combination with Vario technology. In this way, communication between the engine and transmission is direct, and the TMS processor is responsible for the most efficient way of operating each mode.
In addition to the specified factory weight of 10,830 kg, the mass of the tractor in operation has been increased with a ballast weight of 3000 kg, namely, an 1800 kg front ballast weight and 1200 kg (2 × 600 kg) ballast weights in the rear wheels.
Soil cultivation was performed with a 6-furrow reversible plow Kuhn Multi-Master 183, three-point hitched, with a working width of 240 cm and a support wheel that limited the soil cultivation depth to 25 cm.
During plowing, the tractor moved entirely on the unplowed ground (the “on land” variant), using navigation with an accuracy of 2.5 cm (1 inch), thus maintaining a uniform working width of 2.4 m.
2.2. Categorization of Input Variables
To enable the inclusion of diverse agronomic and environmental conditions within the machine learning model, several continuous input variables were transformed into categorical classes based on expert-defined thresholds and statistical distribution. Specifically:
Soil texture was categorized into three ordinal classes (1 = light, 2 = medium, 3 = heavy), based on the percentage of clay content as defined by FAO soil texture classification. Class 1 included soils with clay content < 20%, class 2 represented 20–35%, and class 3 referred to soils with >35% clay.
Humus content was categorized into three classes according to the agronomic interpretation: 1 = low (<2%), 2 = moderate (2–4%), and 3 = high (>4%).
Tractor working regime (engine load and gear combination) was encoded as categorical values (1, 2, and 3), representing predefined field operation setups: (1) low engine RPM and low gear, (2) medium engine RPM and medium gear, and (3) high engine RPM and high gear. These settings were based on typical field operation standards and predefined during the experimental design phase.
This binning approach allowed for improved model generalization while still preserving the agronomic interpretability of categorical inputs.
2.3. Remote Method for Determining Management Zones Based on NDVI
Remote sensing using the Normalized Difference Vegetation Index (NDVI) method was used as the starting point for creating management zones with different varieties on a 243 ha plot. The Normalized Difference Vegetation Index (NDVI) was used to delineate management zones within the experimental field, based on satellite data collected in the previous growing season. NDVI values served as a proxy for spatial variability in vegetation and indirectly reflected underlying differences in soil characteristics. These zones were used to define different sampling locations for measurements and were also coded as categorical variables during the machine learning model development to account for site-specific variability. Remote sampling is a relatively simple way to collect a large amount of multidimensional data [
36,
37,
38].
The factors that led to variations in the NDVI index of the observed plant species on the experimental plot were drought, healthy and diseased plants, insects, soil compaction, nutrient availability, the quality of performed agricultural operations, variations in the textural composition of the soil within the plot, and many other factors.
In order to eliminate the factors that affect the NDVI index in a single production year and to focus on the factors that are present in the long term on the plot (such as different soil composition), a multi-year analysis of images was performed over a period of seven years using LandSat satellites. Analysis of the NDVI imagery resulted in the formation of 80 management zones of irregular shape, ranging in size from 0.92 ha to 5.9 ha [
39].
2.4. Soil Sampling with an Automatic Soil Sampler
After determining the management zones, soil sampling was performed within each of the 80 zones. From each management zone, using an automatic soil sampler mounted on a car and with the aid of GPS navigation, sampling was performed from 5 samples for a management zone of 0.92 ha up to 30 samples for a management zone of 5.9 ha. Then, all the samples from one management zone were combined to form one representative sample, which was then analyzed in the laboratory. A total of about 1200 samples were taken. The sampling depth is the depth at which the plowing operation is performed, i.e., a depth of 25 cm [
40].
2.5. Soil Testing Methods in the Laboratory
The upper limit of soil plasticity was determined by the Arany method, MSZ-08-0205-2:1978) [
41,
42], which is based on determining the amount of water (in cm
3) that needs to be added with continuous mixing to air-dried soil to reach the upper limit of plasticity.
Although the Arany Yarn Number method is not internationally recognized as a standard for soil texture classification based on particle-size distribution (such as USDA or FAO methods), it is widely used in Serbia and parts of Central Europe as a practical and economical proxy for estimating soil plasticity and workability. Its selection in this study was motivated by the goal of future field-level implementation, where cost and simplicity are critical.
To support comparability with global standards, we refer to [
43], who established correlations between the Arany index and conventional soil texture classes based on sand, silt, and clay content. This relationship enables the alignment of our soil classification results with globally recognized texture systems. To differentiate the results according to different soil compositions, the classification of the clay content in the soil was carried out as follows: 42–48 Arany number values indicate a low clay level, 49–51 Arany number values indicate a medium clay level, and 52–60 Arany number values indicate a high clay level.
The content of organic carbon, i.e., humus, in the soil was determined by the Hungarian standard method MSZ 08-0210:1977, MSZ-08-0452:1980 [
44,
45]. For the classification of soil according to humus content, as in the case of clay, a division was made: low clay level for 2–3.06%, medium clay level for 3.07–3.2%, and high clay level for 3.21–4.00%.
The instantaneous soil moisture content was determined by the gravimetric method, drying soil samples in an oven at 105 °C to a constant mass.
2.6. Method for Collecting Data from Tractors in Real Time
For the purpose of collecting data in real time from the Fendt 936 tractor, a compatible data logger “FMB 120” manufactured by Teltonika (Teltonika, Vilnius, Lithuania) was used. The device used GNSS connection in communication with the operations center: GPS, GLONASS, GALILEO, BEIDOU, SBAS, QZSS, DGPS, AGPS, and GSM mobile technology, which are shown in
Table 2 with frequency transmission and reception range.
The working parameters of the tractor obtained via the FMB 120 data logger were the current location of the tractor on the experimental plot with an accuracy of <3 m, passed distance (m), fuel consumption (L
−1), engine load (%), and engine temperature (°C). Signals arrived at the operations center every 60 s. By placing POI polygons with management zones in the operations canter, the exact position of the tractor within each zone was determined. Several authors have addressed the topic of real-time data collection from tractors via the CAN bus. For example, in previous work [
42], a specific data logger was used to collect data for the analysis of fuel consumption, engine load, engine speed, tractor performance, etc.
2.7. Method for Measuring Exhaust Gases
The composition of exhaust gases was measured with a portable gas analyzer Testo 350 (TestoGMBH, Lenzkirch, Germany), which meets the requirements of EN50270:2000-01 [
46]. The analyzer was configured to monitor the concentration of molecular oxygen O
2, nitric oxide NO, nitrogen oxides NO
x, nitrogen dioxide NO
2, sulphur dioxide SO
2, carbon dioxide CO
2, and carbon monoxide CO. The time on the gas analyzer was synchronized with the time on the data logger installed in the tractor. This synchronization enabled the mutual correlation of real-time data from the mentioned devices and allowed for comparisons of different operating conditions, performance indicators, and exhaust gases.
2.8. Machine Learning Model
The application of machine learning (ML) models has proven to be an extremely effective statistical tool in regression and classification problems. ML models are based on connecting input data, which may not have a clear correlation in advance, which in the ML model, through learning and training methods, provide output data. Extreme Gradient Boosting (XGBoost) belongs to ensemble learning techniques, iterating through an ensemble of weak learners, mainly with decision trees, to ultimately generate a strong prediction model. The advantages of the XGBoost model are high flexibility, strong predictability, high scalability, and high efficiency in model learning. These models have proven extremely successful in processing various types of data, in both their models—regression and classification, because they use regularization to minimize the loss function. XGBoost models successively correct prediction errors, with approximation through decision trees. Also, the advantage of the XGBoost model compared to others is the generation of high model accuracy for less model training time compared to other methods. The success of these networks lies in the fact that they can be adapted to a wide range of applications. The current application of these models can be direct, integrated with other algorithms, and optimized for parameters [
47,
48,
49,
50,
51,
52,
53,
54].
Figure 2 shows the proposed XGBoost model for the analysis of experimentally collected data [
55].
The model selection and the final comparison of model efficiency were based on standard statistical variables, the coefficient of determination, the mean absolute error, and the mean square error [
56]:
—actual (measured) value.
—predicted value from the model.
—mean of the actual values.
—total number of observations.
(coefficient of determination): measures the proportion of the variance in the dependent variable that is predictable from the independent variables.
(mean squared error): the average of the squares of the differences between actual and predicted values.
(root mean squared error): the square root of MSE, providing an error metric in the same units as the target variable.
All statistical processing was performed in Python using the scikit-learn and XGBoost libraries. The dataset was pre-processed by encoding categorical variables numerically. Model performance was evaluated using standard regression metrics: coefficient of determination (R2), mean squared error (MSE), and root mean squared error (RMSE). In addition, hyperparameter tuning (learning rate and booster type) was performed to optimize model performance and avoid overfitting.
3. Results
The aim of the study was to determine the correlation between the working regime, the type of soil worked (loam and humus), the specific fuel consumption, and the CO
2 content in the exhaust gases. Since the input data are given in the form of categorical values, the formation of each model implies the formatting of the input data into numerical values. Based on all input data, the farm types are categorized into six categories and the soil composition, humus, and clay content into three categories.
Table 3 shows the input data converted into numerical values.
The correlation matrix (
Figure 3) was created to determine the dependencies between various input and output parameters. Positive values of the correlation coefficient indicate a positive correlation, while negative values indicate a negative correlation. The strength of the correlation is indicated by the absolute value of the correlation coefficient. From the data shown, it can be seen that none of the input data has a strong correlation with the output data, and this is the reason for choosing the XGBoost regression model. The percentage composition of CO
2 in the exhaust gases shows a negative correlation with all input parameters, and it is interesting to note that the percentage of humus shows a negative correlation in relation to the specific fuel consumption and the amount of CO
2.
This trend can be explained by the fact that soils with higher humus content tend to have better physical properties, including improved structure, porosity, and water retention capacity. These factors reduce soil compaction and draught resistance during tillage operations, which in turn leads to lower engine load and reduced fuel consumption. As a result, the engine operates more efficiently, producing lower levels of CO2 in the exhaust gases.
The CO2 concentration in exhaust gases is expressed as a percentage (%), while the specific fuel consumption is given in grams per kilowatt-hour (g/kWh). These units are now consistently used in all relevant tables and figure captions.
The correlation matrix (
Figure 3) reveals a negative relationship between humus content and both CO
2 emissions (r = –0.25) and specific fuel consumption (r = –0.10). This indicates that increased humus content leads to improved soil structure and reduced tillage resistance, thereby decreasing fuel demand and lowering emissions. Additionally, CO
2 emissions show a weak negative correlation with working regime and clay content, while specific fuel consumption has a weak positive correlation with CO
2 emissions (r = 0.12), suggesting that engine load and combustion dynamics are linked but also influenced by soil variability. The moderate positive correlation between clay and humus (r = 0.32) reflects soil texture tendencies in the tested plots.
In order to optimize the XGBoost model for different output data for which the model needs to be trained, the hyperparameters of the model, booster and learning rate, were varied to avoid overfitting the model. The training score, i.e., the quality of the corresponding model, is determined according to the standard statistical equations (1) to (3), and the modeling results for the case of determining the percentage CO
2 content in the combustion products are shown in
Table 4.
Table 4 clearly shows that the last model has the best values of the statistical coefficients, where R
2 has the highest value and MSE and RMSE have the lowest values. For the selected values, the distribution diagram of the measured and calculated values is shown in
Figure 4. Although the model with the selected gbtree booster and a learning rate of 0.02 shows a 22% lower R
2 value, an 11% higher MSE value, and a 0.05% higher RMSE value, it can be seen in
Figure 5, which shows the distribution of the measured and calculated values, that this model is also acceptable.
When comparing the predicted and measured values for working modes 1 and 2 (
Figure 6), a noticeable variation can be observed in a subset of data points. Specifically, prediction errors greater than 30% occurred in less than 15% of the total cases, indicating that the model maintained a relatively good level of generalization under field conditions. These mismatches are likely due to external factors not directly captured in the input dataset, such as localized variations in soil moisture, ploughing depth, and sensor dynamics during real-time data acquisition.
Despite these discrepancies, the model demonstrated adequate robustness and can be reliably used for the prediction of CO2 emissions based on the selected operational and soil parameters. This confirms its potential for application in exploratory analysis and as a component of decision-support systems in precision agriculture.
As shown in
Figure 4,
Figure 5 and
Figure 6, there are several data points where the predicted CO
2 values deviate significantly from the observed measurements. These differences are particularly noticeable in the peak ranges, where the actual CO
2 values exceed 14%. Such deviations may result from operational conditions not captured by the recorded variables, such as instantaneous changes in plough depth, engine acceleration, or soil heterogeneity. Additionally, delays in sensor response during dynamic field conditions can introduce temporal offsets between measurements and predictions. Despite these outliers, the model accurately follows the general trend and captures the variation structure of the data, as confirmed by the overall prediction accuracy exceeding 80%.
The modeling values for the optimization of specific fuel consumption are listed in
Table 5. In contrast to the previous set of data, the statistical data show much lower values when predicting the specific fuel consumption per hectare traveled. The best model was obtained with the selection booster “dart” and learning rate = 0.05. The distribution of the measured and predicted values is shown in
Figure 7, where the deviations between the data can be clearly seen.
To better evaluate model performance beyond standard regression metrics, we transformed the continuous output of specific fuel consumption (SFC) into a binary classification problem. A threshold of 350 g/kWh, based on the median of the empirical distribution, was used to divide data into “low” and “high” consumption classes. This allowed us to construct a confusion matrix (
Figure 7) that visualizes the model’s classification accuracy.
Despite the limited R2 value obtained from regression (<0.3), the model correctly classified 65% of cases, showing reasonable alignment with observed labels. The matrix reveals a slight bias toward underestimating high-consumption cases, which can be attributed to the absence of key variables (e.g., real-time torque, load, and terrain resistance). Nonetheless, this level of accuracy is sufficient for exploratory or operational screening where a quick estimation of whether consumption is within or above the threshold is valuable. The chosen threshold and binarization were further justified by their prevalence in practical field advisory systems, where such thresholds are often predefined based on fuel economy targets.
In order to evaluate the classification performance of the XGBoost model for specific fuel consumption (SFC), a binary confusion matrix was constructed by transforming the continuous SFC output into two classes. The threshold for classification was defined at 350 g/kWh, based on the median value of the empirical distribution. Values below this threshold were categorized as “low consumption”, and values above it as “high consumption”.
As shown in
Figure 8, the model correctly classified 124 instances as true negatives (low SFC correctly identified) and 55 instances as true positives (high SFC correctly identified). However, 70 false positives (high SFC incorrectly predicted as low) and 27 false negatives (low SFC predicted as high) were observed. This performance indicates a reasonable balance between sensitivity and specificity but also reveals a tendency of the model to misclassify higher fuel consumption cases. This misclassification may be attributed to the limited number of features related to engine load and field resistance included in the current dataset. Future improvements could include real-time telemetry data such as torque, fuel injection pressure, or load dynamics to enhance prediction accuracy across consumption classes.
This binary classification was used to simplify model evaluation and better visualize the discrimination ability of the model in distinguishing extreme consumption behaviors under field conditions, as recommended in similar studies on ML-based agricultural system modeling.
4. Discussion
This study demonstrates the applicability of the XGBoost algorithm for modeling CO
2 emissions and specific fuel consumption (SFC) during soil tillage using real-world experimental data. The obtained R
2 value exceeding 0.80 for CO
2 emissions confirms the ability of the model to capture complex, nonlinear relationships among operational, mechanical, and soil-related variables. Similar levels of accuracy were reported in related studies—e.g., Lim et al. (2024) achieved R
2 values of 0.85 and 0.97 for NO
x and PM emissions, respectively, during tractor plow tillage using regression-based models [
23].
In contrast, the predictive power for SFC was substantially lower (R
2 < 0.3), which aligns with findings from previous studies that suggest SFC is influenced by a wider set of dynamic variables, including torque load, real-time engine RPM fluctuations, traction resistance, and topographical variations [
45]. The absence of such parameters in our model—particularly torque and dynamic load data—likely limited the predictive performance. Moreover, the categorization of soil texture and working regimes (e.g., as classes instead of continuous values) may have contributed to a loss of sensitivity and granularity in model training.
Although the inclusion of multiple agronomic and machine parameters (e.g., soil moisture, wheel slippage, and engine temperature) enabled a robust predictive model for emissions, further model improvement—especially for SFC—could be achieved through advanced feature engineering. Incorporating polynomial interactions, time-series transformations, or sensor fusion techniques (e.g., CAN-bus data integration) would provide richer input context. Studies in other domains (e.g., wastewater aeration [
56] and trajectory prediction for autonomous vehicles [
54]) support the potential of such approaches.
The selection of the DART (Dropout Additive Regression Trees) booster with a learning rate of 0.11 was found optimal in our experiments, which is consistent with prior research highlighting the regularization and generalization benefits of DART in avoiding overfitting on smaller or noisy datasets [
48]. DART’s dropout mechanism reduces the dominance of individual trees and encourages diversity in the boosted ensemble, improving generalization in agronomic settings where data may include irregularities or missing values [
56].
A significant advantage of the XGBoost method, compared to linear regression, support vector machines (SVMs), or even artificial neural networks (ANNs), lies in its capacity to handle heterogeneously scaled features and missing values without extensive preprocessing [
43,
45]. Moreover, unlike deep neural networks, XGBoost provides relatively interpretable outputs—particularly when combined with SHAP (Shapley Additive Explanations), which has been effectively used in land-use modeling and environmental science [
50]. Although SHAP was not implemented in this study, its application in future work could yield deeper insights into variable importance and model explainability.
Our findings on the weak individual correlation between input variables and emissions reaffirm the utility of ensemble tree-based models that aggregate weak signals to form strong predictors. However, this complexity raises challenges in direct physical interpretability—a known limitation of tree-based ensemble methods in agricultural research. For practical deployment in field machinery, transparency and traceability remain important, especially for systems intended to support operator decision-making or regulatory compliance.
From a practical standpoint, the high accuracy of the CO
2 emission model offers promising implications for real-time applications in precision agriculture. Integrating such models into embedded tractor systems or cloud-based farm management platforms could allow dynamic emission monitoring and feedback-based operational adjustments. Similar approaches have been demonstrated in intelligent irrigation and carbon-aware aeration management in wastewater systems [
56].
It is also noteworthy that the model was developed using real operational data from two separate growing seasons, which contributes to its robustness. Nevertheless, the generalizability remains constrained by the geographical and mechanical homogeneity of the dataset. Expanding the dataset to include different machinery types, more soil classes, and variable topographies—combined with the integration of high-frequency telemetry—could further enhance model performance and applicability across diverse agricultural systems.
Lastly, the methodology used in this study contributes to the growing body of literature advocating for machine learning approaches in environmental impact assessment and low-carbon agriculture. As emission regulations for non-road mobile machinery become more stringent globally, the development of reliable, interpretable, and data-efficient predictive models will become increasingly important for compliance, optimization, and sustainability in agricultural mechanization.
5. Conclusions
This study investigated the viability of the XGBoost machine learning algorithm for predicting CO2 emission and specific fuel consumption (SFC) during soil tillage operations with agricultural tractors. By integrating a wide range of field-collected variables—including soil texture, moisture content, wheel slippage, and engine performance parameters—XGBoost demonstrated strong predictive capability for CO2 emissions (R2 > 0.80), confirming its suitability for modeling complex, nonlinear systems in agricultural mechanization.
The relatively lower performance of the SFC model (R2 < 0.3) highlights the need for a broader set of dynamic input features to capture the multifactorial nature of fuel consumption under variable field conditions. Despite this limitation, the study confirms the relevance of machine learning models in the environmental performance monitoring of agricultural machinery, with particular value for emission estimation, operational optimization, and the development of precision farming technologies.
The choice of the DART booster within the XGBoost framework proved effective in preventing overfitting while maintaining predictive accuracy, indicating its potential for use in data-limited but complex agricultural scenarios. Compared to traditional statistical methods, the use of tree-based ensemble learning models enables higher flexibility and interpretability, particularly when paired with post hoc explanatory tools such as SHAP.
From a broader perspective, this research contributes to the growing body of work on sustainable agriculture, offering a data-driven approach for supporting regulatory compliance, energy efficiency, and low-carbon transitions in farming. Future research should focus on the expansion of datasets across seasons, topographies, and machinery types, the incorporation of real-time sensor data, and the integration of predictive models into decision-support systems and embedded machinery platforms.
The findings underline the transformative potential of machine learning in advancing environmentally responsible agricultural practices and pave the way for intelligent, emission-aware field operations aligned with global sustainability goals.