A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model

Farhangmehr, Vahid; Cobo, Juan Hiedra; Mohammadian, Abdolmajid; Payeur, Pierre; Shirkhani, Hamidreza; Imanian, Hanifeh

doi:10.3390/su15107897

Open AccessArticle

A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model

by

Vahid Farhangmehr

^1,*

,

Juan Hiedra Cobo

²

,

Abdolmajid Mohammadian

¹

,

Pierre Payeur

³

,

Hamidreza Shirkhani

² and

Hanifeh Imanian

¹

Department of Civil Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada

²

National Research Council Canada, Ottawa, ON K1A 0R6, Canada

³

School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(10), 7897; https://doi.org/10.3390/su15107897

Submission received: 6 March 2023 / Revised: 28 April 2023 / Accepted: 9 May 2023 / Published: 11 May 2023

(This article belongs to the Special Issue Groundwater, Soil and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

Soil temperature is a critical parameter in soil science, agriculture, meteorology, hydrology, and water resources engineering, and its accurate and cost-effective determination and prediction are very important. Machine learning models are widely employed for surface, near-surface, and subsurface soil temperature predictions. The present study employed a properly designed one-dimensional convolutional neural network model to predict the hourly soil temperature at a subsurface depth of 0–7 cm. The annual input dataset for this model included eight hourly climatic features. The performance of this model was assessed using a wide range of evaluation metrics and compared to that of a multilayer perceptron model. A detailed sensitivity analysis was conducted on each feature to determine its importance in predicting the soil temperature. This analysis showed that air temperature had the greatest impact and surface thermal radiation had the least impact on soil temperature prediction. It was concluded that the one-dimensional convolutional model performed better than the multilayer perceptron model in predicting the soil temperature under both normal and hot weather conditions. The findings of this study demonstrated the capability of the model to predict the daily maximum soil temperature.

Keywords:

machine learning; convolutional neural network; multilayer perceptron; soil temperature prediction; time-series regression

1. Introduction

Soil temperature is a critical parameter in soil science and agriculture and directly or indirectly affects the four growth stages of plants, i.e., seed germination, root development, flowering, and reproduction, as well as the physical, chemical, and biochemical processes in soil, including nitrification, transpiration, ventilation, and respiration of soil, and the microbiological activity of microorganisms available in soil [1,2]. It also affects soil properties, such as soil moisture, air, and nutrient content, as well as the capillary transport of water and nutritious solutes through plant foliage [3,4].

Soil temperature is also a critical parameter in hydrology, water resources engineering, meteorology, and geo-environmental engineering [4,5,6,7,8] that affects the rate of evaporation on the soil surface, the thermal energy balance between the atmosphere and the land surface [9], and the rate of decomposition of organic matter and its transformation to carbon dioxide in the atmosphere [10]. Thus, there is no doubt that accurate measurement or prediction of soil temperature on the ground surface and underground is important.

In the literature, two approaches have been broadly addressed by researchers to obtain soil temperature: direct measurement by suitable sensors in terms of accuracy and reliability [11], or indirect capture by satellites, and prediction by applying appropriate numerical models in terms of accuracy, speed, and cost-effectiveness [12]. In the second approach, owing to the climatic nature of soil temperature and its stochastic behavior, two models are often used for its prediction: statistical models [13,14] and machine learning models [15].

In the statistical models, the history of soil temperature in a time-series format is applied for prediction. Two stochastic models, autoregressive integrated moving average (ARIMA) and autoregressive moving average (ARMA), have been addressed in several studies [13,14,16]. The inherent need of statistical models for a large amount of time-series data to make reliable long-term predictions has drawn attention to machine learning models, such as artificial neural networks (ANN), gradient boosting (GB), support vector machine (SVM), k-nearest neighbors (KNN), random forests (RF), and linear regression (LR) [17,18].

Abyaneh et al. (2016) predicted the daily soil temperature at six different depths from 5 to 100 cm using an ANN and a co-active neuro-fuzzy inference system (CANFIS) with daily mean air temperature recorded for 14 years as the only input data [19]. Their study considered two different climatic regions, one with a humid climate and the other with a dry climate, to evaluate the performance of ANN and CANFIS. They found that, in humid regions, ANN performed better than CANFIS. They also concluded that the prediction accuracy of both models decreased with increasing depth. Citakoglu (2017) employed an adaptive neuro-fuzzy inference system (ANFIS), ANN, and multiple linear regression (MLR), which is a developed version of LR, to predict monthly soil temperatures at five depths ranging from 5 to 100 cm [20]. The input data in this study were monthly air temperature and precipitation recorded for 20 years. The coefficient of determination (R²), mean absolute error (MAE), and root mean squared error (RMSE) for these models demonstrated better performance for ANFIS than for ANN and MLR. Zhang et al. (2018) suggested a combined model of ensemble empirical mode decomposition (EEMD) and long short-term memory (LSTM) to forecast daily soil temperature with less need for historical data [21]. They evaluated the performance of their suggested model by calculating and examining the mean absolute percentage error (MAPE), mean squared error (MSE), MAE, RMSE, Pearson correlation coefficient (PCC), and Nash–Sutcliffe coefficient of efficiency (NSCE). They compared them with those calculated for a recurrent neural network (RNN), EEMD coupled with RNN, classic LSTM, and two empirical mode decomposition (EMD) models coupled with RNN and LSTM. The comparison showed better performance for the EEMD-LSTM model than for the other five models. Delbari et al. (2019) used support vector regression (SVR), a modified version of SVM, to predict the daily soil temperature at three depths of 10, 30, and 100 cm under five different climatic conditions: hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid [22]. The feeding input data in their study were daily air temperature, dew point temperature, relative humidity, solar radiation, and atmospheric pressure. A more appropriate performance was reported in their paper for SVR in deeper layers under humid climate conditions compared with MLR. Alizamir et al. (2020) assessed the performance of four machine learning models, namely the extreme learning machine (ELM), classification and regression trees (CART), ANN, and MLR, in accurately predicting monthly soil temperatures at four different depths from 5 to 100 cm [7]. The feeding data for these models were the monthly air temperature, relative humidity, solar radiation, and wind speed. They examined the RMSE, R², and NSCE for these models and concluded that their performance deteriorated as soil depth increased. They found that ELM performs better than the other models. Li et al. (2020) developed a bidirectional LSTM (BiLSTM) network to predict hourly soil temperature at five soil depths in five different US climates [23]. To feed their integrated deep BiLSTM network, they used hourly solar radiation, air temperature, wind speed, dew point temperature, relative humidity, and vapor pressure. To evaluate the performance of their model compared with LSTM, traditional BiLSTM, deep neural network (DNN), RF, SVR, and LR, MAE, MSE, and R² were calculated. An ANFIS model equipped with two well-known optimizers, namely salp swarm and grasshopper, and fed with daily maximum, mean, and minimum air temperatures was proposed by Penghui et al. (2020) to predict the daily soil temperature [24]. The performance of the model was examined by comparing the results with those extracted from seven models, including classical ANFIS and six ANFIS models equipped with a grasshopper optimizer, particle swarm optimizer, grey wolf optimizer, genetic optimizer, salp swarm optimizer, and dragonfly optimizer. Shamshirband et al. (2020) hybridized SVM and multilayer perceptron (MLP) and equipped it with a firefly optimizer to predict soil temperature at depths of 5, 10, and 20 cm [25]. The input data in their study were sunshine hours, relative humidity, air temperature, and wind speed. They found that the hybrid model predicted soil temperature better than the classic MLP and SVM models. Seifi et al. (2021) used ANFIS, SVM, and MLP models with salp swarm, particle swarm, firefly, and sunflower optimizers to forecast hourly soil temperatures at depths of 5, 10, and 30 cm in two arid and semi-humid climates [26]. They fed these models with four hourly climatic features: relative humidity, solar radiation, wind speed, and air temperature. After conducting a detailed analysis of the effect of each feature on soil temperature, they found that wind speed was a negligible feature. They concluded that ANFIS with the sunflower optimizer performed better than the other models. Imanian et al. (2022) assessed the performance of 13 machine learning models, including different classic regression models, ANFIS, SVM, KNN, RF, GB, and MLP, for forecasting hourly soil temperatures under ordinary and extreme weather conditions [27]. The input data of these models encompassed eight hourly climatic features. They calculated the RMSE, MAE, MSE, normalized root mean squared error (NRMSE), and R² for these models. They found that MLP performed better than the other models.

Convolutional neural networks (CNN), which are feedforward ANN, are often used for image recognition, classification, clustering, video processing, satellite remote sensing, computer vision, and natural language processing, and the raw input data feeding the network have a three-dimensional or two-dimensional spatial distribution [28]. In these applications, the first unit of the CNN, which encompasses fully connected convolutional, pooling, and padding layers, extracts suitable features from the input data to feed its second unit. The second unit of the CNN can be an LSTM, MLP, RNN, or any other feedforward model. The successful performance of CNN in these applications has recently attracted attention and interest for its use in predicting spatiotemporal soil temperatures. Hao et al. (2021) proposed a model that coupled CNN and EEMD to forecast the soil temperature at three depths from 5 to 30 cm [8]. The input data in their research were the maximum, minimum, mean, and variance of the soil temperature in the form of a time series. To assess the performance of their model, they calculated the MSE, MAE, RMSE, and R² and then compared them with those calculated for the LSTM and EEMD-LSTM models. They found that their model was more efficient than the other two models. Yu et al. (2021), knowing that traditional deep learning (DL) models cannot predict spatiotemporal soil temperatures, proposed a three-dimensional (3D) CNN combined with EEMD [4]. This model used EEMD to decompose spatiotemporal soil temperature into various intrinsic mode functions. They calculated MAE, RMSE, and R² and concluded that the coupling of EEMD and 3D CNN improved the predictions.

Based on the above literature review, CNN has been used less often than other machine learning models for predicting soil temperature, and in the recent attention it has received it has mainly been applied coupled with an advanced feedforward machine learning model to predict soil temperature that had a temporal and spatial distribution. In the coupled model, CNN has been used to extract suitable features from raw spatiotemporal input data. These input data included either a small number of time-series parameters directly related to soil temperature, such as the maximum, minimum, and mean values, or a very limited number of climatic parameters in a time-series format that affected soil temperature. The literature review also indicates that no detailed performance evaluation has been performed on such hybrid CNN models. Furthermore, it was found from the literature review that although the prediction using machine learning models of some meteorological parameters has been investigated by some researchers under extreme climatic conditions, such as moisture convection during heavy precipitation [29], solar radiation at very high land surface temperatures [30], and precipitation volume under extreme rainfall [31], soil temperature prediction in the warmest and coldest ranges of temperature has not been well addressed in the studies reviewed above.

The aim of this study is to examine in detail the capability of a simple and well-architected one-dimensional CNN model without any coupled advanced feedforward machine learning model or additional modern optimizer to predict hourly soil temperatures under normal, hot, and cold weather conditions, along with the daily maximum soil temperature. The hourly values of eight climatic features influencing soil temperature, including air temperature, instantaneous wind gusts, dew point temperature, surface net thermal radiation, surface net solar radiation, surface pressure, total precipitation, and evaporation, were employed as the input data. A detailed sensitivity analysis was performed on each feature to determine its importance in predicting soil temperature. A fairly comprehensive performance evaluation of this CNN model compared with the MLP model was conducted by calculating a wide range of evaluation metrics.

The subsequent sections of this paper are as follows. Section 2 introduces the study area and the hourly climatic data of a target station located in it, and then introduces the CNN model applied in this research. This section also describes the evaluation metrics used to assess the performance of the CNN model in predicting soil temperature. Section 3 first discusses the determination of an appropriate architecture for the CNN model, and then provides a sensitivity analysis of each feature in the input climatic data. It also presents the performance evaluation of the CNN model in predicting hourly soil temperatures under very hot and cold weather conditions, along with the daily maximum soil temperature. Finally, this section compares the performance of the CNN and MLP models by examining the evaluation metrics calculated for both models. The last section of this paper is devoted to conclusions.

2. Methodology

2.1. Study Area and Dataset

The annual input data for the CNN model in the present study included eight hourly climatic features: air temperature at 2 m above the surface (K), total precipitation (m), instantaneous wind gusts at 10 m above the surface (m/s), evaporation (m of water), surface net solar radiation (J/m²), surface net thermal radiation (J/m²), surface pressure (Pa), and dew point temperature at 2 m above the surface (K). Together with the hourly soil temperature (K), collectively called the dataset, these features were freely downloaded for 2021, and for a target station located in the Ottawa area of Canada with geographical coordinates of (45.25° N latitude,75.50° W longitude) from the ERA5 website. ERA5 is the fifth-generation atmospheric reanalysis of the global climate that covers the historical period from 1950 to the present and is conducted by the Copernicus Climate Change Service (C3S) at the ECMWF (https://cds.climate.copernicus.eu/, accessed on 21 November 2022) [32]. The downloaded files were in GRIB format and transformed into CSV format using Panoply, a Java-based package. Finally, all the data were gathered in an Excel file. Figure 1 shows the geographical location of the target station in the study area [33]. It is worth mentioning that Ottawa, the capital of Canada, is located at the confluence of the Ottawa River and the Rideau River in the southern portion of the province of Ontario, and has a semi-continental climate with four distinct seasons: warm and humid summers, extremely cold winters, and temperature-varying springs and falls. The total number of hourly input data was 8784 (366 days × 24 h); thus, the total number of inputs was 70,272 (8784 × 8 features). The CNN model in this study used this information to predict the hourly soil temperature under normal and very warm and cold weather conditions, along with the daily maximum soil temperature, at a subsurface depth of 0–7 cm. This prediction was then assessed using the soil temperature available in the dataset and by calculating evaluation metrics. Figure 2 shows hourly time-series data for air temperature, dew point temperature, and wind gusts, selected as three samples of the eight features mentioned above, during 2021. It also shows hourly time-series data for soil temperature during the same year.

2.2. One-Dimensional CNN

In the artificial intelligence (AI) literature, deep learning (DL) is known as a subset of machine learning and includes various ANNs with at least two hidden layers. It has been addressed in papers that further deepening an ANN via adding hidden layers does not necessarily lead to acceptable results, and what is more important is the proper architecture design for the layers besides using an appropriate optimization algorithm to train the network and a suitable activation function to control the non-linearity of individual neurons in the layers [23,34]. A proper architecture design can avoid data overfitting and vanishing gradients, both of which occasionally occur in the backpropagation algorithm in the network training process. It can also avoid additional computational loads [35]. CNN is a well-known DL model. A one-dimensional (1D) CNN, which can extract suitable features from a raw input time-series dataset and perform regression prediction using the extracted features, consists of two units. The first unit includes input, convolutional, pooling, and output layers. The input layer receives the raw input data that have already been transformed into a suitable tensor form and passes these to the first convolutional layer without performing any operations on them. An appropriate number of equally sized kernels are slid along the input tensor with a convenient equal stride to extract the first feature map from the first convolutional layer. During this sliding, the kernels, which are a vector matrix of weights

w_{i j}

, are convolutionally multiplied in a limited part of the input tensor

x_{i}

, after which the products are summed, and finally, a bias

b_{j}

is added to it [36]:

y_{j} = \sum_{i = 1}^{n} w_{i j} * x_{i} + b_{j}

(1)

where ∗ denotes a convolutional operator. To perform this multiplication, both sides must have the same size. Although there are various nonlinear activation functions to activate the product

y_{j}

, for example, the sigmoid function and hyperbolic tangent function, the rectified linear unit (ReLU) function is commonly employed. This simple and easy-to-use activation function, without the implementation restrictions of other activation functions, removes the negative values of

y_{j}

by setting them to zero and speeding up the training of the network [36].

f (y_{j}) = m a x (0, y_{j})

(2)

The first feature map

f (y_{j})

, after passing through a pooling layer, is delivered as the input tensor to the next convolutional layer, which has its own kernels. This process is repeated until the final convolutional layer.

A pooling layer, which commonly comes after each convolution layer, uses a feature map as its input and reduces its size. This pooling layer function reduces the number of kernels and their learnable weights, eliminating the need to deepen the CNN further by increasing the number of convolutional layers, thereby avoiding overfitting. Two types of pooling layers are typically applied: the max pooling layer and the average pooling layer. The 1 × 1 max pooling layer uses the maximum value of each local cluster of features in the feature map, whereas the average pooling layer uses the average value. The former is commonly employed in modern CNN models because of its improved performance [36].

The last feature map extracted from the last convolutional layer after being flattened or transformed into a vector by the output layer passes through the second unit of the 1D CNN, which is a fully connected layer based on a feedforward machine learning model, to perform regression prediction. The goal in this study is to predict soil temperature. As mentioned in the literature review, some researchers have proposed relatively advanced models for this unit to predict time-regressive soil temperature. The question that comes to mind here is whether the proper design for the first unit of the architecture of the 1D CNN, i.e., the proper determination of its hyperparameters, has the potential to predict the soil temperature with a simple predictive model and a straightforward optimizer in its second unit. This was the question that we sought to answer in this study. For this purpose, two hyperparameters were carefully examined, namely, the number of convolutional layers and the number of kernels, both of which play a key role in S > 0, S ≥ 3 W K S P W − K + 2 P S + 1. P = (K − 1)/2 S = 1 f X, Y (S) = max a, b = 0 1 S 2 X + a, 2 Y + b. f (x) = max (0, x) f (x) = tanh f() (x)f (x) = |tanh f() (x)|σ (x) = (1 + e − x) − 1[0, 1](− ∞, ∞) avoiding overfitting and providing accurate results. A max pooling layer is inserted between two consecutive convolutional layers. Three points are usually considered when selecting the appropriate kernel and pooling sizes. First, these sizes are chosen based on the size of the input dataset and are not treated as hyperparameters. Second, the convolutional layers close to the input layer tend to have more kernels with a larger size, because the size of the feature maps gradually decreases as they pass through the pooling layers. Third, a large pooling size may cause an unacceptable information loss in the input feature map. Based on these points, a kernel size of 4 for the first convolutional layer and a pooling size of 2 for pooling layers were chosen. The activation function of the convolutional layers and sliding stride of the kernels were considered to be ReLU and 1, respectively. The second unit of the 1D CNN model presented in this study was a simple feedforward fully connected 1-layer model with 32 neurons and a ReLU activation function. The loss function and straightforward optimizer applied to train the 1D CNN model by modifying the learnable parameters through a back-propagation algorithm based on the gradient descent technique were MSE and RMSprop with a learning rate of 0.001, respectively [37]. This modification was performed using an iterative process with a maximum iteration number of 5000. To better understand the connection between the two units of the 1D CNN, Figure 3 shows a schematic representation of the last part of the first unit of the 1D CNN, with one convolutional layer and one pooling layer connected to its second unit.

For the CNN model, a computer code was written in Python 3.10. Python is an open-source advanced programming language widely used for data analysis and machine learning. PyCharm 2022.1.1, one of Python’s integrated development environments (IDEs), was employed to execute the code on a computer with a 6th Gen Intel Core i7@2.60 GHz processor and 8.00 GB installed RAM.

2.3. Evaluation Metrics

A comprehensive assessment of the performance of the proposed machine learning model for regression prediction is essential. A vast range of evaluation metrics are used to conduct such assessments. The capability of the proposed model can be determined by accurately calculating the values based on the predicted and observed values in the dataset. In this study, 13 evaluation metrics were used: maximum residual error (MaxE), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), root mean squared error (RMSE), normalized root mean squared error (NRMSE), coefficient of determination (R²), relative root mean squared error (RRMSE), bias, performance index (PI), variance-accounted-for (VAF), Nash–Sutcliffe coefficient of efficiency (NSCE), and Akaike information criterion (AIC). It is worth noting that R², NRMSE, RRMSE, MAPE, NSCE, VAF, and PI are dimensionless. By indicating the observed value and the mean observed value for soil temperature in the dataset applied in this study as

y_{o b s}

and

\bar{y_{o b s}}

, the value and the mean value predicted by the model proposed in this study for soil temperature as

y_{p r e d}

and

\bar{y_{p r e d}}

, the number of data in the dataset as

n

, the number of input climatic features as

k

, and the fit line coefficients derived from the linear trend line as

a

and

b

, the evaluation metrics are defined as follows:

M a x E = M a x (y_{o b s} - y_{p r e d})

(3)

M A E = \frac{\sum |y_{o b s} - y_{p r e d}|}{n}

(4)

M S E = \frac{\sum {(y_{o b s} - y_{p r e d})}^{2}}{n}

(5)

R M S E = \sqrt{\frac{\sum {(y_{o b s} - y_{p r e d})}^{2}}{n}}

(6)

N R M S E = \frac{R M S E}{[M a x (y_{o b s}) - M i n (y_{o b s})]}

(7)

R R M S E = \frac{R M S E}{\bar{y_{o b s}}}

(8)

M A P E = \frac{1}{n} \sum |\frac{y_{o b s} - y_{p r e d}}{y_{o b s}}|

(9)

b i a s = \frac{\sum (y_{p r e d} - y_{o b s})}{n}

(10)

R^{2} = {[\frac{\sum (y_{o b s} - \bar{y_{o b s}}) (y_{p r e d} - \bar{y_{p r e d}})}{\sqrt{\sum {(y_{o b s} - \bar{y_{o b s}})}^{2} \sum {(y_{p r e d} - \bar{y_{p r e d}})}^{2}}}]}^{2}

(11)

N S C E = 1 - \frac{\sum {(y_{o b s} - y_{p r e d})}^{2}}{\sum {(y_{o b s} - \bar{y_{o b s}})}^{2}}

(12)

V A F = 1 - \frac{v a r (y_{o b s} - y_{p r e d})}{v a r (y_{o b s})}

(13)

P I = \frac{R R M S E}{\sqrt{R^{2}} + 1}

(14)

A I C = n \times \ln (R S S) + 2 k, R S S = \sum {(y_{p r e d} - (a y_{o b s} + b))}^{2}

(15)

Note that lower values for error metrics bias, AIC, and PI and higher values for R², NSCE, and VAF are more desirable.

MAE provides the average absolute magnitude of the errors regardless of their direction by equally weighting all individual differences in the average. MSE provides the average magnitude of errors by assigning a relatively high weight to large errors. Since MSE has a square of the original data’s unit, RMSE with the same unit is a more appropriate metric to compare results with. MAE and RMSE can be used to detect errors’ variation. A larger difference between MAE and RMSE implies a larger variance in individual errors. NRMSE and RRMSE are dimensionless error metrics derived from RMSE that provide a better evaluation of the objective performance of different machine learning models. MAPE is a dimensionless error metric that is insensitive to outliers and provides a general indication of machine learning model performance. Unlike the above-mentioned error metrics, bias indicates the overall direction of the errors without determining the accuracy of the proposed machine learning model. Bias can be a positive number for over-forecasting or a negative number for under-forecasting, or even zero. NSCE is a normalized metric that compares the magnitude of residual variance to the magnitude of actual data variance. VAF is an indicator that indicates the relative variance of errors in a machine learning model. The desirable value of VAF is 1, which means zero estimation error variance. AIC and PI are two prediction error estimators used to assess the capability of proposed machine learning models to perform a targeted task. Lower values of these estimators are more desirable. R² is a statistical metric used to assess the performance of a regression machine learning model. This metric can be expressed as a function of MSE, so that as MSE increases, R² decreases, and its desired value is 1 [38].

3. Results and Discussion

3.1. 1D CNN Architecture

Before designing an appropriate architecture for a supervised machine learning model to cost-effectively and efficiently perform regression prediction in scientific or engineering applications while providing accurate and reliable results, allocating acceptable amounts of time-series input and output data is necessary to train and validate the model and test its performance. This allocation must be carried out such that the time-series format of the input and output data is preserved. In the present study, 70% of the input data, including the eight hourly climatic features introduced in Section 2.1, and the output data encompassing the hourly soil temperature were used to train and validate the CNN model, while the remaining 30% were set aside to test its performance in predicting soil temperature. To reduce the sensitivity of the model to the scale of the climatic features in the input data and thereby allow the model to embark on its iterative training with better learnable parameters, leading to more accurate results and faster convergence, the input training and testing data were normalized by the mean and standard statistical deviation of each independent feature. It is essential to meticulously monitor the model during its training and testing phases by calculating the evaluation metrics introduced in Section 2.3 to ensure that the results are acceptable, and that overfitting does not occur. Figure 4 illustrates the flowchart of the methodology used in this study.

As mentioned in Section 2.2, the number of convolutional layers and the number of their kernels are two hyperparameters in designing an appropriate architecture for the 1D CNN model presented in this study. To investigate these, two 1D CNN models with two and three convolutional layers with one and two max pooling layers inserted between two successive convolutional layers, respectively, were used. For the three-layer model, four distinctive number of sets of kernels with kernel sizes of 4, 3, and 2 for the first, second, and third convolutional layers, respectively, and a pooling size of 2 for both the first and second pooling layers were chosen. For the two-layer model, five different number of sets of kernels with kernel sizes of 4 and 3 for the first and second convolutional layers, respectively, and a pooling size of 2 for the pooling layer were selected. These kernel and pooling sizes were acceptable in terms of the size of the dataset used for training and testing the models. The model was executed for all cases and its behavior was monitored by calculating the evaluation metrics. The hyperparameter that yielded the best performance for the 1D CNN model with more desirable evaluation metrics was selected.

Figure 5 and Figure 6 illustrate the calculated values of the evaluation metrics during the training and testing phases of the three-layer 1D CNN model. The four cases in these figures are as follows: case 1 with 32, 16, and 8 kernels; case 2 with 64, 32, and 16 kernels; case 3 with 128, 64, and 32 kernels; and case 4 with 256, 128, and 64 kernels. As can be clearly seen in these figures, case 2 with 64, 32, and 16 kernels in the first, second, and third convolutional layers, respectively, resulted in the best performance with the most acceptable evaluation metrics for the three-layer 1D CNN model in soil temperature prediction in both the training and testing phases. Figure 7 and Figure 8 show the calculated evaluation metrics during the training and testing phases of the two-layer 1D CNN model. The five cases in these figures are as follows: case 1 with 16 and 8 kernels; case 2 with 32 and 16 kernels; case 3 with 64 and 32 kernels; case 4 with 128 and 64 kernels; and case 5 with 256 and 128 kernels. From these figures, it can be concluded that case 5, with 256 and 128 kernels in its first and second convolutional layers, respectively, led to the best performance with the most desirable evaluation metrics for the two-layer 1D CNN model in predicting the soil temperature in both the training and testing phases.

In Table 1, the performances of the two-layer and three-layer 1D CNN models with the best kernel number configurations in predicting the soil temperature in the training and testing phases are compared using the evaluation metrics. It can be seen from this table that the three-layer 1D CNN model with the kernel number configuration of 64, 32, and 16 had better performance with finer evaluation metrics than the two-layer 1D CNN model with the kernel number configuration of 256 and 128. This three-layer 1D CNN model was therefore selected as the well-designed 1D CNN model to be used in the following subsections. It is noteworthy that this selected 1D CNN model does not pretend to be the optimal architecture and configuration for all applications of soil temperature prediction but is demonstrated to offer the best available performance under the experimental conditions considered in this study.

3.2. Climatic Features Significance in Soil Temperature Prediction

One of the main gaps observed in the studies of most researchers working on predicting soil temperature using machine learning models is that they often choose a limited number of climatic features that affect soil temperature as the input data for the models, without performing a sensitivity analysis to determine the significance of each feature in predicting soil temperature. A review of these papers shows that only air temperature was present in all input datasets, and other influential climatic features, including wind, surface solar and thermal radiation, evaporation, surface pressure, precipitation, and dew point temperature, were not present collectively in them [12,15,23,39,40,41]. This review also shows that the features present in the input datasets were mostly monthly or daily, and rarely hourly. In the previous subsection, we used the mentioned eight hourly time-series features in an input dataset to design an appropriate architecture for the 1D CNN model. In the current subsection, to fill the gap mentioned above, we conduct a detailed sensitivity analysis of each feature to investigate its role in the performance of the well-designed 1D CNN model from Section 3.1. In this analysis, the model was run after removing one of the features, and the error metrics encompassing MAE, MSE, RMSE, NRMSE, RRMSE, and MAPE were then calculated. This process was executed for each feature. The calculated error metrics were then compared with those calculated for all features to accurately determine the significance and effectiveness of each feature in predicting the hourly soil temperature at a near-surface depth of 0–7 cm. The results of the testing phase for the model are presented in Table 2. This table shows that removing any of the features from the input dataset increased the error metrics, while removing the air temperature led to the greatest increase in the error metrics. This means that air temperature was the most effective feature for predicting soil temperature. This finding was consistent with other studies that mainly concentrated on air temperature as an input for soil temperature prediction [12,15,23,39,40,41]. This finding was expected because the air layer, whose temperature was recorded in the input dataset, was within 2 m of the ground surface and was in direct contact with the soil surface, and therefore had the greatest influence on near-surface soil temperature. Table 2 also shows that the influence intensity of the other features on soil temperature prediction were as follows: precipitation, dew point temperature, surface solar radiation, surface pressure, wind gust, evaporation, and surface thermal radiation. Table 2 reveals the significant influence of other features that it may not be safe to neglect when aiming for higher precision in soil temperature prediction.

3.3. Performance Evaluation of the 1D CNN Model in Ordinary Weather Conditions

In Table 3, the performances of the well-designed three-layer 1D CNN model from Section 3.1 and the similar three-layer MLP model with 64, 32, and 16 neurons in the first, second, and third hidden layers, respectively, are compared while predicting the soil temperature under ordinary weather conditions. A computer code was written in this study for the MLP model. An MLP, which is a class of feedforward ANNs, consists of one input layer, hidden (middle) layers, and one output layer. All hidden layers have the same activation function, which is commonly referred to as the ReLU. Each neuron in each layer has weighted connectivity with all the neurons in the next layer. The learnable weights are modified through an iterative process using a back-propagation algorithm [12,15,40,41]. Similar to the 1D CNN model presented in this work, the MLP model employed to compare its performance with that of the 1D CNN model used the MSE loss function and RMSProp with a learning rate of 0.001 in the back-propagation algorithm, the ReLU activation function in its hidden layers, and a maximum iteration number of 5000. Both models were fed by the same input and output data. From Table 3, it is clear that the 1D CNN model had a better performance in soil temperature prediction under ordinary weather conditions in its training and testing phases compared with the MLP model because it had lower error metrics, including MaxE, MAE, MSE, RMSE, NRMSE, RRMSE, and MAPE along with bias, AIC, and PI, and higher VAF, R², and NSCE. This improved performance is due to the 1D CNN model’s inherent advantages [4,8,28,36], as follows: (1) Each convolutional layer shares equally its kernels with all feature clusters in its input feature map for the convolutional multiplication operation. This sharing reduces the number of learnable parameters and thus lowers the memory requirements for running the CNN; (2) in contrast with MLP models which are highly prone to overfitting due to the full connection of each neuron in one layer to all neurons in the next layer, especially in deep MLP models having more layers, CNN has a safety margin against overfitting because of the local and hierarchical connections of the kernels of each convolutional layer to the features in its feeder feature map; (3) CNN has less intricacy due to the mentioned connectivity; and (4) unsimilar to MLP models which require a complex architecture with massive numbers of neurons in their hidden layers for larger input data and are hence prone to overfitting, CNN permits the network to be deeper. A notable disadvantage of CNNs is that they typically require large amounts of training data to achieve their best performance.

Scatter plots of soil temperature under ordinary weather conditions predicted by the 1D CNN and MLP models based on ERA5 data are depicted in Figure 9. This figure shows a good fit between the observed values and those predicted by the models during their training and testing phases. It is observed that the soil temperatures predicted by the models during their training and testing phases demonstrated a close match to the identity line. Considering the size of the data, which fed both models in this study, this means that both models were reliable for soil temperature prediction. The correlation between actual and predicted data in the training phase was 81.88% for the 1D CNN model and 89.85% for the MLP model. The correlation in the testing phase was 89.98% for the 1D CNN model and 89.24% for the MLP model. This figure shows a slightly better distribution around the fit line for the 1D CNN during both the training and testing phases. Figure 10 shows the distribution of the predicted soil temperatures between the prediction bands under ordinary weather conditions. Prediction bands and the confidence region are widely used in the statistical analysis of regression prediction [27]. This figure proves the strength of the 1D CNN and MLP models considering that the distribution of the soil temperature predicted remains between the prediction bands in both their training and testing phases. A slightly better distribution in the training and testing phases is observed for the 1D CNN model in this figure.

Although the main focus in the present work is to compare the performance of two ANN models, namely CNN and MLP, in predicting soil temperature, to demonstrate the capability of the three-layer well-designed 1D CNN model from Section 3.1, five error metrics including MaxE, MSE, RMSE, RRMSE, and NRMSE and three evaluation metrics including R², VAF, and PI were compared with those of two other machine learning models, namely RF and SVR, in Table 4. Lower error metrics and PI as well as higher VAF and R² for the 1D CNN in the testing phase compared to those of SVR and RF models proved its better performance in predicting soil temperature.

3.4. Performance Evaluation of the 1D CNN Model in Very Hot and Cold Weather Conditions

Ensuring the proper performance of a machine learning model proposed for soil temperature prediction in the hottest and coldest temperature ranges is essential and critical in soil science, agriculture, hydrology, water resources engineering, and geo-environmental engineering to take immediate or planned actions to prevent undesirable events. Thus, a detailed performance evaluation is conducted in the current subsection for the 1D CNN model. First, it was necessary to extract the climatic features related to soil temperature in the hottest and coldest ranges from the input dataset. The model, which was appropriately designed based on ordinary climatic features in Section 3.1, was then run using these reduced input datasets. Given the lack of a unique definition of the hottest and coldest soil temperature ranges in the literature, we applied a strategy suitable for the geographical area considered in our research to perform this extraction. In this strategy, the soil temperature in the output data was first sorted in descending order. The time-series format of soil temperature and its climatic features was preserved in this sorting. Then, the features related to the upper and lower quartiles of the sorted soil temperatures were separated and considered as features of the hottest and coldest soil temperature ranges, respectively. Both the 1D CNN and MLP models were run individually on the upper quartile (coldest temperature range) with 2196 input data, and then individually on the lower quartile (coldest temperature range) with 2196 input data. The evaluation metrics were calculated for the testing phase and compared based on their outputs. Figure 11 and Figure 12 present a comparison. Figure 11 shows that the 1D CNN model had lower error metrics, including MaxE, MAE, MSE, RMSE, RRMSE, and MAPE; lower bias and PI; and higher R², NSCE, and VAF, compared to the MLP model in very hot weather conditions. This indicates that the 1D CNN model performed better than the MLP model in predicting the soil temperature in the hottest range. The NRMSE and AIC of this model were 7.37% and 16,703.12, respectively, which were lower than those of the MLP model (9.51% and 17,316.34, respectively). This again demonstrates the capability of the 1D CNN model for soil temperature prediction under extremely hot weather conditions. Conversely, Figure 12 shows that the MLP model had lower error metrics, lower bias and PI, and higher R², NSCE, and VAF than the 1D CNN model under cold weather conditions. The NRMSE and AIC of the MLP model were 11.13% and 18,293.29, respectively, which were both lower than 15.64% and 18,630.33, respectively, for the 1D CNN model. The MLP model performed better than the 1D CNN model in predicting soil temperature under very cold weather conditions. However, it is not possible to judge this properly because under such conditions, the ground surface in the Ottawa area is covered by snow and ice. This wear hampers the influence of climatic features on soil temperature, and therefore deteriorates the performance of both models in predicting soil temperature in the coldest temperature range. Adding snow and ice-related features, such as snow depth, snow layer temperature, snow density, and ice layer temperature, to the eight climatic features considered in this study and then running both 1D CNN and MLP models with the extended input dataset is a remedy to achieve a better judgement of the performance of both models under very cold weather conditions. To improve the performance evaluation metrics of the 1D CNN model under very hot and cold weather conditions, it is better to use a larger input dataset and output data for at least four consecutive years.

3.5. Capability of 1D CNN Model in Predicting the Daily Maximum Soil Temperature

In some cases, only prediction of daily maximum soil temperature without needing to predict hourly soil temperature in soil science, agriculture, hydrology, water resources engineering, and geo-environmental engineering seems to be sufficient to take planned actions to prevent adverse events. This subsection examines the capability of the 1D CNN model to predict daily maximum soil temperature. For this purpose, all climatic features related to the daily maximum soil temperatures for 2021 were extracted from the input dataset, maintaining their time-series format. The 1D CNN model proposed in Section 3.1 was executed using the extracted features. Table 5 presents the evaluation metrics calculated for both the training and testing phases of the model. This table shows that the 1D CNN model performed well in its training phase and exhibited fairly acceptable performance in its testing phase. The weaker performance of the model observed in its testing phase was because of the lower number of features used to train it. To improve performance, a larger dataset is recommended.

4. Conclusions

In this study, the performance of a well-designed straightforward three-layer 1D CNN model for predicting hourly soil temperatures under both ordinary and hot and cold weather conditions was examined and compared with the performance of a MLP model. A broad range of evaluation metrics was used to perform this comparison. In addition, the capability of the 1D CNN model to forecast the daily maximum soil temperature was investigated. Eight hourly climatic features in time-series format, including evaporation, air temperature, dew point temperature, surface solar radiation, surface thermal radiation, surface pressure, precipitation, and wind gust, were collectively considered in the input dataset. A detailed sensitivity analysis was conducted on each feature to determine its significance in predicting the soil temperature. The main findings of the present research are as follows: (1) an appropriate architecture for the 1D CNN model, i.e., determining its hyperparameters, has the potential to eliminate the need for any coupled advanced feedforward machine learning model or modern optimizer to predict soil temperature; (2) air temperature has the greatest effect, and surface thermal radiation has the least effect on soil temperature prediction; (3) the appropriately designed 1D CNN model demonstrates a better performance in predicting soil temperature under ordinary and hot weather conditions compared to the MLP model; and (4) this 1D CNN model can forecast the daily maximum soil temperature with acceptable precision.

Author Contributions

Conceptualization, J.H.C., A.M., H.S. and P.P.; methodology, V.F., H.I. and A.M.; software, V.F.; validation, V.F. and P.P.; formal analysis, V.F. and P.P.; investigation, V.F.; resources, V.F., H.I. and A.M.; data curation, V.F.; writing—original draft preparation, V.F.; writing—review and editing, J.H.C., P.P., H.S., H.I. and A.M.; visualization, V.F.; supervision, A.M., J.H.C. and H.S.; project administration, A.M., J.H.C. and H.S.; funding acquisition, A.M., H.S. and J.H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Research Council Canada through the Artificial Intelligence for Logistics Supercluster Support Program, grant number AI4L-120.

Data Availability Statement

Parts of the data used in this manuscript are available through the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lai, L.; Zhao, X.; Jiang, L.; Wang, Y.; Luo, L.; Zheng, Y.; Chen, X.; Rimmington, G.M. Soil respiration in different agricultural and natural ecosystems in an arid region. PLoS ONE 2012, 7, e48011. [Google Scholar] [CrossRef] [PubMed]
Liang, L.; Riveros-Iregui, D.; Emanuel, R.; McGlynn, B. A simple framework to estimate distributed soil temperature from discrete air temperature measurements in data-scarce regions. J. Geophys. Res. Atmos. 2014, 119, 407–417. [Google Scholar] [CrossRef]
Onwuka, B.; Mang, B. Effects of soil temperature on some soil properties and plant growth. Adv. Plants Agric. Res. 2018, 8, 34–37. [Google Scholar] [CrossRef]
Yu, F.; Hao, H.; Li, Q. An ensemble 3D convolutional neural network for spatiotemporal soil temperature forecasting. Sustainability 2021, 13, 9174. [Google Scholar] [CrossRef]
Araghi, A.; Mousavi-Baygi, M.; Adamowski, J.; Martinez, C.; van der Ploeg, M. Forecasting soil temperature based on surface air temperature using a wavelet artificial neural network. Meteorol. Appl. 2017, 24, 603–611. [Google Scholar] [CrossRef]
Kisi, O.; Sanikhani, H.; Cobaner, M. Soil temperature modeling at different depths using neuro-fuzzy, neural network, and genetic programming techniques. Theor. Appl. Climatol. 2017, 129, 833–848. [Google Scholar] [CrossRef]
Alizamir, M.; Kisi, O.; Ahmed, A.N.; Mert, C.; Fai, C.M.; Kim, S.; Kim, N.W.; El-Shafie, A. Advanced machine learning model for better prediction accuracy of soil temperature at different depths. PLoS ONE 2020, 15, e0231055. [Google Scholar] [CrossRef]
Hao, H.; Yu, F.; Li, Q. Soil temperature prediction using convolutional neural network based on ensemble empirical mode decomposition. IEEE Access 2021, 9, 4084–4096. [Google Scholar] [CrossRef]
Keshavarzi, A.; Sarmadian, F.; Omran, E.S.E.; Iqbal, M. A neural network model for estimating soil phosphorus using terrain analysis. Egypt. J. Remote Sens. Space Sci. 2015, 18, 127–135. [Google Scholar] [CrossRef]
Zhang, Y. Soil temperature in Canada during the twentieth century: Complex responses to atmospheric climate change. J. Geophys. Res. 2005, 110, D03112. [Google Scholar] [CrossRef]
Vandoorne, R.; Gräbe, P.J.; Heymann, G. Soil suction and temperature measurements in a heavy haul railway formation. Transp. Geotech. 2021, 31, 100675. [Google Scholar] [CrossRef]
Feng, Y.; Cui, N.; Hao, W.; Gao, L.; Gong, D. Estimation of soil temperature from meteorological data using different machine learning models. Geoderma 2019, 338, 67–77. [Google Scholar] [CrossRef]
Bonakdari, H.; Moeeni, H.; Ebtehaj, I.; Zeynoddin, M.; Mahoammadian, A.; Gharabaghi, B. New insights into soil temperature time series modeling: Linear or nonlinear? Theor. Appl. Climatol. 2019, 135, 1157–1177. [Google Scholar] [CrossRef]
Zeynoddin, M.; Bonakdari, H.; Ebtehaj, I.; Esmaeilbeiki, F.; Gharabaghi, B.; Zare Haghi, D. A reliable linear stochastic daily soil temperature forecast model. Soil Tillage Res. 2019, 189, 73–87. [Google Scholar] [CrossRef]
Mehdizadeh, S.; Fathian, F.; Safari, M.J.S.; Khosravi, A. Developing novel hybrid models for estimation of daily soil temperature at various depths. Soil Tillage Res. 2020, 197, 104513. [Google Scholar] [CrossRef]
Zeynoddin, M.; Ebtehaj, I.; Bonakdari, H. Development of a linear based stochastic model for daily soil temperature prediction: One step forward to sustainable agriculture. Comput. Electron. Agric. 2020, 176, 105636. [Google Scholar] [CrossRef]
Fradkov, A.L. Early history of machine learning. IFAC-PapersOnLine 2020, 53, 1385–1390. [Google Scholar] [CrossRef]
Bochenek, B.; Ustrnul, Z. Machine learning in weather prediction and climate analyses-Applications and perspectives. Atmosphere 2022, 13, 180. [Google Scholar] [CrossRef]
Abyaneh, H.Z.; Varkeshi, M.B.; Golmohammadi, G.; Mohammadi, K. Soil temperature estimation using an artificial neural network and co-active neuro-fuzzy inference system in two different climates. Arab. J. Geosci. 2016, 9, 377. [Google Scholar] [CrossRef]
Citakoglu, H. Comparison of artificial intelligence techniques for prediction of soil temperatures in Turkey. Theor. Appl. Climatol. 2017, 130, 545–556. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, Q.; Zhang, G.; Nie, Z.; Gui, Z.; Que, H. A novel hybrid data-driven model for daily land surface temperature forecasting using long short-term memory neural network based on ensemble empirical mode decomposition. Int. J. Environ. Res. Public Health 2018, 15, 1032. [Google Scholar] [CrossRef] [PubMed]
Delbari, M.; Sharifazari, S.; Mohammadi, E. Modeling daily soil temperature over diverse climate conditions in Iran-A comparison of multiple linear regression and support vector regression techniques. Theor. Appl. Climatol. 2019, 135, 991–1001. [Google Scholar] [CrossRef]
Li, C.; Zhang, Y.; Ren, X. Modeling hourly soil temperature using deep BiLSTM neural network. Algorithms 2020, 13, 173. [Google Scholar] [CrossRef]
Penghui, L.; Ewees, A.A.; Beyaztas, B.H.; Qi, C.; Salih, S.Q.; Al-Ansari, N.; Bhagat, S.K.; Yaseen, Z.M.; Singh, V.P. Metaheuristic optimization algorithms hybridized with artificial intelligence model for soil temperature prediction: Novel model. IEEE Access 2020, 8, 51884–51904. [Google Scholar] [CrossRef]
Shamshirband, S.; Esmaeilbeiki, F.; Zarehaghi, D.; Neyshabouri, M.; Samadianfard, S.; Ghorbani, M.A.; Mosavi, A.; Nabipour, N.; Chau, K.W. Comparative analysis of hybrid models of firefly optimization algorithm with support vector machines and multilayer perceptron for predicting soil temperature at different depths. Eng. Appl. Comput. Fluid Mech. 2020, 14, 939–953. [Google Scholar] [CrossRef]
Seifi, A.; Ehteram, M.; Nayebloei, F.; Soroush, F.; Gharabaghi, B.; Haghighi, A.T. GLUE uncertainty analysis of hybrid models for predicting hourly soil temperature and application wavelet coherence analysis for correlation with meteorological variables. Soft Comput. 2021, 25, 10723–10748. [Google Scholar] [CrossRef]
Imanian, H.; Hiedra Cobo, J.; Payeur, P.; Shirkhani, H.; Mohammadian, A. A comprehensive study of artificial intelligence applications for soil temperature prediction in ordinary climate conditions and extremely hot events. Sustainability 2022, 14, 8065. [Google Scholar] [CrossRef]
Shomron, G.; Weiser, U. Spatial correlation and value prediction in convolutional neural networks, IEEE Comput. Archit. Lett. 2019, 18, 10–13. [Google Scholar] [CrossRef]
O’Gorman, P.A.; Dwyer, J.G. Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change, and extreme events. J. Adv. Model. Earth Syst. 2018, 10, 2548–2563. [Google Scholar] [CrossRef]
Huang, L.; Kang, J.; Wan, M.; Fang, L.; Zhang, C.; Zeng, Z. Solar radiation prediction using different machine learning algorithms and implications for extreme climate events. Front. Earth Sci. 2021, 9, 596860. [Google Scholar] [CrossRef]
Araújo, A.D.S.; Silva, A.R.; Zárate, L.E. Extreme precipitation prediction based on neural network model-A case study for southeastern Brazil. J. Hydrol. 2022, 606, 127454. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Biavati, G.; Horányi, A.; Muñoz Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Rozum, I.; et al. ERA5 Hourly Data on Single Levels from 1979 to Present. In Copernicus Climate Change Service (C3S) Climate Data Store (CDS); 2018. Available online: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview (accessed on 21 November 2022).
Google Maps. Available online: https://www.google.ca/maps/@45.3759264,-75.7182361,11.33z (accessed on 21 November 2022).
Wang, X.; Li, W.; Li, Q. A new embedded estimation model for soil temperature prediction. Sci. Program. 2021, 2021, 5881018. [Google Scholar] [CrossRef]
Kim, P. MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence; Apress: Seoul, Republic of Korea, 2017; pp. 103–120. [Google Scholar]
Gebrehiwot, A.; Hashemi-Beni, L.; Thompson, G.; Kordjamshidi, P.; Langan, T. Deep convolutional neural network for flood extent mapping using unmanned aerial vehicles data. Sensors 2019, 19, 1486. [Google Scholar] [CrossRef]
Xu, D.; Zhang, S.; Zhang, H.; Mandic, D.P. Convergence of the RMSProp deep learning method with penalty for nonconvex optimization. Neural Netw. 2021, 139, 17–23. [Google Scholar] [CrossRef]
Imanian, H.; Shirkhani, H.; Mohammadian, A.; Hiedra Cobo, J.; Payeur, P. Spatial interpolation of soil temperature and water content in the land-water interface using artificial intelligence. Water 2023, 15, 473. [Google Scholar] [CrossRef]
Tüysüzoglu, G.; Birant, D.; Kiranoglu, V. Soil temperature prediction via self-training: Izmir case. J. Agric. Sci. 2022, 28, 47–62. [Google Scholar] [CrossRef]
Bayatvarkeshi, M.; Bhagat, S.K.; Mohammadi, K.; Kisi, O.; Farahani, M.; Hasani, A.; Deo, R.; Yaseen, Z.M. Modeling soil temperature using air temperature features in diverse climatic conditions with complementary machine learning models. Comput. Electron. Agric. 2021, 185, 106158. [Google Scholar] [CrossRef]
Samadianfard, S.; Ghorbani, M.A.; Mohammadi, B. Forecasting soil temperature at multiple-depth with a hybrid artificial neural network model coupled-hybrid firefly optimizer algorithm. Inf. Process. Agric. 2018, 5, 465–476. [Google Scholar] [CrossRef]

Figure 1. Location of the target station in the Ottawa area [33].

Figure 2. Hourly time-series data freely downloaded from the website of ERA5 [32] for (a) air temperature, (b) dew point temperature, (c) instantaneous wind gusts, and (d) soil temperature during 2021.

Figure 3. The last part of the first unit of the 1D CNN connected to its second unit.

Figure 4. Methodological flowchart for this study.

Figure 5. Evaluation metrics calculated for the three-layer 1D CNN model in its training phase, (a) MaxE, MAE, MSE, RMSE, and bias, (b) RRMSE, MAPE, and PI, (c) R², NSCE, and VAF.

Figure 6. Evaluation metrics calculated for the three-layer 1D CNN model in its testing phase, (a) MaxE, MAE, MSE, RMSE, and bias, (b) RRMSE, MAPE, and PI, (c) R², NSCE, and VAF.

Figure 7. Evaluation metrics calculated for the two-layer 1D CNN model in its training phase, (a) MaxE, MAE, MSE, RMSE, and bias, (b) RRMSE, MAPE, and PI, (c) R², NSCE, and VAF.

Figure 8. Evaluation metrics calculated for the two-layer 1D CNN model in its testing phase, (a) MaxE, MAE, MSE, RMSE, and bias, (b) RRMSE, MAPE, and PI, (c) R², NSCE, and VAF.

Figure 9. Distribution of the soil temperature under ordinary weather conditions predicted by the 1D CNN model (a,c) and the MLP model (b,d), in the training and testing phases.

Figure 10. Distribution of the soil temperature under ordinary weather conditions predicted by the 1D CNN model (a,c) and the MLP model (b,d), in the training and testing phases.

Figure 11. Evaluation metrics of the 1D CNN model vs. the MLP model in very hot weather conditions, (a) MaxE, MAE, MSE, RMSE, bias, (b) RRMSE, MAPE, PI, (c) R², NSCE, VAF.

Figure 12. Evaluation metrics of the 1D CNN model vs. the MLP model in very cold weather conditions, (a) MaxE, MAE, MSE, RMSE, bias, (b) RRMSE, MAPE, PI, (c) R², NSCE, VAF.

Table 1. (a) Evaluation metrics calculated for the two-layer and three-layer 1D CNN models with the best kernel number configuration. (b) Evaluation metrics calculated for the two-layer and three-layer 1D CNN models with the best kernel number configuration.

(a)
Training	MaxE	MAE	MSE	RMSE	NRMSE	RRMSE	MAPE
Three layers, 64, 32, 16 kernels	3.94	1.22	2.54	1.58	3.25%	0.56%	0.45%
Two layers, 256, 128 kernels	4.70	1.27	2.54	1.59	3.27%	0.56%	0.46%
Testing	MaxE	MAE	MSE	RMSE	NRMSE	RRMSE	MAPE
Three layers, 64, 32, 16 kernels	7.35	1.93	6.37	2.52	6.90%	0.91%	0.70%
Two layers, 256, 128 kernels	9.33	2.34	9.28	3.04	8.32%	1.10%	0.84%
(b)
Training	R²		NSCE		VAF	AIC	PI
Three layers, 64, 32, 16 kernels	98.69%		97.91%		98.87%	48368.13	0.28%
Two layers, 256, 128 kernels	96.10%		97.90%		98.14%	51884.45	0.29%
Testing	R²		NSCE		VAF	AIC	PI
Three layers, 64, 32, 16 kernels	90.25%		88.14%		90.42%	24535.60	0.46%
Two layers, 256, 128 kernels	86.54%		82.73%		86.13%	25673.20	0.57%

Table 2. Sensitivity analysis of the climatic features in the input dataset of the selected 1D CNN model.

	MAE	MSE	RMSE	NRMSE (%)	RRMSE (%)	MAPE (%)
All features included	1.93	6.37	2.52	6.90	0.91	0.70
Without precipitation	2.52 (30.57%)	10.08 (58.24%)	3.10 (23.02%)	8.49 (23.04%)	1.12 (23.08%)	0.91 (30.00%)
Without surface pressure	2.23 (15.54%)	8.15 (27.94%)	2.83 (12.30%)	7.73 (12.03%)	1.02 (12.09%)	0.81 (15.71%)
Without evaporation	2.05 (6.22%)	7.11 (11.62%)	2.66 (5.56%)	7.29 (5.65%)	0.96 (5.49%)	0.74 (5.71%)
Without wind gust	2.12 (9.84%)	7.81 (22.61%)	2.77 (9.92%)	7.57 (9.71%)	1.00 (9.89%)	0.77 (10.00%)
Without dewpoint temperature	2.38 (23.32%)	8.88 (39.40%)	2.97 (17.86%)	8.13 (17.83%)	1.07 (17.58%)	0.86 (22.86%)
Without surface solar radiation	2.25 (16.58%)	8.41 (32.03%)	2.89 (14.68%)	7.93 (14.93%)	1.04 (14.29%)	0.81 (15.71%)
Without surface thermal radiation	1.94 (0.52%)	6.53 (2.51%)	2.55 (1.19%)	6.98 (1.16%)	0.92 (1.10%)	0.70 (0.00%)
Without air temperature	2.72 (40.93%)	12.06 (89.32%)	3.45 (36.90%)	9.45 (36.96%)	1.24 (36.26%)	0.98 (40.00%)

Table 3. (a) Three-layer well-designed 1D CNN model versus the three-layer MLP model. (b) Three-layer well-designed 1D CNN model versus the three-layer MLP model.

(a)
Training	MaxE	MAE	MSE	RMSE	NRMSE	RRMSE	MAPE
CNN	3.94	1.22	2.54	1.58	3.25%	0.56%	0.45%
MLP	5.29	1.49	3.24	1.79	3.66%	0.63%	0.55%
Testing	MaxE	MAE	MSE	RMSE	NRMSE	RRMSE	MAPE
CNN	7.35	1.93	6.37	2.52	6.90%	0.91%	0.70%
MLP	8.87	2.03	7.49	2.72	7.43%	0.98%	0.73%
(b)
Training	bias	R²		NSCE	VAF	AIC	PI
CNN	0.66	98.69%		97.91%	98.87%	48368.13	0.28%
MLP	1.82	98.69%		97.33%	98.63%	49476.55	0.32%
Testing	bias	R²		NSCE	VAF	AIC	PI
CNN	0.91	90.25%		88.14%	90.42%	24535.60	0.46%
MLP	2.12	89.11%		86.07%	89.48%	24748.12	0.50%

Table 4. Three-layer well-designed 1D CNN model versus the RF and SVR models.

Testing Phase	MaxE	MSE	RMSE	NRMSE	RRMSE	R2	VAF	PI
CNN	7.35	6.37	2.52	6.90%	0.91%	90.25%	90.42%	0.46%
RF	11.63	6.43	2.54	6.93%	0.91%	88.19%	88.78%	0.46%
SVR	12.11	7.47	2.73	7.48%	0.94%	86.28%	88.86%	0.48%

Table 5. (a) Evaluation metrics calculated for the 1D CNN model predicting the daily maximum soil temperature. (b) Evaluation metrics calculated for the 1D CNN model predicting the daily maximum soil temperature.

(a)
Phase	MaxE	MAE	MSE	RMSE	NRMSE	RRMSE	MAPE
Training	3.62	2.10	6.54	2.48	5.49%	0.87%	0.73%
Testing	5.72	3.58	19.23	4.35	13.31%	1.55%	1.28%
(b)
Phase	bias	R2		NSCE	VAF	AIC	PI
Training	2.48	98.10%		94.79%	98.06%	1658.96	0.43%
Testing	2.69	83.70%		68.94%	83.02%	779.80	0.81%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Farhangmehr, V.; Cobo, J.H.; Mohammadian, A.; Payeur, P.; Shirkhani, H.; Imanian, H. A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model. Sustainability 2023, 15, 7897. https://doi.org/10.3390/su15107897

AMA Style

Farhangmehr V, Cobo JH, Mohammadian A, Payeur P, Shirkhani H, Imanian H. A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model. Sustainability. 2023; 15(10):7897. https://doi.org/10.3390/su15107897

Chicago/Turabian Style

Farhangmehr, Vahid, Juan Hiedra Cobo, Abdolmajid Mohammadian, Pierre Payeur, Hamidreza Shirkhani, and Hanifeh Imanian. 2023. "A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model" Sustainability 15, no. 10: 7897. https://doi.org/10.3390/su15107897

APA Style

Farhangmehr, V., Cobo, J. H., Mohammadian, A., Payeur, P., Shirkhani, H., & Imanian, H. (2023). A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model. Sustainability, 15(10), 7897. https://doi.org/10.3390/su15107897

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Convolutional Neural Network Model for Soil Temperature Prediction under Ordinary and Hot Weather Conditions: Comparison with a Multilayer Perceptron Model

Abstract

1. Introduction

2. Methodology

2.1. Study Area and Dataset

2.2. One-Dimensional CNN

2.3. Evaluation Metrics

3. Results and Discussion

3.1. 1D CNN Architecture

3.2. Climatic Features Significance in Soil Temperature Prediction

3.3. Performance Evaluation of the 1D CNN Model in Ordinary Weather Conditions

3.4. Performance Evaluation of the 1D CNN Model in Very Hot and Cold Weather Conditions

3.5. Capability of 1D CNN Model in Predicting the Daily Maximum Soil Temperature

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI