The findings are structured to evaluate the thermophysical properties and stability of coconut oil, its effectiveness as a heat-storage medium, and its influence on greenhouse root-zone temperatures under real climatic conditions. In addition, the predictive performance of the applied machine learning models is compared through statistical metrics, feature importance, and sensitivity analyses. The results collectively highlight the potential of coconut oil-based PCM systems for improving thermal management and energy efficiency in greenhouse environments.
3.1. Selection Criteria of Coconut Oil
Differential Scanning Calorimetry (DSC) is the most used method to determine the melting/freezing points, phase-change heat and specific heat capacity of PCMs. In this study, DSC measurements were carried out with a Mettler Toledo brand device located at Cukurova University Central Research Laboratory, using a 10 mg CO sample in a nitrogen atmosphere with a nitrogen flow rate of 40 mL/min and a heating–cooling rate of 1 °C/min. Each sample was heated from −10 °C to 60 °C and similarly cooled from 60 °C to −10 °C. Also, thermal conductivity of CO was measured using HotDisk2500S with capton sensor at 25 °C ambient temperature. Differential scanning calorimetry and thermal conductivity analysis of coconut oil used in the study is given in
Figure 5 and
Table 4. Investigation of the usability of coconut oil as a PCM, the specific enthalpy and thermal conductivity values were like other studies in the literature as expected [
36,
37,
38,
39]. CO gives back 98.95 J/g of the 117.36 J/g of energy it takes in during heating, and its storage efficiency is 84.3%. Although the thermal conductivity coefficient of the sample is low at 0.193 W/m.K, it is a product with development potential due to the abundance of CO, its natural origin, and the ability to increase the thermal conductivity coefficient with nanoparticles.
CO possesses desirable thermal properties, such as a phase-change temperature between 17 and 26 °C [
36,
37], a moderate latent heat of fusion of 102–103 kJ/kg [
38], and a thermal conductivity ranging from 0.161–0.321 W/m·K [
39], in addition to its ecofriendly bio-based nature.
The biochemical stability of coconut oil, classified under the fatty acid group in PCM classification, is reported to withstand 200 thermal cycles [
40] or more than two years, making it also cost-effective. Overall, studies suggest that coconut oil is suitable for thermal applications for building as a heat-storage material [
15,
37,
41,
42,
43]. Thermal stability analysis of CO was performed with different numbers of cycles (100, 200, 300, 400, and 500), and results are given in
Figure 6.
Thermogravimetric analysis gives curves showing the weight loss of a substance against temperature. The weight of the CO sample remains constant until the decomposition starts. The mass loss observed afterwards is the result of the temperature increase in the boiling phase and the evaporation in the test phase. The initial temperature of the thermal decomposition of coconut oil was found to be 252 °C for this study. Majority portion of the mass loss (95%) occurred between 200 and 400 °C. When the degradation curve of CO is examined, it falls into the classification of medium volatile oil (200–600 °C). There is a two-stage thermal degradation in the curve shown in
Figure 7. The reason for this is that the chain length of the fatty acids, a branch of the chain, and the degree of unsaturation are factors that affect the thermo-oxidative properties of the fatty ester [
44].
3.3. Statistics and Machine Learning for Experiment
Machine learning has been applied to predict the root-zone temperature to determine the impacts of heat storage with PCM in a greenhouse. It is important to evaluate properly the training and test sets for machine learning applications to eliminate underfitting, overfitting, and multicollinearity. Before starting the analysis, whether there was a multicollinearity problem among the input parameters was examined. The Variance Inflation Factor (VIF) value of the parameters whose multicollinearity was controlled are Rad (3.962), Vw (1.329), Moist (2.066), Tamb (7.296), Tghot (9.787), and Tpcm (3.260), respectively. Results show that there is no multicollinearity problem when VIF is lower than 10 [
45,
46]; thus, there is no multicollinearity problem in this study. Descriptive statistics of the measured parameters in this experiment are presented in
Table 5. The average root-zone temperature (Tphot) ranges between 3.30 °C and 44.65 °C. The average of Tpcm, which is thought to be the most effective parameter on Tphot, varies between −1.84 °C and 44.65 °C.
According to the results of the paired t-test, a statistically significant difference was observed between the soil temperatures of the heated pots (Tphot) and the unheated pots (Tpcold) (t = 18.558, df = 2657, p < 0.001). The confidence interval (95%) for the mean difference was (1.39, 1.72), indicating that the average soil temperature in the heated pots was 1.55 °C higher than in the unheated pots. This finding concludes that the application of phase-change materials (PCMs) is effective in increasing root-zone temperatures.
Linear relationship between measured parameters were determined using Pearson correlation coefficient (R), and the results are illustrated (
Figure 10). As can be seen from
Figure 10, Tphot has a positive and highly significant correlation (
p < 0.001) with Tpcm (0.96), Tghot (0.82), Rad (0.70) and Tamb (0.67). Rad positively correlated with Tghot, Tpcm, and Tphot, with R ranging from 0.70 to 0.81, and Moist is negatively correlated with Tamb, Tghot, Tpcm, and Tphot (
p < 0.001).
To determine the predictive power of a model, it needs to be evaluated with various criteria. For this reason, the evaluation results made with six different criteria are summarized in
Table 6. The smallest RMSE for training is 0.59 °C in XGBoost and for testing is 1.66 °C for SVR. Ref. [
47] tried to determine the RMSE values between 2.05 and 3.54 °C, and in [
46] the smallest RMSE of 1.61 was found. SDR ratios in the study ranged between 0.07 and 0.23. The minimum SDR is observed in XGBoost for both training and testing datasets.
MAPE and MAD were smallest for training in XGBoost at 2.48 and 0.46. These indices were the smallest for testing in XGBoost and SVR at 6.90 and 1.26, respectively. When R
2 values are examined, it is observed that all three algorithms obtain strong models (>95%). XGBoost showed better results (99%) for training datasets than others, while testing dataset’s R
2 values are equal in SVR and XGBoost (96%). The results of the evaluation criteria shown in
Table 6 are within the ideal limits. All three algorithms achieved strong results in modeling the root-zone temperature. While the XGBoost algorithm achieved the best results during the training of the dataset, XGBoost and SVR achieved similar results during the evaluation of the test set.
Figure 11 shows the linear regression between the observed and predicted values of the models used in the study for both training and testing sets.
Heating requirements of greenhouses in the Mediterranean region are lower than in other regions. The annual average energy consumption of a greenhouse in the Mediterranean region for a 90-day heating period is approximately 150 kW. The amount of fuel oil required to meet this demand is 0.055 L/m
2, and the amount of coal is 0.1 kg/m
2 [
48]. Ref. [
49] heat storage on the soil surface can meet up to 13–19% of total daily heat requirement of a greenhouse. Ref. [
50] reported that the air temperature inside the greenhouse should be 12 °C higher than the air temperature outside. With the proposed heat-storage application, both greenhouse indoor temperatures and root-zone temperatures were found to be higher than control groups. Root-zone temperatures in the greenhouse compartment where the heat-storage application was made showed better results during the night hours compared to the control group. The reason why the plant root-zone temperatures in the greenhouse with heat storage are lower before noon is that the PCM has collected and stored heat from the interior, and both the interior and root-zone temperatures have decreased. Indoor temperatures were found to be 1.8 °C, 2.4 °C, and 0.4 °C higher than the control greenhouse in February, March, and April, respectively, with a short-term (day to night) heat-storage process. Root-zone temperatures were observed to be 1.76 °C, 0.91 °C, and 1.46 °C higher than control pots for the same months, respectively.
This study clarifies the effectiveness of machine learning models (SVR, MARS, XGBoost) in analyzing factors and PCM on root-zone temperature and predicting future values. The models combine climatic factors such as temperature, humidity, solar radiation, and wind velocity and internal factors like air temperature in the greenhouse and PCM temperature. These models try to reveal nonlinear relationships between inputs and outputs. With these learning models, target output can be predicted with a very accurate percentage using existing input data.
Ref. [
51] achieved a low Root Mean Square Error (RMSE) value of 3.7 °C for the test dataset in their study, employing Artificial Neural Networks (ANNs) to predict temperature fluctuations in a high tunnel greenhouse. In another study, ref. [
52] utilized ANNs to estimate greenhouse temperatures, yielding R
2 values of 0.959 (winter) and 0.955 (summer) within a 95% confidence interval. Ref. [
53] attempted to forecast temperature and humidity levels in a Chinese greenhouse, employing various machine learning approaches. Support Vector Machine (SVM) demonstrated an RMSE of 2.78 °C and an R
2 of 0.89 for temperature prediction, along with an RMSE of 4.55% and an R
2 of 0.87 for humidity. Ref. [
54] conducted a comparative analysis of machine learning models including Random Forest, SVM, Multiple Linear Regression (MLP), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) for predicting minimum greenhouse temperatures. The random forest model exhibited the highest R
2 (0.87) with the lowest RMSE (4.43 °C), while SVM yielded an R
2 of 0.87 and an RMSE of 4.52 °C. SVM has been utilized for understanding greenhouse indoor temperatures and the energy-storage performance of phase-change materials in solar collectors. Ref. [
22] utilized the Nonlinear Autoregressive Networks with Exogenous Input (NARX) algorithm to model indoor air temperatures in greenhouses with and without Phase-Change Materials (PCMs). Ref. [
55] applied NARX and Recurrent Neural Network (RNN) algorithms to model indoor temperatures, achieving R
2 values of 0.9986 and 0.9893, respectively. These findings underscore the efficacy of machine learning techniques in developing heat-storage systems. Ref. [
56] compared ANN and Multivariate Adaptive Regression Splines (MARS) methods for estimating indoor temperatures in greenhouses, noting MARS’s provision of more detailed results. MARS data-mining algorithm has been utilized across various domains, including agricultural studies [
32,
57,
58].
Determining which variables are significant and assessing their impact on model performance is of paramount importance in machine learning. To this end, feature importance analysis was conducted for each model, and the results are presented in
Table 7. Additionally, sensitivity analysis which evaluates how variations in each variable affect model performance, provides critical insights into model accuracy, robustness, and hyperparameter tuning. The sensitivity metrics are also included in
Table 7.
Upon examination of
Table 7, it was observed that in the XGBoost and SVR models, all six input variables contributed to model performance, whereas in the MARS model, Vw was found to be ineffective and was thus excluded. Across all three models, Tpcm was identified as the most influential factor affecting Tphot. The increase in temperature of the phase-change material (coconut oil) led to a corresponding rise in root-zone temperatures.
Feature importance analysis was employed to evaluate the impact of input variables on the output variable by calculating their gain values. The sum of all variable gains equals one. The influence of Tpcm on Tphot was determined to be 68.8% in XGBoost, 47.5% in SVR, and 43.5% in MARS. While XGBoost and MARS models identified Rad (18.4%) as the second-most significant parameter after Tpcm, the SVR model indicated Tamb (19.6%) as the secondary influential factor.
According to the sensitivity analysis results, which demonstrate the effect of variations in each variable on model performance, the Moist and Vw variables influence the model output, albeit with relatively lower importance scores in all algorithms except MARS. Mean dropout loss values indicate that even minor changes in Tpcm would lead to significant differences in model performance across all three algorithms.