Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems

Yenare, Raju R.; Sonawane, Chandrakant; Roy, Anindita; Landini, Stefano

doi:10.3390/su18031467

Open AccessArticle

Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems

¹

Symbiosis Institute of Technology Pune, Symbiosis International (Deemed University), Pune 412115, India

²

Department of Mechanical Engineering, Smt. Kashibai Navale College of Engineering, Pune 411041, India

³

Symbiosis Centre for Nanoscience and Nanotechnology, Symbiosis International (Deemed University), Pune 412115, India

⁴

School of Engineering, Mathematics and Physics, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(3), 1467; https://doi.org/10.3390/su18031467

Submission received: 12 December 2025 / Revised: 24 January 2026 / Accepted: 29 January 2026 / Published: 2 February 2026

(This article belongs to the Special Issue Sustainable Cold Chain Packaging: Passive Solutions and EPS Alternatives for Thermal Integrity)

Download

Browse Figures

Versions Notes

Abstract

The problem of the post-harvest loss of perishable products has been a loss facing food security, especially in areas that lack adequate cold chain facilities. This issue is directly connected with sustainability objectives because post-harvest losses are the major source of food wastage, unneeded energy use, and related greenhouse gas emissions. Cold storage with phase-change material (PCM) is a promising alternative, as it aims at stabilizing temperatures and enhancing energy consumption, but current analyses of performance have been conducted through experimental testing and computational fluid dynamic (CFD) simulations, which are precise but computationally expensive. To handle this drawback, the current work constructs a machine learning predictive model to predict the dynamics of charging and discharging temperature of PCM cold storage systems. Four regression models, namely Random Forest, Extreme Gradient Boosting (XGBoost), Support Vector Regression (SVR), and K-Nearest Neighbors (KNNs), were trained and tested on experimental datasets that were obtained for varying storage layouts. The various error and accuracy measures used to determine model performance comprised MSE, MAE, R², MAPE, and percentage accuracy. The findings suggest that Random Forest provides the best accuracy during both the charging and the discharging process, with the highest R² values of over 0.98 and with minimal mean absolute errors. The KNN model was competitive in the discharge process, especially in cases of consistent thermal recovery patterns, and XGBoost was consistent in layout accuracy. However, SVR had relatively lower robustness, particularly when using nonlinear charged dynamics. Among the evaluated models, the Random Forest algorithm demonstrated the highest predictive accuracy, achieving coefficients of determination (R²) exceeding 0.98 for both charging and discharging processes, with mean absolute errors below 0.6 °C during charging and 0.3 °C during discharging. This paper has proven that machine learning is an efficient surrogate to CFD and experimental-only methods and can be used to predict the thermal behavior of PCM quickly and precisely. The proposed framework will allow for developing cold storage systems based on energy efficiency, low costs, and sustainability, especially in the context of decentralized and resource-limited agricultural supply chains, with the help of quick and data-focused forecasting of PCM thermal behavior.

Keywords:

phase-change materials; cold storage; sustainable cold chain; machine learning; thermal performance prediction

1. Introduction

Post-Harvest Losses (PHLs) continue to be a major limitation to food security and farm earnings, and recent research has indicated that fruits and vegetables have the highest loss rates in the world, frequently ranging between 28 and 55% of the supply chain [1]. Such losses add to amplified nutrition and market failures, especially in low- and middle-income areas; the current literature highlights that a shrink in PHLs is a direct scalable lever to food insecurity and diet quality [2]. Enhancing the first-mile cold chain is always noted as a high-impact intervention: when implemented properly around production locations, cold storage has been observed to reduce handling and storage losses by large margins: Field-scale studies have reported that the deployment of cold storage facilities at production sites can reduce post-harvest losses by approximately 10–25%, depending on crop type, storage duration, and climatic conditions, with specific case studies reporting reductions of around 13% [3]. In this context, cold storage using phase-change material (PCM) is a promising route to providing stable temperatures with enhanced energy efficiency, peak-load shifting and passive buffering to the intermittency of power. The state-of-the-art reviews on cold thermal energy storage (CTES) emphasize the use of conventional PCMs (e.g., hydrated salts, paraffins, and water/ice) in refrigeration or cold chain application and provide advantages in thermoregulation and energy saving [4]. The complementary literature on small-scale and portable cold storage also highlights the potential of integrating PCMs to increase autonomy, increase smoothing temperature variations as well as increasing transport resilience to perishables [5]. In addition to reviews, recent experimental-numerical research on micro/portable cold rooms shows that, with sufficient care in the choice and location of PCMs, and due to proper encapsulation, significant increases in thermal performance can be obtained with realistic operating constraints [6]. Cumulatively, the evidence base inspires a shift towards PCM-enabled cold storage as a viable, scalable lever technology to minimize PHLs with the ability to hold energy and infrastructure costs, especially in environments with grid reliability and logistics, which are still restrictive. This reason is also directly relevant to the current work, which aims to provide the correct and rapid prediction of charge/discharge thermal behavior as a precondition to design optimization and real-time control [5,7].

In terms of sustainability, alleviating energy-saving use of cold storage of post-harvest products directly relates to various United Nations Sustainable Development Goals, such as SDG 2 (Zero Hunger), SDG 7 (Affordable and Clean Energy), and SDG 12 (Responsible Consumption and Production). PCM-based cold storage systems allow passive control of temperature, shifting of peak loads, and the minimization of the use of continuous mechanical cooling, which contributes to lowering the operational energy requirements and attendant emissions as well [6]. Nevertheless, to scale-up to a sustainable deployment, it is necessary to use predictive tools that can optimize PCM configuration and operational strategies without incurring computational costs or experimental costs that are too high [8]. Within the current research, this need is met by proposing a machine learning framework that incorporates data and aids the sustainable design of cold-chain infrastructures by speeding up the prediction of thermal performance.

Although experimental studies and computational fluid dynamic (CFD) simulations are the gold standard of characterizing PCM thermal behavior, their application is usually limited by large resource requirements. Experimental systems are costly due to their time-intensive and expensive nature, necessitating accurate instrumentation, large-scale calibration as well as trials at controlled ambient conditions, especially when many layouts, charging/discharging cycles, or design variations need to be evaluated. Consequently, simulation times can extend to hours or even days of a single scenario, making the design exploration of an iterative process more difficult [8].

Moreover, high-fidelity CFD executions restrict the rate at which design parameter areas can be searched out—restricting sensitivity analysis, optimization, and real-time adaptation. This has been the issue that has inspired studies on surrogate modeling and reduced-order modeling. Indicatively, Bacellar et al. showed that even a spatially reduced domain CFD model could replicate full-domain behavior with a 4–5 order-of-magnitude acceleration in runtime and thereby facilitate rapid prototyping of PCM systems [9]. Thus, it has a critical gap: experimental and CFD techniques are not applicable to rapid prediction, real-time control, or sweeps in design, even though they are highly accurate, in PCM cold storage systems. The only way to fill this gap is by having a complementary modeling paradigm that is fidelity and speed balanced, therefore encouraging a scorecard to use machine learning as an efficient predictive proxy.

The objective of the current paper is to design and illustrate a machine learning predictive model that can effectively predict the charging and discharging temperature characteristics of cold storage systems that use phase-change materials (PCMs). Although previous research has confirmed applicability of machine learning as a proxy of CFDs in thermal systems, most of the current research is either simulation-based, addressing steady-state behavior, or is only on generic thermal components like heat exchangers and batteries. This work attempts to assess the potential of various machine learning algorithms to characterize the nonlinear thermal behavior that is directly related to PCM phase transitions by utilizing experimental datasets that have been gathered with varying storage patterns and working conditions. Specifically, ensemble techniques (Random Forest and XGBoost), kernel-based regression (SVR), and instance-based learning (KNN) are experimentally compared concerning their capability to recreate layout-specific temperature patterns during charging and discharging processes. The primary innovation of this study lies in developing an experimental-data-driven machine learning surrogate capable of accurately predicting PCM charging and discharging behavior, thereby replacing computationally expensive CFD simulations and enabling real-time, layout-specific thermal predictions for portable cold storage systems. Besides the large setup and modeling cost, in the course of the computation, computational fluid dynamic (CFD) simulations of PCM charging and discharging processes are also costly in terms of computational cost. Machine learning (ML) surrogate models are incredibly cheap to inference after being trained. In the current work, it was found that the model inference time to predict an entire charging or discharging profile temperature was on the order of milliseconds, permitting near real-time prediction. This is a number of orders of magnitude that is significantly less than the time required to run CFD-based simulations. ML-based surrogates, therefore, are best applicable in rapid layout screening, sensitivity analysis and control-based problems that demand repeated prediction. The main value of this work is the creation of a hybrid model directly combining experimental data with machine learning models. Contrary to traditional research, where experiments or computational fluid dynamics (CFDs) are the only tools used, this method allows for surrogate predictions in a short time and physical relevance, since the models are based on actual measurement data.

2. Literature Review

Calati et al. [10] conducted a thorough review of PCM-based cold thermal energy storage (LTES) of refrigerated transport and distribution, including materials, encapsulations, system topologies, and performance trade-offs, and has identified finning and heat-exchange designs as the key bottlenecks to charge/discharge rates. Zhang et al. [11] conducted a review of low-temperature PCMs (100 to 30 °C), material classes, and methods of enhancement (nanoparticles, microencapsulation, and shape stabilization) with a direct application to cold storage and the design of CTES devices. Maiorino et al. [12] conducted a state-of-the-art survey of refrigerated transportation, plotting the potential of PCM-based systems to minimize fuel/energy consumption and emissions and address integration on routing/operation constraints. Riahi et al. [13] examined experimental evaluations of a vapor-compression system with PCM storage and demonstrated cooling capacity/COP advantages and demonstrated that PCM add-ons could be used in conventional refrigeration loops. Harun-Or-Rashid et al. [14] utilized a retrofitted household refrigerator and PCM as an evaporator/condenser. The findings indicate efficiency in energy savings and thermal regulation, and the report provides a guideline for the integration of cold storage of appliances at the appliance level. He et al. [15] conducted experimental research on a direct-contact cold TES with a consideration of melting dynamics and heat-transfer constraints; their research gives an insight on how charge/discharge management concerning cold rooms/boxes should be done. Kareem et al.’s [16] experiments on air-multiple-PCM heat exchangers show that an experimentally validated heat exchanger stabilizes supply-air temperatures and also illustrates that multi-PCM staging is able to offer stable cold delivery during variable loads. Wu et al. [17] conducted a model-validated study of PCM cold plates with non-uniform fins and measured the influence of fin layouts changing cold storage/release rates and made a design rule regarding plate-based cold chain units. Li et al. [18] offered an overview of the material pertaining to phase-change cold-storage materials and novel HVAC/cold-storage systems and combined a selection of materials, mitigation of supercooling/phase separation, and a system-configuration design of air-conditioning and cold rooms. Experiments on a portable cold-storage box by Hang et al. [19] on PCM package numbers/layouts under a dynamic ambient profile demonstrate that location has a strong influence on cooling time and temperature uniformity, which is directly applicable to last-mile cold chains.

Efatinasab et al. [20] trained machine/deep-learning tools to make a micro-finned tube heat exchanger, collectively forecasting the heat-transfer coefficient and frictional pressure drop, and demonstrated that ML can screen HX geometry in a shorter time. Suzuki et al. [21] developed ML surrogate models to substitute CFDs in the development of microchannel thermal devices, so that they could accurately predict the heat-transfer and pressure loss to allow for quick multi-objective optimization. Zhao et al. [22] have surveyed physics-informed neural networks (PINNs) on problems with a classical heat transfer, formulations, and boundary conditions, as well as challenges to real-time thermo-fluid prediction. Chen et al. [23] did a survey of ML techniques in building an energy system, including preprocessing, model selection, and uncertainty assessment with load forecasting, control and diagnostics. Ayoola et al. [24] showed the optimization of air-to-water heat pumps by using collected data, whereby the operating parameters were optimized using ML, and the overall seasonal performance was enhanced by real-home variability. In a comparison between ANN and LSTM in VRF HVAC predicting power, Hsu et al. [25] discovered that LSTM is better than ANN in capturing the dynamics of the system during heat-recovery mode. D’Aquilio et al. [26] designed an ML-based surrogate CFD of room airflow/thermal fields (OpenFOAM) that are orders-of-magnitudes faster to predict so that it can be used to support control-oriented simulations. Qian et al. [27] suggested data-based thermal predictions of Li-ion batteries and investigated a hybrid BTMS (PCM + air) system based on ML to predict temperature changes and performance of the system. Tian et al. [28] introduced a physics-informed ML model with transfer learning to predict complex-component real-time thermal behavior with reduced training times (cutting time by half) and maintaining simulation/experiment errors to less than 14%. Li et al. [29] provided a review of ML applications to building energy systems (fault detection, energy prediction, scheduling, and control) and focused on supervised models (SVM/RF) and gaps in information on datasets and generalizations.

In this respect, the current review addresses the PCM cold storage and machine-learning-based thermal prediction frameworks applicable to surrogate modeling and aims to offer a comprehensive review of machine learning algorithms, which have already appeared in the literature. PCM cold storage research has progressed due to experimental and CFD studies. Still, the methods are energy-consuming and do not offer easy predictions or real-time functionality. Similar advances in machine learning have demonstrated great potential in thermal energy applications like batteries, heat exchangers, and HVAC, and its use in PCM cold storage is less common. There have been very few attempts made to train predictive models that are directly based on experimental PCM charging and discharging data. This creates a vacuum that can be filled by the current work as it suggests a hybrid framework that combines experimental data with ML algorithms to provide accurate, fast, and layout-specific temperature predictions of PCM systems.

3. Methodology

3.1. Dataset Preparation

A surrogate modeling method based on machine learning consumes much less computational energy than repeated CFD simulations and thus corresponds to the sustainable and resource-efficient means of engineering. The machine learning modeling that was used in the current study was based on experimental research carried out on a PCM-integrated portable cold storage system. Such an algorithm differs from previous ML-based PCMs by using a verified experimental ML proxy that conserves latent-heating effects, runs at a number of order-of-magnitude cheaper cost, and can be used to design cold stores in real time or during active design. The experimental program produced temperature–time profiles both with charging and discharging cycles with three different layout configurations. Through experimental validation, numerical modeling, and machine learning implementation upon the same dataset, the same framework was preserved in each of the analytical methods. This guaranteed a reasonable comparison of the outcomes and validity of the results. The charging cycle was conducted under ambient temperature (around 29 °C) and then reduced to sub-zero temperatures (as low as −10 °C). The measurements were taken at 10 min intervals of three layouts that were named as Layout 1, Layout 2 and Layout 3. On their part, under the discharging cycle, the PCM was also permitted to emit the stored thermal energy, and relative temperature increases were measured under the same conditions in the layout. The entire dataset therefore gave an overall picture of the PCM thermal performance in both of these modes of operation. Temperature measurements were obtained using calibrated digital temperature sensors placed in close proximity to the PCM encapsulation. The measured temperatures therefore represent PCM-adjacent thermal behavior rather than bulk air temperatures. Sensor accuracy was ±0.5 °C, and calibration was verified prior to experimentation. This measurement approach ensures that the recorded data reflect the actual thermal response of the PCM during phase changes. Measurement uncertainty was accounted for through sensor calibration and repeated experimental runs. Each charging and discharging cycle was repeated under identical conditions for all layouts, and consistent thermal trends were observed across repetitions, confirming the repeatability and reliability of the experimental dataset. The phase-change material (PCM) employed in the current experimental study is RT4, which is a commercially available organic PCM that is specially designed to be utilized in cold storage at low temperatures of a refrigerator. RT4 was chosen because of its narrow melting temperature, with its center being 4 c. It is capable of holding high latent heat, and its thermophysical characteristics remain constant when it is charged and discharged repeatedly. Table 1 presents the thermophysical properties of the RT4 PCM; they were consistently applied during the experimental measurements, CFD simulations, and machine learning modeling.

3.1.1. Dataset Structure

The dataset has been prepared so that time, in minutes, is considered as the independent variable and specific PCM temperatures of the layouts as the dependent variables. The variables in the dataset, their type, units and place in the machine learning model are shown in Table 2.

The formulation enabled the problem to be addressed as a multi-output regression problem where the models learnt to forecast the thermal response of various PCM layouts concurrently.

3.1.2. Preprocessing Steps

Before implementing machine learning, the raw experimental data went through a number of preprocessing processes to ensure it is ready to be trained and evaluated:

Data Cleaning: Column labels that were inconsistent (e.g., °C, etc.) were harmonized, and any duplicate or unrecorded entries were properly managed.
Categorical Encoding: The operation mode between charging and discharging and the identifiers of the layout were encoded into a number that would be used in ML algorithms.
Scaling and Normalization: Continuous variables, like time and temperature, were normalized to avoid bias due to the different ranges of input variables.
Temporal Consideration: Since the data was time-lapsed, the division of training and test data occurred in the temporal context; hence, the occurrence of future values in the past was avoided.

The models were trained using a multi-output regression strategy, where the temperatures of Layout 1, Layout 2, and Layout 3 are predicted simultaneously within a single model. This approach enables the model to learn inter-layout thermal correlations while preserving layout-specific dynamics.

3.1.3. Train–Test Split

The sample dataset comprising 487 samples was categorized into three subsets on a 70-15-15 basis. The Training Set (70%) was used to fit the models. The Validation Set (15%) was used to tune hyperparameters and stop early. The Testing Set (15%) was used for final analysis. This division made sure that the models were effectively trained and tested over unknown temporal segments, which was similar to real-world forecasting. The selected hyperparameter values shown in Table 3 represent the optimal trade-off between predictive accuracy and generalization capability, as determined through grid-based tuning on the validation dataset. Ensemble-based models required deeper parameter optimization to manage nonlinear phase-change dynamics, whereas instance-based learning achieved competitive performance with fewer tuning parameters.

3.1.4. Dataset Illustration

Figure 1 below gives the charging and discharging temperature–time profiles of the three layouts to provide a visual representation. The graphs show the overall cooling and heating patterns that the ML models were trained to emulate.

3.1.5. Consistency Across Approaches

The current study is consistent in its methodological approach, since it used the same data to conduct experimental validation, CFD-based numerical simulations, and machine learning projections. This correspondence made sure that comparisons done between the three approaches were not affected by the dissimilarities in the data but rather the capability and restrictions of each method. In this respect, the dataset served as a universal benchmark to validate the findings and strengthen the validity of the findings.

3.2. Model Implementation

Supervised machine learning regression models were used to predict the charging and discharging temperature of the PCM-based cold storage system. According to the nonlinear thermal dynamics of PCMs, especially in the phase-transition process, there was a need to identify algorithms with the ability to cope with complex patterns and generalizations across experimental layouts and the ability to make good multi-output predictions. The study uses four different machine learning algorithms, which include the following: Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Regression (SVR), and K-Nearest Neighbors (KNNs). Algorithms have been selected to reflect a different modeling philosophy, which allowed them to compare predictive performance in a complete way. The chosen algorithms can be regarded as the four opposite modeling philosophies: ensemble learning (Random Forest and XGBoost), kernel-based regression (SVR), and instance-based learning (KNNs). This choice allows us to perform a balanced comparison between tree-based, distance-based, and kernel-based predictors, which represent a wide range of complexity, interpretability and learning nonlinear models. The forecasting of PCM charging and discharging temperatures was developed as a supervised regression problem. Suppose that the dataset is represented as follows:

D = {(x_{i}, y_{i})}_{i = 1}^{N}

(1)

where

x_{i} \in R^{n}

represents the input features (e.g., time, mode, and layout),

y_{i} \in R^{m}

denotes the target outputs (layout-specific temperatures), and

N

is the number of observations. The objective of the models is to learn a function,

f : R^{n} \to R^{m}

, that minimizes the prediction error between

{\hat{y}}_{i} = f (x_{i})

and the actual target,

y_{i}

.

Random Forest (RF)

Random Forest is an ensemble-based decision tree based on the idea of decision trees, in which the weak learners (trees) are combined to create a strong predictor. The bootstrap sampling and random selection of features during node splitting makes it less prone to overfitting and enhances generalizations. RF works best to identify nonlinearity in data and has resistance to noise. This rendered it a good choice of modeling the intrinsically nonlinear charging and discharging curves of PCMs. Random Forest involves predictions by amalgamation of various decision trees, which are trained on bootstrap samples. Regression prediction is as follows:

\hat{y} (x) = \frac{1}{T} \sum_{t = 1}^{T} h_{t} (x)

(2)

where

T

is the number of trees, and

h_{t} (x)

is the prediction of the

t^{t h}

decision tree. This ensemble approach reduces variance and improves generalizations.

Extreme Gradient Boosting (XGBoost)

XGBoost is a fast and performance-based gradient boosting model. As compared to Random Forest which builds trees in parallel, XGBoost is built sequentially, with each new tree trying to correct the mistakes made by the last tree. The algorithm has inbuilt regularization measures that help avoid overfitting and achieve effective learning of complicated patterns. XGBoost has been chosen due to its high prediction accuracy and its effectiveness when it comes to solving regression tasks with multivariate nonlinear dependencies. XGBoost creates trees one after the other, whereby each tree rectifies residuals of the former trees. The Ktrees after prediction are predicted to be as follows:

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F

(3)

where

F

represents the space of regression trees. The objective function combines the loss function,

L

, with a regularization term,

Ω

:

Obj = \sum_{i = 1}^{N} L (y_{i}, {\hat{y}}_{i}) + \sum_{k = 1}^{K} Ω (f_{k})

(4)

This ensures accuracy while avoiding overfitting.

Support Vector Regression (SVR)

SVR is a regression extension of Support Vector machines (SVMs). It tries to optimize the best fit in a given range of tolerance, called the e-insensitive tube, and punishes only the predictions which are not in the range. With the use of kernel functions (linear, polynomial, and radial basis function), SVR is able to learn nonlinear associations in continuous data, which are complex. Its advantage is that it is strong in regression when there is limited data at hand and is therefore able to capture the continuous yet nonlinear PCM evolution of temperature. SVR attempts to find a function,

f (x) = w^{T} ϕ (x) + b

, that approximates the data within an error tolerance,

ϵ

. The optimization problem is formulated as follows:

\underset{w, b, ξ, ξ^{*}}{m i n} \frac{1}{2} ∥ w ∥^{2} + C \sum_{i = 1}^{N} (ξ_{i} + ξ_{i}^{*})

(5)

which is subject to the following:

y_{i} - w^{T} ϕ (x_{i}) - b \leq ϵ + ξ_{i}, w^{T} ϕ (x_{i}) + b - y_{i} \leq ϵ + ξ_{i}^{*}

(6)

where

C

is the penalty parameter, and

ϕ (x)

maps inputs into a higher-dimensional kernel space.

K-Nearest Neighbors (KNNs)

KNNs are a non-parametric model used in predictions of an output given an input using the average of the K-Nearest Neighbors of the input in the feature space. The similarity principle is used to guarantee that a data point with similar characteristics gives similar results. In spite of its simple design, KNNs offer a valuable baseline of regression and are especially interpretable; thus, it is useful to compare it with more advanced models. In the given work, KNNs were used to assess the effectiveness of local similarity measurements to estimate PCM temperature reactions. In the feature space, KNN regression computes the average output of its K-Nearest Neighbors to obtain a prediction of the target:

\hat{y} (x) = \frac{1}{K} \sum_{i \in N_{K} (x)} y_{i}

(7)

where

N_{K} (x)

denotes the set of

K

-Nearest Neighbors of input

x

. A weighted variant applies higher weights to closer neighbors.

This study did not assume the use of deep learning models, like LSTM or CNN architectures, since the size of the dataset was moderate, and there were no spatial or long-horizon temporal dependencies that would warrant their complexity. The identified models thus offer the best compromise between predictive, computational and interpretability qualities of experimentally driven PCM thermal prediction.

3.2.1. Training Process

The four algorithms were trained with the experimental data presented in the section above. The training processes particular to each model are summarized below:

Random Forest (RF): This employed bootstrap sampling (sampling with replacement) to form several training subsets. Individual trees were trained, and the averaged results of all trees were predicted.
XGBoost: A boosting scheme was used, with the trees being constructed in order. The trees reduced the loss of the preceding model by minimizing the loss function via gradient descent.
SVR: It was trained with the e-insensitive loss function, whereby one neglects a small error and punishes a bigger error. To gain nonlinear patterns, kernel transformations (RBFs) were implemented.
KNNs: These trained a distance metric (Euclidean distance) to find the K-Nearest Neighbors of each input. The computations of predictions were based on the average temperature of these neighbors, and near neighbors had a more significant impact in the weighted form.

All the models were trained on a 70-15-15 split (training, validation, and testing) with the time aspect taken into consideration to prevent data leakage, as explained in Section 3.2.2.

3.2.2. Evaluation Metrics

In order to assess the predictive power of the models in a detailed way, a collection of commonly used errors and accuracy indicators was used:

Mean Squared Error (MSE): The MSE is used to measure the mean squared error between the actual and predicted values and punishes large errors.
Mean Absolute Error (MAE): This is the average of the difference between the predicted and the actual values, which presents interpretability in the same terms as the target variable.
Coefficient of Determination (R²): This is the percentage of variance in the dependent variable that is covered by the model. An increased R² is a sign of superior predictive power.
Mean Absolute Percentage Error (MAPE): The error in prediction is expressed as a percentage, which is beneficial in explaining relative performance.
Accuracy (%): This is computed as 100 MAPE and is an intuitive way of measuring the correctness of a model.

These measures enabled a comparison of the models in a quantitative manner, meaning that the most effective algorithm might be known, not just in regards to error reduction but also with respect to realistic predictive performance.

Figure 2 shows the general ML procedure that was adopted in this research. The model comprised five steps:

Dataset Acquisition: PCM cycles of charging/discharging data.
Data Cleaning, Preprocessing, Scaling, and Feature Engineering.
Model Selection and Training: RF, XGBoost, SVR and KNNs are implemented.
Model Validation: Training/validation split and K-fold cross-validation.
Comparison and Prediction: Expected PCM temperature profiles were observed in comparison with experimental and numerical variables.

The optimal hyperparameters for each machine learning model were determined using a grid-based tuning approach applied to the validation dataset. Model performance was evaluated using MAE and R² as objective criteria. Parameters such as the number of trees and maximum depth for Random Forest, the learning rate and tree depth for XGBoost, the kernel parameters and penalty factor for SVR, and the neighborhood size for KNNs were systematically optimized to ensure balanced accuracy and generalizations. In addition to the temporal train–validation–test split, k-fold cross-validation (k = 5) was performed on the training set to verify robustness and reduce variance in performance estimates.

3.2.3. Computational Cost and Considerations

Predictive accuracy is obviously crucial, but computational efficiency is important as well to deploy and scale it in practice. Machinery learning models used in the current system were trained offline based on experimentally obtained datasets. The required capacity to carry out the training process was moderate but much less demanding than the ability to conduct repeated CFD simulations on multiple layouts and operating cycles. More to the point, when deployed, the trained ML models could do forward inference to yield predictions. This step in inference consisted of basic arithmetic operations and decision-tree traversals (in case of ensemble models) or distance computations (in case of KNNs), which resulted in runtimes of milliseconds per prediction on an off-the-shelf CPU. Conversely, PCM prediction by CFD in computing charging or discharging necessitates a solution of coupled time-dependent energy and momentum equations in thousands of time steps and is not suitable for real-time or iterative designs. Thus, the suggested ML system demonstrates a positive trade-off between precision and computational intensity, being a good surrogate of CFD in design optimization and operational prognostics.

4. Results and Discussion

4.1. Charging Case

The machine learning models were tested on the charging cycle of the PCM-based cold storage system with the aim of forecasting specific PCM temperatures to the layout of Layout 1, Layout 2, and Layout 3. Cross-validation of the predictive accuracy of the above models was done with the help of five statistical indices, that is, mean squared error (MSE), mean absolute error (MAE), coefficient of determination (R²), mean absolute percentage error (MAPE) and accuracy. The overall findings are summarized in Table 4, which indicates the relative merits and demerits of the models among the three layouts.

Random Forest performed the best in terms of overall performance in the case of Layout 1, with an R² of 0.9855 and the lowest MSE and MAE value when compared to the other models. Figure 3a also demonstrates this, with the actual vs. predicted values being near to the 45° reference line and the residuals being clumped near zero, indicating a strong model. XGBoost also showed favorable performance, with an R² of 0.956 and lower relative errors (MAPE = 17.86) compared to Random Forest, but again, slight deviation was noted at high temperatures similar to that in Figure 3b only. But SVR, conversely, had difficulties with overwhelming values, because the predictions start to become flattened at 10 °C, resulting in a reduced R² of 0.894 and increased error levels, as shown in Figure 3c. KNNs yielded a satisfactory fit in the experimental values (R² = 0.978), whose residual showed being sensitive to temperature extremes, as indicated in Figure 3d.

In Layout 2, the models further improved in performance, with Random Forest performing better compared to the rest. It had the highest R² of 0.988, the lowest MAE of 0. 503 °C, and the best overall accuracy of 89.27%, which made it the most predictable. Figure 4a) confirms this trend, with their residual values mostly within the range of ±2 °C. XGBoost was right behind, with R² = 0.978, but its error distribution became large at high predicted values, as observed in Figure 4b. SVR also failed to perform well, with residue values of up to +15 °C and a R² of 0.891 (Figure 4c), and KNNs continued to give consistent predictions, at a R² of 0.972 and an accuracy of close to 87% (Figure 4d). These findings indicate that the most predictable layout was Layout 2, with a lower range of thermal dynamics, enabling the models to be effective enough to generalize. Layout 2 exhibits a smaller thermodynamic range, characterized by reduced temperature gradients and a smoother phase-transition behavior, resulting in fewer abrupt nonlinearities. This simplifies the learning space for machine learning models and enhances generalization accuracy compared to layouts with stronger latent-heat-induced fluctuations.

The Layout 3 outcomes were the same as the earlier findings, but the models demonstrated a bit more variability than that of Layout 2. Random Forest was the best, with R² = 0.988 and generally well distributed residuals, although the scatter was higher at higher temperatures (Figure 5a). XGBoost once again had a good R² = 0.978, but some underestimation was at extreme values, as presented in Figure 5b. SVR still did not work and flattened its predictions at larger actual values, and KNNs provided balanced predictions, with R² = 0.970 and accuracy = 83.26 (Figure 5c and Figure 5d, respectively). The accuracy of the prediction was a bit less with Layout 3, but the ensemble techniques still kept their superiority to the kernel and instance-based models.

In general, the findings of the charging case categorically define the superiority of Random Forest as the strongest and stable predictor of all layouts. Layout 2 had notably good performance since it came out as the most predictable layout because of the ease of thermal transition that could be captured easily by the models. XGBoost was also reliable and competitive in accuracy but slightly weaker on the extremes than Random Forest. The KNN algorithm was relatively good at the relevant temperature range, but it was not as stable at the higher ends, whereas SVR continued to be way behind, owing to its inability to capture the nonlinearities of the processes involved in phase changes. The numerical data in Table 2 and the visual confirmation of the actual vs. predicted plots and the distribution of residues prove that Random Forest is the most trusted model to predict PCM charging, which is quite accurate and predictable with XGBoost and the most predictable structure with Layout 2.

4.2. Discharging Case

The same four machine learning algorithms were also used to analyze the discharging phase of the PCM cold storage system, which includes Random Forest, XGBoost, SVR, and KNNs. This stage is defined by the slow ejection of the stored thermal energy as the PCM returns to equilibrium with the surrounding environment after being in sub-zero temperatures. Such behavior has to be modeled accurately in order to evaluate the thermal stability of the system at the storage conditions. The performance of every layout is summarized in Table 5, whereas Figure 6a–d demonstrates comparisons of experimental reads and ML predictions of Layouts 1 to 3.

Random Forest, XGBoost, SVR, and KNNs were used to model the discharging behavior of the PCM, and the predictions themselves were compared to the experimental data. The results of this are shown in Figure 6, Figure 7 and Figure 8, which are comparisons of the predicted and actual resilience of each layout in the discharging cycle, which visually demonstrates the predictive ability of every model. In the case of Layout 1 (Figure 6), both the Random Forest and KNN models demonstrated an almost perfect fit in the experimental data, and this indicates that both methods were very effective in describing dynamics of discharging. The fact that the predictions and measurements have smooth agreement indicates how well these models can generalize in this stage. There is also a close follow of the curve by XGBoost predictions, but one can observe that there is a slight deviation in the central range of the temperature, which indicates that there is underestimation to the side. SVR obtained a fairly fair fit with noticeable flattening in some areas, meaning that it was less adaptable to a quick transition between the temperature recovery curve.

The plots of Layout 2 (Figure 7) reveal that all four models were able to capture the overall trend in discharging, yet they had varying predictive consistency. The predictions of Random Forest were very close to the experimental values in the entire discharge range, which adds credence to its strength. KNNs once again fared well, whereby the predicted points almost coincided with the actual curve in most of the areas. XGBoost also provided reasonable results, but its forecasts were slightly off during the time when the temperature was changing at a higher rate. The profile forecasted by SVR graphically fitted the experimental curve, but small differences built up, indicating its lack of flexibility in cases where the discharge curve was not so linear.

The Layout 3 (Figure 8) plots support the idea that it was Random Forest which offered the most juxtaposing opinion, with experimental evidence and its predictions being barely distinguishable from the actual curve. KNNs also fitted excellently, but at some higher temperature ranges, the predictions deviated slightly. Overall, XGBoost did a good job, but it had a tendency to underfit towards the high end of the discharging curve. Although the tendencies of the SVR predictions are more or less consistent with the overall trend, one could observe more pronounced discrepancies in comparison with the other models, and some flattening effects were even noticeable.

Combined, the predicted and actual plots of each of the three layouts indicate that ensemble methods, especially Random Forest, always provided the best fit to experimental data. KNNs also demonstrated admirable performance, particularly in Layouts 1 and 2, which affirm its effectiveness in establishing the trends of discharging in a smoother manner. The XGBoost forecasts were not as precise as at the ends but were generally more accurate, whereas SVR was the least predictable, as indicated by the greater variance in the predicted curves and the actual experimental curves. The visual data therefore highlight the fact that the most reliable of the models used in the discharging case was Random Forest and KNNs, although the former prevailed throughout the layouts. These findings indicate that the novelty of the suggested framework does not only lie in its numerical accuracy of prediction but in its capability to provide physically consistent PCM thermal behavior with experimentally trained models that require insignificant inference times, which cannot otherwise be achieved with standard CFD-based or simulation-trained machine learning methods.

In comparing the predictive performance of the charging and discharging stages it can be seen that the models have some complementary strengths in their relation to the thermal regime. In the case of charging where cooling was more nonlinear (latent-heat effects) ensemble-based models like Random Forest always performed better, especially on Layout 2, whereas XGBoost was somewhat less accurate but more consistent when at the extremes. However, the discharging stage, where the recovery trends became smoother, demonstrated that KNNs became one of the most worthy competitors to Random Forest, obtaining phenomenal accuracy for Layouts 1 and 2. In both instances, SVR was the poorest in terms of reliability, both in regards to nonlinear variations during charging and relative error susceptibility during discharging. When combined, the analysis shows that the predictions of Random Forest are the most powerful and generalizable in both modes, and KNNs can be used to provide outstanding localized accuracy in discharging conditions, thus making the hybrid model strategies viable in PCM-based cold storage systems.

Along with the overall accuracy of the predictions, the physical consistency of the machine learning predictions can be concluded based on the point distributions of the actual and predicted plots. The heavy concentration of the data points around the 45° reference line around the phase-change temperature of PCM is a sign that the models are able to correctly reproduce the temperature stagnation that is related to latent-heat effects. At the transition of the phase, several experimental data points are at a similar temperature in spite of different thermal states, which is manifested by the concentration of the predictions in this area as opposed to artificial smoothing. This observation proves that the models maintain the latent-heat plateau of thermal response of PCMs, instead of imposing a pure linear temperature evolution. Deviations on a smaller scale at high temperatures are sensible heat areas beyond the phase-change interval and do not affect physical consistency.

In addition to predictive accuracy, the findings also bring out the computational benefits of machine learning-based modeling. Although CFD simulations cannot be replaced in physics-based insights and validation, they are too expensive to be utilized in iterative or real-time applications. On the contrary, the ML models that were considered in this paper provided precise forecasting with a minor inference-learning time that enabled quick assessment of various layouts and operating conditions. Such computational efficiency is especially important in portable and decentralized cold storage systems which require rapid decision-making and responsive control.

5. Conclusions

This paper introduced a machine learning-based model of forecasting charging and discharging thermal states namely of a PCM-integrated cold storage system through experimental experiences of various layouts. The findings show that the ensemble techniques, mainly the Random Forest technique, were very accurate in both the charging and discharging stages, and the R² values were higher than 0.98, with low error rates. KNNs also appeared to be a competitive choice in the discharging case, providing better localized performance on particular layouts, although XGBoost was also reliable, with a somewhat weaker output. Compared to other models, SVR, which can generalize trends, has lower adaptability to nonlinear fluctuations and higher relative errors. Besides predictive performance, the machine learning framework suggested by the authors can provide an ultimate benefit in terms of computational efficiency. After the training, the models can produce temperature predictions virtually in real time, which stands in stark contrast to CFD simulations, which consume significant computational time and resources. This has been made possible by the dramatic inference time saving, creating a model that can be used to generate real-time predictions, giving design ideas rapid iteration and scalability with various system configurations and thus lending credence to machine learning as a viable substitute to CFD to analyze PCM-based cold storage. The main novelty of the work is that the experimentally validated, physically consistent machine learning surrogate can be used in place of computationally intensive CFD simulations to compute PCM-based cold storage systems and predict the thermal behavior of layout-specific thermal predictions rapidly.

The results demonstrate that ensemble-based models, particularly Random Forest, provide robust and generalizable predictions across both charging and discharging regimes, effectively capturing the nonlinear thermal dynamics associated with PCM phase transitions. Instance-based learning using KNNs showed competitive performance during the discharging phase, where smoother thermal gradients dominate, highlighting the influence of the underlying physical regime on model suitability. These findings reinforce the importance of aligning the selection of machine learning models with the thermodynamic characteristics of the operating process. Besides predictive efficacy, this work is sustainable because it allows for the design and operation of energy-efficient PCM-based cold stores, the minimization of post-harvest losses of food, and the minimization of the use of refrigeration energy. The suggested framework justifies scalable and low-cost cold chain systems that fit in decentralized agricultural settings, which provides practical environmental, economic, and social sustainability incentives.

Future studies ought to build upon this study to include other PCM materials, geometrical settings, and climate conditions, as well as look at deep learning and physics-informed methods to push the generalizability further. A promising opportunity, in particular, is integration with real-time IoT sensor networks and control algorithms, which can allow for the predictive operation of PCM-based cold storage units in a variety of agricultural and industrial situations. On the whole, this paper demonstrates the disruptive nature of machine learning in the rapid implementation of cold chain solutions that are sustainable and energy-efficient. Although the current research proves that machine learning-only models can correctly and physically consistently forecast PCM charging and discharging behavior, research can improve the solidity of the model in the future by integrating physical laws into the learning mechanism. Physics-informed neural networks (PINNs) can be a promising solution as they introduce constrained training through the inclusion of energy and phase-change considerations into the training model, thus enhancing the extrapolative behavior of the model outside experimental training data. These types of hybrid physics-data-driven models would be less dependent on data and thermodynamically consistent. Combining PINNs with live sensor measurements could also be used to adaptively control physics-constrained PCM-based cold storage apparatus.

Author Contributions

Conceptualization, C.S.; software, C.S.; validation, R.R.Y.; formal analysis, R.R.Y. and A.R.; investigation, R.R.Y.; resources, S.L.; writing—original draft, R.R.Y.; writing—review and editing, C.S. and S.L.; supervision, C.S., A.R. and S.L.; project administration, A.R.; funding acquisition, C.S. and A.R. All authors have read and agreed to the published version of the manuscript.

Funding

The authors thank AICTE, for funding the project with the sanction number F NO. M/954/2024-US-C1/1/57 and number F NO. M/954/2024-US-C1/1/28 under scheme Tomato Grand Challenge.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Karoney, E.M.; Molelekoa, T.; Bill, M.; Siyoum, N.; Korsten, L. Global research network analysis of fresh produce postharvest technology: Innovative trends for loss reduction. Postharvest Biol. Technol. 2024, 208, 112642. [Google Scholar] [CrossRef]
Khan, A.A.; Siddiqui, Y.; Siddique, K.H.; Bobo, J.A.; Ali, A. Minimizing postharvest food losses: A vital strategy to alleviate food insecurity and malnutrition in developing nations: A review. Discov. Food 2024, 4, 145. [Google Scholar] [CrossRef]
McLay, A.; Morant, G.; Breisch, K.; Rodwell, J.; Rayburg, S. Practices to Improve the Sustainability of Australian Cold Storage Facilities. Sustainability 2024, 16, 4584. [Google Scholar] [CrossRef]
Ouaouja, Z.; Ousegui, A.; Toublanc, C.; Rouaud, O.; Havet, M. Phase Change Materials for Cold Thermal Energy Storage applications: A critical review of conventional materials and the potential of bio-based alternatives. J. Energy Storage 2025, 110, 115339. [Google Scholar] [CrossRef]
Yenare, R.R.; Sonawane, C.R.; Sur, A.; Singh, B.; Panchal, H.; Kumar, A.; Sadasivuni, K.K.; Siddiqui, I.H.; Bhalerao, Y. A comprehensive review of portable cold storage: Technologies, applications, and future trends. Alex. Eng. J. 2024, 94, 23–33. [Google Scholar] [CrossRef]
Aloe, C.M.; De Maio, A. Balancing Temperature and Humidity Control in Storage Location Assignment: An Optimization Perspective in Refrigerated Warehouses. Sustainability 2025, 17, 7477. [Google Scholar] [CrossRef]
Ouaouja, Z.; Havet, M.; Rouaud, O.; Toublanc, C.; Ousegui, A. Thermal properties and performance of glycerol-water-NaCl phase change material for cold chain applications. J. Energy Storage 2025, 126, 117045. [Google Scholar] [CrossRef]
Kang, Z.; Tan, R.; Zhou, W.; Qin, Z.; Liu, S. Numerical simulation and optimization of a phase-change energy storage box in a modular mobile thermal energy supply system. Sustainability 2023, 15, 13886. [Google Scholar] [CrossRef]
Bacellar, D.; Alam, T.; Ling, J.; Aute, V. A Study on Computational Cost Reduction of Simulations of Phase-Change Material (PCM) Embedded Heat Exchangers. In Proceedings of the International Refrigeration and Air Conditioning Conference, West Lafayette, IN, USA, 24–28 May 2021. [Google Scholar]
Calati, M.; Hooman, K.; Mancin, S. Thermal storage based on phase change materials (PCMs) for refrigerated transport and distribution applications along the cold chain: A review. Int. J. Thermofluids 2022, 16, 100224. [Google Scholar] [CrossRef]
Zhang, X.; Shi, Q.; Luo, L.; Fan, Y.; Wang, Q.; Jia, G. Research progress on the phase change materials for cold thermal energy storage. Energies 2021, 14, 8233. [Google Scholar] [CrossRef]
Maiorino, A.; Petruzziello, F.; Aprea, C. Refrigerated transport: State of the art, technical issues, innovations and challenges for sustainability. Energies 2021, 14, 7237. [Google Scholar] [CrossRef]
Riahi, A.; Shafii, M.B. Experimental assessment of using phase change materials in vapor compression refrigeration systems for condenser pre-cooling. Heliyon 2024, 10, e40259. [Google Scholar] [CrossRef]
Harun-Or-Rashid, M.; Hasan, M.T.; Bin Alam, T.B.; Hossain, S. Energy efficient refrigeration system using latent heat storage, PCM. Int. J. Thermofluids 2024, 23, 100717. [Google Scholar] [CrossRef]
He, S.; Chen, Z.; Wang, W.; Chen, Q.; Tang, L.; Huang, Y. Experimental study on an improved direct-contact thermal energy storage container. J. Energy Storage 2024, 102, 114201. [Google Scholar] [CrossRef]
Kareem, B.E.; Adham, A.M.; Yaqob, B.N. Experimental analysis of air-multiple pcm heat exchanger in evaporative cooling systems for supply air temperature stabilization. J. Build. Eng. 2024, 82, 108269. [Google Scholar] [CrossRef]
Wu, K.; Zhao, H.; Wang, Y. Cold Storage and Release Characteristics of Phase Change Cold Storage Plate with Non-Uniform Fins. Energies 2024, 17, 3610. [Google Scholar] [CrossRef]
Li, Z.; Sha, Y.; Zhang, X. Research on phase change cold storage materials and innovative applications in air conditioning systems. Energies 2024, 17, 4365. [Google Scholar] [CrossRef]
Hang, C.; Sun, S.; Zhang, H.; Rong, W.; Xu, S.; Ma, G. Experimental study on portable cold storage box with phase change material packages. Therm. Sci. 2025, 127. [Google Scholar] [CrossRef]
Efatinasab, E.; Irannezhad, N.; Rampazzo, M.; Diani, A. Machine and deep learning driven models for the design of heat exchangers with micro-finned tubes. Energy AI 2024, 16, 100370. [Google Scholar] [CrossRef]
Suzuki, A.; Nakatani, H.; Kobashi, M. Machine learning surrogate modeling toward the design of lattice-structured heat sinks fabricated by additive manufacturing. Mater. Des. 2023, 230, 111969. [Google Scholar] [CrossRef]
Zhao, Z.; Wang, Y.; Zhang, W.; Ba, Z.; Sun, L. Physics-informed neural networks in heat transfer-dominated multiphysics systems: A comprehensive review. Eng. Appl. Artif. Intell. 2025, 157, 111098. [Google Scholar] [CrossRef]
Chen, Y.; Gong, W.; Obrecht, C.; Kuznik, F. A review of machine learning techniques for building electrical energy consumption prediction. Energy AI 2025, 21, 100518. [Google Scholar] [CrossRef]
Ayoola, R.B.; Ilori, O.M.; Perera, N.; Mateo-Garcia, M.; Akinyemi, K.; Boyd, D.; Leonard, M. Data-driven optimisation of residential air-to-water heat pump performance using IoT and machine learning. Energy Build. 2025, 348, 116352. [Google Scholar] [CrossRef]
Hsu, P.C.; Gao, L.; Hwang, Y. Comparative study of LSTM and ANN models for power consumption prediction of variable refrigerant flow (VRF) systems in buildings. Int. J. Refrig. 2025, 169, 55–68. [Google Scholar] [CrossRef]
D’Aquilio, A.; Ebrahim, H.; Crobu, E. A surrogate CFD model using Machine Learning for fast design explorations of the indoor environment. In Building Simulation; IBPSA: Britton, SD, USA, 2023; Volume 18, pp. 2138–2145. [Google Scholar]
Qian, W.; Fang, W.; Tian, Y.; Dai, G.; Yan, T.; Yang, S.; Wang, P. Data-Driven Prediction of Li-Ion Battery Thermal Behavior: Advances and Applications in Thermal Management. Processes 2025, 13, 2769. [Google Scholar] [CrossRef]
Tian, M.; Mu, H.; Liu, T.; Li, M.; Ding, D.; Zhao, J. Physics-informed machine learning-based real-time long-horizon temperature fields prediction in metallic additive manufacturing. Commun. Eng. 2025, 4, 168. [Google Scholar] [CrossRef]
Li, D.; Qi, Z.; Zhou, Y.; Elchalakani, M. Machine Learning Applications in Building Energy Systems: Review and Prospects. Buildings 2025, 15, 648. [Google Scholar] [CrossRef]

Figure 1. Temperature profile of PCM for Layouts 1, 2 and 3 for both charging and discharging cases. (a) Charging case; (b) discharging case.

Figure 2. Framework of machine learning implementation used in this study.

Figure 3. Comparison of experimental vs. ML predictions for Layout_1. (a) Random Forest, (b) XGBoost, and (c) SVR (d) KNNs.

Figure 4. Comparison of experimental vs. ML predictions for Layout_2. (a) Random Forest, (b) XGBoost, (c) SVR, and (d) KNNs.

Figure 5. Comparison of experimental vs. ML predictions for Layout_3. (a) Random Forest, (b) XGBoost, (c) SVR, and (d) KNNs.

Figure 6. Comparison of experimental vs. ML predictions for Layout_1 (discharging). (a) Random Forest, (b) XGBoost, (c) SVR, and (d) KNNs.

Figure 7. Comparison of experimental vs. ML predictions for Layout_2 (discharging). (a) Random Forest, (b) XGBoost, (c) SVR, and (d) KNNs.

Figure 8. Comparison of experimental vs. ML predictions for Layout_3 (discharging). (a) Random Forest, (b) XGBoost, (c) SVR, and (d) KNNs.

Table 1. Properties of RT4 PCM materials.

Property	Unit	Value
Melting Temperature	°C	4
Density (Solid)	kg/m³	880
Density (Liquid)	kg/m³	760
Latent Heat of Fusion	kJ/kg	230
Specific Heat (Solid)	kJ/kg·K	2.1
Specific Heat (Liquid)	kJ/kg·K	2.3
Thermal Conductivity (Solid)	W/m·K	0.2
Thermal Conductivity (Liquid)	W/m·K	0.19

Table 2. Summary of dataset variables used in ML modeling.

Variable Name	Type	Unit	Role in Model
Time	Continuous	min	Independent variable
Layout_1_Temp	Continuous	°C	Dependent variable
Layout_2_Temp	Continuous	°C	Dependent variable
Layout_3_Temp	Continuous	°C	Dependent variable
Mode (Charge/Discharge)	Categorical	–	Categorical input

Table 3. Optimized hyperparameters used for machine learning models.

Model	Hyperparameter	Tuned Value
Random Forest (RF)	Number of trees (n_estimators)	200
	Maximum tree depth (max_depth)	12
	Minimum samples per split (min_samples_split)	5
	Minimum samples per leaf (min_samples_leaf)	2
	Feature selection (max_features)	sqrt
XGBoost	Number of trees (n_estimators)	150
	Learning rate (learning_rate)	0.05
	Maximum tree depth (max_depth)	6
	Subsample ratio (subsample)	0.8
	Column sampling (colsample_bytree)	0.8
Support Vector Regression (SVR)	Kernel type	RBF
	Regularization parameter (C)	10
	Kernel width (γ)	0.1
	Insensitive loss (ε)	0.1
K-Nearest Neighbors (KNNs)	Number of neighbors (K)	5
	Distance metric	Euclidean
	Weighting scheme	distance—weighted

Table 4. Performance of machine learning models for charging case.

Target Variable	Model	MSE (Mean)	MAE (Mean)	R² (Mean)	MAPE (Mean %)	Accuracy (Mean %)
Layout_1_Temp	Random Forest	1.051905	0.582058	0.985517	24.190	75.810
	XGBoost	3.631906	1.129440	0.955929	17.860	82.140
	SVR	13.274185	1.419117	0.894465	18.038	81.962
	KNN	2.309098	0.681024	0.977861	22.136	77.864
Layout_2_Temp	Random Forest	0.969371	0.503068	0.988251	10.734	89.266
	XGBoost	1.675723	0.745464	0.977619	16.845	83.155
	SVR	13.187575	1.394042	0.891301	15.930	84.070
	KNN	2.873345	0.720324	0.972480	12.480	87.520
Layout_3_Temp	Random Forest	1.183204	0.473205	0.988136	20.445	79.555
	XGBoost	1.937054	0.735380	0.978071	22.139	77.861
	SVR	11.539059	1.257540	0.900454	19.895	80.105
	KNN	3.281776	0.697586	0.970102	16.745	83.255

Table 5. Performance of machine learning models for discharging case.

Target Variable	Model	MSE (Mean)	MAE (Mean)	R² (Mean)	MAPE (Mean %)	Accuracy (Mean %)
Layout_1_Temp	Random Forest	0.077382	0.215990	0.992488	10.788	89.212
	XGBoost	0.512570	0.511339	0.965034	18.955	81.045
	SVR	0.363858	0.418270	0.975505	15.143	84.857
	KNNs	0.134822	0.197855	0.992162	9.899	90.101
Layout_2_Temp	Random Forest	0.130327	0.276456	0.984203	12.067	87.933
	XGBoost	0.331108	0.438961	0.975314	13.664	86.336
	SVR	0.104303	0.267790	0.990200	36.686	63.314
	KNNs	0.059894	0.125264	0.995810	16.444	83.556
Layout_3_Temp	Random Forest	0.149542	0.220490	0.990504	8.050	91.950
	XGBoost	0.375157	0.407409	0.976732	15.619	84.381
	SVR	0.767699	0.564019	0.962312	24.438	75.562
	KNNs	0.434594	0.278795	0.981139	18.595	81.405

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yenare, R.R.; Sonawane, C.; Roy, A.; Landini, S. Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems. Sustainability 2026, 18, 1467. https://doi.org/10.3390/su18031467

AMA Style

Yenare RR, Sonawane C, Roy A, Landini S. Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems. Sustainability. 2026; 18(3):1467. https://doi.org/10.3390/su18031467

Chicago/Turabian Style

Yenare, Raju R., Chandrakant Sonawane, Anindita Roy, and Stefano Landini. 2026. "Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems" Sustainability 18, no. 3: 1467. https://doi.org/10.3390/su18031467

APA Style

Yenare, R. R., Sonawane, C., Roy, A., & Landini, S. (2026). Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems. Sustainability, 18(3), 1467. https://doi.org/10.3390/su18031467

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Experimental Data-Driven Machine Learning Analysis for Prediction of PCM Charging and Discharging Behavior in Portable Cold Storage Systems

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Dataset Preparation

3.1.1. Dataset Structure

3.1.2. Preprocessing Steps

3.1.3. Train–Test Split

3.1.4. Dataset Illustration

3.1.5. Consistency Across Approaches

3.2. Model Implementation

3.2.1. Training Process

3.2.2. Evaluation Metrics

3.2.3. Computational Cost and Considerations

4. Results and Discussion

4.1. Charging Case

4.2. Discharging Case

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI