Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier

Ali, Ahsan; Khan, Muhammad Adnan; Choi, Hoimyung

doi:10.3390/molecules29061280

Open AccessArticle

Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier

by

Ahsan Ali

¹,

Muhammad Adnan Khan

^2,3,4

and

Hoimyung Choi

^1,*

¹

Department of Mechanical Engineering, Gachon University, Seongnam 13120, Republic of Korea

²

School of Computing, Skyline University College, University City Sharjah, Sharjah 1797, United Arab Emirates

³

Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University Lahore Campus, Lahore 54000, Pakistan

⁴

Department of Software, Faculty of Artificial Intelligence and Software, Gachon University, Seongnam 13120, Republic of Korea

^*

Author to whom correspondence should be addressed.

Molecules 2024, 29(6), 1280; https://doi.org/10.3390/molecules29061280

Submission received: 2 February 2024 / Revised: 12 March 2024 / Accepted: 12 March 2024 / Published: 13 March 2024

(This article belongs to the Special Issue Synthesis and Applications of Materials in Green Chemistry)

Download

Browse Figures

Versions Notes

Abstract

Dibenzyltoluene (H0-DBT), a Liquid Organic Hydrogen Carrier (LOHC), presents an attractive solution for hydrogen storage due to its enhanced safety and ability to store hydrogen in a concentrated liquid form. The utilization of machine learning proves essential for accurately predicting hydrogen storage classes in H0-DBT across diverse experimental conditions. This study focuses on the classification of hydrogen storage data into three classes, low-class, medium-class and high-class, based on the hydrogen storage capacity values. We introduce Hydrogen Storage Prediction with the Support Vector Machine (HSP-SVM) model to predict the hydrogen storage classes accurately. The performance of the proposed HSP-SVM model was investigated using various techniques, which included 5-Fold Cross Validation (5-FCV), Resubstitution Validation (RV), and Holdout Validation (HV). The accuracy of the HV approach for the low, medium, and high class was 98.5%, 97%, and 98.5%, respectively. The overall accuracy of HV approach reached 97% with a miss clarification rate of 3%, whereas 5-FCV and RV possessed an overall accuracy of 93.9% with a miss clarification rate of 6.1%. The results reveal that the HV approach is optimal for predicting the hydrogen storage classes accurately.

Keywords:

5-Fold Cross Validation; Holdout Validation; HSP-SVM; Resubstitution Validation; Support Vector Machine

1. Introduction

The renewable energy sources are receiving great attention in the modern world due to gradual increments in the energy demand as the global population is increasing. The global population is expected to reach a figure of 10 billion by 2050 [1]. Energy needs are increasing in the world, and countries are turning to renewable energy resources as well as fossil fuels to meet their needs. In the coming years, the utilization of energy will increase exponentially. There is a limit to the life of fossil fuels, so finding new energy sources is important. Global warming poses a significant challenge due to the adverse impacts associated with the utilization of fossil fuels, including oil, coal, and natural gas [2]. The utilization of fossil fuels for power generation is progressively diminishing in developed nations. It is quite difficult to replace fossil fuels immediately because fossil fuels meet 80% of our everyday energy demands [3]. According to a report by the World Health Organization (WHO), fossil fuels usage contributes to climate change, which has negative impacts on human health [4,5]. This has made it even more essential to reduce fossil fuel use using renewable energy sources. Renewable energy has a lower environmental effect than traditional energy conversion techniques; it is considered a clean energy source with nearly no carbon emissions [6]. Every human activity has the potential to affect the environment; nonetheless, when considering environmental implications, renewable approaches should be favored above other methods.

Hydrogen has emerged as an efficient form of energy storage that produces zero carbon emissions, making it an environmentally friendly option. Moreover, its energy content (141.7 MJ/Kg) is higher than that of fossil fuels (45.8 MJ/Kg). Hydrogen energy possesses almost seven times higher gravimetric density than fossil fuels [7]. These characteristics of hydrogen energy make it a favorable energy source for the future. However, hydrogen has a low volumetric density, which makes it quite difficult to store. Commercially used hydrogen storage techniques, such as cryogenic storage and pressurized gas storage, have the disadvantages of requiring high amounts of energy, experiencing boil-off losses, and being difficult to transport [8,9,10].

The Liquid Organic Hydrogen Carriers (LOHCs) system is seen as a suitable approach for storing hydrogen into aromatic compounds. This system elevates the boil-off losses and transport issues. Several LOHC systems have been investigated to find an efficient system. Some of the efficient LOHC systems are the carbazole [11,12,13,14,15,16,17], indole [18,19,20,21,22], and acridine [23] derivatives. The gravimetric hydrogen density of N-ethylcarbazole (NEC) is 5.8 wt.%, which makes it an efficient LOHC system. However, it carries a major drawback of solidifying at room temperature. Brückner et al. [24] introduced a Dibenzyltoluene (H0-DBT) and perhydro-Dibenzyltoluene (H18-DBT) pair in 2014. H0-DBT eliminates the solidification concern as it is present in liquid form. In recent years, researchers have focused on H0-DBT due to its high gravimetric storage density of 6.2 wt.% [25,26,27,28,29,30]. It also possesses reversibility characteristics, and the hydrogen is produced during a dehydrogenation reaction. These characteristics reveal that H0-DBT is an efficient candidate for storing hydrogen in a wide range of applications.

The several studies focusing on the hydrogenation of H0-DBT have been reported to highlight the attained hydrogen storage capacity under various experimental conditions [24,28,31,32]. The hydrogenation reaction is influenced by several key parameters such as the reaction temperature, initial pressure, and ratio of catalyst to H0-DBT. The hydrogen storage capacity of H0-DBT is varied when the reaction conditions, such as the reaction temperature and initial pressure, are different. Categorizing hydrogen storage in H0-DBT based on storage capacity can help identify the various classes of stored hydrogen. Machine learning algorithms (MLAs) have been employed recently to analyze the available data and make more accurate predictions for hydro-gen storage. This approach can assist to identify the optimal reaction parameters for hydrogenation of H0-DBT and other LOHCs in a short time and minimize the efforts of researchers.

Several materials, such as electrocatalysts [33,34], perovskite solids [35,36,37,38], thermoelectric [39,40,41,42], interphase precipitation in micro-alloyed steels [43], carbon-capture materials [44,45], light-emitting transistors [46], and oxides and inorganic materials [47,48,49,50], have been considered in recent times to apply the MLAs. The various machine learning models have been applied for predicting the adsorption behavior of H₂, CH₄, C₃H₈, and CO₂ in H₂-selective nanocomposite membranes. The results elucidated that the Committee Machine Intelligent System (CMIS) exhibited the highest accuracy in comparison to another model with R2 = 0.9997 [51]. Rezakazemi et al. employed the genetic algorithm (GA) and particle swarm optimization (PSO) to enhance the performance of adaptive neuro-fuzzy inference system (ANFIS), which was used to study the performance of the H2-selective mixed matrix membrane (MMM) [52]. The results showed that PSO-ANFIS yielded better predictions in comparison to the other two models yielding R2 = 0.9938 for the testing. In a later work, they applied two intelligent models for the prediction of various gases diffusion through the nanocomposite membranes. They reported that the DE-ANFIS (differential evolution-adaptive neuro-fuzzy inference system) predicted the diffusion of gases more accurately with R2 value of 0.9981 for testing [53]. Rahnama et al. predicted the hydrogen storage capacities in metal hydrides by employing four regression models. They revealed that the boosted decision tree regression model performed better among all the model yielding higher coefficient of determination of 0.83 in comparison to the other three models [54]. In the second study, Rahnama et al. predicted the optimal material groups of metal hydrides using different classification algorithms. The results revealed that the multiclass neural network performed better than the other three algorithms with an accuracy of 80% [55].

Among various machine learning algorithms, Support Vector Machine (SVM) can handle datasets with a large number of features and still achieve good classification performance. SVM’s capacity to handle non-linear relationships through kernel methods allows for the capture of complex patterns inherent in hydrogen storage behavior, while its optimization objective mitigates overfitting and enhances generalization performance when validated on independent datasets. Furthermore, SVM’s efficiency in high-dimensional feature spaces enables simultaneous analysis of multiple parameters, reflecting the intricacies of hydrogen storage systems. Using SVM for training and testing, researchers monitor learning relevant to data. They are related to the group of linear classifiers. Meanwhile, the forward destination, which is the classifiers’ unique feature, increases as SVM reduces the experimental classification error. Thus, classifiers with maximum margins were called by SVM. The goal of SVM is to reduce the systemic risk [56]. Therefore, detecting the optimal parameter environment typically requires complete cross-validation. A collection of prototypes is generally referred to as this technique. Model selection is a time-consuming process, which is a practical problem of this process. There are a number of variables involved in the proposed system that can affect the results linked with applying the SVM algorithm. Parameters such as the set of kernel functions, the standard deviation of the Gaussian kernel, the corresponding positions related to the categorized slack variable to hinder the uneven distribution of the categorized outcomes, and the number of training occasions are considered [57].

This study proposes a machine learning model that utilizes SVM techniques to predict hydrogen storage classes which are classified on the basis of hydrogen storage capacity values. The input dataset is divided into three classes and each class has its range of hydrogen storage capacity. The hydrogen storage capacity values of less than 1.5 wt.% and from 1.5 wt.% to 3 wt.% are considered as low class and medium class, respectively. The hydrogen storage capacity values beyond 3 wt.% are categorized as high class. For the prediction of hydrogen storage classes, the Hydrogen Storage Prediction using Support Vector Machine (HSP-SVM) was proposed. The proposed HSP-SVM model wa validated using three various techniques such as 5-Fold Cross Validation (5-FCV), Resubstitution Validation (RV), and Holdout Validation (HV). The various statistical parameters were considered to do the comparative analysis of these validation approaches, and the optimal validation approach was identified.

2. Simulations and Results

In MATLAB, a proposed HSP-SVM model was implemented on a dataset containing 151,388 samples adopted from the previous study [8]. The used model type for the analysis was SVM, employing the quadratic kernel function. The kernel scale was set to automatic, and box constrain level was kept as one. The multiclass analysis was conducted using one-to-one approach with standardized data as true. All the input features, which are listed in Table 1, were used in the model. For the multiclassification costs, the cost matrix was opted default. The proposed model was evaluated using statistical metrics, including accuracy, Misclassification Rate (MCR), Recall/Sensitivity, True Negative Rate (TNR)/Selectivity, Precision/Positive Predictive Value (PPV), False Positive Rate (FPR), False Negative Rate (FNR), False Discovery Rate (FDR), Negative Predictive Value (NPV), and False Omission Rate (FOR).

A c c u r a c y = \frac{\frac{H_{x e}}{T_{x e}} + \frac{H_{x g}}{T_{x g}}}{\frac{H_{x e}}{T_{x e}} + \frac{\sum_{f = 1}^{m} (H_{x f, f \neq e})}{T_{x f}} + \frac{H_{x g}}{T_{x g}} + \frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}}} W h e r e, e / f / g / q = 1, 2, 3, \dots, m

M i s s r a t e = \frac{\frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}}}{\frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}} + \frac{H_{x e}}{T_{x e}}} W h e r e, e / g / q = 1, 2, 3, \dots, m

R e c a l l / S e n s i t i v i t y = \frac{\frac{H_{x e}}{T_{x e}}}{\frac{H_{x e}}{T_{x e}} + \frac{\sum_{q = 1}^{m} (H_{x q, q \neq g)}}{T_{x g}}} W h e r e, e / g / q = 1, 2, 3, \dots, m

T r u e N e g a t i v e R a t e / S e l e c t i v i t y = \frac{\frac{H_{x g}}{T_{x g}}}{\frac{H_{x g}}{T_{x g}} + \frac{\sum_{f = 1}^{m} (H_{x f, f \neq e)}}{T_{x f}}} W h e r e, f / g = 1, 2, 3, \dots, m

P r e c i s i o n / P o s i t i v e P r e d i c t i v e v a l u e = \frac{\frac{H_{x e}}{T_{x e}}}{\frac{H_{x e}}{T_{x e}} + \frac{\sum_{f = 1}^{m} (H_{x f, f \neq e})}{T_{x f}}} W h e r e, e / f = 1, 2, 3, \dots, m

F_{1} S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

F a l s e P o s i t i v e R a t e = \frac{\frac{\sum_{f = 1}^{m} (H_{x f, f \neq e)}}{T_{x f}}}{\frac{H_{x e}}{T_{x e}} + \frac{\sum_{f = 1}^{m} (H_{x f, f \neq e)}}{T_{x f}}} W h e r e, e / f = 1, 2, 3, \dots, m

F a l s e D i s c o v e r y R a t e = \frac{\frac{\sum_{f = 1}^{m} (H_{x f, f \neq e})}{T_{x f}}}{\frac{H_{x e}}{T_{x e}} + \frac{\sum_{f = 1}^{m} (H_{x f, f \neq e)}}{T_{x f}}} W h e r e, e / f = 1, 2, 3, \dots, m

F a l s e O m i s s i o n R a t e = \frac{\frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}}}{\frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}} + \frac{H_{x g}}{T_{x g}}} W h e r e, g / q = 1, 2, 3, \dots, m

N e g a t i v e P r e d i c t i v e v a l u e = \frac{\frac{H_{x g}}{T_{x g}}}{\frac{H_{x g}}{T_{x g}} + \frac{\sum_{q = 1}^{m} (H_{x q, q \neq g})}{T_{x g}}} W h e r e, g / q = 1, 2, 3, \dots, m

3. Materials and Methods

The proposed Hydrogen Storage Prediction empowered with Support Vector Machine (HSP-SVM) model involves three layers: the data acquisition layer, preprocessing layer, and validation layer, as shown in Figure 1. In the data acquisition layer, devices gather the data of various parameters, but sometimes are missing or have noise due to technical issues or device failures, which is addressed through preprocessing techniques such as handling missing values, moving average methods, and normalization in the data preprocessing layer. After the data preprocessing is completed, the validation layer is activated. This layer is divided into two sub-layers: the application/prediction layer and a performance evaluation layer for calculating various statistical parameters. In the prediction layer, the proposed model uses the SVM algorithm for classification, and three various approaches, such as 5-FCV, RV, and HV, are used for the model validation. The output layer estimates the accuracy, miss rate, recall, precision, and specificity of the proposed HSP-SVM model, as shown in Figure 1.

In this study, the dataset is adopted from the previous study [8] from Figure 4 to Figure 8. The key parameters which directly affect the hydrogen storage capacity of H0-DBT are the temperature and pressure. The hydrogen storage capacity value increases with the increment in temperature and pressure values. Moreover, the catalyst also plays a vital role in accelerating the hydrogen adsorption rates, and optimizing the dosage of the catalyst is imperative. Furthermore, the concentration of H0-DBT may affect the hydrogen storage capacity, and it is necessary to investigate its effect on the attained hydrogen storage capacity. Hence, the selection of key parameters was guided by a comprehensive understanding of the physical and chemical factors influencing hydrogen storage in H0-DBT, aiming to provide insights into the underlying mechanisms governing hydrogen adsorption in H0-DBT. The parameters considered as input and targeted output are listed in Table 1.

Table 1. Input/output parameters for the proposed HSP-SVM model.

S. No.	Input/Output Parameters
Input 1	Temperature
Input 2	Pressure
Input 3	H0-DBT Concentration
Input 4	Catalyst Concentration
Output	Hydrogen Storage Classes (Low, Medium, and High)

The SVM algorithm is a type of machine learning model that is often used for classification tasks involving datasets with many features. It is particularly useful when there are more features than data points. To reduce the amount of memory required, SVM only uses a subset of the training data, called support vectors, in its decision-making process. Various types of kernel functions can be used in SVM, including standard kernels and custom kernels that can be defined by the user. Since we know that the line equation is [58,59]:

a₂ = ba₁ + d

(1)

where ‘b’ is the slope of a line and ‘d’ is the intersection,

ba₁ − a₂ + d = 0

Let

\vec{a} = {(a_{1}, a_{2})}^{t}

and

\vec{c} = (b, - 1)

, then Equation (1) can be rewritten as

\vec{c} \cdot \vec{a} + d = 0

(2)

The equation for a hyperplane in two dimensions is obtained using vectors. The general equation for a hyperplane in any number of dimensions is shown in Equation (2). This equation and the corresponding functions can be used to define the hyperplane in any number of dimensions.

The direction of a vector

\vec{a} = {(a_{1}, a_{2})}^{t}

is written as c and it is defined as [60]:

c = \frac{a_{1}}{|| a ||} + \frac{a_{2}}{|| a ||}

(3)

where

||a|| = \sqrt{a_{1 +}^{2} a_{2 +}^{2} a_{3 +}^{2} \dots \dots \dots . . a_{n}^{2}}

As we all know,

\cos (σ) = \frac{a_{1}}{| |a| |} and \cos (φ) = \frac{a_{2}}{| |a| |}

Equation (3) can be written as

h = (\cos (σ), \cos (φ))

\vec{c} \cdot \vec{a} = || c || || a || \cos (ω)

ω = σ - φ

\begin{array}{l} \cos (ω) = \cos (σ - φ) \\ = \cos (σ) \cos (φ) + \sin (σ) \sin (φ) \\ = \frac{c_{1}}{| |c| |} \frac{a_{1}}{| |a| |} + \frac{c_{2}}{| |c| |} \frac{a_{2}}{| |a| |} \\ = \frac{c_{1} a_{1} + c_{2} a_{2}}{| |a| | | |a| |} \end{array}

c \cdot a = || c || || a || [\frac{c_{1} a_{1} + c_{2} a_{2}}{|| c || || a ||}]

\vec{c} \cdot \vec{a} = \sum_{l = 1}^{m} c_{l} a_{l}

(4)

The dot product of two n-dimensional vectors can be computed using the Equation (4).

Let

x = y (c \cdot a + d)

The proposed system measures the performance p on a training dataset, given a dataset D [61,62].

x_{l} = y_{l} (c \cdot a + d)

The functional margin of a dataset is represented by E, and it shows the degree to which the classes in the dataset are separated from each other. The distance between the hyperplane and the nearest sample from either class is regarded as the functional margin. If the functional margin is large, it reflects that the classes are separated effectively, which, in turn, enhances the performance of the model. The generalization ability of the model is commonly investigated using the functional margin. The model possessing large functional margin will lead to fewer chances of overfitting the training data [63].

E = \min_{l = 1 \dots . . m} x_{l}

The optimal hyperplane is the hyperplane having the largest functional margin. The prime objective is the identification of the optimal hyperplane, which involves determining the optimal values of the vector (c⃗) and scalar d that define the hyperplane.

The Lagrangian function shows the following equation [58,59,60]:

δ (c, d, b) = \frac{1}{2} c \cdot c - \sum_{l = 1}^{m} φ_{l} [y : (c \cdot a + d) - 1]

τ_{j} δ (c, d, b) = j - \sum_{l = 1}^{m} φ_{l} y_{l} a_{l} = 0

(5)

τ_{z} δ (c, d, b) = - \sum_{l = 1}^{m} φ_{l} y_{l} = 0

(6)

From Equations (5) and (6), we obtain:

c = \sum_{l = 1}^{m} φ_{l} y_{l} a_{l} and \sum_{l = 1}^{m} φ_{l} y_{l} = 0

(7)

Substituting the Lagrangian function, δ we obtain:

c (φ, d) = \sum_{l = 1}^{m} φ_{l} - \frac{1}{2} \sum_{l = 1}^{m} \sum_{c = 1}^{m} φ_{l} φ_{l} y_{l} y_{n} a_{l} a_{n}

Thus,

\max_{φ} \sum_{l = 1}^{m} φ_{j} - \frac{1}{2} \sum_{l = 1}^{n} \sum_{l = 1}^{m} φ_{l} φ_{n} y_{l} y_{n} a_{l} a_{n}

(8)

Subject to

φ_{l} \geq 0, l = 1 \dots . m, \sum_{l = 1}^{m} φ_{l} y_{l}

= 0.

The Karush–Kuhn–Tucker (KKT) conditions can be extended to the Lagrangian multiplier method when the constraints are unbalanced. The necessary KKT conditions will be expressed as [60,63]:

φ_{l} [y_{l} (c_{l} \cdot a^{*} + d) - 1] = 0

(9)

where the optimal point in the dataset is represented by a*, and it is characterized by a positive value of φ. The value of β for all other points in the dataset is approximately zero.

So,

y_{l} ((c_{l} \cdot a^{*} + d) - 1) = 0

(10)

The points in the dataset that are closest to the hyperplane are known as support vectors. The support vectors can be identified using the Equation (10) described above.

c - \sum_{l = 1}^{m} φ_{l} y_{l} a_{l} = 0

c = \sum_{l = 1}^{m} φ_{l} y_{l} a_{l}

(11)

To calculate the value of z, we obtain:

y_{l} ((c_{l} \cdot a^{*} + d) - 1) = 0

(12)

Multiplying both sides by e in Equation (12), then it becomes:

y_{l}^{2} ((a_{l} \cdot a^{*} + d) - y_{l}) = 0

where

y_{l}^{2}

= 1.

((c_{l} \cdot a^{*} + d) - y_{l}) = 0

d = y_{l} - c_{l} \cdot a^{*}

(13)

Then:

d = \frac{1}{s} \sum_{l = 1}^{s} (y_{l} - c_{l} \cdot a)

(14)

The number of support vectors, represented by variable A, determines the characteristics of the hyperplane that will be applied to do the predictions. In this manuscript, we examined the application of SVM for multiclass classification. To address this problem, we adopted a strategy of breaking down the multiclass problem into several binary classification problems. Specifically, we employed m × (m − 1)/2 classifiers (where m represents the number of classes) to accomplish the classification task. Therefore, we utilized three classifiers following a one-to-one approach to achieve accurate classification results. And as follows, the hypothesis function is:

g (c_{i}) = [\begin{matrix} i & i f c_{i} \cdot a + d > {t h}_{i} \\ j & e l s e \end{matrix}]

(15)

In the SVM algorithm used in the proposed HSP-SVM model, points above the hyperplane are classified as class i (in the case of the low-hydrogen storage class: i = 1, in the case of the medium-hydrogen storage class: i = 2, similarity in the case of the high-hydrogen storage class: I = 3); otherwise, point are classified as class j. The goal of the SVM algorithm is to find the optimal hyperplane that can accurately divide the data into the correct classes. The SVM algorithm works by identifying the hyperplane that provides the largest margin, or distance, between the different classes, which helps to improve the accuracy of the model.

4. Discussions

4.1. 5-Fold Cross Validation

The 5-Fold Cross Validation (5-FCV) approach was evaluated initially. Table 2a–c represents the confusion matrix for the proposed HSP-SVM model with 5-FCV. From Table 2a, it is evident that among storage samples categorized as low class, 37,925 cases were classified accurately as low-class storage. While 105,935 cases were accurately categorized as not belonging to low-class storage, and 7528 cases were erroneously identified as not belonging to low-class storage. It is evident from Table 2b that 57,787 occurrences of medium storage samples were accurately identified as medium-class storage, whereas 9301 instances were wrongly predicted as medium storage. However, 84,300 occurrences of non-medium-class storage were categorized accurately. Table 2c reveals that in the case of high-class storage samples, 46,375 instances were categorized accurately as high-class storage. Whereas 1773 occurrences of non-high-class storage were not identified accurately, and 103,240 samples were accurately classified as not belonging to high-class storage. The results elucidated that the classification accuracy of this approach was 93.90%, with an MCR of 6.10%. The low and medium classes had low accuracies, which resulted in an overall low accuracy for the 5-FCV approach.

4.2. Resubstitution Validation

The Resubstitution Validation (RV) approach was used to evaluate the performance of the proposed SVM model for predicting hydrogen storage. Table 2a–c shows the confusion matrix for the proposed SVM model. It is obvious from Table 2a that 37,925 instances were correctly classified as low-class storage, while 7528 samples were wrongly classified as non-class storage. Moreover, 105,935 instances were identified accurately as not belonging to low-class storage. Table 2b shows that 57,787 occurrences of medium-class storage were identified accurately. Whereas 9301 samples were not classified accurately as medium-class storage, and 84,300 entities were identified accurately as non-medium-class storage. In Table 2c, it is revealed that 46,375 instances were identified correctly as high-class storage, while 1773 samples were incorrectly identified as non-high-class storage. Furthermore, 103,240 samples of non-high-class storage were identified accurately. The results revealed that the classification accuracy was 93.90% with an MCR of 6.10%. The overall low accuracy for RV was observed due to high MCR values for the low and medium classes.

4.3. Holdout Validation

The third approach was the Holdout Validation (HV) technique that was used to evaluate the proposed HSP-SVM model. The performance of this approach was assessed using a confusion matrix, as shown in Table 3a–c. Table 3a shows that the instances (36,033) in the low storage category were correctly classified, and the instances (87,107) in the non-low storage category were correctly classified, with the exception of 1860 occurrences of non-low-class storage were not categorized accurately. Table 3b shows that in the medium storage category, 48,120 instances were correctly classified, and 3754 instances were wrongly predicted as the medium storage category. However, 73,126 instances were identified accurately as non-medium storage categories. Table 3c shows that in the high storage category, 37,093 instances were correctly classified, and 1894 occurrences of non-high storage category were not identified accurately. Whereas 86,013 instances were predicted accurately as non-high storage categories. The results showed that this approach had a classification accuracy of 97.00% and a misclassification rate of 3.00%. The high accuracy of this approach in classifying low storage capacity was a major contributing factor to its overall performance.

4.4. Receiver Operating Characteristic Curve

The receiver operating characteristic (ROC) curves using 5-FCV, RV, and HV for the low class, medium class, and high class are depicted in Figure 2a–c. For the low-class and high-class ROC curves shown in Figure 2a,c, the ROC curves for the classifier in this study showed a True Positive Rate (TPR) of 0.83 and 0.96 and a False Positive Rate (FPR) of 0, respectively. This indicated that the classifier was able to correctly classify 83% and 96% of positive samples and was able to correctly classify all negative samples. The area under the ROC curve (AUC) was calculated to be 1.00, which reveals the perfect performance. It is observed from Figure 2b that the True Positive Rate (TPR) and False Positive Rate (FPR) for the medium class were 1.00 and 0.10, respectively. This elucidates that the classifier performed perfectly for classifying the positive samples whereas 90% of the negative samples were correctly classified. Moreover, the area under the ROC curve (AUC) was 0.99, which elucidates the excellent performance of the classifier. It also reveals that the classifier performed effectively for separating the positive and negative samples. The high TPR and low FPR values emphasize that the classifier achieved a good balance between sensitivity and specificity. Overall, the results revealed that the performance of the classifier was excellent.

4.5. Comparative Analysis of the 5-FCV, RV, and HV

The 5-FCV, RV, and HV techniques were evaluated using various statistical parameters to assess their performance for all three classes, and results are presented in Table 4. The results showed that the HV model had higher accuracies of 98.5% and 97.0% in comparison to accuracies of 95.0% and 93.8% achieved by the 5-FCV and RV techniques for the low class and the medium class, respectively. Whereas an accuracy of 98.8% obtained by 5-FCV and RV methods and an accuracy of 98.5% obtained by HV were almost similar for the high class. The 5-FCV and RV models yielded a lower misclassification rate (MCR) for the high class (1.20%) compared to the low class (5.00%) and medium class (6.15%). However, the RV approach yielded almost similar MCR for all three classes i.e., 1.50% for the low class and high class and 3.00% for the medium class. The selectivity of all the approaches was similar (100%) for the low and high classes. However, 95.1% was obtained by HV for the medium class, and a lower selectivity of 90.1% was achieved for the medium class from the other two approaches. The recall of all the methods was highest for the medium class (100%), whereas the recall yielded by the HV was higher at 95.1% in comparison to 83.4% yielded by 5-FCV and RV for the low class. All the techniques showed similar precision values of 100% for the low and high classes. However, the HV technique was more precise for the medium class with a precision value of 92.8%. The HV approach yielded higher F1 score values compared to 5-FCV and RV. Specifically, for the low and medium classes, HV yielded F1 scores of 97.5% and 96.2%, respectively, in comparison to the corresponding F1 scores of 90.9% and 92.5% achieved through 5-FCV and RV. However, the F1 score achieved using 5-FCV and RV (98.1%) was slightly higher than of HV (97.5%). The False Positive Rate (FPR) achieved from the HV approach was lower (4.90%) in comparison to the 9.90% yielded by 5-FCV and RV for the medium class. The False Discovery Rate (FDR) was higher for the medium class (13.9%) obtained by 5-FCV and RV, whereas HV yielded an FDR of 4.90% for the medium class. The False Omissions Rate (FOR) was achieved by 5-FCV, and RV was 6.60% for the low class, whereas HV yielded a lower FOR (2.10%) for the low class. The Negative Predictive Value (NPV) was higher (97.90%) for the low class obtained from HV, and it was 93.40% for the low class achieved from 5-FCV and RV approaches. Overall, these results suggested that the HV approach performed best for all three classes compared to the 5-FCV and RV approaches depicted in Figure 3a–c.

Moreover, the overall accuracy of the HV approach was higher (97.00%) compared to the accuracy (93.90%) of the 5-FCV and RV approaches. Furthermore, MCR was 3.00% for the HV approach, and it was 6.10% for the 5-FCV and RV methods depicted in Figure 4. Hence, the HV approach was found to be optimal to predict the hydrogen storage stages using the proposed HSP-SVM model.

As shown in Table 5, our proposed HSP-SVM model exhibited good accuracy compared to previous studies. Moreover, the comparison of the predictive performance of the current model with other classification models, such as Levenberg–Marquardt (LM) [64] and Weighted Federated Machine Learning (WFML) [65], is shown in Figure 5. It is evident from Figure 5 that our current model has performed better in comparison to previously reported classification algorithmic models. The accuracy of HSP-SVM was 97%, whereas it was 94.9% and 96.4% for the LM and WFML models, respectively. Moreover, the recall value for HSP-SVM and WFML was quite close, whereas it was 87.2% for LM.

5. Conclusions

Using Dibenzyltoluene (H0-DBT) as a liquid organic hydrogen carrier presents a promising option for hydrogen storage systems. The HSP-SVM model was developed to predict the hydrogen storage classes when storing in H0-DBT, and its performance was validated using various techniques such as 5-FCV, RC, and HV. The HV approach showed a higher accuracy of 97.0%, whereas it was 93.9% for 5-FCV and RC. Moreover, the MCR values for HV, RC, and 5-FCV were 3.00% and 6.10%, respectively. Furthermore, HV approach yielded an accuracy of 98.50% and sensitivity of 95.10%, for the low class in comparison to 95% accuracy and 83.40% sensitivity for the 5-FCV and RC approaches. Similarly, for the medium class, the accuracy and precision of the HV approach were 97% and 92.80%, respectively, whereas the 5-FCV and RC approaches achieved a lower accuracy of 93.85% and sensitivity of 86.10%. Therefore, HV classified the low-class and medium-class data more efficiently than the other two approaches. These results suggested that the HV approach was the optimal approach for the proposed HSP-SVM model to predict hydrogen storage classes in Dibenzyltoluene.

Author Contributions

Conceptualization, A.A. and H.C.; methodology, A.A. and M.A.K.; software, A.A. and M.A.K.; validation, A.A., M.A.K. and H.C.; data curation, A.A.; writing—original draft preparation, A.A. and M.A.K.; writing—review and editing, A.A and H.C.; supervision, H.C.; funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. 2022R1A2C1093395).

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

Data will be available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Available online: http://sdg.iisd.org/news/world-population-to-reach-9-9-billion-by-2050/ (accessed on 14 December 2023).
Endo, N.; Goshome, K.; Tetsuhiko, M.; Segawa, Y.; Shimoda, E.; Nozu, T. Thermal management and power saving operations for improved energy efficiency within a renewable hydrogen energy system utilizing metal hydride hydrogen storage. Int. J. Hydrogen Energy 2021, 46, 262–271. [Google Scholar] [CrossRef]
Singh, R.; Singh, M.; Gautam, S. Hydrogen economy, energy, and liquid organic carriers for Its mobility. Mater. Today Proc. 2021, 46, 5420–5427. [Google Scholar] [CrossRef]
World Health Organization. COP24 Special Report: Health and Climate Change; WHO: Geneva, Switzerland, 2018. [Google Scholar]
Franco, M.; Bilal, U.; Diez-Roux, A.V. Preventing non-communicable diseases through structural changes in urban environments. J. Epidemiol. Commun. Health 2015, 69, 509–511. [Google Scholar] [CrossRef] [PubMed]
Singh, R.; Altaee, A.; Gautam, S. Nanomaterials in the advancement of hydrogen energy storage. Heliyon 2020, 6, 04487. [Google Scholar] [CrossRef] [PubMed]
Yang, J.; Sudik, A.; Wolverton, C.; Siegel, D.J. High capacity hydrogen storage materials: Attributes for automotive applications and techniques for materials discovery. Chem. Soc. Rev. 2010, 39, 656–675. [Google Scholar] [CrossRef] [PubMed]
Ali, A.; Kumar, G.U.; Lee, H.J. Parametric study of the hydrogenation of dibenzyltoluene and its dehydrogenation performance as a liquid organic hydrogen carrier. J. Mech. Sci. Technol. 2020, 34, 3069–3077. [Google Scholar] [CrossRef]
Kölbig, M.; Weckerle, C.; Linder, M.; Bürger, I. Review on thermal applications for metal hydrides in fuel cell vehicles: Operation modes, recent developments and crucial design aspects. Renew. Sustain. Energy Rev. 2022, 162, 112385–112394. [Google Scholar] [CrossRef]
Abohamzeh, E.; Salehi, F.; Sheikholeslami, M.; Abbassi, R.; Khan, F. Review of hydrogen safety during storage, transmission, and applications processes. J. Loss Prev. Process Ind. 2021, 72, 104569–104578. [Google Scholar] [CrossRef]
Yang, M.; Han, C.; Ni, G.; Wu, J.; Cheng, H. Temperature controlled three-stage catalytic dehydrogenation and cycle performance of perhydro-9-ethylcarbazole. Int. J. Hydrogen Energy 2012, 37, 12839–12845. [Google Scholar] [CrossRef]
Wang, B.; Yan, T.; Chang, T.; Wei, J.; Zhou, Q.; Yang, S.; Fang, T. Palladium supported on reduced graphene oxide as a high-performance catalyst for the dehydrogenation of dodecahydro-N-ethylcarbazole. Carbon 2017, 122, 9–18. [Google Scholar] [CrossRef]
Mehranfar, A.; Izadyar, M.; Esmaeili, A.A. Hydrogen storage by N-ethylcarbazol as a new liquid organic hydrogen carrier: A dft study on the mechanism. Int. J. Hydrogen Energy 2015, 40, 5797–5806. [Google Scholar] [CrossRef]
Xue, W.; Liu, H.; Zhao, B.; Ge, L.; Yang, S.; Qiu, M.; Li, J.; Han, W.; Chen, X. Single Rh₁Co catalyst enabling reversible hydrogenation and dehydrogenation of N-ethylcarbazole for hydrogen storage. Appl. Catal. B Environ. 2023, 327, 122453. [Google Scholar] [CrossRef]
Jiang, Z.; Gong, X.; Wang, B.; Wu, Z.; Fang, T. A experimental study on the dehydrogenation performance of dodecahydro-N-ethylcarbazole on M/TiO₂ catalysts. Int. J. Hydrogen Energy 2019, 44, 2951–2959. [Google Scholar] [CrossRef]
Ge, L.; Qiu, M.; Zhu, Y.; Yang, S.; Li, W.; Li, W.; Jiang, Z.; Chen, X. Synergistic catalysis of Ru single-atoms and zeolite boosts high-efficiency hydrogen storage. Appl. Catal. B Environ. 2022, 319, 121958. [Google Scholar] [CrossRef]
Dong, Y.; Yang, M.; Zhu, T.; Chen, X.; Cheng, G.; Ke, H.; Cheng, H. Fast dehydrogenation kinetics of perhydro-N-propylcarbazole over a supported Pd catalyst. ACS Appl. Energy Mater. 2018, 1, 4285–4292. [Google Scholar] [CrossRef]
Dong, Y.; Yang, M.; Yang, Z.; Ke, H.; Cheng, H. Catalytic hydrogenation and dehydrogenation of N-ethylindole as a new heteroaromatic liquid organic hydrogen carrier. Int. J. Hydrogen Energy 2015, 40, 10918–10922. [Google Scholar] [CrossRef]
Dong, Y.; Yang, M.; Zhu, T.; Chen, X.; Li, C.; Ke, H.; Cheng, H. Hydrogenation Kinetics of N-Ethylindole on a Supported Ru Catalyst. Energy Technol. 2018, 6, 558–562. [Google Scholar] [CrossRef]
Li, L.; Yang, M.; Dong, Y.; Mei, P.; Cheng, H. Hydrogen storage and release from a new promising liquid organic hydrogen storage carrier: 2-methylindole. Int. J. Hydrogen Energy 2016, 41, 16129–16134. [Google Scholar] [CrossRef]
Chen, Z.; Yang, M.; Zhu, T.; Zhang, Z.; Chen, X.; Liu, Z.; Dong, Y.; Cheng, G.; Cheng, H. 7-ethylindole: A new efficient liquid organic hydrogen carrier with fast kinetics. Int. J. Hydrogen Energy 2018, 43, 12688–12696. [Google Scholar] [CrossRef]
Yang, M.; Cheng, G.; Xie, D.; Zhu, T.; Dong, Y.; Ke, H.; Cheng, H. Study of hydrogenation and dehydrogenation of 1-methylindole for reversible onboard hydrogen storage application. Int. J. Hydrogen Energy 2018, 43, 8868–8876. [Google Scholar] [CrossRef]
Yang, M.; Xing, X.; Zhu, T.; Chen, X.; Dong, Y.; Cheng, H. Fast hydrogenation kinetics of acridine as a candidate of liquid organic hydrogen carrier family with high capacity. J. Energy Chem. 2020, 41, 115–119. [Google Scholar] [CrossRef]
Brückner, N.; Obesser, K.; Bösmann, A.; Teichmann, D.; Arlt, W.; Dungs, J.; Wasserscheid, P. Evaluation of Industrially applied heat-transfer fluids as liquid organic hydrogen carrier systems. ChemSusChem 2014, 7, 229–235. [Google Scholar] [CrossRef]
Modisha, P.M.; Jordaan, J.H.; Bösmann, A.; Wasserscheid, P.; Bessarabov, D. Analysis of reaction mixtures of perhydro-dibenzyltoluene using two-dimensional gas chromatography and single quadrupole gas chromatography. Int. J. Hydrogen Energy 2018, 43, 5620–5636. [Google Scholar] [CrossRef]
Markiewicz, M.; Zhang, Y.Q.; Bösmann, A.; Brückner, N.; Thöming, J.; Wasserscheid, P.; Stolte, S. Environmental and health impact assessment of liquid organic hydrogen carrier systems–challenges and preliminary results. Energy Environ. Sci. 2015, 8, 1035–1045. [Google Scholar] [CrossRef]
Heller, A.; Rausch, M.H.; Schulz, P.S.; Wasserscheid, P.; Fröba, A.P. Binary diffusion coefficients of the liquid organic hydrogen carrier system dibenzyltoluene/perhydrodibenzyltoluene. J. Chem. Eng. Data 2016, 61, 504–511. [Google Scholar] [CrossRef]
Leinweber, A.; Müller, K. Hydrogenation of the liquid organic hydrogen carrier compound monobenzyl toluene: Reaction pathway and kinetic effects. Energy Technol. 2018, 6, 513–520. [Google Scholar] [CrossRef]
Müller, K.; Stark, K.; Emel’yanenko, V.N.; Varfolomeev, M.A.; Zaitsau, D.H.; Shoifet, E.; Schick, C.; Verevkin, S.P.; Arlt, W. Liquid organic hydrogen carriers: Thermophysical and thermochemical studies of benzyl-and dibenzyl-toluene derivatives. Ind. Eng. Chem. Res. 2015, 54, 7967–7976. [Google Scholar] [CrossRef]
Rao, P.C.; Yoon, M. Potential liquid-organic hydrogen carrier systems: A review on recent progress. Energies 2020, 13, 6040. [Google Scholar] [CrossRef]
Ali, A.; Rohini, A.K.; Noh, Y.S.; Moon, D.J.; Lee, H.J. Hydrogenation of dibenzyltoluene and the catalytic performance of Pt/Al₂O₃ with various Pt loadings for hydrogen production from perhydro-dibenzyltoluene. Int. J. Energy Res. 2022, 46, 6672–6688. [Google Scholar] [CrossRef]
Shi, L.; Qi, S.; Qu, J.; Che, T.; Yi, C.; Yang, B. Integration of hydrogenation and dehydrogenation based on dibenzyltoluene as liquid organic hydrogen energy carrier. Int. J. Hydrogen Energy 2019, 44, 5345–5354. [Google Scholar] [CrossRef]
Greeley, J.; Jaramillo, T.F.; Bonde, J.; Chorkendorff, I.B.; Nørskov, J.K. Computational high-throughput screening of electrocatalytic materials for hydrogen evolution. Nat. Mater. 2006, 5, 909–913. [Google Scholar] [CrossRef] [PubMed]
Hong, W.T.; Welsch, R.E.; Shao-Horn, Y. Descriptors of oxygen-evolution activity for oxides: A statistical evaluation. J. Phys. Chem. C 2016, 120, 78–86. [Google Scholar] [CrossRef]
Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B.P.; Ramprasad, R.; Gubernatis, J.E.; Lookman, T. Machine learning bandgaps of double perovskites. Sci. Rep. 2016, 6, 19375. [Google Scholar] [CrossRef]
Pilania, G.; Balachandran, P.V.; Kim, C.; Lookman, T. Finding new perovskite halides via machine learning. Front. Mater. 2016, 19, 23–29. [Google Scholar] [CrossRef]
Pilania, G.; Balachandran, P.V.; Gubernatis, J.E.; Lookman, T. Classification of ABO₃ perovskite solids: A machine learning study. Acta Crystallogr. Sect. B Struct. Sci. Cryst. Eng. Mater. 2015, 71, 507–513. [Google Scholar] [CrossRef]
Balachandran, P.V.; Broderick, S.R.; Rajan, K. Identifying the ‘inorganic gene’ for high-temperature piezoelectric perovskites through statistical learning. Proc. R. Soc. A Math. Phys. Eng. Sci. 2011, 467, 2271–2290. [Google Scholar] [CrossRef]
Sparks, T.D.; Gaultois, M.W.; Oliynyk, A.; Brgoch, J.; Meredig, B. Data mining our way to the next generation of thermoelectrics. Scr. Mater. 2016, 111, 10–15. [Google Scholar] [CrossRef]
Yan, J.; Gorai, P.; Ortiz, B.; Miller, S.; Barnett, S.A.; Mason, T.; Stevanović, V.; Toberer, E.S. Material descriptors for predicting thermoelectric performance. Energy Environ. Sci. 2015, 8, 983–994. [Google Scholar] [CrossRef]
Seshadri, R.; Sparks, T.D. Perspective: Interactive material property databases through aggregation of literature data. APL Mater. 2016, 4, 053206. [Google Scholar] [CrossRef]
Oliynyk, A.O.; Antono, E.; Sparks, T.D.; Ghadbeigi, L.; Gaultois, M.W.; Meredig, B.; Mar, A. High-throughput machine-learning-driven synthesis of full-Heusler compounds. Chem. Mater. 2016, 28, 7324–7331. [Google Scholar] [CrossRef]
Rahnama, A.; Clark, S.; Sridhar, S. Machine learning for predicting occurrence of interphase precipitation in HSLA steels. Comput. Mater. Sci. 2018, 154, 169–177. [Google Scholar] [CrossRef]
Wilmer, C.E.; Leaf, M.; Lee, C.Y.; Farha, O.K.; Hauser, B.G.; Hupp, J.T.; Snurr, R.Q. Large-scale screening of hypothetical metal–organic frameworks. Nat. Chem. 2012, 4, 83–89. [Google Scholar] [CrossRef] [PubMed]
Lin, L.C.; Berger, A.H.; Martin, R.L.; Kim, J.; Swisher, J.A.; Jariwala, K.; Rycroft, C.H.; Bhown, A.S.; Deem, M.W.; Haranczyk, M.; et al. In silico screening of carbon-capture materials. Nat. Mater. 2012, 11, 633–641. [Google Scholar] [CrossRef] [PubMed]
Gómez-Bombarelli, R.; Aguilera-Iparraguirre, J.; Hirzel, T.D.; Duvenaud, D.; Maclaurin, D.; Blood-Forsythe, M.A.; Chae, H.S.; Einzinger, M.; Ha, D.G.; Wu, T.; et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat. Mater. 2016, 15, 1120–1127. [Google Scholar] [CrossRef] [PubMed]
Kim, E.; Huang, K.; Tomala, A.; Matthews, S.; Strubell, E.; Saunders, A.; McCallum, A.; Olivetti, E. Machine-learned and codified synthesis parameters of oxide materials. Sci. Data 2017, 4, 170127. [Google Scholar] [CrossRef] [PubMed]
Sumpter, B.G.; Vasudevan, R.K.; Potok, T.; Kalinin, S.V. A bridge for accelerating materials by design. NPJ Comput. Mater. 2015, 1, 15008. [Google Scholar] [CrossRef]
Kalinin, S.V.; Sumpter, B.G.; Archibald, R.K. Big–deep–smart data in imaging for guiding materials design. Nat. Mater. 2015, 14, 973–980. [Google Scholar] [CrossRef]
Kim, E.; Huang, K.; Jegelka, S.; Olivetti, E. Virtual screening of inorganic materials synthesis parameters with deep learning. NPJ Comput. Mater. 2017, 3, 53. [Google Scholar] [CrossRef]
Dashti, A.; Harami, H.R.; Rezakazemi, M. Accurate prediction of solubility of gases within H₂-selective nanocomposite membranes using committee machine intelligent system. Int. J. Hydrogen Energy 2018, 43, 6614–6624. [Google Scholar] [CrossRef]
Rezakazemi, M.; Dashti, A.; Asghari, M.; Shirazian, S. H₂-selective mixed matrix membranes modeling using ANFIS, PSO-ANFIS, GA-ANFIS. Int. J. Hydrogen Energy 2017, 42, 15211–15225. [Google Scholar] [CrossRef]
Rezakazemi, M.; Azarafza, A.; Dashti, A.; Shirazian, S. Development of hybrid models for prediction of gas permeation through FS/POSS/PDMS nanocomposite membranes. Int. J. Hydrogen Energy 2018, 43, 17283–17294. [Google Scholar] [CrossRef]
Rahnama, A.; Zepon, G.; Sridhar, S. Machine learning based prediction of metal hydrides for hydrogen storage, part I: Prediction of hydrogen weight percent. Int. J. Hydrogen Energy 2019, 44, 7337–7344. [Google Scholar] [CrossRef]
Rahnama, A.; Zepon, G.; Sridhar, S. Machine learning based prediction of metal hydrides for hydrogen storage, part II: Prediction of material class. Int. J. Hydrogen Energy 2019, 44, 7345–7353. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, Y.; Zhou, G.; Jin, J.; Wang, B.; Wang, X.; Cichocki, A. Multi-kernel extreme learning machine for EEG classification in brain-computer interfaces. Expert. Syst. Appl. 2018, 96, 302–310. [Google Scholar] [CrossRef]
Jain, V.; Merchant, A.; Roy, S.; Ford, J.B. Developing an emic scale to measure ad-evoked nostalgia in a collectivist emerging market. J. Bus. Res. 2019, 99, 140–156. [Google Scholar] [CrossRef]
Rahman, A.U.; Sultan, K.; Naseer, I.; Majeed, R.; Musleh, D.; Gollapalli, M.A.S.; Chabani, S.; Ibrahim, N.; Siddiqui, S.Y.; Khan, M.A. Supervised machine learning-based prediction of COVID-19. Comput. Mater. Contin. 2021, 69, 21–34. [Google Scholar]
Khan, M.A.; Abu-Khadrah, A.; Siddiqui, S.Y.; Ghazal, T.M.; Faiz, T.; Ahmad, M.; Lee, S.W. Support-vector-machine-based adaptive scheduling in mode 4 communication. Comput. Mater. Contin. 2022, 73, 3319–3331. [Google Scholar]
Tahir, A.; Asif, M.; Ahmad, M.B.; Mahmood, T.; Khan, M.A.; Ali, M. Brain Tumor Detection using Decision-Based Fusion Empowered with Fuzzy Logic. Math. Probl. Eng. 2022, 2022, 2710285. [Google Scholar] [CrossRef]
Abidi, W.U.H.; Daoud, M.S.; Ihnaini, B.; Khan, M.A.; Alyas, T.; Fatima, A.; Ahmad, M. Real-time shill bidding fraud detection empowered with fussed machine learning. IEEE Access 2021, 9, 113612–113621. [Google Scholar] [CrossRef]
Nadeem, M.W.; Goh, H.G.; Ponnusamy, V.; Andonovic, I.; Khan, M.A.; Hussain, M. A fusion-based machine learning approach for the prediction of the onset of diabetes. Healthcare 2021, 9, 1393. [Google Scholar] [CrossRef] [PubMed]
Ata, A.; Khan, M.A.; Abbas, S.; Khan, M.S.; Ahmad, G. Adaptive IoT empowered smart road traffic congestion control system using supervised machine learning algorithm. Comput. J. 2021, 64, 1672–1679. [Google Scholar] [CrossRef]
Choi, H.; Ali, A.; Khan, M.A.; Abbas, N. Prediction of hydrogen storage in dibenzyltoluene empowered with machine learning. J. Energy Storage 2022, 55, 105844. [Google Scholar]
Ali, A.; Khan, M.A.; Choi, H. Hydrogen Storage Prediction in Dibenzyltoluene as Liquid Organic Hydrogen Carrier Empowered with Weighted Federated Machine Learning. Mathematics 2022, 10, 3846. [Google Scholar] [CrossRef]
Thornton, A.W.; Simon, C.M.; Kim, J.; Kwon, O.; Deeg, K.S.; Konstas, K.; Pas, S.J.; Hill, M.R.; Winkler, D.A.; Haranczyk, M.; et al. Materials genome in action: Identifying the performance limits of physical hydrogen storage. Chem. Mater. 2017, 29, 2844–2854. [Google Scholar] [CrossRef] [PubMed]
Bucior, B.J.; Bobbitt, N.S.; Islamoglu, T.; Goswami, S.; Gopalan, A.; Yildirim, T.; Farha, O.K.; Bagheri, N.; Snurr, R.Q. Energy-based descriptors to rapidly predict hydrogen storage in metal–organic frameworks. Mol. Syst. Des. Eng. 2019, 4, 162–174. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; Volume 2, pp. 1–758. [Google Scholar]

Figure 1. The proposed HSP-SVM model for hydrogen storage prediction.

Figure 2. Receiver operating characteristic curves using the 5-FCV, RC, and HV approaches for (a) low class, (b) medium class, and (c) high class.

Figure 3. (a–c) Comparison of statistical parameters for the 5-FCV, RV, and HV approaches.

Figure 4. The overall accuracy and MCR of 5-FCV, RC, and HV for hydrogen storage prediction using the proposed HSP-SVM model.

Figure 5. Comparison of the statistical parameters for the proposed HSP-SVM with the LM and WFML models.

Table 2. (a–c). Confusion matrix of the proposed SVM model using 5-Fold Cross Validation and Resubstitution Validation: (a) low class, (b) medium class, (c) high class.

Parameters	Predicted Classes
Parameters	(a) Low Class	(b) Medium Class	(c) High Class
True Positive (TP)	39,725	57,787	46,375
False Negative (FN)	7528	0	1773
False Positive (FP)	0	9301	0
True Negative (TN)	105,935	84,300	103,240

Table 3. (a–c). Confusion matrix of the proposed SVM model using Holdout Validation: (a) low class, (b) medium class, (c) high class.

Parameters	Predicted Classes
Parameters	(a) Low Class	(b) Medium Class	(c) High Class
True Positive (TP)	35,033	48,120	37,093
False Negative (FN)	1860	0	1894
False Positive (FP)	0	3754	0
True Negative (TN)	87,107	73,126	86,013

Table 4. Comparison of the proposed HSP-SVM model using the 5-FCV, RC, and HV approaches in terms of various statistical parameters.

Evaluation Parameters	5-Fold Cross Validation and Resubstitution Validation			Holdout Validation
Evaluation Parameters	Low Class	Medium Class	High Class	Low Class	Medium Class	High Class
Accuracy	95.0%	93.8%	98.8%	98.5%	97.0%	98.5%
Miss rate	5.0%	6.15%	1.20%	1.50%	3.00%	1.50%
Selectivity	100%	90.1%	100%	100%	95.1%	100%
Recall/Sensitivity	83.4%	100%	96.3%	95.1%	100%	95.1%
Precision	100%	86.1%	100%	100%	92.8%	100%
F₁ Score	90.9%	92.5%	98.1%	97.5%	96.2%	97.5%
False positive rate	0	9.90%	0	0	4.90%	0
False discovery rate	0	13.9%	0.00	0	7.20%	0
False omission rate	6.60%	0	1.70%	2.10%	0	2.15%
Negative Predictive Value	93.4%	100%	98.3%	97.9%	100%	97.8%

Table 5. Comparison of the current study with previously published studies.

Studies	Year	Storage System	Model	Accuracy
Thornton et al. [66]	2017	Nanoporous materials	Neural Network	88.0%
Rahnama et al. [54]	2019	Metal hydrides	Boosted decision tree regression	83.0%
Rahnama et al. [55]	2019	Metal hydrides	Multiclass neural network	80.0%
Bucior et al. [67]	2019	Metal organic frameworks	Multilinear regression with LASSO [68]	96.0%
Choi et al. [64]	2022	LOHC	Levenberg–Marquardt	94.9%
Ali et al. [65]	2022	LOHC	HSPS-WFML	96.4%
Ali et al.	Current Study	LOHC	HSP-SVM	97.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ali, A.; Khan, M.A.; Choi, H. Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier. Molecules 2024, 29, 1280. https://doi.org/10.3390/molecules29061280

AMA Style

Ali A, Khan MA, Choi H. Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier. Molecules. 2024; 29(6):1280. https://doi.org/10.3390/molecules29061280

Chicago/Turabian Style

Ali, Ahsan, Muhammad Adnan Khan, and Hoimyung Choi. 2024. "Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier" Molecules 29, no. 6: 1280. https://doi.org/10.3390/molecules29061280

APA Style

Ali, A., Khan, M. A., & Choi, H. (2024). Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier. Molecules, 29(6), 1280. https://doi.org/10.3390/molecules29061280

Article Menu

Supervised Machine Learning-Based Prediction of Hydrogen Storage Classes Utilizing Dibenzyltoluene as an Organic Carrier

Abstract

1. Introduction

2. Simulations and Results

3. Materials and Methods

4. Discussions

4.1. 5-Fold Cross Validation

4.2. Resubstitution Validation

4.3. Holdout Validation

4.4. Receiver Operating Characteristic Curve

4.5. Comparative Analysis of the 5-FCV, RV, and HV

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI