Next Article in Journal
Do Regulatory Tariffs Curb Gas Flaring? Evidence from Nigeria
Previous Article in Journal
Theoretical Analysis and Modelling of LNG Reforming to Hydrogen Marine Fuel for FLNG Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Machine Learning Applications in Gray, Blue, and Green Hydrogen Production: A Comprehensive Review

1
Department of Petroleum Engineering, Cullen College of Engineering, University of Houston, Houston, TX 77204, USA
2
College of Natural Sciences, The University of Texas at Austin, Austin, TX 78712, USA
3
C&C Reservoirs, Houston, TX 77040, USA
*
Author to whom correspondence should be addressed.
Submission received: 31 March 2025 / Revised: 3 May 2025 / Accepted: 15 May 2025 / Published: 17 May 2025

Abstract

:
Hydrogen is increasingly recognized as a key contributor to a low-carbon energy future, and machine learning (ML) is emerging as a valuable tool to optimize hydrogen production processes. This review presents a comprehensive analysis of ML applications across various hydrogen production pathways, including gray, blue, and green hydrogen, with additional insights into pink, turquoise, white, and black/brown hydrogen. A total of 51 peer-reviewed studies published between 2012 and 2025 were systematically reviewed. Among these, green hydrogen—particularly via water electrolysis and biomass gasification—received the most attention, reflecting its central role in decarbonization strategies. ML algorithms such as artificial neural networks (ANNs), random forest (RF), and gradient boosting regression (GBR) have been widely applied to predict hydrogen yield, optimize operational conditions, reduce emissions, and improve process efficiency. Despite promising results, real-world deployment remains limited due to data sparsity, model integration challenges, and economic barriers. Nonetheless, this review identifies significant opportunities for ML to accelerate innovation across the hydrogen value chain. By highlighting trends, key methodologies, and current gaps, this study offers strategic guidance for future research and development in intelligent hydrogen systems aimed at achieving sustainable and cost-effective energy solutions.

1. Introduction

1.1. Background of Hydrogen Production

Sustainable development and a high quality of life rely on clean, safe, and reliable energy supplies. However, meeting this growing demand—driven by population and economic growth—places increasing pressure on fossil fuels, which remain dominant but contribute significantly to greenhouse gas emissions and resource depletion. These challenges underscore the urgent need to transition to renewable energy sources. According to the IEA (International Energy Agency) report, the share of fossil fuels in the global energy mix has gradually decreased over the last decade, from 82% in 2013 to 80% in 2023. Energy demand has increased by 15% over this period, and 40% of this growth (Figure 1) has been met by clean energy, i.e., renewables in the power and end-use sectors, nuclear energy, and low-emission fuels, including carbon capture, utilization, and storage (CCUS) [1].
Hydrogen is gaining global recognition as a versatile energy carrier, extending beyond its traditional applications. Unlike synthetic carbon-based fuels, it offers the potential to be truly carbon-neutral, or even carbon-negative, throughout its entire life cycle. One of its primary advantages lies in its versatility, as it can be used across multiple sectors, including transportation, industrial heating, power generation, and energy storage [2]. Additionally, hydrogen exhibits high energy density, making it a suitable alternative to conventional fossil fuels, particularly in hard-to-abate industries such as steelmaking, cement production, and long-haul transportation. When produced using renewable energy sources, such as electrolysis powered by wind or solar, hydrogen generates no direct carbon emissions, contributing to global decarbonization efforts and supporting net-zero targets [3]. Furthermore, hydrogen can be stored for extended periods and transported via pipelines, liquid carriers, or ammonia conversion, enhancing energy security and grid flexibility by compensating for the intermittency of renewable power sources. In fuel cell applications, hydrogen produces only water and heat as byproducts, making it an environmentally sustainable solution for both stationary and mobile energy systems [4]. Moreover, emerging pathways such as blue and turquoise hydrogen offer transitional solutions by reducing carbon emissions through CCUS or methane pyrolysis [5]. As advancements in production, storage, and distribution technologies continue, hydrogen plays a crucial role in sustainable energy systems, industrial transformation, and global efforts to combat climate change.
Global hydrogen production reached 97 Mt in 2023, an increase of almost 2.5% compared to 2022 [6], and this is expected to rise to a minimum of 105 million tons by 2030 (Figure 2). Hydrogen has been utilized across a wide range of industries, including steel and fertilizer production. In 2023, China is the largest consumer, accounting for 29% of total usage. North America and the Middle East follow, representing 16% and 14% of global hydrogen consumption, respectively. Other regions, including India (9%), Europe (8%), and the rest of the world (24%), contribute to the remaining share, reflecting the global diversification of hydrogen demand [6].
Hydrogen production can be categorized into several types based on the feedstock and carbon emissions associated with each process, as shown in Figure 3 [7]. Gray hydrogen, the most common form, is produced from natural gas through steam methane reforming (SMR) or autothermal reforming (ATR), releasing significant amounts of CO₂ into the atmosphere. Blue hydrogen follows a similar production pathway but incorporates CCUS to reduce emissions, making it a lower-carbon alternative. Black or brown hydrogen, derived from coal gasification, is among the most carbon-intensive forms due to the high emissions generated during the process. In contrast, green hydrogen is produced via water electrolysis using renewable electricity (e.g., wind, solar, hydro), emitting no CO2 and producing only oxygen and water as byproducts, making it the most sustainable option. Pink hydrogen, another low-emission alternative, is also generated through electrolysis but powered by nuclear energy, ensuring minimal carbon emissions. Turquoise hydrogen is obtained through methane pyrolysis, which splits methane into solid carbon and hydrogen, resulting in lower emissions than blue hydrogen while offering potential carbon storage in solid form. Lastly, white hydrogen refers to naturally occurring hydrogen found in geological formations, which, if extracted properly, can be a zero-emission resource [8]. The selection of hydrogen production methods significantly influences its viability as a clean energy carrier and its role in decarbonizing industrial processes, transportation, and energy systems. Among these methods, three routes are of interest: gray, blue, and green hydrogen.
Figure 4 illustrates the carbon intensity of various hydrogen production methods, expressed in kg CO2 equivalent per kg of H2 produced. Green hydrogen pathways—including wind-, hydro-, and solar-powered electrolysis—exhibit the lowest carbon intensities, ranging from 0.4 to 1.6 kg CO2/kg H2, with biomass gasification potentially achieving negative emissions. Blue hydrogen, produced via fossil fuel reforming with carbon capture (e.g., ATR and SMR), shows moderate emissions (2.8–7.0 kg CO2/kg H2), while coal-based routes remain significantly higher at 11.8. Gray, brown, and black hydrogen represent the most carbon-intensive methods, exceeding 9 kg CO2/kg H2. In contrast, nuclear-derived pink hydrogen has low emissions (0.4), and turquoise hydrogen—produced via methane pyrolysis—falls between 1.9 and 4.8. The value for white hydrogen is shown as 0.0 and marked as assumed, reflecting the current lack of standardized data for naturally occurring H2.

1.2. Incorporating Machine Learning with Hydrogen Production

Machine learning (ML) is a branch of artificial intelligence that enables computers to analyze vast datasets, identify patterns, and make data-driven predictions or decisions with minimal human intervention. One of the key benefits of ML is its ability to optimize complex systems, improve efficiency, and reduce operational costs by continuously learning from data and adjusting processes in real time [10,11]. As a result, ML has been widely applied in various industries, including healthcare, finance, manufacturing, and renewable energy [11], to enhance productivity, predictive maintenance, and decision-making. In the field of hydrogen production, ML offers significant advantages by improving process efficiency, reducing energy consumption, and optimizing system performance. For example, Shahin and Simjoo [12] demonstrated the practical capabilities of ChatGPT-4 through four detailed case studies, each addressing a critical aspect of hydrogen energy development. Kwon et al. [13] used ML methods to predict hydrogen demand from 2020 to 2030 with an R2 value of 0.9936. An increasing number of papers have been published that focus on applying ML in hydrogen production. Figure 5 shows the most frequently used keywords for ML applications in hydrogen production (based on co-occurrence analysis with “all keywords” as a unit of analysis using the VOSviewer tool, version 1.6.20). ML can be employed in various aspects of hydrogen production. In electrolysis, ML algorithms can be used to enhance reaction efficiency, electrode material selection, and power usage, leading to increased hydrogen yield with minimal energy loss. In SMR and coal gasification, ML can help optimize reaction conditions, reduce carbon emissions, and predict equipment failures, ensuring more sustainable and cost-effective hydrogen production. Additionally, ML-driven models can facilitate the integration of green hydrogen into the energy grid by forecasting renewable energy availability and dynamically adjusting electrolysis operations. By utilizing advanced data analytics and automation, ML plays a crucial role in advancing hydrogen production technologies, reducing costs, and accelerating the transition to a cleaner energy future.

1.3. Motivation of This Review

To the best of our knowledge, the in-depth technical review of ML within blue, green, and gray hydrogen production is limited. Most of the work focuses on reviewing the process of different production methods [2,3,4,5,14]. Davies et al. [7] conducted a review on how ML applications are used on blue hydrogen only. Alagumalai et al. [15] and Sharma et al. [16] summarized the ML applications for biohydrogen only, which is one aspect of green hydrogen. Bassey et al. [17] presented a review of recent ML applications on green hydrogen with a main focus on water electrolysis. Allal et al. [18] provided comprehensive coverage of ML applications in hydrogen energy systems, but it lacks differentiation regarding how these techniques are applied across the various hydrogen classifications (e.g., green, blue, gray), which limits insights into color-specific optimization strategies. With hydrogen being identified as a key renewable energy source for achieving a low-carbon economy, it is essential to further develop and optimize novel low-carbon hydrogen technologies and explore how ML can be utilized in their development and deployment. This work provides a comprehensive technical review of the literature on ML within hydrogen production, focusing particularly on blue, green, and gray hydrogen, as well as a general overview of pink, turquoise, black, and white hydrogen. Other hydrogen production methods, such as solar-driven hydrogen production but with biomass as a feedstock [19], converting plastic waste into clean hydrogen via gasification [20], and municipal sludge gasification-based hydrogen production [21,22], are not included here. This review aims to offer insights into how ML can address common limitations of conventional process modeling techniques and support its integration into intelligent monitoring and control systems.
This review adopts a narrative and technical synthesis approach to evaluate ML applications in hydrogen production. A structured literature search was conducted using the Scopus database (primary) and Google Scholar (supplementary), covering the period from 2012 to 2025. The following Boolean search query was used: (“machine learning”) AND (“hydrogen production”) AND (“gray” OR “blue” OR “green” OR “electrolysis” OR “gasification”).
A total of 172 articles were initially identified. After the removal of duplicates and screening for relevance, 51 peer-reviewed studies were included based on the following criteria: (1) application of ML to hydrogen production processes, (2) focus on specific hydrogen types (gray, blue, green, etc.), and (3) availability of methodological detail. Review articles, non-ML studies, and inaccessible or off-topic papers were excluded.
The remainder of this paper is organized as follows: Section 2 reviews common ML methods and algorithms relevant to hydrogen production. Section 3 through Section 6 systematically explore the role of ML in gray, blue, and green hydrogen production, as well as in other hydrogen types, including pink, turquoise, white, and black. Section 7 highlights the benefits and limitations of ML techniques in hydrogen production and discusses emerging challenges and opportunities. Finally, Section 8 concludes this review and offers perspectives on future research directions in ML-driven hydrogen energy systems.

2. Overview of Machine Learning

Machine learning (ML) enables computers to learn patterns from data and make decisions or predictions without explicit programming. Unlike traditional rule-based programming, where outcomes depend on predefined instructions, ML models analyze large datasets, identify correlations, and improve their performance over time. This data-driven approach has led to breakthroughs in automation, data analysis, and decision-making across various fields. ML is particularly valuable in complex problem-solving, where traditional methods struggle due to large-scale data, nonlinear relationships, and the need for real-time adaptation.

2.1. Brief History of ML

The roots of ML can be traced back to the mid-20th century, with the development of foundational mathematical theories and early computational models. In 1950, Alan Turing introduced the Turing Test, which laid the groundwork for AI research. The first ML algorithm, the Perceptron, was developed in 1958 by Frank Rosenblatt, marking the birth of artificial neural networks (ANNs) [23,24]. During the 1980s and 1990s, advances in computational power and algorithmic improvements, such as the backpropagation algorithm, enabled the resurgence of neural networks. By the 2000s, the rise of big data and deep learning models—particularly with architectures like convolutional neural networks (CNNs) and recurrent neural networks (RNNs)—revolutionized applications in image recognition, natural language processing, and autonomous systems. Today, ML is an essential part of scientific research, industry automation, and technological innovation, with continuous advancements in quantum computing, reinforcement learning, and generative AI models.

2.2. Categories of ML

ML algorithms are broadly categorized into three main types, each distinguished by how the model learns from data [25,26]:
  • Supervised learning: In this approach, models are trained on labeled data, meaning each input is paired with the correct output. The algorithm learns by minimizing errors and improving accuracy through iterative training. Examples include linear regression, decision trees, and neural networks, which are commonly used in applications such as fraud detection, medical diagnosis, and stock price prediction.
  • Unsupervised learning: This method deals with unlabeled data, where the algorithm identifies hidden patterns or structures without explicit output labels. Clustering and association rule learning are common techniques with applications in customer segmentation, anomaly detection, and market analysis. Examples include k-means clustering and principal component analysis (PCA).
  • Reinforcement learning: Unlike the previous categories, reinforcement learning (RL) is based on reward-based learning, where an agent interacts with an environment and learns through trial and error to maximize cumulative rewards. RL is widely used in robotics, gaming (e.g., AlphaGo), and autonomous systems. Algorithms such as Q-learning and deep Q-networks (DQNs) power advanced decision-making systems.

2.3. Common ML Algorithms

The following table (Table 1) summarizes some of the common ML techniques currently in use, along with their common applications, advantages, and limitations. As shown in the table, these diverse techniques have proved successful in various applications. That was the driving force to utilize them in the field of hydrogen production modeling, simulation, and optimization. There are various ML algorithms, each suited for different types of tasks [11,26,27].
ML has demonstrated remarkable success across various domains, such as finance, biology, geosciences, healthcare analytics, materials science, and engineering [28,29,30,31,32], due to its capability to model complex, nonlinear relationships within large datasets. Its ability to uncover hidden patterns and optimize predictive accuracy makes it a powerful tool for addressing intricate problems. Given these advantages, the application of ML in forecasting hydrogen production is expected to gain increasing attention, emerging as a pivotal trend in the pursuit of efficient and sustainable energy solutions.

3. Blue Hydrogen Production and ML Applications

3.1. Blue Hydrogen Production

Blue hydrogen has gained significant attention as a viable alternative for large-scale hydrogen production, particularly in industrial manufacturing, transportation, and power generation sectors. It is a low-carbon hydrogen production method that relies on fossil fuels while incorporating CCUS to reduce greenhouse gas emissions. Blue hydrogen is primarily produced through SMR or ATR of natural gas, as well as coal gasification, with carbon capture systems preventing most of the CO2 emissions from entering the atmosphere. The details of each process are introduced below.

3.1.1. Steam Methane Reforming (SMR)

SMR was first industrially implemented in 1936 at the Billingham site in the United Kingdom. This development was a result of collaborative efforts and technological advancements in the early 20th century [33]. SMR itself was developed to meet industrial demands for hydrogen, like in ammonia production for fertilizers, using natural gas and high-temperature steam to produce hydrogen and carbon dioxide. The “blue” twist came later, as climate concerns grew in the 2000s when engineers started pairing SMR with CCS technology.
Nowadays, SMR is the most widely used process for large-scale hydrogen production, particularly in industrial applications. It involves the reaction of methane (CH4), the primary component of natural gas, with steam (H2O) at high temperatures, typically ranging between 700 and 1000 °C [34], in the presence of a nickel-based catalyst. The reaction takes place in a steam reformer, which consists of catalyst-filled tubes heated externally by burning natural gas. Figure 6 shows a simplified process of SMR.
The primary reaction converts methane and steam into hydrogen and carbon monoxide through the following endothermic reaction:
CH4 + H2O→CO + 3H2,
Since this reaction requires heat input, it is conducted in high-temperature reactors where the necessary heat is supplied externally. The produced carbon monoxide undergoes a subsequent process called the water–gas shift reaction (WGSR), where it reacts with additional steam to produce more hydrogen and CO2:
CO + H2O→CO2 + H2,
This two-step reaction process—methane reforming followed by the water–gas shift reaction—maximizes the hydrogen yield while generating CO2 as a byproduct. In traditional gray hydrogen production, this CO2 is released into the atmosphere, significantly contributing to greenhouse gas emissions. However, in blue hydrogen production, up to 90% of the CO2 emissions are captured, preventing their release and making the hydrogen production process significantly cleaner [35].

3.1.2. Autothermal Reforming (ATR)

Autothermal reforming (ATR) is a hydrogen production process that combines elements of SMR and partial oxidation (POX) to convert natural gas (methane) into hydrogen, carbon monoxide, and carbon dioxide. Unlike SMR, which requires an external heat source, ATR is self-sustaining, generating the necessary heat through a controlled oxidation reaction [36]. This makes it particularly suitable for large-scale hydrogen production with CCUS, positioning ATR as a key technology for blue hydrogen development.
ATR technology has evolved alongside other hydrogen and syngas production methods. The concept of POX of hydrocarbons dates back to the early 20th century, with significant industrial developments occurring in the 1950s and 1960s. Early ATR designs were primarily used for syngas (CO + H2) production in the chemical and petrochemical industries, particularly for ammonia and methanol synthesis. In the 1990s and 2000s, the increasing demand for low-carbon hydrogen and the advancement of carbon capture technologies led to ATR being considered as a viable alternative to SMR for blue hydrogen production. Today, ATR is gaining traction in projects where integrated CCUS solutions are prioritized, as it offers a more efficient carbon capture process compared to SMR.
ATR operates in a single high-pressure reactor where methane reacts with oxygen or air and steam (H₂O) to produce hydrogen-rich syngas. Figure 7 exhibits a simplified process of ATR.
The process consists of three main steps:
  • POX Reaction
Methane is partially oxidized using pure oxygen or air, producing carbon monoxide and hydrogen while releasing heat:
CH4 + 1/2O2→CO + 2H2,
This reaction is exothermic, meaning it generates heat, allowing the process to be self-sustaining.
2.
Steam Reforming Reaction
The remaining methane reacts with steam at high temperatures (900–1100 °C) in the presence of a nickel-based catalyst, producing more hydrogen and carbon monoxide:
CH4 + H2O→CO + 3H2,
This is endothermic, meaning it absorbs heat, balancing the overall energy requirement of the ATR reactor.
3.
Water–Gas Shift Reaction (WGSR)
The carbon monoxide from the first two reactions undergoes a water–gas shift reaction, where it reacts with steam to produce carbon dioxide and additional hydrogen:
CO + H2O→CO2 + H2
The resulting CO2 is captured using CCUS, and the remaining hydrogen is purified for industrial applications.
ATR offers several advantages that make it a promising method for blue hydrogen production, particularly in applications where CCUS is integrated. One of its key benefits is its higher carbon capture efficiency, as ATR operates at higher pressures than SMR, allowing for easier and more cost-effective CO2 separation, with capture rates reaching up to 90% [37]. Additionally, ATR is a self-sustaining process, as the POX of methane generates the necessary heat, eliminating the need for external fuel combustion, thereby reducing overall CO₂ emissions. The technology is also highly scalable, making it suitable for large-scale hydrogen production in refineries and industrial applications where a consistent hydrogen supply is required. However, despite these advantages, ATR has some limitations that impact its widespread adoption. A major challenge is the requirement for pure oxygen instead of air, which necessitates using energy-intensive air separation units (ASUs), resulting in increased capital and operational costs. Furthermore, ATR systems operate at higher pressures and temperatures, requiring specialized equipment and reactor designs, which adds to the complexity and cost of implementation. Additionally, compared to SMR, ATR is less established in existing hydrogen infrastructure, meaning that industries may face higher initial investments and technical challenges when adopting this technology [38]. Despite these hurdles, ATR remains a viable and efficient pathway for low-carbon hydrogen production, particularly in projects where high CO₂ capture rates and large-scale hydrogen output are essential. Table 2 provides a brief comparison of ATM, SMR, and coal gasification.

3.2. ML Application for Blue Hydrogen

In blue hydrogen production, key technical challenges include optimizing carbon capture efficiency, minimizing energy consumption during SMR or ATR, and enhancing the purity and recovery of hydrogen in gas separation processes. ML models have been instrumental in predicting performance under varying operating conditions and optimizing parameters such as purge-to-feed ratios, adsorption pressures, and membrane configurations. ML-driven surrogate models enable the rapid evaluation of complex PSA and SE-SMR systems, reducing computational time and improving design optimization. Moreover, ML helps identify optimal catalyst formulations and operating strategies that enhance methane conversion and CO2 capture, thereby supporting more efficient and economically viable blue hydrogen production.
This section explores the application of ML across various aspects of hydrogen production, from material screening and development to full-scale plant optimization. Through this analysis, we highlight current implementations of ML in blue hydrogen production and identify opportunities for broader adoption. A summary of the relevant studies, including key methods and findings, is presented in Table 3 and will be discussed in greater detail in the following sections. It highlights the ML algorithms used, input parameters, predicted outputs, and key findings. The majority of the reviewed studies focus on SMR, reflecting its industrial maturity and wide adoption. In contrast, relatively few studies target ATR. This observation is consistent with the work of Howarth and Jacobson [40], who argue that blue hydrogen may not yet offer a truly low-emission alternative to gray hydrogen due to its unresolved technical and environmental challenges.
In terms of algorithm usage, an ANN is by far the most frequently adopted technique, appearing in over 70% of the studies. Variants of ANNs—such as FFBPNN, DNN, ASNN, and hybrid models like ANN-GA or ANN-DE—demonstrate the method’s versatility and adaptability to complex nonlinear systems. While ANNs dominate the reviewed studies, their black-box nature limits interpretability, which is critical in safety-focused industries like hydrogen production. Some studies (e.g., Vo et al. [41], Yu et al. [42]) reported high R2 values (>0.99), but these validations were based mainly on train–test splits without independent test sets or cross-validation, raising concerns about overfitting. Some dataset sizes were generally small (<500 samples), largely due to experimental constraints, limiting model generalizability. For example, the studies by Tong et al. [43] and Streb and Mazzotti [45] optimized PSA processes but did not evaluate robustness across different feed gas compositions or operating ranges. Moreover, while MSE and R2 were reported, their practical meaning (e.g., impact of a 2% error on H2 purity for industrial feasibility) was rarely discussed.
Common input parameters used across studies fall into three main categories: (1) operating conditions—such as adsorption pressure, steam-to-carbon ratio, temperature, and purge-to-feed ratio; (2) material properties—including molecular descriptors, catalyst composition, and concentrations of amine solvents; (3) process variables—such as feed flow rates, membrane area, and reactor parameters. Outputs typically focus on hydrogen purity, hydrogen recovery/yield, CO2 capture efficiency, energy consumption, and economic metrics like H2 production cost.

4. Gray Hydrogen Production and ML Applications

4.1. Gray Hydrogen Production Process

Gray hydrogen represents the most prevalent form of hydrogen production in the current global energy landscape. It is primarily derived from fossil fuels—most notably natural gas—through processes that emit substantial quantities of CO2 into the atmosphere. Based on the latest data from the IEA [6], global hydrogen demand reached 97 Mt in 2023, with low-emission hydrogen production accounting for less than 1% of this total. Most specifically, almost all the hydrogen still comes from fossil fuels (83%), with 62% from gray hydrogen, followed by 19% from a combination of brown and black hydrogen, 0.7% from blue hydrogen, and only 0.04% from green hydrogen [6]. The rest was produced as a byproduct in the chemical industry. The widespread adoption of gray hydrogen can be attributed to its relatively low production cost and the well-established infrastructure supporting its generation and distribution [35]. However, despite its economic advantages, the environmental consequences of gray hydrogen production pose significant challenges to global decarbonization goals. It is estimated to be responsible for about 2% of global CO2 emissions, representing around 830 Mt of CO2 yearly [52]. As the world shifts toward a more sustainable energy future, there is increasing pressure to decarbonize gray hydrogen production, given its dominant role and high environmental impact.

4.2. ML Application for Gray Hydrogen

Gray hydrogen production, primarily from coal gasification or SMR without carbon capture, faces major challenges such as controlling syngas composition, minimizing carbon emissions, and maintaining catalyst stability. Given the significant share of gray hydrogen in global production and its considerable environmental footprint, improving the efficiency and sustainability of this process is critical for near-term decarbonization. However, optimizing gray hydrogen production is inherently complex due to the interplay of numerous operational parameters, such as feedstock quality, reaction kinetics, temperature, pressure, and catalyst performance. ML applications in gray hydrogen production have focused on modeling and predicting syngas outputs (e.g., H2, CO, CH4, CO2) based on variable feedstock properties and operating conditions. Techniques such as ANNs, GPR, and ensemble models have been used to optimize gasification parameters, maximize hydrogen yield, and reduce undesirable byproducts. In addition, ML facilitates real-time monitoring and anomaly detection in gasifier operations, enhancing system efficiency and minimizing environmental impacts. Table 4 reviews recent advancements in applying ML techniques to gray hydrogen production. Information such as ML models, input variables, target outputs, and their impact on process enhancement is also included.
Similar to the statistical summary of blue hydrogen, ANNs are the most widely used algorithm, appearing in over 60% of the studies. Variants such as ANN-GA, ANN-MLP, DNN, and hybrid models like DNN-PSO reflect the flexibility of neural architectures in capturing the nonlinear dynamics of hydrogen production systems. Other frequently employed algorithms include SVR, DT, GPR, and ensemble methods like RF and GBR. Regarding input features, studies generally use 6–12 variables that can be grouped into three categories: (1) feedstock and fuel properties—e.g., fixed carbon, volatile matter, elemental composition (C, H, O, N, S), moisture, and ash content; (2) reaction and process conditions—e.g., temperature, steam-to-carbon ratio, oxygen/air flow rate, pressure, and gasifier bed temperature; (3) economic or system-level inputs—such as compressor costs, catalyst configurations, and energy inputs. The most commonly predicted output variables include hydrogen yield, CO2 emission rate, syngas composition (H2, CO, CH4, CO2), heating value, and carbon conversion efficiency. Many models also evaluate optimization trade-offs, such as improving H2 yield while minimizing emissions or catalyst degradation.

5. Green Hydrogen Production and ML Applications

Green hydrogen is produced using renewable energy sources with zero direct CO2 emissions. Unlike gray hydrogen, which is generated from fossil fuels and emits significant amounts of CO2, green hydrogen is entirely carbon-free, making it a critical component in the transition toward sustainable energy systems. The fundamental principle behind green hydrogen production lies in the electrolysis of water, which can be achieved through different electrolyzer technologies, each offering distinct advantages in terms of efficiency, operational conditions, and scalability.

5.1. Green Hydrogen Production Process

This section discusses the primary methods of green hydrogen production, including water electrolysis, biomass gasification, photoelectrochemical (PEC) water splitting, and biological hydrogen production.

5.1.1. Water Electrolysis

The primary method for producing green hydrogen is water electrolysis, which splits water into hydrogen and oxygen using electricity. The reaction occurs in an electrolyzer and is given by the following:
2H2O(l)→2H2(g) + O2(g)
Three primary electrolyzer technologies are used for green hydrogen production: alkaline electrolysis (AEL), proton exchange membrane (PEM) electrolysis, and solid oxide electrolysis cells (SOECs). These technologies differ in operating temperature, efficiency, and material requirements. AEL is the most commercially available and widely used technology, benefiting from lower costs due to its reliance on non-precious-metal catalysts. However, it suffers from lower efficiency and slower response times. PEM electrolyzers, in contrast, provide higher efficiency and faster dynamic response, making them suitable for fluctuating renewable energy sources such as wind and solar power. SOECs, operating at high temperatures, offer the highest efficiency by utilizing thermal energy to reduce electrical input requirements, but they face challenges in material degradation and high capital costs [66,67]. Table 5 provides a comparison among different electrolyzers.
Water electrolysis for green hydrogen production relies on electricity generated from renewable sources such as solar photovoltaic (PV) systems, wind turbines, or hydropower plants. These clean energy inputs ensure that the hydrogen produced is free from carbon emissions.

5.1.2. Biomass Gasification

Beyond water electrolysis, green hydrogen can be produced from biomass-based waste and wet organic materials (such as agricultural waste, forestry, or animal residues) through thermochemical and biochemical processes that convert carbon-rich feedstocks into hydrogen-rich gases [68]. This process is called biomass gasification. It typically occurs at high temperatures exceeding 500 °C and involves reacting organic or fossil-based carbonaceous materials with a controlled amount of O2 and/or steam to produce CO, H2, and CO2 [69]. The key reaction is as follows:
CxHyOz + H2O→H2 + CO2 + CO + CH4
The combination of biomass with other methods is also novel. For example, Pan et al. [70] present an ML-driven framework for optimizing biomass–coal co-gasification, targeting green hydrogen-rich syngas and liquid fuel production.

5.1.3. PEC Water Splitting

PEC water splitting is a method that directly converts solar energy into H2 and O2 using a semiconductor-based system. The PEC system mimics natural photosynthesis but uses artificial photoelectrodes. It has a photoanode (usually an n-type semiconductor) that absorbs light and oxidizes water to generate O2 and a photocathode (usually a p-type semiconductor) or a metal electrode that collects electrons to reduce protons (H⁺) into H2. The success of PEC water splitting is critically dependent on the properties of the photoelectrode materials, particularly their band gap, band edge alignment, light absorption, charge carrier mobility, and chemical stability in aqueous environments [71]. A wide range of semiconductor materials—such as TiO2, BiVO4, Fe2O3, WO3, and Cu2O—have been investigated, with various strategies employed to overcome inherent limitations, including low solar-to-hydrogen efficiency and photoelectrode degradation. ML offers a transformative opportunity for accelerating the discovery and optimization of metal oxide photoelectrodes, enhancing efficiency, and reducing costs in PEC water splitting [72].

5.1.4. Biohydrogen Production

Biohydrogen production is a method that utilizes microorganisms to produce hydrogen gas, offering a potentially sustainable and environmentally friendly approach to green hydrogen production. This process utilizes various microorganisms, including green algae, cyanobacteria, photosynthetic bacteria, and dark fermentative bacteria, to produce H2 from organic substrates or water. The common methods for biohydrogen production are microbial electrolysis, photobiological hydrogen production, dark fermentation, and photofermentative hydrogen production [73]. ML applications in biohydrogen production involve modeling and optimizing the process by identifying patterns in complex biological data, predicting process performance, and selecting optimal conditions to maximize hydrogen yield. It enables data-driven control of fermentation, microbial behavior, and feedstock utilization, accelerating the design of cost-effective and scalable biohydrogen systems [15].

5.2. ML Application for Green Hydrogen

ML also plays an increasingly critical role in advancing green hydrogen production. In water electrolysis, ML algorithms help predict hydrogen yield, optimize operational parameters (e.g., current density, temperature, electrode material), and detect system degradation or faults. Beyond general process optimization, ML also addresses technology-specific challenges in different types of electrolyzers. For instance, in PEM electrolysis, ML models have been developed to predict membrane degradation and design operational strategies that extend membrane lifetime, which is essential due to the high cost and sensitivity of PEM membranes. In AEL systems, where catalyst deactivation and scaling are major concerns, ML techniques are used to model catalyst aging and recommend optimal operating conditions to maintain performance. In SOEC technologies, which operate at elevated temperatures, ML assists in predicting material degradation and thermal stress, enabling the development of improved operational protocols to enhance durability.
ML applications are also rapidly expanding in other green hydrogen production routes. In biomass gasification, ML models are employed to predict syngas composition, identify optimal gasification parameters, and classify biomass feedstocks based on hydrogen production potential. In PEC water splitting, ML aids in the discovery and optimization of semiconductor materials by uncovering complex relationships between material properties and photocurrent density. For biohydrogen production, ML models facilitate hydrogen yield prediction from processes like dark fermentation and microbial electrolysis, assist in reactor control, and reveal key influencing variables such as pH, substrate concentration, and microbial community behavior. Table 6 provides selected examples of ML applications in each category of green hydrogen production.
Among the 21 studies surveyed, ANNs and their variants (e.g., MLP, BPNN, RBF) were the most commonly used algorithms (11 studies have applied ANNs), followed closely by ensemble methods such as RF, GBR, and SVR. Inputs varied widely across studies but were generally categorized into operational conditions (e.g., temperature, pressure, flow rates), feedstock properties (e.g., elemental composition, moisture, ash content), and environmental or time-series data (e.g., solar irradiance, humidity, timestamps). Most studies targeted hydrogen yield or production rate as primary outputs, with several also predicting associated variables such as syngas composition, CO2 yield, photocurrent density, and gasification efficiency. Time-series models like LSTM and hybrid models (e.g., ANN-GA, LSTM-CNN) demonstrated strong performance in forecasting and dynamic control tasks. Overall, the findings underscore ML’s versatility in capturing the complex, nonlinear relationships inherent to green hydrogen processes and its potential to optimize system performance, guide catalyst/material selection, and reduce experimental costs across diverse production pathways.

6. Other Hydrogen Production Pathways and Their ML Applications

While green, blue, and gray hydrogen represent the dominant and most widely studied production pathways, several other methods—such as pink, turquoise, white, and black/brown hydrogen—have also emerged in recent years. Although these technologies are either still in the early stages of development or currently lack widespread adoption, they offer unique advantages and potential for low- or zero-carbon hydrogen production. For the sake of completeness and to provide a comprehensive overview, this section includes a brief discussion of these alternative methods, along with their associated ML applications.

6.1. Pink Hydrogen Production

Pink hydrogen refers to hydrogen produced through water electrolysis powered by nuclear energy. It emits no direct CO2 during hydrogen generation, making it an attractive option for countries with established nuclear infrastructure. The process is similar to that of green hydrogen, where water is split into hydrogen and oxygen using electricity. However, in the case of pink hydrogen, the electricity is generated from nuclear power plants rather than renewable sources such as wind or solar [52].
Despite its potential as a clean and reliable hydrogen source, pink hydrogen remains in the early stages of commercial deployment. One reason for this is the ongoing debate over the sustainability and public perception of nuclear energy. While some concerns exist regarding radioactive waste and nuclear safety, it is important to note that modern nuclear reactors utilize only small amounts of radioactive fuel and have benefited from significant advances in safety protocols, reactor design, and waste management technologies [8].
Recent developments highlight the growing interest in this hydrogen pathway. For example, the U.S. Department of Energy and Constellation Energy Group have launched the nation’s first pink hydrogen demonstration system at the Nine Mile Point Nuclear Plant in New York (Figure 8). This pilot project produces approximately 560 kg of hydrogen per day using just 1.25 MW of the plant’s 1907 MW nuclear output [93]. Moreover, Constellation has announced plans to scale up commercial hydrogen production by 2026, signaling increased momentum for pink hydrogen deployment.
On the academic front, Fernández-Arias et al. [94] conducted a bibliometric analysis of 550 research papers over a 13.6-year period and found that scientific interest in pink hydrogen is steadily increasing, with an annual growth rate of 5.58%. These findings suggest a rising awareness of nuclear-powered hydrogen as a viable decarbonization strategy, especially in countries that already rely on nuclear power for electricity generation. As global efforts to diversify clean hydrogen sources intensify, pink hydrogen could play a complementary role alongside green and blue hydrogen.

6.2. Turquoise Hydrogen Production

Turquoise hydrogen is produced through a process called methane pyrolysis, where natural gas (methane) is thermally decomposed into hydrogen gas and solid carbon in the absence of oxygen:
CH4→C (solid) + 2H2
Unlike SMR, which produces CO2 as a byproduct, turquoise hydrogen avoids direct carbon emissions by generating solid carbon, which can be stored or utilized in various industries (e.g., carbon black, battery materials, construction additives). This makes it a potentially low-carbon or even carbon-neutral pathway, depending on the energy source used for pyrolysis [95].
Methane pyrolysis processes described in the literature can be classified into three categories: catalytic, thermal, and plasma decomposition. In catalytic pyrolysis using nickel, methane conversion begins at around 500 °C [96]. Without a suitable catalyst, thermal decomposition starts above 700 °C [97]. To achieve technically relevant reaction rates and methane conversion rates, higher temperatures are required, i.e., typically above 800 °C for catalytic processes, over 1000 °C for the thermal processes, and up to 2000 °C when using plasma torches [98]. The need for advanced reactor designs and the immature commercial infrastructure for handling and monetizing solid carbon pose challenges to the widespread application of turquoise hydrogen production. Moreover, catalyst deactivation and reactor fouling due to carbon buildup remain key technical issues.

6.3. White Hydrogen Production

White hydrogen refers to naturally occurring hydrogen found in underground deposits, such as in geological formations, in volcanic systems, or along fault zones. Unlike other types of hydrogen, white hydrogen is not produced through industrial processes—it is naturally formed through geochemical reactions like serpentinization (water reacting with iron-rich rocks) or radiolysis (water molecules split by natural radiation). Because it exists in nature without carbon emissions, white hydrogen is considered a clean and renewable energy source, if it can be extracted economically [8]. However, it remains largely untapped and underexplored, and ongoing research is focused on identifying viable reservoirs, assessing environmental impacts, and developing technologies for efficient extraction.

6.4. Black/Brown Hydrogen Production

Black/brown hydrogen refers to hydrogen produced through the gasification of coal. The key difference between brown and black hydrogen lies in the type of coal used—brown hydrogen is produced from lignite (low-grade, moisture-rich coal), while black hydrogen is derived from bituminous or anthracite coal, which has higher carbon content and energy density [8]. During this process, coal reacts with oxygen and steam at high temperatures to generate syngas (Figure 9). Black hydrogen is often contrasted with gray hydrogen, which is produced from natural gas via SMR. Although both methods emit substantial CO2, gray hydrogen typically has a lower carbon footprint than black hydrogen, as natural gas has a higher hydrogen-to-carbon ratio and burns more cleanly than coal. As such, black hydrogen is considered one of the least environmentally friendly hydrogen production routes in the absence of CCS. In coal gasification, coal is heated in the presence of steam to produce hydrogen and CO, which is then further processed to yield hydrogen and CO2. This method is even more carbon-intensive than SMR, emitting approximately 19 kg of CO2 per kilogram of hydrogen [6].
The concept of coal gasification dates back to the 19th century when it was used to produce town gas for lighting and heating in urban areas. Early gasification plants operated using coal-derived syngas, which contained a mixture of H2, CO, CH4, and other gases. By the 20th century, gasification technology had advanced, enabling the large-scale production of synthetic fuels and chemicals. This advancement was particularly notable during World War II, when Germany utilized coal gasification to produce liquid fuels in the absence of crude oil. In the 1950s and 1960s, further advancements in catalysts and high-pressure gasifiers improved hydrogen production efficiency, making coal gasification attractive for the chemical and fertilizer industries [99]. With the rise of climate concerns in the 21st century, the focus shifted toward low-emission coal utilization. Countries like China, the United States, and Australia continue to invest in clean coal technologies, exploring coal gasification as a means to produce hydrogen while reducing CO2 emissions through capture and storage [38].
Coal gasification presents both opportunities and challenges, particularly in regions with abundant coal reserves. One of its primary advantages is its ability to utilize widely available coal resources, providing an alternative to natural-gas-based hydrogen production [100]. Moreover, coal has the largest reserves of any fossil fuel in the world; especially in China, this method is frequently used and generates a substantial quantity of hydrogen [101]. Additionally, when coupled with CCUS, coal gasification can significantly reduce CO2 emissions, making it a more sustainable approach compared to traditional coal combustion. The process also allows for high hydrogen yields, as syngas can be further processed to maximize hydrogen production through the water–gas shift reaction. However, despite these benefits, coal gasification remains energy-intensive, requiring high temperatures and pressures, which increase operational costs. The process also generates significant amounts of solid waste, such as slag and ash, which require proper disposal and environmental management. Additionally, capital costs are high due to the complexity of gasification reactors and the need for extensive gas purification systems to remove sulfur, nitrogen compounds, and particulates [102]. While CCUS can mitigate carbon dioxide emissions, it further adds to the cost and infrastructure requirements. As a result, while coal gasification remains a viable option, its long-term feasibility depends on advancements in carbon capture technology, regulatory policies, and economic incentives for low-carbon hydrogen production.

6.5. ML Application Summary

ML application for pink, turquoise, white, and black/brown hydrogen faces diverse technical challenges, largely due to the immaturity of these processes. For example, in turquoise hydrogen production via methane pyrolysis, maintaining plasma stability and preventing catalyst deactivation are major concerns; ML models are applied to predict plasma behavior from emission spectra and to optimize catalyst formulations for sustained high methane conversion. In pink hydrogen, ML assists in forecasting hydrogen production costs under varying operational and regulatory scenarios. For white hydrogen, ML models help predict subsurface thermodynamic behavior and phase stability to guide resource development. In black/brown hydrogen from coal gasification, ML supports optimizing gasification parameters to improve syngas quality and reduce pollutant emissions.
Although the application of ML in these emerging hydrogen pathways remains at an early stage, initial studies demonstrate its potential to accelerate experimental discovery, optimize process conditions, improve model accuracy, and enhance feasibility assessments under complex and uncertain conditions. Table 7 highlights selected examples of ML applications across these lesser-explored hydrogen production technologies, underscoring the growing interest in expanding ML’s role throughout the hydrogen value chain.

7. Key Challenges, Opportunities, and Future Work

As ML continues to gain insights into hydrogen production research and applications, it is important to assess both the limitations and opportunities associated with its deployment. This section summarizes the key challenges that hinder the widespread implementation of ML in real-world hydrogen systems, as well as the emerging opportunities that highlight ML’s potential to enhance efficiency, reduce costs, and accelerate innovation across the hydrogen value chain.

7.1. Key Challenges

Despite rapid advancements in hydrogen production and the increasing integration of ML across various hydrogen pathways, several key challenges must be addressed to enable large-scale deployment and economic viability. These challenges span technical, economic, and operational domains, while also presenting opportunities for innovation and strategic growth.
One of the primary challenges is the high cost of hydrogen production, particularly for low-carbon pathways. As of 2025, gray hydrogen remains the least expensive, with production costs ranging from 0.7 to 2.3 USD/kg H2, owing to mature infrastructure and low natural gas prices. Blue hydrogen, which incorporates carbon capture, is moderately more expensive at 1.4–3.2 USD/kg H2, depending on the capture efficiency and technology used (e.g., SMR vs. ATR). Green hydrogen is the most variable and costly, with prices spanning 1.9–8.2 USD/kg H2, influenced by factors such as electrolyzer type, electricity source, and scale [9]. These economic disparities pose a major barrier to market competitiveness, especially in regions where fossil-fuel-based hydrogen remains dominant.
In addition to cost, technical complexity remains a significant issue. Each production method presents unique optimization challenges—from managing catalyst degradation in turquoise and gray hydrogen processes to dealing with intermittent renewable input in green hydrogen electrolysis. Moreover, emerging methods like white and pink hydrogen require further exploration into geological extraction and nuclear integration, respectively, which are currently limited by data availability, infrastructure readiness, and public acceptance.
Deployment of ML in hydrogen production environments introduces several domain-specific challenges. Sensor reliability is a concern due to the high-temperature, high-pressure, and corrosive conditions typical of reformers and electrolysis units, often leading to degraded or missing data that can impair model performance. Additionally, concept drift caused by catalyst aging, membrane fouling, and feedstock variability can reduce accuracy over time, requiring adaptive or online learning methods. Integration with industrial control systems (e.g., SCADA, DCS) also presents hurdles, as these systems are not readily compatible with modern ML tools. To ensure safe and effective deployment, ML models must be interpretable, auditable, and capable of functioning within hybrid frameworks alongside traditional physics-based controls.
ML offers promising solutions to many of these challenges, yet its implementation in industrial settings is still nascent. While models such as ANNs, RF, and XGBoost have shown impressive accuracy in laboratory studies (often with R2 > 0.95), real-world deployment is limited by data sparsity, a lack of standardization, and integration challenges with legacy systems. Additionally, model interpretability and trustworthiness are critical issues, especially in safety-critical applications like reactor control or carbon capture optimization.
Another challenge identified in this review is the lack of explicit criteria for ML algorithm selection in many surveyed studies. While ANNs dominate much of the literature—likely due to their ability to model complex nonlinear relationships typical of hydrogen production processes—the reasons for selecting particular models are often not clearly stated. In some cases, the prevalence of certain algorithms appears to be influenced by historical trends or researcher familiarity rather than systematic benchmarking. This lack of transparency makes it difficult to assess the true suitability of models for different hydrogen production contexts.
Additionally, certain patterns were observed across different hydrogen production pathways. In blue hydrogen (e.g., SMR and ATR) and gray hydrogen (e.g., coal gasification), the underlying chemical and thermodynamic processes involve complex, highly nonlinear interactions among multiple variables, making ANNs particularly suitable. In green hydrogen production, especially in water electrolysis and biomass gasification, time-dependent factors such as renewable energy variability and biomass heterogeneity further favor the use of flexible, nonlinear models like ANNs and ensemble methods such as RF. For emerging pathways like pink, turquoise, and white hydrogen, where datasets are often small and experimental conditions vary widely, simpler or ensemble models are sometimes preferred to enhance robustness and avoid overfitting.

7.2. Opportunities

On the opportunity side, the application of ML across the hydrogen value chain opens doors for optimization, cost reduction, and accelerated innovation. One major opportunity lies in predictive maintenance and fault detection. Hydrogen production facilities often experience costly downtimes due to equipment degradation (e.g., electrolyzer membrane wear, catalyst deactivation). ML can forecast failures based on sensor trends and historical patterns, enabling proactive interventions and minimizing unplanned shutdowns.
ML also enhances process optimization and real-time control, particularly for systems with fluctuating inputs—such as green hydrogen production linked to variable renewable energy. By dynamically adjusting operating conditions (e.g., voltage, temperature, flow rates), ML algorithms can maximize hydrogen yield, energy efficiency, and system lifetime. In multi-step processes like SMR or gasification, ML can support end-to-end optimization across heat exchangers, reactors, and gas separation units.
In materials discovery and system design, ML accelerates the identification of optimal catalysts, membranes, or sorbents by learning from experimental and simulation data. This reduces the need for exhaustive lab testing and guides researchers toward promising formulations faster. For example, ML-assisted screening has been applied to metal oxide photoelectrodes in PEC water splitting and to novel catalysts in methane pyrolysis.

7.3. Future Work

Future research should prioritize the development of robust and adaptive ML models capable of operating under realistic, time-varying conditions across diverse hydrogen production pathways. For green hydrogen, ML frameworks that dynamically respond to intermittent renewable energy inputs will be critical for optimizing electrolyzer performance and reducing operational variability. In blue and gray hydrogen systems, integrating long-term effects such as catalyst degradation, the low efficiency of carbon capture, and process instability remains a key challenge. Additionally, the lack of standardized, high-quality datasets, particularly for emerging methods such as turquoise, pink, and white hydrogen, continues to hinder generalizable model development. Coupling ML with physics-informed modeling may enhance interpretability and reliability, which are essential for industrial uptake. Moreover, future research should explore the integration of machine learning with techno-economic analysis (TEA) and life-cycle assessment (LCA) frameworks to better quantify the cost impacts of ML-driven optimization. Currently, only a few studies, such as that of Kim et al. (2022) [103] on pink hydrogen, have attempted to estimate hydrogen production costs using ML. Expanding this intersection could enable more accurate predictions of CAPEX, OPEX, and LCOH under optimized operating scenarios, thereby supporting more informed investment and policy decisions. Finally, ensuring model interpretability, uncertainty quantification, and integration with industrial control systems will be vital for bridging the gap between algorithm development and large-scale, real-world deployment.
In summary, while hydrogen production poses technical and economic challenges, the synergy between hydrogen production and ML presents a powerful opportunity to reshape the global energy landscape. Strategic investment in research, infrastructure, and policy—alongside continued ML innovation—will be critical to unlocking the full potential of a low-carbon hydrogen economy.

8. Conclusions

This review systematically explored the role of ML in hydrogen production, focusing on a wide range of production pathways, including green, blue, and gray, and emerging methods such as pink, turquoise, white, and black/brown hydrogen. This paper highlights how ML techniques have been employed to improve process efficiency, predict hydrogen yield, optimize operational parameters, and reduce environmental impact. The findings underscore the growing intersection between data-driven methods and hydrogen technologies, offering insights into current trends, prevailing challenges, and future directions. The key conclusions are as follows:
  • A total of 51 peer-reviewed papers from 2012 to 2025 were analyzed, covering ML applications across multiple hydrogen production pathways.
  • Green hydrogen received the most ML attention, especially in water electrolysis and biomass gasification, driven by the global shift toward carbon-neutral energy systems.
  • ANNs and their variants (e.g., MLP, BPNN, RBF) were the most frequently used models, applied in over 60% of the studies.
  • Ensemble learning methods like RF, GBR, and XGBoost demonstrated high predictive accuracy and are increasingly used in catalyst screening, syngas modeling, and multi-variable optimization.
  • Time-series models (e.g., LSTM, Bi-LSTM) were effectively employed in forecasting applications, such as renewable-energy-driven electrolysis and biohydrogen production.
  • Common input variables included process parameters (temperature, pressure, flow rates), feedstock properties (elemental composition, ash, moisture), and environmental conditions (solar irradiance, weather data).
  • Key predicted outputs included hydrogen yield, CO2 capture or emission rates, syngas composition, and economic metrics such as production cost.
  • Major challenges include limited real-world deployment, data availability, and a lack of model interpretability, especially in safety-critical systems.
  • Future work should focus on developing robust, generalizable ML models supported by high-quality real-time datasets, emphasizing industrial integration, cost analysis, and techno-economic and life-cycle assessments and addressing current gaps in model validation and interpretability.

Author Contributions

Conceptualization, X.D. and G.Y.; methodology, X.D.; investigation, S.G.; resources, S.G.; writing—original draft preparation, X.D.; writing—review and editing, S.G. and G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ABR/ADAAdaBoost regression
AdBAdaptive boosting regression
AMTAlternating model tree
ANFISAdaptive neuro-fuzzy inference system
ANNArtificial neural network
ARMAssociation rule mining
ASNNAssociative neural network
Bi-LSTMBidirectional LSTM
BPBackpropagation
BRBayesian regularization
CNNConvolutional neural network
DEDifferential evolution
DNNDeep neural network
DRMDry reforming of methane
DTDecision tree
ELMExtreme learning machine
ENRElastic net regression
ETREnsemble tree regression
FFBPNNFeed-forward backpropagation neural network
GAGenetic algorithm
GBDTGradient boosting decision tree
GBMGradient boosting machine
GBRGradient boosting regression
GPGenetic programming
GPRGaussian process regression
KNNN-nearest neighbor
KRKernel ridge
LGBLightGBM
LMLevenberg–Marquardt
LSTMLong short-term memory
LSSVMLeast squares support vector machine
LTSLow temperature shift
MDEAMethyl diethanolamine
MLPMultilayer perceptron
MLR-RRMulti-linear regression with ridge regularization
MOGAMulti-objective genetic algorithm
MSEMean squared error
MTLMultitask learning
MVRMultivariate regression
NARXNonlinear autoregressive model with exogenous inputs neural network
NNsNeural networks
NSGA-IINon-dominated sorting genetic algorithm II
PLSPartial least squares
PSAPressure swing adsorption
PSOParticle swarm optimization
PZPiperazine
QSPRQuantitative structure–property relationship
RBFNNRadial basis function neural network
ResNetResidual convolutional neural network
RFRandom forest
RRRidge regression
SCGScaled conjugate gradient
SE-SMRSorption-enhanced steam methane reforming
SMOregSequential minimal optimization regression
SMRSteam methane reforming
SMRSmall modular reactor
SVDSingular value decomposition
SVMSupport vector machine
SVRSupport vector regression
TINNThermodynamics-informed neural network

References

  1. IEA World Energy Outlook. 2024. Available online: https://www.iea.org/reports/world-energy-outlook-2024 (accessed on 11 February 2025).
  2. Acar, C.; Dincer, I. Review and Evaluation of Hydrogen Production Options for Better Environment. J. Clean. Prod. 2019, 218, 835–849. [Google Scholar] [CrossRef]
  3. Dawood, F.; Anda, M.; Shafiullah, G.M. Hydrogen Production for Energy: An Overview. Int. J. Hydrogen Energy 2020, 45, 3847–3869. [Google Scholar] [CrossRef]
  4. Nikolaidis, P.; Poullikkas, A. A Comparative Overview of Hydrogen Production Processes. Renew. Sustain. Energy Rev. 2017, 67, 597–611. [Google Scholar] [CrossRef]
  5. Holladay, J.D.; Hu, J.; King, D.L.; Wang, Y. An Overview of Hydrogen Production Technologies. Catal. Today 2009, 139, 244–260. [Google Scholar] [CrossRef]
  6. IEA Global Hydrogen Review. 2024. Available online: https://www.iea.org/reports/global-hydrogen-review-2024 (accessed on 11 February 2025).
  7. George Davies, W.; Babamohammadi, S.; Yang, Y.; Masoudi Soltani, S. The Rise of the Machines: A State-of-the-Art Technical Review on Process Modelling and Machine Learning within Hydrogen Production with Carbon Capture. Gas. Sci. Eng. 2023, 118, 205104. [Google Scholar] [CrossRef]
  8. Incer-Valverde, J.; Korayem, A.; Tsatsaronis, G.; Morosuk, T. “Colors” of Hydrogen: Definitions and Carbon Intensity. Energy Convers. Manag. 2023, 291, 117294. [Google Scholar] [CrossRef]
  9. United Nations Economic Commission for Europe (UNECE). Hydrogen: Technology Brief. 2022. Available online: https://unece.org/hydrogen (accessed on 20 February 2025).
  10. Jordan, M.I.; Mitchell, T.M. Machine Learning: Trends, Perspectives, and Prospects. Science (1979) 2015, 349, 255–260. [Google Scholar] [CrossRef]
  11. Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef]
  12. Shahin, M.; Simjoo, M. Potential Applications of Innovative AI-Based Tools in Hydrogen Energy Development: Leveraging Large Language Model Technologies. Int. J. Hydrogen Energy 2025, 102, 918–936. [Google Scholar] [CrossRef]
  13. Kwon, H.; Park, J.; Shin, J.E.; Koo, B. Optimal Investment Strategy Analysis of On-Site Hydrogen Production Based on the Hydrogen Demand Prediction Using Machine Learning. Int. J. Energy Res. 2024, 2024. [Google Scholar] [CrossRef]
  14. Dash, S.K.; Chakraborty, S.; Elangovan, D. A Brief Review of Hydrogen Production Methods and Their Challenges. Energies 2023, 16, 1141. [Google Scholar] [CrossRef]
  15. Alagumalai, A.; Devarajan, B.; Song, H.; Wongwises, S.; Ledesma-Amaro, R.; Mahian, O.; Sheremet, M.; Lichtfouse, E. Machine Learning in Biohydrogen Production: A Review. Biofuel Res. J. 2023, 10, 1844–1858. [Google Scholar] [CrossRef]
  16. Kumar Sharma, A.; Kumar Ghodke, P.; Goyal, N.; Nethaji, S.; Chen, W.-H. Machine Learning Technology in Biohydrogen Production from Agriculture Waste: Recent Advances and Future Perspectives. Bioresour. Technol. 2022, 364, 128076. [Google Scholar] [CrossRef]
  17. Bassey, K.E.; Ibegbulam, C. Machine learning for green hydrogen production. Comput. Sci. IT Res. J. 2023, 4, 368–385. [Google Scholar] [CrossRef]
  18. Allal, Z.; Noura, H.N.; Salman, O.; Vernier, F.; Chahine, K. A Review on Machine Learning Applications in Hydrogen Energy Systems. Int. J. Thermofluids 2025, 26, 101119. [Google Scholar] [CrossRef]
  19. Takeda, S.; Nam, H.; Chapman, A. Low-Carbon Energy Transition with the Sun and Forest: Solar-Driven Hydrogen Production from Biomass. Int. J. Hydrogen Energy 2022, 47, 24651–24668. [Google Scholar] [CrossRef]
  20. Devasahayam, S. Deep Learning Models in Python for Predicting Hydrogen Production: A Comparative Study. Energy 2023, 280, 128088. [Google Scholar] [CrossRef]
  21. Qi, H.; Cui, P.; Liu, Z.; Xu, Z.; Yao, D.; Wang, Y.; Zhu, Z.; Yang, S. Conceptual Design and Comprehensive Analysis for Novel Municipal Sludge Gasification-Based Hydrogen Production via Plasma Gasifier. Energy Convers. Manag. 2021, 245, 114635. [Google Scholar] [CrossRef]
  22. Haq, Z.U.; Ullah, H.; Khan, M.N.A.; Naqvi, S.R.; Ahsan, M. Hydrogen Production Optimization from Sewage Sludge Supercritical Gasification Process Using Machine Learning Methods Integrated with Genetic Algorithm. Chem. Eng. Res. Des. 2022, 184, 614–626. [Google Scholar] [CrossRef]
  23. Kononenko, I. Machine Learning for Medical Diagnosis: History, State of the Art and Perspective. Artif. Intell. Med. 2001, 23, 89–109. [Google Scholar] [CrossRef]
  24. Fradkov, A.L. Early History of Machine Learning. IFAC-Pap. 2020, 53, 1385–1390. [Google Scholar] [CrossRef]
  25. Zhu, X.; Goldberg, A.B. Introduction to Semi-Supervised Learning; Springer International Publishing: Cham, Switzerland, 2009; ISBN 978-3-031-00420-9. [Google Scholar]
  26. Zhou, Z.-H. Machine Learning; Springer Singapore: Singapore, 2021; ISBN 978-981-15-1966-6. [Google Scholar]
  27. Mahesh, B. Machine Learning Algorithms—A Review. Int. J. Sci. Res. (IJSR) 2020, 9, 381–386. [Google Scholar] [CrossRef]
  28. Rundo, F.; Trenta, F.; di Stallo, A.L.; Battiato, S. Machine Learning for Quantitative Finance Applications: A Survey. Appl. Sci. 2019, 9, 5574. [Google Scholar] [CrossRef]
  29. Zitnik, M.; Nguyen, F.; Wang, B.; Leskovec, J.; Goldenberg, A.; Hoffman, M.M. Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities. Inf. Fusion. 2019, 50, 71–91. [Google Scholar] [CrossRef] [PubMed]
  30. Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine Learning in Geosciences and Remote Sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef]
  31. Morgan, D.; Jacobs, R. Opportunities and Challenges for Machine Learning in Materials Science. Annu. Rev. Mater. Res. 2020, 50, 71–103. [Google Scholar] [CrossRef]
  32. Schweidtmann, A.M.; Esche, E.; Fischer, A.; Kloft, M.; Repke, J.; Sager, S.; Mitsos, A. Machine Learning in Chemical Engineering: A Perspective. Chem. Ing. Tech. 2021, 93, 2029–2039. [Google Scholar] [CrossRef]
  33. Murkin, C.; Brightling, J. Eighty Years of Steam Reforming. Johns. Matthey Technol. Rev. 2016, 60, 263–269. [Google Scholar] [CrossRef]
  34. Simpson, A.; Lutz, A. Exergy Analysis of Hydrogen Production via Steam Methane Reforming. Int. J. Hydrogen Energy 2007, 32, 4811–4820. [Google Scholar] [CrossRef]
  35. Saha, P.; Akash, F.A.; Shovon, S.M.; Monir, M.U.; Ahmed, M.T.; Khan, M.F.H.; Sarkar, S.M.; Islam, M.K.; Hasan, M.M.; Vo, D.-V.N.; et al. Grey, Blue, and Green Hydrogen: A Comprehensive Review of Production Methods and Prospects for Zero-Emission Energy. Int. J. Green. Energy 2024, 21, 1383–1397. [Google Scholar] [CrossRef]
  36. Oni, A.O.; Anaya, K.; Giwa, T.; Di Lullo, G.; Kumar, A. Comparative Assessment of Blue Hydrogen from Steam Methane Reforming, Autothermal Reforming, and Natural Gas Decomposition Technologies for Natural Gas-Producing Regions. Energy Convers. Manag. 2022, 254, 115245. [Google Scholar] [CrossRef]
  37. Bauer, C.; Treyer, K.; Antonini, C.; Bergerson, J.; Gazzani, M.; Gencer, E.; Gibbins, J.; Mazzotti, M.; McCoy, S.T.; McKenna, R.; et al. On the Climate Impacts of Blue Hydrogen Production. Sustain. Energy Fuels 2022, 6, 66–75. [Google Scholar] [CrossRef]
  38. Van Cappellen, L.; Croezen, H.; Rooijers, F. Feasibility Study into Blue Hydrogen Technical, Economic & Sustainability Analysis. 2018. Available online: https://www.cedelft.eu/en/publications/2149/ (accessed on 20 March 2025).
  39. AlHumaidan, F.S.; Absi Halabi, M.; Rana, M.S.; Vinoba, M. Blue Hydrogen: Current Status and Future Technologies. Energy Convers. Manag. 2023, 283, 116840. [Google Scholar] [CrossRef]
  40. Howarth, R.W.; Jacobson, M.Z. How Green Is Blue Hydrogen? Energy Sci. Eng. 2021, 9, 1676–1687. [Google Scholar] [CrossRef]
  41. Vo, N.D.; Oh, D.H.; Kang, J.-H.; Oh, M.; Lee, C.-H. Dynamic-Model-Based Artificial Neural Network for H2 Recovery and CO2 Capture from Hydrogen Tail Gas. Appl. Energy 2020, 273, 115263. [Google Scholar] [CrossRef]
  42. Yu, X.; Shen, Y.; Guan, Z.; Zhang, D.; Tang, Z.; Li, W. Multi-Objective Optimization of ANN-Based PSA Model for Hydrogen Purification from Steam-Methane Reforming Gas. Int. J. Hydrogen Energy 2021, 46, 11740–11755. [Google Scholar] [CrossRef]
  43. Tong, L.; Bénard, P.; Zong, Y.; Chahine, R.; Liu, K.; Xiao, J. Artificial Neural Network Based Optimization of a Six-Step Two-Bed Pressure Swing Adsorption System for Hydrogen Purification. Energy AI 2021, 5, 100075. [Google Scholar] [CrossRef]
  44. Streb, A.; Mazzotti, M. Performance Limits of Neural Networks for Optimizing an Adsorption Process for Hydrogen Purification and CO2 Capture. Comput. Chem. Eng. 2022, 166, 107974. [Google Scholar] [CrossRef]
  45. Nkulikiyinka, P.; Wagland, S.T.; Manovic, V.; Clough, P.T. Prediction of Combined Sorbent and Catalyst Materials for SE-SMR, Using QSPR and Multitask Learning. Ind. Eng. Chem. Res. 2022, 61, 9218–9233. [Google Scholar] [CrossRef]
  46. Vo, N.D.; Kang, J.-H.; Oh, D.-H.; Jung, M.Y.; Chung, K.; Lee, C.-H. Sensitivity Analysis and Artificial Neural Network-Based Optimization for Low-Carbon H2 Production via a Sorption-Enhanced Steam Methane Reforming (SESMR) Process Integrated with Separation Process. Int. J. Hydrogen Energy 2022, 47, 820–847. [Google Scholar] [CrossRef]
  47. Oh, H.-T.; Kum, J.; Park, J.; Dat Vo, N.; Kang, J.-H.; Lee, C.-H. Pre-Combustion CO2 Capture Using Amine-Based Absorption Process for Blue H2 Production from Steam Methane Reformer. Energy Convers. Manag. 2022, 262, 115632. [Google Scholar] [CrossRef]
  48. Pizoń, Z.; Kimijima, S.; Brus, G. Enhancing a Deep Learning Model for the Steam Reforming Process Using Data Augmentation Techniques. Energies 2024, 17, 2413. [Google Scholar] [CrossRef]
  49. Wang, Y.; Cui, X.; Peters, D.; Çıtmacı, B.; Alnajdi, A.; Morales-Guio, C.G.; Christofides, P.D. Machine Learning-Based Predictive Control of an Electrically-Heated Steam Methane Reforming Process. Digit. Chem. Eng. 2024, 12, 100173. [Google Scholar] [CrossRef]
  50. Cherif, A.; Lee, J.-S.; Nebbali, R.; Lee, C.-J. Novel Design and Multi-Objective Optimization of Autothermal Steam Methane Reformer to Enhance Hydrogen Production and Thermal Matching. Appl. Therm. Eng. 2022, 217, 119140. [Google Scholar] [CrossRef]
  51. Gul, H.; Arshad, M.Y.; Tahir, M.W. Production of H2 via Sorption Enhanced Auto-Thermal Reforming for Small Scale Applications-A Process Modeling and Machine Learning Study. Int. J. Hydrogen Energy 2023, 48, 12622–12635. [Google Scholar] [CrossRef]
  52. Newborough, M.; Cooley, G. Developments in the Global Hydrogen Market: The Spectrum of Hydrogen Colours. Fuel Cells Bull. 2020, 2020, 16–22. [Google Scholar] [CrossRef]
  53. Chavan, P.D.; Sharma, T.; Mall, B.K.; Rajurkar, B.D.; Tambe, S.S.; Sharma, B.K.; Kulkarni, B.D. Development of Data-Driven Models for Fluidized-Bed Coal Gasification Process. Fuel 2012, 93, 44–51. [Google Scholar] [CrossRef]
  54. Patil-Shinde, V.; Kulkarni, T.; Kulkarni, R.; Chavan, P.D.; Sharma, T.; Sharma, B.K.; Tambe, S.S.; Kulkarni, B.D. Artificial Intelligence-Based Modeling of High Ash Coal Gasification in a Pilot Plant Scale Fluidized Bed Gasifier. Ind. Eng. Chem. Res. 2014, 53, 18678–18689. [Google Scholar] [CrossRef]
  55. Azzam, M.; Aramouni, N.A.K.; Ahmad, M.N.; Awad, M.; Kwapinski, W.; Zeaiter, J. Dynamic Optimization of Dry Reformer under Catalyst Sintering Using Neural Networks. Energy Convers. Manag. 2018, 157, 146–156. [Google Scholar] [CrossRef]
  56. Alsaffar, M.A.; Mageed, A.K.; Abdel Ghany, M.A.R.; Ayodele, B.V.; Mustapa, S.I. Elucidating the Non-Linear Effect of Process Parameters on Hydrogen Production by Catalytic Methane Reforming: An Artificial Intelligence Approach. IOP Conf. Ser. Mater. Sci. Eng. 2020, 991, 012078. [Google Scholar] [CrossRef]
  57. Le, V.T.; Dragoi, E.-N.; Almomani, F.; Vasseghian, Y. Artificial Neural Networks for Predicting Hydrogen Production in Catalytic Dry Reforming: A Systematic Review. Energies 2021, 14, 2894. [Google Scholar] [CrossRef]
  58. Byun, M.; Lee, H.; Choe, C.; Cheon, S.; Lim, H. Machine Learning Based Predictive Model for Methanol Steam Reforming with Technical, Environmental, and Economic Perspectives. Chem. Eng. J. 2021, 426, 131639. [Google Scholar] [CrossRef]
  59. Ayodele, B.V.; Mustapa, S.I.; Kanthasamy, R.; Zwawi, M.; Cheng, C.K. Modeling the Prediction of Hydrogen Production by Co-gasification of Plastic and Rubber Wastes Using Machine Learning Algorithms. Int. J. Energy Res. 2021, 45, 9580–9594. [Google Scholar] [CrossRef]
  60. Ayodele, B.V.; Alsaffar, M.A.; Mustapa, S.I.; Adesina, A.; Kanthasamy, R.; Witoon, T.; Abdullah, S. Process Intensification of Hydrogen Production by Catalytic Steam Methane Reforming: Performance Analysis of Multilayer Perceptron-Artificial Neural Networks and Nonlinear Response Surface Techniques. Process Saf. Environ. Prot. 2021, 156, 315–329. [Google Scholar] [CrossRef]
  61. Hong, S.; Lee, J.; Cho, H.; Kim, M.; Moon, I.; Kim, J. Multi-Objective Optimization of CO2 Emission and Thermal Efficiency for on-Site Steam Methane Reforming Hydrogen Production Process Using Machine Learning. J. Clean. Prod. 2022, 359, 132133. [Google Scholar] [CrossRef]
  62. Chen, W.; Chen, Z.; Hsu, S.; Park, Y.; Juan, J.C. Reactor Design of Methanol Steam Reforming by Evolutionary Computation and Hydrogen Production Maximization by Machine Learning. Int. J. Energy Res. 2022, 46, 20685–20703. [Google Scholar] [CrossRef]
  63. Kim, C.; Won, W.; Kim, J. Early-Stage Evaluation of Catalyst Using Machine Learning Based Modeling and Simulation of Catalytic Systems: Hydrogen Production via Water–Gas Shift over Pt Catalysts. ACS Sustain. Chem. Eng. 2022, 10, 14417–14432. [Google Scholar] [CrossRef]
  64. Liu, S.; Yang, Y.; Yu, L.; Zhu, F.; Cao, Y.; Liu, X.; Yao, A.; Cao, Y. Predicting Gas Production by Supercritical Water Gasification of Coal Using Machine Learning. Fuel 2022, 329, 125478. [Google Scholar] [CrossRef]
  65. Huang, J.; Liang, Z.; Liu, Y. Smart Reforming for Hydrogen Production via Machine Learning. Chem. Eng. Sci. 2025, 304, 120959. [Google Scholar] [CrossRef]
  66. Chi, J.; Yu, H. Water Electrolysis Based on Renewable Energy for Hydrogen Production. Chin. J. Catal. 2018, 39, 390–394. [Google Scholar] [CrossRef]
  67. El-Shafie, M. Hydrogen Production by Water Electrolysis Technologies: A Review. Results Eng. 2023, 20, 101426. [Google Scholar] [CrossRef]
  68. Alamiery, A. Advancements in Materials for Hydrogen Production: A Review of Cutting-Edge Technologies. ChemPhysMater 2023. [Google Scholar] [CrossRef]
  69. Valizadeh, S.; Hakimian, H.; Farooq, A.; Jeon, B.-H.; Chen, W.-H.; Hoon Lee, S.; Jung, S.-C.; Won Seo, M.; Park, Y.-K. Valorization of Biomass through Gasification for Green Hydrogen Generation: A Comprehensive Review. Bioresour. Technol. 2022, 365, 128143. [Google Scholar] [CrossRef]
  70. Pan, J.; Shahbeik, H.; Shafizadeh, A.; Rafiee, S.; Golvirdizadeh, M.; Ghafarian Nia, S.A.; Mobli, H.; Yang, Y.; Zhang, G.; Tabatabaei, M.; et al. Machine Learning Optimization for Enhanced Biomass-Coal Co-Gasification. Renew. Energy 2024, 229, 120772. [Google Scholar] [CrossRef]
  71. Kumar, M.; Meena, B.; Subramanyam, P.; Suryakala, D.; Subrahmanyam, C. Recent Trends in Photoelectrochemical Water Splitting: The Role of Cocatalysts. NPG Asia Mater. 2022, 14, 88. [Google Scholar] [CrossRef]
  72. Mohd Raub, A.A.; Bahru, R.; Mohd Nashruddin, S.N.A.; Yunas, J. Advances of Nanostructured Metal Oxide as Photoanode in Photoelectrochemical (PEC) Water Splitting Application. Heliyon 2024, 10, e39079. [Google Scholar] [CrossRef]
  73. Saifuddin, N.; Priatharsini, P. Developments in Bio-Hydrogen Production from Algae: A Review. Res. J. Appl. Sci. Eng. Technol. 2016, 12, 968–982. [Google Scholar] [CrossRef]
  74. Li, J.; Pan, L.; Suvarna, M.; Wang, X. Machine Learning Aided Supercritical Water Gasification for H2-Rich Syngas Production with Process Optimization and Catalyst Screening. Chem. Eng. J. 2021, 426, 131285. [Google Scholar] [CrossRef]
  75. Sezer, S.; Özveren, U. Investigation of Syngas Exergy Value and Hydrogen Concentration in Syngas from Biomass Gasification in a Bubbling Fluidized Bed Gasifier by Using Machine Learning. Int. J. Hydrogen Energy 2021, 46, 20377–20396. [Google Scholar] [CrossRef]
  76. Saadetnejad, D.; Oral, B.; Can, E.; Yıldırım, R. Machine Learning Analysis of Gas Phase Photocatalytic CO2 Reduction for Hydrogen Production. Int. J. Hydrogen Energy 2022, 47, 19655–19668. [Google Scholar] [CrossRef]
  77. Cheng, G.; Luo, E.; Zhao, Y.; Yang, Y.; Chen, B.; Cai, Y.; Wang, X.; Dong, C. Analysis and Prediction of Green Hydrogen Production Potential by Photovoltaic-Powered Water Electrolysis Using Machine Learning in China. Energy 2023, 284, 129302. [Google Scholar] [CrossRef]
  78. Yang, Q.; Ma, Z.; Bai, L.; Yuan, Q.; Gou, F.; Li, Y.; Du, Z.; Chen, Y.; Liu, X.; Yu, J.; et al. Machine Learning Assisted Prediction for Hydrogen Production of Advanced Photovoltaic Technologies. DeCarbon 2024, 4, 100050. [Google Scholar] [CrossRef]
  79. Babay, M.-A.; Adar, M.; Chebak, A.; Mabrouki, M. Forecasting Green Hydrogen Production: An Assessment of Renewable Energy Systems Using Deep Learning and Statistical Methods. Fuel 2025, 381, 133496. [Google Scholar] [CrossRef]
  80. Salah, A.; Hanel, L.; Beirow, M.; Scheffknecht, G. Modelling SER Biomass Gasification Using Dynamic Neural Networks. In Computer Aided Chemical Engineering; Elsevier: Amsterdam, The Netherlands, 2016; Volume 38, pp. 19–24. [Google Scholar]
  81. Krzywanski, J.; Fan, H.; Feng, Y.; Shaikh, A.R.; Fang, M.; Wang, Q. Genetic Algorithms and Neural Networks in Optimization of Sorbent Enhanced H2 Production in FB and CFB Gasifiers. Energy Convers. Manag. 2018, 171, 1651–1661. [Google Scholar] [CrossRef]
  82. Ozbas, E.E.; Aksu, D.; Ongen, A.; Aydin, M.A.; Ozcan, H.K. Hydrogen Production via Biomass Gasification, and Modeling by Supervised Machine Learning Algorithms. Int. J. Hydrogen Energy 2019, 44, 17260–17268. [Google Scholar] [CrossRef]
  83. Torky, M.; Dahy, G.; Hassanein, A.E. GH2_MobileNet: Deep Learning Approach for Predicting Green Hydrogen Production from Organic Waste Mixtures. Appl. Soft Comput. 2023, 138, 110215. [Google Scholar] [CrossRef]
  84. Gil, M.V.; Jablonka, K.M.; Garcia, S.; Pevida, C.; Smit, B. Biomass to Energy: A Machine Learning Model for Optimum Gasification Pathways. Digit. Discov. 2023, 2, 929–940. [Google Scholar] [CrossRef]
  85. Meena, M.; Kumar, H.; Chaturvedi, N.D.; Kovalev, A.A.; Bolshev, V.; Kovalev, D.A.; Sarangi, P.K.; Chawade, A.; Rajput, M.S.; Vivekanand, V.; et al. Biomass Gasification and Applied Intelligent Retrieval in Modeling. Energies 2023, 16, 6524. [Google Scholar] [CrossRef]
  86. Oral, B.; Can, E.; Yildirim, R. Analysis of Photoelectrochemical Water Splitting Using Machine Learning. Int. J. Hydrogen Energy 2022, 47, 19633–19654. [Google Scholar] [CrossRef]
  87. Tajima, M.; Nagai, Y.; Chen, S.; Pan, Z.; Katayama, K. A Robust Methodology for PEC Performance Analysis of Photoanodes Using Machine Learning and Analytical Data. Analyst 2024, 149, 4193–4207. [Google Scholar] [CrossRef]
  88. Sahu, N.; Azad, C.; Kumar, U. Construction of Hybrid Models Based on Cascade Technique Using Basic Machine Learning Models: An Application as Photocurrent Density Predictor of the Photoelectrode in PEC Cell. Mater. Today Commun. 2024, 41, 110643. [Google Scholar] [CrossRef]
  89. Mishra, S.; Kumar, P.; Dey, S.; Pattanayak, P.; Singh, T. Design of Ternary Metal Oxides for Photoelectrochemical Water Splitting Using Machine Learning Techniques. J. Environ. Chem. Eng. 2025, 13, 115260. [Google Scholar] [CrossRef]
  90. Taheri, E.; Amin, M.M.; Fatehizadeh, A.; Rezakazemi, M.; Aminabhavi, T.M. Artificial Intelligence Modeling to Predict Transmembrane Pressure in Anaerobic Membrane Bioreactor-Sequencing Batch Reactor during Biohydrogen Production. J. Environ. Manag. 2021, 292, 112759. [Google Scholar] [CrossRef]
  91. Hosseinzadeh, A.; Zhou, J.L.; Altaee, A.; Li, D. Machine Learning Modeling and Analysis of Biohydrogen Production from Wastewater by Dark Fermentation Process. Bioresour. Technol. 2022, 343, 126111. [Google Scholar] [CrossRef] [PubMed]
  92. Venkatesh, P.; Chowdhury, M.R.; Rajasekhar, N.; Radhakrishnan, T.K.; Samsudeen, N. Deep Learning Based Modelling and Control of a Microbial Electrolysis Cell for Enhanced Bio Hydrogen Production. Int. J. Hydrogen Energy 2024. [Google Scholar] [CrossRef]
  93. Office of Nuclear Energy Nine Mile Point Begins Clean Hydrogen Production. Available online: https://www.energy.gov/ne/articles/nine-mile-point-begins-clean-hydrogen-production (accessed on 23 March 2025).
  94. Fernández-Arias, P.; Antón-Sancho, Á.; Lampropoulos, G.; Vergara, D. Emerging Trends and Challenges in Pink Hydrogen Research. Energies 2024, 17, 2291. [Google Scholar] [CrossRef]
  95. Diab, J.; Fulcheri, L.; Hessel, V.; Rohani, V.; Frenklach, M. Why Turquoise Hydrogen Will Be a Game Changer for the Energy Transition. Int. J. Hydrogen Energy 2022, 47, 25831–25848. [Google Scholar] [CrossRef]
  96. Muradov, N.; Vezirolu, T. From Hydrocarbon to Hydrogen?Carbon to Hydrogen Economy. Int. J. Hydrogen Energy 2005, 30, 225–237. [Google Scholar] [CrossRef]
  97. Steinberg, M. Fossil Fuel Decarbonization Technology for Mitigating Global Warming. Int. J. Hydrogen Energy 1999, 24, 771–777. [Google Scholar] [CrossRef]
  98. Dagle, R.A.; Dagle, V.; Bearden, M.D.; Holladay, J.D.; Krause, T.R.; Ahmed, S. An Overview of Natural Gas Conversion Technologies for Co-Production of Hydrogen and Value-Added Solid Carbon Products; Richland, WA, USA, 2017. [Google Scholar] [CrossRef]
  99. Bhutto, A.W.; Bazmi, A.A.; Zahedi, G. Underground Coal Gasification: From Fundamentals to Applications. Prog. Energy Combust. Sci. 2013, 39, 189–214. [Google Scholar] [CrossRef]
  100. Jiang, L.; Xue, D.; Wei, Z.; Chen, Z.; Mirzayev, M.; Chen, Y.; Chen, S. Coal Decarbonization: A State-of-the-Art Review of Enhanced Hydrogen Production in Underground Coal Gasification. Energy Rev. 2022, 1, 100004. [Google Scholar] [CrossRef]
  101. Schneider, S.; Bajohr, S.; Graf, F.; Kolb, T. State of the Art of Hydrogen Production via Pyrolysis of Natural Gas. ChemBioEng Rev. 2020, 7, 150–158. [Google Scholar] [CrossRef]
  102. Hermesmann, M.; Müller, T.E. Green, Turquoise, Blue, or Grey? Environmentally Friendly Hydrogen Production in Transforming Energy Systems. Prog. Energy Combust. Sci. 2022, 90, 100996. [Google Scholar] [CrossRef]
  103. Kim, J.; Rweyemamu, M.; Purevsuren, B. Machine Learning-Based Approach for Hydrogen Economic Evaluation of Small Modular Reactors. Sci. Technol. Nucl. Install. 2022, 2022, 1–9. [Google Scholar] [CrossRef]
  104. Salimian, A.; Grisan, E. Deep Learning Analysis of Plasma Emissions: A Potential System for Monitoring Methane and Hydrogen in the Pyrolysis Processes. Int. J. Hydrogen Energy 2024, 58, 1030–1043. [Google Scholar] [CrossRef]
  105. Wen, Y.; Wang, S.; Wu, L.; Hondo, E.; Tang, C.; Jiang, J.; Ho, G.W.; Kawi, S.; Wang, C.-H. Exploring the Role of Process Control and Catalyst Design in Methane Catalytic Decomposition: A Machine Learning Perspective. Int. J. Hydrogen Energy 2024, 72, 601–613. [Google Scholar] [CrossRef]
  106. Zhang, T.; Zhang, Y.; Katterbauer, K.; Al Shehri, A.; Sun, S.; Hoteit, I. Deep Learning–Assisted Phase Equilibrium Analysis for Producing Natural Hydrogen. Int. J. Hydrogen Energy 2024, 50, 473–486. [Google Scholar] [CrossRef]
  107. Zhao, Y.; Wang, J.; Yi, Q. Bridging Uncertainty Gaps with Artificial Intelligence-Assisted Syngas Precise Prediction in Coal Gasification. Chem. Eng. Sci. 2025, 301, 120734. [Google Scholar] [CrossRef]
  108. Ceylan, Z.; Ceylan, S. Application of Machine Learning Algorithms to Predict the Performance of Coal Gasification Process. In Applications of Artificial Intelligence in Process Systems Engineering; Elsevier: Amsterdam, The Netherlands, 2021; pp. 165–186. [Google Scholar] [CrossRef]
Figure 1. Global energy mix by scenario to 2050 [1]. STEPS = Stated Policies Scenario; APS = Announced Pledges Scenario; NZE = Net Zero Emissions by 2025 Scenario.
Figure 1. Global energy mix by scenario to 2050 [1]. STEPS = Stated Policies Scenario; APS = Announced Pledges Scenario; NZE = Net Zero Emissions by 2025 Scenario.
Gases 05 00009 g001
Figure 2. Hydrogen demand by sector and by region, historical and in the Net Zero Emissions by 2050 Scenario, 2019–2030 [6].
Figure 2. Hydrogen demand by sector and by region, historical and in the Net Zero Emissions by 2050 Scenario, 2019–2030 [6].
Gases 05 00009 g002
Figure 3. Classification of hydrogen production methods.
Figure 3. Classification of hydrogen production methods.
Gases 05 00009 g003
Figure 4. Carbon intensity for various H2 production methods in 2022 (data obtained from UNECE [9]).
Figure 4. Carbon intensity for various H2 production methods in 2022 (data obtained from UNECE [9]).
Gases 05 00009 g004
Figure 5. Most frequently used keywords of ML applications in hydrogen production according to the Scopus database (developed by VOSviewer, version 1.6.20).
Figure 5. Most frequently used keywords of ML applications in hydrogen production according to the Scopus database (developed by VOSviewer, version 1.6.20).
Gases 05 00009 g005
Figure 6. Simplified process flow diagram of SMR.
Figure 6. Simplified process flow diagram of SMR.
Gases 05 00009 g006
Figure 7. Simplified process flow diagram of ATR.
Figure 7. Simplified process flow diagram of ATR.
Gases 05 00009 g007
Figure 8. Nine Mile Point Nuclear Station (photo: Constellation Energy) [93].
Figure 8. Nine Mile Point Nuclear Station (photo: Constellation Energy) [93].
Gases 05 00009 g008
Figure 9. Simplified process flow diagram of coal gasifier.
Figure 9. Simplified process flow diagram of coal gasifier.
Gases 05 00009 g009
Table 1. Selected examples of common ML algorithms.
Table 1. Selected examples of common ML algorithms.
AlgorithmConceptSample ApplicationsAdvantagesLimitations
LROne of the simplest ML algorithms used for predicting continuous numerical values. It assumes a linear relationship between input features (independent variables) and the target variable (dependent variable). The algorithm fits a straight line that best represents the relationship between the input and output.Predicting house prices based on size, location, and other features; forecasting sales trends in retail and e-commerce;
stock price prediction in financial markets.
Simple and easy to interpret; works well when the relationship between variables is approximately linear.Fails for nonlinear relationships;
sensitive to outliers, which can distort predictions.
DTA tree-like structure used for both classification and regression. It splits data into branches based on conditions, forming a flowchart-like decision model. Each node represents a decision based on a feature, and branches lead to possible outcomes.Credit risk assessment (loan approvals); medical diagnosis (classifying diseases based on symptoms); customer segmentation (targeted marketing.Easy to interpret and visualize; handles both numerical and categorical data; works well for small to medium-sized datasets.Prone to overfitting on complex datasets; highly sensitive to noisy data (small changes in data can lead to different tree structures).
RFAn ensemble learning algorithm that builds multiple decision trees and combines their results to make more accurate predictions. It reduces overfitting by averaging multiple trees trained on different subsets of data.Fraud detection in banking; predicting customer churn in telecom and subscription-based businesses; medical imaging analysis (cancer detection from MRI scans).Higher accuracy than a single decision tree; handles missing data well and works on large datasets; reduces overfitting by combining multiple trees.Computationally expensive for large datasets; harder to interpret compared to a single decision tree.
SVMA powerful classification algorithm that works by finding the best decision boundary (hyperplane) for separating different classes. It aims to maximize the margin between data points of different classes.Text classification (spam email detection); image recognition (face detection); medical diagnostics (classifying tumors as benign or malignant).Effective for high-dimensional data; works well for small datasets with clear class separation.Computationally expensive for large datasets; sensitive to noisy data and requires careful feature scaling.
K-meansAn unsupervised learning algorithm that groups similar data points into k clusters. It minimizes the distance between data points within a cluster and assigns new data to the closest cluster.Customer segmentation in marketing; anomaly detection (fraudulent transactions); image segmentation in computer vision.Fast and scalable for large datasets; works well when clusters are clearly defined.Sensitive to outliers; requires the number of clusters (k) to be predefined.
ANNInspired by the human brain, consisting of layers of interconnected neurons. These models use backpropagation to adjust weights and improve accuracy.Speech recognition (Google Assistant, Siri); autonomous driving (object detection in self-driving cars); medical diagnostics (AI-driven X-ray analysis).Handles complex problems like speech and image recognition; self-learning capabilities from vast amounts of data.Requires large datasets for training; computationally expensive (needs GPUs).
Gradient Boosting (XGBoost, LightGBM, CatBoost)Combines multiple weak models (decision trees) to create a strong predictive model. It corrects previous mistakes iteratively using gradient descent.Financial modeling (credit scoring); weather forecasting; medical outcome prediction.Handles missing data and outliers well.Computationally expensive for big data; prone to overfitting if not carefully tuned.
Table 2. Comparison of ATR and SMR for blue hydrogen production [39].
Table 2. Comparison of ATR and SMR for blue hydrogen production [39].
FeatureSMRATR
FeedstockNatural gas (CH4)Natural gas (CH4)
Process complexityLower (relies on external heating)Higher (requires O2 supply)
CO2 capture efficiencyModerate (~75–85% with CCUS)Higher (~95% with CCUS)
Energy requirementsHigher (requires external heat input)Lower (self-sustaining heat generation)
CO2 emissionsModerate (requires CCUS)Lower (easier CO₂ separation)
Industrial maturityWidely used globallyEmerging but growing
Capital costsLower (simpler design)Higher (complex setup)
Table 3. Research overview of ML within blue hydrogen production.
Table 3. Research overview of ML within blue hydrogen production.
CategoryNo.ReferenceAlgorithmsDatasetInputsOutput(s)Key Findings
SMR1Vo et al. (2020) [41]Dynamic-model-based ANN, SVD, FFBPNN108 (cryogenic unit); 291 (membrane); 35 (PSA)Membrane area, adsorption time, purge-to-feed ratioH2 purity, CO2 capture rate, H2 productivity, H2 production cost, energy consumption for CO2 captureANN models provided high accuracy (<2% error) and significant computational cost reduction.
2Yu et al. (2021) [42]ANN-GA100Adsorption pressure, part of adsorption time, feed flow rate, length of activated carbon layer, ratio of purge to feedPurity of hydrogen, recovery rate, productivity of the PSA processThe optimized PSA process achieved hydrogen purity above 99% while balancing recovery and productivity.
3Tong et al. (2021) [43]ANN112Adsorption pressure, adsorption step timeH2 purity and recoveryANN can effectively predict and optimize PSA 1-based hydrogen purification.
4Streb and Mazzotti (2022) [44]ANN20,000Feed composition (mol fractions of CO2, CO, CH4, N2, Ar, and H2), adsorption time, light purge duration, evacuation pressure, recycle ratioH2 purity, H2 recovery, CO2 purity, CO2 recovery, CO2 specific energy consumption, productivityANN successfully used for multi-objective constrained optimization: H2 purity ≥ 99–99.97%; CO2 purity ≥ 96%; H2 and CO2 recovery ≥ 90%.
5Nkulikiyinka et al. (2022) [45]QSPR, MTL, ASNN, DNN, LSSVM446Molecular descriptors of materials, CaO or Ni concentration, calcination/carbonation temperature and time, synthesis method, BET surface area, steam-to-carbon ratioMethane conversion, CO2 capture capacityASNN with GSFrag descriptors + multitask learning gave the most accurate predictions.
6Vo et al. (2022) [46]ANN402Inlet temperature, velocity, steam-to-carbon ratio, purge-to-feed ratio, adsorption pressureH2 purity and recovery, CO2 capture efficiency, H2 production cost, energy consumptionThe ANN-based SE-SMR model reduces simulation time from 2 h to 20 s, achieving 99.99% H2 purity and 90.3% CO2 capture efficiency.
7Oh et al. (2022) [47]ANN-DE480MDEA concentration, PZ concentration, flash drum pressure, PZ ion flow rate in lean-amine solvent Reboiler duty, electricity consumption, total equivalent workANNs and DE successfully optimize pre-combustion CO2 capture in SMR-based blue hydrogen production.
8Pizoń et al. (2024) [48]ANN10,475Temperature, steam-to-CH4 ratio, N2-to-CH4 ratio, CH4 flow rate, nickel catalyst massConcentration of H2, CO, CO2, and CH4The ANN model performs better than traditional kinetic models, showing MSE = 0.00022 compared to alternative models.
9Wang et al. (2024) [49]RNN, LSTM100,000Electric current, reactor temperature, flow rate of CH4, H2, CO2, and COReactor temperature, flow rate of CH4, H2, CO2, and COThe LSTM-RNN model can accurately predict reactor dynamics and drive model predictive control for H2 production.
ATR10Cherif et al. (2022) [50]MOGAN/ACatalyst configuration (Ni/Al2O3 or Pt/Al2O3)H2 yield, maximum wall temperatureOptimized catalyst configuration has 46% increase in H2 yield and 27% increase in CH4 conversion.
11Gul et al. (2023) [51]LM, BR, SCGN/AConcentration of CH4, CO, CO2, H2, H2O, and N2, CaCO3 and CaO (solid phase), reactor temperatureH2 yield, CO2 capture efficiency, H2 purity, CH4 conversionSorption-enhanced autothermal reforming (SEATR) process achieved 97% H2 purity (compared to 66% in conventional ATR) and 94% CH4 conversion (compared to 77% in conventional ATR).
1 The hydrogen can be produced via coal gasification or SMR.
Table 4. Examples of gray hydrogen production.
Table 4. Examples of gray hydrogen production.
No.ReferenceAlgorithmsDatasetInputsOutput(s)Key Findings
1Chavan et al. (2012) [53]MVR, ANN106Fixed carbon, volatile matter, mineral matter/ash content, air feed per kg of coal, steam feed per kg of coal, bed temperatureGas production rate, heating value of the product gasANN models outperformed MVR models. The air feed rate was the most influential factor for both gas production and heating value.
2Patil-Shinde et al. (2014) [54]GP, ANN, PCA36Fuel ratio, ash content, specific surface area of coal, activation energy of gasification, coal feed rate, gasifier bed temperature, ash discharge rate, air/coal ratioCO + H2 generation rate, syngas production rate, carbon conversion, heating value of syngasBoth GP and ANN models performed well, with R2 between 0.920 and 0.996. Air/coal ratio, temperature, ash discharge rate, and coal feed rate were the most influential inputs.
3Azzam et al. (2018) [55]ANN-GA2000Reaction temperature, pressure, catalyst diameterCH4 conversion, CO2 conversion, H2/CO ratio, molar percentage of solid carbonANN and GA provide accurate and efficient optimization; high temperatures favor DRM performance but increase catalyst degradation.
4Alsaffar et al. (2020) [56]ANN-MLP30Gas hourly space velocity, O2 concentration in the feed, reaction temperature, CH4/CO2 ratioH2 yield, CH4 conversionThe best-performing ANN architecture was 4-9-2, achieving a sum of squares error of 0.076 and R2 > 0.9.
5Le et al. (2021) [57]ANN-DE100Hydrocarbon type, catalyst composition, reaction temperature, support material properties, process conditionsHydrocarbon conversion, H2 yield, catalyst stabilityHydrocarbon type affects H2 yield. The best ANN model had MSE < 0.05 and relative error < 3.36%.
6Byun et al. (2021) [58]SVR, DT, GPR 10,000Number of reactors, temperature, H2 permeance, membrane area, sweep gas flow rate, steam-to-carbon ratio, compressor capital cost, labor cost, natural gas cost, electricity costH2 production rate, CO2 emission rate, unit H2 production costReactor count and operating temperature have the strongest influence on hydrogen production. The GPR model outperformed SVR and DT.
7Ayodele et al. (2021) [59]ANN (RBFNN and MLP)30Gasification temperature, rubber seed shell particle size, high-density polyethylene particle size, amount of plastic in the mixtureH2 productionOne-layer MLP showed the best performance with an R2 of 0.990 and the lowest sum of squares error.
8Ayodele et al. (2021) [60]ANN-MLP17Methane partial pressure, steam partial pressure, reaction temperatureH2 yield and CH4 conversionANN with 3–17–15–2 structure provides the best prediction for H2 yield (R2 = 0.997) and CH4 conversion (R2 = 0.996).
9Hong et al. (2022) [61]DNN-PSO10,514 for operation, 10,000 for simulationNatural gas feed flow rate, demineralized water flow rate, air flow rate, natural gas fuel flow rate, PSA recovery rate, system pressure, off-gas pressure, SMR reactor inlet/outlet temperature, LTS reactor inlet temperature, air-to-fuel ratioH2 production, H2 purity, thermal efficiency, CO2 emission, SMR conversion efficiencyThe hybrid DNN model achieves an R2 score of 0.94; higher thermal efficiency comes at the cost of higher CO2 emissions.
10Chen et al. (2022) [62]NNs60Inlet temperature, steam-to-carbon ratio, Reynolds numberH2 yield, methanol conversionSteam-to-carbon ratio has the most significant impact on H2 yield; NNs achieve high prediction accuracy.
11Kim et al. (2022) [63]ANN419A total of 34 input features (catalyst composition, operating conditions, catalyst preparation conditions)CO conversionThe ANN model predicts one-pass CO conversion with high accuracy (R2 = 0.997). The best-performing catalysts include Pt/Co(10 wt%)/Al2O3, Pt/Co(20 wt%)/Al2O3, and Pt/Ce(5 wt%)/TiO2.
12Liu et al. (2022) [64]GBR, RF, SVR, DT, ANN, ABR3536Elements (C, H, O, N, S), moisture, ash, volatile, fixed carbon, temperature, concentration ratio, equivalence ratio, residence timeH2, CO, CH4, CO2 gas yieldsGBR was the most accurate model. Operating conditions (especially temperature and residence time) contributed 88.55% to gas yield predictions.
13Huang et al. (2025) [65]LR, RR, Lasso, ENR, DT, RF, GBR, ETR, XGBoost, KNN, MLP1386Temperature, steam-to-carbon ratio, oxygen-to-carbon ratio, pressureH2 yield, CO2 yield, heat dutyXGBoost outperformed all other models. Temperature was the most influential factor for H2 yield.
Table 5. Comparison of electrolyzer technologies for green hydrogen production.
Table 5. Comparison of electrolyzer technologies for green hydrogen production.
Electrolyzer TypeElectrolyteTemperature (°C)Efficiency (%)AdvantagesChallenges
AELKOH or NaOH solution60–8065–75Low-cost, mature technologyLow current density
PEMSolid polymer membrane50–8075–80Fast response, compactUses expensive catalysts (Pt, Ir)
SOECCeramic oxide700–100080–85High efficiency, uses heatHigh degradation, expensive materials
Table 6. Selected examples of green hydrogen production.
Table 6. Selected examples of green hydrogen production.
CategoryNo.ReferenceAlgorithmsDatasetInputsOutput(s)Key Findings
Water electrolysis1Li et al. (2021) [74]NNs, RF, SVR718Feedstock composition, operational conditions (temperature, pressure, reaction time, solid content), catalyst propertiesH2 yield, CO2 yield, CH4 yield, CO yieldNNs outperformed RF and SVR in optimizing H2 production from supercritical water gasification.
2Sezer and Özveren (2021) [75]ANN-LM370,656Carbon content, H2 content, O2 content, gasifier temperature, steam flow rate, fuel (biomass) flow rateH2 mole fraction in syngas, total exergy value of syngasThe ANN model achieved high accuracy (R2 = 0.9999 for training and test data sets).
3Haq et al. (2022) [22]GPR, ETR, ANN, SVM, GA125Proximate analysis of sewage sludge, ultimate analysis of sewage sludge, supercritical water gas operation conditionsH2 yield, CO2 yield, CH4 yield, CO yieldThe GPR model is the best for predicting H2 yield; temperature is the most influential factor for H2 production.
4Saadetnejad et al. (2022) [76]RF, DT549Photocatalyst properties (semiconductor material, band gap energy, co-catalyst type and loading) and reaction conditions (temperature, pressure, CO2/H2O molar ratio)Band gap energy of the photocatalyst, total gas production rateBest semiconductors for gas-phase CO2 reduction are CeO2, SrTiO3, ZnS, ZrO2.
5Cheng et al. (2023) [77]SVM, Prophet9840Temperature, atmospheric pressure, relative humidity, cloud cover, precipitation, fixed month index, full timestamp and time-series structure (for Prophet only)Hydrogen productionML is effective for regional hydrogen production forecasting, especially when integrated with local climate data. SVM outperformed Prophet.
6Yang et al. (2024) [78]ELM, RF, SVM, GA, LSTM, RBF, BPNN1095Solar irradiance, temperature, sunshine hoursPhotovoltaic (PV) power generation, H2 productionLSTM performed best with R2 = 0.8402. HJT PV technology produced most H2 with the lowest cost.
7Babay et al. (2025) [79]SVR, RF, MLP, LSTM-CNNN/ASolar irradiance, ambient temperature, photovoltaic (PV) panel temperature, panel type, seasonal dataH2 productionPolycrystalline panels showed higher H2 output than monocrystalline and amorphous silicon. RF gave the best accuracy.
Biomass or organic waste8Salah et al. (2016) [80]NARXN/AMass flow of fuel and steam into gasifier, fuel mass and air and O2 flow into regenerator, continuous and discontinuous mass flow from generatorProduct gas flow rate, temperature and pressure of gasifier, temperature and pressure of regeneratorThe model achieved low prediction errors, demonstrated real-time adaptability, and helped find the key operating parameters.
9Krzywanski et al. (2018) [81]ANN-GA25Reactor type, CaO/C mole ratio, H2O/C mole ratio, reaction temperatureVolumetric H2 concentration in syngasDeveloped [4–3–3–1] ANN-GA model predicted H2 concentrations with high accuracy: <±8% relative error.
10Ozbas et al. (2019) [82]LR, KNN, SVR, DT2036Time, temperature, concentration of CO, CO2, CH4, and O2, higher heating value of syngasHydrogen concentration in syngasLR has the highest accuracy with R2 = 0.99. The highest H2 concentration in syngas reached 35% vol.
11Torky et al. (2023) [83]MobileNet-CNN, Xception-CNN, DNN, Mask-RCNN23,628Image data, waste characteristics (material type, waste category, physical properties, environmental conditions), estimated weight parameters (volume, density)Waste classification (recyclable, organic, or harmful), estimated waste weight (dry or wet), H2 productionMobileNet-CNN achieved 93% accuracy for waste classification and 98% accuracy in distinguishing dry vs. wet organic waste.
12Gil et al. (2023) [84]GPR30Process parameters (temperature, steam-to-air ratio, stoichiometric ratio, steam-to-biomass ratio), biomass properties (C%, H%, O%, ash content)H2 vol%, CO vol%, CH4 vol%, gas yield, combustible gas concentrationThe GPR model achieved high predictive accuracy, with R2 values ranging from 0.82 to 0.98 for different gasification parameters.
13Meena et al. (2023) [85]ANN, SVM, DT, RF, GBN/AProcess parameters (temperature, equivalence ratio, steam-to-biomass ratio, pressure), biomass properties (C%, H%, O%, ash content, volatile matter), type of gasifying agent, catalyst type, timeH2 vol%, CO vol%, CH4 vol%, CO2 vol%, syngas heating value, syngas yield, tar content, gasification efficiencyRF and GB models showed the highest accuracy, with R2 values exceeding 0.95 for predicting H2 yield and syngas composition.
14Pan et al. (2024) [70]GBR, RF, DT, KR, GA458Feedstock composition (C, H2, N2, O2, S, volatile matter, fixed carbon, ash, moisture %), temperature, biomass/coal blending ratio, equivalence ratio, gasifying agent typeSyngas yield, H2, CO2, CH4, and CO2 content, syngas lower heating valueGBR showed the highest accuracy (R2 up to 0.99) for predicting syngas composition and heating value.
PEC15Oral et al. (2022) [86]ARM, RF, DT10,560Electrode materials, synthesis methods, doping elements, co-catalyst, second-layer materials, calcination conditions, electrolyte type and pH, irradiation conditions, applied bias voltageBand gap energy, photocurrent densityML successfully identified patterns and optimized conditions. RF performed well in predicting band gap energy. ARM and DT helped identify key parameters for enhancing PEC efficiency
16Tajima et al. (2024) [87]SVR, GPR, DT75 (Fe2O3), 32 (BiVO4), 58 (WO3/BiVO4)X-ray diffraction, Raman spectroscopy, UV/Vis absorbance, photoelectrochemical impedance spectroscopyPhotocurrent densityGPR achieved highest prediction accuracy across all tested photoanode materials (hematite, BiVO4, and WO3/BiVO4) with R2 among 0.85–0.99, even for small datasets (30–70 samples).
17Sahu et al. (2024) [88]MLP, ABR, RR, ENR2593Band gap of photoelectrode material, working electrode area, light intensity, power of light source, pH, filter condition, molarityPhotocurrent densityThe hybrid model ABR + MLP performs best with R2 = 0.9686.
18Mishra et al. (2025) [89]KNN, RF, AdB, GBR, XGBoost85Material properties (Shannon ionic radius, density, electronegativity, etc.), experimental conditions (light density, applied bias voltage, preparation method)Band gap energy, photocurrent densityThe XGB model performed best for both band gap prediction and photocurrent density prediction.
Biohydrogen19Taheri et al. (2021) [90]ANN, ANFIS119Organic loading rate, effluent pH, mixed liquor suspended solids, mixed liquor volatile suspended solidsTransmembrane pressureANFIS slightly outperformed ANN (R2 = 0.93 vs. R2 = 0.88).
20Hosseinzadeh et al. (2022) [91]GBM, SVR, RF, AdaBoost210Acetate (A), butyrate (B), A/B ratio, ethanol, iron, nickel, pH, biomass proportion, hydraulic retention, chemical oxygen demandH2 production (yield or rate) during dark fermentationAll four ML models showed high accuracy (R2 > 0.88), with RF having the highest (R2 = 0.902).
21Venkatesh et al. (2024) [92]LSTM, Bi-LSTM5600Applied voltage, sequential input dataCurrent density (directly correlates to H2 production rate)Bi-LSTM outperforms LSTM in modeling and controlling biohydrogen production in a microbial electrolysis cell.
Table 7. Selected examples of pink, turquoise, white, and black/brown hydrogen production.
Table 7. Selected examples of pink, turquoise, white, and black/brown hydrogen production.
CategoryNo.ReferenceAlgorithmsDatasetInputsOutput(s)Key Findings
Pink hydrogen1Kim et al. (2022) [103]CART implemented in Minitab softwareNA61 inputs (heat consumption at H2 generation plant, electricity rating of SMRs, heat supplied to plants, operating years, tax rate, inflation, etc.)H2 production cost (USD/kg)ML can identify key economic drivers in nuclear H2 production. Heat consumption is the most important factor.
Turquoise hydrogen2Salimian and Grisan (2024) [104]ResNet-504975Plasma emission spectra (200–1100 nm) reshaped as 32 × 32 tensorsH2 and CH4 concentrationThe model performed well in predicting CH4 concentration but was less accurate for low H2 concentrations.
3Wen et al. (2024) [105]RF, XGBoost, DT, ADA, GBDT, LGB, KNN, SVR, Lasso, RR, ENR, MLP, MLR2733wt% of Fe, Ni, Cu, Co, Al2O3, SiO2, TiO2, MgO; calcination temperature, CH4 concentration, gas hourly space velocity, reaction temperature and timeCH4 conversion and H2 yieldRF and XGBoost achieved the highest accuracy with R2 = 0.9999 for CH4 conversion and R2 = 0.9996 for the H2 yield model.
White hydrogen4Zhang et al. (2024) [106]TINN5041Thermodynamic properties (critical temperature and pressure, acentric factor, mole fraction), process conditions (temperature, total molar density, pore radius)Number of equilibrium phases, compositional mole fractions of gas and liquid phaseThe TINN model achieves ~20× speedup in phase equilibrium computation compared to traditional iterative flash calculation methods.
Black hydrogen5Zhao et al. (2025) [107]BP-MLP, SVR, MLR-RR, DT, RF, XGBoost, GPR750Coal composition (C, H, N, O, S, Cl, volatile matter, fixed carbon, ash content, moisture content), temperature, pressure, steam-to-coal ratio, oxygen-to-coal ratio Syngas component proportions: H2, CO, CO2, CH4, N2, and others, hydrogen-to-carbon ratioThe BP-MLP showed the best performance. The steam-to-coal ratio, moisture content, and Cl content were the most influential features for H2 prediction.
6Ceylan and Ceylan (2021) [108]SMOreg, GPR, Lazy K-Star, Lazy IBk, AMT, RF106Mineral matter, fixed carbon, volatile matter, air feed, steam feed, bed temperatureGas yield, heating valueRF performed best with R2 = 0.9998 for gas yield and R2 = 0.9730 for heating value.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Du, X.; Gao, S.; Yang, G. Machine Learning Applications in Gray, Blue, and Green Hydrogen Production: A Comprehensive Review. Gases 2025, 5, 9. https://doi.org/10.3390/gases5020009

AMA Style

Du X, Gao S, Yang G. Machine Learning Applications in Gray, Blue, and Green Hydrogen Production: A Comprehensive Review. Gases. 2025; 5(2):9. https://doi.org/10.3390/gases5020009

Chicago/Turabian Style

Du, Xuejia, Shihui Gao, and Gang Yang. 2025. "Machine Learning Applications in Gray, Blue, and Green Hydrogen Production: A Comprehensive Review" Gases 5, no. 2: 9. https://doi.org/10.3390/gases5020009

APA Style

Du, X., Gao, S., & Yang, G. (2025). Machine Learning Applications in Gray, Blue, and Green Hydrogen Production: A Comprehensive Review. Gases, 5(2), 9. https://doi.org/10.3390/gases5020009

Article Metrics

Back to TopTop