# Advanced Machine Learning Methods for Learning from Sparse Data in High-Dimensional Spaces: A Perspective on Uses in the Upstream of Development of Novel Energy Technologies

^{*}

## Abstract

**:**

## 1. Introduction

^{3}atoms. It is therefore enticing to deploy ML methods to replace such calculations or at least to reduce the required number of such calculations. Of course, ML can boost the development not only of the so-called sustainable or “green” energy technologies; we will provide an example in this perspective where it can be used to advance research useful for the future of nuclear energy.

## 2. High Dimensionality and Extremely Low Data Density in the Space of Descriptors

#### 2.1. Examples of Input—Output Mappings Used in ML for Energy Technologies

_{2}. CO

_{2}can be recycled through vegetation but unfortunately the timeframe from dead vegetation to fossil fuels is too long to rely on such a natural carbon cycle with current and projected levels of energy consumption. There is a quest for what can be called an anthropogenic chemical carbon cycle [46]. In it, one does not rely on burning of fossil fuels but on a set of technologies which either generate electricity directly from the sun with solar cells [47] and from other sustainable sources such as wind [48], or synthesize, using sunlight and CO

_{2}and other inputs, fuels which can be utilized in a clean way, such as hydrogen or liquids which can be used with cleaner exhaust in fuel cells or even burnt directly [49,50,51].

#### 2.2. New Technologies and Challengies Require New Simulation Methods–A Large Scope for Machine Learning

**x**), where

**x**is a vector of Cartesian coordinates x, y, z. The function is positively definite and integrates to N: $\int \rho \left(\mathit{x}\right)d\mathit{x}=N$. We represent ρ as a sum of N positively definite pieces, each integrating to 1:

**x**), which can take two alternative forms:

^{3}atoms. Large-scale modeling means more realistic modeling and is required to properly account for a range of phenomena which are intrinsically large-scale (e.g., microstructure-driven properties). In OF-DFT, ρ takes the meaning of the electron density, N is the number of electrons, ${\varphi}_{i}$ are single-electron (Kohn–Sham) orbitals [36], T is the kinetic energy, and $\tau \left(\mathit{x}\right)$ is the kinetic energy density (KED) (we neglect spins and partial orbital occupancy without the loss of generality). The mapping $T=T\left[\rho \left(\mathit{x}\right)\right]$ is the kinetic energy functional (KEF). Approximate formulas for such an expression ($T\left[f\left(\mathit{x}\right)\right]$) exist but they are not accurate enough for use in most applications where ab initio modeling is needed, including (organic and inorganic) semiconductors and transition metal containing functional materials of novel energy technologies. In the past several years, substantial progress has been being made on this problem with the help of machine learning using, in particular, techniques such as neural networks and kernel methods [41,78,79,80,81,82]. We will return to this example later in the context of deep learning. Some uses of ML to improve quantum mechanics-based modeling methods are reviewed in [39].

#### 2.3. The Curse of Dimensionality and Why ML Techniques Are Effective

^{D}. With M as small as 10 and D as small as 10, one would need 10

^{10}data. This is the curse of dimensionality. Moreover, when one constructs a function with polynomials or Fourier expansions, not only the number of required samples grows exponentially but the number of terms in the representation as well. The result of this is that it is impossible to achieve good density of sampling just by adding more data. For example, already in 20 dimensions, a million data points is equivalent to only about 2 data per degree of freedom (of an equivalent direct product grid). If we somehow managed to get 10 times more data, it would only increase the density of sampling to about 2.2 data per degree of freedom. Practically, one therefore always works with extremely sparse data. That is why 32 dimensions in the example above is in fact very high. In practice, one starts feeling this curse of dimensionality from about 6 dimensions and up.

## 3. Advanced Techniques for Working with Sparse Data

#### 3.1. Brief Introduction to Neural Networks and Gaussian Process Regression

#### 3.1.1. Neural Networks (NN)

**x**but their arguments are scalars dependent on

**x**as well as on weights

**w**and biases b. This parameterization by

**w**and b makes a flexible, non-direct product basis. This representation goes back to the Kolmogorov theorem of 1957 [86], and since then, in a series of papers restrictions on σ have been relaxed [87,88,89,90,91,92,93,94,95]. This expression is a universal approximator even when σ is the same for all n and as long as σ is smooth and nonlinear [96]. In applications, typically σ is the same for all n and typically it is the sigmoid function, $\sigma \left(x\right)=\left({e}^{x}-{e}^{-x}\right)/\left({e}^{x}+{e}^{-x}\right)$, but it does not have to be [96]. The parameters

**w**and b and the coefficients c are fitted to reproduce a set of known samples of f, ${f}^{j}=f\left({\mathit{x}}^{j}\right)$, j = 1, …, M. The fit is nonlinear because σ is nonlinear.

#### 3.1.2. Gaussian Process Regression (GPR)

**x**) at certain points in space

**x**

^{j}, what are the expectation values f(

**x**) and their variance Δf(

**x**) for function values at other points in space

**x**?” One assumes that correlation between data can be described with a kernel, a chosen type of the covariance function k(

**x**

_{1},

**x**

_{2}). The answer is given by

**f**is a vector of all ${f}^{j}$ values, and the matrices

**K**and

**K***are computed from pairwise covariances among the data:

**f**.

_{v}is the modified Bessel function of the second kind. At different values of ν this function becomes a Gaussian ($\nu \to \infty $), a simple exponential ($\nu =1/2$) and various other widely used functions (such as Matern3/2 and Matern5/2 for ν = 3/2 and 5/2, respectively). The parameters of the covariance function are the only parameters, they are hyper-parameters and they are few, as few as one (for an isotropic kernel at fixed ν). This is therefore a non-parametric method, which is an advantage. While hyper-parameters still need to be chosen, the performance is usually about equally good as long as hyper-parameters are in some reasonable range. In Equation (7), the critical parameter is the length parameter l; the prefactor ${\sigma}^{2}$ is fully correlated with δ in Equation (6). Note that Equation (5) is a non-direct product representation.

#### 3.1.3. Relative Pros and Cons of GPR vs. NN

#### 3.2. High-Dimensional Model Representation (HDMR)

#### 3.3. Combining HDMR with ML for Learning from Sparse Data

#### 3.3.1. Machine Learning of HDMR Terms

_{d}(D) and can be quite high. An even further extension that we proposed is to represent f(

**x**) as a sum of lower-dimensional functions in new coordinates

**x**

^{(i)}which are linear combination of the original coordinates:

#### 3.3.2. RS-HDMR-NN (Random Sampling High-Dimensional Model Representation Neural Network)

#### 3.3.3. RS-HDMR-GPR (Random Sampling High-Dimensional Model Representation Gaussian Process Regression)

_{6}gas, enrichment of the

^{235}UF

_{6}fraction, conversion to solid uranium oxide, and then production of nuclear fuel assemblies. A key step in this cycle is uranium enrichment to bring the fraction of uranium-235 from the naturally occurring 0.7% to 3–5% (about 99.3% of naturally occurring U is

^{238}U) [124]. Not all reactor types require enrichment but most reactors operating in the world do, while other applications (e.g., defense) require much higher enrichment degrees. The enrichment is typically done in the gaseous phase, by enriching UF

_{6}gas in centrifuges. This is a costly process: the enrichment cost can account for about 10% of the electricity cost [125]. For many years now, researchers and the industry have been studying laser-driven enrichment whereby one excites isotope-sensitive hyperfine transitions either of uranium or uranium hexafluoride [126,127]. This requires very high laser coherency (on the order of 10

^{5}) and brings with it various issues. If one could use instead vibrational transitions in UF

_{6}, some of which are isotopomer-selective (such as the mode at around 628 cm

^{−1}which is different by about 0.6 cm

^{−1}between

^{235}UF

_{6}and

^{238}UF

_{6}implying necessary coherency on the order of 10

^{3}) [128], one could use cheaper and less coherent IR lasers.

_{6}[129]. We computed that UF

_{6}can be adsorbed on different graphene derivatives with tunable adsorption energy depending on the derivative, and that in the adsorbed state there exists an isotopomer unique vibrational mode which can be used to heat and make desorb the molecules in an isotopomer-selective way, as illustrated in Figure 6.

_{6}would be critical for such a technique, and accurate modeling of vibrational properties and dynamics critical for ability to simulate this process [130]. Unfortunately, good, well-resolved vibrational spectra of UF

_{6}are not even found in the experimental literature [128,131], and to compute accurate vibrational spectra or vibrational dynamics, one requires a good interatomic potential function (potential energy surface, PES), which for a UF

_{6}molecule is a 15-dimensional function. Accurate PESs for UF

_{6}are still unavailable, in particular, due to difficulty of building a 15-dimensional function from sparse data.

_{6}as ${E}^{U{F}_{6}}\left(\mathit{x}\right)\approx {{\displaystyle \sum}}_{i=i}^{{N}_{cf}}\left({f}_{i}^{GPR}({x}_{1}^{\left(i\right)},{x}_{2}^{\left(i\right)},\dots ,{x}_{d}^{\left(i\right)}\right)$ from samples computed with density functional theory [130]. We trained the model on 2000, 3000, and 5000 data and tested its quality on 50,000 data, i.e., we used a test set much larger that the training set. The results of test set errors obtained with different orders d of HDMR and different numbers of training data are summarized in Table 1. More details of the calculations are given in [105].

_{cf}= 1). Consider now the results obtained with only 2000 training data. This is a very sparse dataset, with sampling density of only about 1.7 data per dimension. What we observe here is that the fit with three-dimensional functions gives a better test set error than the full-dimensional fit. This is because with low density of sampling, it is impossible to recover the full D-dimensional function [120]. The data points were sampled quasi-randomly [132] in the 15-dimensional space; they do not lie on subdimensional manifolds, but the information to recover the full D-dimensional terms is just not there. Low data density also increases the danger of overfitting. All this together argues for representations with lower-dimensional functions such as those based on HDMR. Ultimately this has to do with the fact that a finite-size dataset in a D-dimensional space is not a D-dimensional object but has a dimension anywhere between 0 and D [83,84].

#### 3.4. When Are Deep NNs Useful?

**x**) looks for aluminum, magnesium, and silicon crystals:

_{4}” in Figure 8).

_{+}, see Equation (2)) and of some of the density dependent variables we used [41]. In Figure 9, $p=\frac{{\left|\nabla \rho \right|}^{2}}{4{\left(3{\pi}^{2}\right)}^{2/3}{\rho}^{8/3}}$ is the scaled (to satisfy the so-called exact conditions [136]) squared gradient and $q=\frac{\Delta \rho}{4{\left(3{\pi}^{2}\right)}^{2/3}{\rho}^{5/3}}$ is the scaled Laplacian of the density, TF is for ${\tau}_{TF}\left(\mathit{r}\right)=\frac{3}{10}{\left(3{\pi}^{2}\right)}^{2/3}{\rho}^{5/3}\left(\mathit{r}\right)$-the Thomas–Fermi KED [137], and vW is for ${\tau}_{vW}\left(\mathrm{r}\right)=\frac{1}{8}\frac{{\left|\nabla \rho \left(\mathit{r}\right)\right|}^{2}}{\rho \left(\mathit{r}\right)}$–the von Weiszacker KED [138]. The KED distribution is very uneven. The distributions of the density-dependent variables are in some cases extremely uneven. What it means is that there are vast parts of the space which are extremely sparsely sampled but which are still important for the quality of the model, and here we clearly see the advantage of a deep NN. Data distribution is an issue that still needs to be better addressed in machine learning. Just using weighted fitting is not sufficient, as we also saw in our research [41,78].

## 4. Discussion and Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Conflicts of Interest

## References

- Tong, Q.; Gao, P.; Liu, H.; Xie, Y.; Lv, J.; Wang, Y.; Zhao, J. Combining Machine Learning Potential and Structure Prediction for Accelerated Materials Design and Discovery. J. Phys. Chem. Lett.
**2020**, 11, 8710–8720. [Google Scholar] [CrossRef] - Walters, W.P.; Barzilay, R. Applications of Deep Learning in Molecule Generation and Molecular Property Prediction. Accounts Chem. Res.
**2020**, 54, 263–270. [Google Scholar] [CrossRef] - Ramprasad, R.; Batra, R.; Pilania, G.; Mannodi-Kanakkithodi, A.; Kim, C. Machine learning in materials informatics: Recent applications and prospects. npj Comput. Mater.
**2017**, 3, 54. [Google Scholar] [CrossRef] - Wang, A.Y.-T.; Murdock, R.J.; Kauwe, S.K.; Oliynyk, A.O.; Gurlo, A.; Brgoch, J.; Persson, K.A.; Sparks, T.D. Machine Learning for Materials Scientists: An Introductory Guide toward Best Practices. Chem. Mater.
**2020**, 32, 4954–4965. [Google Scholar] [CrossRef] - Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine learning for molecular and materials science. Nature
**2018**, 559, 547–555. [Google Scholar] [CrossRef] [PubMed] - Moosavi, S.M.; Jablonka, K.M.; Smit, B. The Role of Machine Learning in the Understanding and Design of Materials. J. Am. Chem. Soc.
**2020**, 142, 20273–20287. [Google Scholar] [CrossRef] [PubMed] - del Cueto, M.; Troisi, A. Determining usefulness of machine learning in materials discovery using simulated research landscapes. Phys. Chem. Chem. Phys.
**2021**, 23, 14156–14163. [Google Scholar] [CrossRef] [PubMed] - Kalidindi, S.R. Feature engineering of material structure for AI-based materials knowledge systems. J. Appl. Phys.
**2020**, 128, 041103. [Google Scholar] [CrossRef] - Li, S.; Liu, Y.; Chen, D.; Jiang, Y.; Nie, Z.; Pan, F. Encoding the atomic structure for machine learning in materials science. WIREs Comput. Mol. Sci.
**2021**, 12, e1558. [Google Scholar] [CrossRef] - Im, J.; Lee, S.; Ko, T.-W.; Kim, H.W.; Hyon, Y.; Chang, H. Identifying Pb-free perovskites for solar cells by machine learning. npj Comput. Mater.
**2019**, 5, 37. [Google Scholar] [CrossRef] [Green Version] - Meftahi, N.; Klymenko, M.; Christofferson, A.J.; Bach, U.; Winkler, D.A.; Russo, S.P. Machine learning property prediction for organic photovoltaic devices. npj Comput. Mater.
**2020**, 6, 166. [Google Scholar] [CrossRef] - Sahu, H.; Ma, H. Unraveling Correlations between Molecular Properties and Device Parameters of Organic Solar Cells Using Machine Learning. J. Phys. Chem. Lett.
**2019**, 10, 7277–7284. [Google Scholar] [CrossRef] [PubMed] - Zhuo, Y.; Brgoch, J. Opportunities for Next-Generation Luminescent Materials through Artificial Intelligence. J. Phys. Chem. Lett.
**2021**, 12, 764–772. [Google Scholar] [CrossRef] [PubMed] - Mahmood, A.; Wang, J.-L. Machine learning for high performance organic solar cells: Current scenario and future prospects. Energy Environ. Sci.
**2020**, 14, 90–105. [Google Scholar] [CrossRef] - Li, F.; Peng, X.; Wang, Z.; Zhou, Y.; Wu, Y.; Jiang, M.; Xu, M. Machine Learning (ML)—Assisted Design and Fabrication for Solar Cells. Energy Environ. Mater.
**2019**, 2, 280–291. [Google Scholar] [CrossRef] [Green Version] - Wang, C.-I.; Joanito, I.; Lan, C.-F.; Hsu, C.-P. Artificial neural networks for predicting charge transfer coupling. J. Chem. Phys.
**2020**, 153, 214113. [Google Scholar] [CrossRef] - An, N.G.; Kim, J.Y.; Vak, D. Machine learning-assisted development of organic photovoltaics via high-throughput in situ formulation. Energy Environ. Sci.
**2021**, 14, 3438–3446. [Google Scholar] [CrossRef] - Rodríguez-Martínez, X.; Pascual-San-José, E.; Campoy-Quiles, M. Accelerating organic solar cell material’s discovery: High-throughput screening and big data. Energy Environ. Sci.
**2021**, 14, 3301–3322. [Google Scholar] [CrossRef] - Priya, P.; Aluru, N.R. Accelerated design and discovery of perovskites with high conductivity for energy applications through machine learning. npj Comput. Mater.
**2021**, 7, 90. [Google Scholar] [CrossRef] - Srivastava, M.; Howard, J.M.; Gong, T.; Dias, M.R.S.; Leite, M.S. Machine Learning Roadmap for Perovskite Photovoltaics. J. Phys. Chem. Lett.
**2021**, 12, 7866–7877. [Google Scholar] [CrossRef] - Teunissen, J.L.; Da Pieve, F. Molecular Bond Engineering and Feature Learning for the Design of Hybrid Organic–Inorganic Perovskite Solar Cells with Strong Noncovalent Halogen–Cation Interactions. J. Phys. Chem. C
**2021**, 125, 25316–25326. [Google Scholar] [CrossRef] - Miyake, Y.; Saeki, A. Machine Learning-Assisted Development of Organic Solar Cell Materials: Issues, Analyses, and Outlooks. J. Phys. Chem. Lett.
**2021**, 12, 12391–12401. [Google Scholar] [CrossRef] [PubMed] - Xu, S.; Liang, J.; Yu, Y.; Liu, R.; Xu, Y.; Zhu, X.; Zhao, Y. Machine Learning-Assisted Discovery of High-Voltage Organic Materials for Rechargeable Batteries. J. Phys. Chem. C
**2021**, 125, 21352–21358. [Google Scholar] [CrossRef] - Moses, I.A.; Joshi, R.P.; Ozdemir, B.; Kumar, N.; Eickholt, J.; Barone, V. Machine Learning Screening of Metal-Ion Battery Electrode Materials. ACS Appl. Mater. Interfaces
**2021**, 13, 53355–53362. [Google Scholar] [CrossRef] [PubMed] - Chen, A.; Zhang, X.; Chen, L.; Yao, S.; Zhou, Z. A Machine Learning Model on Simple Features for CO
_{2}Reduction Electrocatalysts. J. Phys. Chem. C**2020**, 124, 22471–22478. [Google Scholar] [CrossRef] - Lamoureux, P.S.; Winther, K.T.; Torres, J.A.G.; Streibel, V.; Zhao, M.; Bajdich, M.; Abild-Pedersen, F.; Bligaard, T. Machine Learning for Computational Heterogeneous Catalysis. ChemCatChem
**2019**, 11, 3581–3601. [Google Scholar] [CrossRef] [Green Version] - Back, S.; Yoon, J.; Tian, N.; Zhong, W.; Tran, K.; Ulissi, Z.W. Convolutional Neural Network of Atomic Surface Structures To Predict Binding Energies for High-Throughput Screening of Catalysts. J. Phys. Chem. Lett.
**2019**, 10, 4401–4408. [Google Scholar] [CrossRef] - Toyao, T.; Maeno, Z.; Takakusagi, S.; Kamachi, T.; Takigawa, I.; Shimizu, K.-I. Machine Learning for Catalysis Informatics: Recent Applications and Prospects. ACS Catal.
**2019**, 10, 2260–2297. [Google Scholar] [CrossRef] - Li, X.; Paier, W.; Paier, J. Machine Learning in Computational Surface Science and Catalysis: Case Studies on Water and Metal–Oxide Interfaces. Front. Chem.
**2020**, 8, 601029. [Google Scholar] [CrossRef] - Pablo-García, S.; García-Muelas, R.; Sabadell-Rendón, A.; López, N. Dimensionality reduction of complex reaction networks in heterogeneous catalysis: From l inear-scaling relationships to statistical learning techniques. WIREs Comput. Mol. Sci.
**2021**, 11, e1540. [Google Scholar] [CrossRef] - Li, X.; Chiong, R.; Page, A.J. Group and Period-Based Representations for Improved Machine Learning Prediction of Heterogeneous Alloy Catalysts. J. Phys. Chem. Lett.
**2021**, 12, 5156–5162. [Google Scholar] [CrossRef] [PubMed] - Wu, D.; Zhang, J.; Cheng, M.-J.; Lu, Q.; Zhang, H. Machine Learning Investigation of Supplementary Adsorbate Influence on Copper for Enhanced Electrochemical CO
_{2}Reduction Performance. J. Phys. Chem. C**2021**, 125, 15363–15372. [Google Scholar] [CrossRef] - Palkovits, S. A Primer about Machine Learning in Catalysis—A Tutorial with Code. ChemCatChem
**2020**, 12, 3995–4008. [Google Scholar] [CrossRef] - Giordano, L.; Akkiraju, K.; Jacobs, R.; Vivona, D.; Morgan, D.; Shao-Horn, Y. Electronic Structure-Based Descriptors for Oxide Properties and Functions. Accounts Chem. Res.
**2022**, 55, 298–308. [Google Scholar] [CrossRef] [PubMed] - Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys. Rev.
**1964**, 136, B864–B871. [Google Scholar] [CrossRef] [Green Version] - Kohn, W.; Sham, L.J. Self-consistent equations including exchange and correlation effects. Phys. Rev.
**1965**, 140, A1133–A1138. [Google Scholar] [CrossRef] [Green Version] - Rapaport, D.C. The Art of Molecular Dynamics Simulation, 2nd ed.; Cambridge University Press: Cambridge, UK, 2004; ISBN 978-0-521-82568-9. [Google Scholar]
- Jansen, A.P.J. Kinetic Monte Carlo Algorithms. In An Introduction to Kinetic Monte Carlo Simulations of Surface Reactions; Jansen, A.P.J., Ed.; Lecture Notes in Physics; Springer: Berlin, Heidelberg, 2012; pp. 37–71. ISBN 978-3-642-29488-4. [Google Scholar]
- Manzhos, S. Machine learning for the solution of the Schrödinger equation. Mach. Learn. Sci. Technol.
**2020**, 1, 013002. [Google Scholar] [CrossRef] - Behler, J. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys.
**2016**, 145, 170901. [Google Scholar] [CrossRef] [Green Version] - Manzhos, S.; Golub, P. Data-driven kinetic energy density fitting for orbital-free DFT: Linear vs Gaussian process regression. J. Chem. Phys.
**2020**, 153, 074104. [Google Scholar] [CrossRef] - Kulik, H.; Hammerschmidt, T.; Schmidt, J.; Botti, S.; Marques, M.A.L.; Boley, M.; Scheffler, M.; Todorović, M.; Rinke, P.; Oses, C.; et al. Roadmap on Machine Learning in Electronic Structure. Electron. Struct.
**2022**. [Google Scholar] [CrossRef] - Duan, C.; Liu, F.; Nandy, A.; Kulik, H.J. Putting Density Functional Theory to the Test in Machine-Learning-Accelerated Materials Discovery. J. Phys. Chem. Lett.
**2021**, 12, 4628–4637. [Google Scholar] [CrossRef] [PubMed] - Friederich, P.; Häse, F.; Proppe, J.; Aspuru-Guzik, A. Machine-learned potentials for next-generation matter simulations. Nat. Mater.
**2021**, 20, 750–761. [Google Scholar] [CrossRef] [PubMed] - Statistical Review of World Energy | Energy Economics | Home. Available online: https://www.bp.com/en/global/corporate/energy-economics/statistical-review-of-world-energy.html (accessed on 7 February 2022).
- Olah, G.A.; Prakash, G.K.S.; Goeppert, A. Anthropogenic Chemical Carbon Cycle for a Sustainable Future. J. Am. Chem. Soc.
**2011**, 133, 12881–12898. [Google Scholar] [CrossRef] [PubMed] - Nayak, P.K.; Mahesh, S.; Snaith, H.J.; Cahen, D. Photovoltaic solar cell technologies: Analysing the state of the art. Nat. Rev. Mater.
**2019**, 4, 269–285. [Google Scholar] [CrossRef] - Herbert, G.J.; Iniyan, S.; Sreevalsan, E.; Rajapandian, S. A review of wind energy technologies. Renew. Sustain. Energy Rev.
**2007**, 11, 1117–1145. [Google Scholar] [CrossRef] - Winter, M.; Brodd, R.J. What Are Batteries, Fuel Cells, and Supercapacitors? Chem. Rev.
**2004**, 104, 4245–4270. [Google Scholar] [CrossRef] [Green Version] - Birdja, Y.Y.; Pérez-Gallent, E.; Figueiredo, M.C.; Göttle, A.J.; Calle-Vallejo, F.; Koper, M.T.M. Advances and challenges in understanding the electrocatalytic conversion of carbon dioxide to fuels. Nat. Energy
**2019**, 4, 732–745. [Google Scholar] [CrossRef] - Detz, R.J.; Reek, J.N.H.; van der Zwaan, B.C.C. The future of solar fuels: When could they become competitive? Energy Environ. Sci.
**2018**, 11, 1653–1669. [Google Scholar] [CrossRef] - Barnhart, C.J.; Benson, S.M. On the importance of reducing the energetic and material demands of electrical energy storage. Energy Environ. Sci.
**2013**, 6, 1083–1092. [Google Scholar] [CrossRef] - Winter, M.; Barnett, B.; Xu, K. Before Li Ion Batteries. Chem. Rev.
**2018**, 118, 11433–11456. [Google Scholar] [CrossRef] - Abram, T.; Ion, S. Generation-IV nuclear power: A review of the state of the science. Energy Policy
**2008**, 36, 4323–4330. [Google Scholar] [CrossRef] - Ho, M.; Obbard, E.; A Burr, P.; Yeoh, G. A review on the development of nuclear power reactors. Energy Procedia
**2019**, 160, 459–466. [Google Scholar] [CrossRef] - Suman, S. Hybrid nuclear-renewable energy systems: A review. J. Clean. Prod.
**2018**, 181, 166–177. [Google Scholar] [CrossRef] - Shao, M.; Chang, Q.; Dodelet, J.-P.; Chenitz, R. Recent Advances in Electrocatalysts for Oxygen Reduction Reaction. Chem. Rev.
**2016**, 116, 3594–3657. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jahangiri, H.; Bennett, J.; Mahjoubi, P.; Wilson, K.; Gu, S. A review of advanced catalyst development for Fischer–Tropsch synthesis of hydrocarbons from biomass derived syn-gas. Catal. Sci. Technol.
**2014**, 4, 2210–2229. [Google Scholar] [CrossRef] [Green Version] - Chen, W.-H.; Chen, C.-Y. Water gas shift reaction for hydrogen production and carbon dioxide capture: A review. Appl. Energy
**2019**, 258, 114078. [Google Scholar] [CrossRef] - Chen, L.; Qi, Z.; Zhang, S.; Su, J.; Somorjai, G.A. Catalytic Hydrogen Production from Methane: A Review on Recent Progress and Prospect. Catalysts
**2020**, 10, 858. [Google Scholar] [CrossRef] - Lavoie, J.-M. Review on dry reforming of methane, a potentially more environmentally-friendly approach to the increasing natural gas exploitation. Front. Chem.
**2014**, 2, 81. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jain, A.; Ong, S.P.; Hautier, G.; Chen, W.; Richards, W.D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater.
**2013**, 1, 011002. [Google Scholar] [CrossRef] [Green Version] - Liu, M.; Kitchin, J.R. SingleNN: Modified Behler–Parrinello Neural Network with Shared Weights for Atomistic Simulations with Transferability. J. Phys. Chem. C
**2020**, 124, 17811–17818. [Google Scholar] [CrossRef] - Behler, J. Constructing high-dimensional neural network potentials: A tutorial review. Int. J. Quantum Chem.
**2015**, 115, 1032–1050. [Google Scholar] [CrossRef] - Na, G.S.; Jang, S.; Lee, Y.-L.; Chang, H. Tuplewise Material Representation Based Machine Learning for Accurate Band Gap Prediction. J. Phys. Chem. A
**2020**, 124, 10616–10623. [Google Scholar] [CrossRef] - Xu, P.; Lu, T.; Ju, L.; Tian, L.; Li, M.; Lu, W. Machine Learning Aided Design of Polymer with Targeted Band Gap Based on DFT Computation. J. Phys. Chem. B
**2021**, 125, 601–611. [Google Scholar] [CrossRef] [PubMed] - Aykol, M.; Herring, P.; Anapolsky, A. Machine learning for continuous innovation in battery technologies. Nat. Rev. Mater.
**2020**, 5, 725–727. [Google Scholar] [CrossRef] - Deringer, V.L. Modelling and understanding battery materials with machine-learning-driven atomistic simulations. J. Phys. Energy
**2020**, 2, 041003. [Google Scholar] [CrossRef] - Thomas, J.K.; Crasta, H.R.; Kausthubha, K.; Gowda, C.; Rao, A. Battery monitoring system using machine learning. J. Energy Storage
**2021**, 40, 102741. [Google Scholar] [CrossRef] - Li, W.; Cui, H.; Nemeth, T.; Jansen, J.; Ünlübayir, C.; Wei, Z.; Zhang, L.; Wang, Z.; Ruan, J.; Dai, H.; et al. Deep reinforcement learning-based energy management of hybrid battery systems in electric vehicles. J. Energy Storage
**2021**, 36, 102355. [Google Scholar] [CrossRef] - Elkamel, M.; Schleider, L.; Pasiliao, E.L.; Diabat, A.; Zheng, Q.P. Long-Term Electricity Demand Prediction via Socioeconomic Factors—A Machine Learning Approach with Florida as a Case Study. Energies
**2020**, 13, 3996. [Google Scholar] [CrossRef] - Krishnadas, G.; Kiprakis, A. A Machine Learning Pipeline for Demand Response Capacity Scheduling. Energies
**2020**, 13, 1848. [Google Scholar] [CrossRef] [Green Version] - Nti, I.K.; Teimeh, M.; Nyarko-Boateng, O.; Adekoya, A.F. Electricity load forecasting: A systematic review. J. Electr. Syst. Inf. Technol.
**2020**, 7, 1–19. [Google Scholar] [CrossRef] - Antonopoulos, I.; Robu, V.; Couraud, B.; Kirli, D.; Norbu, S.; Kiprakis, A.; Flynn, D.; Elizondo-Gonzalez, S.; Wattam, S. Artificial intelligence and machine learning approaches to energy demand-side response: A systematic review. Renew. Sustain. Energy Rev.
**2020**, 130, 109899. [Google Scholar] [CrossRef] - Kim, J.Y.; Lee, J.-W.; Jung, H.S.; Shin, H.; Park, N.-G. High-Efficiency Perovskite Solar Cells. Chem. Rev.
**2020**, 120, 7867–7918. [Google Scholar] [CrossRef] [PubMed] - Pham, H.D.; Xianqiang, L.; Li, W.; Manzhos, S.; Kyaw, A.K.K.; Sonar, P. Organic interfacial materials for perovskite-based optoelectronic devices. Energy Environ. Sci.
**2019**, 12, 1177–1209. [Google Scholar] [CrossRef] - Witt, W.C.; del Rio, B.G.; Dieterich, J.M.; Carter, E.A. Orbital-free density functional theory for materials research. J. Mater. Res.
**2018**, 33, 777–795. [Google Scholar] [CrossRef] - Golub, P.; Manzhos, S. Kinetic energy densities based on the fourth order gradient expansion: Performance in different classes of materials and improvement via machine learning. Phys. Chem. Chem. Phys.
**2018**, 21, 378–395. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Fujinami, M.; Kageyama, R.; Seino, J.; Ikabata, Y.; Nakai, H. Orbital-free density functional theory calculation applying semi-local machine-learned kinetic energy density functional and kinetic potential. Chem. Phys. Lett.
**2020**, 748, 137358. [Google Scholar] [CrossRef] - Seino, J.; Kageyama, R.; Fujinami, M.; Ikabata, Y.; Nakai, H. Semi-local machine-learned kinetic energy density functional demonstrating smooth potential energy curves. Chem. Phys. Lett.
**2019**, 734, 136732. [Google Scholar] [CrossRef] - Snyder, J.C.; Rupp, M.; Hansen, K.; Blooston, L.; Müller, K.-R.; Burke, K. Orbital-free bond breaking via machine learning. J. Chem. Phys.
**2013**, 139, 224104. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Yao, K.; Parkhill, J. Kinetic Energy of Hydrocarbons as a Function of Electron Density and Convolutional Neural Networks. J. Chem. Theory Comput.
**2016**, 12, 1139–1147. [Google Scholar] [CrossRef] [PubMed] - Hausdorff, F. Dimension und äußeres Maß. Math. Ann.
**1918**, 79, 157–179. [Google Scholar] [CrossRef] - Kak, S. Information theory and dimensionality of space. Sci. Rep.
**2020**, 10, 20733. [Google Scholar] [CrossRef] [PubMed] - Bottou, L. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2012; pp. 421–436. ISBN 9783642352881. [Google Scholar]
- Kolmogorov, A.N.; Arnol’d, V.; Boltjanskiĭ, V.; Efimov, N.; Èskin, G.; Koteljanskiĭ, D.; Krasovskiĭ, N.; Men’šov, D.; Portnov, I.; Ryškov, S.; et al. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. Am. Math. Soc. Transl. Ser. 2
**1963**, 28, 55–59. [Google Scholar] [CrossRef] - Sprecher, D.A. A Numerical Implementation of Kolmogorov’s Superpositions II. Neural Netw.
**1997**, 10, 447–457. [Google Scholar] [CrossRef] - Sprecher, D.A. A Numerical Implementation of Kolmogorov’s Superpositions. Neural Netw.
**1996**, 9, 765–772. [Google Scholar] [CrossRef] - Sprecher, D.A.; Draghici, S. Space-filling curves and Kolmogorov superposition-based neural networks. Neural Netw.
**2002**, 15, 57–67. [Google Scholar] [CrossRef] - Nees, M. Approximative versions of Kolmogorov’s superposition theorem, proved constructively. J. Comput. Appl. Math.
**1994**, 54, 239–250. [Google Scholar] [CrossRef] [Green Version] - Katsuura, H.; Sprecher, D.A. Computational aspects of Kolmogorov’s superposition theorem. Neural Netw.
**1994**, 7, 455–461. [Google Scholar] [CrossRef] - Sprecher, D.A. A universal mapping for kolmogorov’s superposition theorem. Neural Netw.
**1993**, 6, 1089–1094. [Google Scholar] [CrossRef] - Kurkova, V. Kolmogorov’s theorem and multilayer neural networks. Neural Netw.
**1992**, 5, 501–506. [Google Scholar] [CrossRef] - Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw.
**1991**, 4, 251–257. [Google Scholar] [CrossRef] - Hornik, K.; Stinchcombe, M.; White, H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw.
**1990**, 3, 551–560. [Google Scholar] [CrossRef] - Gorban, A. Approximation of continuous functions of several variables by an arbitrary nonlinear continuous function of one variable, linear functions, and their superpositions. Appl. Math. Lett.
**1998**, 11, 45–49. [Google Scholar] [CrossRef] [Green Version] - Manzhos, S.; Carrington, T., Jr. Using neural networks to represent potential surfaces as sums of products. J. Chem. Phys.
**2006**, 125, 194105. [Google Scholar] [CrossRef] - Beck, M.; Jäckle, A.; Worth, G.; Meyer, H.-D. The multiconfiguration time-dependent Hartree (MCTDH) method: A highly efficient algorithm for propagating wavepackets. Phys. Rep.
**2000**, 324, 1–105. [Google Scholar] [CrossRef] - Schmitt, M. On the Complexity of Computing and Learning with Multiplicative Neural Networks. Neural Comput.
**2002**, 14, 241–301. [Google Scholar] [CrossRef] [PubMed] - Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge MA, USA, 2006; ISBN 0-262-18253-X. [Google Scholar]
- Genton, M.G. Classes of Kernels for Machine Learning: A Statistics Perspective. J. Mach. Learn. Res.
**2001**, 2, 299–312. [Google Scholar] - Smola, A.; Bartlett, P. Sparse Greedy Gaussian Process Regression. In Proceedings of the Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2001; Volume 13. [Google Scholar]
- Kamath, A.; Vargas-Hernández, R.A.; Krems, R.V.; Carrington, T., Jr.; Manzhos, S. Neural networks vs Gaussian process regression for representing potential energy surfaces: A comparative study of fit quality and vibrational spectrum accuracy. J. Chem. Phys.
**2018**, 148, 241702. [Google Scholar] [CrossRef] [PubMed] - Warner, B.A.; Neal, R.M. Bayesian Learning for Neural Networks (Lecture Notes in Statistical Vol. 118). J. Am. Stat. Assoc.
**1997**, 92, 791. [Google Scholar] [CrossRef] - Boussaidi, M.A.; Ren, O.; Voytsekhovsky, D.; Manzhos, S. Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for Multivariate Function Representation: Application to Molecular Potential Energy Surfaces. J. Phys. Chem. A
**2020**, 124, 7598–7607. [Google Scholar] [CrossRef] [PubMed] - Ren, O.; Boussaidi, M.A.; Voytsekhovsky, D.; Ihara, M.; Manzhos, S. Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for representing multidimensional functions with machine-learned lower-dimensional terms allowing insight with a general method. Comput. Phys. Commun.
**2021**, 271, 108220. [Google Scholar] [CrossRef] - Manzhos, S.; Ihara, M. Rectangularization of Gaussian Process Regression for Optimization of Hyperparameters. arXiv
**2021**, arXiv:2112.02467. [Google Scholar] - Li, G.; Rosenthal, C.; Rabitz, H. High Dimensional Model Representations. J. Phys. Chem. A
**2001**, 105, 7765–7777. [Google Scholar] [CrossRef] - Rabitz, H.; Aliş, Ö.F. General foundations of high-dimensional model representations. J. Math. Chem.
**1999**, 25, 197–233. [Google Scholar] [CrossRef] - Alış, F.; Rabitz, H. Efficient Implementation of High Dimensional Model Representations. J. Math. Chem.
**2001**, 29, 127–142. [Google Scholar] [CrossRef] - Fisher, R.A. On the “Probable Error” of a Coefficient of Correlation Deduced from a Small Sample. Metron
**1921**, 1, 3–32. [Google Scholar] - Sobol′, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul.
**2001**, 55, 271–280. [Google Scholar] [CrossRef] - Li, G.; Hu, J.; Wang, S.-W.; Georgopoulos, P.G.; Schoendorf, A.J.; Rabitz, H. Random Sampling-High Dimensional Model Representation (RS-HDMR) and Orthogonality of Its Different Order Component Functions. J. Phys. Chem. A
**2006**, 110, 2474–2485. [Google Scholar] [CrossRef] - Wang, S.-W.; Georgopoulos, P.G.; Li, G.; Rabitz, H. Random Sampling-High Dimensional Model Representation (RS-HDMR) with Nonuniformly Distributed Variables: Application to an Integrated Multimedia/Multipathway Exposure and Dose Model for Trichloroethylene. J. Phys. Chem. A
**2003**, 107, 4707–4716. [Google Scholar] [CrossRef] - Manzhos, S.; Ihara, M. On the Optimization of Hyperparameters in Gaussian Process Regression with the Help of Low-Order High-Dimensional Model Representation. arXiv
**2022**, arXiv:2112.01374. [Google Scholar] - Manzhos, S.; Yamashita, K.; Carrington, T. Extracting Functional Dependence from Sparse Data Using Dimensionality Reduction: Application to Potential Energy Surface Construction. In Proceedings of the Coping with Complexity: Model Reduction and Data Analysis; Gorban, A.N., Roose, D., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 133–149. [Google Scholar]
- Manzhos, S.; Carrington, T., Jr. Using redundant coordinates to represent potential energy surfaces with lower-dimensional functions. J. Chem. Phys.
**2007**, 127, 014103. [Google Scholar] [CrossRef] - Manzhos, S.; Yamashita, K.; Carrington, T. Fitting sparse multidimensional data with low-dimensional terms. Comput. Phys. Commun.
**2009**, 180, 2002–2012. [Google Scholar] [CrossRef] - Manzhos, S.; Sasaki, E.; Ihara, M. Easy representation of multivariate functions with low-dimensional terms via Gaussian process regression kernel design: Applications to machine learning of potential energy surfaces and kinetic energy densities from sparse data. Mach. Learn. Sci. Technol.
**2022**, 3, 01LT02. [Google Scholar] [CrossRef] - Manzhos, S.; Carrington, T., Jr. A random-sampling high dimensional model representation neural network for building potential energy surfaces. J. Chem. Phys.
**2006**, 125, 084109. [Google Scholar] [CrossRef] - Duvenaud, D.; Nickisch, H.; Rasmussen, C.E. Additive Gaussian Processes. In Advances in Neural Information Processing Systems; Neural Information Processing Systems: San Diego, CA, USA, 2011; pp. 226–234. [Google Scholar]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science
**2006**, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Manzhos, S.; Yamashita, K. A model for the dissociative adsorption of N
_{2}O on Cu(100) using a continuous potential energy surface. Surf. Sci.**2010**, 604, 555–561. [Google Scholar] [CrossRef] - Wolfsberg, M.; Van Hook, A.; Paneth, P.; Rebelo, L.P.N. Isotope Effects; Springer: Dordrecht, The Netherlands, 2009. [Google Scholar]
- Schneider, E.; Carlsen, B.; Tavrides, E.; van der Hoeven, C.; Phathanapirom, U. Measures of the environmental footprint of the front end of the nuclear fuel cycle. Energy Econ.
**2013**, 40, 898–910. [Google Scholar] [CrossRef] [Green Version] - Parvin, P.; Sajad, B.; Silakhori, K.; Hooshvar, M.; Zamanipour, Z. Molecular laser isotope separation versus atomic vapor laser isotope separation. Prog. Nucl. Energy
**2004**, 44, 331–345. [Google Scholar] [CrossRef] - Ronander, E.; Strydom, H.J.; Botha, L.R. High-pressure continuously tunable CO
_{2}lasers and molecular laser isotope separation. Pramana**2014**, 82, 49–58. [Google Scholar] [CrossRef] - McDowell, R.S.; Sherman, R.J.; Asprey, L.B.; Kennedy, R.C. Vibrational spectrum and force field of molybdenum hexafluoride. J. Chem. Phys.
**1975**, 62, 3974–3978. [Google Scholar] [CrossRef] - Koh, Y.W.; Westerman, K.; Manzhos, S. A computational study of adsorption and vibrations of UF6 on graphene derivatives: Conditions for 2D enrichment. Carbon
**2015**, 81, 800–806. [Google Scholar] [CrossRef] - Manzhos, S.; Carrington, T.; Laverdure, L.; Mosey, N. Computing the Anharmonic Vibrational Spectrum of UF6 in 15 Dimensions with an Optimized Basis Set and Rectangular Collocation. J. Phys. Chem. A
**2015**, 119, 9557–9567. [Google Scholar] [CrossRef] [PubMed] - Berezin, A.; Malyugin, S.; Nadezhdinskii, A.; Namestnikov, D.; Ponurovskii, Y.; Stavrovskii, D.; Shapovalov, Y.; Vyazov, I.; Zaslavskii, V.; Selivanov, Y.; et al. UF6 enrichment measurements using TDLS techniques. Spectrochim. Acta Part A Mol. Biomol. Spectrosc.
**2007**, 66, 796–802. [Google Scholar] [CrossRef] [PubMed] - Sobol’, I. On the distribution of points in a cube and the approximate evaluation of integrals. USSR Comput. Math. Math. Phys.
**1967**, 7, 86–112. [Google Scholar] [CrossRef] - Hodges, C.H. Quantum Corrections to the Thomas–Fermi Approximation—The Kirzhnits Method. Can. J. Phys.
**1973**, 51, 1428–1437. [Google Scholar] [CrossRef] - Manzhos, S.; Dawes, R.; Carrington, T. Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces. Int. J. Quantum Chem.
**2014**, 115, 1012–1020. [Google Scholar] [CrossRef] [Green Version] - Manzhos, S.; Carrington, T. Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem. Rev.
**2020**, 121, 10187–10217. [Google Scholar] [CrossRef] - Bartlett, R.J.; Ranasinghe, D.S. The power of exact conditions in electronic structure theory. Chem. Phys. Lett.
**2016**, 669, 54–70. [Google Scholar] [CrossRef] - Fermi, E. Eine statistische Methode zur Bestimmung einiger Eigenschaften des Atoms und ihre Anwendung auf die Theorie des periodischen Systems der Elemente. Eur. Phys. J. A
**1928**, 48, 73–79. [Google Scholar] [CrossRef] - Weizsäcker, C.F.V. Zur Theorie der Kernmassen. Eur. Phys. J. A
**1935**, 96, 431–458. [Google Scholar] [CrossRef]

**Figure 1.**Left: Schematic of the SQL data structure of Catalysis-Hub.org, used to store reaction energies (reactions table, green) and DFT calculations (ASE database, blue). Since each reaction energy involves several DFT calculations (and the same DFT calculations can potentially be used for several reactions), a many-to-many mapping schema is used to preserve connections between the table rows. Right: Machine learning-enhanced catalyst candidate prediction: bulk and surface structures retrieved from structure databases like materialsproject.org, OQMD, Catalysis-Hub.org, etc., are used for automated slab generation and enumeration of possible adsorption sites. In an iterative process, limited numbers of DFT-calculated adsorption energies and machine-learning-predicted adsorption energies are used to inform microkinetic models to eventually suggest promising catalyst candidates that should be investigated by experiment. Adapted with permission from [26]. Copyright 2019 Wiley-VCH Verlag GmbH.

**Figure 2.**Left: Distribution of types of ML techniques applied in design and fabrication of solar cells. Right: Distribution of applications in design and fabrication of solar cells assisted by ML techniques. Adapted with permission from [15]. Copyright 2019 Wiley-VCH Verlag GmbH.

**Figure 4.**Test set root mean square error (rmse) of the interatomic potential of vinyl bromide as function of both the number and the dimensionality of terms of Red-RS-HDMR-NN. Reproduced with permission from [117]. Copyright 2010 American Institute of Physics.

**Figure 5.**Test set mean absolute error (mae) of the interatomic potential of N

_{2}O over Cu (the system shown on the right) as a function of the dimensionality of a single component function of Red-RS-HDMR-NN. Reproduced with permission from [123]. Copyright 2010 Elsevier.

**Figure 7.**Kinetic energy densities within the unit cells of (

**a**) fcc Al, (

**b**) hcp Mg, and (

**c**) cubic diamond Si crystals.

**Figure 8.**One-dimensional cuts of the kinetic energy densities of bcc Li (top left), hcp Mg (top right), fcc Al (bottom left), and cubic diamond Si (bottom right) along selected directions in the crystal lattice. The target Kohn–Sham kinetic energy density is shown as a black line, results of a single-hidden layer NN fit of the KEDs of all materials simultaneously with a red line (“[80] NN”, where 80 is the number of neurons in the hidden layer), and the results of a four-hidden layer NN fit of the KEDs of all materials simultaneously with a turquoise line (“[20 20 20 20] NN”, where 20 is the number of neurons in each hidden layer). See [78] for details. Adapted with permission from Ref. [78]. Copyright 2019 The Owner Societies.

**Figure 9.**Distributions (histograms) of the kinetic energy densities and density dependent variables in a dataset combining data from Al, Mg, and Si at equilibrium geometry as well as under uniform compression and extension. Adapted with permission from [41]. Copyright 2020 American Institute of Physics.

**Table 1.**Test set root mean square error (rmse) when fitting the potential energy surface of UF

_{6}with HDMR-GPR of different orders d for different numbers of training points N

_{train}. For comparison, the results with a full 15-dimensional GPR are also shown. The numbers of component functions N

_{cf}at each d are also shown.

Rmse ^{1} | N_{cf}\N_{train} | 5000 | 3000 | 2000 |
---|---|---|---|---|

Full-D (d = D) | 1 | 42.2 | 75.4 | 106.7 |

d = 1 | 15 | 234.6 | 236.4 | 237.3 |

d = 2 | 105 | 168.1 | 178.6 | 190.3 |

d = 3 | 455 | 65.6 | 78.0 | 97.4 |

^{1}On the test set of 50,000 points.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Manzhos, S.; Ihara, M.
Advanced Machine Learning Methods for Learning from Sparse Data in High-Dimensional Spaces: A Perspective on Uses in the Upstream of Development of Novel Energy Technologies. *Physchem* **2022**, *2*, 72-95.
https://doi.org/10.3390/physchem2020006

**AMA Style**

Manzhos S, Ihara M.
Advanced Machine Learning Methods for Learning from Sparse Data in High-Dimensional Spaces: A Perspective on Uses in the Upstream of Development of Novel Energy Technologies. *Physchem*. 2022; 2(2):72-95.
https://doi.org/10.3390/physchem2020006

**Chicago/Turabian Style**

Manzhos, Sergei, and Manabu Ihara.
2022. "Advanced Machine Learning Methods for Learning from Sparse Data in High-Dimensional Spaces: A Perspective on Uses in the Upstream of Development of Novel Energy Technologies" *Physchem* 2, no. 2: 72-95.
https://doi.org/10.3390/physchem2020006