The Search for Hydrogen Stores on a Large Scale; a Straightforward and Automated Open Database Analysis as a First Sweep for Candidate Materials

The storage of hydrogen is considered as the bottleneck in the implementation of portable fuel cell power generating systems. The necessary experimental studies to discover and develop appropriate storage materials are always time-limited. We discuss herein the approach of an uncomplicated and accessible computationally based analysis of database knowledge towards the identification of promising storage systems. The open access policy of the Crystallography Open Database (COD) invites researchers to grasp the opportunity to formulate targeted analyses of crystalline solids, unfettered by material resources. We apply such an approach to the initial evaluation of potential solid-state hydrogen stores, although the method could potentially be transferred to other material analysis tasks.


Introduction
Hydrogen is probably the most attractive energy carrier for a fossil fuel free future.Although fuel cells are a mature and efficient means to convert energy cleanly, the convenient, yet safe storage of hydrogen as a feedstock poses a major problem [1].While storing hydrogen in gaseous or liquid form would be advantageous, the many practical difficulties make it necessary to look for alternatives [2,3].As a compressed gas, hydrogen has a low density, occupying considerable space and demanding OPEN ACCESS technological solutions to the storage of the highly mobile and diffusive hydrogen molecules [4].As a liquid, hydrogen requires energy to compress, a low temperature to maintain a condensed state, and retains a relatively low volumetric hydrogen density.
One alternative approach is to store hydrogen in or on solids.Such a philosophy can afford storage systems without the extensive safety precautions associated with the pressurised and/or cryogenic storage of a highly flammable and explosive fuel such as hydrogen.One method is to increase the fuel density via the physical adsorption of hydrogen on surfaces or within porous materials [5].Although the use of such porous materials lowers the gravimetric capacity (the mass of hydrogen stored per mass of the loaded storage system) through the introduction of supplementary weight, the volumetric capacity (mass of hydrogen stored per volume of the loaded storage system) of physical adsorption materials at a given pressure can easily exceed that of the pure gas.Physical adsorption systems however carry the main disadvantage that significant gravimetric capacities are only achievable at cryogenic temperatures and/or high pressures.This arises from the low enthalpy associated with physical adsorption resulting in labile storage materials at ambient temperature [6].Another option, without the need of high pressure or constant cooling, is to store hydrogen in solids via bonding to other atoms, i.e. through the formation of chemical bonds [1].A number of potential chemical hydrogen storage systems have been studied including metal hydrides [7], amides [8], borohydrides [9], and ammonia borane-based materials [10].
Experimental studies inevitably focus on a certain select group of candidates and cannot include the whole range of possible storage materials.In the study described herein, we adopted a very different approach.As a simple first approximation, any material containing hydrogen is in principle able to release H2 by one means or another.In practice, of course, the dehydrogenation ability is governed by the thermodynamics (and kinetics) of the hydrogen release process and many high hydrogen-content solids are incapable of release under the moderate thermal working conditions dictated by a fuel cell.Nevertheless, first and foremost, a potential hydrogen storage material can be considered on the basis of the composition of any given compound.Hence, it was neither our aim in this study to predict and assess hypothetical storage materials nor to nominate prescriptively a shortlist of materials that could meet all the criteria required for a viable commercial system.Rather, we wished to consider a simple and widely accessible way to provide an initial assessment of a large body of established hydrogen-containing materials on the basis of the two most widely considered storage materials criteria: gravimetric and volumetric hydrogen density.The most complete set of information about solids is probably contained in structural databases such as the Inorganic Crystal Structure Database (ICSD, for inorganic structures) [11], the Cambridge Structural Database (CSD, for organic structures) [12], or the Crystallography Open Database (COD) as an open access database of both organic and inorganic structures [13][14][15].Although the use of crystallographic databases has the inherent shortfall that only crystalline structures can be considered, we assumed this to be a sound approach, given that: (a) most crystalline materials are reliably characterised and (b) many non-crystalline solids also form crystalline phases, which will be included in the study.

Data Harvesting and Processing
The motivation for the use of the COD as our database of choice was two-fold: First, it contains both inorganic and organic crystal structures and is therefore appropriate for a comprehensive study and second, in contrast to the other well-referenced structural databases, the COD is fully open access allowing all information to be retrieved freely at the point of use.The code employed throughout this work was written in the interpreted language Python, which has a proven track record in the solution of chemical problems.This is especially true in crystallography [16] and of many examples, among the most prominent is probably the EXPGUI interface for GSAS [17].

Data Harvesting Procedure
Data from the COD can be obtained freely from the homepage (www.crystallography.net).For the study herein, a complete set of cif files from the COD (status as of 27 January 2015) was downloaded and then treated locally.The data harvesting process was written as Python2 code and is heavily based on the use of the python module PyCifRW, which can be obtained from the International Union of Crystallography (IUCr) homepage (www.iucr.org/resources/cif/software/pycifrw)[18].The full python code of the reader routine is supplied in the supporting information.In order to limit the complexity of the task and to allow the use of potentially incomplete data sets, only three pieces of information were retrieved from the cif-files: the COD registration number as a unique identifier, the sum formula, and the density of the crystal structure as derived from the diffraction experiment.The diffraction-derived density was favoured over the use of experimental densities simply because only a very small amount of datasets contain experimentally determined values.After collecting the information from all cif-files in the database, those containing hydrogen (or deuterium) were chosen for further treatment.In the course of the harvesting program execution, the hydrogen gravimetric capacity of the compound in the cif-file was calculated from the sum formula, whereas the equivalent volumetric capacity was calculated from the gravimetric capacity and the density.One should consider that this treatment will only yield theoretical capacity values.The experimentally determined gravimetric capacity of a real system will almost certainly be less, as factors such as non-ideal decomposition have to be taken into account.Further, the experimental volumetric capacity will unavoidably be lower than the calculated value given, for example, that the packing density of the material cannot be neglected.Factors such as these are not easily evaluated across a wide sample of materials and it would be no more than speculation to assume values for these non-ideal parameters.Nevertheless, as a measure of relative storage capacity performance, we believe that the concept is valid and we have therefore chosen to use the data as retrieved from the cif-files in order to keep the data as consistent and unbiased as possible.

Data Refining Procedure
The output of the data harvesting process was written to two output files, one containing the COD registry numbers and sum formulae, and the other containing the COD registry numbers and both gravimetric and volumetric capacity.The file containing the capacity information can be used directly as delimited text output for plotting or importing into a spreadsheet.The file containing the sum formulae provides a means to abstract chemical information, i.e., regarding the elements present (or not present) in the compound.A second python program was built to enable this, which allows the refinement of the dataset by specifying a number of different search parameters, such as: elements required to be present, elements required to be excluded, the number of elements, and the minimal and maximal values for both gravimetric and volumetric capacity.In order to make the user interaction uncomplicated, a simple front end was created as a graphical user interface (GUI) using the Tk suite (Figure 1).The GUI can be used to produce personalised result lists for users.The main restriction of this treatment is that the sum formula data contain no information about the connectivity or chemical nature of the compounds.Other specifiers that could supply such information, for instance a structural formula or a SMILES code, proved difficult to use satisfactorily in this study.While the former is not rigorously defined and therefore hardly machine readable, the latter is only contained in a relatively small subset of cif-files and therefore would greatly restrict the number of possible hits.If a similar procedure was adopted for materials searching tasks other than the purpose of this study, however, one could imagine the use of alternative selected information from the cif-files as required.

Results and Discussion
The COD database contains 306,511 cif files in the version used for this study.Of these, 262,710 cif files contain either hydrogen or deuterium (85.7% of all deposited structures) while providing density values, and these form the basis of this study.

Outlier Treatment
One of the necessary caveats when working with databases is that not all of the entries are likely to be correct.While it is difficult to identify incorrect sum formulae instantaneously (some examples will be examined in the following section), some of the volumetric values retrieved can be quickly recognised as unreasonable.Inconsistencies in the volumetric capacity can be mostly attributed to erroneous density values.For instance, the entries 2,300,260 and 2,300,261 (mono-ammoniates of methanol measured at different temperatures) [19] return a volumetric hydrogen capacity of over 160 g•mL −1 , which is clearly unrealistically high.In fact, this value arises from the density in the cif file, which is given in kg•m −3 while the density for cif-entries is defined in g•cm −3 .Hence, a factor of 1000 is erroneously introduced into the calculation.Some other noteworthy spurious outliers are caused by inaccurate sum formulae (Figure 2).These extend from typographic errors in the number of atoms (such as in "C119.8H1137.2Al4 F6.6 Mo4 N8 O12") [20], through flawed treatments of multiplicities (as in "Cl H12") [21] to mistyped sum formulae (as in "Ct H20 N2 O2 wo23") [22].It is important to note that in cases of mis-entered sum formulae, the program calculates the molecular mass from the sum formula by using an internal dictionary for the atomic masses.As this treatment is case-sensitive, it excludes all non-recognised atom types and consequently sets their atomic mass to zero.This process will clearly result in a significant overestimation of the relative hydrogen content.While it is important to be cautious with these outliers, which need special attention, the vast majority of datasets lie in the expected range and are directly usable as input.

US Department of Energy Targets
One important measure for the efficiency of hydrogen storage materials is the set of targets for hydrogen storage described by the US Department of Energy (DoE).These targets, as defined in the latest documentation, include a system gravimetric hydrogen density of 7.5 wt.% and a system volumetric hydrogen capacity of 70 g•L −1 as ultimate goals and 5.5 wt.% and 40 g•L −1 for gravimetric and volumetric targets, respectively, for 2020 [23].If one temporarily neglects the overall size and weight of the remainder of the storage system (tank, thermal management, balance of plant), of the 262,710 hydrogen-containing substances in the COD, 32,774 already meet the ultimate DoE capacity targets (12.5% of all hydrogen containing structures), which in principle would place them amongst the most potentially interesting systems to study.In fact, if one concentrates on the 2020 targets, then 95,016 documented crystal structures fulfil the necessary capacity requirements (36.1% of all hydrogen containing structures).In practice, one then has to adjust these values to allow for total system performance.
Of the seemingly numerous potential hydrogen storage materials, only a limited number has been studied in detail experimentally to date.In many cases, of course, this is for good reason since the strength of bonds to hydrogen in organic compounds, for example, precludes serious consideration of thermal hydrogen release.A simple extreme example from an overview of the COD data concerns the solid structures of H2.These are easily identified in the general plot given the unique gravimetric capacity of 1000 g (H2)•kg −1 (i.e., 100 wt.% hydrogen).The variation in the points for H2(s) originates from the many datasets that have been recorded at different temperature and pressure.Although the elevated pressure data are within the range of the DoE targets, the conditions under which the hydrogen solid structures are measured are extreme (up to 26.5 GPa and/or at extremely low temperatures) and so not remotely achievable practically in a viable system.It is however remarkable that the exceptionally low density of solid hydrogen, even at extreme pressures (0.210 g•cm −3 at 26.5 GPa) [24], prohibits a high volumetric capacity.Much higher values, in fact, are achievable by a number of other compounds.

Organic Compounds
The COD contains mostly organic compounds, comprising 255,806 structures (97.4% of all hydrogen containing compounds), with (hydrogen-containing) inorganic substances totalling only 6711 entries (2.6% of all hydrogen containing compounds in the database).(In fact, as the sum formula is the only indicator of the chemical nature used for each entry, the identity of organic and inorganic substances is reduced to whether a compound contains carbon or not.It is beyond the current scope of the study to explore bonding properties and therefore an exact definition of C-H containing compounds cannot be employed.)Some organic compounds that are notable in terms of gravimetric and volumetric capacities are highlighted in Figure 3. Again, it is important to check for outliers where the values might not represent true information (e.g., "C6 H132 O12 Sc4 O", where the carbon atom count should be 60, rather than 6; structural formula: [Sc(OCH2C(CH3))3]4) [25].Inevitably, crystallised hydrocarbons like propane and butane possess high gravimetric and volumetric hydrogen capacities [26].In fact, these two alkanes, for example, are represented by vertical lines of data-points characteristic of a series of structures of the same compound where the density varies as a consequence of the different measurement pressures and/or temperatures.Very similar behaviour can be observed for different polymorphs of the same compound (as in the case of ethylendiamine) and would be expected from different compounds with the same sum formula [27].Boron containing compounds, such as tetramethylammonium borohydride or 1,3-diisopropyl-4,5-dimethylimidazolium tetramethoxyborate are amongst the most promising organic hydrogen stores in terms of capacity [28,29].The dehydrogenation properties of tetramethylammonium borohydride, for example, which contains both protic and hydridic hydrogen, are unstudied, although one might expect the evolution of C-H-and B-H-containing gases on decomposition of the pure compound.Such borohydride and borate materials could be worthy of study when mixed with light metal hydrides in composite systems, however.Finally, the clathrate tetra-n-butyl ammonium bromide (26.4) hydrate (TBAB•26.4H2O) is worthy of comment as an impressive representative of clathrate structures, which allow the high density storage of gases (in this case water) through the formation of host-guest structures stabilised by Van-der-Waals interactions [30].

Inorganic Hydrogen Stores
Although only 2.6% of all hydrogen containing structures deposited in the COD do not contain carbon, the absence of relatively strong C-H bonds renders some of these materials very interesting for hydrogen storage.It is not surprising, therefore, that these are among the materials that are under the most active experimental consideration as stores.Binary hydrides (e.g., LiH/LiD) [31] comprise a major field of investigation and for the purposes of hydrogen store design, can be crudely divided into two categories: heavy metal hydrides and light metal hydrides.While the light metal hydrides have the advantage of a higher gravimetric capacity, the heavy metal hydrides have the advantage of a higher density conferred by the metallic constituents, enhancing their volumetric capacity (Figure 4).The latter all possess volumetric capacities above the ultimate DoE target and could therefore be interesting in applications where system space is the biggest constraint.Most of the returned inorganic structures also contain oxygen, predominantly present in the COD as hydroxides or hydrates.One striking example is ice, which exhibits a range of different volumetric capacities depending on the polymorph and pressure [32][33][34][35][36][37].Oxygen-containing materials might not normally be considered as hydrogen stores due to the often strong bond between hydrogen and oxygen, which makes the formation of water likely, and reversible hydrogen storage enthalpically unfavourable.However, the recent development of modular, "primary" hydrogen storage systems, e.g., based on the combination of MgH2 with Mg(OH)2 [38] and reversible hydride-hydroxide systems, such as NaH + NaOH [39,40], suggests that oxygen-containing hydrogen storage materials are worthy of further exploration.
Another class of compounds amongst the most promising hydrogen stores are ammine complexes since the ligand (containing 17.6 wt.% of hydrogen in the case of [Na(NH3)5][Na(NH3)3(P3H3)] [41]) is normally bound in high stoichiometric ratios to the central metal atom making the overall hydrogen capacity very promising.The data analysis clearly shows, however, one class of compounds that is able to bind the highest gravimetric amounts of hydrogen while maintaining impressive volumetric capacity.Boranes, both covalent (e.g., B10H13(SH) [42]) and ionic (e.g.NH4B3H8 [43]) can bind with relatively prodigious amounts of hydrogen.This is clearly due to the trivalency of boron coupled with low molecular weight.The two pre-eminent candidates for hydrogen storage, ammonia borane and lithium borohydride, are only two members of a considerably larger family of compounds, which has the potential to outperform other storage materials.In the sections below, we consider two classes of inorganic hydrogen-containing materials in more detail.These case studies provide an insight into the possibilities of structural database analysis as an initial stage in a wider search process for suitable hydrogen storage materials.Selected subsets are emphasised: Binary hydrides (magenta crosses), nitrogen containing compounds (green filled triangles), boron-containing compounds (but excluding oxygen; red filled squares) and oxygen-containing compounds (blue filled circles).Some specific example compounds are labelled in the figure.

Binary Metal Hydrides
Given that the number of binary metal hydrides is necessarily limited by the number of metals forming stable compounds, the number of hits in the COD is fairly limited.Only 28 entries fulfil the requirements (binary compounds without B, C, N, O, Cl, Br or I and with a reported density value; Figure 5).In comparison, the commercial database ICSD contains 327 entries for binary metal hydrides.Perhaps the main reason for this is that the ICSD has existed for far longer than the COD and contains a larger number of historical data.Some very interesting trends and general assumptions can however been drawn from the data available.
The clear leader in terms of gravimetric capacity is lithium hydride, a compound that has long been considered as a hydrogen storage material [9,31].A quite striking observation is the higher apparent volumetric capacity of LiD as compared to LiH.This, of course, is an outcome of the isotope effect on density.While the calculation of the gravimetric capacity is corrected for this (by setting the mass of deuterium as equal to the mass of hydrogen in the calculation), the density is taken directly from the cif file.While this effect has nearly no influence on structures including heavy elements, the density for LiD is markedly higher than for LiH causing this deviation.The other alkali metal hydrides [31] show much lower hydrogen capacities and of these only NaH could be regarded justifiably as having potential as a realisable hydrogen store.
Unfortunately, the COD only contains the structures of CaH2 [44] and SrH2 [31] as representatives of the alkaline earth metal hydrides and therefore makes a detailed analysis of this group of compounds impossible.Nevertheless, CaH2 has a volumetric capacity that is higher than the DoE target and could be considered as a potential hydrogen store.Undoubtedly, the most studied alkaline-earth metal hydride is magnesium hydride, MgH2 (absent in the COD), which has a higher gravimetric capacity (7.7 wt.%) than CaH2 (and without the toxicity issues of BeH2) and is rightly considered the most promising hydrogen storage material among the alkaline earth elements [7,45].
In light of the atomic mass of the lanthanide atoms, the gravimetric hydrogen capacity of the corresponding hydrides is understandably low [31].The exceptional density of the hydrogenated materials, however, yields higher volumetric capacities than solid hydrogen at ambient pressures (Figure 2).Due to the very similar chemistry of the f-block elements, the respective hydrides are isostructural and differ in performance primarily through the atomic mass of the component metals.Of the p-block hydrides, the solid structures of germane and stannane [46] exhibit encouragingly high volumetric hydrogen storage capacities, but it should be noted that these structures were determined at 5 K and that both compounds are gaseous at room temperature.
Perhaps the most remarkable hydrogen storage capacities however are exhibited by transition metal hydrides [31], although there are relatively few examples in the COD as compared to ICSD (which by comparison contains 148 entries under binary transition metal hydrides).ScH2 and NbH2 surpass the other binary hydrides in terms of volumetric capacity.Again, this is due to the high density of the reported structures and NbH2 (6.6 g•cm −3 ) is a striking example [47].A number of subsequent studies have confirmed the cell parameters and composition of NbH2 [48,49], validating the previously observed density.Density-functional calculations also support the experimental findings [50].In light of this exceptionally high density, in principle NbH2 represents a very interesting material for further study as a hydrogen store for applications where space is limited.In fact, niobium is one of a small number of body centred cubic structured transition metals that do not hydrogenate readily at room temperature [51], although alloying with first row transition metals, such as Cr, has a profound effect towards lowering the hydriding temperature [52].

Complex Hydrides Containing Boron
Boron-containing compounds (Figure 6) constitute some of the most promising hydrogen stores.First, the valency of boron accounts for the tendency of such compounds to form with relatively high hydrogen content and second, boron has a low atomic mass and so hydrogen contributes more in wt.% terms than in compounds with heavier atoms.The number of different compounds in the boron-hydrogen system is vast, although the COD is disadvantaged by the significantly reduced number of reported compounds (44 matching the search requirements formulated for Figure 6), compared to the ICSD (497 structures matching the same criteria).The compounds in the boron-hydrogen system can be conveniently divided into ionic and covalent compounds.Each of these groups can be considered as potential hydrogen stores with distinctive characteristics.
In terms of the nominally ionic compounds in the boron-hydrogen system, it is not the much-studied store LiBH4 [53] that has the highest gravimetric hydrogen content, but rather ammonium octahydrotriborate NH4B3H8 [43] due to the hydrogen-containing ammonium cation.Nonetheless, LiBH4 provides a potent combination of gravimetric and volumetric capacity.In fact, the hexagonal high-temperature phase, h-LiBH4 has a considerably higher density than the orthorhombic room-temperature phase, o-LiBH4 to the extent that the former has the highest volumetric capacity of any of the borohydrides in the COD.The higher density of h-LiBH4 is possibly caused by the order-disorder transition of the BH4 − anion during the phase transformation, and the hexagonal phase can be stabilised at room temperature through the partial substitution of halide anions.[54,55] The other ionic metal borohydrides Mg(BH4)2 [56] and Zr(BH4)4 [57], as well as the metal octahydroborates NaB3H8 [58] and Cr(B3H8)2 [59] suffer from the higher weight of the metals and/or anions which consequently lowers the gravimetric capacity.However, it is important to note that the use of transition metals in the structures again has a positive influence on the density of the materials which places Zr(BH4)4 and Cr(B3H8)2 ahead of their s-metal counterparts in terms of volumetric hydrogen content.By combining a simple gravimetric/volumetric approach with the empirical relationships linking dehydrogenation temperature with metal electronegativity [60], one can begin to shortlist the most promising permutations of metal cations and borane/borohydride anions.Extending anion/ligand complexity arguments further, another ionic compound but of a rather different type, is [Li(NH3)4]2B6H6•2NH3 [61].In contrast to a simple metal cation, the complex cation [Li(NH3)4] + contains hydrogen atoms itself through the ammonia molecules ligated to the central Li atom.Through the vehicle of forming a complex cation (and by retaining some solvated molecules of ammonia), the hydrogen content in this compound can be increased dramatically over those containing simple monometallic cations.One approach to increase the hydrogen content still further would be to replace the relatively hydrogen-poor anion by one with a higher hydrogen:boron ratio.Many strong hydrogen storage candidates exist among covalent boron-hydrogen compounds such as NH3BH3 [62], NH3B3H7 [63] and H4N2BH3 [64].The B-N-H combination is not an arbitrary one but defines a distinctive family of complex molecular species that are stabilised by the dative bond between the strong electron-donating Lewis base, nitrogen and the strong electron-accepting Lewis acid, boron [65,66].The strength of this bonding interaction also provides a rationale as to why in ammonia borane, for example, the respective gaseous components, ammonia and borane are able to exist together as a covalent solid at ambient conditions.Ammonia borane is the archetypal "chemical hydride" where, by exploitation of B-H … H-N interactions (so-called dihydrogen bonds) [67], it is possible to effect hydrogen release at relatively low temperature (ca.381 K) [68].Although ammonia borane is the most promising and most studied material in this category, compounds composed of higher boranes or hydrazine, for example, are also potentially interesting.Indeed, all the B-N-H covalent boranes exceed both the gravimetric and volumetric DoE targets.The solid structures of the binary boranes should also be mentioned.Tetraborane (arachno-tetraborane(10)), B4H10 [69] is a prominent example of a borane with both high gravimetric and high volumetric hydrogen capacity.The disadvantage of such compounds however, is that their melting points and boiling points lie far below and close to ambient conditions, respectively, and therefore are neither strictly solid-state stores nor especially convenient to employ under typical fuel cell operating conditions.Moreover, such boranes are predominantly highly flammable and have a high toxicity.

Limitations of the Method and Future Implications
Although the open-access database search approach chosen for this work is simple, accessible, and effective, it is necessary to emphasise the unavoidable limitations of the method and possible ways to develop such data analysis in the future.Other than the obvious fact that the approach only serves to evaluate gravimetric and volumetric hydrogen capacity among the many criteria demanded by the DoE for practical storage (and moreover at a materials rather than a systems level), one significant limitation lies with the data themselves.As the results are only captured from a database, they are not internally verifiable and the value of the analysis relies strongly on the quality of the deposited data together with some user experience in recognising obvious outliers.In this respect, drawing from cif-based databases has an advantage in that many helpful tools for the verification of entries already exist, e.g.checkCIF [70].Automated checks for the integrity of the data can thus readily identify anomalous records.An improvement in the future would therefore be to introduce the capability to include more information from the cif-file into the production of the results.
Clearly, however, the use of a crystallographic database limits the available information to structure-derived data.This is principally apparent perhaps, in that both the gravimetric and volumetric capacity values are derived directly from structural formulae and do not reflect the experimental conditions/limitations involved in extracting and replacing hydrogen.Further, incorporating other DoE requirements such as price, toxicity, decomposition kinetics, or thermodynamics to provide a more comprehensive and informed study would require additional external information and/or the application of other methods.While cost and toxicity, for example, could be tackled by extracting information (where it exists) from other chemical databases, thermodynamic and kinetic values could be obtained by combining a simple database analysis with low-level electronic structure calculations.Such combined data analysis/quantum chemical calculations would nevertheless be time-consuming and probably only feasible for relatively undemanding systems requiring a limited number of calculations.
In the study described herein, the analysis has been largely restricted to hydride and "chemical storage" materials, which already contain bonds to hydrogen.These are the most obvious to examine from a crystallographic database.However, one could also imagine searching for porous materials that retain hydrogen by physisorption and where a structure may or may not contain sorbed hydrogen as deposited.Such an analysis could for instance be driven by the differences in density between the deposited structure and a hypothetical closest packing of the atoms in the cell.If the deposited density is somewhat significantly smaller than the closest packing, there will be some void space in the structure.An excellent example of such a study in the field of metal-organic frameworks shows the feasibility of such studies [71].In this work, a sophisticated algorithm is employed to mine the CSD, reject erroneous entries, identify 3D framework structures, remove solvent molecules, and calculate surface area and porosity.The approach allows the gravimetric and volumetric hydrogen density of approximately 20,000 MOFs to be evaluated with the one caveat that the calculated figures represent the hypothetical scenario of compounds completely free of solvent and existent as single crystalline monoliths.
Perhaps the most striking limitation of the open access database search approach presented herein however, is that it is restricted to pure substances and cannot take account of mixed/"composite" systems or those in special states, such as nanoconfined systems.As the method is based on exploiting a crystallographic database, which can only contain the structures of well-defined, single phases, it is impossible to consider "composite" systems as much as it is impossible to account for materials where the hydrogen-containing phase is unknown.
Bearing all the above points in mind, it is evident that structural database analysis can be a very useful tool in the initial stages of the discovery of potential new hydrogen storage systems.In spite of this, it remains the remit of the attentive chemist or materials scientist to analyse all the necessary information at her/his disposal and to tune systems experimentally to yield practical solutions from a basis of computational results.

Conclusions
The ready availability of archives of existing knowledge in databases creates opportunities for analysis towards the solution of particular problems.Open-access databases provide a means for any researcher to access potentially scientifically valuable data freely at the point of use.Herein, we have demonstrated how the use of the cif-files in the COD can be applied in the process of large-scale analysis of potential hydrogen storage materials, in which over 300,000 datasets can be considered.The primary benefit of such analyses is to collate useful knowledge into in a serviceable form, which allows the abstraction of purpose-fitting rules or guidelines.In the case of hydrogen storage materials, these collated data can re-emphasise the groups of materials likely to be of importance as potential hydrogen stores (e.g., binary hydrides and boranes) in a quantitative manner based on two key criteria (gravimetric and volumetric hydrogen density).Moreover, however, output can be used as a first indicator of new, yet reliably characterised, systems worthy of further exploration.At the next levels of sophistication, one could envisage multiple database screenings to encompass more than two of the essential DoE target criteria and the combination of searching procedures with other computational tasks, such as the quantum-chemical calculation of the enthalpy of formation (or, for example, the calculation of band-gaps of solid-state materials in terms of other applications).

Figure 1 .
Figure 1.Screenshot of the graphical user interface (GUI) for the search program interface (with the default values shown).

Figure 2 .
Figure 2. Plot of all hydrogen containing cif entries in the Crystallography Open Database (COD) with respect to their gravimetric and volumetric hydrogen capacity (blue open circles).The dotted line (red) corresponds to the 2020 US Department of Energy (DoE) targets, while the green dashed line corresponds to the ultimate DoE targets.Some prominent outliers are denoted with their sum formula as taken from the respective cif files [20-22].

Figure 3 .
Figure 3. Plot of all hydrogen-containing organic (green plusses) and inorganic compounds (red crosses) in the COD in terms of volumetric and gravimetric hydrogen density.Some specific examples are labelled in the figure.

Figure 4 .
Figure 4. Plot of all inorganic hydrogen containing compounds in the COD (black dots).Selected subsets are emphasised: Binary hydrides (magenta crosses), nitrogen containing compounds (green filled triangles), boron-containing compounds (but excluding oxygen; red filled squares) and oxygen-containing compounds (blue filled circles).Some specific example compounds are labelled in the figure.

Figure 5 .
Figure 5. Plot of the hydrogen storage capacities of the binary metal hydrides (magenta crosses) included in the COD [31,44,46].

Figure 6 .
Figure 6.Plot of the compounds in the boron-hydrogen system (excluding C-, O-, P-and S-containing substances; red open squares) emerging from the COD.Some of the most promising individual materials are indicated (for references, see the text).