Computational Surface Modelling of Ices and Minerals of Interstellar Interest—Insights and Perspectives

: The universe is molecularly rich, comprising from the simplest molecule (H 2 ) to complex organic molecules (e.g., CH 3 CHO and NH 2 CHO), some of which of biological relevance (e.g., amino acids). This chemical richness is intimately linked to the different physical phases forming Solar-like planetary systems, in which at each phase, molecules of increasing complexity form. Interestingly, synthesis of some of these compounds only takes place in the presence of interstellar (IS) grains, i.e., solid-state sub-micron sized particles consisting of naked dust of silicates or carbonaceous materials that can be covered by water-dominated ice mantles. Surfaces of IS grains exhibit particular characteristics that allow the occurrence of pivotal chemical reactions, such as the presence of binding/catalytic sites and the capability to dissipate energy excesses through the grain phonons. The present know-how on the physicochemical features of IS grains has been obtained by the fruitful synergy of astronomical observational with astrochemical modelling and laboratory experiments. However, current limitations of these disciplines prevent us from having a full understanding of the IS grain surface chemistry as they cannot provide fundamental atomic-scale of grain surface elementary steps (i.e., adsorption, diffusion, reaction and desorption). This essential information can be obtained by means of simulations based on computational chemistry methods. One capability of these simulations deals with the construction of atom-based structural models mimicking the surfaces of IS grains, the very ﬁrst step to investigate on the grain surface chemistry. This perspective aims to present the current state-of-the-art methods, techniques and strategies available in computational chemistry to model (i.e., construct and simulate) surfaces present in IS grains. Although we focus on water ice mantles and olivinic silicates as IS test case materials to exemplify the modelling procedures, a ﬁnal discussion on the applicability of these approaches to simulate surfaces of other cosmic grain materials (e.g., cometary and meteoritic) is given.


Introduction
The formation of a Solar-like planetary system is the result of the evolution of a primordial interstellar cloud, which takes place through five physical steps: the prestellar, protostellar, protoplanetary disk, planetesimal and planet formation phases [1]. This complex process is associated with an increase in molecular complexity, in which from one phase to the other the chemical composition grows in complexity because molecules formed in the previous steps react to form species that are even more complex [2,3]. This is reflected by radio to near-infrared (IR) observations, which have detected rotational and vibrational transitions of more than 200 different gas-phase molecular species in different interstellar, circumstellar, extragalactic and other astrophysical environments [4]. However, not all the syntheses of these compounds take place through gas-phase reactions, but a characteristics, which we feel very important in the context of this perspective. As mentioned, interstellar bare dust grains consist mainly of silicates and carbonaceous materials. The nature of the later ones has unambiguously not been determined yet. Plausible candidates seem to be polycyclic aromatic hydrocarbons (PAHs), non-hydrogenated and hydrogenated amorphous carbon, silicon carbides, or mixtures of hydrocarbon-based disordered structures [10,[43][44][45]. In contrast, silicates have been clearly observed and identified as major components of interstellar dust [46,47]. Indeed, silicate dust grains emit at mid-IR wavelengths, whose spectral features give information on the chemical composition and structural state. Silicates are a wide class of inorganic materials based on a [SiO 4 ] 4− building block, the negative net charge being compensated by divalent cations, in the interstellar ones by Mg 2+ and Fe 2+ as they appear with the largest cosmic abundances. Astrophysical silicates correspond to the groups of olivines and pyroxenes, with general formula Mg 2x Fe (2−2x) SiO 4 and Mg x Fe (1−x) SiO 3 (x = 0−1), in which Fe 2+ and Mg 2+ replace each other in the structure. They show two broad IR bands at 9.7 (≈1000 cm −1 ) and 18 µm (≈550 cm −1 ), corresponding to the Si-O stretching and O-Si-O bending modes. This is indicative of the predominance of an amorphous nature since the random distribution of bond lengths and angles common in amorphous structural states results in broad bands. In pre-stellar cores, proto-stellar envelopes and protoplanetary midplanes grains are covered in ice mantles. This is due to the freeze-out of gas-phase molecules onto the dust grain surfaces but also mainly by the formation of water molecules in situ (see above). Ice mantles can also be studied by their IR spectroscopic features [48,49], which show stunningly complex profiles as a result of the diverse chemical composition, structural state and grain shape and size. The most dominant component is water, through the O-H stretching 3 µm (≈3300 cm −1 ) band. Additionally, smaller quantities of other volatile species are also present, e.g., CO, CO 2 , NH 3 and CH 3 OH, and hence that ice mantles are usually referred to as "dirty ices". The actual composition and ratios (ca. 5−50% wrt water ice abundance) depend on the environment where they reside. Comparison of experimental IR features with the observational ones indicates that the structural state of the ice mantles matches reasonably well with that of the amorphous solid water [6,7,50]. Moreover, they are also reported to be porous materials [49,51,52], although the porosity degree is uncertain. Cometary grains are composed by rocks, ices and organic components [34], the ratio between the inorganic and the organic fractions being as similar as 55%:45% to that found in the 67P comet [53]. Interestingly, glycine, the simplest amino acid, has been identified in the Stardust Wild 2 [54,55] and Rosetta's 67P [56] comets, with the corresponding profound implications in the exogenous delivery theories. The rocky components comprise a mixture of silicates, oxides and metal sulfides, while the ice composition is similar to that of the interstellar mantles, i.e., H 2 O as the major component with CO, CO 2 and CH 3 OH as the minor components with more abundance. Finally, in relation to meteoritic grains, the most interesting ones from an astrochemistry standpoint are those belonging to the group of carbonaceous chondrites (CCs) because they are considered to be pristine and representative of materials forming the PD. They are mineralogically very rich as ca. 275 class of minerals have been characterized [57]. The most common ones are silicates, aluminosilicates, metal oxides, metal sulfides, carbonates and phosphates. Remarkably, CCs are also very rich as far as soluble organic components are concerned, in which diverse organic molecular species have been identified (e.g., carboxylic acids, hydroxyacids, alcohols, aldehydes and ketones and aromatic and aliphatic compounds), some of them of biological relevance such as amino acids, sugars and nucleobases [58,59].
Spectroscopic observational measurements are traditionally complemented by astrochemical modelling and terrestrial laboratory experiments. However, this combined approach, which has been very fruitful to build-up our current knowledge on the physicochemical properties of IS grains, is currently severally limited at reproducing and characterizing the truly existing grain surface chemistry. Observations allow us to detect, identify and quantify chemical species in different astrophysical environments, but they cannot tell us the ways through which they form, e.g., either in the gas phase or on the grain surfaces, the most favorable reaction channel and the mechanistic steps. Astrochemical modelling is a computational simulation tool mostly dedicated to modelling the abundances of chemical species and their evolution with time with the aim to rationalize the observations. However, some of the input parameters needed for the simulations suffer from uncertainties (especially those involved in gas-grain processes) affecting the accuracy of the simulations [60]. Laboratory experiments are also very useful to rationalize observations, in this case by simulating experimentally chemical reactions of astrophysical interest, which allow us to identify the nature of the products formed and the feasibility of the reactions under the harsh physical conditions of space. However, setting up the physicochemical conditions of the IS medium realistically is technically very difficult, and reproducing the actual composition of grain analogues (e.g., dirty ice mantles) is also tricky [61]. Additionally, information derived from the experiments (e.g., activation and diffusion energies) is dependent on the physico-chemical model adopted to interpret the results, as in the temperature programmed desorption (TPD) experiments [62].
In this interdisciplinary approach, computational chemistry can serve as a complementary discipline that can partly overcome these limitations. Indeed, simulations derived from computational chemistry are capable of representing the elementary physicochemical steps associated with a grain surface reaction (namely, adsorption of the reactants, their diffusion on the surfaces, the chemical reaction and the eventual desorption of the products) realistically and at a molecular level, this way providing unique and valuable atomic-scale information (i.e., structures, energetics and dynamics) of the related phenomena. Within this context, a fundamental aspect is the construction and simulation of realistic atombased structural models for grains. This is indeed a key and mandatory pre-requisite if one wants to investigate the sequence of elementary steps occurring on the grain surfaces in a reliable and accurate way. Although surface modelling is recurrently used for many purposes in surface science and heterogeneous catalysis (normally for the simulation of metal and metal oxide crystalline surfaces), its application in the field of astrochemistry is more limited. Thus, the aim of this perspective is to present different surface modelling techniques grounded on the computational chemistry that are useful to simulate IS grain materials in a realistic way. An earlier report aimed at describing the quantum mechanical treatment of molecule-surface interactions in the context of the biomaterials was published by some of us, which can be used as a complement of this perspective [63].
This manuscript is organized as follows. Section 2 provides a description of the different methods (based on both quantum and classical mechanics) currently available in computational chemistry. Section 3 describes with examples the different state-of-the-art techniques and strategies to construct and model surfaces of interstellar solid-state matter. Finally, Section 4 is devoted to the conclusions alongside some future perspectives in the surface modelling of cosmic grains by means of computational chemistry tools. We would like to mention that the surface modelling techniques presented here use silicates and water-based ices as test cases, because these are the systems in which the authors have more experience. However, the presented techniques can extend to other materials of cosmic interest, as discussed in more detail in this last section as future perspectives.

Computational Chemistry Methods
The main objective of computational chemistry is to solve chemical problems by simulating chemical systems (molecular, biological, materials) in order to provide reliable, accurate and comprehensive information at an atomic level. To this end, there are two main methodological families: those based on quantum chemical methods and those based on molecular mechanics. The former are methods in which the electrons are explicitly accounted for, while in the latter their presence is hidden in the force field. For the sake of understanding this perspective, these methods are briefly described in this section.

Quantum Chemical (QC) Methods
QC methods, also called electronic structure, first-principles or ab initio methods, calculate how electrons and nuclei interact by solving the time-independent electronic Schrödinger equation in the Born-Oppenheimer approximation. QC can be classified into two groups based on either wave function methods or the density functional theory (DFT).

Wave Function-Based Methods
These are methods that, by solving the electron Schrödinger equation, calculate the explicitly correlated electronic wave functions. The simplest wave function-based method is the Hartree-Fock (HF), in which the multielectron wave function is approximated to a single Slater determinant (the mathematical expressions for wave functions in quantum mechanics). This approximation, however, leads to the main pitfall of HF: it neglects the instantaneous electron correlation. That is, it considers that one electron moves in an averaged field due to the remaining electrons, when in reality the motion of the electrons is correlated (the motion of one electron depends on the instantaneous, mutual interaction with the other electrons). The lack of instantaneous electron correlation introduces an error in the HF wave function and energy, seriously affecting the prediction of the kinetic barriers or the description of London dispersion forces.
Despite this inaccuracy, the HF solution can be systematically improved. One way is through the many body perturbation theory, culminated in the Møller-Plesset (MPn) methods [64], or by expressing the wave function as a linear combination of Slater determinants, i.e., the HF one plus those representing single, double, triple, etc., excitations, giving rise to the configuration interaction (CI) methods [65]. The coupled cluster (CC) theory also introduces excited state configurations to the wave function but adopting an exponential expansion (through the cluster operator) on the HF one [66]. The most popular CC derivation is the CCSD(T) method [67], in which single (S) and double (D) excitations are included through the cluster operator and triple (T) excitations by using the perturbation theory. CCSD(T) extensively includes electron correlation and, in combination with largely extended basis sets, is considered the "gold standard" in QC [68].
Despite the improved quality of the post-HF methods, they are considerably computationally expensive so that their applicability is hampered by the size of the systems. CCSD(T) calculations for more than 25−30 atoms are unfeasible so that, in practice, these calculations in surface modelling are mainly used for calibrating more approximate (but cheaper) methods to ensure their suitability in the description of the electronic structure of the simulated surfaces.

Methods Based on the Density Functional Theory (DFT)
The DFT foundations lie on the mathematical formulations developed by Kohn and Sham in 1965 [69]. The central theorem states that the ground state energy of a nondegenerate electronic system is unequivocally defined by the total electron density ρ(r) through a universal and exact mathematical functional (a functional takes a function and defines a single number from it, like in a definite integral)-that is, the energy of a system can be expressed as a functional E of ρ(r), i.e., E[ρ(r)]. This represents an enormous advantage with respect to the wave function-based methods because, in DFT, the ρ(r) for a system of N electrons only depends on 3 spatial coordinates, while the corresponding wave function depends on 3N spatial and N spin variables. Because of this, DFT methods are even cheaper than HF (according to the choice of the functional E[ρ(r)]) while including a significant fraction of the missing instantaneous electron correlation. The disadvantage of DFT is that although the existence of the universal functional relating the electron density to the energy of a system can be mathematically demonstrated, the exact form of E[ρ(r)] is still unknown. The search for functionals connecting these two quantities, known as exchange-correlation functional E XC [ρ(r)], leads to the existence of a wide variety of DFT methods [70]. The simplest approach for E XC [ρ(r)] is the Local Density Approximation (LDA), which implicitly assumes that E XC [ρ(r)] does only depend on the functional expression on the ρ(r) value at the local position r. LDA is improved by the Generalized Gradient Approximation (GGA), in which the E XC [ρ(r)] depends not only on the density ρ(r) but also on the gradient of the density, ∇ρ(r), this way accounting for the spatial varying electron density present in any chemical system. Meta-GGA functionals are a new class of methods in which higher order density gradients or the kinetic energy density are accounted for. Combination of conventional GGA E XC [ρ(r)] with HF-like electron exchange functional E X (HF) gives rise to the DFT hybrid functionals. For these cases, since the weight factor for each component in the definition of the E XC [ρ(r)] cannot be assigned from first-principles, a certain degree of empiricism is used, for instance by fitting the coefficients to experimental data or to post-HF wave-function-based results. Finally, hybrid-meta GGA methods apply a meta-GGA similar concept to conventional hybrid functionals so that these methods depend on the electron density and its gradient, a percentage of exact exchange and the kinetic energy density.
Among the main drawbacks of the DFT methods (specially the oldest ones) is that they do not cope with long-range non-covalent (i.e., dispersion or London) interactions. Although this is not particularly important in the modelling of conventional surfaces (i.e., metals, oxides, etc.), it gains relevance when a proper simulation of the elementary steps taking place at the surfaces (i.e., adsorption, diffusion, reaction) is aimed. A pragmatic solution to account for dispersion interactions in DFT methods was proposed by Grimme, in which an a posteriori correction term based on an atom-atom additive London-type empirical potential (D) is introduced to supplement the DFT energy [85]. Different correction terms have been sequentially proposed and improved (D [86], D2 [87], D3 [88] and D4 [89]). The main advantage of the DFT-D scheme is that it maintains the original accuracy of the pure DFT method for short-range interactions while including the dispersion interactions at a negligible computational cost.

Semiempirical Methods
Despite the enormous cost-effective advantage of the DFT methods with respect to the wave function ones, DFT simulations are also limited by the size of the systems up to hundreds of atoms or even thousands of atoms with high performance computing allowing massively parallel calculations. This is an important aspect in surface modelling, because, in most of the cases, realistic extended systems are inevitably large (e.g., amorphous surfaces), reaching the limits of the DFT applicability.
Semiempirical methods are faster alternative approaches to overcome the size limitations of DFT. These methods are derived from the pure QC methods but with different approximations (e.g., omission of the bielectronic integrals) while including some empirical parameters to mitigate the errors due to the approximations and to account for electron correlation effects. Due to these simplifications and parametrizations, semiempirical methods are faster than their ab initio counterparts, permitting the treatment of large chemical systems, impossible to deal with the pure QC methods. However, accuracy of the semiempirical methods strongly depends on whether the parametrization is suitable for the specific case under study. That is, results can be very wrong if the simulated system does not fit with the training set used to parametrize the method.

Basis Set
Basis sets are a linear combination of basis functions used to build molecular orbitals, which in turn approximately represent the electronic wave function. Thus, the use of basis sets in electronic structure calculations is compulsory. The quality of the calculation and the accuracy of the derived information strongly depend on the completeness of the chosen basis set.
Two main types of basis sets exist: Gaussian-type orbitals (GTOs) and plane waves (PWs). GTOs are localized functions centered on the atoms. They are commonly used in molecular (i.e., finite) calculations because they obey the typical radial-angular decomposition and exhibit the spatial and symmetry properties of atomic orbitals. PWs are periodic functions in nature, i.e., they are not localized functions but uniformly diffuse in space, so that their use requires the adoption of periodic boundary conditions (in which an infinite system is approximated by repeating a unit cell, see below). Because of that, they are widely used in the simulation of periodic solid-state systems.
The main advantage of GTOs with respect to PWs is that the number of basis functions exclusively depends on the number of atoms of the system, while the number of PWs, since they fill the 3D space, depends on the unit cell size. Advantages of PWs with respect to GTOs are that their quality is controlled by a unique parameter (the cut-off kinetic energy) and that they do not suffer from basis set superposition errors (BSSE), which can be dramatically critical for adsorption purposes. BSSE is an artefact when using truncated GTO basis sets, in which there is a system composed of two fragments: one fragment uses the basis set of the other and vice versa, this way leading to an overestimation of the total energy and, therefore, affecting the optimized geometries. BSSE is usually corrected by the counterpoise method [104], with schemes to correct energies (CP) and geometries (gCP). In relation to that, it is worth mentioning the recent development of the HF-3c method [105]. HF-3c is based on HF using the minimal MINIX basis set in which 3 correction (3c) terms are applied (upon inclusion of empirical parameters) to include London dispersions (D3 scheme), to alleviate BSSE (gCP scheme), and to correct for short-range effects caused by basis set incompleteness (i.e., to remove the systematic overestimation of bond lengths for electronegative elements when employing small basis sets).

Classical Molecular Mechanic (MM) and Molecular Dynamic (MD) Simulations
Methods based on classical MM (also called force field methods) ignore the electrons and their motion. A chemical system is described by a "ball and spring" model, in which atoms are represented by balls of different sizes and softness and the bonds by springs of different stiffness. The energy of the system is calculated as a function of the nuclear positions only. To this end, force fields (FFs), a set of interatomic potentials (IPs) encompassing energy functions and parameters, are used to define the molecular mechanics energy measuring the degree of mechanical strain within the system. As electrons are not explicitly accounted for in the classical MM, very large systems (thousands of atoms) can be simulated to predict conformational flexibility and relative stability (e.g., protein folding).
FFs contain energy functions that include bonded and non-bonded terms to represent the intra-(i.e., covalent) and inter-(i.e., non-covalent) molecular forces within the system.
In essence, the various contributions present in the IPs should model: (i) the interaction between pairs of bonded atoms (covalent bond stretching), (ii) the energy required for bending an angle formed by three atoms (angle bending), (iii) the energy change associated with a bond rotation (torsional or dihedral), and iv) the non-bonded pairwise interactions, usually using a Coulomb potential for electrostatic interactions and a Lennard-Jones potential for van der Waals interactions.
The energy functions contain a set of parameters (e.g., equilibrium values for bond lengths, angles, dihedrals and atomic charges) that define the different types of atoms, chemical bonds, angles torsions, non-bonded interactions and other terms. The FF parametrization (i.e., the numerical assignment of the parameters) was historically derived from experimental data, but nowadays has been replaced by QC calculations.
Due to the cheap cost, MM methods are adopted to simulate the dynamics of the nuclear motion within a molecule through molecular dynamic (MD) simulations. In MD, successive configurations of the system are generated by integrating the Newton's laws of motion. The result is a trajectory that specifies how the positions and velocities of the nuclei vary with time, in which at each nuclear position, the molecular mechanics energy and the nuclear forces of the system are evaluated. Thus, MD simulations are a good choice to study the evolution in time-space phase of the atomic positions subject to the internal forces of chemical nature, while the kinetic energy associated with the nuclear motion provides the temperature of the system. Classical MD simulations can reach time-scales of hundreds of picoseconds to microseconds, which allows the ergodicity of the system to be reached.
MD simulations can also be carried out by moving the nuclei within the electronic field generated by the corresponding electronic wave function or, more conveniently, by the electron density usually computed within the DFT. For these cases, the electrons are treated quantum mechanically, while the nuclei, within the Born-Oppenheimer approximation, are treated as classical particles so that their dynamics can be followed by the integration of the Newton equations, like in the MM method. Hence, these simulations are called ab initio molecular dynamics (AIMD). In AIMD simulations, although being computationally more expensive than pure classical MD, the introduction of the electronic structure of the system allows the study of bond breaking/formation as the result of internal exchange of energy. Thanks to the huge improvement of the power of high-performance computers, AIMD can nowadays reach tens to hundreds of picoseconds.
To overcome the high cost and limited time evolution intrinsically linked to AIMD, the relatively new methodologies of machine learning (ML) can be adopted. ML is becoming an explosive field, not only in problems such as image recognition, language translation or to play the GO game but also to tackle physical chemistry problems. The present review cannot expand to include the multitude of new contributions by ML in the present topic due to space constraints and limited authors experience in the ML field. In essence, neural network (NN) approaches can be used to model the potential energy surface for almost any kind of systems through a specific learning based on ab initio data run on a limited set of representative cases. The interested reader can refer to this perspective article [106]. An impressive example of the success of ML in greatly expanding the applicability of AIMD quality calculation up to 100 million atoms at the rate of 1ns/day was recently made possible by the combination of the Deep Potential Molecular Dynamics approach using a highly optimized code (GPU DeePMD-kit) on the Summit supercomputer (see Reference [107]).

Results
In computational chemistry, either periodic or cluster approaches are used as possible paradigms to represent the external faces of the real material. The periodic paradigm starts by cutting out a unit cell from the bulk of the material to build up a 2D model (in principle infinite) of the surface represented by a slab of the material of finite thickness. The surface unit cell is thus repeated periodically. The cluster paradigm is based on using a molecular system (i.e., the cluster) as a structural model mimicking the surfaces. In this section, the application of these two surface modelling strategies and related approaches are described using, as test cases, interstellar ices and silicates.

Crystalline Surfaces
Observation of the interstellar dust (see above) revealed that both the core of the grains (silicates) and the mantles (mainly water ice) are in the amorphous state. This fact renders the computer simulation of these materials particularly challenging as no definite molecular structure is available for amorphous phases. The usual way to model surfaces (not limited to those of IS interest but applicable in a wider context) is to imagine the amorphous material as a repetition of a fictitious unit cell large enough to mimic the absence of local order due to the amorphous nature of the material, which is repeated by the periodic boundary conditions. In this way, the same machinery (programs and algorithms) to simulate crystalline materials is adopted.
A perfect crystalline solid bulk structure can be generated by repeating its unit cell along the three directions (3D) of space, in such a way that an infinite, periodic system is obtained (see Figure 1). The crystallographic unit cell is defined by six independent parameters, i.e., the a, b and c cell vectors, and the α, β and γ intra-cell angles (see Figure 1). The atomic content of the unit cell is repeated periodically by application of the lattice translation vector T = la + mb + nc (where l, m and n are integer numbers spanning the [−∞, +∞] range), this way generating the periodic systems. Examples of the unit cell for silicates (Mg 2 SiO 4 , forsterite) and water ice (P-ice) are shown in Figure 2 (bulk structures).

Results
In computational chemistry, either periodic or cluster approaches are used as possible paradigms to represent the external faces of the real material. The periodic paradigm starts by cutting out a unit cell from the bulk of the material to build up a 2D model (in principle infinite) of the surface represented by a slab of the material of finite thickness. The surface unit cell is thus repeated periodically. The cluster paradigm is based on using a molecular system (i.e., the cluster) as a structural model mimicking the surfaces. In this section, the application of these two surface modelling strategies and related approaches are described using, as test cases, interstellar ices and silicates.

Crystalline Surfaces
Observation of the interstellar dust (see above) revealed that both the core of the grains (silicates) and the mantles (mainly water ice) are in the amorphous state. This fact renders the computer simulation of these materials particularly challenging as no definite molecular structure is available for amorphous phases. The usual way to model surfaces (not limited to those of IS interest but applicable in a wider context) is to imagine the amorphous material as a repetition of a fictitious unit cell large enough to mimic the absence of local order due to the amorphous nature of the material, which is repeated by the periodic boundary conditions. In this way, the same machinery (programs and algorithms) to simulate crystalline materials is adopted. A perfect crystalline solid bulk structure can be generated by repeating its unit cell along the three directions (3D) of space, in such a way that an infinite, periodic system is obtained (see Figure 1). The crystallographic unit cell is defined by six independent parameters, i.e., the a, b and c cell vectors, and the α, β and γ intra-cell angles (see Figure 1). The atomic content of the unit cell is repeated periodically by application of the lattice translation vector T = la + mb + nc (where l, m and n are integer numbers spanning the [−∞, +∞] range), this way generating the periodic systems. Examples of the unit cell for silicates (Mg2SiO4, forsterite) and water ice (P-ice) are shown in Figure 2 (bulk structures).  Crystalline surfaces are systems in which T applies only along two cell parameters (those defining the actual surface), this way forming a 2D-periodic slab model. The remaining direction perpendicular to the surface slab is no longer crystallographic and describes the z-coordinate of each atom in the slab. The number of atomic layers defines the slab thickness of the surface model. These crystalline surfaces of finite thickness results from cutting out a slice from the corresponding crystalline bulk structures along a desired Miller (hkl) plane (so-called slab-cut procedure), while maintaining cell neutrality, the right stoichiometry and zero dipole moment across the slab. Figure 2 exemplifies the slab-cut procedure to model the (010) Mg 2 SiO 4 (A) and the (010) water proton ordered P-ice (B) surfaces form the corresponding bulk systems, as provided in References [108][109][110][111] for Mg 2 SiO 4 surfaces and References [112][113][114][115] for H 2 O P-ice surfaces.
When modelling surfaces adopting the periodic crystalline approach, the thickness of the models should be large enough to avoid structural and energetic artefacts. Low thicknesses can lead to spurious surface geometry reconstructions (e.g., in molecule adsorption) due to an exceedingly high mobility of the surface atoms. Moreover, the surface energy (E S ) of the model should remain constant by adding an increasing number of layers. E S is a quantity representing the penalty suffered by a given slab when it is detached from the crystal bulk. E S is used to establish a ladder of relative stabilities between surfaces derived from the same bulk (e.g., for Mg 2 SiO 4 slab stabilities see References [116,117]). If E S is not constant (normally due to a too low thickness), the electronic structure of the slab is thickness-dependent and accordingly surface properties may vary as a function of the thickness. Crystalline surfaces are systems in which T applies only along two cell parameters (those defining the actual surface), this way forming a 2D-periodic slab model. The remaining direction perpendicular to the surface slab is no longer crystallographic and describes the z-coordinate of each atom in the slab. The number of atomic layers defines the slab thickness of the surface model. These crystalline surfaces of finite thickness results from cutting out a slice from the corresponding crystalline bulk structures along a desired Miller (hkl) plane (so-called slab-cut procedure), while maintaining cell neutrality, the right stoichiometry and zero dipole moment across the slab. Figure 2 exemplifies the slabcut procedure to model the (010) Mg2SiO4 (A) and the (010) water proton ordered P-ice (B) surfaces form the corresponding bulk systems, as provided in References [108][109][110][111] for Mg2SiO4 surfaces and References [112][113][114][115] for H2O P-ice surfaces.
When modelling surfaces adopting the periodic crystalline approach, the thickness of the models should be large enough to avoid structural and energetic artefacts. Low Finally, it is worth mentioning that depending on the basis set functions used to describe the atomic orbitals, the slab surfaces may or may not be considered as a true 2D system. With GTOs (see above, Section 2), due to their localized nature, true 2D-periodic systems can be modelled, i.e., with a true infinite empty space above and below the slab. In contrast, when PWs are employed, surface modelling is forcedly based on pseudo 3D systems, as the PWs extend towards the space. The surface model consists of a 3D replica of the slab along the direction perpendicular to the slab plane, which is used to ensure that enough empty space is left between replicas, to prevent spurious interactions between the replicated slabs.

Amorphous Surfaces
As mentioned above, the phase of most of the IS grains is amorphous. This in terms of computational modelling means that the structures do not follow long-range order, which implies that a 2D surface unit cell should be large enough to represent the disordered structures. This is achieved by adopting a large unit cell, with randomized internal atomic positions to ensure bond length and angle distribution characteristics of amorphous systems. The unit cell amorphous content is usually achieved by starting from a crystalline phase of the system and running MD at high temperature to randomize the nuclei positions which are then cooled down to reach a vitreous/amorphous phase (so-called hot/cool MD procedure).
An example of this procedure is shown in Figure 3A for the amorphization of the crystalline P-ice (010) surface. Alternatively, one can use the 3D periodic crystalline bulk systems as an initial guess structure for the MD simulation. For these cases, one obtains a 3D periodic amorphous bulk system, in which the slab cut procedure defines different amorphous surfaces. An example of this procedure is shown in Figure 3B, for the forsterite (Mg 2 SiO 4 ) surfaces [118].

Free and Embedded Cluster Systems
The cluster approach (CA) consists of using a finite (i.e., a molecule) system to model a surface. At variance with the periodic approach, in the CA, an extended system is reduced into a finite number of atoms representing, in principle, the physico-chemical features of the local region of interest of the surface. In this approach, two methods are possible to prepare the cluster models: the top-down and the bottom-up. The same procedure can be adopted to simulate IS dirty ices (see above, Section 1). The pristine pure amorphous water ice is made further defective, since a certain number of water molecules are substituted by other volatile molecules, such as CO, NH 3 or HCN. A possible strategy to model surfaces of this kind is achieved by: (i) replacing some water molecules by the volatile species in the crystalline structure of pure water ice (please note that this should be done in agreement with the chemical composition present in the IS ices); (ii) running MD simulations to amorphize the dirty ices; and (iii) defining the slab-cuts to obtain the desired surfaces. An example of this sequence is shown in Figure 4 for the generation of a model surface corresponding to a mixture of H 2 O:CO (ca. 75%:25%) [119].

Free and Embedded Cluster Systems
The cluster approach (CA) consists of using a finite (i.e., a molecule) system to model a surface. At variance with the periodic approach, in the CA, an extended system is reduced into a finite number of atoms representing, in principle, the physico-chemical features of the local region of interest of the surface. In this approach, two methods are possible to prepare the cluster models: the top-down and the bottom-up.  Step (i) replacement of H2O molecules of the crystalline water P-ice bulk system by CO molecules (according to the H2O:CO ratio).
Step (ii) running MD simulation at 300 K on the H2O/CO crystalline-like system to amorphize the structure.
Step (iii) slab-cut procedure on the amorphous H2O/CO bulk structure to reach an amorphous H2O/CO slab surface model as a final product. Figure 5A shows the top-down strategy to construct a cluster model for a water P-ice surface. The cluster is generated by taking out a molecular piece from the crystalline periodic 3D bulk structure [120]. However, reconstruction effects, particularly severe at the frontier of the cluster, may cause large geometry distortions of the initial structure, leading to final systems that significantly differ with respect to the original ones [19,26]. To cope with the frontier problem, nuclei at the exterior part of the cluster can be fixed to the pris- Step (i) replacement of H2O molecules of the crystalline water P-ice bulk system by CO molecules (according to the H 2 O:CO ratio).

Top-Down Approach
Step (ii) running MD simulation at 300 K on the H2O/CO crystalline-like system to amorphize the structure.
Step (iii) slab-cut procedure on the amorphous H 2 O/CO bulk structure to reach an amorphous H 2 O/CO slab surface model as a final product. Figure 5A shows the top-down strategy to construct a cluster model for a water P-ice surface. The cluster is generated by taking out a molecular piece from the crystalline periodic 3D bulk structure [120]. However, reconstruction effects, particularly severe at the frontier of the cluster, may cause large geometry distortions of the initial structure, leading to final systems that significantly differ with respect to the original ones [19,26]. To cope with the frontier problem, nuclei at the exterior part of the cluster can be fixed to the pristine bulk material positions to enforce the structural memory of the original system. It is worth mentioning that in this top-down procedure, the type, size and surface features of the cluster are a subjective choice of the modeler. tool is capable to automatically generate both stoichiometric and charge electroneutral NPs for ionic systems, without human post-processing operations, avoiding bias and mistakes. Interestingly, the Wulff-constructed NP structures can be amorphized by the usual hot/cool MD procedure. Figure 5B (amorphous NPs) shows the effect of this step for the forsterite NP [123].

Bottom-up Approach
The bottom-up strategy consists of building the cluster systems by joining atoms, molecular units or even small clusters in such a way that the clusters grow in size upon successive joining. In the case of water, cluster growth takes place by a water-by-water Another less subjective top-down strategy is to be guided by the Wulff construction [121], frequently adopted to design structural nanoparticle (NP) models. In practice, by exploiting the Gibbs-Wulff theorem [122], the final morphology of an NP is dictated by the relative surface energy of the crystal faces defining the NP shape. Figure 5B shows two different in-size Wulff-constructed NP models of forsterite (Mg 2 SiO 4 ) [123]. NPs based on the Wulff-construction procedure have been successfully modelled for metallic systems (e.g., [121,124,125]) as well as for binary ionic systems (e.g., [126][127][128][129]). For these latter systems, however, a rigorous control of the stoichiometry is mandatory, requiring in most of the cases manual manipulation (normally removal/addition of the excess/missing atoms). An automatic procedure is encoded in the BCN-M software (version 1.0) [130]. This tool is capable to automatically generate both stoichiometric and charge electroneutral NPs for ionic systems, without human post-processing operations, avoiding bias and mistakes. Interestingly, the Wulff-constructed NP structures can be amorphized by the usual hot/cool MD procedure. Figure 5B (amorphous NPs) shows the effect of this step for the forsterite NP [123].

Bottom-up Approach
The bottom-up strategy consists of building the cluster systems by joining atoms, molecular units or even small clusters in such a way that the clusters grow in size upon successive joining. In the case of water, cluster growth takes place by a water-by-water addition, this way forming the dimer, trimer, tetramer, etc. (see Figure 6A [131]). Due to the intrinsic relevance of water, the lowest-energy structures of water clusters of increasing size have already unambiguously determined up to sizes of 20 water molecules [132,133]. However, since within this strategy the structures are not derived from the crystal bulks (and accordingly they are not constrained to be bulk-like), an explosive number of possible final candidates emerges. In these situations, searching the global minimum of the potential energy surfaces (PES) is a very challenging task. This can currently be solved by applying global optimization techniques, which are algorithms that allow us to explore the complex potential energy maps to identify the most stable systems.
To achieve that, genetic algorithms [134][135][136][137][138] and Basin-Hoping Monte Carlo (BHMC) algorithms are the most common modelling strategies to seek the global minimum [139][140][141][142]. The BHMC method combines Monte Carlo with local optimization techniques, providing a picture of the energy landscape of the considered molecular clusters. Therefore, application of BHMC allows us to locate different low-energy minima, hopefully including the most stable one. The BHMC algorithm has been successfully used to locate the lowest-energy Mg 2 SiO 4 nanoclusters of different formula units (see Figure 6B) [143]. It has also been applied to study the growth of silicate nanoclusters by successively adding atoms and/or molecular units, this way providing a path associated with the nucleation and condensation of IS silicates [144,145]. Interestingly, execution of BHMC simulations using a DFT approach can be extremely computationally expensive and in practice, it is only affordable for small clusters. Thus, for larger systems, a two-step process is generally used. First, the global optimization is carried out with less expensive methods, normally classical interatomic MM or semi-empirical potentials, and then the manifold minimum-energy structures are subsequently reoptimized at the higher levels of theory to refine the final energy ranking within the original manifold. In this line, the GFN-xTB family of semiempirical methods (see above, Section 2) have demonstrated to be actually cost-effective as far as the modelling of large water clusters is concerned, providing comparable results (structure and energetics) with the CCSD(T) ones [146]. Again, efficiency and improvement in GA and basin-hopping may result in the clever application of ML techniques, as highlighted in a recent work by the Hammer's group [147].
ing size have already unambiguously determined up to sizes of 20 water molecules [132,133]. However, since within this strategy the structures are not derived from the crystal bulks (and accordingly they are not constrained to be bulk-like), an explosive number of possible final candidates emerges. In these situations, searching the global minimum of the potential energy surfaces (PES) is a very challenging task. This can currently be solved by applying global optimization techniques, which are algorithms that allow us to explore the complex potential energy maps to identify the most stable systems. To achieve that, genetic algorithms [134][135][136][137][138] and Basin-Hoping Monte Carlo (BHMC) algorithms are the most common modelling strategies to seek the global minimum [139][140][141][142]. The BHMC method combines Monte Carlo with local optimization techniques, providing a picture of the energy landscape of the considered molecular clusters.

Embedded Clusters
The free cluster approach suffers by the entirely lost memory (both electronic and structural) of the bulk solid from which it has been derived. Atoms and molecules (for the icy models) at the border of the cluster feel a very different environment with respect to the companions at the interior of the cluster. A very effective way to restore, at least partially, the "memory" of the surroundings from which the cluster was cut out is the IMOMM method, originally proposed by F. Maseras and K. Morokuma as "integrated molecular orbital molecular mechanics", in which the interesting zone of the cluster was treated at a higher level (quantum mechanics) while adopting a low level (classical force fields) for the external part [148]. The method then evolved in the ONIOM (Our own N-layered Integrated molecular Orbital and Molecular mechanics), where the low level is no longer restricted to be a classical-based force field but can be any quantum mechanical method. ONIOM has been popularized by its incarnation as an option in the Gaussian program [149]. The method is, therefore, ideal to treat a large portion of the material of interest (for instance, a large chunk of water ice, either crystalline or amorphous) treated at a low level of theory (a large number of force fields for water are available), while the zone of chemical interest for adsorption of iCOM or reactivity can be limited to a smaller portion of the system treated at high quantum mechanical level. The ONIOM method has been recently reviewed [150]. Relevant applications of the ONIOM approach in astrochemistry are the study of adsorption at the icy surfaces, mimicking the exterior of an icy grain using a crystalline slab of water ice. To achieve good performance while maintaining the accuracy high, the authors proposed to adopt as a low level method the sophisticated polarizable force field AMOEBA09 [151], as defined in the Thinker program [152]. The ONIOM(QM:AMOEBA09) approach was first tested for the self-adsorption of water molecules on a crystalline water I h ice slab and then extended to treat radical species (OH, HCO and CH 3 ), adsorption, a particularly difficult task due to the open shell nature of the system [153]. In summary, we believe the ONIOM approach can be quite successful when dealing with the simulation of local chemical phenomena (adsorption, reactivity, etc.) occurring at the grain surfaces. The possibility of using any level of theory for the low-level method through a scripting tool included in the Gaussian program opens up a very interesting perspective when the Grimme's new suite of semiempirical (xTB-GFN0,1,2) [101][102][103] and force field (GFN-FF) [154] methods are adopted, due to their excellent performance in treating non-covalent interactions which dominate the iCOM adsorption features.

Surface Models with Surface Defects
How IS ices grow is still a subject of deep debate. Grains do not grow by a stepwise addition of water molecules: instead, the water molecules are formed in situ by a series of elementary surface reactions involving H and O atom-derived species (e.g., O, O 2 , OH, etc.). Therefore, little is known about the way in which the water heat of formation re-models the underlying ice grain structure. Clearly, the above-described strategy to simulate the amorphous icy grain structures, i.e., starting from a crystalline ice followed by a hot/cool MD procedure, seems far away from the real physicochemical steps leading to ice formation. A different and clever strategy for modelling the accretion of an icy grain by subsequent reactive steps involving H and O atoms has been proposed by Garrod [52]. The simulation envisages diffusive and reactive steps towards the formation of the final H 2 O molecule (and also O 2 , H 2 and H 2 O 2 as a by-products). By tuning the simulation empirical parameters related to the energy of the attached atom and its diffusivity, very large grains have been obtained exhibiting different degrees of internal porosity, revealing sites where H 2 can be trapped. The beauty of the approach lies in its simplicity and full control of the physical conditions. The main limit is in the water-water interactions, completely ignoring the specific hydrogen bond features which, through its complex 3D web of intermolecular bonds, can fully affect the porosity and density of the final grain. The Garrod's group has further contributed to our understanding of the icy grain chemistry and structure in a subsequent series of papers [52,[155][156][157][158][159]. As an alternative to render the icy grain structure, the presence of surface defects of various nature can be introduced in the computer model. In this case, surface defects are imperfections of the regular arrangement of the atomic positions in the external surfaces. These sites exhibit enhanced instability compared with the non-defective analogues and accordingly may be particularly reactive. For water ice, surface defects are usually dangling bonds of undercoordinated water molecules (i.e., surface H or O atoms that can act as H-bond donors or acceptors, respectively), presence of radicals (e.g., OH, generated by UV and/or cosmic rays' irradiation) and cavities or pores of nanometric size in their morphology, as revealed by porosity of the IS ices. For amorphous silicates, surface defects are usually point (i.e., local) defects, and the most common are undercoordinated metal ions (the valences of the metals are not fully satisfied), vacancies (missing atoms) and substitutions (for IS silicates normally Fe 2+ replacing Mg 2+ ) but also oxygen anions may behave as powerful nucleophilic centers with respect to CO 2 and CH 3 CN [160].
Some of these surface defects can already be generated by means of the periodic and cluster surface modelling approaches. This is, for instance, the case of surface dangling bonds (see the crystalline P-ice (010) surface in Figure 2B or the amorphous slab in Figure 3A) or the case of undercoordinated Mg atoms (see the outermost Mg atoms of the crystalline Mg 2 SiO 4 (010) surface of Figure 2A or of the amorphous Mg 2 SiO 4 surfaces of Figure 3B). Relatively simple defects require a straightforward manipulation of the "initial" structures. For instance, the presence of surface OH radicals in ices can be formed by removal of H atoms from one H 2 O molecule belonging to the ice [131], or the presence of Fe 2+ in surface silicates by simple substitution of Mg 2+ for Fe 2+ in Mg 2 SiO 4 -based surfaces [161]. The modelling of surfaces exhibiting specific morphologies is more problematic (cavities, pores) as these defects involve a relatively large portion of the entire surface to be represented. An effective strategy to model surfaces with cavities is the bottom-up one, in which clusters of different sizes and morphologies are joined together to achieve basin-like regions mimicking cavities or pores. Examples of this procedure are shown in Figure 7 for water ice. Additionally, the ice structure resulting from merging smaller clusters can become the unit cell content of a periodic 2D amorphous water ice surface exhibiting a cavity morphology [113]. This last step requires extra care to ensure a proper H-bonding between the content of the unit cell with the neighbor cells.

Summary and Perspectives
The present work addresses the various procedures to construct and model surfaces for systems mimicking solid-state interstellar (IS) grains, either core or mantle, by means of state-of-the-art methods, techniques and strategies of computational chemistry. The work highlights the strategies and problems to be overcome to construct representative structurally atomistic surface models for IS grains suitable for computational

Summary and Perspectives
The present work addresses the various procedures to construct and model surfaces for systems mimicking solid-state interstellar (IS) grains, either core or mantle, by means of state-of-the-art methods, techniques and strategies of computational chemistry.
The work highlights the strategies and problems to be overcome to construct representative structurally atomistic surface models for IS grains suitable for computational investigations of their physico-chemical properties. Indeed, several astrochemically relevant reactions only occur in the presence of the grains, which, alongside the gas-phase-based reactions, feed the rich chemical diversity and complexity present in the Universe. Moreover, such a molecular complexity is linked to the different physical phases involved in the formation of Solar-like planetary systems, in which grain particles (from interstellar, protoplanetary and interplanetary origin) play a central role in the formation of molecular species detected in each phase. However, the specific role in each reaction-type is not fully understood. Obtaining structural models for cosmic grain surfaces is the very first step to subsequently simulate grain surface reactions of different nature. Thus, to simulate these processes accurately, a proper construction and description of the electronic structure of the grain models is fundamental.
As the grain surface modelling is grounded on computational chemistry techniques, brief descriptions of the methods belonging to this realm are provided, in particular those based on quantum chemistry (HF, post-HF, DFT) and classical mechanics (Force Field). Limits and merits of each class of method are briefly illustrated. Molecular dynamics, either using FF or in the AIMD incarnation, is a key tool to amorphize crystalline material through hot/cooling cycles.
Finally, the key materials covering most of the IS core (silicates) and mantle (dirty ice) composition have been described as paradigmatic cases. The following surface modelling items are presented and exemplified: (i) periodic and cluster approaches, (ii) construction of periodic surfaces adopting the slab-cut procedure, (iii) construction of cluster systems via "molecule" extraction from the periodic bulk, (iv) generation of clusters and nanoparticles adopting the Wulff-construction scheme, (v) generation of nanoclusters adopting the global optimization algorithm of Monte Carlo Basin Hoping or ML techniques, (vi) generation of amorphous systems by applying the hot/cool MD procedure on crystalline-like systems also by ML to overcome time evolution bottleneck, (vii) construction of defective surfaces, in particular those presenting point defects and those exhibiting cavities/pores. We are facing the new challenge of ML which is revolutionizing computational chemistry; at the same time, ML still faces some subtle problems, namely, the choice of suitable molecular descriptors (see [162] for discussion on this topic), and the difficult replication of the results outside the laboratory where a specific trained NN has been developed; this latter point is reminiscent of computational chemistry before the comparison of highly engineered, stable and reliable quantum chemistry computer programs, in which research groups relied on in house developed codes. Nevertheless, we are confident that ML will grow steadily in the immediate future, opening the route for the simulation of micrometric sized grains for both the silicate core and the water ice mantle. ML can also be crucial to leverage the high cost related to the prediction of iCOM infrared spectra at the grain surfaces, allowing the comparison with the James Webb (JWST) space telescope observation expected in the immediate future.
As mentioned in the Introduction, these surface modelling strategies are usually adopted in topics of surface science and/or heterogeneous catalysis, especially the periodic approach for crystalline systems. Their applicability, however, contrasts significantly with the few works focused on astrochemistry, in which water ices and silicates are by far the most modelled materials. However, materials belonging to these families have not been fully exploited for astrochemical purposes. For silicates, several structural models are available for olivines, but not for pyroxenes. It is worth bearing in mind that pyroxenes present a lower amount of Mg/Fe cations than olivines, which can affect their adsorptive features, and hence their chemical activity. Thus, a systematic and comprehensive investi-gation of the differences in the physicochemical properties between olivine and pyroxene surfaces is of high relevance, which will firstly require the construction of pyroxene surface models. In relation to ice mantles, the interstellar and cometary ones consist mainly of water (and hence commonly modelled by pure water), but traces of other components are also present. However, surface models of true dirty ices (i.e., H 2 O mixed with traces of CO, CO 2 , NH 3 , CH 3 OH, etc.) are still to be proposed. Models of this kind are very important, because the interaction of these minor species with water ice molecules can lead to their activation, increasing their propensity to react. Such an activation effect is usually investigated by modelling the gas-ice complexes (i.e., the minor species being adsorbed on top positions of pure water ice surfaces). However, the degree of activation can be different if the minor species forms part of the ice network, as different interactions with the surrounding molecules are established. Indeed, these minor iced molecules are attributed to be raw reactive species that trigger chemical reactions leading to an increase in molecular complexity (e.g., formation of iCOMs). Thus, realistic models for dirty ice mantles are highly demanded, the construction of which can be conducted by adopting similar approaches as for water ice.
The reason why water ice and silicates are the most common materials to study grain surface chemical reactions is that they are the most abundant ones in the IS medium. However, other materials are also available and relatively abundant in this and other astrophysical media, which can indeed play roles in building-up the chemistry present in space. This is undeniably the case of the rocky materials belonging to comets and meteorites. These asteroidal bodies present a wide variety of minerals, the most usual ones being silicates, aluminosilicates, metal oxides, metal sulfides and carbonates. On terrestrial Earth, some of these materials exhibit catalytic properties, which can operate efficiently in space. Thus, investigations of the synthesis of organic compounds detected in comets and found in meteorites and the role of the minerals in these processes are of fundamental relevance, not only for the astrochemistry field but also for the prebiotic chemistry one, since these minerals were also present on the primordial Earth's crust [163,164]. Therefore, the modelling of surfaces for materials belonging to these mineral families is mandatory to conduct these investigations. Modelling of these systems can certainly be conducted by means of the strategies presented in this perspective, but in this case with some caution, as these minerals differ significantly to silicates and, especially, to ice mantles-that is, cometary and meteoritic minerals are undoubtedly more complex in terms of their nature (pure ionic or hybrid ionic/covalent systems), composition (more chemical species), structures (different crystalline and amorphous systems) and electronic structure (i.e., open-shell systems such as radical ones or those containing Fe 2+ ).
Therefore, for all the exposed above, there is indeed a fertile and unexplored ground for theoreticians and modelers dedicated to the primordial chemical evolution and origin of life's studies.