Theoretical Investigation of Glycine Micro-Solvated. Energy and NMR Spin Spin Coupling Constants Calculations

: Glycine in its neutral form can exist in the gas phase while its zwitterion form is more stable in water solution, but how many waters are actually necessary to stabilize the zwitterionic structure in the gas phase? Are the intramolecular isotropic spin spin coupling constants sensitive enough to accuse the change in the environment? or the conformer observed? These and related questions have been investigated by a computational study at the level of density functional theory employing the B3LYP functional and the 6-31++G**-J basis set. We found that at least two water molecules explicitly accounted for in the super-molecule structure are necessary to stabilize both conformers of glycine within a water polarizable continuum model. At least half of the SSCCs of both conformers are very stable to changes in the environment and at least four of them differ signiﬁcantly between Neutral and Zwitterion conformation.


Introduction
Glycine is the smallest amino-acid conforming living organisms, and therefore of great interest in theoretical studies mainly due to the great relative computational saving its investigation demand. In particular, many of its properties are already known experimentally [1][2][3][4][5][6][7][8][9]. Among the diversity of works on glycine, some are aimed to elucidate the mechanisms of tautomerization. Hence, the proton transfer of the neutral glycine, stable in the gas phase, to zwitterionic conformer, stable in aqueous solution, has an estimated free energy barrier of about 7.3 kcal mol [1][2][3]5,6,8,[10][11][12][13][14]. Some other recent works aim to study small clusters of Glycine: (H 2 O) n complexes in order to elucidate structures and binding energies [15][16][17][18], or the stabilization of Z-Glycine in water and its NMR signature [19] and the election of a method for fast accurate of 1 J(C, H) for application in proteins [20].
Császár [21] found the three most stable conformers of neutral glycine and named them Ip, IIp and IIIp), then Godfrey et al. [22] found a slightly different set and named them 1, 3 and 2, later Sauer et al. [23] and Aikens and Gordon [4] made the same finding as Császár's naming the conformers, respectively, A, C and B and N1, N3 and N2). More recently, Caputo et al. [24] found the structures proposed by Császár. According to Császár [21] and Ding and Krogh-Jespersen [25], the zwitterion spontaneously tends to structure IIp, 2, C or N3 when it is let in the gas phase, which is obviously the shorter path for the proton. Moreover, the most stable geometry we found when including a continuum polarizable medium (PCM) is precisely this last one, named in the present work by N for neutral. This conformer corresponds to the second most stable structure found by Császár, Sauer et al., Aikens 3,41 name the other two structures analyzed in this work with Z (for zwitterion) and TS (for transition state), which were found within the PCM model.
Alonso et al. [16] reported that the conformer of glycine III, which coincides with the neutral conformer chosen for this work, in the complex glycine:water IIIa was not observed in their experiment based in a supersonic jet and they explain the reason as being due to its collisional relaxation to the most stable conformer of glycine (I). However, it is not the purpose of the present work to analyze the collisions of glycine:water.
In this work, we use structure IIp, 2, C or N3 for the neutral glycine and its zwitterion to analyze their stability within the PCM model and including explicit water molecules, up to reaching saturation of the hydrophilic positions.
The number of water molecules necessary to reach saturation vary a bit according to the conditions established for the hydrogen bond (HB) formation, see for instance Ref. [6]. Hence, we only restricted rings of water molecules that do not involve the glycine in them, so the saturation is reached with eight molecules of water. The energies scheme arising from it allows one to analyze the contributions in the energy of the HBs formed in the super-molecule model. The scheme contemplated for adding water molecules is shown in the computational details section. Finally we analyze the NMR indirect spin-spin coupling constants (J) between the intramolecular heavy atoms.
The calculations of SSCCs were performed using the density functional theory with the B3LYP functional, and employed a reliable basis set 6-311++G * * -J [33]. Basis functions, for both glycine and water molecules, were taken from Basis Set Exchange [34][35][36].
Calculations include a liquid solvent described by Polarizable Continuum Model (PCM) approximation [37] with the variant (SMD) [38] that differs from the default (IEF-PCM) also by Truhlar and co-workers solvation model [39][40][41][42][43] that do not include the electrostatic terms. Vibrational corrections were not included due to these normally being too expensive to calculate in standard applications to larger molecules.
The geometries obtained in this work are essentially the same as the ones obtained by Caputo et al. in Ref. [24] and are represented in Figure 1, where the transition state (TS) was found with the help of N and Z structures within the PCM approximation mentioned above and QST3 option.   Analyses of the neutral glycine molecule allows one to identify four hydrophilic sites. They are, respectively, the two oxygens (syn-and anti-periplanar to the C-N bond) named in Table 1 as Ac (by Acid) and O T (by trans); the amino group named Am (by amino) and, finally, the possibility of simultaneous HB formation between Ac and Am, which is named AcAm.
Therefore, subsequent water molecules were added following the order and position indicated in Figure 2 and Table 1. Up to eight water molecules effectively joined to glycine were feasible to be included for the three conformers, that is, without having rings of water molecules which do not include the glycine in them. Therefore, the reported SSCCs are single point calculations in all cases.

H-Bond Sites N/Z-Structure Insertion Order
Ac  Figure 1. Optimized geometries of glycine: N (left) Neutral, TS (centre) Transition State and Z (right) Zwitterionic form: using B3LYP/6-31+G(d,p) within PCM with water dielectric constant. Red represents oxygen, blue represents nitrogen, black represents carbon and gray represents hydrogen.  conformer. The sequence for incorporating the water molecules explicitly was referred to 94 in Figure 2 and Table 1.

95
The zwitterion structure needs a minimum of two water molecules in the super-96 molecule structure to preserve its stability in the gas phase, but within the PCM 97 approximation the three structures are stable by themselves, i.e. without any explicit 98 water molecule. As a reminder , the SMD variant is a IEF-PCM that includes the 99 non-electrostatic terms for Truhlar and coworkers' solvation model. Within the PCM 100 approximation four water molecules are needed for Z-Gly to become more stable than 101 the whole series of hydrated N-Gly and TS-Gly molecules, see Figure 3. The theory of indirect nuclear SSCCs, References [44] and different computational methods for calculating them has been described extensively in the literature [45][46][47][48][49][50][51]. However, it should be useful, as a reminder, to mention that there are four contributions to the SSCC: the Fermi contact (FC) and the spin-dipolar (SD), which come from the interaction of the nuclear magnetic moments with the spin of the electrons, as well as the diamagnetic spin orbital (DSO) and the paramagnetic spin orbital (PSO), which are due to the interaction of the nuclear spins with the orbital angular momentum of the electrons.

Basics Concepts
We have analyzed the hydration of one conformer of glycine, stable in gas phase, its zwitterion counterpart, which is stable in liquid phase, and the transition state obtained from the former two, depicted in Figure 1. The hydration of these conformers was performed in a systematic manner adding one explicit molecule of water each time until saturation was reached and also embedding these complexes in a continuum dielectric constant of water, through the PCM model, which allowed us to have a stable zwitterion conformer. The sequence for incorporating the water molecules explicitly was referred to in Figure 2 and Table 1.
The zwitterion structure needs a minimum of two water molecules in the supermolecule structure to preserve its stability in the gas phase, but within the PCM approximation the three structures are stable by themselves, that is, without any explicit water molecule. As a reminder, the SMD variant is an IEF-PCM that includes the non-electrostatic terms for Truhlar and coworkers' solvation model. Within the PCM approximation, four water molecules are needed for Z-Gly to become more stable than the whole series of hydrated N-Gly and TS-Gly molecules, see Figure 3.
The energies [in a.u.] for the different super-molecular systems of Gly:(H 2 O) n (with n = 1 to 8) were obtained as: E Gly + (8 − n) · E H 2 O . The E Gly and E H 2 O are the energies of glycine and water, respectively. All of them were obtained within the PCM model and represented in Figure 3 together with the energies of each conformer, N, TS and Z, namely with subindex 0 (zero), i.e., 0 (zero) water molecule. Clearly it can be seen in Figure 3   Finally, the gap between n = 7 and 8, which is common for N and Z-Gly structures, again shows a new formation of a stronger H-Bond in the zone labeled O T and the additional H-Bonds formed between water molecules, (8) from Table 1.
Worthy of being mentioned is the fact that the addition of a water molecule in the Ac zone, (1 and 6) of Table 1; and in the Am zone, (2 and 4) of Table 1, have respectively a major and minor effect on the energy decrease in the super-molecular system. While the addition of subsequent water molecules in the O T zone, (3, 5 and 7) of Table 1 has a decreasing influence on the decrease in energy; and the addition of a water molecule in the AmAc zone, (8) of Table 1 produce a large decrease in the energy effect of the supermolecule system for N and Z-Gly.
Structure Z-Gly has a slightly different pattern with at least four gaps (G1 -G4 in the Figure) in its pattern. The first two (G1 and G2) coincide with those just seen for N-Gly and TS-Gly structures, that is, the gap between n = 2 and 3 (G1) and the gap between n = 4 and 5 (G2). Meanwhile, there are two extra gaps, one is between n = 5 and 6 (G3), that appear with the formation of the H-bond between water molecules in the Ac zone, (6) from Table 1 and finally is the gap between n = 7 and 8 (G4) that has to do with the formation of an H-bond in the AmAc zone, (8) from Table 1 and coincides with the corresponding N-Gly. Thus, taking into account the size of the gaps (Table S1 of the Supplementary Material), one sees that the first water molecules added in the O T zone (3), the second in the Ac zone (6) and the AmAc zone (8), contribute more to the downing of the energy of the super-molecular system of the Z-Gly. All the others produce a lesser downing in the super-molecule energy. Figure 3 also shows that the super-molecular structures of the Z-Gly:W n always have lower energies than the other two. From n = 4, the Z-Gly structures are under the smallest energies for structures N-Gly and TS-Gly. Our calculations showed that adding up to eight explicit water molecules, the complex has at least one H-bond between each water and glycine and that, beyond this number and the addition of more explicit water molecules, new H-bonds are solely formed between water molecules. Hence, the super-molecule structure was limited to eight water molecules [24]. Figure 4 represents the energy difference [in a.u.] between neutral and zwitterionic conformers and their hydrated structures. In all cases the difference of energy, ∆E N−Z = E N − E Z , is positive, which agrees well with the pattern observed in Figure 3, where the energies of zwitterion are lower than their neutral counterparts. Only for the supermolecule n = 7, Gly:7W, the energy difference, 0.0282 a.u., is slightly smaller than that for n = 6, 0.0277 a.u. This fact, together with the almost imperceptible increase for n = 8, 0.0283 a.u. is directly related to the saturation of the super-molecule systems and the difference between N and Z tends to be constant.

Spin-Spin Coupling Constants
The intramolecular SSCCs between the heavy atoms, other than hydrogen, from one to three bonds in glycine have been investigated for the neutral and zwitterionic forms in a electrostatic embedding within PCM and adding to the calculations up to eight molecules of water in explicit form.
The isotropic values and their contributions are collected in Figure 5a-f and Tables S2-S4  of the Supplementary Material and summarized in Tables 2 and 3. The N-and Z-Gly with just the PCM embedding, 0W, and up to eight explicit water molecules, 8W, are shown in Table 2 (Columns 1 and 2) and compared with the best calculations of Valverde et al. [19] (Columns 3 and 4) and same experiments Ref. [52][53][54]. At first glance can be noted two main characteristics, one is the better agreement with experiment for 1 J(N, C N ) in the present work and that the values for 1 J(O Ac , C O s ) and 1 J(O T , C O s ) seems to be exchanged between this work and the work by Valverde et al. [19]. Discarding the possibility of human error, according to private communication, it could arise from differences in the geometries or basis sets, but a further investigation would be needed.   The complexity of the variations along the series are clearly exposed in the Figure 5, where the qualitative general behavior of the intramolecular coupling in glycine can be seen. A summarized pattern can be extracted from Table 3. Where the differences between Gly:4W -Gly:0W, Gly:8W -Gly:4W and the total difference Gly:8W -Gly:0W are exposed. The last one is just the addition of the former two. The FC contributions, which determine, in general, the behavior of the total, are also shown.
The one-bond SSCCs change as much as ∼ 3.0 Hz for Z-Gly, along the series Gly:nW, when incorporating explicitly water molecules, from n = 0 to 8, in the calculations. In particular, the 1 J(C O s , C N ) and 1 J(N, C N ) of the Z-Gly, 2.83 Hz and −1.31 Hz are the most sensitive to the presence of explicit water than the corresponding counterparts of N-Gly, −0.77 Hz and 0.42 Hz, see Figure 5a,d. While for Z-Gly 1 J(C O s , C N ) increases with the incorporation of water molecules 1 J(N, C N ) decreases, and the opposite happens with N-Gly.
The changes in 1 J(C O s , C N ) for N-Gly arise from the first round of added water molecules and are reduced a bit by the second round. While for Z-Gly the main variation arises from the second round of water molecules and is enlarged a bit by the first round. The changes in 1 J(N, C N ) for N-and Z-Gly seems to mirror what occurs in 1 J(C O s , C N ).
The other two one-bond SSCCs, 1 J(O Ac , C O s ) and 1 J(O T , C O s ), vary much less with the incorporation of explicit water molecules, and for Z-Gly they decrease, −0.59 Hz and −1.26 Hz respectively. While for N-Gly, the decrease is of −0.8570 Hz and −1.4373 Hz.
For N-Gly, the changes observed in 1 J(O Ac , C O s ) arise from the first round of added water molecules and is counteracted by the second round in about 46%; while for Z-Gly the second round act in the same direction as the main variation was introduced by the first round of added water molecules. Again, in 1 J(O T , C O s ), the effect of the first and second round of added water molecules for both N-and Z-Gly seems to mirror the previous behavior.
Remarkable is the fact that these last two SSCCs have, along the series, an almost constant difference between N-Gly and Z-Gly of about 6.0 Hz in favor of Z-Gly for the former coupling constant and about 5.0 Hz in favor of Z-Gly for the latter coupling constant. This characteristic makes any of the one-bond oxygen-carbon SSCCs an excellent marker for distinguishing between the conformers.
The one-bond carbon-carbon and nitrogen-carbon SSCCs contrast with the previous remark, since they do not seem to show many differences between conformers, Figure 5a,d. Instead, they vary in different directions according to the conformer and the coupling constant, see Table 2.
It is worth to note that due to the difference in scale of ordinate axes between Figure 5a-f, the variations observed for two-and three-bonds SSCCs are not that much as they look like at first glance, instead they all are rather quite stable.
The general behavior of all gemial SSCCs is that they are roughly stable between n = 0 and 8, with the addition of explicit water molecules with only two exceptions, 2 J(N, C O s ) and 2 J(O T , C N ), which shows changes over 1.0 Hz for N-Gly and Z-Gly, respectively, see Table 3. Figure 5b,e also allows one to see that 2 J(N, C O s ) has a steady increment, along the series, for N-Gly but it is almost unchanged for Z-Gly, just 0.1 Hz. In the same manner, 2 J(O T , O Ac ) decreases steadily for the the first round of water molecules and increase a bit for the second round producing a net change of −0.5 Hz for Z-Gly and stays standing for N-Gly. However, the average difference along the series between conformers is roughly constant and is about 4.0 Hz in both cases. Again, this fact would let experiments distinguish between conformers in a quite reliable manner.
For the two-bond SSCCs, the second round of added water molecules is more influential than the first round, with only two exceptions, 2 J(O Ac , C N ) and 2 J(O T , O Ac ), in both cases for Z-Gly.
The net changes from 0W to 8W in the three-bond SSCCs are between the smallest and are of only a few tenths, Table 3. However, Figure 5c,f shows large variations in the intermediate stages of 3 J(O Ac , N) for both compounds, but mainly for N-Gly. These large variations can be attributable to the geometrical changes in the conformation of hydrogens in the glycines compounds, N and Z, which affect more to long nuclear interactions, mainly due to their small values.
The 3 J(O T , N) coupling constants shows similar trends and magnitudes for both compounds. This fact makes this coupling constant very reliable for experiments but would be very difficult to distinguish between conformers.
As seen previously, at least two of the four stable SSCCs that show a large difference between N-and Z-Gly contain at least one oxygen, 1 J(N, C N ), 1 J(O Ac , C O s ), 2 J(N, C O s ) and 2 J(O T , O Ac ), see Figure 5 and Tables S2-S4 of the Supplementary Material. The 3 J(O T , N) SSCC is very stable but shows similar magnitudes for both conformers therefore is unable to discern between them.
Thus, it is worth mentioning that the oxygen, in general, has received less attention than the other elements of the first and second row of the periodic table, mainly due to the small abundance of its only magnetic nucleus, which is 0.037% and the fact that it possess a nuclear spin I = 5/2 that implies a quadrupole moments, responsible of the broadening of the NMR line widths, see for instance, Ref. [55][56][57][58][59][60][61][62]

Conclusions
In this work, we have systematically studied the energy and the intramolecular SSCCs of the heavy nuclei, other than hydrogens, for neutral and zwitterionic glycine within a dielectric constant of water using the PCM model and including explicit water molecules increasing the number of them from zero to eight where all geometrical parameters are allowed to relax.
Within the PCM embedding, the three structures, N, Z and TS, are stable. The zwitterionic structure needs at least two explicit water molecules in the super-molecule structure to preserve its stability in the gas phase and four explicit water molecules to become more stable than the whole series of hydrated neutral molecules.
From six to eight water molecules, the energy difference between N-Gly and Z-Gly becomes approximately constant.
The analyses of the intramolecular SSCCs' reflexes, along the series, found some very stable coupling constants for both conformers, like the four one-bond coupling constants, 1 N). Three of these coupling constants have very similar magnitudes to each other's conformers, that is, they do not distinguish either between conformers nor suggest changes in the environment. This is the case of SSCCs 1 J(C O s , C N ), 1 J(N, C N ) and 3 J(O T , N).
Therefore, the other four SSCCs are also very stable along the series; they almost do not suggest changes in the environment, but they exhibit quite different magnitudes for both conformers, over 4.0 Hz, making them excellent markers for experiments.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/sci3040041/s1, Table S1: Energy [in a. u.] for the N-, TS-and Z-Glycine super-molecular systems with 0 to 8 water molecules, Table S2: One-two-and three-bond spin-spin coupling constants (in Hz) obtained at the B3LYP/6-31G* level of calculations using PCM for both neutral (N) and zwitterionic (Z) conformers in function of the number of water molecules, Table S3: One-two-and three-bond spin-spin coupling constants (in Hz) obtained at the B3LYP/6-31G* level of calculations using PCM for both neutral (N) and zwitterionic (Z) conformers in function of the number of water molecules, Table S4: One-two-and three-bond spin-spin coupling constants (in Hz) obtained at the B3LYP/6-31G** level of calculations using PCM for both neutral (N) and zwitterionic (Z) conformers in function of the number of water molecules.