Next Article in Journal
Polymers Enhance Chlortetracycline Hydrochloride Solubility
Previous Article in Journal
Genome-Wide Identification, Phylogenetic, and Expression Analysis of Jasmonate ZIM-Domain Gene Family in Medicago Sativa L.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quelling the Geometry Factor Effect in Quantum Chemical Calculations of 13C NMR Chemical Shifts with the Aid of the pecG-n (n = 1, 2) Basis Sets

by
Yuriy Yu. Rusakov
,
Valentin A. Semenov
and
Irina L. Rusakova
*
A. E. Favorsky Irkutsk Institute of Chemistry, Siberian Branch of the Russian Academy of Sciences, Favorsky St. 1, 664033 Irkutsk, Russia
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(19), 10588; https://doi.org/10.3390/ijms251910588
Submission received: 13 September 2024 / Revised: 26 September 2024 / Accepted: 28 September 2024 / Published: 1 October 2024
(This article belongs to the Section Physical Chemistry and Chemical Physics)

Abstract

:
A root factor for the accuracy of all quantum chemical calculations of nuclear magnetic resonance (NMR) chemical shifts is the quality of the molecular equilibrium geometry used. In turn, this quality depends largely on the basis set employed at the geometry optimization stage. This parameter represents the main subject of the present study, which is a continuation of our recent work, where new pecG-n (n = 1, 2) basis sets for the geometry optimization were introduced. A goal of this study was to compare the performance of our geometry-oriented pecG-n (n = 1, 2) basis sets against the other basis sets in massive calculations of 13C NMR shielding constants/chemical shifts in terms of their efficacy in reducing geometry factor errors. The testing was carried out with both large-sized biologically active natural products and medium-sized compounds with complicated electronic structures. The former were treated using the computation protocol based on the density functional theory (DFT) and considered in the theoretical benchmarking, while the latter were treated using the computational scheme based on the upper-hierarchy coupled cluster (CC) methods and were used in the practical benchmarking involving the comparison with experimental NMR data. Both the theoretical and practical analyses showed that the pecG-1 and pecG-2 basis sets resulted in substantially reduced geometry factor errors in the calculated 13C NMR chemical shifts/shielding constants compared to their commensurate analogs, with the pecG-2 basis set being the best of all the considered basis sets.

1. Introduction

Practically every contemporary chemical research study involves NMR experiments. Due to this, 13C NMR spectrum analysis has now become one of the indispensable physical chemical tools for studying the structure and dynamics of large natural products. However, as is frequently the case for large compounds, the proper assignment of NMR signals is never a simple problem to solve [1,2]. Of utmost importance is precise quantum chemical modeling of the 13C NMR spectra [3]. Indeed, quantum chemical calculations may be of great assistance in resolving the NMR problem, but if and only if they are carried out properly at a sufficient level of electronic structure theory. In this sense, NMR quantum chemical modeling is full of nuances, starting from the quality of the equilibrium geometry, on top of which the NMR calculations are performed, to the computational protocol applied for the calculation of NMR chemical shifts [4,5].
Out of the many computational factors affecting the accuracy of the final values of the 13C NMR shielding constants/chemical shifts, the quality of the equilibrium geometry is one of the most important issues, despite its seemingly inconspicuous influence. Overall, the problem of the equilibrium geometry factor in chemical shift calculations has been recognized for some time now [6,7,8,9] and continues to emerge in modern NMR computational studies every now and then [10,11,12]. As a matter of fact, the quality of the equilibrium geometry is strictly dependent on the level of the electronic structure theory and on the one-electron basis set used at the geometry optimization stage. As for the level of theory, there is a distinct hierarchy of methods with clearly defined computational scaling factors, levels of electron correlation covering, and innate pros and cons [5]. Thus, one can be totally lucid as to what to expect from a particular method. For the basis sets, on the contrary, the issue is far more complicated.
There were only a few works showing a strong dependence of the equilibrium geometry parameters on the basis set used at the geometry optimization stage [13,14,15]. In particular, Helgaker et al. [13] found an interesting relationship where bond lengths noticeably contract with improvements in the basis sets. This must be very important for geometry-dependent molecular properties such as NMR shielding constants. However, there was no systematic investigation of this issue until our recent studies. The first thing we showed was the dependence of the 31P chemical shifts on the basis set used on phosphorus atoms in the geometry optimization stage [16]. The study indicated a considerable variation in the average absolute error of calculated phosphorus chemical shifts compared to experimental data due to the geometry factor effect. The effect was thoroughly explored from the standpoint of the most effective polarization of the phosphorus 3p-shell, and new geometry-oriented pecG-n (n = 1, 2) basis sets for phosphorus atoms were developed on the basis of the property-energy consistent (PEC) method that was proposed earlier by Rusakov and Rusakova [17].
In our next paper [18], we proposed the pecG-n (n = 1, 2) basis sets for hydrogen and p-elements with 2–3 periods. The new basis sets, although rather moderate in size, turned out to be capable of giving equilibrium geometries of very high quality, which were comparable to those provided by considerably larger energy-optimized basis sets. The pecG-n basis sets were shown to be very efficient in reducing the geometry factor error in different second-order molecular properties, including static polarizabilities, static magnetizabilities, and NMR shielding constants [18]. The calculation of 1H, 13C, 15N, 31P, 19F, and 29Se NMR shielding constants was based on the coupled clusters singles and doubles method (CCSD) [19,20] and the CCSD equilibrium geometries. Taking into consideration the wide scope of the testing job and its high computational demand, only a limited number of rather small molecules were selected for each nucleus. Therefore, what is really important now for the pecG-n basis sets is to obtain indisputable proof of their effectiveness for NMR chemical shifts calculations with the aid of extensive and austere testing carried out at different levels of electron theory for a wide variety of challenging molecules, starting from very specific compounds with strong electron correlation effects to structurally entangled biologically active naturally occurring species. In this paper, we present the first such study performed for 13C NMR shielding constants/chemical shifts, as these are the most utilized in contemporary NMR analyses, specifically in NMR studies of large compounds of biological interest.
This paper is structured as follows. The introduction is followed by the “Results and Discussion” section starts from a summary on the PEC method and the pecG-n (n = 1, 2) basis sets in particular, with a strong emphasis on their distinction from standard energy-optimized basis sets. After that, the two-fold analysis of the performance of the pecG-n basis sets is presented, namely, the analysis comprises the theoretical study carried out without resorting to the experimental data, and the practical study based on the comparison of the calculated data with experimental data. The paper ends with a description of the computational details (“Materials and Methods”), conclusions, and the reference list.

2. Results and Discussion

2.1. The PEC Method and the pecG-n (n = 1, 2) Basis Sets

The property-energy consistent (PEC) method was introduced in our earlier paper [17]. This algorithm consists of the optimization of basis sets in relation to a certain molecular property, provided that the least possible total molecular energy is achieved. Namely, the basis set exponents are randomly generated around the starting basis set via Monte Carlo simulations. The generated arrays are verified based on whether they give the property of interest within a desired diapason or not. Of all the sets that provide the property in the desired range, only one set is selected, namely, the one that gives the lowest energy. The optimization of a property under an energetic constraint represents a nonlinear problem with multiple solutions, which is not correctly solvable by means of standard optimization techniques based on directed searches, like numerical Newton-like methods. In this sense, the PEC approach is the most suitable one for the nonlinear optimizations due to the fact that it performs a random search.
The pecG-n (n = 1, 2) basis sets [16,18] were created using the PEC algorithm. The optimization of these basis sets for a particular atom requires the energy-constrained minimization of the target function, representing a molecular energy gradient, relative to the lengths of the selected bonds that involve a particular atom. In fact, it is assumed that we can find a set of exponents which provide bond lengths as close to the ideal equilibrium values as possible, under the condition that they give the lowest molecular energy. The first tests carried out for the geometry-oriented pecG-n basis sets showed the expediency of the idea for a wide variety of molecular properties. For example, theoretical testing performed at the CCSD level of theory revealed that our second-level basis set, pecG-2, provided equilibrium geometries of the same quality as those obtained with the cc-pVQZ [21,22] basis set, resulting in almost the same mean absolute percentage errors (MAPE) for shielding constants as that for the data calculated using the cc-pVQZ geometries [18]. In addition, the pecG-2 basis set is approximately 1.5 times smaller than the cc-pVQZ basis set. Considering the fact that the main limiting factor for calculations of large molecules of biological interest is the basis set used, the computational benefit from using pecG-2 instead of the cc-pVQZ basis set is evident.
It is also worth noting that the optimization of the pecG-n basis sets was completely based on the molecular calculations. Moreover, not one but numerous fitting molecules were considered in the case of each atom, except for hydrogen.
In this paper, we juxtapose our pecG-n basis sets with the two most popular series, namely the Dunning’s series and the Pople’s series. These represent what we call here standard energy-optimized basis sets. This means that only the energy minimum condition was pursued while optimizing their exponents.
The Dunning’s basis sets, (aug)-cc-pVXZ (X = D, T, Q, …) [21,22], are representatives of the basis sets developed based only on atomic calculations. In particular, the cc-pVXZ basis sets for second-period atoms were developed based on the so-called “basic” (sp) primitive set, which was determined from atomic Hartree–Fock (HF) energy minimization. The exponents for the so-called correlation functions were then obtained from the correlated calculations on the atoms, specifically from the configuration interaction single and double excitation (CISD) wave function calculations. The d-functions of cc-pVXZ are too diffuse and, apparently, do not effectively describe the orbital polarization effect.
The Pople’s basis sets, K-LMNG [23,24], can be attributed to a more advanced group of basis sets, whose optimization at least partially involved molecular calculations. These can be augmented by various numbers of sets of polarization and diffuse functions. Here, the K and L, M, and N represent integers that denote the number of Gaussian primitives used to expand the inner-shell atomic orbitals and the inner- and outer-components of the valence shell functions, respectively. The Gaussian exponents for the inner- and valence-shell functions of the K-LMNG basis sets were obtained by minimizing the unrestricted Hartree–Fock (UHF) energy of the atomic ground state. The polarization sets (e.g., (2d,2p) or (3df,3pd)) were obtained by adding the functions with higher angular quantum numbers, with the exponents being the average of the exponents optimized for typical molecules incorporating a particular element. Thus, the Pople’s basis sets expanded with the polarization functions represent the result of mixed atomic and molecular energy-based optimizations.
Overall, by choosing the Dunning’s and Pople’s series, we took into consideration the representatives of two cardinally distinct families of basis sets, namely, those of correlation consistent and polarization consistent basis sets, respectively.
For later convenience, in Table 1, we give the compositions of the basis sets to be used and discussed further in this paper.
As one can see from Table 1, the pecG-1 and pecG-2 basis sets have the same composition and size as the 6-31G(2d,2p) and 6-311G(3df,3pd) basis sets; therefore, the latter can be thought of as direct competitors to our basis sets.

2.2. Theoretical Analysis of the pecG-n (n = 1, 2) Basis Sets

To carry out a theoretical examination of the performance of the pecG-n (n = 1, 2) basis sets against the other basis sets from the perspective of the geometry factor for the accuracy of calculations of 13C NMR shielding constants, we considered a set of ten natural products. These will be referred to as the molecules from set 1. Their structures are presented in Figure 1.
As one can see from Figure 1, set 1 includes rather large spatially bulky compounds with up to 90 atoms each. These natural products are produced by living organisms such as marine sponges, fungi, and plants. Practically all of them represent very important compounds with potential biological activity. In particular, 12,28-oxaircinal A was isolated from three collections of an Indonesian sponge of the genus Acanthostrongylophora, together with 13 known manzamine alkaloids, which are known to have activity against infectious, tropical parasitic, and Alzheimer’s diseases [25]. Icajine represents one of the Strychnos icaja alkaloids, which is detected specifically in the stem, root, and collar bark of S. icaja and commonly possesses specific anti-plasmodial activity [26,27]. Physalin D is found in a fraction from the aerial parts of Physalis angulate, known in Brazil as camapu, which is a branched annual shrub that belongs to the Solanaceae family. Extracts from this plant have been used in traditional folk medicine to treat tumors. Physalin D was found to exhibit inhibitory activity against Mycobacterium tuberculosis [28]. Betulinic acid originates from lupane. It is a pentacyclic triterpene, a group characterized by cytotoxic properties, which can be isolated from plants (e.g., Spirostachys africana) or synthesized [29]. The anticancer property of betulinic acid and its derivatives was extensively studied [30,31,32,33]. Anabsinthin is a sesquiterpene lactone that can be extracted from the aerial parts of Artemisia absinthium L., commonly known as wormwood, which is a yellow, flowering, perennial plant that is distributed throughout various parts of Europe and Siberia and is used for its antiparasitic effects, as well as to treat anorexia and indigestion [34]. Itoaic acid or 2β,11β-dihydroxy-3,4-secofriedelolactone-27-oic acid represents a rare naturally occurring triterpenoid with a 3,4-seco-friedelolactone skeleton and potential anti-inflammatory activity against COX-2, which was isolated from Flacouritaceae plants [35]. Matopensine is a symmetrical bisindole alkaloid that can be extracted from the roots of Strychnos matopensis and Strychnos kasengaensis, which are plants from eastern Africa. Matopensine-type alkaloids were found to exhibited potent and selective activities against Plasmodium [36]. Strychnobaillonine is an asymmetrical bisindole alkaloid found in the roots of liana Strychnos Icaja, which is mainly used by local populations of Africa as an arrow or ordeal poison and as a treatment for skin diseases and chronic, persistent malaria [37]. Iguesterine was isolated from the root bark of Catha cassinoides [38]; it has cytostatic activity against HeLa cells [39]. Naucleidinal was isolated from the roots of Nauclea Orientalis and showed significant cytotoxic activity against both HeLa and KB cell lines [40].
These compounds pose a challenging task for NMR analysis because they have multiple chiral centers and a large number of carbon atoms with close electronic environments; hence, there are many carbon nuclei with almost equivalent chemical shifts. As a result, most of their 13C NMR spectra present a superposition of the individual second- or even higher-order multiplets, forming complex patterns, which are very difficult to analyze. Thus, it is very important to take into account as many factors of accuracy as possible within feasible limits. In this respect, the geometric factor that can change the calculated values of carbon chemical shifts within a couple ppm, depending on the basis set used at the geometry optimization stage, is very important.
We investigated the performance of the pecG-n basis sets on the carbon shielding constants of natural products from set 1, which in total gave us 292 values that resulted in very solid statistics. The equilibrium geometries of the compounds of set 1 were obtained at the DFT(M06-2X) [41] level of theory while taking into account solvent effects, using the different basis sets listed in Table 1. The solvent effects were calculated using the IEF-PCM model parametrized for chloroform as the solvent for 12-28-oxaircinal, icajine, iguesterin, itoaic acid, matopensine, naucleidinal, and strychnobaillonine; acetonitrile as the solvent for anabsinthin; pyridine as the solvent for betulinic acid; and dimethylsulfoxide as the solvent for physalin D.
We used the M06-2X functional because it was proven to give equilibrium geometrical parameters in close proximity to those obtained using high-quality ab initio methods [16,42,43], such as the coupled cluster singles and doubles with non-iterative perturbative triples (CCSD(T)) method [44]. Moreover, should a biological system of set 1 contain significant dispersion interactions, the M06-2X functional could be practically useful, as this functional was found to be highly successful at describing dispersion interactions for neutral molecular systems due to its portion of Grimme’s long-range dispersion corrections with an s6 scaling factor of 0.06 [43].
The equilibrium geometries for the reference values were obtained using the cc-pVQZ basis set. As one can see from Table 1, the quadruple-ζ quality cc-pVQZ basis set is considerably large, with as many as 30 and 55 functions for hydrogen and the second-row atoms, respectively. This makes the equilibrium geometries obtained with the cc-pVQZ basis set a good reference.
The NMR shielding constants were calculated using the GIAO-DFT(B97-2) [45,46] method with the specialized pecS-2 basis set [47,48,49] for all atoms. The pecS-n basis sets were specifically optimized for the 1H, 13C, 15N, 17O, and 31P NMR shielding constant/chemical shift calculations via the PEC algorithm and were presented in our previous papers [47,48]. Overall, the pecS-n (n = 1, 2) basis sets are rather small in size, consisting of only 5/14 and 18/34 functions for hydrogen and the second-row atoms, respectively, and demonstrate a better accuracy compared to the other commensurate shielding-oriented basis sets [49]. Thus, our NMR-oriented basis sets embody a fine balance between size and accuracy, and hence, were our present choice.
For the method, the hybrid B97-2 functional was also chosen deliberately, as this functional or its modifications were found to be among the best and the most popular functionals for predicting isotropic NMR shielding constants [48,50,51,52]. Moreover, the B97-2 functional was used in the optimization of the pecS-n basis sets; therefore, the combination of the B97-2 functional with the second-level pecS-2 basis set must be an appropriate approach for the calculation of NMR chemical shifts using the DFT methodology.
The mean absolute errors (MAEs) were estimated for the 292 carbon shielding constants of the molecules in set 1 that were calculated using the equilibrium geometries obtained using different basis sets against the reference theoretical shielding constants. These are presented in Figure 2.
As one can see from Figure 2, our pecG-1 and pecG-2 basis sets demonstrated noticeably better performance compared to the commensurate Pople-style basis sets, 6-31G(2d,2p) and 6-311G(3df,3pd), respectively. Namely, pecG-1 produced an approximately 1.3 times smaller MAE than that of the 6-31G(2d,2p) basis set, while the pecG-2 basis set turned out to be the best of all, and had an MAE that was two-fold smaller than that of the 6-311G(3df,3pd) basis set.
The cc-pVDZ and 6-311G(d,p) basis sets are commonly used in the geometry optimizations performed nowadays, though, in accordance with our results, these basis sets appear to be the worst ones to use to obtain equilibrium geometries for calculations of carbon shielding constants. Accordingly, taking into account the subtlety with which the entangled 13C NMR spectra of natural products are modeled, we would not recommend using the cc-pVDZ and 6-311G(d,p) basis sets for geometry optimization of such compounds.
The cc-pVTZ basis set is also one of the most popular basis sets for structure optimization; it is considered a rather high-quality basis set for describing molecular electronic structures. As can be seen from Table 1, its size is approximately one and a half times larger than that of the pecG-1 basis set and has only five fewer functions than the pecG-2 and 6-311G(3df,3pd) basis sets. The cc-pVTZ basis set can be seen as having a quality equal to that of the 6-311G(3df,3pd) basis set; ergo, out of the two, the former is preferable due to its fewer number of basis set functions. At the same time, the pecG-2 basis set, being only five functions larger, has a two-fold higher accuracy compared to the cc-pVTZ and 6-311G(3df,3pd) basis sets. Selecting between the cc-pVTZ and pecG-2 basis sets is a complicated matter that depends on the willingness to shift the balance towards either decreasing the computational costs or increasing the accuracy of the NMR calculations. Anyway, whenever one deals with natural products or other complicated systems with highly entangled NMR spectra, high-precision modeling of the 13C NMR spectra is required, and the pecG-2 basis set would apparently be a better choice for the geometry optimization at the expense of the increased computational costs compared to the cc-pVTZ basis set.
In spite of our original intention to devote this section to theoretical testing, a comparison with the experimental data for the molecules of set 1 was also possible, as their experimental 13C NMR chemical shifts are available [25,28,34,35,37,38,40,53,54,55,56]. Therefore, we evaluated scaled chemical shifts δ ~ σ i , α from the carbon shielding constants σ i using a physically meaningful linear regression model. This means that we applied the least squares method (LSM) with a slope equal exactly to −1:
δ ~ σ i , α = σ i + α
In the LSM, the coefficient α is found from the minimization of the sum of squared residuals:
S = i = 1 n δ e x p , i δ ~ σ i , α 2 m i n
to give the following expression:
α = 1 n i = 1 n δ e x p , i + σ i
Should one consider the parameter α as representing an approximated shielding constant of a standard, the expression (1) becomes closely reminiscent of a well-known simplified IUPAC expression [57,58] for NMR chemical shifts, which is defined as the difference between the nuclear shielding constant of a standard and that of the compound under study. That is why our choice was this form of a linear model. It should be mentioned that we did not resort to a classical evaluation of the chemical shifts via the simplified IUPAC formula, as this brings about a systematic error due to the inaccuracy of the calculations of the carbon shielding constant of the reference compound.
In order to assess the accuracy of the calculated δ ~ against the experimental measurements, we first calculated the MAE for each compound from set 1 in virtue of the fact that their NMR data were recorded under different experimental conditions (solvent, temperature, etc.). Their resulting MAEs were then averaged to give the mean MAE for all ten compounds. In these calculations, all meaningful conformers were taken into account and Boltzmann averaging was performed.
What we obtained did not quite fit into the picture that one might have expected. We observed a sporadic behavior of the MAEs of the basis sets used at the geometry optimization stage. That is, all mean MAEs were oscillating around the value of 2.4 ppm within a 0.05 ppm-wide range, without a clear tendency based on the qualities of the considered basis sets. For example, the cc-pVDZ, cc-pVTZ, and cc-pVQZ geometries gave almost the same MAEs for carbon chemical shifts.
In our opinion, the reason for such behavior may lie in many ambiguous factors that were not accounted for. The DFT method can fail to cover all the needed portions of the electron correlation, which might be crucial for some representatives of set 1 to correctly compare with the experimental data. Another complementing reason can be connected with the need for explicit accounting of solute–solvent interactions for some of the molecules, while for others, a simple polarization continuum model is sufficient. In these circumstances, we can see that a blunt comparison of the DFT results with the experiment results without considering many factors of accuracy will make the conclusions about so subtle an effect as the geometry factor pointless.
In this regard, we believe that the first and an insidious factor hampering a proper comparison of the theoretical values with experimental values is the lack of high-quality descriptions of the electron correlation effects in the considered systems. In our minds, in order to make the analysis of the performance of our basis sets based on the comparison with experimental data more valid, high-quality ab initio methods such as CCSD or CCSD(T) are the most important, while the other factors of accuracy are the second most important.
The coupled clusters singles and doubles (CCSD) method is a highly accurate ab initio correlated method which covers the electron correlation by approx. 98.3% and has a scaling computational factor of N6, with N being the number of basis set functions [59]. The CCSD(T) scheme that, on top of the CCSD, takes into account the triple excitations within the noniterative perturbative treatment, covers the electron correlation already by 99.7%, and has a computational scaling factor of N7 [59]. Ideally, it would have represented a great test if it were possible to apply these methods to the molecules from test set 1, but, unfortunately, these methods are prohibitive for systems of such a large size. Therefore, we introduced a set of systems with moderate sizes that possess very intricate electronic structures and represent a challenge to any computational tool. Eventually, we carried out an analysis based on the reference experimental data that are presented in the Section 2.3.

2.3. Analysis of the Performance of the pecG-n (n = 1, 2) Basis Sets Based on a Comparison of the Theoretical Data with Experimental Data

To perform the experiment-based test, we chose nine molecules from the DELTA50 set introduced in the paper by Cohen et al. [60]. These nine molecules formed test set 2, which is shown in Figure 3. One can see that set 2 contains a wide variety of molecules with very difficult electronic structures, including, for example, molecules such as cyclopropane and oxetane, whose specific ring brings about a substantial steric strain. Different hybridization states of carbon atoms in bonding and their unique electron environment in each compound result in a wide variety of experimental 13C NMR chemical shifts, ranging from approx. −3 to 200 ppm. All measurements of carbon chemical shifts were carried out in CDCl3 and referenced to TMS at 0.00 ppm by Cohen et al. [60].
We performed calculations of the equilibrium geometries of the molecules of set 2 at the CCSD level of theory using different basis sets. This time, in view of the particularly demanding computational tasks, not all the basis sets from Table 1 were used in the geometry optimizations, i.e., apart from our basis sets, pecG-n, we only included into consideration the basis sets with the same size, the 6-31G(2d,2p) and 6-311G(3df,3pd) basis sets, and one very popular basis set, 6-311G(d,p). All the molecules of set 2 represent highly symmetric rigid structures so there was no need to carry out a conformational analysis. All geometry optimizations were performed taking into account the solvent effects.
The gas phase calculations of the carbon shielding constants of set 2 were carried out using the GIAO-CCSD(T) method and the pecS-2 basis set on all atoms except for fluorine, for which, the pcS-2 basis set [61] was used. In view of the unavailability of the computational codes for coupled cluster calculations of shielding constants that account for solvent effects, the solvent corrections to the carbon shielding constants were taken into account at the GIAO-DFT(B97-2)/aug-cc-pV5Z level of theory. Unfortunately, the vibrational effects were not taken into account due to the extremely demanding computational costs needed for the computations within the coupled cluster method, while the DFT level was not used deliberately as it cannot be thought of as a trustworthy method for such electronically complicated molecules as those included in set 2.
Overall, the computational scheme based on the CCSD(T) method for the property calculations performed on the equilibrium geometries obtained using the CCSD method represents a very accurate approach in terms of the treatment of electron correlation effects; therefore, in our opinion, this would constitute a proper benchmarking for our pecG-n basis sets based on the comparison with experimental data.
The 13C NMR chemical shifts were calculated from the corresponding shielding constants in accordance with the LSM method with the slope equal to −1, as was described in the Section 2.2 (see Equations (1)–(3)). The computed 13C NMR chemical shifts and the corresponding experimental data are presented in Table 2 below.
As one can see from Table 2, most of the calculated data are in very good agreement with the experimental data; however, in some exceptional cases, the theoretical values noticeably deviated from the experimental data, specifically, the NMR chemical shifts of C1 of DMAc and fluorobenzene. Apparently, these compounds have extremely complicated electron structures with carbon C1 being strongly involved in specific electron interactions. For example, DMAc is a representative of systems where the amino substituent at the C1 atom results in strong n ⟶ π* interaction, implying that the nitrogen atom donates the density of a lone pair (n) of electrons into the empty antibonding π* orbital of the nearby carbonyl group C1 = O, which typically causes a weakening and stretch of the carbon–oxygen bond [62]. At the same time, fluorobenzene is remarkable due to a considerable inductive electron withdrawal effect from the ipso carbon (C1) by fluorine, making the C1-F bond highly polarized and the C1 carbon lacking electrons to such an extent that it actually becomes the most de-shielded one in the system. This perturbation propagates along the ring, resulting in a considerable concertation of negative charge at the ortho position, making the C2 carbon the most shielded one. The electronic structure of fluorobenzene is so complicated that its electron-density difference plot is astoundingly similar to that for the phenyl cation with a nearby negative charge [63]! Evidently, a higher level of coupled cluster theory is needed for such complicated systems at all levels of calculation, not to mention the need to take into account explicit solute–solvent interactions.
In this part of work, however, our interest lay in testing the pecG-n basis sets using a very high level of electronic theory for highly complicated electronic systems. It was not our intention to investigate all factors of accuracy for abnormal cases specifically, even though we obtained a couple of noticeably underestimated values. In fact, these values did not change the tendency for the basis sets even a bit. This can be clearly seen in Figure 4, where we plotted the calculated MAEs from Table 2 together with the MAEs calculated without taking into account a certain “particularly problematic” C1 carbon of DMAc and fluorobenzene.
As one can see in Figure 4, the pecG-1 and pecG-2 basis sets gave equilibrium geometries that resulted in carbon chemical shifts with a higher accuracy than those calculated using the geometries obtained using the direct analogs of our basis sets, the 6-31G(2d,2p) and 6-311G(3df,3pd) basis sets, respectively. Meantime, the 6-311G(d,p) basis set repeated its own behavior at the DFT level where it was among the worst basis set for geometry optimization for NMR calculations. Thus, based on the presented benchmark calculations, we corroborated the superiority of the pecG-1 and pecG-2 basis sets over their analogs, with the pecG-2 basis set being the best for the geometry optimizations preceding carbon NMR chemical shifts calculations. To demonstrate the performance of the pecG-2 basis set, the correlation plot of the theoretical GIAO-CCSD(T)/pecS-2 chemical shifts calculated using the CCSD/pecG-2 equilibrium geometries for test set 2 against the experimental data is shown in Figure 5.

3. Materials and Methods

Optimizations of structures from test set 1 and set 2 were carried out accordingly, at the DFT and CCSD levels of theory in the Gaussian program [64]. All equilibrium geometries were obtained while taking into account solvent effects. For this purpose, the integral equation formalism of the polarizable continuum model, the IEF-PCM [65,66], was used. The IEF-PCM was parametrized in accordance with the solvents defined in the experimental papers. For the specification for each compound, please see the “Results and Discussion” section. All equilibrium geometries, obtained at different levels of electron theory and using different basis sets, are presented in the Supplementary Materials.
All GIAO-DFT calculations of 13C NMR shielding constants, either with or without accounting for the solvent effects, were conducted in the Gaussian program, while all gas phase CCSD(T) calculations of 13C NMR shieldings were carried out in the CFOUR program [67]. All calculated NMR shielding constants are presented in the Supplementary Materials (Tables S1–S15).

4. Conclusions

We investigated the performance of our recently proposed pecG-n (n = 1, 2) basis sets in geometry optimization, using the example of 13C NMR shielding constants/chemical shifts calculations. They were able to improve the final accuracy of the calculated NMR data compared to the other basis sets commonly used at the geometry optimization stage, such as Dunning’ cc-pVXZ (X = D, T) and the Pople-style 6-31G(2d,2p), 6-311G(3df,3pd), and 6-311G(d,p) basis sets.
The theoretical analysis was based on a comparison of the carbon shielding constants calculated at the GIAO-DFT(B97-2)//DFT(M06-2X) level for a wide variety of natural products (set 1) with the reference data obtained using the equilibrium geometries calculated using the cc-pVQZ basis set. Most importantly, this analysis showed that the pecG-1 and pecG-2 basis sets gave substantially better equilibrium geometries than their direct Pople-style analogs with the same size, the 6-31G(2d,2p) and 6-311G(3df,3pd) basis sets, respectively, and, in fact, they produced a substantially smaller geometry factor error in the calculated carbon shielding constants. Out of all considered basis sets, the pecG-2 basis set was found to be the best, providing an MAE of only 0.11 ppm for the shielding constants of the molecules of test set 1.
The practical analysis was based upon a comparison of the calculated 13C NMR chemical shifts with experimental data. This analysis involved a highly demanding computational protocol using the CCSD level of theory for the geometry optimizations and the GIAO-CCSD(T) level for the shielding constant calculations. This analysis was performed on a set of very electronically complicated systems (set 2) and revealed the same pattern that was observed in the theoretical analysis: the pecG-n (n = 1, 2) basis sets showed considerably superior efficacy in quelling the geometry factor error in the 13C NMR chemical shift calculations, with the pecG-2 basis set being the best. The MAE achieved with the pecG-2 basis set for test set 2 compared to the experimental data was found to be 2.25 ppm.
Based on the obtained results, we strongly recommend using the pecG-2 basis set in the geometry optimizations for 13C NMR chemical shifts calculations whenever highly precision modeling is required, like in the case of large-sized natural products. On the other hand, if there is a strong limitation on the basis set functions to be treated, the pecG-1 basis set is recommended. Basically, the pecG-1 and pecG-2 basis sets were conceived so as to provide the smallest geometry factor error in molecular property calculations; thus, as they are moderate in size, they are the best for the geometry optimizations performed as part of NMR spectra modeling. In this work, this fact was successfully corroborated using the example of 13C NMR chemical shift calculations.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms251910588/s1.

Author Contributions

Conceptualization, Y.Y.R. and I.L.R.; methodology, Y.Y.R. and V.A.S.; software, Y.Y.R. and V.A.S.; validation, Y.Y.R. and I.L.R.; formal analysis, Y.Y.R., I.L.R. and V.A.S.; investigation, Y.Y.R., I.L.R. and V.A.S.; resources, V.A.S.; data curation, Y.Y.R.; writing—original draft preparation, I.L.R.; writing—review and editing, Y.Y.R. and I.L.R.; visualization, I.L.R.; supervision, Y.Y.R.; project administration, Y.Y.R.; funding acquisition, V.A.S. and Y.Y.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Russian Science Foundation, grant number 23-23-00267. https://rscf.ru/project/23-23-00267/ (accessed on 1 September 2024).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article and Supplementary Materials, further inquiries can be directed to the corresponding author.

Acknowledgments

We are grateful to the Irkutsk Supercomputer Centre of SB RAS for providing computational resources for the computational cluster ‘‘Academician V. M. Matrosov’’ [68] and A. E. Favorsky Irkutsk Institute of Chemistry for the facilities of the Baikal Analytical Center (http://ckp-rf.ru/ckp/3050, accessed on 1 September 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Huang, Z.; Bi, T.; Jiang, H.; Liu, H. Review on NMR as a tool to analyse natural products extract directly: Molecular structure elucidation and biological activity analysis. Phytochem. Anal. 2024, 35, 5–16. [Google Scholar] [CrossRef] [PubMed]
  2. Semenov, V.A.; Krivdin, L.B. Computational NMR of natural products. Russ. Chem. Rev. 2022, 91, RCR5027. [Google Scholar] [CrossRef]
  3. Krivdin, L.B. Computational 1H and 13C NMR in structural and stereochemical studies. Magn. Reson. Chem. 2022, 60, 733–828. [Google Scholar] [CrossRef] [PubMed]
  4. Helgaker, T.; Jaszuński, M.; Ruud, K. Ab Initio Methods for the Calculation of NMR Shielding and Indirect Spin−Spin Coupling Constants. Chem. Rev. 1999, 99, 293–352. [Google Scholar] [CrossRef] [PubMed]
  5. Rusakova, I.L. Quantum Chemical Approaches to the Calculation of NMR Parameters: From Fundamentals to Recent Advances. Magnetochemistry 2022, 8, 50. [Google Scholar] [CrossRef]
  6. Rablen, P.R.; Pearlman, S.A.; Finkbiner, J. A comparison of density functional methods for the estimation of proton chemical shifts with chemical accuracy. J. Phys. Chem. A 1999, 103, 7357–7363. [Google Scholar] [CrossRef]
  7. Toušek, J.; Dostál, J.; Marek, R. Theoretical and experimental NMR chemical shifts of norsanguinarine and norchelerythrine. J. Mol. Struct. 2004, 689, 115–120. [Google Scholar] [CrossRef]
  8. Chesnut, D.B.; Quin, L.D. A study of NMR chemical shielding in 5-coordinate phosphorus compounds (phosphoranes). Tetrahedron 2005, 61, 12343–12349. [Google Scholar] [CrossRef]
  9. Zhang, Y.; Xu, X.; Yan, Y. Systematic investigation on the geometric dependence of the calculated nuclear magnetic shielding constants. J. Comput. Chem. 2008, 29, 1798–1807. [Google Scholar] [CrossRef]
  10. Nguyen, T.T. 1H/13C chemical shift calculations for biaryls: DFT approaches to geometry optimization. R. Soc. Open Sci. 2021, 8, 210954. [Google Scholar] [CrossRef]
  11. Kondrashova, S.A.; Polyancev, F.M.; Ganushevich, Y.S.; Latypov, S.K. DFT approach for predicting 13C NMR shifts of atoms directly coordinated to nickel. Organometallics 2021, 40, 1614–1625. [Google Scholar] [CrossRef]
  12. Wu, H.; Hemmingsen, L.; Sauer, S.P.A. On the geometry dependence of the nuclear magnetic resonance chemical shift of mercury in thiolate complexes: A relativistic density functional theory study. Magn. Reson. Chem. 2024, 62, 648–669. [Google Scholar] [CrossRef] [PubMed]
  13. Helgaker, T.; Gauss, J.; Jørgensen, P.; Olsen, J. The prediction of molecular equilibrium structures by the standard electronic wave functions. J. Chem. Phys. 1997, 106, 6430–6440. [Google Scholar] [CrossRef]
  14. Temelso, B.; Valeev, E.F.; Sherrill, C.D. A Comparison of One-Particle Basis Set Completeness, Higher-Order Electron Correlation, Relativistic Effects, and Adiabatic Corrections for Spectroscopic Constants of BH, CH+, and NH+. J. Phys. Chem. A 2004, 108, 3068–3075. [Google Scholar] [CrossRef]
  15. Heckert, M.; Kállay, M.; Tew, D.P.; Klopper, W.; Gauss, J. Basis-set extrapolation techniques for the accurate calculation of molecular equilibrium geometries using coupled-cluster theory. J. Chem. Phys. 2006, 125, 044108. [Google Scholar] [CrossRef]
  16. Rusakov, Y.Y.; Nikurashina, Y.A.; Rusakova, I.L. On the utmost importance of the geometry factor of accuracy in the quantum chemical calculations of 31P NMR chemical shifts: New efficient pecG-n (n = 1, 2) basis sets for the geometry optimization procedure. J. Chem. Phys. 2024, 160, 084109. [Google Scholar] [CrossRef]
  17. Rusakov, Y.Y.; Rusakova, I.L. An efficient method for generating propertyenergy consistent basis sets. New pecJ-n (n = 1, 2) basis sets for high-quality calculations of indirect nuclear spin–spin coupling constants involving 1H, 13C, 15N, and 19F nuclei. Phys. Chem. Chem. Phys. 2021, 23, 14925–14939. [Google Scholar] [CrossRef]
  18. Rusakov, Y.Y.; Rusakova, I.L. Getaway from the Geometry Factor Error in the Molecular Property Calculations: Efficient pecG-n (n = 1, 2) Basis Sets for the Geometry Optimization of Molecules Containing Light p Elements. J. Chem. Theory Comput. 2024, 20, 6661–6673. [Google Scholar] [CrossRef]
  19. Bartlett, R.J. Many-Body Perturbation Theory and Coupled Cluster Theory for Electron Correlation in Molecules. Annu. Rev. Phys. Chem. 1981, 32, 359–401. [Google Scholar] [CrossRef]
  20. Scuseria, G.E.; Janssen, C.L.; Schaefer, H.F., III. An efficient reformulation of the closed-shell coupled cluster single and double excitation (CCSD) equations. J. Chem. Phys. 1988, 89, 7382–7387. [Google Scholar] [CrossRef]
  21. Dunning, T.H. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. [Google Scholar] [CrossRef]
  22. Woon, D.E.; Dunning, T.H. Gaussian basis sets for use in correlated molecular calculations. III. The atoms aluminum through argon. J. Chem. Phys. 1993, 98, 1358–1371. [Google Scholar] [CrossRef]
  23. Gordon, M.S.; Binkley, J.S.; Pople, J.A.; Pietro, W.J.; Hehre, W.J. Self-consistent molecular-orbital methods. 22. Small split-valence basis sets for second-row elements. J. Am. Chem. Soc. 1982, 104, 2797–2803. [Google Scholar] [CrossRef]
  24. Francl, M.M.; Pietro, W.J.; Hehre, W.J.; Binkley, J.S.; Gordon, M.S.; DeFrees, D.J.; Pople, J.A. Self-consistent molecular orbital methods. XXIII. A polarization-type basis set for second-row elements. J. Chem. Phys. 1982, 77, 3654–3665. [Google Scholar] [CrossRef]
  25. Rao, K.V.; Donia, M.S.; Peng, J.; Garcia-Palomero, E.; Alonso, D.; Martinez, A.; Medina, M.; Franzblau, S.G.; Tekwani, B.L.; Khan, S.I.; et al. Manzamine B and E and Ircinal a Related Alkaloids from an Indonesian Acanthostrongylophora Sponge and Their Activity against Infectious, Tropical Parasitic, and Alzheimer’s Diseases. J. Nat. Prod. 2006, 69, 1034–1040. [Google Scholar] [CrossRef]
  26. Tchinda, A.T.; Tamze, V.; Ngono, A.R.N.; Ayimele, G.A.; Cao, M.; Angenot, L.; Frèdèrich, M. Alkaloids from the stem bark of Strychnos icaja. Phytochem. Lett. 2012, 5, 108–113. [Google Scholar] [CrossRef]
  27. Frèdèrich, M.; Choi, Y.H.; Angenot, L.; Harnischfeger, G.; Lefeber, A.W.M.; Verpoorte, R. Metabolomic analysis of Strychnos nux-vomica, Strychnos icaja and Strychnos ignatii extracts by 1H nuclear magnetic resonance spectrometry and multivariate analysis techniques. Phytochemistry 2004, 65, 1993–2001. [Google Scholar] [CrossRef]
  28. Januário, A.H.; Filho, E.R.; Pietro, R.C.L.R.; Kashima, S.; Sato, D.N.; França, S.C. Antimycobacterial physalins from Physalis angulata L. (Solanaceae). Phytother. Res. 2002, 16, 445–448. [Google Scholar] [CrossRef]
  29. Chudzik, M.; Korzonek-Szlacheta, I.; Król, W. Triterpenes as Potentially Cytotoxic Compounds. Molecules 2015, 20, 1610–1625. [Google Scholar] [CrossRef]
  30. Liu, J.H.; Tang, J.; Zhu, Z.F.; Chen, L. Design, synthesis, and anti-tumor activity of novel betulinic acid derivatives. J. Asian Nat. Prod. Res. 2014, 16, 34–42. [Google Scholar] [CrossRef]
  31. Boryczka, S.; Bębenek, E.; Wietrzyk, J.; Kempińska, K.; Jastrzębska, M.; Kusz, J.; Nowak, M. Synthesis, structure and cytotoxic activity of new acetylenic derivatives of betulin. Molecules 2013, 18, 4526–4543. [Google Scholar] [CrossRef] [PubMed]
  32. Urban, M.; VLK, M.; Dzubak, P.; Hajduch, M.; Sarek, J. Cytotoxic heterocyclic triterpenoids derived from betulin and betulinic acid. Bioorg. Med. Chem. 2012, 20, 3666–3674. [Google Scholar] [CrossRef] [PubMed]
  33. Baratto, L.C.; Porsani, M.V.; Pimentel, I.C.; Pereira Netto, A.B.; Paschke, R.; Oliveira, B.H. Preparation of betulinic acid derivatives by chemical and biotransformation methods and detrmination of cytotoxicity against selected cancer cell lines. Eur. J. Med. Chem. 2013, 68, 121–131. [Google Scholar] [CrossRef] [PubMed]
  34. Aberham, A.; Cicek, S.S.; Schneider, P.; Stuppner, H. Analysis of Sesquiterpene Lactones, Lignans, and Flavonoids in Wormwood (Artemisia absinthium L.) Using High-Performance Liquid Chromatography (HPLC)-Mass Spectrometry, Reversed Phase HPLC, and HPLC-Solid Phase Extraction-Nuclear Magnetic Resonance. J. Agric. Food Chem. 2010, 58, 10817–10823. [Google Scholar] [CrossRef]
  35. Chai, X.-Y.; Xu, Z.-R.; Bai, C.-C.; Zhou, F.-R.; Tu, P.-F. A new seco-friedelolactone acid from the bark and twigs of Itoa orientalis. Fitoterapia 2009, 80, 408–410. [Google Scholar] [CrossRef]
  36. Frédérich, M.; Jacquier, M.-J.; Thépenier, P.; De Mol, P.; Tits, M.; Philippe, G.; Delaude, C.; Angenot, L.; Zèches-Hanrot, M. Antiplasmodial Activity of Alkaloids from Various Strychnos Species. J. Nat. Prod. 2002, 65, 1381–1386. [Google Scholar] [CrossRef]
  37. Tchinda, A.T.; Jansen, O.; Nyemb, J.-N.; Tits, M.; Dive, G.; Angenot, L.; Frédérich, M. Strychnobaillonine, an Unsymmetrical Bisindole Alkaloid with an Unprecedented Skeleton from Strychnos icaja Roots. J. Nat. Prod. 2014, 77, 1078–1082. [Google Scholar] [CrossRef]
  38. González, A.G.; Francisco, C.G.; Freire, R.; Hernández, R.; Salazar, J.A.; Suárez, E. Iguesterin, a new quinonoid triterpene from Catha cassinoides. Phytochemistry 1975, 14, 1067–1070. [Google Scholar] [CrossRef]
  39. Sneden, A.T. Isoiguesterin, A New Antileukemic Bisnortriterpene from Salacia madagascariensis. J. Nat. Prod. 1981, 44, 503–507. [Google Scholar] [CrossRef]
  40. Sichaem, J.; Worawalai, W.; Tip-pyang, S. Chemical constituents from the roots of nauclea orientalis. Chem. Natur. Comp. 2012, 48, 827–830. [Google Scholar] [CrossRef]
  41. Zhao, Y.; Truhlar, D.G. The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: Two new functionals and systematic testing of four M06-class functionals and 12 other functionals. Theor. Chem. Acc. 2008, 120, 215–241. [Google Scholar] [CrossRef]
  42. Kiohara, V.O.; Carvalho, E.F.V.; Paschoal, C.W.A.; Machado, F.B.C.; Roberto-Neto, O. DFT and CCSD(T) electronic properties and structures of aluminum clusters: Alnx (n = 1–9, x = 0, ±1). Chem. Phys. Lett. 2013, 568–569, 42–48. [Google Scholar] [CrossRef]
  43. Walker, M.; Harvey, A.J.A.; Sen, A.; Dessent, C.E.H. Performance of M06, M06-2X, and M06-HF Density Functionals for Conformationally Flexible Anionic Clusters: M06 Functionals Perform Better than B3LYP for a Model System with Dispersion and Ionic Hydrogen-Bonding Interactions. Phys. Chem. A 2013, 117, 12590–12600. [Google Scholar] [CrossRef] [PubMed]
  44. Gauss, J.; Stanton, J.F. Perturbative treatment of triple excitations in coupled-cluster calculations of nuclear magnetic shielding constants. J. Chem. Phys. 1996, 104, 2574–2583. [Google Scholar] [CrossRef]
  45. Becke, A.D. Density-functional thermochemistry. V. Systematic optimization of exchange-correlation functionals. J. Chem. Phys. 1997, 107, 8554–8560. [Google Scholar] [CrossRef]
  46. Wilson, P.J.; Bradley, T.J.; Tozer, D.J. Hybrid exchange-correlation functional determined from thermochemical data and ab initio potentials. J. Chem. Phys. 2001, 115, 9233–9242. [Google Scholar] [CrossRef]
  47. Rusakov, Y.Y.; Rusakova, I.L. New pecS-n (n = 1, 2) basis sets for quantum chemical calculations of the NMR chemical shifts of H, C, N, and O nuclei. J. Chem. Phys. 2022, 156, 244112. [Google Scholar] [CrossRef]
  48. Rusakov, Y.Y.; Rusakova, I.L. New efficient pecS-n (n = 1, 2) basis sets for quantum chemical calculations of 31P NMR chemical shifts. Phys. Chem. Chem. Phys. 2023, 25, 18728–18741. [Google Scholar] [CrossRef]
  49. Rusakov, Y.Y.; Semenov, V.A.; Rusakova, I.L. On the Efficiency of the Density Functional Theory (DFT)-Based Computational Protocol for 1H and 13C Nuclear Magnetic Resonance (NMR) Chemical Shifts of Natural Products: Studying the Accuracy of the pecS-n (n = 1, 2) Basis Sets. Int. J. Mol. Sci. 2023, 24, 14623. [Google Scholar] [CrossRef]
  50. Teale, A.M.; Lutnæs, O.B.; Helgaker, T.; Tozer, D.J.; Gauss, J. Benchmarking density-functional theory calculations of NMR shielding constants and spin–rotation constants using accurate coupled-cluster calculations. J. Chem. Phys. 2013, 138, 024111. [Google Scholar] [CrossRef]
  51. Brzyska, A.; Borowski, P.; Woliński, K. Solvent effects on the nitrogen NMR chemical shifts in 1-methylazoles—A theoretical study. New J. Chem. 2015, 39, 9627–9640. [Google Scholar] [CrossRef]
  52. Hoffmann, F.; Li, D.-W.; Sebastiani, D.; Brüschweiler, R. Improved Quantum Chemical NMR Chemical Shift Prediction of Metabolites in Aqueous Solution Toward the Validation of Unknowns. J. Phys. Chem A 2017, 121, 3071–3078. [Google Scholar] [CrossRef] [PubMed]
  53. Chirchir, K.D.; Cheplogoi, K.P.; Omolo, O.J.; Langat, K.M. Chemical constituents of Solanum mauense (Solanaceae) and Dovyalis abyssinica (Salicaceae). Int. J. Biol. Chem. Sci. 2018, 12, 999–1007. [Google Scholar] [CrossRef]
  54. Verpoorte, R.; van Beek, T.A.; Riegman, R.L.M.; Hylands, P.J.; Bisset, N.G. Carbon-13 NMR spectroscopy of some Strychnos alkaloids: Part 2. Org. Magn. Reson. 1984, 22, 345–348. [Google Scholar] [CrossRef]
  55. Massiot, G.; Massoussa, B.; Thepenier, P.; Jaquier, M.-J.; le Men-Olivier, L.; Delaude, C. Structure of matopensine, a novel dimeric indole alkaloid from Strychnos species. Heterocycles 1983, 20, 2339–2342. [Google Scholar] [CrossRef]
  56. Massiot, G.; Massoussa, B.; Jacquier, M.-J.; Thépénier, P.; Le Men-Olivier, L.; Delaude, C.; Verpoorte, R. Alkaloids from roots of strychnos matopensis. Phytochemistry 1988, 27, 3293–3304. [Google Scholar] [CrossRef]
  57. Harris, R.K.; Becker, E.D.; Cabral de Menezes, S.M.; Goodfellow, R.; Granger, P. NMR Nomenclature: Nuclear Spin Properties and Conventions for Chemical Shifts: IUPAC Recommendations 2001. Solid State Nucl. Magn. Reson. 2002, 22, 458–483. [Google Scholar] [CrossRef]
  58. Harris, R.K.; Becker, E.D.; Cabral De Menezes, S.M.; Granger, P.; Hoffman, R.E.; Zilm, K.W. Further Conventions for NMR Shielding and Chemical Shifts (IUPAC Recommendations 2008). Magn. Reson. Chem. 2008, 46, 582–598. [Google Scholar] [CrossRef]
  59. Jensen, F. Introduction to Computational Chemistry; John Wiley & Sons Ltd.: Chichester, UK, 2007. [Google Scholar]
  60. Cohen, R.D.; Wood, J.S.; Lam, Y.-H.; Buevich, A.V.; Sherer, E.C.; Reibarkh, M.; Williamson, R.T.; Martin, G.E. DELTA50: A Highly Accurate Database of Experimental 1H and 13C NMR Chemical Shifts Applied to DFT Benchmarking. Molecules 2023, 28, 2449. [Google Scholar] [CrossRef]
  61. Jensen, F. Basis Set Convergence of Nuclear Magnetic Shielding Constants Calculated by Density Functional Methods. J. Chem. Theory Comput. 2008, 4, 719–727. [Google Scholar] [CrossRef]
  62. Newberry, R.W.; Raines, R.T. The n→π* Interaction. Acc. Chem. Res. 2017, 50, 1838–1846. [Google Scholar] [CrossRef] [PubMed]
  63. Rosenthal, J.; Schuster, D.I. The Anomalous Reactivity of Fluorobenzene in Electrophilic Aromatic Substitution and Related Phenomena. J. Chem. Educ. 2003, 80, 6–679. [Google Scholar] [CrossRef]
  64. Frisch, M.J.; Trucks, G.W.; Schlegel, H.B.; Scuseria, G.E.; Robb, M.A.; Cheeseman, J.R.; Scalmani, G.; Barone, V.; Petersson, G.A.; Nakatsuji, H.; et al. Gaussian 09, Revision C.01; Gaussian, Inc.: Wallingford, CT, USA, 2016. [Google Scholar]
  65. Tomasi, J.; Mennucci, B.; Cancès, E. The IEF version of the PCM solvation method: An overview of a new method addressed to study molecular solutes at the QM ab initio level. J. Mol. Struct. THEOCHEM 1999, 464, 211–226. [Google Scholar] [CrossRef]
  66. Tomasi, J.; Mennucci, B.; Cammi, R. Quantum Mechanical Continuum Solvation Models. Chem. Rev. 2005, 105, 2999–3094. [Google Scholar] [CrossRef]
  67. Stanton, J.F.; Gauss, J.; Cheng, L.; Harding, M.E.; Matthews, D.A.; Szalay, P.G. CFOUR. A Quantum Chemical Program Package. Available online: https://cfour.uni-mainz.de/cfour/ (accessed on 1 September 2024).
  68. Irkutsk Supercomputer Center of SB RAS. Irkutsk: ISDCT SB RAS. Available online: https://hpc.icc.ru (accessed on 8 September 2024).
Figure 1. Compounds used in theoretical analysis (set 1). Blue, red, yellow and gray balls represent nitrogen, oxygen, carbon and hydrogen atoms, respectively.
Figure 1. Compounds used in theoretical analysis (set 1). Blue, red, yellow and gray balls represent nitrogen, oxygen, carbon and hydrogen atoms, respectively.
Ijms 25 10588 g001
Figure 2. MAEs for the 13C NMR shielding constants calculated for set 1 using the equilibrium geometries obtained using different basis sets (listed along the abscissa) compared to the corresponding reference theoretical data. The red numbers indicate the sizes of the basis sets for the elements of the second period. In these calculations, only the lowest energy conformers were taken into account.
Figure 2. MAEs for the 13C NMR shielding constants calculated for set 1 using the equilibrium geometries obtained using different basis sets (listed along the abscissa) compared to the corresponding reference theoretical data. The red numbers indicate the sizes of the basis sets for the elements of the second period. In these calculations, only the lowest energy conformers were taken into account.
Ijms 25 10588 g002
Figure 3. Compounds used in the analysis based on the comparison with experimental data (set 2).
Figure 3. Compounds used in the analysis based on the comparison with experimental data (set 2).
Ijms 25 10588 g003
Figure 4. MAEs for the 13C NMR chemical shifts calculated for test set 2 using equilibrium geometries obtained using different basis sets (listed along the abscissa) against the corresponding experimental data. The second bars show the altered statistical figures evaluated without taking into account the chemical shift of C1 of DMAc and fluorobenzene. The red numbers indicate the sizes of the basis sets for the second period elements.
Figure 4. MAEs for the 13C NMR chemical shifts calculated for test set 2 using equilibrium geometries obtained using different basis sets (listed along the abscissa) against the corresponding experimental data. The second bars show the altered statistical figures evaluated without taking into account the chemical shift of C1 of DMAc and fluorobenzene. The red numbers indicate the sizes of the basis sets for the second period elements.
Ijms 25 10588 g004
Figure 5. Correlation plot for the 13C NMR shielding constants of test set 2 calculated at the GIAO-CCSD(T)/pecS-2 level using equilibrium geometries obtained at the CCSD/pecG-2 level against the corresponding experimental data.
Figure 5. Correlation plot for the 13C NMR shielding constants of test set 2 calculated at the GIAO-CCSD(T)/pecS-2 level using equilibrium geometries obtained at the CCSD/pecG-2 level against the corresponding experimental data.
Ijms 25 10588 g005
Table 1. Compositions of basis sets considered in theoretical and practical analyses.
Table 1. Compositions of basis sets considered in theoretical and practical analyses.
Basis SetElementContracted
Composition
Number of Contracted Basis Functions
Rusakov’s pecG-n series
pecG-1H[2s2p]8
B-F[3s2p2d]19
pecG-2H[3s3p1d]17
B-F[4s3p3d1f]35
Dunning’s cc-pVXZ series
cc-pVDZH[2s1p]5
B-F[3s2p1d]14
cc-pVTZH[3s2p1d]14
B-F[4s3p2d1f]30
cc-pVQZH[4s3p2d1f]30
B-F[5s4p3d2f1g]55
Pople-style K-LMNG(x,y) series
6-311G(d,p)H[3s1p]6
B-F[4s3p1d]18
6-31G(2d,2p)H[2s2p]8
B-F[3s2p2d]19
6-311G(3df,3pd)H[3s3p1d]17
B-F[4s3p3d1f]35
Table 2. Calculated and experimental 13C NMR chemical shifts (in ppm) of molecules in set 2.
Table 2. Calculated and experimental 13C NMR chemical shifts (in ppm) of molecules in set 2.
MoleculeNo. of Carbon Atoms 16-311G(d,p)6-31G(2d,2p)6-311G(3df,3pd)pecG-1pecG-2Exp. 2
Acetaldehyde1196.19197.17196.25196.37196.97199.97
234.9234.7035.0934.9035.0230.99
Acetonitrile14.954.844.854.614.651.91
2115.07115.73115.57115.91115.93116.33
Cyclopropane −1.61−1.66−1.37−1.63−1.57−3.15
DMAc1160.74161.03161.06160.69161.50170.66
225.5825.2526.1225.3426.1021.58
340.3539.7540.6039.8640.6138.05
434.3233.7434.6133.8834.6035.20
Fluorobenzene1155.98156.06155.61156.09155.93162.86
2111.85111.65111.51111.72111.49115.32
3131.15130.89131.09131.00130.94129.96
4127.11126.88126.52126.89126.55123.98
Isoxazole1158.00158.72157.55158.28157.62157.64
2106.04105.96105.78105.95105.65103.47
3148.72149.18148.16148.83147.96149.02
Norbornadiene1145.04145.27145.17145.05144.97143.43
248.2848.2748.2948.6348.2750.26
374.6774.1975.3475.1375.5075.32
Oxetane173.8074.0573.9673.8773.9472.55
223.4123.0923.4623.4122.9122.35
Pyridine1150.18150.07149.91150.29149.98149.74
2125.15124.97124.99124.95124.69123.78
3137.53137.41137.21137.33137.11136.09
α 196.12196.82197.81197.18198.54
MAE 2.432.342.392.312.25
1 For the enumeration of atoms, see Figure 2. 2 For experimental data, see reference [60].
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rusakov, Y.Y.; Semenov, V.A.; Rusakova, I.L. Quelling the Geometry Factor Effect in Quantum Chemical Calculations of 13C NMR Chemical Shifts with the Aid of the pecG-n (n = 1, 2) Basis Sets. Int. J. Mol. Sci. 2024, 25, 10588. https://doi.org/10.3390/ijms251910588

AMA Style

Rusakov YY, Semenov VA, Rusakova IL. Quelling the Geometry Factor Effect in Quantum Chemical Calculations of 13C NMR Chemical Shifts with the Aid of the pecG-n (n = 1, 2) Basis Sets. International Journal of Molecular Sciences. 2024; 25(19):10588. https://doi.org/10.3390/ijms251910588

Chicago/Turabian Style

Rusakov, Yuriy Yu., Valentin A. Semenov, and Irina L. Rusakova. 2024. "Quelling the Geometry Factor Effect in Quantum Chemical Calculations of 13C NMR Chemical Shifts with the Aid of the pecG-n (n = 1, 2) Basis Sets" International Journal of Molecular Sciences 25, no. 19: 10588. https://doi.org/10.3390/ijms251910588

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop