DELTA50: A Highly Accurate Database of Experimental 1H and 13C NMR Chemical Shifts Applied to DFT Benchmarking

Density functional theory (DFT) benchmark studies of 1H and 13C NMR chemical shifts often yield differing conclusions, likely due to non-optimal test molecules and non-standardized data acquisition. To address this issue, we carefully selected and measured 1H and 13C NMR chemical shifts for 50 structurally diverse small organic molecules containing atoms from only the first two rows of the periodic table. Our NMR dataset, DELTA50, was used to calculate linear scaling factors and to evaluate the accuracy of 73 density functionals, 40 basis sets, 3 solvent models, and 3 gauge-referencing schemes. The best performing DFT methodologies for 1H and 13C NMR chemical shift predictions were WP04/6-311++G(2d,p) and ωB97X-D/def2-SVP, respectively, when combined with the polarizable continuum solvent model (PCM) and gauge-independent atomic orbital (GIAO) method. Geometries should be optimized at the B3LYP-D3/6-311G(d,p) level including the PCM solvent model for the best accuracy. Predictions of 20 organic compounds and natural products from a separate probe set had root-mean-square deviations (RMSD) of 0.07 to 0.19 for 1H and 0.5 to 2.9 for 13C. Maximum deviations were less than 0.5 and 6.5 ppm for 1H and 13C, respectively.

Molecules 2023, 28 When considering curation of a well-behaved test set, several compounds should be eliminated from the test set for the reasons indicated above (i.e., colored structures in Figure 1), such as those that contain row three elements or hydrogen bond donors or exhibit multiple low-lying conformers. Moreover, the experimental NMR data should be fully  [1] and Tantillo et al. [3] to generate linear scaling factors per Equation (1). Bold-faced, colored structures are modeling challenges due to row three atoms (red), H-bond donors (blue), or multiple low-energy conformers (green). (b) Probe set used by Tantillo et al. [3] to examine accuracy of DFT model chemistries for δ H and δ C predictions. (c) Examples of problematic structures for benchmarking: (i) attachment of Cl, a row three heavy atom, to carbon results in a 13.9 ppm δ C overprediction by DFT; and (ii) difficulties with accurately determining conformer energetics, as shown for several different energy calculations both with and without dispersion corrections using the same B3LYP/6-31G(d) geometries, can lead to vastly different Boltzmann populations and chemical shift predictions. Both 1 H and 13 C chemical shifts (δ C in parentheses) were calculated using the DP4+ model (PCM-mPW1PW91/6-31+G**//B3LYP/6-31G*) [82].
An additional plausible reason for the differing conclusions from various DFT chemical shift benchmarking studies is the varying referencing schemes used to convert isotropic shielding tensors into chemical shifts. The simplest method of subtracting the calculated isotropic shielding tensor, σ, for tetramethylsilane (TMS), according to the equation, δ = σ TMS − σ cmpd , is also the least accurate due to poor cancellation of errors, especially when comparing carbon atoms with different hybridization [83]. A better approach that improves error cancellation is to replace TMS with a structurally similar reference compound relative to the molecule of interest [84][85][86][87]. Alternatively, the most frequently applied methodology, which also improves error cancellation over TMS and does not require experimental data for specific reference compounds, is to calculate and apply linear scaling factors for a given computational model according to Equation (1) [1,3]. However, the downside is that such linear scaling factors must be determined from a large training set of experimental data, often as part of a comprehensive benchmarking study.
The goal of this research was to re-measure and verify experimental NMR assignments for a set of compounds, hereafter referred to as DELTA50, that can be used to provide a reliable benchmark for existing and new DFT methodologies as they are developed. In addition, a majority of currently available functionals, basis sets, and solvation models as implemented in Gaussian 16 [88] were tested to determine which is the most appropriate for 1 H and 13 C NMR chemical shift predictions. The DELTA50 set was then used to calculate (per Equation (1)) linear scaling factors for conversion of isotropic shielding tensors to chemical shifts. Finally, the performance of the optimal methods and linear scaling factors were assessed using 20 natural products and small-to medium-sized organic compounds, which represents a probe set.

DELTA50 Compound Curation and Experimental Measurements
DELTA50 ( Figure 2) contains 50 compounds, comprising 114 proton and 143 carbon chemical shifts, which is a subset of the compounds from the test and probe sets of Rablen [1] and Tantillo [3] (Figure 1) that avoids the previously mentioned problem cases. A wide variety of functional groups (nitro, fluoro, nitrile, aldehyde, ketone, ester, alkyne, amide, amine, olefin, aliphatic, aromatic, and ether), ring sizes (three-to six-membered), and bond hybridizations are present, resulting in 1 H and 13 C NMR chemical shifts ranging from 0.25 to 9.80 ppm and −2.9 to 219.4 ppm, respectively.
Proton and carbon NMR spectra were acquired at 298 K for ≤10 mM solutions of each compound dissolved in CDCl 3 with 0.03% TMS as internal reference using a 600 MHz spectrometer. A concentration of 10 mM in a 5 mm NMR tube (~1-2 mg) allowed for collection of 13 C NMR spectra in reasonable overall acquisition times (e.g., approximately 2-4 h on a Bruker 600 MHz AVANCE III spectrometer equipped with a liquid N 2 -cooled broadband Prodigy™ probe) with all resonances detectable (signal-to-noise ratio ≥ 3:1). Proton NMR spectra were acquired using a 2.73 s acquisition time, 20 ppm spectral width, 6.175 ppm transmitter frequency, 16 scans, and a 1 s relaxation delay. Carbon NMR spectra were acquired using a 0.9 s acquisition time, 240 ppm spectral width, 100 ppm transmitter frequency, 3072 to 4096 scans, and a 2 s relaxation delay. Spectral data were processed in MestReNova, version 14.2. using 0.3 and 1.0 Hz exponential line broadenings applied to 1 H and 13 C, respectively. Both proton (δ H ) and carbon (δ C ) chemical shifts were referenced to TMS at 0.00 ppm and recorded to two decimal places. Individual proton and carbon spectra for each molecule are provided in the Supplementary Materials with expansions for individual multiplets or congested spectral regions, and the assigned chemical shifts are shown for each spectrum; these are also shown in Figure 2.  Figure 1. Experimental 1 H chemical shifts are provided for numbered carbon atoms with 13 C chemical shifts listed in parentheses. Numbering corresponds to that used in calculations. Experimental data were acquired on a 600 MHz NMR in CDCl 3 and referenced to TMS at 0.00 ppm for both 1 H and 13 C.
A concentration-dependent study was undertaken to ensure that samples were sufficiently dilute to avoid deleterious aggregation using a representative subset of compounds with diverse functionalities from DELTA50: benzene, pyridazine, tetrahydrofuran (THF), 3-butyn-2-one, and fluorobenzene ( Figure 3). Here, it was found that proton chemical shifts were stable within a variance of ±0.3% at concentrations up to 50 mM. The largest concen-tration effects were observed for pyridazine and 3-butyn-2-one, which were also found to exhibit the largest proton chemical shift deviations from theoretical predictions in Rablen's study [1]. THF did not show any chemical shift changes up to a concentration of 50 mM, as would be expected for an aliphatic compound, while the chemical shifts of benzene and fluorobenzene deviated by less than 0.02%, indicating that π-π interactions were not appreciable at these concentrations. The alkyne proton chemical shift (3.21 ppm) of 3-butyn-2-one proportionally increased up to 0.1% with increasing concentration, while the methyl proton resonance (2.38 ppm) showed no change, indicating slight π interactions. The protons meta to nitrogen of pyridazine (7.49 ppm) showed the largest concentration dependence, and it was also noticed in the pyridazine samples that the peak width and chemical shift of residual water varied considerably (υ 1/2 12 to 23 Hz and δ H 1.54 to 1.80 ppm). This pointed to the presence of DCl in CDCl 3 that could protonate basic nitrogens, especially at low concentrations. Thus, a few solid crystals of anhydrous K 2 CO 3 were added to the CDCl 3 solvent to neutralize DCl, and this resulted in more consistent chemical shifts as well as markedly improved pyridazine peak shape. Based on these results, anhydrous K 2 CO 3 was used for all sample preparations of basic compounds, and a 10 mM concentration was considered acceptable for the compounds in the DELTA50 test set.
ical shifts are shown for each spectrum; these are also shown in Figure 2.
A concentration-dependent study was undertaken to ensure that samples were sufficiently dilute to avoid deleterious aggregation using a representative subset of compounds with diverse functionalities from DELTA50: benzene, pyridazine, tetrahydrofuran (THF), 3-butyn-2-one, and fluorobenzene ( Figure 3). Here, it was found that proton chemical shifts were stable within a variance of ±0.3% at concentrations up to 50 mM. The largest concentration effects were observed for pyridazine and 3-butyn-2-one, which were also found to exhibit the largest proton chemical shift deviations from theoretical predictions in Rablen's study [1]. THF did not show any chemical shift changes up to a concentration of 50 mM, as would be expected for an aliphatic compound, while the chemical shifts of benzene and fluorobenzene deviated by less than 0.02%, indicating that π-π interactions were not appreciable at these concentrations. The alkyne proton chemical shift (3.21 ppm) of 3-butyn-2-one proportionally increased up to 0.1% with increasing concentration, while the methyl proton resonance (2.38 ppm) showed no change, indicating slight π interactions. The protons meta to nitrogen of pyridazine (7.49 ppm) showed the largest concentration dependence, and it was also noticed in the pyridazine samples that the peak width and chemical shift of residual water varied considerably (υ1/2 12 to 23 Hz and δH 1.54 to 1.80 ppm). This pointed to the presence of DCl in CDCl3 that could protonate basic nitrogens, especially at low concentrations. Thus, a few solid crystals of anhydrous K2CO3 were added to the CDCl3 solvent to neutralize DCl, and this resulted in more consistent chemical shifts as well as markedly improved pyridazine peak shape. Based on these results, anhydrous K2CO3 was used for all sample preparations of basic compounds, and a 10 mM concentration was considered acceptable for the compounds in the DELTA50 test set.  In some cases where chemical shift assignments were not sufficiently obvious from chemical shifts, scalar coupling, and integration, additional one-and two-dimensional (NOE, gCOSY, multiplicity-edited gHSQC, and/or gHMBC) spectra were acquired for confirmation. In particular, selective NOEs (300 to 700 ms mixing times) were used to determine the assignment of the methyl peaks in 2-methyl-2-butene, N,N-dimethylacetamide (DMAc), and N,N-dimethylformamide (DMF).
The differences between the previous experimental data and those recorded in the present study are shown as histogram plots ( Figure 4). While 76 proton chemical shifts (67%) were within ±0.05 ppm of the previously reported experimental data, there were 19 outliers (17%) with greater than a ±0.10 ppm deviation. The largest difference of 0.21 ppm was for the α-ether protons of tetrahydropyran. Linear scaling factors calculated at the B3LYP/cc-pVTZ//B3LYP/6-31G(d) level using the previous experimental data were approximately 0.2% different than using the newly acquired data (slope = −1.0429, intercept = 31.6499 versus slope = −1.0450, intercept = 31.6889, respectively), resulting in a maximum 0.04 ppm difference in predicted proton chemical shifts. For δ C , 88 out of 143 measurements (62%) were within ±0.25 ppm of the previously reported experimental data; 8 outliers (6%) were found with greater than a ±0.50 ppm difference. The largest carbon chemical shift difference of 2.45 ppm was for the keto carbon of cyclohexanone. Linear scaling factors, at the B3LYP/cc-pVTZ//B3LYP/6-31G(d), differed by approximately 0.1% (slope = −1.0372, intercept = 181.5955 using the previous experimental data, and slope = −1.0364, intercept = 181.5727 using the newly acquired data). This resulted in differences of up to 0.15 ppm.
The differences between the previous experimental data and those recorded in the present study are shown as histogram plots ( Figure 4). While 76 proton chemical shifts (67%) were within ±0.05 ppm of the previously reported experimental data, there were 19 outliers (17%) with greater than a ±0.10 ppm deviation. The largest difference of 0.21 ppm was for the α-ether protons of tetrahydropyran. Linear scaling factors calculated at the B3LYP/cc-pVTZ//B3LYP/6-31G(d) level using the previous experimental data were approximately 0.2% different than using the newly acquired data (slope = −1.0429, intercept = 31.6499 versus slope = −1.0450, intercept = 31.6889, respectively), resulting in a maximum 0.04 ppm difference in predicted proton chemical shifts. For δC, 88 out of 143 measurements (62%) were within ±0.25 ppm of the previously reported experimental data; 8 outliers (6%) were found with greater than a ±0.50 ppm difference. The largest carbon chemical shift difference of 2.45 ppm was for the keto carbon of cyclohexanone. Linear scaling factors, at the B3LYP/cc-pVTZ//B3LYP/6-31G(d), differed by approximately 0.1% (slope = −1.0372, intercept = 181.5955 using the previous experimental data, and slope = −1.0364, intercept = 181.5727 using the newly acquired data). This resulted in differences of up to 0.15 ppm. To ensure that only a single low-energy conformer comprised at least 98% of the Boltzmann distribution, a conformer search was performed for each molecule in the DELTA50 set using a mixed torsional, low-mode sampling search in MacroModel and the OPLS4 force field [89] as implemented in the Schrödinger software suite, version 2021-1 [90]. In some cases, multiple conformers were found, such as boat and chair forms of cyclohexane, and these were analyzed further by DFT energy calculations using the M06-2X/6-31+G(d,p) model chemistry (including vibrational energy corrections) and the SMD solvent model for chloroform, as implemented in Gaussian 16 [88]. This model chemistry was chosen because the M06-2X functional combined with the SMD solvent model has To ensure that only a single low-energy conformer comprised at least 98% of the Boltzmann distribution, a conformer search was performed for each molecule in the DELTA50 set using a mixed torsional, low-mode sampling search in MacroModel and the OPLS4 force field [89] as implemented in the Schrödinger software suite, version 2021-1 [90]. In some cases, multiple conformers were found, such as boat and chair forms of cyclohexane, and these were analyzed further by DFT energy calculations using the M06-2X/6-31+G(d,p) model chemistry (including vibrational energy corrections) and the SMD solvent model for chloroform, as implemented in Gaussian 16 [88]. This model chemistry was chosen because the M06-2X functional combined with the SMD solvent model has been shown to be particularly effective for prediction of relative energies [91][92][93][94]. Minimum energy conformations were initially verified by lack of imaginary vibrational modes (i.e., the second derivative matrix of energy with respect to displacement was positive definite). Boltzmann probabilities were calculated (see Supplementary Materials for additional details), and the dominant conformer (≥98% probability weighting) was included in the DELTA50 set (as was the case for the chair conformation of cyclohexane).
For each molecule in the DELTA50 set, geometries were optimized at the B3LYP/6-31G(d) level. The Cartesian coordinates and atom numbering for optimized geometries are provided in the Supplementary Materials along with the experimentally measured chemical shifts. A relatively low level of theory was chosen for molecular geometry optimizations because: (1) geometries are reasonably well predicted at this level; (2) geometry optimization is one of the most time-consuming steps in a DFT calculation; and (3) predicted chemical shift dependencies on molecular geometry should be correctable via linear scaling factors [3]. Moreover, the impact of geometry optimization on overall accuracy was further assessed after evaluating the performance of various functionals, basis sets, and solvent models.

DFT Benchmark Study
The performance of 73 density functionals implemented in Gaussian 16 [88] was evaluated for GIAO isotropic shielding predictions using a large, correlation-consistent, triple-zeta basis set, cc-pVTZ, ultrafine integration grid, and the polarizable continuum model (PCM) for solvent effects. Hartree-Fock (HF) theory was also evaluated for comparison purposes. A wide variety of different classes of functionals (Table 2) were included: those based on either the local density approximation (LDAs), generalized gradient approximation (GGAs), meta-GGAs, or hybrids, such as the popular B3LYP functional, that include an empirically derived contribution of HF exchange. In addition, range-separated functionals, which are often denoted with the prefix 'LC' for long-range correction, were also evaluated. Range-separated functionals vary the contributions from the various exchange terms based on pairwise electronic spatial distance to overcome problems from long-range density over-delocalization [95]. Finally, the impact of including an empirical dispersion correction, denoted as either 'D' or 'D3 , was also considered. While most of these functionals were designed for reproduction of electronic energies, two functionals, WP04 and WC04, were specifically parameterized for accurate proton and carbon chemical shift prediction, respectively [52].  Isotropic shielding tensors, σ, were converted to chemical shifts through linear regression analysis in Excel. To evaluate performance of each functional, residuals (deviations) were plotted as a function of chemical shift, and root-mean-square deviations (RMSD) and maximum deviations (MD) were calculated (Table 2). Particular focus was placed on evaluation of systematic errors with respect to hybridization and functional groups.
In general, a systematic improvement in accuracy was seen in going from HF theory to LDAs and then to more advanced functionals, e.g., HF < LDAs < GGAs ≈ meta-GGAs < hybrids. However, range separation showed minimal improvement in accuracy, and in most cases, except ωB97X-D, no difference was observed upon including an empirical dispersion correction. Maximum deviations for proton and carbon chemical shifts were typically from 0.2 to 0.5 ppm and 5 to 8 ppm, respectively. HF, SOGGA11, and several of the Minnesota functionals performed particularly poorly. The best performing functionals in the present study were found to be ωB97X-D for carbon chemical shifts and WP04 for proton chemical shifts. Figure 5 shows an example of carbon chemical shift deviations from experiment for several representative functionals from each class versus Hartree-Fock theory. Deviations were typically smaller for proton and carbon chemical shifts that resonate upfield (i.e., less than approximately 4.5 and 60 ppm for δ H and δ C , respectively), which corresponds to sp 3 -hybridized carbons. The largest deviations for carbon chemical shift predictions were typically carbonyls, olefins, and sp-hybridized carbons. Systematic errors were observed among several functional groups across the board for most density functionals. For instance, mPW1PW91, PBE0, B3LYP, and B97-D overpredicted the more electron-rich carbon in olefins by 2-5 ppm and underpredicted the amide carbonyl in DMF and DMAc by 3-6 ppm. Alkynes and nitriles were overpredicted by several ppm for mPW1PW91 and PBE0, underpredicted by 1-2 ppm for B97-D, or exhibited no noticeable bias for B3LYP.
Following functional evaluation, the basis set size was investigated. In principle, larger, more complete basis sets should have the flexibility to better approximate the electronic density and thereby yield higher accuracy, but this has not always been found to be the case with DFT-based methods [96][97][98]. Importantly, the size of the basis set can dramatically increase computational time. A medium-sized basis set would be most appropriate for larger molecules of greater than 600-700 Da, while the most accurate basis set should be utilized for calculations of small molecules or fragment structures and for critical compounds, such as newly discovered natural products with unusual scaffolds (e.g., homodimericin A [99]). Table 3 shows the errors and computation times for 40 basis sets from double to quadruple zeta, including polarization and diffuse functions on both heavy atoms and hydrogens. While counterintuitive (yet similar to other benchmark studies), smaller basis sets were found to provide more accurate carbon chemical shift predictions [48,51,71], with def2-SVP identified as the best performing basis set when paired with the ωB97X-D functional. Conversely, proton chemical shift predictions typically require larger basis sets, and diffuse functions appear to be productive for reducing errors. The best performing basis set was 6-311++G(2df,p); however, the computational time was found to be more than six times longer for calculation of nitromethane compared to the next smaller Pople-type basis set, 6-311++G(2d,p). Because 6-311++G(2d,p) performed nearly as well, this was chosen as the preferred basis set when paired with WP04 for proton chemical shift predictions.
In general, δ H predictions exhibited substantially fewer systematic errors than δ C . There were still a few notable cases: the aldehyde proton of DMF was often underpredicted by 0.1 to 0.5 ppm, cyclopropane was overpredicted by 0.1 to 0.3 ppm, and the alkyne protons in t-butyl acetylene and 3-butyn-2-one were underpredicted by 0.1 to 0.3 ppm.
Molecules 2023, 28, x FOR PEER REVIEW 11 of 28 carbon in olefins by 2-5 ppm and underpredicted the amide carbonyl in DMF and DMAc by 3-6 ppm. Alkynes and nitriles were overpredicted by several ppm for mPW1PW91 and PBE0, underpredicted by 1-2 ppm for B97-D, or exhibited no noticeable bias for B3LYP. Following functional evaluation, the basis set size was investigated. In principle, larger, more complete basis sets should have the flexibility to better approximate the electronic density and thereby yield higher accuracy, but this has not always been found to be the case with DFT-based methods [96][97][98]. Importantly, the size of the basis set can  Chemical shift calculations require that the Hamiltonian include interaction terms from the external magnetic field vector, which under a finite basis set leads to a dependence on the choice of the vector origin or gauge [100]. While all gauge methods converge to the same limit at increasing basis set size, most chemical shift calculations are typically handled via GIAO [101][102][103][104][105] due to faster convergence; however, Iron [55] found that the CSGT [105][106][107] method provides more accurate results when paired with longrange corrected (LC) functionals (i.e., the CSGT method with LC-TPSS/cc-pVTZ and the COSMO solvation model was recommended). To test if alternative gauge procedures showed improvement in predictability, CSGT and individual gauges for atoms in molecules (IGAIM) [106,107] were compared to GIAO. Because the accuracy of the gauge method is dependent on basis set size, the basis set was also varied from double-to triple-zeta with inclusion of varying amounts of diffuse and polarization functions. Table 4 shows that, as expected, GIAO converged much more quickly than CSGT and IGAIM. For carbon chemical shift predictions, GIAO produced the lowest RMSD error at only a double-zeta basis set, def2-SVP, and predictions became worse at larger basis sets. The CSGT and IGAIM predictions were nearly equivalent and required triple-zeta basis sets augmented with diffuse functions (viz., aug-cc-pVTZ and jul-cc-pVTZ), which were more than an order of magnitude longer in computation time, to yield comparable levels of accuracy to GIAO with def2-SVP. For proton chemical shifts, GIAO was slower to converge compared to carbon, yet GIAO still exhibited significantly improved accuracy at smaller basis set sizes versus CSGT and IGAIM. Based on these results, GIAO should be the preferred method when considering both speed and accuracy. Chemical shifts strongly depend on the molecular geometry and internuclear bond distances. Fortunately, the ground state geometry of most compounds has been shown to be well predicted at relatively low levels of theory using a wide variety of density functionals and even Hartree-Fock theory. At the start of this benchmarking study, geometries were optimized in vacuo at the B3LYP/6-31G(d) level, which is a frequently used methodology for computations of spectroscopic properties as well as energetics. The appropriateness of that choice was investigated by holding the NMR chemical shift calculation method constant [PCM-ωB97X-D/def2-SVP for δ C and PCM-WP04/6-311++G(2d,p) for δ H ] while varying the geometry optimization method. Four different functionals (B3LYP, B3LYP-D3, M06-2X, and ωB97X-D), which are often used for geometry optimizations, were tested when paired with Pople-type basis sets from double-to triple-zeta. In addition, implicit solvation models, PCM or SMD, were applied to the optimizations of B3LYP, B3LYP-D3, and M06-2X because they exhibited the best accuracy for gas phase calculations. Finally, several computationally economical methods, such as HF, PBE, BLYP, and two semi-empirical methods, PM7 and AM1, were also investigated. Data in Table 5 show that the choice of geometry optimization method led to similar accuracy, which validates the choice of using a relatively low level of theory for the bulk of this benchmarking study. B3LYP performed better than M06-2X and ωB97X-D. Inclusion of dispersion correction resulted in a slight improvement in accuracy, at the expense of a negligible increase in computational time. Thus, B3LYP-D3 was used rather than B3LYP, with the additional benefit that the impact of the D3 correction should also enhance the accuracy of the energy prediction for Boltzmann-weighting. Adding an implicit solvation model also resulted in a moderate improvement in accuracy, with the PCM model slightly outperforming SMD. The most accurate predictions for carbon chemical shifts were found when using PCM-B3LYP-D3 with the 6-311G(d,p) basis set, while proton chemical shifts were very slightly less accurate with that basis set compared to 6-31G(d,p).
Finally, the impact of the solvent model on the accuracy of chemical shifts was studied. Three implicit solvation models implemented in Gaussian 16 were tested with the DELTA50 set using either ωB97X-D/def2-SVP for δ C or WP04/6-311++G(2d,p) for δ H . The integral equation formalism (IEF) version of the polarizable continuum model (PCM) is the recommended (default) model in Gaussian 16 [108]. The SMD model of Truhlar et al. [109] is a revised version of the PCM model that was developed specifically for reproducing solvation energies. Finally, the polarizable conductor calculation model, CPCM, [110,111] was also evaluated. Results in Table 6 show a marked improvement with the use of any solvation model but only a slight improvement in accuracy when using PCM versus CPCM or SMD.
Based on the totality of the previous results, the best performing density functional for carbon chemical shift prediction was found to be ωB97X-D when paired with the def2-SVP basis set. For proton chemical shift predictions, the WP04 functional exhibited the lowest error when combined with the 6-311++G(2df,p) basis set, but calculation times were unreasonably long. In contrast, the smaller 6-311++G(2d,p) basis set gave nearly comparable accuracy at a six-fold reduced computational cost. For both proton and carbon chemical shift predictions, the GIAO method was most accurate at these basis set sizes. Molecular geometries should be optimized at the B3LYP-D3/6-311G(d,p) level. Implicit solvent effects from the PCM model should be included at all stages of the calculation. The best performing models were δ H : GIAO-PCM-WP04/6-311++G(2d,p)//PCM-B3LYP-D3/6-311G(d,p) and δ C : GIAO-PCM-ωB97X-D/def2-SVP//PCM-B3LYP-D3/6-311G(d,p). When it is necessary to reduce calculation times for larger molecules, the geometry optimization step can be changed to B3LYP/6-31G(d) with a moderate reduction in accuracy for carbon chemical shift predictions while maintaining the same level of accuracy for protons after changing the basis set to jul-cc-pVDZ for proton NMR calculations. Importantly, dispersion corrections should still be used for electronic energy calculations with this faster method. In cases where dispersive interactions may be critical to the optimized geometry, such as inclusion of explicit solvent and studies of organometallic complexes, then the highaccuracy method should be used. The recommended methods and linear scaling factors are listed in Table 7.    a The "fine" integration grid will also improve calculation speed with a negligible impact on accuracy at this level of theory (note: a larger grid is needed for meta-GGAs [112]). b slope = m, y-intercept = b; to be used in Equation (1).

Probe Set Evaluation
Next, the performance of each method from Table 7 was evaluated for chemical shift predictions of small-to medium-sized complex synthetic organic compounds and natural products. In this fashion, DELTA50 can be considered a training set for generation of linear scaling factors or optimization of new empirical density functionals, while the series of complex structures and natural products represents a probe set. Twenty relatively rigid, well-studied compounds ranging in molecular weights from 96 to 854 g mol −1 with experimental NMR data measured in deuterated chloroform were included in this probe set. These structures and calculation results are listed in Figure 6 and Table 8, respectively. Most of the compounds in the probe set have existing single-crystal X-ray diffraction (SCXRD) data, and/or their structures have been verified via total synthesis. Additionally, for 18 out of 20 compounds, previous DFT chemical shift calculations have been performed. For several cases, multiple high-quality computational NMR studies have been performed (see Supplementary Materials for data from additional reports). This allows for direct comparison to the best performing methods in this study. Also noteworthy is that for most of these compounds, extensive 2D NMR data have been collected, such that the proton and carbon chemical shift assignments are unlikely to be misassigned. A few proton and carbon assignments are still ambiguous, such as the carbons resonating from 40 to 50 ppm of ingenane diterpene 8, and these have been left out of the probe set (further details can be found in the Supplementary Materials).
Several trends are immediately noticeable. As expected, method two generally exhibited the most accurate performance as evident by the lowest RMSD for 14 out of 20 proton chemical shift predictions and 16 out of 20 carbon chemical shift predictions. Moreover, the maximum δ C deviation across all compounds, representing 424 carbon chemical shift predictions, was only 6.5 ppm for a ketone carbon of homodimericin A, which provides a relatively low upper limit upon which putative NMR structure proposals could be called into question following DFT calculations using the prescribed model chemistry in this paper. It should also be noted that in a few cases where existing literature DFT methods were found to outperform method two, such as the δ H predictions of strychnine or the δ C predictions of artemisinin, these were from studies where the DFT procedures were specifically optimized for these classes of compounds rather than being a general-purpose chemical shift model chemistry for organic compounds.
From Figure 6, there also appears to be a spatial relationship between the least accurate predicted proton and carbon chemical shift from method two, which is highlighted by the colored circles. This was unexpected considering that markedly different optimal density functionals and basis set sizes were used for carbon (GIAO-PCM-ωB97X-D/def2-SVP) versus proton [GIAO-PCM-WP04/6-311++G(2d,p)] chemical shift predictions, although they share the same geometry optimization method [PCM-B3LYP-D3/6-311G(d,p)]. In all cases except echinopine B, the least accurately predicted proton is not attached to the least accurately predicted carbon. Rather, while both nuclei are in close spatial proximity, they are separated by approximately 2.5 to 5 Å. This is most pronounced for olefinic carbons that are often overpredicted by approximately 3 to 5 ppm leading to an underprediction, or over-shielding effect, on nearby protons by approximately 0.1 to 0.3 ppm. It is also noteworthy that a few investigators have found that isotropic shielding tensors are particularly sensitive to the molecular geometry [113,114], and this may indicate that future improvements in density functional performance for chemical shift predictions might come indirectly from improvements in the geometry optimization method rather than from the functional used for GIAO shielding tensor calculations.

PEER REVIEW
19 of 28 Figure 6. Probe set structures. Atomic locations of maximum chemical shift deviations compared to experiment for method two calculations from this study are highlighted for δC and δH in magenta and lime green, respectively.
Dissolved solutions of gaseous reagents, cyclopropane and isobutylene (2-methyl propene), were prepared by bubbling them through CDCl 3 and then serially diluting until acceptable NMR spectra were obtained (i.e., chemical shifts were stable upon dilution, indicating no aggregation).
Homodimericin A was provided by Aili Fan (Peking University, Beijing, China). A variety of density functionals, as implemented in Gaussian 16 [88], were evaluated as part of the benchmarking study. The exchange and correlation components of LDAs, GGAs, and meta-GGAs as well as standalone combinations are listed in the Supplementary Materials along with primary references. In addition, the long-range correction method of Hirao and co-workers was also applied to the non-hybrid functionals, BP86, BPW91, N12, TPSS, revTPSS, and M06-L, which were noted in Table 1 with the prefix 'LC'.

Conclusions
Experimental proton and carbon NMR chemical shift data for 50 structurally diverse compounds dissolved in deuterated chloroform were carefully measured in this study, enabling a comprehensive benchmark of DFT methods. Linear scaling factors to convert isotropic shielding tensors to chemical shifts were generated for two recommended methodologies that balance speed and accuracy. These two best performing methods were then evaluated against 20 probe organic compounds and natural products with high accuracy observed particularly when compared to previously reported DFT predictions for proton and carbon chemical shifts of these well-studied structures.
Of particular importance, this chemical shift test set can be used for performance evaluation of newly developed model chemistries as well as future design and optimization of empirical density functionals. Such work will be a focus for future research conducted in our laboratory.