Quantitative Structure–Property Relationship (QSPR) Models for a Local Quantum Descriptor: Investigation of the 4- and 3-Substituted-Cinnamic Acid Esterification

In this work, the theoretical description of the 4- and 3-substituted-cinnamic acid esterification with different electron donating and electron withdrawing groups was performed at the B3LYP and M06-2X levels, as a two-step process: the O-protonation and the nucleophile attack by ethanol. In parallel, an experimental work devoted to the synthesis and characterization of the substituted-cinnamate esters has also been performed. In order to quantify the substituents effects, quantitative structure–property relationship (QSPR) models based on the atomic charges, Fukui functions and the Frontier Effective-for-Reaction Molecular Orbitals (FERMO) energies were investigated. In fact, the Fukui functions, ƒ+C and ƒ−O, indicated poor correlations for each individual step, and in contrast with the general literature, the O-protonation step is affected both by the FERMO energies and the O-charges of the carbonyl group. Since the process was shown to not be totally described by either charge- or frontier-orbitals, it is proposed to be frontier-charge-miscere controlled. Moreover, the observed trend for the experimental reaction yields suggests that the electron withdrawing groups favor the reaction and the same was observed for Step 2, which can thus be pointed out as the determining step.


Introduction
Esters are compounds of natural or synthetic source, and they can be found in several materials, being extensively used in food industries, as constituents of some important flavor compounds. They are also found in honeys, flowers, fruits, and in fermented beverages, such as wine and beer. The cinnamates, ester derivatives of the cinnamic acid, besides acting as flavorings agents, can also be used as antioxidants, antifungal, anti-rheumatic, and even as inhibitions of the protein kinase C, a target for cancer treatment [1][2][3][4]. They are also widely used in the formulations of ultra radiation B, UVB (280-320 nm) and ultra radiation A, UVA (320-400 nm) absorbers [5]. Esters can be obtained by the Fisher esterification method, which is an acylation of alcohols by acid-catalyzed reaction with carboxylic acid. Recently, this method was described as one responsible reaction for formation of methyl formate in interstellar clouds [6]. The reactions are understood by their mechanistic aspects, and it demonstrates the selectivity and reactivity of the process. These can be expressed as charges, highest occupied molecular orbital (HOMO) and unoccupied molecular orbital (LUMO) energies, (HOMO-LUMO gap), Fukui function, and FERMOS (Frontier Effective-for-Reaction Molecular Orbitals) quantum descriptors [7]. Due to the importance of the cinnamates, and considering the interest both in studying the electronic effects arising from different substituents in cinnamoyl moiety [8,9] and in investigating the fundamental aspects that can elucidate the dynamic of chemical reactions [10,11], we report here theoretical-experimental results with the aim of describing a selection of quantum descriptors for the local hardness of carbonyl in the 4-and 3-substituted-cinnamic acid esterification being the substituents effects measured by quantitative structure-property relationship (QSPR). In order to assess such quantum descriptors, theoretical calculations have been performed at the density functional theory (DFT) level for a series of ethyl 4and 3-substituted-cinnamates. The esterification process was theoretically described by two steps, the O-protonation, Step 1, and the nucleophile attack by ethanol, Step 2, as suggested [12]. A general scheme for the two steps is shown in Figure 1. Moreover, the synthesis of the same compounds was conducted in order to provide experimental parameters for comparison with the theoretical results.  Steps for acid-catalyzed 4-or 3-X-cinnamic acid esterification.

Theoretical Background and Computational Details
The effects of the substituents in a chemical reaction have always been of great interest of chemists. A quantitative treatment of such effects has been described by the pioneering work of Hammett in 1937. The author proposed a linear free-energy relationship, represented by the equation below [12][13][14].
log log In Equation (1), KX is the ionization equilibrium constant for a substituted benzoic acid and KH is the ionization equilibrium constant for benzoic acid. The Hammett substituent constants values have been employed for the understanding of several organic reactions and their related mechanisms [14][15][16]. In fact, the understanding of chemical reactions mechanisms from the microscopic point of view has been a great challenge for chemistry researchers. In this context, Lewis proposed that most of the chemical reactions can be described as an acid-base process, being the acids, electron-pair acceptors and bases the electron-pair donors [17]. In 1963, Pearson introduced the hard and soft acid-base (HSAB) concept, which conveniently divided acids and bases into the following categories: hard, soft and borderline. In this approach, hard acids preferentially react with hard bases, whereas soft acids preferentially react with soft bases [18]. Although the HSAB concept has received great attention in the chemical community, explaining not only inorganic but also organic reactions, the lack of a quantitative description for the theory resulted in a lot of criticism [19][20][21]. In 1968, Klopman [22] affirmed that if the |EHOMO − ELUMO| ~ 0, the interaction between orbitals becomes predominant, being this reaction referred as a frontier-controlled process, whereas if |EHOMO − ELUMO| >> 0, the electron is transferred, and this reaction is referred as a charge-controlled process. In 1983, Parr demonstrated that every chemical system can be associated to the so-called electronic-chemical potential, defining also the chemical hardness according to the density functional theory (DFT), thus the HSAB concept reemerged. Using the molecular orbitals (MO) energies, the larger the energy gap between ELUMO − EHOMO, the harder is the species, being the hardness defined from Equation (2) [23,24].
In 2007, Anderson and coworkers, working out all these concepts, suggested that some reactions are in an interface, neither totally charge-nor totally frontier-orbital-controlled [25]. Even with the increasing computational capabilities and all the effort devoted for the comprehension of the chemical reactions, the HOMO-LUMO approach cannot explain certain reactions, mainly those involving ambidentate molecules. The local hardness and softness concept can also be represented by the Fukui Functions (ƒ(r)), which is formally defined as the partial derivative of the chemical potential, μ, with respect to an external potential, ν(r), at a constant number of electrons (N) [26]: On the basis of the discontinuity of the ƒ(r) versus N curve, three types of Fukui functions can be defined: the f + k and f − k functions, which account for nucleophilic and electrophilic attacks, respectively: (5) and the f 0 k function: which accounts for the homolytic attacks. In these equations, qk is the gross charge of atom k in the molecule and N is the number of electrons. It must be highlighted that only Equations (4) and (5) are of interest in this work. Within these definitions, several reactions could be explained [27]. Some researchers have considered that the Fukui function represents a good measurement of the chemical hardness [28], and others have demonstrated that the protonation sites can be estimated from the investigation of this function [29,30]. Melin et al. demonstrated that for hard-hard interaction, the atomic charges were the most appropriate descriptor for the protonation reaction of hydroxylamine and some amino acids, performing even better than the Fukui function [31]. An alternative way to describe local hardness was proposed by Silva et al. who introduced the Frontier Effective-for-Reaction Molecular Orbital (FERMO) concept [32], that despite some contestations [33,34], has been showed to perform satisfactorily for the global understanding of ambidentate species selectivity such as SCN − , NO2 − , CH3OCH2 − and N,N-dimethylsulfoxide (DMSO). Within the FERMO concept, the local hardness is redefined, as [35]: In Equation (7), the FERMO corresponds to a HOMO-X, a given occupied molecular orbital showing the greatest contribution to compose the reactive center [36]. All these cited reactivity indexes can bring thermodynamic as much as kinetics considerations [37].

Synthesis and Characterization
1 H-NMR chemical shifts were determined at room temperature on a Bruker AC (200 or 400 MHz) spectrometer in CDCl3 or DMSO-d6 with TMS as internal standard. The 13 C spectra were determined on a Varian EM 360L (100 MHz) spectrometer. Instrumental conditions were such as follow: relax delay 1 s, pulse 45°, and data process FT size 65536, spectral width 6398.0 Hz. The reaction progress was measured by gas chromatography on a GC-Shimadzu-14B equipped with a medium polar column (Shimadzu CBP5-25m) using H2 as the carrier gas, with a flame ionization detector set at 280 °C, an injector set at 270 °C and a column set to a temperature range from 50 to 250 °C (10 °C/min). Infrared spectra were recorded on a Perkin Elmer FT 16-PC and the most intense or representative bands are reported in cm −1 . Melting points were determined on a Microquímica model APF 301 apparatus and are uncorrected.
A series of 4-or 3-substituted ethyl-cinnamates was prepared by conventional method [38], using 2 mmol of para and meta substituted cinnamic acids refluxed in absolute ethanol (15 mL) and sulfuric acid (0.1 mL) for 30 h. The ester formation was monitored by GC and TLC (hexane:acetate 9:1 as eluent). The products were isolated after washing with aqueous sodium bicarbonate solution and solvent extraction with dichloromethane. After work up and solvent evaporation, most of the pure esters were obtained as an oil or a solid. All compounds were characterized by 1 H and 13 C-NMR spectroscopy, in agreement with the literature [39][40][41]. The spectroscopic and characterization information of all compounds is given as Electronic Supplementary Information.

Computational Methods
In order to assess the local quantum descriptors described above, quantum mechanical calculations have been performed for the same set of ethyl 4-or 3-substituted-cinnamates with different electron donating (EDGs) and electron withdrawing (EWGs) groups. Geometry optimizations have been performed at the B3LYP [42] level adopting the 6-31+G(d,p) basis set. Scan calculations over some dihedral angles have been conducted in order to guarantee that these geometries correspond to the most stable conformers. The characterization as a minimum energy geometry has also been done by calculating and inspection of the vibrational frequencies, at the same level. The B3LYP functional is probably the most popular and worldwide used functional. Its application for assessing reaction mechanisms, however, has been discussed and the M06-2X functional, a meta exchange-correlation functional proposed by Truhlar and coworkers [43], has been proven much more reliable for investigations on reaction mechanisms. Therefore, additional M06-2X/6-31+G(d,p) calculations for geometry optimizations and vibrational frequencies were performed. It must be noticed that, due to the size of the systems, calculations at a more robust theoretical level, as CCSD(T), are unfeasible.
The orbital eigenvalues have been obtained through single point calculations over the B3LYP/6-31+G(d,p) optimized geometries, at the restricted hartree-fock level adopting the 6-311++G(2d,2p) basis set. The more flexible basis set was adopted in an attempt to obtain improved orbital (HOMO, LUMO and the FERMO, HOMO-x1 and HOMO-x2) energy values. Local hardness parameters for the 4-or 3-X-cinnamic acid were calculated according to da Silva et al. [32,35]: In a similar way, the local hardness parameters for the protonated 4-or 3-X-cinnamic acid were calculated as: The global hardness parameters were also calculated. In order to evaluate the ƒ − O and ƒ + C Fukui functions, single point calculations have been performed at B3LYP and M06-2X levels adopting the 6-311++G(2d,2p) basis set, over the optimized geometry, located at corresponding theoretical level along with the 6-31+G(d,p) basis set. In order to measure the charges for the (N + 1) and (N − 1) species, similar single point calculations were performed over the previously optimized geometries, changing the charge number. The atomic charges were obtained from Natural Population Analysis phase of NBO calculations. Using the atomic charges over the oxygen and carbon atoms connected to the carbonyl group, Fukui functions were evaluated according to Equations (4) and (5).
Standard thermochemical properties have been obtained by conventional relations [44], adopting the ideal gas, rigid rotor and harmonic oscillator models. All theoretical calculations have been performed for the isolated systems, using Gaussian program [45].

Statistics Analysis
The models were obtained by linear regression analyses using the Build QSAR program to determine the parameters. Throughout this paper, n is number of data points, r is the correlation coefficient, Sd is the standard error, and F is Fisher value for the statistical significance [46].

Theoretical Calculations
For a complete theoretical description of the esterification reaction, the molecular geometries for reactants, products and intermediates were optimized and, as expected, all vibrational frequencies were determined as real values, characterizing all the geometries as minima. The resulting geometries and other molecular properties are given as Electronic Supplementary Information. Moreover, as stated above, potential energy curves were calculated in order to guarantee that these geometries correspond to the most stable conformation of each molecule and the results obtained for the cinnamic acid, protonated cinnamic acid and ethyl cinnamate are also given as Electronic Supplementary Information. Only those obtained for cinnamic acid, protonated cinnamic acid and ethyl cinnamate were reported, since similar behavior was found for the compounds. As it can be noted from the potential curves, cinnamic acids and protonated cinnamic acids are most stable at the planar conformation. Concerning the ethyl cinnamates, B3LYP predicts stable conformers showing the carbon atoms of the ethyl group at the same plane as the aromatic ring. This situation is avoided at the M06-2X level, and the terminal CH3 group is found with a dihedral angle (CCOC) of nearly 90°. Except for the CH3 group (and H atoms at ethyl CH2 group), all atoms lie at the same plane. The theoretical calculations were performed considering isolated systems, although the reactions were experimentally conducted in the presence of the solvent. Nevertheless, it has been shown that negligible contribution is observed by performing calculations considering a solvent model, in comparison with the gas phase results [48]. The standard enthalpy differences were obtained for each step (ΔH1 and ΔH2, respectively). The Frontier Effective-for-Reaction Molecular Orbitals (FERMOs), shown in Figure 2, have been chosen as the molecular orbitals with the major contributions from the electron density on the carbonyl group atoms, thus, to the reactive center.

Step 1: O-Protonation for Acid-Catalyzed 4-or 3-X-Cinnamic Acid Esterification
The substituent effects in the O-protonation step were observed by the correlations found between Hammett constants (σp) and O-charge (r 2 = 0.96, F = 197, for the B3LYP data and r 2 = 0.96, F = 178, for the M06-2X data) and by correlation between σp and ΔH1 (r 2 = 0.96, F = 184, for the B3LYP data and r 2 = 0.97, F = 231, for the M06-2X data). The atomic charges were, in particular, good local descriptors also showing good correlation with the ΔH1 values: r 2 = 0.99 and 0.98 (B3LYP and M06-2X, with F values 775 and 512, respectively), suggesting that the first step is favored by the EDGs. These data are still corroborated by the good correlation found between σp and ΔH1. In fact, it may be expected that the EDGs increase the O-charge, favoring this reaction step. Despite some authors have showed the Fukui functions are good descriptors for the protonation sites [29,30], the ƒ − O and the local hardness showed poor correlation with the O-protonation enthalpy variation (r 2 = 0.45, F = 6.53 and r 2 = 0.61, F = 12.69, for the B3LYP and M06-2X results, respectively). These results are corroborated with the conclusions of Melin et al. [31]. Moreover, no QSPR-Models could be established for the local hardness, η 1 and η 2 . The FERMOs energies EHOMO-x1 and EHOMO-x2, as well as EHOMO, however, well correlate with the first step enthalpy (ΔH1). The statistical correlation parameters obtained for ΔH1 versus EHOMO were: r 2 = 0.99 and F = 967 (B3LYP) and r 2 = 0.99 and F = 675 (M06-2X). For ΔH1 versus EHOMO-x1, the following statistical correlation parameters have been observed: r 2 = 0.97 and F = 223 (B3LYP) and r 2 = 0.97 and F = 269 (M06-2X), while for ΔH1 versus EHOMO-x2, r 2 = 0.95 and F = 153 (B3LYP) and r 2 = 0.95 and F = 141 (M06-2X) were found. Weak correlation was observed for ΔH1 versus ELUMO. Statistical parameters for the several possible correlations are summarized in Table 3. These relations, obtained at the M06-2X level, can be observed in Figure 3. B3LYP data are here omitted, since no significant changes from the M06-2X data were observed.  Table 1. Calculated properties for the 4-or 3-X-cinnamic acids: Hammett substituent constants (σp), molecular orbital energies (in hartrees), hardness parameters (η, η 1 and η 2 , in hartrees), standard enthalpy variation (first step, in kcal/mol), atomic charges and ƒ − O Fukui function. B3LYP and M06-2X results are both given for comparison.  Table 2. Calculated properties for the protonated 4-or 3-X-cinnamic acids: Hammett substituent constants (σp), molecular orbital energies (in hartrees), hardness parameters (η, η 3 and η 4 , in hartrees), standard enthalpy variation (second step, in kcal/mol), atomic charges and ƒ + C Fukui function. B3LYP and M06-2X results are both given for comparison.   Table 3. Correlations analysis between local properties and enthalpy variation for 4-or 3-X-cinnamic acid protonation (Step 1), obtained from B3LYP and M06-2X results. These results revealed that for Step 1, the frontier orbitals are as important as the charges. The importance of the frontier orbitals for the reaction process control has been revealed by the QSPR-models obtained for EHOMO, EHOMO-x1 and EHOMO-x2, whilst the good correlations with the reaction thermochemical property, ΔH1, suggested that the charges are also good reaction descriptors. In this way, the first step cannot be considered neither totally charge-nor totally frontier-orbital-controlled, and in the lack of an appropriate term, we refer to Step 1 as a frontier-charge-miscere controlled reaction.

Step 2: The Nucleophilic Attack by Ethanol for Acid-Catalysed 4-or 3-X-Cinnamic Acid Esterification
The substituents effects in the Step 2 were also assessed by the possible correlation among the quantum descriptors. It was found that the Hammett constants (σp) correlates well with both C-charge (r 2 = 0.87, F = 55, B3LYP and r 2 = 0.88, F = 59, M06-2X) and ΔH2 (r 2 = 0.96, F = 189, B3LYP and r 2 = 0.97, F = 237, M06-2X). It was also observed, as expected, that the EWGs increase the C-charge, consequently favoring this step, in the contrast to the Step 1. In fact, the C-charges were proved to be good quantum descriptor for the nucleophilic step, showing good correlation with the ΔH2 values (r 2 = 0.93, F = 110, B3LYP and r 2 = 0.94, F = 115, M06-2X). Strong correlations were observed between the occupied molecular orbital energies and ΔH2 values, as observed from the statistical correlation parameters: r 2 = 0.94, F = 123, r 2 = 0.98, F = 491 and r 2 = 1.00, F = 1876 for ΔH1 versus EHOMO, ΔH1 versus EHOMO-x1 and ΔH1 versus EHOMO-x2, respectively (B3LYP data), and r 2 = 0.92, F = 98, r 2 = 0.98, F = 419 and r 2 = 0.99, F = 1035 for ΔH1 versus EHOMO, ΔH1 versus EHOMO-x1 and ΔH1 versus EHOMO-x2, respectively (M06-2X data). Moreover, the lowest unoccupied molecular orbital energy, ELUMO, strongly correlates with ΔH2 values, (r 2 = 0.98, F = 404, B3LYP and r 2 = 0.98, F = 504, M06-2X). Neither the local hardness parameters (η 3 and η 4 ) nor the Fukui function (ƒ + C) were shown to be good descriptors for this reaction step. Statistical parameters are summarized in Table 4 and graph correlations, obtained at the M06-2X level, can be observed in Figure 4. In Step 2, as also noted in Step 1, strong correlations were between the ΔH values and C-charges and between the ΔH values FERMO energies, but not with Fukui Function or hardness parameters, suggesting that this step is also governed by electrostatic interactions, as well as by frontier orbitals. Therefore, the whole process should be better described as a frontier-charge-miscere controlled reaction.

Global Reaction: Theoretical and Experimental Data Compared
As no significant differences between B3LYP and M06-2x results for Steps 1 and 2 were observed, only correlations from B3LYP data will be discussed in this topic. From the data in Table 5, a good correlation between yield (%) and Hammett constants can be observed (entry 1, r 2 = 0.89, F = 64.52), as well as between Hammett constants and global enthalpy (entry 3, r 2 = 0.96, F = 189.96), demonstrating that the yield is favored by withdrawing groups. As expected, correlation between yield and global enthalpy (entry 2, r 2 = 0.88, F = 53.91) can be noted with one outlier, 4-hydroxy-cinnamic acid (9). Same tendency (negative slope) was observed between the correlation of the second step enthalpy and yield (entry 5, r 2 = 0.91, F = 74.68), however, the opposite (positive slope) was observed for correlation between yield and first step enthalpy (entry 4, r 2 = 0.91, F = 74.59). There is a strong correlation between ELUMO and yield (entry 9, r 2 = 0.94, F = 112.08), which is, interestingly, a more expressive correlation than that observed for the C-charge (entry 11, r 2 = 0.80, F = 4.816). Since the correlation with the ELUMO is expected for a nucleophilic substitution, the second step is suggested as the most important for the whole process. Once again, as occurred in Steps 1 and 2, poor correlations utilizing the Fukui function (ƒ − O) and hardness parameters (η, η 1 , η 2 , η 3 , and, η 4 ) were observed (entry 12-18), however, good correlations among orbitals energies (EHOMO-x2, EHOMO-x1 and EHOMO) can be seen in Table 5 (entry 6, r 2 = 0.88, F = 51.93, entry 7, r 2 = 0.91, F = 76.06, entry 8, r 2 = 0.90, F = 64.10, respectively).   Table 5. Correlations analysis among global properties and enthalpies variations or local properties for the protonated 4-or 3-X-cinnamic acid (PCA) or 4-or 3-X-cinnamic acid (CA), obtained from B3LYP.

Conclusions
The effect of different EWGs and EDGs over the carbonyl was investigated by establishing QSPR-models for each reaction step in the esterification mechanism. It was possible to observe that the EDGs favored the first step (O-protonation), whereas the EWGs contributed to the second step (the ethanol attack). The experimental results suggest that the greatest yields were obtained with EWGs; these yields represent the global process, Step 2, following same trend as the experimental results, suggesting that this step is fundamental for the global esterification reaction. The global hardness could not be pointed out as a good descriptor for the 4-and 3-substituted-cinnamic acid esterification. From the QSPR-models, the O-protonation step for the 4-and 3-substituted-cinnamic acids cannot be considered to be controlled, neither by charges nor by the frontier-orbitals. Thus, a frontier-charge-miscere controlled process is suggested. In this context, the O-protonation reaction depends on the Lewis base and not only hydronium ion.