Computational Study of the Electron Spectra of Vapor-Phase Indole and Four Azaindoles

After geometry optimization, the electron spectra of indole and four azaindoles are calculated by density functional theory. Available experimental photoemission and excitation data for indole and 7-azaindole are used to compare with the theoretical values. The results for the other azaindoles are presented as predictions to help the interpretation of experimental spectra when they become available.


Introduction
Indole is a bicyclic aromatic compound consisting of a six-membered benzene ring fused to a five-membered pyrrole ring. Azaindoles are analogous molecules in which the benzene ring is replaced by a pyridine ring. Indole and azaindoles are involved in many biochemical reactions, as evidenced by papers, reviews [1][2][3][4], and books [5,6], in addition to popular internet sources, such as Wikipedia and Encyclopedia Britanica. While indole and 7-azaindole have been studied by several workers , the other azaindoles have received much less attention. The vibrations of indole were investigated by Collier [9], and by Walden and Wheeler [10]. The experimental rotational constants reported by Suenam et al. [11], Caminati and Bernardo [12], Gruet et al. [13], Nesvadba et al. [14], and by Vavra et al. [15] are more useful for the present study in guiding our choice of method of geometry optimization. The published data of UV absorption spectra [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34] often covered different excited states. Serrano-Andres and Roos [22] used results from multi-configuration second-order perturbation theory for complete active space (CASPT2) to compare with the various observations. Similarly, Serrano-Andres and coworkers [33] used CASPT2 to study the excitations of 7-azaindole. We shall bring the comparison more up to date. Finally, the more recent experimental photoelectron spectra of indole [35] will be used to confirm the theoretical methods based on density functional theory (DFT) we have developed and tested.

Methods
Geometry optimization of indole and the four azaindoles are performed using the Gaussian package [36]. The results are summarized in Table 1. The optimized Cartesian coordinates of the five molecules can be found in the Supplementary Materials.
In our earlier studies, we used molecular geometry determined experimentally when available; otherwise, the Hartree-Fock method was often used for optimization; and MP2 and CCSD were also used occasionally. Typical dependence of calculated core-electron binding energies (CEBEs) on molecular geometry is shown in Appendix A. The molecule formaldehyde is rather rigid, whereas hydrogen peroxide is quite flexible. The results presented in Tables A1 and A2 indicate that the dependence of CEBEs on geometry is quite small in both cases. The results of geometry optimization by B3LYP/6-31G(d) agree with The methods we use for computation of electron spectra have been presented several times in previous studies. After geometry optimization, we use the Amsterdam Density Functional (ADF) package [37] to calculate the various electron spectra. For vertical ionization energies (VIEs) of valence electrons, we used Method (a) = ∆PBE0(SAOP)/et-pVQZ [38], which means the energy difference calculated with the parameter-free Perdew-Burke-Ernzerhof exchange-correlation functional using the electron density obtained with the exchange-correlation potential (V xc ) known as statistical averaging of orbital potentials (SAOP). The efficient even-tempered basis set of polarized valence quadruple-zeta (et-pVQZ) Slater-type orbitals [39] is available in the ADF package. This method has been used in many molecules [40][41][42][43][44][45][46][47][48][49][50]. The results are summarized in Table 2.
The AADs are somewhat arbitrary because they depend on the number of VIEs included. Moreover, some earlier experimental VIEs may be in error by as much as 0.1 eV because of calibration and/or overlapping of bands. In any case, the AADs are less than 0.2 eV for many molecules. Alternative methods giving reliable VIEs are symmetry-adapted cluster configuration interaction (SAC-CI) of Nakatsuji [51] and renormalized partial thirdorder (P3+) method of Ortiz [52]. Both methods are available in recent versions of the Gaussian package. However, SAC-CI is computationally demanding and does not seem to be preferred by photoelectron spectroscopists. On the other hand, P3+ is limited to outer-valence electrons only, whereas our Method (a) above can handle inner-valence electrons with little or no difficulties. For reliable prediction of CEBEs, there are several points to consider: (1) the electrons of the core-hole cation are attracted by a shielded nucleus very different from that of the neutral parent, so that the basis set must be flexible enough to account for the difference. Usual basis sets of Gaussian-type orbitals (GTOs) contain a single contraction for the 1s orbital and is not flexible. In our earlier study using contracted GTOs [53], we used exponent scaling factors to solve the problem, in addition to testing the use of correlationconsistent core-valence basis sets. More recently, Bellafont et al. [54][55][56][57][58] used augmented Partridge basis sets of uncontracted GTOs to avoid the difficulty. On the other hand, the problem does not exist when we use Slater-type orbitals (STOs) in the ADF program. More often than not, we use the efficient et-pVQZ basis set [39], which contains double-zeta core basis functions in addition to effectively polarized quadruple-zeta basis functions for valence electrons.
(2) For the prediction of CEBEs, relativistic effects influence the accuracy of the calculated nonrelativistic results. In 1995 [59], we decided to use the formula (1): I rel = I nr + C rel , with C rel = K I nr N (1) to estimate the small relativistic correction to the calculated nonrelativistic CEBEs of C to F. The parameters K and N were obtained by fitting the difference between Pekeris' accurate relativistic and nonrelativistic ionization energies of two-electron ions [60]. For both C rel and I nr in electron volts, K = 2.198 × 10 −7 and N = 2.178. In 2005, Maruani et al. [61] reported results of Dirac-Fock correction to the ionization energies of atoms Li to Xe. The allometric fit for the Be to Ne series gave K = 6.55 × 10 −7 and N = 2.0569. More recently, Bellafont et al. [54][55][56][57][58] also examined relativistic effects by the Dirac-Fock method on B to F atoms, with very different results. Table 3 summarizes all these efforts.  [61] are slightly larger than C rel of our earlier empirical fit [59] and approximately half of those reported by Bellafont at al [54][55][56][57][58]. It would be ideal if one can apply a relativistic procedure capable of giving accurate results for two-electron ions to the typical molecules B 2 H 6 , CH 4 , NH 3 , H 2 O, and HF. Until then, we shall remain consistent and continue to use our C rel .
(3) Since the method we developed is based on DFT, there is the question of choice of functional. For CEBEs of B to F [53], we use the formula: which means that PW86 and PW91 are used for exchange and correlation functionals, respectively. Table 4 compares our results for carbon-containing molecules with those in the resent paper from Illas' laboratory [17] using the TPSS functional.  Experimental measurements of CEBEs by X-ray photoelectron spectroscopy require calibration and reported values may be off by as much as 0.1 eV. As a rule, synchrotron measurements tend to be more reliable. In any case, our AAD of 0.14 eV is remarkably small. The relativistic correction for carbon 1s ionization is 0.05 eV for our method and 0.13 eV for the method using TPSS. Whether or not we add 0.08 eV to our results, our correlation corrected results for carbon-containing molecules are definitely superior to those using the TPSS functional.  [56].
Two other approaches tested by Illas and coworkers do not fare any better: The popular functional B3LYP performs fairly well for CEBEs of N1s but much less well for O1s [55]. Low-order GW approximations did much worse [62].
For excitation of valence electrons, we use Method (c) = time-dependent density functional theory (TDDFT) with: V xc = SAOP.
Basically, TD-DFT in the ADF package is a CI calculation with singly excited configuration using DFT ground-state molecular orbitals. For non-relativistic closed shell molecules, spin and symmetry are conserved. In other words, excited singlets and triplets are computed separately. Excitations to triplet states are omitted when the keyword allowed is included in the input. For highly excited states (approaching Rydberg excitations, for example), the basis set et-pVQZ can be augmented by diffuse functions specially designed for excitation studies [63]. Such an augmented set (called aug-et-pVQZ) was tested on the first 15 excited states of ten closed shell molecules, and employed in the present study. The use of TDDFT for visible/UV excitations is less well validated, partly because there are fewer experimental data available for comparison. More often than not, the observed absorption bands are the result of convolution of several close-by excitations. Some comparisons between TDDFT results and experiment have been reported [63,64]. Two minor extensions are introduced in this study. Firstly, in the early days of X-ray photoelectron spectroscopy (XPS), Gelius [65,66] estimated relative cross-sections with a simple model. Minor refinements were made by Nefedov et al. [67]. The model worked remarkably well in connection with the semiempirical HAM/3 molecular orbital method [68]. The results of using such a model HAM/3 method for indole are included in Table 9, to be compared with experimental XPS of the valence electrons when available.
The second extension is called shifted meta-Koopmans' theorem, to be used in CEBE calculations. For valence electron ionization of organic and other small molecules, meta-Koopmans' theorem (mKT) means using the negative of the orbital energy from V xc = SAOP calculation to approximate the ionization energy [69]. However, mKT does not provide reliable CEBEs, although it gives quite reasonable relative CEBEs. When there are many carbon atoms, for example, in a molecule, the core-hole may be difficult to localize. In such cases, we can obtain good estimates of the CEBEs by the method of shifted mKT. The shift needed may be obtained in various ways by comparing the mKT value with that from method (b) outlined above. In this work, we select the simplest choice for the lowest CEBE for the element of interest (carbon, for example). In the present study, the shift for indole is found to be 16.67 eV. Alternative choices for the shift are the highest CEBE (for carbon for example), the average of the lowest and highest CEBEs, or the average of all the available CEBEs.
Since the non-resonant Kα X-ray emission spectra (XES) are so simple to compute [70], we calculate the predicted XES for the nitrogen cores of indole and the four azaindoles. As there are so many carbon atoms in indole and azaindoles, the X-ray emission spectra are expected to be hopeless to untangle and are therefore not calculated.

Results and Discussion
The results of our calculations can of course be directly compared with experimental measurements of electron spectra. Correlation of CEBEs with Hammett substitution constants [71] has been demonstrated, but is limited to aromatic substitution reactions. In addition, Thomas et al. [72], suggested that CEBEs are related to chemical properties such as electronegativity, acidity, basicity, proton affinities, reactivity, and regioselectivity of reactions. For example, Saethre et al. [73] found statistical correlation between CEBEs and activation energies and regioselectivity for the Markovnikov addition of the elctrophiles HX (X = F, Cl, Br, and I) to the alkenes ethene, propene and 2-methylpropene. However, comparison with general experimental reactivity is much more difficult. In general, reactivity is affected by a combination of several physical properties of the reactants. For example, it was suggested that the biological activity of non-steroidal anti-inflammatory drugs (NSAIDs) depends on a synergistic collective action of several factors, including the ionization energies of molecular orbital localized mainly on the group responsible for bridging to the receptor [30].
The result for indole is displayed in Figure 1. Our DFT results for the four lowest excited singlet states agree with the previous CASPT2 results reasonably well. For the higher singlet states, our DFT method appears to be more reliable. The same method and basis set are then used to predict the excited states of 4-azaindole, 5-azaindole, and  Table 8, will be useful to experimental chemists. For indole and the four azaindoles, π-type excitations have low intensities.  [30]. c Hassan and Hollas [31]. d Ilich, in Ar matrix [21]. e Sukhodola [32]. f Serrano-Andres and Borin [27]. g Serrano-Andres et al. [33]. h Ten et al. [34]. i This work: TDDFT using V xc = SAOP/aug-et-pVQZ. The results for the photoelectron spectrum of indole(g) are summarized in Table 9. Plekan et al. [35] used the method of renormalized partial third order method (P3+) for the ionization energies of outer valence electrons and ∆B3LYP for core electrons. Besides the Molecules 2021, 26, 1947 9 of 16 inclusion of inner-valence electrons, our DFT results appear to be more reliable (except for the lowest VIE) than the P3+ method, especially for core electrons. The result for indole is displayed in Figure 1. Our DFT results for the four lowest excited singlet states agree with the previous CASPT2 results reasonably well. For the higher singlet states, our DFT method appears to be more reliable. The same method and basis set are then used to predict the excited states of 4-azaindole, 5-azaindole, and 6-azaindole. Hopefully, the results, summarized in Table 8, will be useful to experimental chemists. For indole and the four azaindoles, π-type excitations have low intensities.  The same procedures are applied to the azaindoles and the results are shown in Table 10. The iterations for four valence cations fail to converge. For the purpose of computing X-ray emission spectra, the ionization energies of those four non-convergent cases are estimated.  Finally, the Kα X-ray emission spectra for decay of N1s core holes are predicted for indole and the four azaindoles. The results summarized in Table 11; Table 12 indicates that ∆E values are fairly similar with many transitions between 374 and 398 eV. However, the data of f-values (even though approximate) help to make the spectra different from one another. The f-values have been calculated with Kohn-Sham orbitals from V xc = SAOP calculations. In atomic units, the energy difference ∆E enters the formula for f-value where µ is the transition dipole moment. The error in (∆E) amounts to about 5% (20 eV in 400 eV) for nitrogen. Therefore, the f -values listed are hardly affected except for the most intense transitions, which have been highlighted with boldface type in Tables 11 and 12.

Summary
In this work, we have computed the various electron spectra of gas-phase indole and four azaindoles. The spectra include UV absorption, valence ionization, core ionization, and X-ray emission. The available experimental data on indole and 7-azaindole allow the comparison of theory with experiment and support the reliability of the predicted spectra. Experimentalists are therefore encouraged to measure the unknown spectra. Acknowledgments: The author is grateful for the continuing support of Scientific Computing and Modeling (Amsterdam).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The geometry of both methanal and hydrogen peroxide were optimized by popular methods using various basis sets. The results of such test calculations are presented in Tables A1 and A2. It can be seen that the dependence of the calculated CEBEs is quite small in both cases. Table A1. Summary of DF T calculation on H 2 CO: bond lengths in angstrom, angles in degree, rotational constants and their average absolute deviation from experiment in MHz, and core-electron binding energies in eV. Bold-face type indicates best agreement with the experiment.