LCAO Electronic Structure of Nucleic Acid Bases and Other Heterocycles and Transfer Integrals in B-DNA, Including Structural Variability

To describe the molecular electronic structure of nucleic acid bases and other heterocycles, we employ the Linear Combination of Atomic Orbitals (LCAO) method, considering the molecular wave function as a linear combination of all valence orbitals, i.e., 2s, 2px, 2py, 2pz orbitals for C, N, and O atoms and 1s orbital for H atoms. Regarding the diagonal matrix elements (also known as on-site energies), we introduce a novel parameterization. For the non-diagonal matrix elements referring to neighboring atoms, we employ the Slater–Koster two-center interaction transfer integrals. We use Harrison-type expressions with factors slightly modified relative to the original. We compare our LCAO predictions for the ionization and excitation energies of heterocycles with those obtained from Ionization Potential Equation of Motion Coupled Cluster with Singles and Doubles (IP-EOMCCSD)/aug-cc-pVDZ level of theory and Completely Normalized Equation of Motion Coupled Cluster with Singles, Doubles, and non-iterative Triples (CR-EOMCCSD(T))/aug-cc-pVDZ level of theory, respectively, (vertical values), as well as with available experimental data. Similarly, we calculate the transfer integrals between subsequent base pairs, to be used for a Tight-Binding (TB) wire model description of charge transfer and transport along ideal or deformed B-DNA. Taking into account all valence orbitals, we are in the position to treat deflection from the planar geometry, e.g., DNA structural variability, a task impossible for the plane Hückel approach (i.e., using only 2pz orbitals). We show the effects of structural deformations utilizing a 20mer evolved by Molecular Dynamics.


Introduction
The study of the electronic structure of organic heterocyclic molecules has been of interest for the scientific community for decades, especially since the establishment of investigation methods based on quantum mechanics. This includes the electronic structure and properties of nucleic acid oligomers and polymers, DNA and RNA. The sequence of bases, adenine (A), thymine (T) or uracil (U), guanine (G), cytosine (C), is where genetic information is stored and transferred in all living organisms. The understanding of its electronic structure and charge transfer [1] properties is a crucial issue in biology, involved in functions such as damage and repair, carcinogenesis and mutagenesis [2][3][4], mutations and diseases [5][6][7][8] and is also important for novel applications in nanotechnology [9,10].
The last two decades have witnessed a surge of studies of DNA as the basis for molecular wires and molecular electronics devices/circuits, based on self-assembly and specific base hybridization [11][12][13][14][15]. The prospect of using DNA in materials science stems from exploiting its properties of molecular recognition, assembly, and processing information [11] as well as its ability to transfer or transport charge. Among other theoretical and experimental attempts, the electronic structure of single DNA molecules has been resolved by transverse scanning tunneling spectroscopy and assigned to groups of orbitals originating from the molecular entities, i.e., nucleobases, backbone, counterions [12]. Properties of long-range charge transport in DNA and DNA-mediated charge transfer and mechanisms have been studied a for a long time now [13]. Furthermore, currents in the range of 10-100 pA have been measured in G4-DNA over distances in the range of 10-100 nm [14]. Today, DNA plays an increasingly important role in molecular electronics due to its structural and molecular recognition properties [15].
In this work, we calculate the ionization and excitation energies of nucleic acid bases and similar molecules as well as assemblies of DNA bases using a semi-empirical Linear Combination of Atomic Orbitals (LCAO) method that includes all valence orbitals with a novel parameterization developed by us. Additionally, using this approach, we obtain electronic parameters for charge (electron or hole) transfer along DNA, which can be employed to model electron and hole conductivity. We investigate the electronic structure of the four DNA bases A, T, G, C and of the two Watson-Crick H-bonded pairs A-T and G-C. We focus on the HOMO (Highest Occupied Molecular Orbital) and LUMO (Lowest Unoccupied Molecular Orbital) wave functions and energies. With the new LCAO parameterization developed by us in this work, we calculate the transfer matrix elements between stacking base pairs, for all possible combinations between them, for both electrons and holes, aiming at parameterizing a Tight-Binding (TB) wire model. We calculate the transfer matrix elements for ideal geometries, namely for planar bases and base pairs separated and twisted approximately by 3.4 Å and 36 • , respectively, relative to the double helix growth axis. Our results are compared with published experimental and computational (from first principles and simpler TB models) data for the HOMO and LUMO energies. Finally, the deformed base pairs pruned from several snapshots of a 500 ns Molecular Dynamics (MD) trajectory of a 20mer [16] are used in order to address the effects of structural variability in the electronic structure and charge transfer properties of B-DNA within the LCAO approach.
The rest of this article is organized in the following way: In Section 2, we develop the novel LCAO parameterization that includes all valence orbitals for nucleic acid bases (Section 2.1) and base pairs (Section 2.2). This methodology is not limited to these specific molecular systems but can be applied to similar heterocycles. Next, we obtain the TB parameters that are relevant for a wire model description of charge transfer and transport along B-DNA (Section 2.3). We also describe non ideal bases and base pairs obtained by MD (Section 2.4). In Section 3, we present our results on ionization and excitation energies of various heterocyclic planar molecules, including isolated DNA bases (Section 3.1). The onsite energies of base pairs and transfer integrals between stacked base pairs are presented in Section 3.2. We study the effects of structural variability on the electronic structure and charge transfer properties of B-DNA in Section 3.3. Finally, Section 4, contains our overall conclusions.

LCAO with All Valence Orbitals for Nucleic Acid Bases or Similar Molecules
We consider the state |β of a nucleic acid base, or a similar molecule, as a linear combination of all valence orbital states |φ iν , i.e., 2s, 2p x , 2p y , 2p z for C, N, and O atoms, and 1s for H atoms: The index ν runs among all N atoms of the molecule and the index i runs among all I orbital states of each atom, respectively. |β obeys the Schrödinger equation H B is the Hamiltonian of the base (or other molecule), with eigenvalues E B,k and eigenvectors |β k . Taking the bracket, using φ jµ , Equation (2) gives the linear system of equations The Hamiltonian matrix elements H B,jµiν are given by and the overlap matrix elements are We notice that we have approximated S jµiν by δ jµiν . The system of Equation (3) is solved by numerical diagonalisation, giving the eigenenergies E Bk and eigenvectors To this end we need the values of the Hamiltonian matrix elements, H B,jµiν . Regarding the diagonal matrix elements H B,iνiν -also known as on-site energies-we utilize a novel parameterization, namely: E H(1s) = −13.64 eV for H 1s orbitals, E C(2s) = −13.18 eV for C 2s orbitals, E C(2p) = −6.70 eV for C 2p orbitals, E N(2s) = −14.51 eV for N 2s orbitals, E N(2p) = −9.55 eV for N 2p orbitals, E O(2s) = −15.03 eV for O 2s orbitals, E O(2p) = −11.52 eV for O 2p orbitals. As for the nondiagonal matrix elements H B,jµiν (µ = ν) referring to neighboring atoms, we utilize the Slater-Koster two-center interaction transfer integrals [17] V ss = V ssσ , with ξ 1 , ξ 2 being the directional cosines of d = ji which points from atom i to atom j.
Concerning the values of V ssσ , V spσ , V ppσ , V ppπ , we use the relevant expressions proposed by Harrison [18,19], of the form: with m being the electron mass and d being the two-center distance. The χ values that we propose here are: χ ssσ = −1.32, χ spσ = −1.42, χ ppπ = −0.73 (slightly modified relative to the original Harrison constant), χ ppσ = 2.22. For each H orbital, the interactions are multiplied by a factor b = 0.70 that resulted from the optimization. We arrived at the above parameterization after careful optimization by fitting the LCAO numerical results with the experimental values for the excitation and the ionization energies of nucleic acid bases A, G, T, C, and U. To do so, we used the Nelder-Mead algorithm as implemented in Matlab software. All other nondiagonal matrix elements, referring to non-neighboring atoms, are assumed equal to zero, H B,jµiν = 0. In Tables 1 and 2 we summarize our LCAO parameters. From the numerical diagonalization of the Hamiltonian matrix, one obtains the energy eigenvalues corresponding to the electronic spectrum of molecular orbitals. The occupied and unoccupied orbitals-and thus the HOMO and LUMO-can be found by counting all valence electrons contributed by the atoms of the molecule and arranging them successively in couples of different spin in accordance with the Pauli principle. The same treatment developed for DNA bases is applicable to other purines, pyrimidines, and similar molecules.

LCAO with All Valence Orbitals for B-DNA Base Pairs
Likewise, we obtain the HOMO and LUMO states of a B-DNA base pair or monomer. Let us call N 1 , N 2 the number of atoms making up the two bases of the base pair. We consider the base pair or monomer state |α as a linear combination of all valence orbital states |φ iν , i.e., 2s, 2p x , 2p y , 2p z for C, N and O atoms and 1s for H atoms: The indexes ν and i run among the N 1 + N 2 atoms of the base pair and the I orbitals of each atom, respectively. |α obeys the Schrödinger Equation |α and E A are the eigenvectors and eigenenergies of the monomer or base pair Hamiltonian H A . By taking the bracket, using φ jµ , Equation (13) gives the linear system of equations The system of Equation (14) is solved by numerical diagonalisation, as well, giving the eigenenergies E Ak and eigenvectors In this case, the values of the Hamiltonian matrix elements, H A,jµiν , are expressed slightly differently. The matrix elements H A,jµiν with (a) 1 ≤ ν ≤ N 1 and 1 ≤ µ ≤ N 1 , and (b) N 1 + 1 ≤ ν ≤ N 1 + N 2 and N 1 + 1 ≤ µ ≤ N 1 + N 2 , are expressed in the same way as previously described for molecules. For the remaining matrix elements, we employ the Slater-Koster two-center interaction transfer integrals of Equations (7), (8), (9), (10) but in this case, the values of V ssσ , V spσ , V ppσ , V ppπ are of the form where d 0 = 1.35 Å is a typical covalent bond distance within a base. This difference stems from the fact that Harrison's relations are valid for interatomic distances of the size of covalent bonds. However, the B-DNA bases (A and T, or G and C) are connected with noncovalent hydrogen bonds to form a base pair. The length of hydrogen bonds is longer than the typical length d 0 of the covalent bond connecting neighboring atoms within a base. Thus, when dealing with interatomic distances of the size of hydrogen bonds and longer, Harrison's expressions of Equation (11) are replaced with the appropriate exponentially decaying expressions of the form of Equation (16) [20][21][22]. From the aforementioned diagonalization of the Hamiltonian matrix, we obtain the energy eigenvalues E A -including HOMO and LUMO-of the electronic spectrum, as well as the corresponding eigenvectors (coefficients) c iν of a base pair.

Eigenstates
The HOMO or LUMO state of a DNA segment, made up of N monomers, can be expressed as |α is the HOMO or LUMO state of monomer (base pair) α and v α are time-independent quantities. The Hamiltonian, in second quantization notation, in this TB wire model approach, can be written aŝ E α is the HOMO or LUMO on-site energy of monomer α, and t α,γ is the transfer integral between monomers α and γ. By substituting Equations (17) and (18) into the timeindependent Schrödinger equation we arrive to a system of N coupled equations Equation (20) is equivalent to the eigenvalue-eigenvector problem H DNA is the Hamiltonian matrix of order N composed of the TB parameters (on-site energies and transfer integrals) and v is the vector matrix composed of the coefficients v j . The diagonalization of H DNA leads to the determination of the HOMO or LUMO eigenenergy spectra (eigenspectra), {E k }, k = 1, 2, . . . , N and of the occupation probabilities for each eigenstate, |v jk | 2 , where v jk is the j-th component of the k-th eigenvector.

Coherent Charge Transfer
To describe charge transfer between stacked base pairs of double-stranded DNA, we suppose that an extra inserted electron travels through LUMOs, while an extra inserted hole travels through HOMOs. The time-dependent HOMO or LUMO state of the whole B-DNA segment, |DNA(t) , is considered as a linear combination of base-pair HOMO or LUMO states with time-dependent coefficients where |α is the HOMO or LUMO state of the α-th monomer and the sum is extended over all monomers of the B-DNA segment. Substituting Equations (18) and (22) to the time-dependent Schrödinger equation we obtain the system of N coupled differential equations: Equation (24) is equivalent to a first-order matrix differential equation, which can be solved with the eigenvalue method.

Coherent Charge Transport
To handle coherent charge transport in a TB approach, we also need the TB parameters (on-site energies and transfer integrals) described above. This can be done, e.g., with a transfer matrix approach [23].

TB Parameters for a Wire Model Description
The TB parameters for a wire model description of charge transfer or transport can be obtained as follows. The transfer integral between monomers |λ and |λ can be analyzed as where The matrix elements V iνjµ are given by the Slater-Koster two-center interaction transfer integrals of Equations (7)- (10) with the values of V ssσ , V spσ , V ppσ , V ppπ being of the form of Equation (16). The tight-binding parameters E λ and t λ,λ computed in this work could be used to treat charge transfer (Section 2.3.2) and transport (Section 2.3.3) along a B-DNA segment. Finally, we obtain the maximum transfer percentage of the carrier from one base pair to another. This refers to the maximum probability to find the extra hole or electron at the site where it was not placed at initially. The maximum transfer percentage reads where t is the transfer parameter between the two base pairs and ∆ is the difference between the HOMO or LUMO energies of the two base pairs.
Local complementary base-pair parameters are employed in order to define the base pair structure and its variability. The parameters describing the relative translations in all axes, involving two bases of a Watson-Crick pair, are shear (Sx), stretch (Sy), and stagger (Sz), while the corresponding rotations around x, y, and z axes are buckle (κ), propeller twist (π), and opening (σ) [24]. Figure 1 depicts the definitions of these translation and rotation parameters involving two bases of a Watson-Crick pair.   Table 3.

Heterocyclic Planar Molecules including Nucleic Acid Bases
The theoretical scheme described in Section 2 was employed to calculate the HOMO and LUMO eigenenergies for a variety of heterocyclic planar organic molecules. We make the convenient simplifying assumption that the HOMO absolute value expresses the ionization energy, and the HOMO-LUMO gap expresses the excitation energy (in most cases the first π-π * transition). Below, the ionization energies are of π molecular orbital character and the excitation energies are π-π * transitions, unless otherwise stated. We studied the following groups of molecules: adenine and isomers; guanine and isomers; purine and isomers; thymine, cytosine, uracil, and isomers; pyrimidine and isomers; and other planar heterocyclic molecules. Table 4  with Singles, Doubles, and non-iterative Triples (CR-EOMCCSD(T))/aug-cc-pVDZ level of theory, respectively, ref. [29]. Table 4 also includes transition oscillator strengths f that we calculated in a simplistic approximation, considering point contribution of the corresponding orbitals; i.e., the transition dipole moment d was approximated as where |L (|H ) is the LUMO (HOMO) state. The oscillator strength is [30] E is the excitation energy. The results are illustrated in Figures 3-5.   Figure 3. First π ionization energy and first π-π * excitation energy of purines calculated via our LCAO method using all valence orbitals, along with results at the IP-EOMCCSD/aug-cc-pVDZ (vertical ionization energies) and CR-EOMCCSD(T)/aug-cc-pVDZ (vertical excitation energies) level of theory [29], as well as available experimental data. Different isomers are specified in Table 1. . First π ionization energy and first π-π * excitation energy of pyrimidines calculated via our LCAO method using all valence orbitals, along with results at the IP-EOMCCSD/aug-cc-pVDZ (vertical ionization energies) and CR-EOMCCSD(T)/aug-cc-pVDZ (vertical excitation energies) level of theory [29], as well as available experimental data. Figure 5. First π ionization energy and first π-π * excitation energy of other planar heterocyclic molecules calculated via our LCAO method using all valence orbitals, along with results calculated at the IP-EOMCCSD/aug-cc-pVDZ (vertical ionization energies) and CR-EOMCCSD(T)/aug-cc-pVDZ (vertical excitation energies) level of theory [29], as well as available experimental data.
Regarding the ionization energy, the LCAO obtained results are in very good agreement with both the experimental data and the CC results, although there are some deviations. The Root Mean Square Percentage Error (RMSPE), with respect to the experimental values, is 3.65%. Differences in tautomer ionization energies are as expected negligible, that is 0.12 eV for purine tautomers and 0.01 eV for indazole tautomers. As for the excitation energies of the π-π * transition, the RMSPE, with respect to the experimental values, is 6.49%. Both purine and indazole tautomers have a negligible 0.03 eV difference in their excitation energies. Based on the presented data and reported comments about individual bases, we note that the LCAO method used in this work, though not exact, is capable of producing results in a good agreement with experimental data, when choosing the suitable set of parameters. This outcome has motivated the use of the same method for all other systems of interest, whose computational results are presented in the remainder of this article. Vertical ionization energies of nucleic acid bases in the gas phase with different electronic structure methods are, generally, in agreement with our results, cf. Reference [51] and references therein.

B-DNA Base Pairs
In this subsection, we present our results for the B-DNA base pairs. In Table 5, we show the HOMO, LUMO, and HOMO-LUMO gap energies of the two B-DNA base pairs (Adenine (A)-Thymine (T) and Guanine (G)-Cytosine (C)), according to the procedure described in Section 2.3 using LCAO with all valence orbitals, along with the corresponding energies found in Ref. [52] using only 2p z orbitals. At this point, we should state that the bases making up the base pairs are slightly deformed in comparison to their structure when isolated (cf. Section 3.1), so the corresponding HOMO and LUMO energies for these two cases may differ. Thus, Table 5 also contains the HOMO, LUMO, and HOMO-LUMO gap energies of the distorted bases. The HOMO (LUMO) energies are of π (π * ) molecular orbital character and the HOMO-LUMO gap energies are π-π * transitions, unless otherwise stated. Table 5. HOMO (E LCAO,H ) and LUMO (E LCAO,L ) eigenenergies of the base pairs A-T and G-C, obtained in this work using LCAO with all valence orbitals, along with the corresponding HOMO-LUMO energy gaps (E LCAO,g ) in eV (rows 6 and 7). Rows 2-5 contain the calculated HOMO and LUMO energies of each distorted base making up these base pairs. The third, fifth, and the seventh columns list the corresponding energies from Reference [52] where only 2p z orbitals had been used. The energy values for the bases are slightly different from those in Table 4, as expected. In addition, based on Table 5, one can assume that the HOMO energy of a particular base pair is very close to the largest of the HOMO energies of the two bases of the base pair, while the LUMO energy of the base pair is closer to the lowest of the two LUMO energies.

Base or Base Pair
In Figures 6 and 7 we represent the occupation probabilities of holes and electrons on each atomic orbital of bases and base pairs, calculating the squared coefficients |c iν | 2 (cf. Equations (1) and (12)) of the corresponding states (HOMO for holes, LUMO for electrons). We observe that our calculated HOMO state for the base pair A-T (G-C) is localized almost totally in Adenine (Guanine), while the corresponding LUMO wave function is localized in Thymine (Cytosine), in accordance to results from ab initio techniques of References [53,54], which locate the HOMO of a base pair in purine and the LUMO in pyrimidine. This is due to the higher HOMO energy of Adenine (Guanine) and lower LUMO energy of Thymine (Cytosine) and the large values of these differences compared to the transfer integrals (see Table 6). We calculate the first transition character of A, T, A-T, and G to be π-π * , while C and G-C have π-σ * transition character. We obtain the charge transfer parameters between two successive base pairs by calculating the corresponding overlap integrals from Equation (26). We denote by XY two successive base pairs, X-X compl and Y-Y compl . The bases X and Y are located at the same strand in the direction 5 -3 , while X compl and Y compl , respectively, are their complementary bases on the other strand. In the most common B-DNA conformation, X-X compl and Y-Y compl are approximately separated by 3.4 Å and twisted by 36 • . Table 6 summarizes our LCAO results using all valence orbitals for the transfer parameters, for all possible combinations of successive base pairs and close-to-ideal geometrical conformations. The Table also contains comparisons with other methods.  (1)), for the HOMO (left) and LUMO (right) states of G and C bases into a G-C base pair (top), along with the corresponding probabilities (cf. Equation (12)) for the HOMO and LUMO states of the G-C base pair (bottom). Table 6. Close-to-ideal geometrical conformations. The absolute values of transfer parameters for all possible combinations of successive base pairs. |t LCAO,H | (|t LCAO,L |) of the second (fifth) column refer to hole (electron) transfer parameters obtained from our LCAO calculations using all valence orbitals. The third column lists hole transfer parameters of Reference [55], an estimation from various articles found in bibliography. The sixth column lists the electron transfer parameters of Reference [52], where only 2p z orbitals had been used. The fourth and seventh columns list the transfer parameters with the parameterization of Reference [29], where only 2p z orbitals had been used. All transfer parameters are given in meV. In Figure 8, we illustrate the absolute values of transfer parameters for all possible combinations of successive base pairs for holes and for electrons. The figure contains the transfer parameters obtained from our LCAO calculations using all valence orbitals, along with the corresponding parameters found in Ref. [55] (where various estimations from bibliography had been taken into account). Furthermore, those from Ref. [29], where only 2p z orbitals had been used, and finally, electron transfer parameters from Ref. [52], where only 2p z orbitals had been used. Peluso et al. [56], based on electrochemical and time-dependent spectroscopic measurements, find for GG a transfer integral ≈ 0.1 eV, which is very close to our results, while, for AA, they report a value ≈ 0.3 eV, which seems large compared to the parametrization reported here taking into account all valence orbitals as well as to the parametrization in Reference [55], which takes into account, for holes, the works [52,[57][58][59][60][61]. . We show the transfer parameters obtained from our LCAO calculations using all valence orbitals, as well as the corresponding transfer parameters found in Reference [55] (for holes, estimation from various articles in bibliography), in Reference [29] (using only 2p z orbitals) and in Reference [52] (for electrons, using only 2p z orbitals).
In Figure 9, we depict the maximum transfer percentage of Equation (28) obtained by our LCAO calculations using all valence orbitals, compared to the values using parameters from Reference [55] for holes (an estimation from various articles from bibliography). Furthermore, from Reference [29] for electrons and holes as well as from Reference [52] for electrons (where only 2p z orbitals had been used). For ideal B-DNA geometries and for dimers made of identical monomers, the maximum transfer percentage is 1, while in the case of different monomers, p is smaller than 1, both for holes and for electrons. Both for t and p, we observe that the current LCAO using all valence orbitals is closer to the results from Reference [55] for holes (where various estimations from bibliography of different origin had been taken into account). For electrons, as far as we know this current LCAO calculation is the only one beyond simple Hückel models, using only 2p z orbitals. Figure 9. Comparison of the maximum transfer percentage p obtained by our LCAO method using all valence orbitals, with the p values extracted from other sources: obtained from parameters found in Reference [55] (for holes, estimation from various articles in bibliography), in Reference [29] (using only 2p z orbitals) and in Reference [52] (for electrons, using only 2p z orbitals). Left panel for holes, right panel for electrons.

Effects of Structural Variability
In this subsection, we analyze the effects of structural variability on the electronic structure and charge transfer properties of B-DNA using the fragments derived from MD, as detailed in Section 2.4. In Figure 10, we present the absolute values of the parameters ∆ (difference between the HOMO eigenenergies of the two base pairs of each studied dimer) and t (transfer integral between the two base pairs' HOMOs of each studied dimer), as well as the maximum transfer percentages p as calculated via Equation (28). The values of |t| and p can also be found in Reference [16] in comparison with results obtained by Density Functional Theory (DFT) techniques. From Equation (28) it is expected that ideal dimers (made up of ideal monomers) should have a maximum transfer percentage equal to 1. However, by observing Figure 10, one can notice that not all AA and GG dimers have p = 1. Specifically, dimers with a p considerably different from unit (and a ∆ different than zero) are: A11A12_cl2, A12A13_cl1, A121A13_cl2, A13A14_cl2, G15G16_cl1, and G16G17_cl1. This is expected because the studied monomers are not ideal, which means their consisting bases have relative translations and rotations (Figure 1) as depicted in Figure 2. More specifically, a small p value is related to a large ∆ value, in accordance with Equation (28). Thus, it is expected that the structural parameters (shear, stretch, stagger, buckle, propeller twist, opening) have a reasonable effect on the HOMO (and LUMO) base-pair energy values and consequently on the values of ∆ and p. As for the contribution of transfer integrals t to the above discussion, it is documented in Reference [16].

Conclusions and Outlook
In this work, we computed the tight-binding parameters that are necessary for a wiremodel description of longitudinal (axial) charge transfer through B-DNA. We took into account structural variability by carrying out these computations for multiple structures resulting from a classical trajectory.
We initially calculated the lowest ionization and excitation energies of various "ideal" (frozen) heterocyclic organic molecules with a biological function, including the DNA and RNA bases and isomers. We did so employing the LCAO approximation in a new parameterization that accounts for all valence orbitals, i.e., 2s, 2p x , 2p y , 2p z orbitals for C, N and O atoms and 1s orbital for H atoms. This LCAO approach is more suitable than the standard LCAO parameterization to investigate non-planar geometries. We predict ionization and excitation energies with RMSPE 3.65% and 6.49%, respectively, compared to the experimental values. Based on these errors, we infer that the proposed computational strategy is an adequate tool for a quick and relatively accurate estimation of the electronic structure for a variety of organic molecules.
Using the computed energies of the HOMO and LUMO within the proposed LCAO method, we then evaluated the energy levels of DNA base pairs (A-T, G-C) and the transfer integrals between stacked base pairs. Our results are in good agreement with reference data. The obtained transfer integrals can be used in further studies of charge transfer/transport in DNA oligomers and polymers.
Finally, we addressed the impact of structural flexibility (dynamics) on the electronic structure and charge transfer ability of B-DNA. To this end, we applied our LCAO method to 20 AA and GG dimers, extracted from representative structures in a classical MD trajectory of a 20mer evolved for 500 ns. For all these systems, we calculated the parameters ∆ and t, as well as the maximum transfer percentage between the two monomers of a dimer p. We found that the values of ∆ and p are significantly affected by geometrical changes. Nevertheless, in the vast majority of the studied dimers, the maximum transfer percentage is very close to unity.
We suggest that the proposed methodology can be used in a high-throughput manner to characterize dynamical effects on charge transfer in organic polymers constituted of heterocyclic building blocks.
Our cost-effective simple method is suitable for very fast computations of electronic structure and transfer integrals. It can greatly facilitate charge transfer and transport calculations in sequences of arbitrary geometry taken, e.g., by MD simulations, as far as purines, pyrimidines, and similar molecules are the constituents. Although we took only valence orbitals for carbon, nitrogen, oxygen, and hydrogen into account, this approach can be generalized to include other atomic species and orbitals.