Scaled in Cartesian Coordinates Ab Initio Molecular Force Fields of DNA Bases: Application to Canonical Pairs

The model of Regularized Quantum Mechanical Force Field (RQMFF) was applied to the joint treatment of ab initio and experimental vibrational data of the four primary nucleobases using a new algorithm based on the scaling procedure in Cartesian coordinates. The matrix of scaling factors in Cartesian coordinates for the considered molecules includes diagonal elements for all atoms of the molecule and off-diagonal elements for bonded atoms and for some non-bonded atoms (1–3 and some 1–4 interactions). The choice of the model is based on the results of the second-order perturbation analysis of the Fock matrix for uncoupled interactions using the Natural Bond Orbital (NBO) analysis. The scaling factors obtained within this model as a result of solving the inverse problem (regularized Cartesian scale factors) of adenine, cytosine, guanine, and thymine molecules were used to correct the Hessians of the canonical base pairs: adenine–thymine and cytosine–guanine. The proposed procedure is based on the block structure of the scaling matrix for molecular entities with non-covalent interactions, as in the case of DNA base pairs. It allows avoiding introducing internal coordinates (or coordinates of symmetry, local symmetry, etc.) when scaling the force field of a compound of a complex structure with non-covalent H-bonds.


Introduction. Inverse Problems of Vibrational Spectroscopy
Vibrational spectroscopy is a very important source about the structure of molecules, in particular, in different states of aggregation, including the presence of intra-and intermolecular interactions. Currently, there is a large number of available infrared and Raman spectra, measured with a sufficiently high accuracy, which, as a rule, are supplemented by quantum mechanical calculations at the modern level of theory.
In this article, we consider new possibilities offered by our earlier proposed method for solving the inverse vibrational problem (finding the matrix of force constants of a molecule) by calculating the scale factors directly in Cartesian coordinates [1], namely, the application of this approach to correcting the theoretical frequencies of molecular associates with non-covalent interactions. Generally, the empirical force field is defined as a set of parameters of an individual molecule (or its associates of the finite dimension), which are determined from experimental data on molecular geometry and sets of vibrational frequencies of an individual molecule by solving the so-called inverse vibrational problem. These "small" force fields of separate molecules are widely used in modern computational chemistry as a part of the extended biomolecular force fields for simulating bulky biological systems by MD or MM methods.
The concept of molecular force field arises within both classical and quantum mechanics when a molecule is considered as a mechanical system of nuclei, while all interactions of electrons are included in the effective potential function U(q 1 , . . . , q n ), where q 1 , . . . , q n denote n = 3N-6 generalized coordinates of N atomic nuclei of the molecule. The minimum Molecules 2022, 27, 427 2 of 13 potential function (with respect to nuclei coordinates) defines the equilibrium geometry of a molecule. The second derivatives of the potential with respect to nuclei coordinates in equilibrium: constitute a positive defined force constant matrix F determining all the molecular characteristics related to small vibrations. The vibrational frequencies (obtained from IR and Raman spectra) are the main type of experimental information on molecular vibrations. They are connected with the force constant matrix by the eigenvalue equation: The parameters of the empirical force field are determined by processing the data of experimental infrared and Raman spectra. The so-called inverse vibrational problem of determining the parameters of the molecular force field (matrix of force constants, F) from the given experimental data (vibrational frequencies, isotope frequency shifts, Coriolis constants, centrifugal distortion constants, etc.) is formulated [1] in the form of a nonlinear operator equation in finite-dimensional spaces: where F ∈ Z ∈ R n(n+1)/2 (Z is a set of possible solutions) is the unknown force constant matrix (real and symmetric), Λ ∈ R m represents the set of available experimental data (vibrational frequencies, etc.) determined within δ error level: Λ − Λ δ ≤ δ. A is a nonlinear operator which maps matrix F on Λ, h is an estimate of the operator A uncertainty. The accumulation of data on force constants is necessary for predicting the spectra and other properties of compounds not yet investigated and for the development of physical models in the theory of molecular structure. This mathematical problem of calculating molecular force fields within the general approximation of small vibrations (harmonic model) belongs to the class of mathematical non-linear ill-posed problems [2]. Ill-posedness means that the problem does not satisfy any of the three well-posedness conditions (the existence of a solution, its uniqueness, and stability with respect to perturbations in the input data) [3]. In most cases, the main difficulty in solving such problems is connected with the non-uniqueness of the solution. Indeed, for any molecular structure (except for diatomic molecules), there may exist an infinite number of force field matrices that result in the same set of vibrational frequencies.
At the same time, the increasing possibilities and a rather good accuracy of the modernstate calculations within ab initio and density functional theory approximations have opened up real ways to obtain the force field parameters from high-level computations of structural units, which can be identified with the help of the modern structural methods. Among these units, there are various molecular clusters/associates that are formed due to non-covalent interactions such as H-bonds or stackings. These interactions can be identified from corresponding infrared and Raman spectra, which also contain information about the conformational composition of the compound. The force fields of such molecular systems with non-covalent bonds between separate units should include an intramolecular part responsible for the corresponding vibrations. Quantum-mechanical calculations of molecular clusters allow predicting both the spectral part responsible for intermolecular vibrations (vibrational IR and Raman spectra) of individual fragments and the spectral part responsible for intramolecular vibrations involved in large-amplitude motions associated with conformational changes in biomolecules.
The solving of ill-posed problems can only be realized with the use of stable numerical methods. For solving the inverse vibrational problem, there have been proposed methods based on the theory of regularization of non-linear ill-posed problems [1,2]. The main idea of this theory is as follows: to solve an ill-posed inverse problem, it is necessary to formulate an algorithm and an optimization procedure that provide a unique solution to a mathematical problem. Such a solution can be obtained using a stable numerical algorithm with the inclusion of mathematically formulated additional criteria for choosing a single solution with given properties [1,2]. This approach allows searching for the so-called normal pseudosolution of Equation (2). Such a solution is determined as an optimized matrix of force constants closest in the chosen Euclidean norm to the a priori given matrix F 0 . The solution must satisfy the set of constraints D and reproduce the experimental data Λ δ within a given error level. D is a given set of a priori constraints (supposed to be closed), which describe various types of constraints on the force constant values [3,4].
Within the framework of the regularization theory, such stable solution (F α ) can be obtained as an extreme of the Tikhonov's functional [1][2][3][4]: on the set D where F 0 is some a priori chosen stabilizing matrix. The existence of an extreme F α is proved in [3,5]. To obtain a stable solution, the regularization parameter α should be chosen in accordance with the errors (h, δ) in geometry and experimental frequencies, respectively. The result of minimization is the matrix F α closest to the given matrix F 0 and compatible with the experimental data within the specified error level. In 1994, in collaboration with F. Weinhold, the group of scientists from Moscow State University proposed using stable numerical algorithms based on Tikhonov's regularization method for joint treatment of ab initio and experimental data in molecular force field calculations [6]. It was suggested to "regularize" and stabilize quantum-mechanical force fields by means of finding the so-called normal solution (pseudo-solution) of the inverse vibrational problem. In this model, the stabilizing matrix F 0 is chosen from quantum mechanical calculations, and the resulting solution will be the matrix F α which is the closest in the Euclidean norm to the given ab initio F 0 . The optimized solution is the so-called Regularized Quantum Mechanical Force Field (RQMFF) [1,6]. The force constant matrix F α obtained in this way reproduces the experimental frequencies with given accuracy and is the closest (in the sense of the Euclidian norm) to the specified ab initio matrix F 0 describing the intramolecular interactions. The proposed procedure allows the use of any system of generalized coordinates, including redundant systems of internal or symmetry coordinates [1], which simplifies the transferability of force constants between related molecules. Using regularizing algorithms to refine ab initio force fields, it is possible to obtain solutions to the inverse vibrational problem that retain significant features of the ab initio force constant matrix. In particular, it allows to keep the potential energy distribution (PED) or composition of normal-mode eigenvectors, thereby providing accurate use of information obtained by powerful ab initio methods, transfer, and comparison of force constants in a series of related molecules. The proposed RQMFF approach was successfully used in our joint studies with F. Weinhold [7][8][9][10] carried out for a series of substituted alkanes with the goal of determining the regularities in molecular parameters upon fluorochloro substitution.
Later, this approach to solving the inverse vibrational problem was extended for the very popular Pulay model of scaled force constant matrix (expressed in internal or symmetry/local symmetry coordinates) [11][12][13]. The corresponding regularizing algorithms provide the matrix F α with the following properties: the solution is closest by norm to the QM matrix F 0 , or the scale matrix B is closest to the unit matrix [1,14].

Scaling of Molecular Force Fields in Cartesian Coordinates
Quantum chemical calculations provide the Hessian matrix of the second derivatives of the energy with respect to atomic coordinates. As a rule, the interpretation of theoretical results is carried out in some selected system of internal coordinates related to the geometric parameters of the molecule (bond stretchings, bond angles, dihedral angles, etc.). In the case of large molecular systems, the introduction of a complete system of internal coordinates is the most tedious and time-consuming procedure. Moreover, if we consider the intramolecular part of the force field for macroscopic systems expressed in terms of force constants corresponding to molecular internal coordinates (bond lengths, valence, and dihedral angles), then we encounter a significant problem related to the invariance of individual internal force constants with respect to the chosen set of coordinates: the force constants of the intramolecular force field are not invariant with respect to the choice of internal coordinates.
In the case of large molecular systems, the introduction of a complete system of internal coordinates is the most tedious and time-consuming procedure. Moreover, if we consider the intramolecular part of the force field for macroscopic systems, expressed in terms of the internal coordinates of molecules (bond lengths, bond, and dihedral angles), then we are faced with a significant problem. It consists in the fact that the force constants of intramolecular force fields are not invariant with respect to the choice of internal coordinates.
The simplicity of the scaling procedure has made it extremely popular. It has been shown that the scale factors of many molecular fragments (within a given level of quantummechanical method) are approximately constant over a wide range of similar molecules. Initially, the scaling procedure was suggested [11][12][13] for the force fields defined in an internal or symmetry (local symmetry) coordinate system. A similar approach for scaling molecular force fields in internal/symmetry/local symmetry coordinates was implemented in our software [1, 14,15] developed for solving direct and inverse vibrational problems.
Internal coordinates (including interatomic distances, bond angles, dihedral angles, etc.) have a lot of advantages and are closely related to the assumptions of the classical theory of structure. They are commonly used to characterize molecular geometry and to define energy terms in valence force-field models. These coordinates provide a language that can be easily used in many applications. The force constant matrix expressed in terms of internal coordinates has a clear advantage. However, unfortunately, the total number of force constants in such a matrix is equal to n(n + 1)/2, where n = 3N − 6 (N is the number of atoms in a molecule). For example, in the case of adenine-thymine base pair, the total number of atoms N is equal to 30. To describe this system in internal coordinates, it is necessary to use at least 84 internal coordinates. The number of the force constants (expressed in independent internal coordinates) is 3550. Even if some of the force constants in internal coordinates are chosen equal to zero, the total number of force constants remains quite large.
Regularizing algorithms allow using any system of generalized coordinates, including redundant systems of internal coordinates, which greatly facilitates the transfer and comparison of force constants between related molecules. The theoretical basis and practical aspects of using redundant coordinates in molecular force fields calculations were previously discussed [1].
To avoid the problems arising in defining the set of internal coordinates, especially in the case of large molecules, we have proposed a procedure for scaling ab initio force field matrix in Cartesian coordinates [16]. This problem is not trivial because the scaling matrix for the Hessian in Cartesian coordinates cannot be chosen as diagonal. However, it is possible to formulate certain conditions allowing to find appropriate scale factors, which are discussed below. The model based on scaling in Cartesian coordinates can also be used in symmetry coordinates and significantly reduce the dimension of the mathematical problem [16,17], so it has an obvious advantage in the case of rather bulky molecules, such as a smaller dimension of coordinate space and, accordingly, a smaller number of optimized parameters. One of the main advantages of this approach is that it avoids introducing internal coordinates in the process of scaling and transferring force fields. Numerical details of this procedure and some details of the use of symmetry in this approach, as well as examples of determining the scale factors in Cartesian coordinates of various organic molecules, were presented in our previous publications [16][17][18]. The procedure of scaling the quantum-chemical force matrix F 0 in internal coordinates is defined as: where B is a diagonal matrix composed of scale factors. Let A(F) be an operator that puts into correspondence to a symmetric positively defined force constant matrix F a vector consisting of vibrational frequencies; then the problem of finding the scale factors can be formulated as a non-linear operator equation: The solution of the inverse problem in Cartesian coordinates has specific features that are determined by the constraints imposed on the force constant matrices: namely, the molecular potential energy must be independent of translations or rotations of a molecule as a whole. An explicit form of these constraints was presented in our previous papers [16][17][18]. These constraints lead to a decrease in the matrix rank to 3N − 6, where N is the number of atoms. Obviously, these constraints must be maintained while scaling the Cartesian matrix.
Fitting molecular force fields in Cartesian coordinates reduces the difficulties associated with the choice of internal coordinates in complex molecules and is practically useful in the case of large biological molecules, associates, polymers, etc., including hundreds of atoms, for which only moderately accurate quantum chemistry methods can be applied. To calculate the scale factors in Cartesian coordinates, a special routine of the SPECTRUM2 software package was used [16].
In the case of scaling in Cartesian coordinates, the scaling procedure can also be formulated in the form of Equation (4). However, it was shown [16,17] that the scale factor matrix B cannot be chosen diagonal due to the requirement that the matrix BF 0 B is independent of the translations and rotations of the molecule as a whole. This requirement imposes certain constraints on the elements of matrix B. As a result, the problem of finding the scale factors is formulated as Equations (4) and (5), where B is a symmetric matrix, B ∈ D where D is a set of the mentioned constraints. In addition, set D may include symmetry constraints.
Fitting molecular force fields in Cartesian coordinates reduces the difficulties related to the choice of internal coordinates in complex molecules and is practically useful in the case of large biological molecules, associates, polymers, etc., including hundreds of atoms. Numerical methods for solving an inverse vibrational problem of the form (4)(5) were formulated and applied in [14,15]. The solution to the inverse problem (a set of scale factors) is found by minimizing the functional: on the set B ∈ D with the proper choice of the regularization parameter α. Here, Q h and Λ δ represent approximations of the operator Q and a set of experimental vibrational frequencies Λ, respectively [16][17][18].
In this paper, we present the results of applying these algorithms for fitting the molecular force fields in Cartesian coordinates for primary nucleobases (adenine, cytosine, guanine, and thymine) and subsequent use of these scale factors to correct the force constant matrices of canonical pairs in Cartesian coordinates.

Application of Second-Order Perturbation Theory Analysis to the Fock Matrix in NBO Basis for DNA Bases and Base Pairs at the B3LYP/6-31G* Level of Theory
The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms

Application of Second-Order Perturbation Theory Analysis to the Fock Matrix in NBO Basis for DNA Bases and Base Pairs at the B3LYP/6-31G* Level of Theory
The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms

Application of Second-Order Perturbation Theory Analysis to the Fock Matrix in NBO Basis for DNA Bases and Base Pairs at the B3LYP/6-31G* Level of Theory
The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms

Application of Second-Order Perturbation Theory Analysis to the Fock Matrix in NBO Basis for DNA Bases and Base Pairs at the B3LYP/6-31G* Level of Theory
The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms

Application of Second-Order Perturbation Theory Analysis to the Fock Matrix in NBO Basis for DNA Bases and Base Pairs at the B3LYP/6-31G* Level of Theory
The model of scaling in Cartesian possesses many advantages in over scaling in other generalized coordinates. This approach requires careful analysis of the scaling matrix structure, especially in cyclic molecules, for which it can be difficult to estimate all-important pairwise interactions of atoms. One of the very attractive possibilities for predicting the possible structure of the scaling matrix in complex cyclic systems is the analysis of pairwise interactions of atoms in the framework of the theory of Natural Bond Orbitals (NBO) proposed in the works of Frank Weinhold [19][20][21]. This theory allows to obtain information on charge transfer or conjugative interactions in molecular systems and can be characterized as a very reliable, sensitive, and one of the most efficient theoretical tools for the analysis of intra-and inter-molecular interactions by using calculated data about the interaction of filled and virtual orbitals.
In this work, quantum mechanical calculations and NBO analysis of considered systems were carried out at the B3LYP/6-31G* level of theory. The 6-31G(d) basis set was chosen as one of the simplest polarized double-zeta basis sets, which is widely used in quantum mechanical calculations of bulky biological molecules. Some results of applying the second-order perturbation theory analysis to NBO basis for DNA base molecules at the B3LYP/6-31G level of theory can be found in Tables 1 and 2, compared to the similar calculations for base pairs. Table 1. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in adenine and thymine molecules at the B3LYP/6-31G* level of theory.

Adenine
Adenine-Thymine The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms The visualization of optimized molecular structures has been made using Chemcraft (version 1.8) software [22]. The results of NBO analysis demonstrate the presence of significant hyperconjugative interactions between lone pairs of nitrogen and oxygen atoms with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix. Table 2. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in guanine and cytosine molecules at the B3LYP/6-31G* level of theory.

Guanine and Cytosine
Guanine-Cytosine Molecules 2022, 27, x FOR PEER REVIEW 7 of 14 with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model. with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix. Table 2. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in guanine and cytosine molecules at the B3LYP/6-31G* level of theory.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model.
Molecules 2022, 27, x FOR PEER REVIEW 7 of 14 with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix. Table 2. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in guanine and cytosine molecules at the B3LYP/6-31G* level of theory.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model. with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix. Table 2. Second-order perturbation theory analysis in NBO basis for non-bonded interactions in guanine and cytosine molecules at the B3LYP/6-31G* level of theory.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model. with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model. with antibonding orbitals (σ*) of some skeletal bonds both in DNA base molecules and in their pairs. Therefore, it is necessary to include certain cross-sectional terms in the scaling matrix.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSS-IAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model. a E2-energy of hyperconjugative interactions.

Computational Details
DFT calculations of four DNA bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine) were performed using the GAUSSIAN 09 software (Revision D.01) [23]. Fully optimized geometries, analytical force constants, and harmonic vibrational frequencies of all molecular structures were calculated at the B3LYP/6-31G* level of theory [24][25][26]. Potential surface minima were found by relaxing geometric parameters using standard optimization methods. Inverse scaling problems were solved for all bases, and the resulting sets of scale factors for DNA bases were found as the extremes of functional (6) for each base. The calculation of scale factors was carried out using a special routine of the SPECTRUM software package [1], which was equipped with additional options for applying to Cartesian coordinates. All the considered inverse problems were solved for single parent molecules without the inclusion of experimental data on isotopic species to avoid the incompatibility of inverse problems in the framework of the harmonic model.
The choice of these levels of quantum mechanical calculations is explained by their popularity in quantum mechanical calculations of organic molecules and, especially, biological molecules. In this paper, we present results obtained for the B3LYP/6-31G* level since this level of calculations can be applied to larger biologically important molecules.

•
Analysis of various matrix scaling models in Cartesian coordinates based on the results obtained in the framework of the second-order perturbation theory analysis of the Fock matrix in the NBO basis for the DNA bases under consideration. • Solving inverse problems and determining the scale factors sets in Cartesian coordinates for four DNA bases at the B3LYP/6-31G* level of theory. • Solving inverse problems with a variation of possible sets of scale factors. • Finally, scaling the force constant matrices for DNA pairs. Scale matrix for each pair was composed in block-diagonal form. Comparison to available experimental spectra if appropriate.
The calculated scaling matrices in Cartesian coordinates (B3LYP/6-31G*) of DNA bases are presented in Supplementary Materials Tables S1-S4, comparison of the fitted and observed frequencies are presented in Supplementary Materials Tables S5-S8.

Results of Quantum Mechanical Calculations and Fitting Scale Factors for DNA Bases
All four DNA bases were processed similarly to obtain scale factor matrices B. Below, we demonstrate the results of calculations for the adenine molecule obtained at the B3LYP /6-31G* level of theory.
For the complete automation of the procedure, no special constraints on matrix B were introduced, and the regularization parameter was chosen based on the desired approximation of the experimental frequencies within their given error level. The regularizing procedure is organized in a way that allows obtaining a complete and stable set of scale factors even in cases when some fundamental frequencies remain unknown. Figure 1 shows how the discrepancy between the observed and calculated at each stage of optimization frequencies depends on the value of the regularization parameter α during the optimization procedure. Obviously, the discrepancy decreases with decreasing regularization parameter; the dashed line shows the required error level, which in this case corresponds to α = 5.83·10 −4 .  In this paper, we consider two DNA canonic pairs: adenine-thymine and guaninecytosine. Figures 2 and 3 show the pairs and separate DNA molecules.  In Table 3, we present the matrix of scale factors for adenine obtained on the basis of experimental frequencies. Atomic numbering is shown in Figure 2. As it could be expected, most of the off-diagonal matrix B elements are small, and diagonal elements exhibit small deviations from unity. A similar structure of the scale matrix is observed in the case of other DNA bases; matrices are presented in Supplementary Materials-Tables S1-S4.
Vibrational frequencies of adenine calculated at the B3LYP/6-31G* level differ from the experimental one by 64 cm −1 , while rms error in frequencies for the scaled force field is equal to 8.6 cm −1 . For guanine, the B3LYP/6-31G* calculation results in the rms value of frequencies error equal to~70 cm −1 , while the scaled force field reduces this error to 5.7 cm −1 . The experimentally observed vibrational frequencies together with their tentative assignments, as well as the frequencies calculated on B3LYP/6-31G* level and the frequencies obtained after the scaling procedure for all four DNA bases, can be found in Supplementary Materials-Tables S5-S8.

Quantum Mechanical Calculations and Scaling of DNA Bases Canonic Pairs Hessians
Geometries and force fields for both pairs were calculated at the B3LYP/6-31G* level to ensure consistency of the scale factors with the individual DNA base calculations. For the scaling procedure, the scale matrix B for a DNA pair was built from the elements of the individual matrices for the bases, with the corresponding reordering of the matrix elements according to the order of atoms in the corresponding base pair. The off-diagonal elements of the DNA pair scale matrix corresponding to the interaction of atoms of different bases were taken to be zero. The "B3LYP/Scaled" column in Table 4 compares the theoretical B3LYP/6-31G* frequencies of adenine-thymine pare and corrected by the scale factors obtained for the individual molecules in Section 2.3.3. Table 5 presents the results of a similar calculation for the guanine-cytosine pair. These frequencies may be subsequently used in analyzing experimental spectra. Table 5. Guanine-cytosine pair: B3LYP/6-31G* frequencies and frequencies for the scaled force matrix (cm −1 ).

Discussion
The results of using the scaling procedure directly in Cartesian coordinates for correcting the quantum mechanical Hessians of adenine and thymine (Tables 4 and 5) demonstrate a satisfactory agreement between the experimental and fitted frequencies, which is consistent with the results obtained using the conventional Pulay scaling in internal coordinates. This allows us to conclude that the model for correcting the theoretical vibrational frequencies by scaling Cartesian force matrices appears reasonable.
Earlier, very close values of the diagonal scale factors were obtained for similar atoms in indole and pyrrole molecules [17]. The same is true for pairs of bonded atoms (C, H); this shows the good possibilities of transferring scale factors for atoms in a similar environment. The results of our calculations show that the optimized values of the scale factors β ij are the same for all pairs of atoms which are transferred to each other by symmetry operations for a particular molecule. Note that it is also possible to apply the procedure for each symmetry block individually, which gives a somewhat better frequency fit, similar to what is often done for the standard scaling approach.