Targeting G-quadruplexes with Organic Dyes: Chelerythrine–DNA Binding Elucidated by Combining Molecular Modeling and Optical Spectroscopy

The DNA-binding of the natural benzophenanthridine alkaloid chelerythrine (CHE) has been assessed by combining molecular modeling and optical absorption spectroscopy. Specifically, both double-helical (B-DNA) and G-quadruplex sequences—representative of different topologies and possessing biological relevance, such as telomeric or regulatory sequences—have been considered. An original multiscale protocol, making use of molecular dynamics (MD) simulations and quantum mechanics/molecular mechanics (QM/MM) calculations, allowed us to compare the theoretical and experimental circular dichroism spectra of the different DNA topologies, readily providing atomic-level details of the CHE–DNA binding modes. The binding selectivity towards G-quadruplexes is confirmed by both experimental and theoretical determination of the binding free energies. Overall, our mixed computational and experimental approach is able to shed light on the interaction of small molecules with different DNA conformations. In particular, CHE may be seen as the building block of promising drug candidates specifically targeting G-quadruplexes for both antitumoral and antiviral purposes.


Details on Molecular Modeling and Simulations
Chelerythrine: The equilibrium ground state geometry of the chelerythrine molecule has been optimized at the DFT B3LYP/6-31G(d) level of theory. Atomic point charges have been obtained through the RESP procedure REF using HF/6-31G*. Both computations have been performed using Gaussian D.01. Then the bonded and non-bonded parameters for molecular mechanics and molecular dynamics simulations are GAFF parameters REF adapted to chelerythrine using the antechamber utility available in Ambertools. It has to be noted that the force constant describing the dihedral between atoms 1, 2, 3 and 4 as depicted in Figure S11 has been increased (from 0.300 to 3.625 kcal/mol) to obtain an identical population of conformations where the methyl is pointing out of the plane of the molecule either on one side or the other.

Equilibrium Molecular dynamics simulations:
Starting structures of the 14 base pairs long B-DNA double strands heteropolymers of AT and GC have been obtained using the NAB utility available in Ambertools. In the case of G-quadruplexes, parallel, hybrid and antiparallel configurations are from the ProteinDataBank, respectively 1XAV, 2HY9 and 143D. For all the complexes described in the Introduction section, the starting conformations has been created manually using the Avogadro software. Then each system has been placed in a truncated octahedron box of water molecules described in subsequent molecular mechanics and molecular dynamics simulations by the TIP3P parameters. Amber99 force field including bsc1 corrections has been used to describe nucleic acids. In the case of B-DNA/chelerythrine the systems contained approximatively 7650 water molecules and 25 K + counter-ions to neutralize the global charge of the system. For G4/chelerythrine the systems contained: for parallel 4860 water molecules and 18 K + ions (plus the two central K + ions inside the G4 central channel), for hybrid 4800 water molecules and 18 K + ions (plus the two central K + ions within the G4) , for antiparallel 4630 water molecules and 18 K + ions (plus the two central Na + ions within the G4).
The molecular mechanics followed by molecular dynamics simulations procedure we employed was identical for the 10 studied systems (poly(A·T)·poly(A·T)/CHE left, poly(A·T)·poly(A·T)/CHE right, poly(G·C)·poly(G·C)/CHE left, poly(G·C)·poly(G·C)/CHE right, 1XAV/CHE in, 1XAV/CHE edge, 2HY9/CHE in, 2HY9/CHE edge, 143D/CHE in, 143D/CHE edge). The geometry of the ensemble was first minimized using molecular mechanics simulations by 4000 steps of steepest descent and 4000 steps of conjugated gradients algorithms in order to relax close contacts. Then the system is progressively heated until reaching 300K during 200 ps in the NVT ensemble, followed by 400 ps in the NPT ensemble to relax the global density of the water box. Finally production run has been performed for 100 ns in the NPT ensemble. To keep the temperature (300 K) and pressure (1 atm) constant, langevin dynamics and the Montecarlo thermostat have been used. All simulations were performed using periodic boundary conditions and the particle mesh Ewald for electrostatic interactions.

Free energy calculation details:
In practice, the free-energy differences between the different states were evaluated through a series of transformations between non-physical intermediate states (windows) connecting the initial to the final state by means of a coupling parameter l, varying from 0 to 1. All FEP and TI simulations were performed in both decoupling and coupling directions. For FEP simulations, the maximum-likelihood estimator of the free-energy difference was evaluated using data collected in both directions by means of the Bennett Acceptance Ratio (BAR) using the ParseFEP plugin [J. Chem. Theory Comput., 2012, 8, 2606 of VMD. Both ∆ alch site and ∆ alch bulk FEP simulations were performed under the same conditions but using different stratification strategies tailored to ensure proper convergence of the free-energy estimates. Each window consists of 200 ps of equilibration followed by 800 ps of production. For ∆ alch site 60 windows were used for both coupling and uncoupling simulations. Contributions to ∆ rest site were determined through TI simulations stratified over 24 windows in both coupling and uncoupling directions. ∆ .,:,; 3456 and ∆ <,=,> 3456 were calculated through numerical integration of analytical expressions [Proc. Natl. Acad. Sci. U. S. A., 2005,102, 6825, Biophys. J., 2001,81, 1632.
In the case of FEP simulations, a soft-core potential was employed in which the interatomic distance, r, used in the Lennard-Jones potential, was shifted → C + 5 × (1 − ). Furthermore, in order to avoid the so-called ''end-point catastrophe'', electrostatic interactions of vanishing or appearing particles were linearly coupled to the simulation for l, varying from 0 to 0.5. At l values greater than 0.5, electrostatic interactions were fully decoupled from the simulation.

Macromolecular ECD Modeling
We employed the similar methodology as described in our previous works. [ Theor. Chem. Acc., 2015, 134, 36;J. Phys. Chem. B, 2016, 120, 3113;J. Chem. Theory Comp, 2017, 13, 3290] From the previously described molecular dynamics simulation, 100 structures have been extracted in order to represent the conformational space of each system. Then an ensemble of subunits is chosen in order to provide a suitable reproduction of the intrinsic circular dichroism of our system. Also based on our previous work [J. Chem. Theory Comp, 2017, 13, 3290], it appears that the best partitioning of the supramolecular chromophore is by achieved by decoupling it into 4 distinct chunks composed of three stacked nucleobases, as presented in Figure S12. Figure S12. Partitioning of the macromolecular aggregate used to obtain the semiempirical Frenkel Hamiltonian in the case of B-DNA (left) and G4 (right). The red square represent the chunks included in the QM partition. Note that in all the case three stacked nucleobases are simultaneously included in each partition, hence allowing for the excited states delocalization over the whole subsystem.
Each subunit is described by QM/MM, including polarizable embedding, and covalent bonds are treated using the link atom method (hydrogen atom) to compute the first 12 excited states for each subunit at the TD-DFT M062X/6-31G(d,p) level of theory. The computations a locally have been realized using a modified version of the Gaussian09 code coupled with Tinker. From the QM/MM procedures the energies of the excited states as well as their transition dipole moments are obtained, hence allowing the definition of the Frenkel Hamiltonian whose diagonalization will then provide excitonic coupled states' energies and the corresponding circular dichroism rotational strength (intensities). The convolution of the Frenkel excitonic spectrum obtained for each snapshots will allow obtaining the final ECD spectrum. This has been obtained through Gaussian functions of fixed width at half-length (FWHL) of 0.4 eV. Figure S13. Thermodynamic cycle using for the calculation of the binding free energy. The free energy terms are indicated.   Tableau 2: Binding free energy contributions for Chelerythrine/DNA complexes.