Butterfly Effect in Cytarabine: Combined NMR-NQR Experiment, Solid-State Computational Modeling, Quantitative Structure-Property Relationships and Molecular Docking Study

Cytarabine (Ara-C) is a synthetic isomer of cytidine that differs from cytidine and deoxycytidine only in the sugar. The use of arabinose instead of deoxyribose hinders the formation of phosphodiester linkages between pentoses, preventing the DNA chain from elongation and interrupting the DNA synthesis. The minor structural alteration (the inversion of hydroxyl at the 2′ positions of the sugar) leads to change of the biological activity from anti-depressant and DNA/RNA block builder to powerful anti-cancer. Our study aimed to determine the molecular nature of this phenomenon. Three 1H-14N NMR-NQR experimental techniques, followed by solid-state computational modelling (Quantum Theory of Atoms in Molecules, Reduced Density Gradient and 3D Hirshfeld surfaces), Quantitative Structure–Property Relationships, Spackman’s Hirshfeld surfaces and Molecular Docking were used. Multifaceted analysis—combining experiments, computational modeling and molecular docking—provides deep insight into three-dimensional packing at the atomic and molecular levels, but is challenging. A spectrum with nine lines indicating the existence of three chemically inequivalent nitrogen sites in the Ara-C molecule was recorded, and the lines were assigned to them. The influence of the structural alteration on the NQR parameters was modeled in the solid (GGA/RPBE). For the comprehensive description of the nature of these interactions several factors were considered, including relative reactivity and the involvement of heavy atoms in various non-covalent interactions. The binding modes in the solid state and complex with dCK were investigated using the novel approaches: radial plots, heatmaps and root-mean-square deviation of the binding mode. We identified the intramolecular OH···O hydrogen bond as the key factor responsible for forcing the glycone conformation and strengthening NH···O bonds with Gln97, Asp133 and Ara128, and stacking with Phe137. The titular butterfly effect is associated with both the inversion and the presence of this intramolecular hydrogen bond. Our study elucidates the differences in the binding modes of Ara-C and cytidine, which should guide the design of more potent anti-cancer and anti-viral analogues.


Introduction
Recent drug development efforts have focused on the search for therapeutic agents that can effectively treat diseases associated with high global mortality: cancer and viral diseases.The class of compounds that holds significant importance is the nucleoside analogues, synthetic forms of natural nucleosides, nucleotides and bases that have been chemically modified.The naturally occurring nucleosides are composed of an aglycone moiety (a pyrimidine-or purine-derived base) and a glycone moiety (sugar: β-D-ribofuranose in the RNA or 2 -deoxy-β-D-ribofuranose in the DNA), Figure 1.
chemically modified.The naturally occurring nucleosides are composed of an ag moiety (a pyrimidine-or purine-derived base) and a glycone moiety (sugar ribofuranose in the RNA or 2′-deoxy-β-D-ribofuranose in the DNA), Figure 1.Nucleoside analogues with a modified aglycone part constitute a starting po the synthesis of numerous chemotherapeutics, which possess significant binding affi to both DNA and RNA.They act as competitive inhibitors of viral DNA or RNA syn and when incorporated into the growing nucleic acid chain they cause pre termination of replication.After phosphorylation, their activity increases due presence of an additional fragment, which easily interacts with enzymes invol nucleic acid synthesis.Thus, many of them are used as prodrugs.Both mechanism nucleoside analogues effective against viral infections (HIV, Hepatitis B and Herpe and in the treatment of certain types of cancer (especially hematological).Therefor constitute an important class of pharmaceutical active pharmaceutical ingredients widely used in the drugs development.
The isolation of a number of unique nucleosides including spongothymidine ( arabofuranosylthymine) and spongouridine (  Nucleoside analogues with a modified aglycone part constitute a starting point for the synthesis of numerous chemotherapeutics, which possess significant binding affinities to both DNA and RNA.They act as competitive inhibitors of viral DNA or RNA synthesis, and when incorporated into the growing nucleic acid chain they cause premature termination of replication.After phosphorylation, their activity increases due to the presence of an additional fragment, which easily interacts with enzymes involved in nucleic acid synthesis.Thus, many of them are used as prodrugs.Both mechanisms make nucleoside analogues effective against viral infections (HIV, Hepatitis B and Herpesvirus) and in the treatment of certain types of cancer (especially hematological).Therefore, they constitute an important class of pharmaceutical active pharmaceutical ingredients (APIs), widely used in the drugs development.
The isolation of a number of unique nucleosides including spongothymidine (3-β-D-arabofuranosylthymine) and spongouridine (3-β-D-arabofuranosyluracil) from the Caribbean sponge Cryptotethya crypta (now Tectitethya crypta) by Bergmann et al. [1], and the observation that these compounds can act as terminators of the DNA synthesis chain, initiated research on the new and separate class of nucleotides with modified glycone part: arabinosides.The possibility that they have a role as anti-viral, anti-cancer as well as anti-cellular senescence drug has sparked interest in them.In 1959, Walwick, Roberts and Dekkerin [2] synthesized Cytarabine, (Cytosine arabinoside; Cytosine-1-β-D-arabinofuranoside; Ara-C), a synthetic isomer of cytidine, that differs from cytidine and deoxycytidine, only in the sugar (D-arabinose instead of ribose and deoxyribose, respectively), Figure 2.  Nucleoside analogues with a modified aglycone part constitute a starting point for the synthesis of numerous chemotherapeutics, which possess significant binding affinities to both DNA and RNA.They act as competitive inhibitors of viral DNA or RNA synthesis, and when incorporated into the growing nucleic acid chain they cause premature termination of replication.After phosphorylation, their activity increases due to the presence of an additional fragment, which easily interacts with enzymes involved in nucleic acid synthesis.Thus, many of them are used as prodrugs.Both mechanisms make nucleoside analogues effective against viral infections (HIV, Hepatitis B and Herpesvirus) and in the treatment of certain types of cancer (especially hematological).Therefore, they constitute an important class of pharmaceutical active pharmaceutical ingredients (APIs), widely used in the drugs development.
The isolation of a number of unique nucleosides including spongothymidine (3-β-Darabofuranosylthymine) and spongouridine (3-β-D-arabofuranosyluracil) from the Caribbean sponge Cryptotethya crypta (now Tectitethya crypta) by Bergmann et al. [1], and the observation that these compounds can act as terminators of the DNA synthesis chain, initiated research on the new and separate class of nucleotides with modified glycone part: arabinosides.The possibility that they have a role as anti-viral, anti-cancer as well as anticellular senescence drug has sparked interest in them.In 1959, Walwick, Roberts and Dekkerin [2] synthesized Cytarabine, (Cytosine arabinoside; Cytosine-1-β-Darabinofuranoside; Ara-C), a synthetic isomer of cytidine, that differs from cytidine and deoxycytidine, only in the sugar (D-arabinose instead of ribose and deoxyribose, respectively), Figure 2. The constant interest in Ara-C results from its specific mode of action, the mechanism of which remains unclear despite the passage of time.Ara-C is known to be transported into the cell primarily by Human Equilibrative Nucleoside Transporter 1 (hENT-1).Activation of Ara-C in cells occurs as a result of de novo synthesis of 5′-mono-, di-, and triphosphate derivatives throughout the sequential action of nucleoside-phosphate kinases: deoxycytidine kinase (dCK), deoxycytidine monophosphate kinase (dCMK), and nucleoside diphosphate kinase (NDPK) [3].dCK plays an important role in preventing Ara-C from being converted to an inactive metabolite called uracil arabinoside (Uracil-1β-D-arabinofuranoside) by cytidine deaminase, which occurs by phosphorylating Ara-C.After being rapidly converted to arabinonucleoside triphosphate (Ara-C-CTP), it becomes a substrate of DNA polymerases α, β, γ and η [4].Ara-C-CTP hinders DNA polymerases α and η, the enzyme responsible for initiating DNA replication, by outcompeting the natural substrate deoxycytidine triphosphate (dCTN-CTP).Nonetheless, it merely acts as a modest competitive inhibitor of this enzyme.Another factor has been shown to hinder DNA synthesis, namely the incorporation of Ara-C into DNA.The use of arabinose instead of deoxyribose hinders the formation of phosphodiester linkages between pentoses, preventing the DNA chain from elongation.As a result, the DNA synthesis process is interrupted, the cell cycle from G1 to the S phase is blocked and apoptosis occurs.The use of a non-functional nucleoside to inhibit DNA strand extension is a strategy used in other chemotherapeutics of natural origins (e.g., adenine arabinose or azidothymidine), with only Ara-C having strong anti-cancer properties, while the remaining analogues have anti-viral properties.Although Ara-C has some anti-viral activity against Herpesvirus (HHV) [5] and Vaccinia Virus (VV) [6], it is not highly selective and causes serious side effects.
Ara-C is highly effective against hematologic malignancies (e.g., acute lymphocytic leukemia, acute/chronic myeloid leukemia, acute promyelocytic leukemia, anaplastic large cell lymphoma, Burkitt lymphoma, Langerhans cell histiocytosis, leptomeningeal carcinoma and non-Hodgkin's lymphoma) [7,8] and its efficacy against carcinomas is limited to sarcomas (Nakahara-Fukuoka sarcoma, reticulum cell sarcoma, ascites sarcoma-180), adenocarcinomas (adenocarcinoma-755, spontaneous mammary adenocarcinoma) [9,10] and Ehrlich ascites carcinoma [11], among others, by the method of administration and side effects (chronic toxicity to internal organs including cerebrum).Efforts to improve its efficacy against the so-called "civilization diseases", through drug synergism, targets modification or allosteric modulation, are ongoing.Ara-C exhibit remarkable synergistic effect with other anti-cancer drugs including Daunorubicin, Vinblastine and cis-diaminodichloroplatin (II) (CDDP).Recently, it has been discovered that combining Ara-C with daunorubicin or anthracyclines in liposomal formulations leads to the significant positive outcomes of about 60% [12,13] or 80% [14], respectively.Low-dose cytarabine combined with Venetoclax and 5-azacitidine achieved remission The constant interest in Ara-C results from its specific mode of action, the mechanism of which remains unclear despite the passage of time.Ara-C is known to be transported into the cell primarily by Human Equilibrative Nucleoside Transporter 1 (hENT-1).Activation of Ara-C in cells occurs as a result of de novo synthesis of 5 -mono-, di-, and triphosphate derivatives throughout the sequential action of nucleoside-phosphate kinases: deoxycytidine kinase (dCK), deoxycytidine monophosphate kinase (dCMK), and nucleoside diphosphate kinase (NDPK) [3].dCK plays an important role in preventing Ara-C from being converted to an inactive metabolite called uracil arabinoside (Uracil-1-β-Darabinofuranoside) by cytidine deaminase, which occurs by phosphorylating Ara-C.After being rapidly converted to arabinonucleoside triphosphate (Ara-C-CTP), it becomes a substrate of DNA polymerases α, β, γ and η [4].Ara-C-CTP hinders DNA polymerases α and η, the enzyme responsible for initiating DNA replication, by outcompeting the natural substrate deoxycytidine triphosphate (dCTN-CTP).Nonetheless, it merely acts as a modest competitive inhibitor of this enzyme.Another factor has been shown to hinder DNA synthesis, namely the incorporation of Ara-C into DNA.The use of arabinose instead of deoxyribose hinders the formation of phosphodiester linkages between pentoses, preventing the DNA chain from elongation.As a result, the DNA synthesis process is interrupted, the cell cycle from G1 to the S phase is blocked and apoptosis occurs.The use of a non-functional nucleoside to inhibit DNA strand extension is a strategy used in other chemotherapeutics of natural origins (e.g., adenine arabinose or azidothymidine), with only Ara-C having strong anti-cancer properties, while the remaining analogues have anti-viral properties.Although Ara-C has some anti-viral activity against Herpesvirus (HHV) [5] and Vaccinia Virus (VV) [6], it is not highly selective and causes serious side effects.
The new meta-analysis carried out in [17] has confirmed that the addition of cytarabine to the therapeutic regimen of newly diagnosed patients with primary central nervous system lymphoma (PCNSL) patients is highly beneficial.In Europe and the United States, multiple courses of high-dose Ara-C are used as standard consolidation treatment for AML patients who have achieved complete remission [18].This therapeutic protocol results in higher long-term survival especially for patients in the intermediate-risk group (2022 ELN risk stratification) [19].Therefore, Ara-C is one of the few anti-cancer drugs that have been used in nearly unmodified treatment protocols for more than 50 years.Its presence on the World Health Organization's eEML Model List testifies to the importance it holds.
Surprisingly, Ara-C and naturally occurring cytidine, that differ only in the orientation of a single hydroxyl group in the 2 position (as shown in Figure 2a,b) exhibit completely different biological activities.Cytidine is used as a glutamatergic anti-depressant drug to manage neuropsychiatric deficits associated with cerebrovascular diseases by controlling neuronal-glial glutamate cycling.Its active form, 5 -triphosphate CTP, takes part in the synthesis of ribonucleic acid (RNA) as a building block or as an activating group in the synthesis of lipids such as lecithin, cephalin and cardiolipin.Therefore, the small local structural change limited to the glycone moiety has significant consequences for biological activity.
The aim of our study is to explain how much the hydroxyl inversion, optimization or change of a substituent within glycone moiety affects the distribution of electron density the molecule, and thus the ability to form intermolecular bonds, which determines its biological activity.
For this purpose, we have chosen a highly sensitive technique that is able to detect local changes, even in places that are far away from any structural changes.The physical property describing the charge distribution in molecules and their charge related properties is electrostatic potential, which is directly proportional to the magnitude of the charge and inversely proportional to the distance from the charge.However, its variability between individual molecular environments is relatively small and subtle changes are difficult to assess.The set of the second-order derivatives of the electrostatic potential constituting the Electric Field Gradient (EFG) tensor, is very sensitive even to minor changes in the electron density distribution at the nucleus.The substituent effects or molecular motions result in the changes in the electrostatic potential reflected by the EFG tensor and observables-NQR frequencies.Since the components of the EFG scale with 1/r 3 , where r is the distance from the nucleus, they are highly sensitive primarily to the changes in the immediate vicinity of the nucleus, and slightly weaker to the changes relatively distant from the nucleus.By probing the EFG it is possible to gain insight into the electron density distribution throughout the molecule.The experimentally detected or theoretically calculated components of the EFG tensor provide a description of the nearest nuclear vicinity.However, from an experimental point of view, it is not easy task and demand the use of three 1 H-14 N NMR-NQR (Nuclear Magnetic Resonance-Nuclear Quadrupole Resonance) techniques, namely multiple frequency sweeps, Larmor frequency scanning, and the two-frequency irradiation.When supplemented with the solid-state computational modelling (Quantum Theory of Atoms in Molecules [20,21], Reduced Density Gradient [22] and 3D Hirshfeld surfaces [23]), and Quantum Structure-Property Relationship [24] deliver deep insight into three-dimensional packing and reactivity at the atomic and molecular levels.
These methods shed some light on the pattern of intra-and intermolecular interactions in solid Ara-C, cytidine, and some closely related compounds lacking at least one -OH group (2 -deoxycytidine, 3 -deoxycytidine, 3 -deoxy-3 ,4 -didehydrocytidine, 5-azacytidine, Zalcitabine, Decitabine or Gemcitabine).Ara-C and cytidine differ only slightly, so one would expect their interaction patterns to be similar and the differences to be no greater than those observed in the polymorphic forms.However, a slight structural alteration, the inversion of hydroxyl group at the 2 -position of the sugar, leads to change of the biological activity from antidepressant and DNA/RNA block builder (cytidine) to powerful anticancer (Ara-C).We have discovered that above mentioned modification causes significant changes in the electron density distribution and binding mode, clearly visible in the NQR spectrum.The use of molecular docking enabled us to examine and explain the binding of Ara-C and closely related compounds to dCK.We identified the intramolecular OH•••O hydrogen bond as the key factor responsible for forcing the glycone conformation and strengthening NH•••O bonds with Gln97, Asp133 and Ara128, and stacking with Phe137.The title butterfly effect is associated with its presence.
However, our aim was not only to identify differences in ligand binding but also to develop a method for comparing them in the solid state and in protein-ligand complexes.To date, no thorough comparative analysis of binding modes in solid-state and proteinligand complexes upon changes in ligands has been conducted.Therefore, we have developed new methods for quantitative and qualitative comparison of binding modes.Heat maps, that are commonly used to compare amino acid sequences in proteins, can help to visualize and classify differences in binding modes in both the solid state and proteinligand complexes.A new parameter, RMSD_BM (the root mean square distance between binding modes), has been introduced to describe the relative distance between the binding modes in both cases using numerical values.It has been demonstrated that modifications to the glycone moiety have an impact on the distribution of electron density within the aglycone moiety.This, in turn, affects its binding capacity in both the solid state and proteinligand complex.By correlating these findings with the relative ligand reactivity parameter recently introduced by us, it is possible to screen the ligands for their potential biological activity.The glycone moiety plays a crucial role in the ligands studied as it significantly influences the electron density distribution within aglycone moiety.The comparison of the selected ligands, namely Ara-C (anti-leukemic), cytidine (anti-depressant and RNA component), Gemcitabine (anti-cancer), Zalcitabine (anti-retroviral) and Decitabine (antileukemic and anti-virial) performed in this paper illustrates the usefulness of the new methods quite well.The methods proposed for comparing ligand reactivity, solid-state and protein-ligand binding modes show promise for virtual screening ligands and effective search for drug improvements.

1 H-14 N NQDR Spectrum
We used Larmor frequency scanning [25] and multiple frequency sweep [26] as the first 14 N NQR frequency localization techniques.Measurements were performed at room temperature (295 K).
The proton polarization period was set to 30 s and the relaxation period was set to 0.5 s.We used linear frequency sweeps in the range from 1.6 MHz to 3.5 MHz.The duration of a single cycle was set to 10 ms.The average amplitude of the RF magnetic field during the sweep was approximately 2 mT.The Larmor proton frequency scan was performed in 10 kHz steps, repeating it four times and averaging at each step.By this technique, the low-frequency part of the 14 N NQR spectrum of Ara-C was detected.In the next step, by varying the frequency limits of the sweeps, we located the higher 14 N NQR frequencies ν + and ν − .In order to increase the resolution to about ±5 kHz, we used as the final step the two-frequency irradiation technique.The two rf magnetic fields were applied as a series of repetitive pulses with the pulse length of 1 ms.The amplitudes of the two rf magnetic fields were set to 0.2 mT.
The 14 N NQR spectrum and position of the lines are shown in Figure 3, and the frequencies and e 2 qQ/h and η calculated using Equation (3) are collected in Table 1.
So far, only a few compounds structurally similar to Ara-C have been investigated.The few spectra recorded for them are usually incomplete (e.g., cytidine and 5-azacytidine, Table 1).While the detection of frequencies for the -NH 2 group did not pose any particular difficulties, the remaining frequencies are often missing (cytidine and 5-azacytidine [27]), Table 1.Nonetheless, the NQR spectrum of Ara-C differs significantly from that of structurally similar cytidine, Figure 3.   [28,29].* calculated from ν+ − ν− = ν0.Table 1. 14N NQR frequencies, quadrupole coupling constants e 2 qQ/h and the asymmetry parameters η in Ara-C at T = 295 K and related compounds.

Compound
Nitrogen Site The resonance lines in NQR spectrum assigned to -N= and >N-sugar are "clean", i.e., they are normally broadened, with close full width at half maximum (FWHM), implying no structural disorder.The absence of multiplicity implies the presence of equivalent molecules per unit cell, which is generally consistent with the X-ray results [30].
The surroundings of the three nitrogen nuclei in the Ara-C molecule show significant differences due to hybridization and substituent effects, making them chemically inequivalent.It is highly likely that nitrogen signals can be correctly assigned solely based on the NQR spectra, due to the different e 2 Qq/h and η values for each of them.In Ara-C molecule, the N(1) nitrogen frequencies can be readily distinguished from others due to the strong proton-nitrogen interaction, which affects the double resonance spectra.Therefore, they can be assigned to -NH 2 , for which level crossing and solid effects are usually easy observed.However, the -NH 2 resonance lines appear to be doublets despite the lack of disorder in X-ray structure.In most NQR studies, the quadrupole interaction is assumed to be steady, but certain molecular motions can strongly affect the quadrupolar interaction, leading to the EFG tensor fluctuations.The formation of doublets can be the result of molecular motions: hindered rotation of the -NH 2 group around the NC axis (two site jumps) and/or proton-jumps (proton hopping) in hydrogen bonds.The former is usually activated at relatively high temperatures, while the latter occurs even below liquid nitrogen temperatures.The -NH 2 group of Ara-C is involved in two N-H•••O hydrogen bonds, which are bent.The N•••O distances are remarkably short (2.983 and 3.048 Å).The proton transfer scores calculated for these bonds based on geometrical parameters are 7.45 and 8.40, respectively.Thus, the shorter one should be slightly more stable.However, at the higher temperatures, the hydrogen bond involving 3 -oxygen may be weakened due to the large amplitude motion of the sugar ring.This kind of effect was observed in 5-azacytidine and cytidine [27,28] as well as thymidine by 2 H NMR [29].
The set of resonance lines 2.530, 2.315 and 0.215 MHz assigned to nitrogen N(3) are characterized by a low asymmetry parameter.There are two possible nitrogen sites in the Ara-C molecule, -N= and >N-sugar, to which this set of lines can be assigned.However, the nitrogen >N-sugar forms the glycone-N-aglycone bridge and participate in the three single N-C covalent bonds, Figure 1.Even though it is located close to the oxygen atom, whose lone pairs are adjacent to the π system, a relatively low asymmetry parameter is expected.Thus, this set of frequencies appears to correspond to >N-sugar nitrogen site.This is consistent with the fact that frequencies for this kind of nitrogen site can be detected easily using the combined multiple frequency sweeping and dual-frequency irradiation techniques.
Detecting the resonance lines originating from -N= nitrogen is a challenging task.This nitrogen participates only in two NC covalent bonds, which indicates a high value of the asymmetry parameter, similar to that for the -N= nitrogen of cytidine or cytosine.However, the addition of an extra nitrogen atom in 5-azacytidine reduces the asymmetry parameter because this nitrogen shares its lone pair with the π system, Table 1.The NQR frequencies for this nitrogen site can only be detected using the advanced Fast Field Cycling (FFC) relaxometry techniques.
Based on the experimental data, it seems that the two sets of resonance lines yielding in the lowest η should be assigned to the >N-sugar and -NH 2 sites, while the third to the remaining -N=.Tentative assignments to individual sites are provided in Table 1.
Quantum chemical calculations to verify the correctness of the frequency assignment were performed in different variants (single molecule, cluster and crystal).Modeling the NQR spectrum by assuming a single molecule is almost always too imprecise due to the neglect of the surroundings.The only advantage of this approach is that it does not require knowledge of the crystal structure, but at the cost of conformational searches that may not always yield a solid-state conformer.Two other approaches (cluster and solid) require a consistent, high-quality crystalline structure as input, but do not require conformational analysis, except in special cases such as structural or dynamic disorder.The Cambridge Structural Database (CSD) contains only one Ara-C crystalline structure [27].Calculations were performed for this specific structure, having previously optimized the positions of the hydrogen atoms.We have also predicted the NQR spectrum for β-cytidine and 5-azacytidine, and for other compounds for which crystallographic structures are available.The results are listed in Table 2 and visualized in Figure 4.  4 ref.[33], 5 ref.[34], 6 ref. [35].
Calculations performed at the GGA/RBPE level allow obtaining reasonable NQR parameters at a relatively low computational cost.Optimization of the proton positions is first step, especially important for the -NH 2 group and results in a decrease in η and in an increase in |e 2 qQ/h|.Using other GGA functionals results in more scattered frequencies but the same assignment.Cluster calculations performed at the M062X/6-311+G(d,p) level additionally confirmed the correctness of frequency assignment.Moreover, the results of the calculations suggest that the spectrum for 5-azacytidine was previously misinterpreted [27], and the set of frequencies for the -N= site does not correspond to different nitrogen locations, but to a doublet.Furthermore, the NQR parameters indicate that the X-ray structure of 3 -deoxycytidine [33] is very unreliable (one of the inequivalent molecules has a defective geometry).Calculations performed at the GGA/RBPE level allow obtaining reasonable NQR parameters at a relatively low computational cost.Optimization of the proton positions is first step, especially important for the -NH2 group and results in a decrease in η and in an increase in |e 2 qQ/h|.Using other GGA functionals results in more scattered frequencies but the same assignment.Cluster calculations performed at the M062X/6-311+G(d,p) level additionally confirmed the correctness of frequency assignment.Moreover, the results of the calculations suggest that the spectrum for 5-azacytidine was previously misinterpreted [27], and the set of frequencies for the -N= site does not correspond to different nitrogen locations, but to a doublet.Furthermore, the NQR parameters indicate that the X-ray structure of 3′-deoxycytidine [33] is very unreliable (one of the inequivalent molecules has a defective geometry).
In general, NQR data suggest that -NH2 and to a lesser extent -N= form hydrogen bonds in solid state and thus should play a key role in intermolecular bond formation in Ara-C, cytidine and related compounds.
The -NH2 amine group is flexible and has lone pairs localized on the atoms adjacent to the π system.Thus, it is electron donating group, prone to forming strong N-H•••O hydrogen bonds and activating the aromatic ring by increasing the electron density on the ring at the ortho and para positions.The oxygen =O substituted directly into the pyrazine ring at the C(4) position, is both σ electron withdrawing and strongly π electron donating and acts similarly to -NH2.Therefore, both are excellent π-donors and strongly affects -N(3)= and Ry.A comparison of the 14 N NQR parameters modeled in the absence and presence of hydrogen bonds revealed the effect of hydrogen bond.The population of the π orbital, which is normal to the plane containing N-C and N-H bonds is higher than the population of their σ-orbitals.Thus, the Z axis of the EFG tensor at -NH2 is orientated along the π orbital.Any increase in the population of the N-H bond, due to the charge transfer from the donor electron pair in the hydrogen bond results in a decrease in e 2 Qq/h.But the N-H bond population is higher than that of N-C, hence the deviation from the In general, NQR data suggest that -NH 2 and to a lesser extent -N= form hydrogen bonds in solid state and thus should play a key role in intermolecular bond formation in Ara-C, cytidine and related compounds.
The -NH 2 amine group is flexible and has lone pairs localized on the atoms adjacent to the π system.Thus, it is electron donating group, prone to forming strong N-H•••O hydrogen bonds and activating the aromatic ring by increasing the electron density on the ring at the ortho and para positions.The oxygen =O substituted directly into the pyrazine ring at the C(4) position, is both σ electron withdrawing and strongly π electron donating and acts similarly to -NH 2 .Therefore, both are excellent π-donors and strongly affects -N(3)= and Ry.A comparison of the 14 N NQR parameters modeled in the absence and presence of hydrogen bonds revealed the effect of hydrogen bond.The population of the π orbital, which is normal to the plane containing N-C and N-H bonds is higher than the population of their σ-orbitals.Thus, the Z axis of the EFG tensor at -NH 2 is orientated along the π orbital.Any increase in the population of the N-H bond, due to the charge transfer from the donor electron pair in the hydrogen bond results in a decrease in e 2 Qq/h.But the N-H bond population is higher than that of N-C, hence the deviation from the axial symmetry increases, and consequently η increases.The fact that e 2 Qq/h is significantly lower, but η higher, for the amine nitrogen in the solid than in a single molecule supports the aforementioned argument and is consistent with other observations (e.g., imidazole).
The NQR parameters of -N(3)= correspond to the populations of the lone-pair orbital σ LP and the nitrogen orbitals involved in the σ NC and π bonds to carbon.Upon hydrogen bond formation, the EFG tensor at -N(3)= maintains its orientation.However, a decrease in σ LP of -N(3)= leads to higher e 2 Qq/h and η values.The frequencies ν 0 , ν + , and ν − are proportional to the populations of the nitrogen atom bonds and both NQR parameters.As a result, the NQR frequencies have changed slightly.The solid-state shift of the NQR parameters for nitrogen -N(3)= is rather minor, with η well reproduced and e 2 Qq/h is only slightly exaggerated.The latter suggests that -N(3)= is involved in weak intermolecular interactions in the solid state.A comparison of the NQR parameters for Ara-C, cytidine and cytosine shows an increase in e 2 Qq/h and a decrease in η, after adding glycone residue (ribose) or changing it from ribose to arabinose regardless of the nitrogen site.Despite a minor structural modification, the variations in NQR parameters that reflect these changes are readily apparent.However, it is uncertain whether the observed differences are a result of substitution or a different mode of binding.

Binding Motifs
Hydrogen bonds are the most common non-covalent molecular binding motifs in drug structures.Although the Etter rule [36,37] states that the molecule will attempt to form as many hydrogen bonds as possible, in practice some of them may not be feasible (e.g., due to a specific conformation or involvement in the intramolecular interactions).Ara-C containing amine, oxygen and hydroxy moieties is highly prone to form them in solid and in protein-ligand complexes.Even in solid, Ara-C and cytidine structures differ only slightly, thus, their interaction patterns are anticipated to be similar.Differences no greater than those typically observed in polymorphic forms are expected.A comparison with 2 -deoxycytidine, 3 -deoxycytidine and 2 ,3 -dideoxycytidine, which have at least one less -OH group, and the inclusion of different polymorphic forms can provide a reliable reference point.Unfortunately, the 5-azacytidine or Gemcitabine solid-state structures has not yet been resolved.The analysis of 3D Hirshfeld Surfaces (3D HS), Figure 5, provides deep insight into the homo-and hetero-nuclear intermolecular contacts, Table 3. Heatmap, visualizing the percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species listed in Table 3, is shown in Figure 6.
Due to the identical atomic composition the same types of contacts occur in all systems, Table 3, Figure 6.About 50-70% of the percentage contributions comes from HH and OH/HO contacts.The CH/HC and NH/HN have much smaller contributions of about 8-16% and 10-20%, respectively.The remaining homonuclear (CC, NN and OO) or heteronuclear (NO/ON, NC/CN and CO/OC) contributions are negligibly small.Moreover, the dominant contacts are consistent across all studied ligands, with only their percentages varying.However, after sugar substitution, the amount of NH/HN and CH/HC interactions decreases, while OH/HO and HH increases, Figure 7.   (β form), (f) 3′-deoxycytidine, (g) 2′,3′-dideoxycytidine (Zalcitabine), (h) 5-aza-2′-deoxycytidine (Decitabine) and (i) cytosine.
Heatmap, visualizing the percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species listed in Table 3, is shown in Figure Due to the identical atomic composition the same types of contacts occur in all systems, Table 3, Figure 6.About 50-70% of the percentage contributions comes from HH and OH/HO contacts.The CH/HC and NH/HN have much smaller contributions of about 8-16% and 10-20%, respectively.The remaining homonuclear (CC, NN and OO) or heteronuclear (NO/ON, NC/CN and CO/OC) contributions are negligibly small.Moreover, the dominant contacts are consistent across all studied ligands, with only their percentages varying.However, after sugar substitution, the amount of NH/HN and CH/HC interactions decreases, while OH/HO and HH increases, Figure 7.
As can be seen from Figure 7, the effect of the inversion of the hydroxyl group in the 2′ position of the sugar is subtle, and its detection strongly depends on the quality of the structure.The poorer the quality of the structure, the harder it is to notice.Moreover, polymorphic form matters.
The Euclidean distance (ED) and root mean square deviation (RMSD) showing the global differences between the percentages in the 3D Hirshfeld of all contacts for Ara-C and the compounds studied have the lowest values for β-cytidine and 2′-deoxycytidine, and the highest for cytosine, Table 4.
The values of the enrichment ratio, EXY, of the main intermolecular contacts, which reveals privileged (EXY > 1) and disfavored (EXY < 1) contacts between every two atomic species, X and Y, are collected in Table 5.As can be seen from Figure 7, the effect of the inversion of the hydroxyl group in the 2 position of the sugar is subtle, and its detection strongly depends on the quality of the structure.The poorer the quality of the structure, the harder it is to notice.Moreover, polymorphic form matters.
The Euclidean distance (ED) and root mean square deviation (RMSD) showing the global differences between the percentages in the 3D Hirshfeld of all contacts for Ara-C and the compounds studied have the lowest values for β-cytidine and 2 -deoxycytidine, and the highest for cytosine, Table 4.The values of the enrichment ratio, E XY , of the main intermolecular contacts, which reveals privileged (E XY > 1) and disfavored (E XY < 1) contacts between every two atomic species, X and Y, are collected in Table 5. H atoms provide up to 70% of the molecular surface in investigated compounds, whereas the other contributing components C, O, and N generate much smaller but very similar percentages of the molecular surface-from about 6 to 18%, Table 5.
The percentages of the molecular surface of the crystalline structures of the nucleosides studied, are close (the differences do not exceed 3-5%), while those for cytosine differ by as much as 10%.The Euclidean distance (ED), Table 6, showing the difference between the enrichment ratios in Ara-C and studied compounds, takes the lowest values for 2 -deoxycytidine, Zalcitabine and β-cytidine, and the highest for α-cytidine and 3deoxycytidine.Thus, the privileged and disfavored contacts between every two atomic species are in 2 -deoxycytidine and β-cytidine similar to Ara-C.It should be noted that they differ significantly for each pair of polymorphic forms.Detailed inspection shows that Ara-C resembles the β-form of cytidine in terms of the interactions pattern.Among C•••C and H•••C interactions that compete in the crystal structure, only C• • • H contacts are preferred, but their enrichment ratio, E CH , in Ara-C, cytidine and cytosine is smaller than unity, because H are involved in hydrogen bonds.In 2 -deoxycytidine, E CH > 1, which suggests that H participates in hydrogen bonds to a lesser extent.The C•••C contacts are significantly impoverished in all studies structures, E CC ≈ 0 or E CC > 2 (due to division by small number), so their participation in π•••π stacking is electrostatically unfavorable.A very high E CH /E CC in Ara-C and β-cytidine suggests the classification of the packing motifs as close and herringbone type.The H•••H contacts appear with an enrichment ratio slightly smaller than unity (E HH ≈ 0.9) due to their strong competition with the hydrogen acceptors H In general, in Ara-C, α-cytidine and cytosine the favored interaction partners for N, whose contribution to the 3D HS molecular surface is about 7, 13 and 6%, respectively, are C and H species, while O, followed by N are disfavored.In β-cytidine and 2 -deoxycytidine the favored interactions partner for N, whose contribution to the 3D HS molecular surface is about 6%, is only H species, while C, followed by O and N are disfavored.In all structures the favored interaction partners for O are H, which suggests its participation in many hydrogen bonds.
The participation of C, N, and O atoms in the individual intermolecular hydrogen bonds can be identified with the help of the 2D molecular fingerprints calculated based on 3D Hirschfeld surfaces.As follows from Figure 8, they differ significantly, in both shape and size for all the compounds studied.
The 2D molecular fingerprints of cytosine, 2 -deoxycytidine (β form), and β-cytidine most closely resemble those of Ara-C.However, the visual comparison of the total 2D molecular fingerprints is difficult.The characteristic features of the local 2D fingerprint   (a In the 2D FP plots, Figure 10, the O hydrogen bonds are represented by symmetric spikes ("wings").These sharp and small wings are located at de + di~1.75-2.1 Å and cover a relatively small area of 22-31.5% of the total 3D HS.In β-cytidine, these interactions seem stronger than in Ara-C.The 14 N NQR parameters for the corresponding nitrogen atoms in Ara-C and cytidine differ significantly, suggesting a different binding mode in each crystal structure.The differences between the NH-limited 2D fingerprints of Ara-C and cytidine, Figure 10a-c, are significant and much greater than those between their HH-or OH-limited 2D fingerprints.Furthermore, according to the local 2D fingerprints the hydrogen bonds involving nitrogen are relatively weak in Ara-C when compared to α-cytidine, β-cytidine 2′-deoxycytidine β, Zalcitabine or cytosine.
The C•••H/H•••C contacts are of minor importance.They are represented by the most external wide spikes at de + di~2.7-3.0Å and cover 3.3-16.0% of the total 3D HS, Figure 12.Thus, the percentage of the three-dimensional Hirshfeld surface or enrichment ratios seems to indicate only general trends in intermolecular contacts.

Strength of the Interactions
The differences in percentages in the 3D Hirshfeld surface between Ara-C and the nucleotides (2 -deoxycytidine, 3 -deoxycytidine or 2 ,3 -dideoxycytidine) are relatively small compared to those between Ara-C and cytosine.The 3D HS surface, with its normalized contact distance d norm , shape index and curvature mapped over it, provides additional information about the nature of the non-covalent interactions in Ara-C, Figure 9 The energy of these interactions was calculated using a pair model, Table 7.It is reasonable to assume that they will also have a significant impact on protein-ligand interactions.

Differences in Binding Modes in the Solid State
The root-mean-square deviation of the binding modes (RMSD_BM), which shows the difference between the interactions in the solid-state structure of the Ara-C and studied ligand, is listed in Table 8.The total RMSD_BM, which can be interpreted as binding distance, suggests that the binding mode in Ara-C is the most similar to that in 2 -deoxycytidine α, Decitabine and Zalcitabine.A comparison of the RMSD of the interactions in cytidine α and 2deoxycytidine β suggests that they are significantly different from the interactions in Ara-C, which is surprising.Despite the poor quality of the 3 -deoxycytidine structure, the RMSD_BM values do not differ significantly from the others.
However, the RMSD_BM of HH contacts in Ara-C is close to that in Decitabine, 2deoxycytidine, α and β and Zalcitabine.In contrast, the RMSD of OH and NH contacts in Ara-C are similar to those in cytidine β and Decitabine.In turn, the RMSD_BM describing CH contacts in Ara-C is similar to those in Decitabine, cytidine β, Zalcitabine and cytosine.Thus, some contacts are strengthened, others weakened upon inversion or deoxygenation.While this is obvious in relation to glycone, it also applies to aglycone moiety, in particular nitrogen contacts, which would seem to be unaffected by the inversion.This confirms the conclusions drawn from experimental research-the effect of a small structural change in glycone is transferred through the system of bindings to distant sites in aglycone.It is reasonable to assume that such small changes will also have a significant impact on protein-ligand binding modes.

Binding to Deoxycytidine Kinase (dCK) 2.3.1. Molecular Docking Results
Human deoxycytidine kinase (dCK) is a well-known enzyme encoded by the human DCK gene, a potential suicide gene.It catalyzes the first step in the nucleoside salvage pathway, converting natural deoxyribonucleosides to their monophosphate forms [3].dCK also plays a key role in the phosphorylation of numerous nucleoside analog prodrugs routinely used in anti-cancer and anti-viral treatments.The rate-limiting step in the activation of Ara-C is deoxycytidine kinase-dependent phosphorylation.
The crystal structure of dCK in complex with Ara-C, 1P5Z [49] was obtained from the PDB database.Three of the dCK residues (Tyr86, Asp133 and Glu197) involved in the protein-ligand binding are in close proximity to the native ligand (2 Å), fourteen (Phe137, Gln97, Glu53, Phe96, Arg128, Trp58, Ile30, Tyr86, Ile200, Leu82, Val55, Arg104, Met85 and Lys34) are in further proximity of 3 Å and three remaining residues (Arg194, Leu141, and Ala100) in slightly distant proximity of 4 Å.Prior to docking, the native ligand that cocrystallized with the dCK was removed from the structure.The root-mean-square deviation (RMSD) of the best pose relative to the actual ligand redocked in its own binding site was only 0.113 Å.Thus, the quality of the docking process is very high.The same protocol was used to dock other ligands into the rigid dCK structure.In order to compare the docking quality of 2 -deoxycytidine and Gemcitabine, a re-docking task was also performed.They were docked in the original active sites derived from structures 1P60, 1P61 and 1P62 [49] (the RMSD of the best poses did not exceed 0.5 A).The weak point of classical molecular docking is the lack of flexibility of the residues.Therefore, we verified the results of the docking using two different techniques, the flexible ligands and the flexible residues, the latter being applied to the extent possible.The variations in binding modes are found to be meaningless (within the limits of repeatability of results).Molecular dynamics simulations performed using a coarse-grained approach, confirmed the conclusions drawn from flexible molecular docking.Under crystalline conditions (native state) only a few side chains: Asn77, Gly78, Thr64, Glu120, Ser63 and Gln79 show increased flexibility.However, the root-mean-square-fluctuation (RMSF) of a structure, which is the time average of the root mean square deviation (RMSD) of atomic structures did not exceed 1.54 Å.Assuming very high side-chain mobility (completely free small protein chains) Asn77, Leu221, Lys222, His218 and Gly78 have been detected as more flexible, with the RMSF value not exceeding 3.5 Å.The radius of gyration of approximately 16.5 nm changed relatively little, by 1.6 and 3.1%, respectively.It is worth noting that the binding site is stable even when the pocket is empty.The variations in binding modes were found insignificant and irrelevant from the point of view of our research, falling within the range produced by the various docking techniques and do not affect the final conclusions.
The docking results are presented in Table 9.The best docking poses are shown in Figure 13.The total energy is the highest for Ara-C, while the protein-ligand term is the highest for Gemcitabine.Hydrogen bonds and steric interactions (including hydrophobic ones) are the strongest in Ara-C, followed by Gemcitabine, 2′-deoxycytidine and Decitabine.

dCK-ligand Binding Modes
The interaction of the ligands with dCK involves 15 to 18 residues in total (the 14 residues are constant), Table 10.A radar plot demonstrating the differences in binding  The total energy is the highest for Ara-C, while the protein-ligand term is the highest for Gemcitabine.Hydrogen bonds and steric interactions (including hydrophobic ones) are the strongest in Ara-C, followed by Gemcitabine, 2 -deoxycytidine and Decitabine.

dCK-ligand Binding Modes
The interaction of the ligands with dCK involves 15 to 18 residues in total (the 14 residues are constant), Table 10.A radar plot demonstrating the differences in binding modes is shown in Figure 14.Among the ligands, 2′-deoxycytidine and cytidine bind to the largest number of the residues, while 3′-dehydro-3′,4′-dideoxycytidine binds to the smallest.The differences in the binding modes between the real and the redocked ligands are negligible.The largest variation in binding strength is observed for Gln97, Arg128, Asp133 and Ile30.It is worth noting that three residues in the nearest vicinity (Tyr86, Asp133 and Glu197) are not most strongly bound in the complex.The modes of binding of the individual ligands to the protein treated as a whole can be compared in a quantitative manner using, as proposed by us, parameter root-mean-square deviation of binding mode (RMSD_BM), Table 11.Among the ligands, 2 -deoxycytidine and cytidine bind to the largest number of the residues, while 3 -dehydro-3 ,4 -dideoxycytidine binds to the smallest.The differences in the binding modes between the real and the redocked ligands are negligible.The largest variation in binding strength is observed for Gln97, Arg128, Asp133 and Ile30.It is worth noting that three residues in the nearest vicinity (Tyr86, Asp133 and Glu197) are not most strongly bound in the complex.The modes of binding of the individual ligands to the protein treated as a whole can be compared in a quantitative manner using, as proposed by us, parameter root-mean-square deviation of binding mode (RMSD_BM), Table 11.As follows from Table 12, Decitabine, Gemcitabine and 2 deoxycytidine are closest to Ara-C in terms of the complete binding mode (the corresponding RMSD_BM values are the smallest).The RMSD_BM*, RMSD_BM** and RMSD_BM*** describing the differences between the native and redocked ligands or two native ligands are small.
The heatmap shown in Figure 15 visualizes the binding mode in detail and reveals the importance of the individual binding components.As follows from Table 12, Decitabine, Gemcitabine and 2′deoxycytidine are closest to Ara-C in terms of the complete binding mode (the corresponding RMSD_BM values are the smallest).The RMSD_BM*, RMSD_BM** and RMSD_BM*** describing the differences between the native and redocked ligands or two native ligands are small.
The heatmap shown in Figure 15 visualizes the binding mode in detail and reveals the importance of the individual binding components.The heat map of binding strength for each ligand versus residue allows for the identification of the strongest and most stable components of the binding mode.It shows that the binding between Phe137 residue and ligand (Ara-C and its analogues) composed The heat map of binding strength for each ligand versus residue allows for the identification of the strongest and most stable components of the binding mode.It shows that the binding between Phe137 residue and ligand (Ara-C and its analogues) composed of π-π stacking and hydrophobic interactions is the strongest.Moreover, this strong binding is supported by a much weaker interaction between the ligand and Phe96.It is worth noticing that Phe often acts as a "steric gate" preventing the incorporation of nucleoside triphosphate due to steric hindrance with 2 -OH [50].Because the 2 -OH of the glycone in Ara-C points in an opposite direction, it does not generate any steric clashes in dCK.
The ligands that are deprived of the -OH group at the 2 position are Ara-C, 2deoxycytidine, Zalcitabine, Decitabine and Gemcitabine.Surprisingly, with Decitabine as a ligand, the binding to Phe137 is strongly weakened.Therefore, additional nitrogen has a strong modulating effect on the density distribution: instead of NH•••O, weaker NH•••N hydrogen bonds are formed.
The remaining ligands Leu82, Val55, Met85, Ala100, Arg104, Leu141, Arg194 and Ile200 are insensitive to ligand type and play a marginal role in the binding.
The differences in ligand binding strength to individual residues can be easily compared and analyzed by means of the difference heat map, Figure 16.
of π-π stacking and hydrophobic interactions is the strongest.Moreover, this strong binding is supported by a much weaker interaction between the ligand and Phe96.It is worth noticing that Phe often acts as a "steric gate" preventing the incorporation of nucleoside triphosphate due to steric hindrance with 2′-OH [50].Because the 2′-OH of the glycone in Ara-C points in an opposite direction, it does not generate any steric clashes in dCK.
The ligands that are deprived of the -OH group at the 2′ position are Ara-C, 2′deoxycytidine, Zalcitabine, Decitabine and Gemcitabine.Surprisingly, with Decitabine as a ligand, the binding to Phe137 is strongly weakened.Therefore, additional nitrogen has a strong modulating effect on the density distribution: instead of NH•••O, weaker NH•••N hydrogen bonds are formed.
In general, the ligands' ordering according to the decreasing binding strength is as follows: Phe137 The remaining ligands Leu82, Val55, Met85, Ala100, Arg104, Leu141, Arg194 and Ile200 are insensitive to ligand type and play a marginal role in the binding.
The differences in ligand binding strength to individual residues can be easily compared and analyzed by means of the difference heat map, Figure 16.The map shown in Figure 16 shows the deviations of the binding energy with respect to the native ligand treated as a reference.
As follows from Figure 16, the greatest differences between the ligands occur in their binding to the residues: Arg 128, Ile30, Glu197 and Asp133.This finding is in line with Sabini et al. [49] suggestion that the interaction between Arg128 and the hydrogen-bond acceptor at the sugar 2′-arabinosyl position of Ara-C is relevant for the biological activity.The binding of Arg128 to ligands is strongly weakened for 5-azacytidine, 3′deoxycytidine, Zalcitabine, 3′-dehydrio-3′,4′-dideoxycytidine and Decitabine and strong only in Ara-C, Gemcitabine and 2′-deoxycytidine.Moreover, the binding to Ile30 is The map shown in Figure 16 shows the deviations of the binding energy with respect to the native ligand treated as a reference.
As follows from Figure 16, the greatest differences between the ligands occur in their binding to the residues: Arg 128, Ile30, Glu197 and Asp133.This finding is in line with Sabini et al. [49] suggestion that the interaction between Arg128 and the hydrogenbond acceptor at the sugar 2 -arabinosyl position of Ara-C is relevant for the biological activity.The binding of Arg128 to ligands is strongly weakened for 5-azacytidine, 3deoxycytidine, Zalcitabine, 3 -dehydrio-3 ,4 -dideoxycytidine and Decitabine and strong only in Ara-C, Gemcitabine and 2 -deoxycytidine.Moreover, the binding to Ile30 is strongly weakened nearly for the same set of ligands: cytidine, 5-azacytidine, 3 -deocycytidine, Zalcitabine, 3 -dehydrio-3 ,4 -dideoxycytidine and Decitabine.It is also the strongest in Ara-C, Gemcitabine and 2 -deoxycytidine.The binding to Glu197 is strongly enhanced for 2 deoxycytidine, cytidine, 5-azacytidine and Gemcitabine.Therefore, the whole picture is more complicated, and our analysis reveals many more important components of proteinligand bonds than just the binding of the ligand to Arg128.
Identifying the atoms involved in the hydrogen bonds connecting ligands to residues can shed some light on the moieties in the ligands that are most important for effective binding, Figure 17 and Table 12.
strongly weakened nearly for the same set of ligands: cytidine, 5-azacytidine, 3′deocycytidine, Zalcitabine, 3′-dehydrio-3′,4′-dideoxycytidine and Decitabine.It is also the strongest in Ara-C, Gemcitabine and 2′-deoxycytidine.The binding to Glu197 is strongly enhanced for 2′deoxycytidine, cytidine, 5-azacytidine and Gemcitabine.Therefore, the whole picture is more complicated, and our analysis reveals many more important components of protein-ligand bonds than just the binding of the ligand to Arg128.
Identifying the atoms involved in the hydrogen bonds connecting ligands to residues can shed some light on the moieties in the ligands that are most important for effective binding, Figure 17 and Table 12.Table 12.Identification of hydrogen bonds crucial for the ligand-protein bond.

Ara-C 2′-deoxycytidine
Cytidine 5-azacytidine 3′-deoxycytidine Zalcitabine 3′deoxy-3′,4′-didehydrocytidine Gemcitabine Decitabine As follows from Table 12 and Figure 17 C).Therefore, the lack of a hydroxyl group at the 2 position has a similar effect on binding as its inversion or replacement with fluorine.However, dihydroxylation at the 2 and 3 positions of the glycone has no significant effect on the strongly binding components.
An important factor from the point of view of binding is the conformation of the glycone moiety.The angle describing the conformations of the ligands differs by at most about 30 degrees.However, the orientation of the glycone varies significantly for the different polymorphic forms, e.g., cytidine and 2 -deoxycytidine, Table 13.In terms of conformation, Zalcitabine, β-cytidine, and Decitabine are most similar to Ara-C.The conformation of the glycone in Decitabine, 2 -deoxycytidine and Ara-C after docking in the active pocket is almost the same as in the solid state.For the remaining ligands, the differences are quite significant.The OH group at position 2 of the glycone is an important factor in conformational stability.It stiffens the structure by forming a short but bent OH•••O hydrogen bond with the -OH group of CH 2 OH moiety.This bond is unique, occurs only in Ara-C and strongly modifies the electron density distribution in glycone, and consequently, in the entire molecule.
Heat maps, RMSD_BM distances and analysis of the ligand atom contribution to the hydrogen bonds show that the binding modes of Gemcitabine, Decitabine and 2deoxycytidine to dCK are closest to the binding of Ara-C to dCK.Moreover, the heat maps help identify the main components of the binding in solid and protein-ligand complex, which are nitrogen -NH 2 , -N(3)= and -OH.However, predicting the activity from the results of a docking study is time-consuming.

Relative Reactivity of the Ligands
We recently noticed a relationship between relative reactivity and relative ligandprotein binding strength that allows effective screening of SARS-RdRp protease ligands [51].We decided to check whether the recently developed relative reactivity parameters would also prove useful for Ara-C and their analogues binding to deoxycytidine kinase, dCK.The frontier molecular orbitals (FMO) theory is well-known for conceptualizing chemical bonding and reactivity in terms of interactions between orbitals located at the boundary between occupied and unoccupied.Lowest unoccupied molecular orbital (LUMO) accepts electrons, and its energy corresponds to an electron affinity (EA).The highest occupied molecular orbital (HOMO) donates electrons, and its energy is related to ionization potential (IP).Low IP and high EA are associated with high nucleophilic and electrophilic properties, respectively.Thus, the HOMO-LUMO gap is an important measure for the determination of the charge transfer within molecule.Theoretical global indices (absolute electronegativity, χ; absolute hardness, η; electrophilicity index (reactivity), ω; softness, S; electro-donating power, ω-; electro-accepting power ω+; net electrophilicity, ∆ω; and a maximum number of electrons transferred in a chemical reaction, ∆N max ) provide additional information on the ligand's reactivity.Table 14 shows the HOMO and LUMO energies, the HOMO-LUMO gap and global reactivity indices for ligands evaluated at the M062X/6-311G(d,p) level of the theory.
The ligands can be arranged using the decreasing HOMO-LUMO gap as follows: Decitabine > 5-azacytidine > Ara-C > Zalcitabine > 3 -deoxycytidine >2 -deoxycytidine > Gemcitabine Three compounds, Decitabine, 5-azacytidine and Ara-C, which have a higher HOMO-LUMO energy gap, are more stable than the other ligands.Furthermore, their very high value of the absolute hardness but small value of softness suggests a high degree of stability.
The χ parameter describes the tendency to donate/accept electrons, while η measures the ease with which this process occurs.Both are high for 5-azacytidine, and low for Ara-C and Decitabine.The electrophilic power, which measures the capacity of an electrophile to accept the maximal number of electrons in a neighboring reservoir of electron pool, is represented by the descriptor ω (the global electrophilicity index) and falls within the range of 1.683-2.498eV, Table 14 The reactivity of the ligand measured using ω, reveals the following trend: cytidine > 5-azacytidine > 3 -deoxycytidine > Decitabine > Zalcitabine > 2 -deoxycytdine > Gemcitabine > Ara-C.
Thus, the change in sugar moiety leads to a decrease in electrophilic activation from 2.568 to 1.683 (1.768 for native ligand), while additional nitrogen at -N(5) site leads to its decrease to 1.859.The system's tendency to acquire electrons from the environment (evaluated using reactivity) is observed to be very high for Ara-C and Gemcitabine, while very low for cytidine.Thus, Ara-C and Gemcitabine have relatively low substrate selectivity, which means they can inhibit a wide range of proteins.Ara-C has the lowest local electron-donating power, ω + , electron-accepting power, ω − , and overall electrophilicity, ∆ω.A larger ω + value for Ara-C than 2 -deoxycytidine corresponds to its superior ability to accept charge, but small ω − value for Ara-C enhances the ligand's ability to donate electrons.However, cytidine and 5-azacytidine possess an unusually low-lying LUMO level, indicating their susceptibility to molecular reactions with nucleophiles.The low-lying HOMO level for cytidine, Decitabine and 5-azacytidine suggests they are easier than Ara-C in participation in molecular reactions with electrophiles.Two new reactivity descriptors, the so-called relative electron-donating power, R + , and the relative electron-accepting power, R − , were defined recently by us [51].They describe the relative ability of the ligand to accept and donate charges, respectively.
The three ligands Ara-C, Gemcitabine and natural 2 -deoxycytidine are involved in strong hydrogen bonds with dCK residues, while Zalcitabine and 3 -deoxycytidine (located on the border of this area), binds to dCK much weaker.The relative R + and R − with Ara-C as reference show that Gemcitabine, Zalcitabine and Decitabine are closest to it in terms of electron donating/electron withdrawing properties.On the other hand Ara-C, Gemcitabine and Decitabine are closest to 2 -deoxycytidine (treated as reference), while 5-azacytidine and 3 -deoxycytidine are closest to cytidine (treated as reference).

Quantitative Structure-Property Relationships (Binding Affinity, Relative Reactivity and Biological Activity)
The root-mean-square deviation of the binding modes (RMSD_BM) is non-linearly correlated with the relative reactivity power R + and R − , Figure 18 and shows a distinct separation between the studied compounds.
strong hydrogen bonds with dCK residues, while Zalcitabine and 3′-deoxycytidine (located on the border of this area), binds to dCK much weaker.The relative R + and R − with Ara-C as reference show that Gemcitabine, Zalcitabine and Decitabine are closest to it in terms of electron donating/electron withdrawing properties.On the other hand Ara-C, Gemcitabine and Decitabine are closest to 2′-deoxycytidine (treated as reference), while 5-azacytidine and 3′-deoxycytidine are closest to cytidine (treated as reference).

Quantitative Structure-Property Relationships (Binding Affinity, Relative Reactivity and Biological Activity)
The root-mean-square deviation of the binding modes (RMSD_BM) is non-linearly correlated with the relative reactivity power R + and R − , Figure 18 and shows a distinct separation between the studied compounds.Furthermore, the binding affinity is also non-linearly correlated with the relative reactivity R + and R − , Figure 19.
As can be seen from Figures 18 and 19, a small structural change, the inversion of the hydroxyl group at the 2 position of the sugar or removal/replacement of glycone substituents, leads to a significant change in relative reactivity, binding mode and binding affinity.The root-mean-square deviation of the binding modes (RMSD_BM), binding affinity and R + and R − can be treated as combined screening parameters: smaller R values means stronger binding, smaller RMSD_BM and binding affinity means increased anti-leukemic efficiency.Anti-viral ligands (3 -deoxycytidine, Zalcitabine) are blocked at the 3 -hydroxyl group of DNA, which prevents elongation of nascent DNA.The RMSD_BM, R + and R − for them are higher than for anti-leukemic Ara-C, Gemcitabine and Decitabine, Figure 18.The light pink area in Figure 18 indicates ligands with the low and close values of the relative R + and R − parameters, which bind easily to dCK: Ara-C, Gemcitabine, Decitabine (RMSD_BM < 1.4; anti-cancer) and Zalcitabine (RMSD_BM > 1.4; anti-viral).The R + , R − values higher for Gemcitabine than Ara-C suggests that Gemcitabine should be slightly more effective than Ara-C.Indeed, Gemcitabine has broad anti-leukemic activity across different AML subtypes and is more effective than Ara-C, both in vitro and in vivo [52].Our results are also in a good agreement with the preclinical studies, which indicated that Decitabine is a more effective anti-leukemic agent than 5-azacytidine [53], but five times less effective than Ara-C [54].The mechanism of action of 5-azacytidine is different and involves primarily DNA hypomethylation.5-azacytidine and Decitabine are used in combination with Ara-C [55] due to synergistic induction of apoptosis.Moreover, the IC50 determined against HL-60 cell lines for 5-azacytidine or Decitabine is significantly higher than that of Gemcitabine, indicating that these two compounds are less effective than Gemcitabine [56].On the other hand, cytidine does not have any anti-cancer properties, which is supported by distinct R + and R − and high RMSD_BM.The binding affinity, which is the highest for three ligands: Gemcitabine, Ara-C, and natural 2 -deoxycytidine and the lowest for cytidine reflects RMSD_BM well.
across different AML subtypes and is more effective than Ara-C, both in vitro and in vivo [52].Our results are also in a good agreement with the preclinical studies, which indicated that Decitabine is a more effective anti-leukemic agent than 5-azacytidine [53], but five times less effective than Ara-C [54].The mechanism of action of 5-azacytidine is different and involves primarily DNA hypomethylation.5-azacytidine and Decitabine are used in combination with Ara-C [55] due to synergistic induction of apoptosis.Moreover, the IC50 determined against HL-60 cell lines for 5-azacytidine or Decitabine is significantly higher than that of Gemcitabine, indicating that these two compounds are less effective than Gemcitabine [56].On the other hand, cytidine does not have any anti-cancer properties, which is supported by distinct R + and R − and high RMSD_BM.The binding affinity, which is the highest for three ligands: Gemcitabine, Ara-C, and natural 2′-deoxycytidine and the lowest for cytidine reflects RMSD_BM well.Comparing the relative reactivity, binding mode, or binding affinity of ligands with their biological activity is generally challenging due to variations in the scope of action, Table 15, as well as differences in protocols, doses, methods of administration, and timing of measurements used in different cell-line studies.Furthermore, therapeutic efficacy is limited by other factors including the rapid elimination of intracellular CTP forms in target cells.
Despite the different spectrum of activity, all ligands share a common feature-require metabolic activation via phosphorylation catalyzed by dCK.Thus, low levels of dCK enzyme activity [57], mutations in dCK [58] or increased levels of active triphosphate metabolites [59] contribute to variability in their response.dCK has recently been recognized as a promising new target for BRCA2-deficient cancers (an alternative to PARP inhibitors) [59], an agent to reduce the symptoms of encephalitis (inflammation of the brain) [60] and the cause of Gemcitabine resistance in pancreatic cancer patients (due to dCK inactivation) [61].Furthermore, dCK deficiency or weak binding to dCK are the ratelimiting factor for phosphorylation and in a consequence biological activity.The methods proposed for comparing ligand reactivity and protein-ligand binding modes show promise for screening ligands or searching for drug improvements.
In the case of Ara-C, as for favipiravir [63], the T 1H (B P ) significantly exceeds typical values.Such conditions are unfavorable from the experimental point of view because the sample cannot be properly polarised and measurements are very time-consuming.A modification of the classical technique in which the polarization period (B P = 0) is removed from the experiment and the proton magnetization at the beginning of the B R interval is zero is an effective solution.The magnetization of the proton increases towards the equilibrium value according to the formula M = M 0 1 − e − τ T 1H (B R ) .For B R values at which the 1 H proton NMR transition frequency ν H matches one of the 14 N NQR transition frequencies ν Q , the proton system is additionally polarised by the transfer of polarisation from nitrogen NQR levels to proton NMR levels with a characteristic cross-relaxation time T CR .The additional "quadrupole peaks" appear in the CR spectrum when the condition The 1 H-14 N cross-relaxation spectrum of Ara-C was measured by this modified crossrelaxation technique.Although the peaks are well located in the spectrum, the resolution is relatively low due to line broadening due to the Zeeman effect as well as the Zeeman shift with respect to the pure NQR frequencies in the zero field [64].
In the second step, the techniques of multiple frequency sweeps and two-frequency irradiation [25,26] were used.They enable a more precise determination of the frequencies of the NQR triplets ν + , ν − , and ν 0 at the three nitrogen positions in an Ara-C molecule.In the search for the 14 N NQR frequencies at the pyrazine nitrogen positions, we applied multiple frequency sweeps of the rf magnetic field in the frequency range 2.8-4 MHz and performed the ν H scan.

Computational Modeling-Density Functional Theory Single Molecules and Clusters
The calculations for the single molecules and clusters were carried out using Gaussian 16 rev.C01 [65].The quantum chemical calculations required for the QTAIM analysis were carried out within the density functional theory (DFT) approach rooted in the Kohn-Sham [66] theorem, generalized by Levy [67].The hybrid meta exchange-correlation functional, with a double amount of non-local exchange, M062X [68] and an all electron split-valence basis set 6-311+G(d,p) was used.As we have shown previously, M062X provides a reliable electron density distribution in single molecules and cluster systems with non-covalent interactions [68].Providing useful accuracy and favorable algorithmic complexity, this approach is an attractive compromise.The positions of the hydrogen atoms have been optimized.X-ray crystallography typically cannot directly determine the positions of light atoms at standard resolutions.The full description of the clustering technique can be found in our previous papers [63,69,70].

Solid-State
All solid-state quantum chemical calculations were carried out within the CASTEP [75] code.Different nonlocal generalized gradient approximation (GGA) functionals depending on both ρ and dρ/dr, namely RPBE (revised Perdew, Burke, and Ernzerhof) [76], PBE (Perdew, Burke, and Ernzerhof) [77] with the Tkatchenko-Scheffler (DFT-TS) correction for dispersion [78] was probed.The use of the gauge including projector-augmented interactions with its environment.The enrichment ratio, E XY , a descriptor defined as the ratio of the proportion of actual connections in a crystal to the theoretical fraction of random interactions [83] was calculated based on 2D FP.It denotes a proclivity to make or avoid contacts.
The conversion of the files with receptor and ligand structures to the .pdbqtformat was made using MGLTools ver.1.5.7.Molecular docking was performed using automated docking tools AutoDock ver.4.2.6 [91] and AutoDock Vina ver.1.2.3 [92].Before docking the ligands, the native ligand that co-crystallized with dCK and water molecules was removed from the structure.The protonation state of the protein was checked and corrected prior to docking.To assess the docking process's quality, a redocking task was performed.The redocking protocol was effective if the pose's root-mean-square deviation (RMSD) from its conformation in the parental structure did not exceed 3 Å (the RMSD values for the studied ligands did not exceed 0.5 Å).The new ligand was docked into the rigid protein structure using two separate techniques: template docking and docking with defined search space around the active site.The grid box of size 9-15 Å was centered on the active site.After docking, the optimal poses that resulted in protein-ligand complex stabilization with the highest docking score were chosen and studied further.The molecular docking results were validated using a Genetic Evolutionary Method for molecular DOCKing (GEMDOCK) [93].This technique also utilizes a genetic algorithm, but with a distinct evolution operator.The energy of the interactions was described using the sum of the piecewise linear potential (PLP) term (steric, van der Waals and hydrogen bonding interactions), and the Coulomb term (electrostatic interactions).Since the lack of protein flexibility is one of the greatest limitations in molecular docking, some flexibility of selected residues was assumed in GEMDOCK.The molecular dynamics simulations have been performed using coarse grain technique [94,95].The total number of generated models was equal to 50,000, 2% of which were selected and 10 best used in the further comparative analysis.The binding affinity was estimated using the Gehlhaar model [96] with original parameterization and using PRODIGY [94,95].The final 2D and 3D visualizations of the binding modes were made using PoseEdit [97] and VMD [98].The average deviation between the binding modes was calculated using the newly defined quantity: root-mean-square deviation of the binding modes (RMSD_BM).It was calculated as follows: where p i and q i are the binding interactions in each structure and P = {p i } and Q = {q i }.

Grid Heatmaps
A heat maps is a 2-dimensional data visualization technique, where the magnitude of individual values within a dataset is color coded, i.e., represented by a color.It helps capturing the most relevant data.In the biological field, heat maps are used to visually represent patterns in DNA, RNA or gene expression.
We applied grid color coded heat maps to visualize binding modes of the ligands in the solid state and when docked to the active pocket in dCK.In both cases, a red-yellow-blue scheme was used, where dark red indicates strong interactions and dark blue indicates very weak interactions.Additionally, heat maps were used to visualize differences in the binding modes in the solid state and protein-ligand complex (the native ligand was treated as a reference).In this case, the diagram in red-yellow-blue scheme shows the strengthening (red), stability (yellow) and weakening (blue) of the binding strength.

Conclusions
A slight structural alteration, the inversion of hydroxyl group at the 2 -position of the sugar, leads to change of the biological activity from anti-depressant and DNA/RNA block builder (cytidine) to powerful anti-cancer (Ara-C).We have discovered that the abovementioned modification causes significant changes in the electron density distribution and binding mode, clearly visible in the NQR spectrum.
This butterfly effect in Ara-C can be seen in the individual molecules, solid state but also after the docking to the active pocket in dCK.To evaluate it, we proposed the new parameter, the root-mean-standard deviation of the binding modes and heat maps approach.They allow to compare binding modes in crystals and protein-ligand complexes.The root-mean-square deviation of the binding mode RMSD_BM defined by us enables the assessment of global differences in the binding mode in solid state and protein-ligand complex.The heat map of binding strength for each ligand versus residue allows identification of the strongest and most stable components of the binding mode.Differential heat maps make it easier to assess which residues are important for binding efficiency.
The relative reactivity, R + and R − , acts as ligand screening parameters: smaller R values means stronger binding to dCK.Using the abovementioned techniques, we identified the intramolecular OH•••O hydrogen bond as the key factor responsible for forcing the glycone conformation and strengthening NH•••O bonds with Gln97, Asp133, and Ara128, and stacking with Phe137.The title butterfly effect is associated with intramolecular hydrogen bond.Our study elucidates the differences in the binding modes of Ara-C and cytidine, which should guide the design of more potent anti-cancer and anti-viral analogues.
3-β-D-arabofuranosyluracil) fro Caribbean sponge Cryptotethya crypta (now Tectitethya crypta) by Bergmann et al. [ the observation that these compounds can act as terminators of the DNA synthesis initiated research on the new and separate class of nucleotides with modified glycon arabinosides.The possibility that they have a role as anti-viral, anti-cancer as well a cellular senescence drug has sparked interest in them.In 1959, Walwick, Rober Dekkerin [2] synthesized Cytarabine, (Cytosine arabinoside; Cytosinearabinofuranoside; Ara-C), a synthetic isomer of cytidine, that differs from cytidi deoxycytidine, only in the sugar (D-arabinose instead of ribose and deoxy respectively), Figure 2.

Figure 3 .
Figure 3.The experimentally determined 14 N NQR spectrum in Ara-C (top) and positions of the resonance lines in the spectra of Ara-C and reference cytidine.

Figure 3 .
Figure 3.The experimentally determined 14 N NQR spectrum in Ara-C (top) and positions of the resonance lines in the spectra of Ara-C and reference cytidine.

Figure 6 .
Figure 6.Heatmap visualizing the percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species.The red-yellow-blue scheme, with dark red indicating strong interactions and dark blue indicating very weak ones was used.

Figure 6 . 43 Figure 7 .
Figure 6.Heatmap visualizing the percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species.The red-yellow-blue scheme, with dark red indicating strong interactions and dark blue indicating very weak ones was used.als 2024, 17, x FOR PEER REVIEW 13 of 43

Figure 7 .
Figure 7. Difference heat map visualizing relative percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species (Ara-C is a reference).The difference heat map visualizes the differences in the percentages in a red-yellow-blue scheme, where red indicates an increase and blue indicates a decrease.
3) because they are favored in the crystal packing, and, notably, the O=C oxygen atoms form intermolecular hydrogen bonds of the type O-H•••O and N-H•••O.Therefore, the O•••O and N•••O contacts are significantly impoverished.The N•••H/H•••N contacts show enrichment values higher than unity, and, thus, are favored and the existence of the O-H•••N hydrogen bonds is raised.
and H•••H contacts were analyzed in depth.The highest contribution to 2D fingerprint is brought by weak H•••H interactions, which are reflected by the cloud of scattered points with spikes at d e + d i ~2.0-2.3Å, Figure 9.In Ara-C, these interactions are clearly stronger than in β-cytidine.In the 2D FP plots, Figure 10, the O•••H/H•••O contacts describing intramolecular O-H•••O and intermolecular O-H•••O and N-H•••O hydrogen bonds are represented by symmetric spikes ("wings").These sharp and small wings are located at d e + d i ~1.75-2.1 Å and cover a relatively small area of 22-31.5% of the total 3D HS.In β-cytidine, these interactions seem stronger than in Ara-C.The N•••H/H•••N contacts are represented by the external symmetric spikes sharp, and d e + d i ~1.8-2.9Å and cover only about 12-23% of the total 3D HS, Figure 11.They are strong for Decitabine, suggesting the presence of NH•••N interactions in this structure.The 14 N NQR parameters for the corresponding nitrogen atoms in Ara-C and cytidine differ significantly, suggesting a different binding mode in each crystal structure.The differences between the NH-limited 2D fingerprints of Ara-C and cytidine, Figure10a-c, are significant and much greater than those between their HH-or OH-limited 2D fingerprints.Furthermore, according to the local 2D fingerprints the hydrogen bonds involving nitrogen are relatively weak in Ara-C when compared to α-cytidine, β-cytidine 2 -deoxycytidine β, Zalcitabine or cytosine.The C•••H/H•••C contacts are of minor importance.They are represented by the most external wide spikes at d e + d i ~2.7-3.0Å and cover 3.3-16.0% of the total 3D HS, Figure 12.The impact of other contributions is negligible.The C•••O/O•••C contacts provide a tiny contribution of about 4.1% and they are represented by sharp wings, d e + d i ~3.40 Å, placed in the middle area of the entire fingerprint.The O•••N and C•••N/N•••C, contacts bring negligible contribution of 1.2 and 1.0%, respectively.The contribution brought by C•••N/N•••C and C•••C contacts, comes mainly from interlayer π•••π stacking interactions.Thus, the percentage of the three-dimensional Hirshfeld surface or enrichment ratios seems to indicate only general trends in intermolecular contacts.
. They help identify OH•••O, NH•••O and NH•••N bonds involving donors and acceptors of the ligands.
The intramolecular O-H•••O hydrogen bond in sugar moiety of Ara-C linking OH at 2 position of the sugar and OH group of CH 2 OH moiety is very short (R O•••O = 2.650 Å) and nonlinear (<OHO = 155.45• ) occurs only in Ara-C.Intense red areas in the 3D HS near O and H from -OH, represent very low values of d norm = −0.6830a.u., shape index = −0.9965,and curvedness = −1.3652a.u., which confirms existence of this strong bond.It stiffens structure and modifies the electron density within the glycone moiety, and as a consequence, the entire molecule.Therefore, it affects the ability of other atoms of the ligand to form hydrogen bonds.The intermolecular hydrogen bonds are shown in the 3D HS surface by the areas that range in color from deep red to white.Deep red areas in the 3D HS localized near two H atoms (from -NH 2 ) represent low values of d norm = −0.4106and −0.3885 a.u., shape index = −0.9561and −0.9304 a.u., and curvedness = −2.0440and −1.8137 a.u.They confirm the participation of the amine group, NH 2, in strong N-H•••O hydrogen bonds, which are short (R O•••N = 3.048 and 2.983 Å) and nonlinear (<NHO = 163.7 and 145.73 • ).The -N= participates in weak, nearly linear, CH•••N hydrogen bond of 3.562 Å.The small light red area near =N-and H from Ry sugar represent the values of d norm = −0.8983a.u.(shape index = 0.9512 a.u. and curvedness = −2.2851a.u.).The -NRy participates exclusively in covalent bonds.The remaining red areas on 3D HS surface reveal OH•••O intermolecular hydrogen bonds, which are short (R O•••O = 3.003 and 2.720 Å) and non-linear (<OHO = 159.64 and 172.71 • ).The overall interaction energy partitioning suggests that the crystalline packing in Ara-C is primarily governed by electrostatic and repulsive strong N-H•••O, O-H•••O and C-H•••N hydrogen bonds, followed by weaker N-H•••O and C-H•••O, which are mainly dispersive and repulsive (Table 7).In β-Cytidine and 2 -deoxycytidine strong N-H•••O, O-H•••N and O-H•••O hydrogen bonds are electrostatic and repulsive.Intramolecular OH•••O bonds closing the 5-membered ring occur only in Ara-C, while very weak CH•••O (O from =O) bonds closing the non-planar ring occur in all structures except cytosine.The -OH group in Ara-C is oriented toward the cytosine moiety, prompting it to engage in intramolecular hydrogen bond and form an additional heterocyclic quasi-ring, stiffening the entire structure.Therefore, the sugar component is crucial for the formation of the specific interaction pattern in the Ara-C crystal structure.The arrangement of hydroxyl groups, as well as their number, play a key role.All 2D molecular fingerprints indicate that the strongest interactions in each structure involve O and H contacts, specifically OH•••O and NH•••O hydrogen bonds.

Figure 13 .
Figure 13.The best docking poses of the ligand in complexes with dCK protein.The protein backbone is represented as a cartoon, the binding cavity residues are shown as thin sticks and the docked ligands are shown as color sticks (Ara-C in green, 2′-deoxycytidine in yellow, cytidine in cyan, 5-azacytidine in magenta, 3′-deoxycytidine in white, Zalcitabine in blue, Decitabine in dark green and 3′-deoxy,3′,4′-didehydrocytidine in red).The hydrogen bonds linking Ara-C to the dCK residues are shown using dashed green lines.

Figure 13 .Table 9 .
Figure 13.The best docking poses of the ligand in complexes with dCK protein.The protein backbone is represented as a cartoon, the binding cavity residues shown as thin sticks and the docked ligands are shown as color sticks (Ara-C in green, 2 -deoxycytidine in yellow, cytidine in cyan, 5-azacytidine in magenta, 3 -deoxycytidine in white, Zalcitabine in blue, Decitabine in dark green and 3 -deoxy,3 ,4 -didehydrocytidine in red).The hydrogen bonds linking Ara-C to the dCK residues are shown using dashed green lines.

Figure 15 .
Figure 15.The binding strength of the ligands to individual residues (Ara-C* 1P5Z, 2-deoxycytidine* P160, 2-deoxycytidine** P161, Gemcitabine* P162 are listed as reference).The heat map visualizes the binding mode in red-yellow-blue scheme, with dark red indicating strong interactions and dark blue indicating very weak ones.

Figure 15 .
Figure 15.The binding strength of the ligands to individual residues (Ara-C* 1P5Z, 2-deoxycytidine* P160, 2-deoxycytidine** P161, Gemcitabine* P162 are listed as reference).The heat map visualizes the binding mode in red-yellow-blue scheme, with dark red indicating strong interactions and dark blue indicating very weak ones.

Figure 16 .
Figure 16.Difference heat map revealing the differences in the binding strength of ligands to individual residues (Ara-C* 1P5Z, 2-deoxycytidine* P160, 2-deoxycytidine** P161, Gemcitabine* P162 are listed as reference).Thered-yellow-blue scheme, where red indicates bond strengthening and blue indicates bond weakening was used.

Figure 16 .
Figure 16.Difference heat map revealing the differences in the binding strength of ligands to individual residues (Ara-C* 1P5Z, 2-deoxycytidine* P160, 2-deoxycytidine** P161, Gemcitabine* P162 are listed as reference).Thered-yellow-blue scheme, where red indicates bond strengthening and blue indicates bond weakening was used.

Figure 17 .
Figure 17.The binding mode in Ara-C-dCK complex (hydrogen bonds are depicted in blue, hydrophoblic contacts in green and π•••π interactions in cyan).

Figure 17 .
Figure 17.The binding mode in Ara-C-dCK complex (hydrogen bonds are depicted in blue, hydrophoblic contacts in green and π•••π interactions in cyan).
, NH 2 and -N= play a key role, as they bind Gln97 residue using strong NH•••O and NH•••N hydrogen bonds and Asp133 using weaker NH•••O bonds.In addition, OH from CH 2 OH moiety is highly important, binding to Glu53 through OH•••O and Arg128 through NH•••O.However, only 2 deoxycytidine, Gemcitabine and Decitabine bind to Arg128 via hydrogen bonds.Moreover, protein-ligand binding is strongly modulated by multiple OH•••O hydrogen bonds connecting the glycone hydroxyl groups to Tyr86, Glu197 and Arg128 (only in Ara-

Figure 18 .
Figure 18.The correlation between relative binding mode (RMSD_BM) and the relative reactivity R + and R − .

Figure 19 .
Figure 19.The correlation between relative binding affinity and the relative reactivity R+ and R−.

3. 2
.5.Comparison of the Differences in the Binding Modes Root-Mean-Square Deviation of the Binding Mode

Table 2 .
The14N NQR parameters for Ara-C and related compounds calculated theoretically at the GGA/RPBE level with TS DFT-D correction.

Table 3 .
Percentage contributions to the 3D Hirshfeld surface area calculated for each pair of species.

Table 4 .
The Euclidean distance (ED) and root-mean-square deviation (RMSD) calculated between the 3D Hirshfeld percentage contributions for Ara-C and other compounds studied.

Table 4 .
The Euclidean distance (ED) and root-mean-square deviation (RMSD) calculated between the 3D Hirshfeld percentage contributions for Ara-C and other compounds studied.
* The value averaged over available structures.

Table 5 .
Enrichment ratios E XY characterizing the various contacts in Ara-C and related compounds.

Table 6 .
The Euclidean distance (ED) and root-mean-square deviation (RMSD) between the enrichment ratios for Ara-C and related compounds.
* Excess value for CC contacts omitted.

Table 7 .
Key hydrogen bonding interactions in the solid state (E tot -total energy, E e -electrostatic term, E P -polarization term, E d -dispersion and E r -repulsion).

Table 8 .
Differences in the binding modes between Ara-C and the studied compounds in solid state.

Table 10 .
The binding mode of the Ara-C, cytidine and its analogues to dCK (the residues are ordered by decreasing binding energy expressed in kcal/mol).

Table 12 .
Identification of hydrogen bonds crucial for the ligand-protein bond.

Table 11 .
The root-mean-square deviation of binding mode (RMSD_BM).Binding mode distance between Ara-C and the native ligand.** Binding mode distance between the original and redocked ligand.*** Binding mode distance between the redocked ligands of 2′deoxycytidine. *

Table 13 .
The angle describing the conformations of the ligands in the solid state and proteinligand complex.