1. Introduction
Unraveling the excitation dynamics of DNA is essential to understanding its biochemical functions and pathology, such as the influence of UV light on cancer formation [
1]. It has been theorized that UV light induces the formation of excitons in DNA. These would be formed either on the hydrogen-bonded base pair and/or between adjacent bases along the DNA backbone [
2,
3,
4]. It is thought that the appearance of excitons signals the presence of long-lived photoexcited states in DNA [
5,
6]. The slow decay of these excited states increases the chance that the excess energy will dissipate via higher-energy deactivation pathways, possibly resulting in photo-induced biological damage and conditions such as cancer. Although extensive effort has gone into investigations in this area [
3,
7,
8,
9], definitive exciton signatures in DNA have remained elusive and sometimes contested [
9,
10]. In contrast, single nucleobases are devoid of excitons and decay quickly and safely via non-radiative pathways [
2,
10]. Clearly, it is vital to recognize the spectroscopic signatures of excitons and to discern their role in the excited-state lifetimes in DNA.
DNA is composed of repeating units of nucleobases: adenine (A), thymine (T), guanine (G), and cytosine (C).
Figure 1 shows the skeletal structural formula of adenine and thymine composing the DNA strands employed in the present study, and their respective versions with sugar ribose and phosphate groups, as when included in a DNA strand. Normally, DNA is composed of two single strands of DNA (ssDNA) coiled together, forming a double-stranded DNA molecule (dsDNA), as, for example, in B-DNA. The two strands are stabilized by h-bonding between the parallel nucleobases of each ssDNA.
Non-linear spectroscopic tools, such as transient absorption (TA) and two-dimensional photon echo (PE) spectroscopic techniques, can identify exciton transfer pathways, as well as interactions between the DNA nucleobases. However, before performing such involved measurements, we must identify the DNA strands that show the strongest evidence of excitonic behavior. To simplify matters, we focus on detecting interactions between stacked nucleobases in single-stranded DNA (ssDNA), which are predominantly H-aggregate in character, and lack the contribution of the Watson–Crick hydrogen bond present in double-stranded DNA, so that we can isolate the contributions of the excitonic behavior along the backbone only, in a bottom-up approach. The techniques of investigation of the present study are absorption spectroscopy on the one hand, and circular dichroism spectroscopy (CD) on the other hand, at the UV range of wavelengths (around 260 nm), where nucleobases and ssDNA exhibit strong optical responses [
11,
12,
13,
14]. CD is a very sensitive technique to structural changes upon UV-light irradiation [
15]. The transition dipole strength is analyzed and linked to the extent of the spatial delocalization of excitons using a data analysis technique based on dipole moment retrieval by multiple Gaussian fitting, as well as by the sum rules developed by Burin et al. [
16].
Here, we selected two different families of homogeneous ssDNA referring to a single strand of DNA composed only of the same type of nucleobases, d(A)
n and d(T)
n where
n is the number of nucleobases in the sequence. Moreover, we employ also the sequence of alternating adenines and thymines, d(AT)
n. These two families are chosen to represent purines (adenosine-homo oligomers) and pyrimidines (thymine-homo oligomers) on the one hand, and the mixture of them (alternating sequence) on the other hand. We are limiting our investigation to this ssDNA to avoid base–pair interactions and simplify the model. These interactions would complicate the excitonic signature and the spectroscopic signal in general, while having single strands means that we have only excitons delocalized along the same strand of DNA due solely to the H-aggregation of nearby nucleobases. As regards length, the ssDNA is kept relatively small, 2 to 20 bases, to suppress the complexity of the system, and be able to simulate the absorption and circular dichroism spectra with limited computational resources. The dependence of the UV-absorption spectra of ssDNA upon its composition [
17], due to bases interacting dissimilarly with each other, adds another layer of complication, which we started to tackle by studying very specific sequences of alternating adenines and thymines.
In the present work, we scrutinize DNA single strands for evidence and extent of excitonic signatures based upon a set of specific parameters, including length and nucleobase sequence. Our primary tools are absorption and circular dichroism spectroscopies, coupled with data analysis of the transition dipole strength, whose results are linked to the extent of the spatial delocalization of excitons. By comparing ssDNA molecules, we aim to find signatures of excitonic behavior. To our knowledge, this is the first time that short single strands of DNA with a well-defined sequence are systematically studied and where spectroscopic signatures are compared together utilizing this data analysis.
3. Results and Discussion
The absorption spectra of the single base deoxyadenosine triphosphate (dATP) and the ssDNA strands ((dA)
n) on the left spectra of
Figure 2 show a large band centered near 260 nm and an intense band near 215 nm, as has been reported in the literature [
21,
22,
23]. The valley at around 230 nm is quite high for typical ssDNA absorption spectra, possibly due to the presence of impurities from the synthesis and purification process. The absorption peaks of the (dA)
n strands are blue shifted by 3 to 4 nm compared with the peak of dATP. However, this blue shift does not linearly increase with the number of adenosines in the chain, and moreover, there is no broadening of the absorption peaks concomitant with increasing adenosine number, although there is a small shape change at approximately 285 nm. The blue shift is caused by a negative Coulomb coupling between neighboring chromophores in a tail-to-tail H-aggregates formation, as is the case for single-stranded DNA [
24]. The exciton delocalized between the bases along the same strand, evidenced by the existence of a blue shift, was previously proposed by Hu et al. [
2]; nevertheless, the shift does not increase as the number of nucleobases n grows. This either implies that this delocalization occurs only at a few sites, that competing phenomena, especially at room temperature, are at play (e.g., strong coupling to the bath, excimer formation, non-radiative decay, etc.), or a combination of the two.
At the same time, the CD spectra of (dA)
n on the right plot of
Figure 2 have a complicated structure, as previously reported by other groups [
13,
25,
26,
27]. Positive bands are present at 271 nm and at 218 nm, whereas negative bands occur at 250 nm and at 207 nm; alternating those bands with opposite signs, there are zero crossings at 260 nm and near 215 nm, whereas a positive shoulder appears at 230 nm. Therefore, we can recognize a clear negative Cotton effect, corresponding to the absorption peak at 257 nm. From the perspective of the secondary structure, it has been proven by Brahms et al. [
28] that (dA)
n is a “single-chain stacked-base helix”, based on their CD spectroscopy data. Likewise, our spectra show a broad positive band between 260 nm and 280 nm, preceded by a negative band between 240 nm and 260 nm, which is typical of a β-helix structure. After normalization at 260 nm against the absorption spectrum, when CD spectra of different (dA)
n molecules are compared, the magnitude of the maxima at 270 nm and minima at 250 nm is, respectively, increasing and decreasing, as in
Figure 2. A small deviation for dA
2 is present in the CD spectrum, the origin of this discrepancy comes from impurities during the synthesis process of the oligomers.
Moving on to the thymine-based molecules, experimental absorption spectra of single chains of thymidines in
Figure 3 are confirmed by other studies in the literature [
15]. The spectrum of single-base dTTP (deoxythymidine triphosphate) exhibits a large peak centered at 268 nm. Relative to the absorption peak of the monomer, the (dT)
n homo-oligomer spectral peaks are blue-shifted to approximately 266 nm, independently of the strand length. Another broad absorption band is present at around 206 nm but is not completely resolved due to the limit of our spectral window; hence, we will not discuss this further due to a lack of complete data. The aforementioned two absorption bands are connected via a deep valley at 235 nm. As was the case for the (dA)
n homo-oligomers, the (dT)
n strands do not show broadening when compared with each other or to the single base dTTP. Regardless of the lack of peak broadening and differently from the adenine case, the blue shift for multiple-base ssDNA suggests that there is an exciton delocalized between the bases along the strand, and overall, it behaves as an H-aggregate. Due to the constancy of the shift, this may occur only at a few sites. While one would expect to see further shifts with an increasing number of nucleic acids per strand, the broad nature of the absorption peak prevents us from identifying any extra shifts. By collecting spectra at lower temperatures, we would likely be able to reveal these extra shifts, which is planned in a follow-up paper.
The collected CD spectra are shown in
Figure 3, and are in agreement with spectra from the literature [
25,
27]. The complicated structure of the CD spectra, for homo-oligomer strands composed of two or more thymidines, consists of four peaks of alternating sign; there are two negative peaks at 214 nm and 251 nm and two positive peaks at 223 nm and 278 nm. These are separated by nodal points at 220 nm, 231 nm, and 278 nm. We interpret this as a combination of two negative Cotton effects above and below 230 nm, corresponding to the absorption peaks centered at 266 nm and 208 nm respectively.
Spectral features show an increase in amplitude only, as the number of thymidines in the homo-oligomer strand increases above four. However, there is a 1 nm blue shift of the two red-most peaks between (dT)
2 and (dT)
4 onwards, and our extremely short strand (dT)
2 appears to behave similarly to the short strands (
n = 2, 3) presented in Holm et al. [
25], with respect to this blue shift. The CD spectra indicate a β-helix secondary structure, similar to (dA)
n, and in agreement with the literature [
13].
As the last group of molecules presented in this paper, we move on to describe the absorption and CD spectra of (dAT)
n repeats of DNA, shown in
Figure 4, they are similar to what other groups obtained [
29]. Here, one notices that the features recall both the spectra of (dA)
n and (dT)
n, as expected. The maximum of the main absorption band of (dAT)
n is at 260 nm, with the lone exception of (dAT)
10 (261 nm). As a comparison, the maximum absorption of (dA)
n is 256.5 nm and of (dT)
n is 266 nm. The absorption maximum of the A-T repeat strands is located exactly at the average of the constituting homo-oligomers: 261.25 ± 0.5 nm. Therefore, the contribution to the transition dipole strength is distributed equally toward the two nucleobases adenine and thymidine. One would expect similar behavior in the circular dichroism spectra, but the features are instead more mixed. For instance, the positive band at 280 nm has a completely different shape than the one of (dA)
n, while it is analogous to the band of (dA)
n. Moreover, the shoulder present at approximately 230 nm in (dA)
n spectra is completely absent here and replaced with a (slightly positive) nodal point at 233 nm, a feature present in the spectra of (dT)
n at 231 nm, where instead the nodal point is slightly negative. The difference on the y-scale is due to the influence, although reduced, of the shoulder of (dA)
n.
The (dAT)
n molecules have the characteristic of each being complementary of the other, meaning that if in the same ensemble of molecules there are two or more (dAT)
4, as an example (but any length will work), then two different ssDNA molecules could match together and form double-stranded short oligos. Nevertheless, the melting temperature of (dAT)
n molecules is above room temperature only for n > 6, and therefore, all the (dAT)
n molecules in this study, with the exception of (dAT)
10, can be safely assumed to be single-stranded. (dAT)
10 merits special comment not only for the melting temperature being between 24.6 °C and 37.9 °C, depending on the simulation model [
30,
31,
32], but also for the ability to form hairpins. (dAT)
n, in general, are self-complementary and are known to form harpins at low concentrations of sodium ions [
33], presenting a partial double-stranded structure. But, again, the shorter (dAT)
n strands in this work have a much lower chance to form hairpins, since it requires at least 4 nucleobases to create a loop [
34], automatically excluding any strand shorter than 10 nucleobases in the present work, so up to (dAT)
6. More specifically, while (dAT)
10 is long enough to create the hairpin, its melting temperature is lower, and therefore, hairpins are more likely to exist at room temperature. In fact, using the web-based software Mfold (version 3.5) [
35], the melting temperature of (dAT)
10 to form hairpins is 69.1 °C (see
Figure A9 in
Appendix A), well above room temperature, while the melting temperature for (dAT)
5 is only 30.3 °C, which is not far from room temperature, causing a lower chance of forming hairpins. The completely different character of the CD spectrum below 220 nm of (dAT)
10 as compared with shorter (dAT)
n strands is indeed due to some of or all the nucleobases forming parallel hydrogen bonds with the complementary oligo respective bases [
29]. Therefore, the secondary structure of (dAT)
10 is mostly a hairpin.
As explained in the methods section, in addition to dipole strength the absorption spectra can reveal the presence of interactions between stacked bases in the ssDNA, from which excitons arise. When comparing single-nucleobase and multiple-nucleobase spectra, stacked-base interaction signals appear as absorption peak shifts and absorption band narrowing. The results of the fitting routine explained in the methods section are the peak wavelengths and dipole strength µ
2 listed in
Table 1. The molecular weights (MW) and extinction coefficients at 260 nm (ε
260) are given by the manufacturer. All other values are extracted from the experimental absorption and circular dichroism spectra. We fitted the absorption spectra with three Gaussians (four in the case of (dAT)
n), using the methodology described in the methods sections. The rationale behind the decision of which fitting function to use and how many curves to fit are described in the SI and are also based on literature results [
36,
37], and our own experimental results of 2D photon-echo spectroscopy of nucleobases [
38] and on the same short-stranded DNA sequences as in the present paper (unpublished results). The fitting reveals two Gaussian underneath the 260 nm band, in accordance with literature for single nucleotides [
36] and DNA single strands, and both are taken into consideration for calculating the transition dipole strengths. We calculated and plotted the transition dipole strengths (µ
2, from
Table 1) for the peaks at 257 nm to 274 nm and at 235 nm to 250 nm as a function of the number of nucleobases present in the strand of all the ssDNA studied; the result is shown in
Figure 5. The three interpolating lines are first-order polynomials, with a coefficient of determination (R
2) of 0.9934 ((dA)
n), 0.9997 ((dT)
n), and 0.9948 ((dAT)
n). From these linear fits, we calculated the dependence of the transition dipole strength on the number of bases. The transition dipole strengths of (dT)
n strands depend on the number of bases in DNA, as 12.3 D
2 per base. This value is less than half than the one obtained for (dA)
n, 21.83 D
2, while the transition dipole strength-per-base for (dAT)
n is between the two previous values: 18.64 D
2. We can only speculatively assign the physical meaning behind this behavior to a different dipole moment interaction between neighboring nucleobases. The exact assignments can be performed only with the help of theoretical high-level simulations, which are extremely computationally expensive. It is clear that the transition dipole strength grows linearly with the number of nucleobases, and one could estimate the value of the transition dipole strength for any given ssDNA, knowing the sequence of the nucleobases in the strand and its length.
In CD spectra, evidence for excitons originating from stacked-based interactions is the Cotton, also called butterfly shape, spectral feature. This is a zero-crossing (nodal) point of the CD spectrum that corresponds to an absorption band. The full butterfly shape, made by two bands in the CD spectrum, correlates with two transition dipole moments of opposite signs that arise from excitonic electronic-level splitting. The Cotton effect can be understood as follows: DNA has its nucleobases disposed in a right-handed helix. At shorter wavelengths, near the nodal point, the dipole moments of interacting bases are almost parallel, creating a combined exciton state. Here, the right-hand circularly polarized light is most strongly absorbed. At longer wavelengths, the dipole moments are directed almost opposite of each other, therefore there is a strong absorption of left-hand (right-hand) circularly polarized light and hence the positive (negative) Cotton effect.
Furthermore, CD spectra can reveal the delocalization extent of excitons in oligomeric chains. If we assume that the basic element of the chain is a dimer, increasing the number of dimers leads to two possible kinds of behavior, as shown in the schematic of
Figure 6. At the bottom, the basic elements ignore each other, so the contribution to the CD signal is just the product of the dimer signal with the number of dimers. At the top of
Figure 6, the elements interact, and the CD signal is composed of the preceding non-interacting dimer signal together with the dimer interaction signal. To separate these two cases, the normalized peak values of the negative and positive bands of a butterfly-shaped are subtracted for different ssDNA lengths. In the first case, where excitons are localized solely among nearest neighbors, the relative strength of the CD signal does not increase as the number of bases per DNA strand increases. In the second case, where excitons are delocalized on more than one dimer, we expect an increased value with an increasing number of bases per strand. Qualitatively, as the slope steepness increases, so does the extent of the spatial delocalization.
The normalized CD spectra carry information on the type of interaction, regardless of whether it follows the top or the bottom situations depicted in
Figure 6. The CD peaks, once the spectra are properly normalized, can be extracted and plotted versus the length of the strand (number of nucleobases composing the DNA), as shown in
Figure A3 for (dA)
n,
Figure A4 for (dT)
n, and
Figure A5 for (dAT)
n (both in the
Appendix A). The values of these CD peaks (
Table 1) can be treated to extract the exciton interaction strength when related peak couples are identified. The maxima at 270−278 nm and minima at 250−251 nm in the CD spectra fall both under the same broad absorption peak, and they arise from an exciton coupling. The differences in the absolute values of the related maxima and minima are shown in
Figure 7, plotted versus the number of nucleobases per strand, (dA)
n in red, (dT)
n in blue, and (dAT)
n in green. While both the CD peaks of adenosine and thymidine strands increase in magnitude when the strands increase in length, the trend is not the same: in the case of adenosine, it is constant, while the relative CD strength increases with the number of thymidines per strand and does not saturate. It is known that β-DNA has a helicity of 10.5 bases per turn; therefore, in this study, we covered both the situation where DNA is not fully twisted (n ≤ 10) and where it is (n > 10). In both ranges, the value of the difference between CD peaks is stationary (adenosine) or increases (thymidine); therefore, the trend does not depend on whether the DNA is fully or partially twisted. This suggests that the excitons delocalize more and more as the (dT)
n becomes longer regardless of the completeness of the helix, as opposed to the excitons in (dA)
n that stay localized in a few nucleobases. In other words, for adenosine strands, the closest bases are the main contributors to the base-base interaction strength, and the extent of the excitons is only among the nearest neighbors.
The sum rules for absorption and circular dichroism spectra derived by Burin et al. [
16,
20] can be used to express exciton coupling strength through the integral properties of the experimental spectra, as outlined in the methods section. These have been applied to the spectra of this study, and the resulting values are plotted in
Figure 8. The values of the exciton coupling are similar to the values found for DNA hairpins by Burin et al. [
16]. Considering that hairpins are double-stranded, the lower values of exciton coupling that we find are to be expected. In fact, the values in Burin et al. [
16] for A-poly-T sequences (0.035 eV, 0.036 eV, 0.033 eV for n = 4, 6, 8) are 4% smaller for n = 6 and 20% smaller for n = 8 than our values. This is not surprising, considering that a double-stranded structure is more rigid than a single-stranded one, and its nucleobases are closer, therefore the electronic excitations are farther apart due to the stronger H- and J-shifts. The growing trend of thymine oligomers and the steady trend of adenine oligomers confirms our analysis (
Figure 7); the exceptions are strands of 2 and 4 nucleic acids. This might be due to an underestimated difference in the Van Vleck theory between strands too short to complete even half of a chain twist (5 nucleic acids in B-type DNA). Using Burin’s method, the (dAT)
n family holds steady, whereas using our method it grows; as previously mentioned, (dAT)
n is a self-annealing repeat that has a high chance to become double-stranded when long enough, while it stays unfolded when short. Consequently, the exciton coupling for shorter and longer A-T values represent two different structures (single and hairpin double-stranded DNA, respectively).
A surprising difference exists between the results of
Figure 7 and
Figure 8 on the one hand and
Figure 5 on the other, whereby the order of the three families is inverted: the values of the dipole strength of adenine homo-oligomers are the highest at any given number of nucleic acids, while the value of the exciton coupling is the lowest. Due to the dipolar nature of exciton coupling, the oscillation strength is greater for quantum states with larger transition dipole moments [
39]. Therefore, the exciton coupling should scale with the dipole strength. Consequently, the previously mentioned difference might be simply due to a different geometry of the DNA, meaning that the nucleobases are not disposed in a parallel fashion or with a 36
◦ angle between the nearby bases, as originally hypothesized.
The findings in the present work can be extended to any other sequence of DNA and RNA, given that the molecules remain single-stranded and of a similar secondary structure. We plan to repeat the same measurements by dissolving the ssDNA molecules in buffers with different compositions as the one used in this work to ensure the presence of only single-stranded DNA molecules in the samples, and in particular, we will add DMSO (Dimethyl sulfoxide) to the phosphate-based buffer. DMSO is commonly used in biotechnology for intentional denaturation of DNA, that is the forcing of double-stranded DNA to separate and remain single-stranded. The data analysis cannot be immediately replicated, as the presence of ions and cations and other molecules like DMSO modifies both the absorption and circular dichroism and both the first and secondary structure of DNA in a more complicated way than just denaturation of oligomers [
40,
41,
42]. Unluckily, to the knowledge of the authors, there is no systematic study of the absorption and circular dichroism of short strands of DNA with well-known lengths and sequences where DMSO was employed and the signals characterized.