- freely available
- re-usable

*Int. J. Mol. Sci.*
**2012**,
*13*(6),
7445-7465;
doi:10.3390/ijms13067445

^{1}

^{2}

## Abstract

**:**We describe a statistical method to analyze dual-channel photon arrival trajectories from single molecule spectroscopy model-free to identify break points in the intensity ratio. Photons are binned with a short bin size to calculate the logarithm of the intensity ratio for each bin. Stochastic photon counting noise leads to a near-normal distribution of this logarithm and the standard student t-test is used to find statistically significant changes in this quantity. In stochastic simulations we determine the significance threshold for the t-test’s p-value at a given level of confidence. We test the method’s sensitivity and accuracy indicating that the analysis reliably locates break points with significant changes in the intensity ratio with little or no error in realistic trajectories with large numbers of small change points, while still identifying a large fraction of the frequent break points with small intensity changes. Based on these results we present an approach to estimate confidence intervals for the identified break point locations and recommend a bin size to choose for the analysis. The method proves powerful and reliable in the analysis of simulated and actual data of single molecule reorientation in a glassy matrix.

## 1. Introduction

Single molecule experiments can probe structure and dynamics on a molecular scale, revealing details that in traditional bulk experiments remain hidden behind ensemble averages [1–8]. In particular, the technique allows the observation of individual events of molecular dynamics, yielding distributions of and correlations between different dynamical properties. Unfortunately, as the method is still young [5], few analysis methods are available to harvest the vast pool of information from the original fluorescence intensity or photon arrival trajectories. Single molecule experiments are therefore often analyzed with correlation functions building on the common practice for bulk experiments. Even though differences between individual molecules observed in the sample are preserved, many details embedded in the temporal sequences of events are lost in the correlation procedure, which time-averages over the trajectory. Furthermore, properly averaged autocorrelation functions require a trajectory length of at least 100-times the correlation time [9], while experimental trajectories are often much shorter than this minimum due to the limited photochemical lifetime of the probe molecules.

Recently, methods were introduced that permit the correlation analysis of photon arrival times directly without binning [10,11], which, while increasing the time resolution achievable, still suffer from the time-averaging intrinsic in correlation analysis. Starting with the analysis of quantum dot and single molecule “blinking”, a drop of the total intensity to zero [1,12–19], often associated with excursions to the triplet state of the single molecule [20,21], the advantage of the identification of individual events in the single molecule trajectory became obvious. Numerous methods that are based on model dynamics were subsequently introduced, such as the analysis of hidden Markov chains, for example through Bayesian analysis [22–24], photon counting histograms [25], or maximum likelihood analysis [26,27]. These approaches are appropriate for single molecule dynamics with a well-known number of accessible states, for example in the case of blinking or enzymatic turnovers, which can be approximated as “on” and “off” states, but even this simplification is debated [28].

Several model-independent methods are described in the literature, which are capable of detecting individual intensity change points directly from a single-channel photon arrival trajectory with Bayesian or maximum likelihood approaches [29–31], yielding the times of sudden changes in a piece-wise constant fluorescence intensity trace. These methods do not require the assumption of an underlying mechanism or a limited number of accessible states, and can therefore be applied more general. Typical applications are the identification of enzymatic turnovers [32], motor protein movements [33], or nanoparticle blinking [18,19,34], all leading to large fluctuations in the fluorescence intensity associated with the dynamical event of interest. Those sudden changes in the fluorescence are common in single molecule spectroscopy, caused by jumps between states with different emission characteristics as opposed to the continuous changes seen in bulk.

In addition to the single-channel intensity detection employed in the examples above, ratiometric measurements, that is the simultaneous recording of two intensity channels, is another widely used technique in single molecule spectroscopy [2,35,36]. Examples are the detection of two different polarization directions of the emitted fluorescence, I_{||} and I_{⊥}, to observe the angular reorientation of a single probe molecule reporting on polymer [37–39] or protein dynamics [40] as well as the detection of two different emission wavelengths, either to observe shifts in an emission spectrum [41] or to determine the distance between a pair of single molecules showing Förster resonance energy transfer (sp-FRET) [42–44].

Instead of investigating the intensities directly, a ratiometric analysis focuses on a normalized intensity ratio, for example, in the case of single molecule orientational motion, the reduced linear dichroism [45] (Equation 1).

Here, effects of the photodynamics of the probe molecule, which might change the total intensity, I_{||}+ I_{⊥} without affecting the intensity ratio, I_{||}/I_{⊥}, are eliminated from the monitored quantity, I_{d}, and only changes in the polarization direction of the emission, caused by single molecule reorientation, are recorded. It would be desirable to detect sudden changes in the ratiometric measure similar to their detection in the single-channel analysis methods discussed above. A two-state Markov-chain approach has for example been used to analyze sp-FRET experiments [44,46].

This intensity ratio is central to any of the two-channel experimental methods mentioned above. As an example for its usefulness we will here discuss one application, monitoring the fluorescence polarization direction of the single molecule emission. However, the method described here is general and can easily be adapted to any ratiometric single molecule technique.

To construct a model-free approach for the detection of change points in a ratiometric variable one might imagine analyzing both intensity channels separately using one of the single-channel model-free methods [29–31] and combining the results. However, single probe molecule “blinking” that leaves the ratiometric measure of interest (such as the reduced linear dichroism, I_{d}) undefined during any “dark” periods interferes with this approach. An example illustrating the high frequency and short durations of these blinking events is shown in Figure 1 for the two polarization directions of the fluorescence from a single rhodamine B molecule immobilized in a solid polymer matrix. Short gaps in the continuous stream of photons indicate frequent blinking events with typical durations on the order of a millisecond. Also shown are the results of a statistical analysis routine [29] analyzing the photon arrival times in each detection channel separately. This routine, in combination with a subsequent coincidence analysis [47] would need to identify all of these blinking periods to avoid false positive ratiometric change points, the quantity of interest in the applications mentioned above. As can be seen from Figure 1 this is clearly not the case, justifying the search for a new statistical analysis method dedicated to the identification of ratiometric change points, as presented in this paper.

## 2. Results and Discussion

#### 2.1. Threshold Values for Significance

The threshold value, τ_{1−}_{α}, indicating statistically significant differences between two sections of intensity ratio points with a total length L at various levels of confidence 1−α, is shown in Figure 2. The threshold value τ_{1−}_{α} does depend very weakly on both the average intensity ratio, 〈ρ〉, between the two detection channels, and the average number of photons per bin, 〈N_{photon}〉, and is shown here only for equal intensity in the two channels, 〈ρ〉= 1, and 〈N_{photon}〉= 25 photons per bin. We fit the slow increase of the threshold value τ_{1−}_{α} as a function of the total number of sample points, L, with the empirical fit function:

where A represents an amplitude, s a scaling factor, and λ a power law exponent. Table 1 lists the resulting best-fit parameters for the various levels of confidence, 1−α, for the fits displayed in Figure 2.

If the analysis is performed with the additional safeguard that N_{min} = 10 consecutive points have to surpass the threshold value, τ_{1−}_{α}, the effective probability, α_{eff}, of false positives is further suppressed, for example to α_{eff} = 0.05% in the case of α = 1%, as listed in Table 1. This safeguard might seem overly cautious, but in our particular application we are interested in an accurate identification of large intensity ratio changes, ρ_{i}/ρ_{i}_{+1}, and the waiting times, t_{w}, in between. Break points missed because of this additional safeguard are either characterized by small intensity ratio changes, ρ_{i}/ρ_{i}_{+1}, or a short waiting time, t_{w}, leading up to the break point (see Section 2.3), neither one of which constitutes a shortcoming in the analysis of our single molecule experiments [38].

Trial break points, k′, near the beginning (k′ → 0) or the end (k′ → L) of the investigated sequence of the trajectory generate one section with a very small sample size. The resulting large fluctuations in the mean and standard deviation of this small section lead to an increased probability of false positive identification, as shown in Figure 3 for N = 100,000 simulations for sequences of length L = 100. The increased probability of false positives near one of the limits of the sequence does not depend on the length of the remainder of the sequence, L − k′. As this increased likelihood for the identification of false break points significantly exceeds the ideal, targeted, probability of α, as indicated by the dashed line in Figure 3, we exclude N_{excl} = 10 points at the beginning and the end of the investigated sequence of intensity ratio points from the analysis.

#### 2.2. Accuracy

To illustrate the strength of the analysis method, Figure 4a displays an example with L = 1000 simulated intensity ratio points with 〈N_{photon}〉 = 25 photons per bin. The sequence features a break point in the center, k = L/2, with a change in the intensity ratio of 30%. The corresponding likelihood measure, $\mathcal{L}$(k′), calculated at each trial break point, k′, for this example is displayed in panel (b) of Figure 4. The maximum likelihood value, $\mathcal{L}$_{max}, significantly exceeds the threshold value τ_{1−}_{α} (added as a dashed line for the 1 − α = 99% level of confidence) for this sample length, leading to a clear identification of the simulated break point. We use the location of the maximum likelihood value, $\mathcal{L}$_{max} = L(k̂), as the best estimate, k̂, for the location of the actual break point, k.

#### 2.2.1. Distribution of Location Error

The distribution of the deviation, Δk̂ = k̂− k, of the estimated location for the intensity ratio break point, k̂, from the actual break point, k, is shown in Figure 5 for various changes in the intensity ratio, ρ_{i}/ρ_{i}_{+1}. The probability P(Δ k̂) falls off approximately exponentially from a maximum probability at zero error, Δ k̂ = 0 and only depends on the change in the intensity ratio, ρ_{i}/ρ_{i}_{+1}, at the break point, but not on the length of the sample, L, or on the relative location, k/L, of the break point within the sample. For sequences similar to the example pictured in Figure 4 with ρ_{i}/ρ_{i}_{+1} = 1.30 and 〈N_{photon}〉 = 25 photons per bin, over 20% of all break points are identified without error, Δk̂ = 0. The average deviation of all identified break points is 〈|Δk̂|〉 = 3.7 bins. The average deviation, 〈|Δk̂|〉 is shown in Figure 6 as a function of the relative change in intensity ratio, ρ_{i}/ρ_{i}_{+1}, for a bin size of 〈N_{photon}〉 = 25 photons per bin. Also indicated in Figure 6 is the percentage of correctly identified break point locations, P(Δk̂ = 0).

The selection of the bin size, 〈N_{photon}〉, determines the photon counting and single molecule blinking noise of the binned intensity ratios, the number of intensity ratio points to be analyzed between two break points, and the time resolution of the analysis. The distribution of the error of the estimated location, however, if measured in terms of the absolute deviation in time instead of as a number of bins, proved to be independent of the bin size chosen for the analysis of a given photon sequence. Binning fewer photons per bin might increase the time resolution per bin, but due to the increased photon counting noise per bin will not change the timing error in the estimated break point locations. The only way to reduce the timing error for break points even further is to record the photon trajectory at a higher intensity, thus increasing the number of photons available for analysis for the same number of break points.

#### 2.2.2. Estimation of Location Error

From N = 10,000 random number simulations with one break point in the center we find empirically that the threshold τ′_{1−}_{α}, which defines the confidence interval for k̂, depends strongly on the length, L, of the sequence tested, the average number of photons per bin, 〈N_{photon}〉, and the size of the step at the break point, ρ_{i}/ρ_{i}_{+1}. However, for any combination of these three parameters investigated we find that the threshold value τ′_{1−}_{α} falls within 20% of a common upper bound, if described as a function of the variable $\mathcal{L}$_{max}/L, where $\mathcal{L}$_{max} is the likelihood of the estimated break point for a particular sequence of length, L. We describe this upper bound empirically through a power law that approaches a constant level for small values of $\mathcal{L}$_{max}/L.

where x = $\mathcal{L}$_{max}/L, τ′_{0} is a constant threshold for small values of $\mathcal{L}$_{max}/L, A signifies an amplitude, and λ is a power-law exponent. Table 2 lists the resulting parameters describing this upper bound for various levels of confidence, 1 − α. In tests on simulated photon trajectories of various lengths and step sizes we find that the fraction of the actual break points that lie within the thus estimated confidence interval is close to the expected value 1 − α for most parameter combinations. Only for very short sequences or few photons per bin do we observe the fraction of break points within the confidence interval to drop below 1 − α.

#### 2.3. Sensitivity

We determine the sensitivity of the analysis, that is the probability of false negatives, or break points missed, by testing the method described above on two types of sequences of simulated intensity ratio points. The first set of tests is performed on sequences that include a single break point at location k, the second set of tests uses simple model sequences with multiple break points (N_{jumps} = 200) of identical spacing and constant changes in the intensity ratio (square waves). The probability of false negatives in sequences with a single break point depends strongly on the change in intensity ratio, ρ_{i}/ρ_{i}_{+1}, the average number of photons per bin, 〈N_{photon}〉, and the length of the investigated sequence, L. Figure 7 shows these dependences for a bin size of 〈N_{photon}〉 = 25 photons per bin both as a contour plot and as cuts for a variety of intensity ratio changes. As expected, closely spaced break points with small changes in the intensity ratio are likely missed, but both closely spaced large jumps as well as well-separated small jumps can be detected reliably.

The algorithm employed to search for multiple break points in photon arrival trajectories (Section 4.4) was separately tested on simulated model trajectories with multiple break points. The resulting probability for false negatives is very similar to the diagram depicted in Figure 7, except for a reduced sensitivity for break points in very short sequences (L < 40 points) caused by the additional safeguards added to protect against false positive identifications (Section 4.1).

For a given sequence of photons, smaller bin sizes, 〈N_{photon}〉, lead to higher photon counting noise per bin for the intensity ratio, which, despite producing more bins to analyze for the sequence, reduces the sensitivity significantly (Figure 8). This observation is in contrast to results for statistical single-channel photon trajectory analysis, where the largest information content is revealed using a photon-by-photon approach [10]. On the other hand, large bin sizes can reduce the number of bins per break point below the additional safeguards introduced to guard against false positives (Section 4.1), such that actual break points could be rejected.

#### 2.4. Comparison to Existing Methods

Returning to the challenge of analyzing single molecule photon arrival trajectories containing frequent blinking events (Figure 1) we now compare the results of the method described in this paper to those from the separate identification of intensity break points in each channel in combination with a subsequent coincidence analysis. A maximum likelihood analysis [29] of the photon arrival times in a single detection channel can identify many of the “dark” of “bright” periods, but due to their high frequency and very short duration nevertheless misses a significant fraction randomly in one of the two channels, as illustrated in Figure 1. A subsequent test for coincidences of these breaks points in both channels [47] to distinguish changes in the intensity ratio from changes in the total intensity therefore yields false-positive change points for the ratiometric measure whenever a break point is missed in one of the two channels.

Figure 9 illustrates the result of this coincidence analysis for the photon arrival trajectory shown in Figure 1. As the trajectories were collected for a single molecule in a rigid polymer matrix, no molecular reorientations and therefore no changes in the fluorescence intensity ratio occur in the experiment. As shown, the single-channel analysis leads to a dramatic overestimation of the single molecule orientational dynamics. In comparison, the statistical analysis of the photon frequency, as suggested in this paper with a very short bin width on the order of the average blinking duration, yields the expected result of a constant ratio without any break points.

As an example of the application of the proposed method to a single molecule trajectory with reorientations, Figure 10 illustrates the strength of the method by comparing an experimental trajectory with the corresponding reconstructed single molecule orientational dynamics. The reconstructed trajectory follows the experimental intensity ratio very well, capturing all significant reorientations of the probe molecule. Even more importantly, dynamical quantities such as the correlation function for the orientational motion of the probe molecule are represented equally well with the reconstructed trajectory [38]. In analogy to the discussion in the proceeding sections, we tested the accuracy and sensitivity of the analysis method applied to simulated single molecule trajectories where jump times and amplitudes were known. The simulated trajectories very closely resemble those recorded in single molecule experiments in a glass matrix at the glass transition temperature [38]. The dynamics in a glass is characterized by a very broad distribution of waiting times, spanning several orders of magnitude. In addition, the assumed exponential distribution of jump sizes generates a large number of break points with small changes of the intensity ratio. Even though both of these factors present a challenge to the analysis routine, the overall performance is very satisfactory, as basically all simulated break points with a change in the intensity ratio of ρ_{i}/ρ_{i}_{+1} > 1.25 are identified with 63% of the extracted jump times exhibiting an error in the location of less than 2 bins. This error in the location is well within the intrinsic timing error expected for single molecule experiments for a given recorded photon rate and can be controlled by choosing the excitation intensity appropriately.

## 3. Single Molecule Experiment

In single molecule spectroscopy, a fluorescent probe is embedded in the matrix of interest at a very low concentration [3–5]. We use rhodamine B as a probe molecule in the polymer poly(vinyl acetate) in the vicinity of the glass transition temperature of the matrix. Details of the experiment are published elsewhere [38,48]. Very briefly, a high-NA microscope objective focuses a cw He-Ne laser onto the sample and collects the fluorescence from the probe molecules. After spectral filtering, a dielectric polarization cube splits the emission into two perpendicular polarization directions, which are detected on separate single-photon-counting photodiodes. For later statistical analysis, the arrival timestamps of every detected photon in each channel are continuously recorded.

The recorded fluorescence intensity in both polarization directions, I_{||} and I_{⊥}, as well as the ratio of these two fluorescence intensities, ρ = I_{||}/I_{⊥}, exhibit sudden changes, as shown in Figure 10. While changes in the total fluorescence intensity, I = I_{||} + I_{⊥}, could be caused by the probe molecule’s photodynamics, such as excursions to the triplet state [1,13,32], or fluctuations in the fluorescence lifetime due to changes in the probe environment [49], changes in the ratio of the fluorescence intensity in the two polarization directions, ρ = I_{||}/I_{⊥}, indicate reorientations of the probe molecule.

## 4. Analysis Method

The analysis method described here identifies statistically significant changes in the expectation value of the observed intensity ratio. We bin photons from the original photon arrival trajectory with a short bin width of Δt = 5 ms, which is larger than the lengths of typical blinking periods for rhodamine [12,50], corresponding to an average number of about 〈N_{photon}〉 ~ 25 photons per bin at typical intensities in our experiments. For each bin we calculate the logarithm of the ratio of the intensity in the two detection channels,

Stochastic photon counting noise leads to a near-normal distribution [51,52] of log(ρ) as illustrated in Figure 11 for simulated photon arrival times in two channels with a constant average intensity ratio of 〈ρ〉 = 1 and 〈N_{photon}〉 = 25 photons per bin. At these bin sizes bins without photons are extraordinarily rare and do not require special treatment. The near-normality of the distribution for log(ρ), combined with the numerous statistical tools available for the normal distribution, is the reason why we analyze the logarithm of the intensity ratio (Equation 4) instead of the linear dichroism, I_{d}, (Equation 1) which is traditionally used in fluorescence microscopy.

#### 4.1. False Positives

For normally distributed populations the student t-test is traditionally used to find statistically significant differences between the means of two samples [53]. We determine the actual threshold values indicating significant differences between the means of two samples (that is, between two consecutive sections of binned intensity ratio values) through random number simulations. To this end we simulate N = 100,000 trials of photon arrival time sequences of varying lengths, L = 20 to L = 5000 points, without a break in the intensity ratio, ρ, (testing for false positives). For these simulated photon arrival trajectories we calculate the logarithm of the intensity ratio, log(ρ), for bins with 〈N_{photon}〉 = 25 photons on average.

At every point, k′, with k′ = 0. . . L, in these intensity ratio sequences we determine the Student’s t-test’s p-value using standard statistical procedures [53] to test for statistical differences between the sequences before and after the trial point k′, {0, . . ., k′− 1} and {k′, . . ., L}, as if k′ were an actual break point in the intensity ratio sequence. As the sequences simulated to test for false positives do not contain an actual break point, we save the maximum p-value, p_{max}, found in each of the N trials of a given length, L. We exclude the p-values for the first and last N_{excl} = 10 potential break points, as those show very large fluctuations due to the small size of one of the two sections tested (Section 2.1). For each length, L, we find a threshold value τ_{1−}_{α}, such that for a fraction 1−α of the N trials the maximum p-value, p_{max}, falls below the threshold τ_{1−}_{α}, where 1 − α is the level of confidence.

#### 4.2. Location of Break Points

The simulated example trajectory (with a break point) shown in Figure 4 illustrates the approach used for the analysis of our experimental results. We split a sequence of intensity ratio points into two sections at a trial break point, k′, calculate the p-value, p, for the statistical significance of different means to determine the likelihood, $\mathcal{L}$(k′) = −log(p), for a break at this trial position, k′. We repeat the test for all trial break points, k′ = N_{excl} . . . L − N_{excl} in the sequence (again excluding the first and last N_{excl} = 10 points). We accept the break point, k′_{max}, with the maximum likelihood, $\mathcal{L}$_{max} = $\mathcal{L}$(k′_{max}), as our maximum likelihood estimate, k̂, if $\mathcal{L}$_{max} exceeds the threshold value τ_{1−}_{α} for a sequence of length L at the confidence level 1 − α.

For the analysis of our experimental data we choose a conservative level of confidence of 1−α = 99% that in addition has to be surpassed by at least N_{min} = 10 consecutive trial break points around k̂ to further guard against false positive identifications caused by additional (non-photon counting) noise in the experiment, for example through frequent blinking of the probe molecule. This safeguard limits the shortest detectable distance between break points, t_{w}, to about t_{w,min} ~ N_{min}Δt = 50 ms and raises the detection threshold for break points significantly. The analysis of our experimental data therefore rejects some fraction of the break points with the smallest change in intensity ratio, ρ_{i}/ρ_{i}_{+1}. However, for our particular application these rejections are much less of a concern than the inclusion of just a few false positive break points.

#### 4.3. False Negatives and Error of Location Estimate

In separate stochastic simulations of photon streams that feature one intensity ratio break in the center, k = L/2, as illustrated in Figure 4, we determine the fraction of missed break points (false negatives) as a function of the change in intensity ratio, Δρ/ρ, and the length, L, of the sequence probed. We choose a level of confidence of 1 − α = 99% plus an additional safeguard of a minimum of N_{min} = 10 points above the threshold, τ_{1−}_{α}, to test the routine under the same conditions as in the analysis of our experimental trajectories. We perform N = 10,000 trials for intensity changes of ρ_{i}/ρ_{i}_{+1} = 1.1 to ρ_{i}/ρ_{i}_{+1} = 2.0, with an average number of 〈N_{photon}〉 = 25 photons per bin with equal intensity in both channels when averaged over the entire trajectory. For successfully identified break points we determine the distribution of the error, Δk̂ = k̂ − k, for the estimated location of the break point, k̂, as a function of the length of the sequence, L, and the change in the intensity ratio, ρ_{i}/ρ_{i}_{+1}, at the break point, k.

Furthermore, we determine the difference in the likelihood measure Δ$\mathcal{L}$_{max} = $\mathcal{L}$_{max}−$\mathcal{L}$(k) between the likelihood at the maximum, $\mathcal{L}$_{max} = $\mathcal{L}$(k̂) and the likelihood for a break at the actual break point, $\mathcal{L}$(k). From the distribution of this likelihood difference, Δ$\mathcal{L}$_{max}, we find a threshold value τ′_{1−}_{α} such that for a fraction 1−α of the N trials Δ$\mathcal{L}$_{max} falls below τ′_{1−}_{α} where 1−α signifies the level of confidence for the location error estimate. We use this threshold value τ′_{1−}_{α} to find the confidence interval k^{−} . . . k^{+} for the estimated break point k̂, where k^{−} and k^{+} are the two points to the right and left of k̂, respectively, such that $\mathcal{L}$(k^{−}) = $\mathcal{L}$(k^{+}) = $\mathcal{L}$_{max} − τ′_{1−}_{α} (see Figure 4).

#### 4.4. Algorithm

The application of the t-test as described above can only locate one most probable change point, k̂, in a given sequence of points. To find all change points, {k̂_{i}}, in an experimental trajectory, we systematically test for potential break points in a slowly growing section of the trajectory, starting at the last identified break point, k̂_{i}_{−1}, lengthening the sequence under test by N_{step} = 5 bins at a time. As an additional safeguard against spurious break points we also require that any probable new break point, k̂_{i}, is confirmed N_{repeat} = 2 more times in additional sequences that start with the same previously identified break point, k̂_{i}_{−1}, but are lengthened by an additional N_{step} bins each time. If the break point is reproducible, we subsequently double-check the previously identified change point k̂_{i}_{−1} in the sequence bracketed by the two adjacent break points k̂_{i}_{−2} and (the newly identified) k̂_{i}. If k̂_{i}_{−1} is confirmed as the most likely break point location between k̂_{i}_{−2} and k̂_{i}, the process continues from break point k̂_{i} to search for a new break point k̂_{i}_{+1}. However, if the previously identified most likely break point location, k̂_{i}_{−1}, differs from the new location of the break point between k̂_{i}_{−2} and k̂_{i}, or if break point k̂_{i}_{−1} is no longer statistically significant given the new sequence limit k̂_{i}, break point k̂_{i}_{−1} is modified accordingly (or eliminated all together) and the confirmation check continues backwards until the sequence {k̂_{0} . . . k̂_{i}} is self-consistent. The algorithm is represented graphically in Figure 12. This approach eventually yields a time sequence of n most likely intensity ratio change points, {k̂_{i}}, with i = 0. . . n.

We calculate the corresponding intensity ratios, ρ_{i}, between two identified change points, k̂_{i} and k̂_{i}_{+1}, directly from the number of photons recorded between these two times in each detection channel. To accelerate the calculation we approximate the p-value, p, through the following equation that we determined empirically:

with

where x = log(ρ), while χ̄_{i} and s_{i} are the maximum likelihood estimators for mean and standard deviation, determined for the intensity ratio sequence sections before (i = 1) and after (i = 2) the trial break point, k′. To further improve the speed of the calculation, we pre-calculate cumulative sums for x and x_{2} for the entire tested sequence, {ρ_{0}, . . ., ρ_{L}} to quickly determine averages and standard deviations for the two samples on either side of all possible trial break points k′ from differences between the corresponding two elements of the cumulative sums.

#### 4.5. Simulation of Photon Sequences With Multiple Break Points

To test the performance, sensitivity, and reliability of the analysis routine algorithm, we simulate the following three types of trajectories with multiple break points: (a) square wave intensities with constant waiting times and constant intensity jumps, varying both parameters independently in separate runs; (b) trajectories with constant waiting time but intensity jumps of random amplitude, varying only the constant waiting time in separate simulation runs; as well as (c) photon sequences that are comparable to experimentally recorded trajectories. The simulation of realistic experimental single molecule trajectories is based on a recently proposed model for the dynamics of glasses [38], which allows us to simulate the waiting times between changes in the fluorescence polarization recorded for the single probe molecule. We simulate the changes in the intensity ratio, ρ, at these break points from angular jump trajectories of the single molecule that stem from random walks on a sphere with isotropic exponential jump size distribution. Accounting for the numerical aperture of the microscope objective [54], we subsequently calculate the intensities in the two polarization directions for each orientation, randomly pick photon arrival times with an exponential waiting time distribution, consistent with these intensities in the two detection channels and finally bin the photons as done in the experiment. In these simulations we assume that waiting times and jump sizes are uncorrelated, which is consistent with our experimental results. The purpose of these different simulations is to determine the percentage of identified jumps (sensitivity), the average error in the estimated location of break points (accuracy) of the utilized algorithm (Section 4.4).

## 5. Conclusions

Single molecule spectroscopy is a new and very powerful experimental technique, calling for new analysis methods. The statistical method described in this paper identifies sudden changes in a ratiometric variable, the ratio between two fluorescence intensities, indicating the times of individual dynamical events of the single probe molecule, from the recorded photon arrival times in the two detection channels. This model-free analysis approach provides quantifiable error estimates for the jump times at a chosen level of confidence. Tests on simulated photon arrival trajectories indicate that the analysis method locates all events of significant magnitude with little or no error, and still recovers a large fraction of jumps with small amplitudes. The approach described in this paper is general and can easily be applied to different functional forms of the ratiometric variables to analyze other single molecule experiments, such as sp-FRET or spectral diffusion without assuming an underlying kinetic mechanism or a limited number of accessible states. The method is only sensitive to changes in the ratiometric observable and avoids the interference of the frequent and brief single molecule blinking. It therefore fills a gap in the collection of single molecule analysis methods. Moving away from the analysis of single molecule experiments with time correlation functions to a more detailed statistical description using an event perspective affords a higher level of detail accessible in the experiment, bringing us closer to harnessing the full power of both the single molecule and single photon detection technique.

## Acknowledgements

This work was made possible through NSF grant CHE-0749863

## References

- Plakhotnik, T.; Donley, E.A.; Wild, U.P. Single-molecule spectroscopy. Ann. Rev. Phys. Chem
**1997**, 48, 181–212. [Google Scholar] - Weiss, S. Fluorescence spectroscopy of single biomolecules. Science
**1999**, 283, 1676–1683. [Google Scholar] - Xie, X.S.; Trautman, J.K. Optical studies of single molecules at room temperature. Ann. Rev. Phys. Chem
**1998**, 49, 441–480. [Google Scholar] - Moerner, W.E.; Orrit, M. Illuminating single molecules in condensed matter. Science
**1999**, 283, 1670–1676. [Google Scholar] - Moerner, W.E. A dozen years of single-molecule spectroscopy in physics, chemistry, and biophysics. J. Phys. Chem. B
**2002**, 106, 910–927. [Google Scholar] - Moerner, W.E. New directions in single-molecule imaging and analysis. Proc. Natl. Acad. Sci. USA
**2007**, 104, 12596–12602. [Google Scholar] - Schuler, B. Single-molecule fluorescence spectroscopy of protein folding. ChemPhysChem
**2005**, 6, 1206–1220. [Google Scholar] - Tamarat, P.; Maali, A.; Lounis, B.; Orrit, M. Ten years of single-molecule spectroscopy. J. Phys. Chem. A
**2000**, 104, 1–16. [Google Scholar] - Bingemann, D. Analysis of ‘blinking’ or ‘hopping’ single molecule signals with a limited number of transitions. Chem. Phys. Lett
**2006**, 433, 234–238. [Google Scholar] - Yang, H.; Xie, X.S. Probing single-molecule dynamics photon by photon. J. Chem. Phys
**2002**, 117, 10965–10979. [Google Scholar] - Hinze, G.; Basche, T. Statistical analysis of time resolved single molecule fluorescence data without time binning. J. Chem. Phys
**2010**, 132, 044509:1–044509:6. [Google Scholar] - Ha, T.; Enderle, T.; Chemla, D.S.; Selvin, P.R.; Weiss, S. Quantum jumps of single molecules at room temperature. Chem. Phys. Lett
**1997**, 271, 1–5. [Google Scholar] - Basche, T. Fluorescence intensity fluctuations of single atoms, molecules and nanoparticles. J. Lumin
**1998**, 76–7, 263–269. [Google Scholar] - Ishitobi, H.; Kai, T.; Fujita, K.; Sekkat, Z.; Kawata, S. On fluorescence blinking of single molecules in polymers. Chem. Phys. Lett
**2009**, 468, 234–238. [Google Scholar] - Heilemann, M.; Margeat, E.; Kasper, R.; Sauer, M.; Tinnefeld, P. Carbocyanine dyes as efficient reversible single-molecule optical switch. J. Am. Chem. Soc
**2005**, 127, 3801–3806. [Google Scholar] - Biebricher, A.; Sauer, M.; Tinnefeld, P. Radiative and nonradiative rate fluctuations of single colloidal semiconductor nanocrystals. J. Phys. Chem. B
**2006**, 110, 5174–5178. [Google Scholar] - Nirmal, M.; Dabbousi, B.O.; Bawendi, M.G.; Macklin, J.J.; Trautman, J.K.; Harris, T.D.; Brus, L.E. Fluorescence intermittency in single cadmium selenide nanocrystals. Nature
**1996**, 383, 802–804. [Google Scholar] - Shimizu, K.T.; Neuhauser, R.G.; Leatherdale, C.A.; Empedocles, S.A.; Woo, W.K.; Bawendi, M.G. Blinking statistics in single semiconductor nanocrystal quantum dots. Phys. Rev. B
**2001**, 63, 205316:1–205316:5. [Google Scholar] - Kuno, M.; Fromm, D.P.; Hamann, H.F.; Gallagher, A.; Nesbitt, D.J. Nonexponential “blinking” kinetics of single CdSe quantum dots: A universal power law behavior. J. Chem. Phys
**2000**, 112, 3117–3120. [Google Scholar] - Gensch, T.; Bohmer, M.; Aramendia, P.F. Single molecule blinking and photobleaching separated by wide-field fluorescence microscopy. J. Phys. Chem. A
**2005**, 109, 6652–6658. [Google Scholar] - Haase, M.; Hubner, C.G.; Reuther, E.; Herrmann, A.; Mullen, K.; Basche, T. Exponential and power-law kinetics in single-molecule fluorescence intermittency. J. Phys. Chem. B
**2004**, 108, 10445–10450. [Google Scholar] - Andrec, M.; Levy, R.M.; Talaga, D.S. Direct determination of kinetic rates from single-molecule photon arrival trajectories using hidden markov models. J. Phys. Chem. A
**2003**, 107, 7454–7464. [Google Scholar] - Schenter, G.K.; Lu, H.P.; Xie, X.S. Statistical analyses and theoretical models of single-molecule enzymatic dynamics. J. Phys. Chem. A
**1999**, 103, 10477–10488. [Google Scholar] - Jung, S.; Dickson, R.M. Hidden markov analysis of short single molecule intensity trajectories. J. Phys. Chem. B
**2009**, 113, 13886–13890. [Google Scholar] - Hajdziona, M.; Molski, A. Estimation of single-molecule blinking parameters using photon counting histogram. Chem. Phys. Lett
**2009**, 470, 363–366. [Google Scholar] - Burzykowski, T.; Szubiakowski, J.; Ryden, T. Analysis of photon count data from single-molecule fluorescence experiments. Chem. Phys
**2003**, 288, 291–307. [Google Scholar] - Jager, M.; Kiel, A.; Herten, D.P.; Hamprecht, F.A. Analysis of single-molecule fluorescence spectroscopic data with a markov-modulated poisson process. Chemphyschem
**2009**, 10, 2486–2495. [Google Scholar] - Zhang, K.; Chang, H.; Fu, A.; Alivisatos, A.P.; Yang, H. Continuous distribution of emission states from single CdSe/ZnS quantum dots. Nano Lett
**2006**, 6, 843–847. [Google Scholar] - Watkins, L.P.; Yang, H. Detection of intensity change points in time-resolved single-molecule measurements. J. Phys. Chem. B
**2005**, 109, 617–628. [Google Scholar] - Kalafut, B.; Visscher, K. An objective, model-independent method for detection of non-uniform steps in noisy signals. Comput. Phys. Commun
**2008**, 179, 716–723. [Google Scholar] - Ensign, D.L.; Pande, V.S. Bayesian detection of intensity changes in single molecule and molecular dynamics trajectories. J. Phys. Chem. B
**2010**, 114, 280–292. [Google Scholar] - Lu, H.P.; Xun, L.Y.; Xie, X.S. Single-molecule enzymatic dynamics. Science
**1998**, 282, 1877–1882. [Google Scholar] - Carter, N.J.; Cross, R.A. Mechanics of the kinesin step. Nature
**2005**, 435, 308–312. [Google Scholar] - Empedocles, S.A.; Norris, D.J.; Bawendi, M.G. Photoluminescence spectroscopy of single CdSe nanocrystallite quantum dots. Phys. Rev. Lett
**1996**, 77, 3873–3876. [Google Scholar] - Berezin, M.Y.; Achilefu, S. Fluorescence lifetime measurements and biological imaging. Chem. Rev
**2010**, 110, 2641–2684. [Google Scholar] - Deniz, A.A.; Laurence, T.A.; Dahan, M.; Chemla, D.S.; Schultz, P.G.; Weiss, S. Ratiometric single-molecule studies of freely diffusing biomolecules. Ann. Rev. Phys. Chem
**2001**, 52, 233–253. [Google Scholar] - Rosenberg, S.A.; Quinlan, M.E.; Forkey, J.N.; Goldman, Y.E. Rotational motions of macro-molecules by single-molecule fluorescence microscopy. Acc. Chem. Res
**2005**, 38, 583–593. [Google Scholar] - Bingemann, D.; Allen, R.; Olesen, S. Single molecules reveal the dynamics of heterogeneities in a polymer at the glass transition. J. Chem. Phys
**2011**, 134, 024513:1–024513:9. [Google Scholar] - Adhikari, S.; Selmke, M.; Cichos, F. Temperature dependent single molecule rotational dynamics in PMA. Phys. Chem. Chem. Phys
**2011**, 13, 1849–1856. [Google Scholar] - Hu, D.; Lu, H.P. Single-molecule nanosecond anisotropy dynamics of tethered protein motions. J. Phys. Chem. B
**2002**, 107, 618–626. [Google Scholar] - Prummer, M.; Sick, B.; Renn, A.; Wild, U.P. Multiparameter microscopy and spectroscopy for single-molecule analytics. Anal. Chem
**2004**, 76, 1633–1640. [Google Scholar] - Lu, H.P. Probing single-molecule protein conformational dynamics. Acc. Chem. Res
**2005**, 38, 557–565. [Google Scholar] - Yuan, H.; Xia, T.; Schuler, B.; Orrit, M. Temperature-cycle single-molecule FRET microscopy on polyprolines. Phys. Chem. Chem. Phys
**2011**, 13, 1762–1769. [Google Scholar] - Chung, H.S.; Gopich, I.V.; McHale, K.; Cellmer, T.; Louis, J.M.; Eaton, W.A. Extracting rate coefficients from single-molecule photon trajectories and fret efficiency histograms for a fast-folding protein. J.Phys. Chem. A
**2010**, 115, 3642–3656. [Google Scholar] - Jameson, D.M.; Ross, J.A. Fluorescence polarization/anisotropy in diagnostics and imaging. Chem. Rev
**2010**, 110, 2685–2708. [Google Scholar] - Gopich, I.V.; Szabo, A. Decoding the pattern of photon colors in single-molecule FRET. J. Phys. Chem. B
**2009**, 113, 10965–10973. [Google Scholar] - Xu, C.S.; Kim, H.; Hayden, C.C.; Yang, H. Joint statistical analysis of multichannel time series from single quantum dot-(Cy5)
_{n}constructs. J. Phys. Chem. B**2008**, 112, 5917–5923. [Google Scholar] - Adhikari, A.; Capurso, N.; Bingemann, D. Heterogeneous dynamics and dynamic heterogeneities at the glass transition probed with single molecule spectroscopy. J. Chem. Phys
**2007**, 127, 027732:1–027732:9. [Google Scholar] - Vallee, R.A.L.; Tomczak, N.; Vancso, G.J.; Kuipers, L.; van Hulst, N.F. Fluorescence lifetime fluctuations of single molecules probe local density fluctuations in disordered media: A bulk approach. J. Chem. Phys
**2005**, 122, 114704. [Google Scholar] - Teschke, O.; Dienes, A.; Holtom, G. Measurement of triplet lifetime in a jet stream cw dye laser. Opt. Commun
**1975**, 13, 318–320. [Google Scholar] - Clarke, R.W.; Orte, A.; Klenerman, D. Optimized threshold selection for single-molecule two-color fluorescence coincidence spectroscopy. Anal. Chem
**2007**, 79, 2771–2777. [Google Scholar] - Widengren, J.; Kudryavtsev, V.; Antonik, M.; Berger, S.; Gerken, M.; Seidel, C.A.M. Single-molecule detection and identification of multiple species by multiparameter fluorescence detection. Anal. Chem
**2006**, 78, 2039–2050. [Google Scholar] - Press, W.H.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. Numerical Recipes in C: The Art of Scientific Computing; Cambridge University Press: Cambridge, UK, 1992. [Google Scholar]
- Fourkas, J.T. Rapid determination of the three-dimensional orientation of single molecules. Opt. Lett
**2001**, 26, 211–213. [Google Scholar]

**Figure 1.**Waiting times between two consecutive photons recorded in two perpendicular polarization directions as a function of photon arrival time for the fluorescence of a single rhodamine B molecule in a solid polymer matrix. “Blinking” leads to frequent gaps (“dark” periods) in the stream of photons with durations on the order of a millisecond. Line: Result of the identification of intensity change points in each detection channel separately with a maximum likelihood method [29].

**Figure 2.**Threshold values, τ

_{1−}

_{α}, for the likelihood measure $\mathcal{L}$(k′) corresponding to statistically significant intensity ratio break points at a level of confidence of 1 − α as indicated. Points: results of N = 100,000 random number simulations with 〈N

_{photon}〉 = 25 photons per bin and 〈ρ〉 = 1. Lines: smooth fits with an empirical function (Equation 2).

**Figure 3.**Probability for false positive break points as a function of the trial location k′ in simulated intensity ratio trajectories without a break. The ideal, uniform, distribution at a level of confidence of 1−α = 99% is indicated by the dotted line. The first and last N

_{excl}= 10 points, whose probability for false positives approaches or exceeds the ideal target value α, are excluded in the analysis.

**Figure 4.**Identification of the most likely break point in a simulated example (

**a**) Logarithm of the intensity ratio, log(ρ), with 〈ρ〉 = 1, for 1000 simulated points (red) with 〈N

_{photon}〉 = 25 photons per bin and a 30% change in the intensity ratio at the center point (thin vertical line). The average intensity is indicated with a thick line (blue); (

**b**) Likelihood measure, $\mathcal{L}$(k′), for a break at all possible trial positions, k′, for the trajectory in panel (

**a**). The threshold value, τ

_{1−}

_{α}, for a level of confidence 1 − α = 99% is indicated with the dashed line. The position, k̂, of the maximum likelihood, $\mathcal{L}$

_{max}= $\mathcal{L}$(k̂), is taken as the best estimate for the break location. Inset: Expanded view of the likelihood for a break around the maximum, illustrating the determination of the confidence interval k

^{−}. . . k

^{+}, around k̂ using the threshold value τ′

_{1−}

_{α}(blue).

**Figure 5.**Distribution of the deviation between actual and estimated location of the break point, Δk̂ = k̂ − k, for various changes in the intensity ratio at the break point, ρ

_{i}/ρ

_{i}

_{+1}, ranging from 1.2 to 2.0 in steps of 0.1, for an average number of 〈N

_{photon}〉 = 25 photons per bin.

**Figure 6.**Solid line: Average error, 〈|Δk̂|〉, for the estimated location, k̂, of the break point, k, as a function of the change in the intensity ratio at the break point ρ

_{i}/ρ

_{i}

_{+1}for 〈N

_{photon}〉 = 25 photons per bin. Dashed line: Percentage of correctly identified break point locations, P(Δ k̂ = 0). We find no statistically significant dependence of the error on the length of the sample or on the position of the break point within the sample.

**Figure 7.**Probability of undetected break points (false negatives), P, as a function of the magnitude of the change in intensity ratio at the break point, ρ

_{i}/ρ

_{i}

_{+1}, and length of the test sequences, L, for sequences containing one break point in the center. 〈N

_{photon}〉 = 25 photons per bin and 1 − α = 99%. Contour lines are spaced in 10% intervals.

**Figure 8.**Fraction of undetected break points (false negatives) for various changes in the intensity ratio, ρ

_{i}/ρ

_{i}

_{+1}, as indicated, in sequences of photons with 2000 photons per break point as a function of the average bin size, 〈N

_{photon}〉, for 1 − α = 99% using the analysis algorithm invoking the additional safeguards discussed in the text.

**Figure 9.**Ratio of the intensity of the two polarization components of the fluorescence from a single molecule embedded in a rigid polymer matrix, binned at 2 ms (red points). Result from the statistical analysis described in this paper (black line) indicating the expected lack of any reorientations of the probe molecule. Results from the separate identification of intensity break points in each polarization channel (blue line), frequently misidentifying blinking events as single molecule reorientations.

**Figure 10.**(

**a**) Intensity of the two polarization components, I

_{||}and I

_{⊥}, of the fluorescence emitted by a single rhodamine B molecule embedded in poly(vinyl acetate) at the glass transition temperature; (

**b**) Ratio of the intensity of the two polarization components, ρ = I

_{||}/I

_{⊥}, (green) and sequence of single molecule angular jumps (black), reconstructed using the analysis method described in this paper.

**Figure 11.**Distribution of the logarithm of the intensity ratio, ρ = I

_{||}/I

_{⊥}, for a simulated photon arrival trajectory, binned with 〈N

_{photon}〉 = 25 photons per bin (red), with fit to a Gaussian distribution (blue).

**Figure 12.**Schematic representation of the algorithm employed to systematically identify a sequence of most likely break points in an experimental trajectory in a self-consistent manner.

**Table 1.**Best-fit parameters for the empirical fit of Equation 2 to the threshold, τ

_{1−}

_{α}, for statistically significant differences in two sections of intensity ratio change points as determined in random number simulations. α: probability of false positive identification, α

_{eff}: probability of false positive identification if the additional safeguard is used that N

_{min}= 10 consecutive points have to surpass the threshold value, τ

_{1−α}.

Nominal Confidence, 1 − α | Effective Confidence, 1 − α_{eff} | Amplitude, A | scale, s | Exponent, λ |
---|---|---|---|---|

90% | 98% | 3.62 | 0.565 | 0.285 |

95% | 99% | 4.15 | 0.567 | 0.249 |

98% | 99.8% | 4.98 | 0.580 | 0.227 |

99% | 99.95% | 5.79 | 0.608 | 0.231 |

99.5% | 99.98% | 6.80 | 0.655 | 0.253 |

**Table 2.**Parameters for the empirical description of the upper bound for the threshold τ′

_{1−}

_{α}, according to Equation 3, to estimate the confidence interval k

^{−}. . . k

^{+}, for the estimated break point location, k̂, at a level of confidence of 1 − α.

confidence, 1 − α | small $\mathcal{L}$_{max}/L constant threshold, τ′_{0} | amplitude, A | exponent, λ |
---|---|---|---|

68% | 1.10 | 27 | 1.15 |

90% | 1.95 | 34 | 1.12 |

95% | 2.65 | 37 | 1.15 |

98% | 3.40 | 41 | 1.15 |

99% | 4.20 | 43 | 1.15 |

99.5% | 5.20 | 45 | 1.18 |

© 2012 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).