In Situ Flow Cytometer Calibration and Single-Molecule Resolution via Quantum Measurement

Fluorescent biomarkers are used to detect target molecules within inhomogeneous populations of cells. When these biomarkers are found in trace amounts it becomes extremely challenging to detect their presence in a flow cytometer. Here, we present a framework to draw a detection baseline for single emitters and enable absolute calibration of a flow cytometer based on quantum measurements. We used single-photon detection and found the second-order autocorrelation function of fluorescent light. We computed the success of rare-event detection for different signal-to-noise ratios (SNR). We showed high-accuracy identification of the events with occurrence rates below 10−5 even at modest SNR levels, enabling early disease diagnostics and post-disease monitoring.


Introduction
Flow cytometry is a commonly used optical technique to measure a wide range of cell properties in a high throughput manner by probing them with a laser as they flow through a flow chamber [1,2]. By introducing fluorescent markers to a sample, a flow cytometer can be used to detect the presence of target molecules and reveal the distribution of cells containing these molecules within a heterogeneous population [3,4]. There are certain situations where the target molecules are presented in trace amounts within the sample. In the case when only one or just a few fluorescent biomarkers are present in the probe volume of a flow cytometer, fluctuations in the laser power, optical loss, detector noise, and other factors make it hard to detect the fluorescent signal [5]. Although detection of single fluorescent molecules in flow cytometers has been previously claimed [6][7][8][9][10], these claims have not been rigorously proven using the quantum-mechanical formalism of the second-order coherence [11], a method that describes intensity correlations of light sources commonly measured using a Hanbury Brown and Twiss interferometer [12]. Previous claims rely on a priori knowledge of sample concentration and statistical analysis of photonbursts, which do not necessarily arise from a single-emitter as large scatterers including bubbles within the flow cell could also generate these bursts. To this end, we present a theoretical model for sensing fluorescent biomarkers at the single-molecule level based on the measurement of the second-order autocorrelation function of their emitted light. This method provides in situ verification of single-molecule sensitivity from first principles. It can unambiguously resolve the number of emitters within the probe volume, even when background noise is present. The advantage of this method is that it creates an independent scale for resolving the number of particles based on laws of quantum mechanics [13][14][15]. This scale is unaffected by loss, and thus, this measurement enables an absolute calibration method for flow cytometers, which otherwise is a challenging task [16].
Using this scale we quantify the ability of our method to identify the presence or absence of a biomarker when its occurrence, P, in a sample population is extremely rare (P < 10 −4 ). This level of sensitivity makes our scheme suitable for the diagnosis of earlystage diseases or the monitoring of disease recurrence. We show that identification of rarely occurring single biomarkers can be performed with a high success rate for achievable signalto-noise ratios (SNR) values. We demonstrate that our method can be used to discriminate between different numbers of biomarkers, also with a high success rate.

Method
A source of light can be characterized using the second-order autocorrelation function. This function relates the photon number fluctuations at time t with those at a different time t + τ. In quantum mechanics, the second-order autocorrelation function is expressed in terms of creation (a † ) and annihilation (a) operators as: When τ = 0, Equation (1) gives wheren is the photon number operator. It turns out that if light is generated by a single emitter, Equation (2) gives g (2) (0) = 0, because a single-photon emitter cannot emit more than one photon at a time. Note that for N single-photon emitters, the autocorrelation function at zero takes the following form: The phenomenon that g (2) (0) < 1 for all N < ∞ is known as antibunching. The dependence of the autocorrelation function g (2) (0) on the number of single-photon emitters is particularly strong for low N. Additionally, note from Equation (3) that g (2) (0) does not depend on optical loss. Therefore, the measurement of g (2) (0) allows quantification of the number of molecules present in the probe volume from first principles of optical quantum measurement.
It has been shown that we can infer the number of single-photon emitters within the collection volume based on the autocorrelation function and that a value of g (2) (0) < 0.5 implies that light comes predominantly from one single-photon emitter [11,13,17]. The relation between the autocorrelation function and the number of emitters has been studied widely and reported in numerous publications [18][19][20][21][22]. Here, we consider realistic experimental scenarios where the biomarker signal is polluted by noise. Typically, this noise is associated with the strong pump light used to excite fluorescence. Thus, it is common to encounter experiments plagued with optical background noise, such as thermal background, Raman scattering, autofluorescence, phosphorescence, scattered and unfiltered pump light, and ambient light leakage, among others. In addition, other effects, such as detector dark counts also contribute to background noise. In a faint coherent light field where P 1 P 2 P 3 . . ., the probability P N ph to observe N ph photons quickly decreases with the number of detected photons. Using this assumption, we can use the approximate expression g (2) (0) ≈ 2P 2 /P 2 1 , where P 1 ≈ N p b + p n and P 2 ≈ p 2 n /2 + N p n p b + N 2 p 2 b /2(1 − 1/N) correspond to the probabilities of detecting one and two photons, respectively. Here, by introducing p n and p b , the probabilities of detecting one photon from a noise source or from a single biomarker, respectively, we account for the presence of noise relative to the biomarker signal [23,24]. Substituting P 1 , P 2 into the approximate expression for g (2) (0), we get: One can then harness this relation as a tool to connect the number of emitters to the observed autocorrelation values in a measurement device from first principles. By comparing the measured g (2) (0) with those for a known number of emitters we can find p b and p n specific to an experimental setup. Note that this approach requires a constant number of emitters within the collection volume throughout the calibration measurement. In the case of flow cytometry, the number of emitters is changing stochastically. We found an expression for an average number of biomarkers in the collection volume instead. For constant concentration, the number of biomarkers is governed by a Poisson distribution. Thus, for an average of N biomarkers in the interrogation volume we find: where the summations can be done analytically: To understand this relation, consider the shape of the g (2) (τ), Figure 1a. In general, any light field whose intensity fluctuates in time exhibits a shoulder of higher values for 0 < τ << t ToF , where t ToF is the characteristic time of intensity fluctuations; specifically, in flow cytometry, t ToF is the time of flight of a particle through the interrogation volume. At τ >> t ToF g (2) (τ) reaches the uncorrelated background. If an assumption is made that light is only emitted by single-photon emitters, g [25]). We point out that the existence of the shoulder cannot be used for concentration estimation except when an additional assumption is made, because the shoulder can be due to any classical fluctuations. To prove the quantum character of the emitted light without extra assumptions, the difference between g (2) (0) and g (2) shoulder should be used. Therefore, for the dynamic measurements, we normalized the autocorrelation function g (2) (0) using the value g (2) shoulder [14], while in the static model, normalization at τ → ∞ is ordinarily used. In Figure 1b we show the behavior of g (2) (0) as a function of the number of emitters for both static (fixed number of emitters) and dynamic (varying but averaged number of emitters) cases for different values of the signal-to-noise ratio (SNR), defined here as p b /p n , i.e., SNR for a single emitter. As expected, the Poisson distribution correction for the dynamic case is most evident at low concentrations, i.e., precisely where g (2) (0) exhibits the most antibunching behavior, and therefore the averaging cannot be ignored. Figure 1b shows that it is still possible to observe the relation between g (2) (0) and the average number of emitters even when the number of emitters changes at random during the measurement. Therefore, correlation-based absolute calibration of flow cytometers is possible. To calibrate, one needs to prepare a sample with biomarkers that are sufficiently diluted but whose exact concentration could be unknown. Then, a series of known dilutions should be applied to change the average number of emitters in the collection volume, and a series of g (2) (0) measurements should be taken. The ratio p b /p n , defined as SNR above, then can be extracted from the fit of the experimental data with Equation (5). Further, the absolute concentration of the biomarkers (i.e., the accurate scaling for horizontal axis in Figure 1b follows from the same fit. In addition, from the coincidence analysis above, the rate of single counts provides information about the relation between p b , p n : P 1 = N p b + p n , where N is the average number of biomarkers per measurement volume. This allows extraction of both p b and p n from the measurement. Then, based on the photon rate of a single emitter and the values of p b , p n , it is possible to create the absolute photonic flux (intensity) scale that is proportional to the number of emitters (biomarkers) in a particular biological sample, i.e., to perform absolute calibration of a flow cytometer.
It is evident that the g (2) (0) values flatten and asymptote towards 1 for a low SNR as seen in Figure 1b. Hence, the accuracy with which we can measure the concentration is limited by the SNR of a system. We point out that the absolute metrological measurements are possible for SNR 0.5, i.e., in the presence of relatively strong noise.

Proposed Experimental Setup
The proposed experimental setup is shown in Figure 2. A dilute sample labeled with fluorescent biomarkers flows through a flow cell where it is interrogated with a laser that excites the emitters. Then, the fluorescent emission is collected and spectrally filtered to remove noise produced by the bright excitation beam. Spectral filtering can be achieved using dichroic mirrors and low-pass filters (LP filters). The fluorescent emission is then split into two spatial channels using a 50:50 beam splitter. In this way, a Hanbury Brown-Twiss measurement can be performed using a pair of single-photon detectors and a time tagger [12]. The second-order autocorrelation function can be obtained by analyzing the coincidence counts between the detectors. Typically, to define a coincidence, delays between the detections on both detectors are measured, and a coincidence occurs if a delay is within a pre-defined interval (time bin), which allows calculation of the autocorrelation function. If excitation with short pulses is used, the predefined interval is typically equal to the inverse of the pulse repetition rate. We estimated the expected number of photons in one measurement trial as λ ≈ M · (N p b + p n ) where M is the number of recorded time bins during the ToF of a biomarker. The expression for λ can be rewritten using the average number of photons expected in one such measurement: λ = N N b + N n . In flow cytometry, the measurement trial duration is related to the velocity of the flow. To find typical parameters for this quantum measurement-limited flow cytometer, we considered standard operational settings. To estimate N n,b , we need to find the ToF of the biological entity with the biomarker through the collection volume. We considered a sample flow rate of v s =10 µL/min and a sheath flow of v sh =100 µL/min, a flow cell with cross-sectional area A = 250 µm × 250 µm, with an excitation beam waist σ ≈ 1 µm. With these parameters the expected time of flight would be t ToF = σA/(v sh + v s ) ≈ 30 µs. As an excitation pump, we assumed a pulsed laser with a repetition rate of R = 80 MHz. We expected a biomarker with a quantum yield Q y ≈ 10% [26]. Given the ToF and the repetition rate, we can estimate the collection efficiency. For an objective with NA = 0.65, the one-half angular aperture is α = 40 • . The typical single-photon detector efficiency is η ≈ 50%. Using these settings and estimated coupling loss η c = 50%, we can expect the average number of detected photons N b ≈ t ToF RQ y ηη c (1 − cos(α))/2 ≈ 8 per trial. On the other hand, the number of background photons is dependent on multiple factors unique to each setup, and it is harder to estimate. Raman scattering is dependent on the buffer solution, and our ability to filter it out depends on the absorption and emission spectra of the biomarker and the filters used. Using this estimation, here we took N b = 8 and changed the value of N n to obtain SNR = {0.5, 1, 2, 4}.

Detection of Single Biomarkers
The capacity to detect a single emitter in flow cytometry can be an extremely advantageous tool for the diagnosis of early-stage diseases or the monitoring of a disease recurrence. Particularly important is detecting single rarely occurring biomarkers, for example, when samples have an occurrence rate (or the ratio of events with a biomarker present to the events without one) on the order of 1 in 1000 or less [27].
The identifying parameter is the number of photons detected during a single measurement trial, i.e., during the ToF of a biomarker through the collection volume. The likelihood that N ph photons are detected during a trial given that the ensemble under measurement contains N emitters is governed by the Poissonian process: L(N ph |N) = λ N ph exp(−λ)/N ph !, where λ is the expected number of photons detected in one trial, and it depends on the flow cytometer setup-see the previous section.
In Figure 3 we present a panel of graphs that show the probabilities L(N ph |N) to detect N ph photons for ensembles of N emitters for different signal and noise levels. Here we were interested in a very low number of photons detected, as is usually the case when the number of biomarkers is low while the flow cytometer throughput is maximized.
From Figure 3 we see that those probability distributions become more distinct for higher SNR (lower noise levels) and can be used to distinguish between different numbers of emitters. It is important to distinguish between those distributions (Bayesian likelihoods) and probabilities of successful detection of a biomarker, particularly when biomarkers are rarely occurring in a sample.
The probability to identify a biomarker can be found using the likelihood and the expected occurrence rate using Bayes formula: where P N is the prior (previously known) probability to encounter an ensemble of N emitters in the sample, and P N ph is the probability to detect N ph photons, which is given by the sum of likelihoods L(N ph |i) for all possible number of emitters i multiplied by the prior probabilities P i : Here we focus on the ultimate goal of quantum-enabled flow cytometry-the distinguishing of volumes with one emitter from those with no emitters assuming a low rate of occurrence. Under this condition, the probability of encountering a trial with more than one emitter can be neglected. We were interested in identifying the fraction of occasions when the positively identified volume indeed contains a single emitter: This success rate allows estimation of the fraction of the correctly identified cells containing a biomarker if the flow cytometer setup is followed by a cell sorter and when threshold N * ph for the number of detected photons is set for each trial. On the other hand, 1 − S(N * ph ) gives the fraction of incorrectly identified cells that do not contain a biomarker. Clearly, for large N * ph , the success rate can be nearly perfect, even in a situation with high background noise. The downside of choosing a high N * ph is the fraction of biomarkers that will not be identified: which, in the case of rare events, may not be achievable (see red curves in Figure 4a-d). Therefore, the probability F to miss a biomarker needs to be considered at the same time as the success rate. We present this statistical analysis for experimental scenarios that have different signal and noise intensities. We kept a constant average signal intensity of eight photons per measurement and varied the noise intensity from 2 to 16, Figure 4.  Dashed gray line is the isoline for a success rate of 50%. Red line: the probability F of erroneously discarding trials containing a biomarker as function of the threshold level N * ph ; dashed red line shows N * ph for which F(N * ph ) = 50%. We assume a fixed signal intensity N b = 8 (photons per trial) and noise intensities N n = {16, 8,4, 2} in (a-d), respectively. Note, that even though the density plot and red curves are shown on the same plot, the horizontal axes have different meaning for F and S.
The graphs presented in Figure 4 show how the success rate varies as the probability of encountering a single emitter changes, depending on the threshold for the number of detected photons. Based on the results of our model, we observed that SNR > 4 is required to achieve a success rate above 50% for the occurrence rate below 10 −4 and F > 50%. We can also see that for lower SNR, the success rate is drastically reduced; for an SNR < 0.5, it becomes virtually impossible to detect events with an occurrence probability lower than 10 −1 .
The success rate could be further improved by collecting and cross-referencing other information. For instance, if the occurrence of a rare single biomarker event is correlated with a different event, such as a scattering burst, or fluorescence activity at a different wavelength, that extra information can be employed. We did not investigate this procedure here, because such cross-referencing is typical for traditional flow cytometry analysis [28].
A similar procedure can be followed to establish the success identification rate and for discerning between zero and N > 1 emitters. In some cases, it may be desirable to distinguish between ensembles with a low and a high number of emitters. For instance, the measurement of the over-expression of the HER-2 gene is used to diagnose breast cancer, whereby the case with N = 2 HER-2 genes need to be distinguished from N > 5 [29,30]. As it can be seen from the Figure 5, discrimination between two and six emitters may be achieved with much lower SNR 0.5 than single emitter detection for the same signal intensity of N b = 8 photons per emitter per trial.

-5
10 -4 10 -3 10 -2 10 -1 1 Figure 5. Color map of the success rate S (i.e., the fraction of time when the volume identified as that containing six emitters indeed contains six emitters as opposed to just two emitters) as a function of the probability of encountering a volume with six emitters in a sample (horizontal axis) and the threshold for number of photons detected (left vertical axis). Dashed gray line is the isoline for success rate of 50%. Red line: the probability F to erroneously discard trials containing six biomarkers as a function of the threshold level N * ph ; dashed red line shows N * ph for which F(N * ph ) = 50%. This example models the identification of overexpressed HER-2 gene with six copies in a cell versus normal two copies (six biomarkers versus two biomarkers). We assumed a fixed signal intensity per biomarker N b = 8 (photons per trial) and noise intensities N n = {128, 64, 32, 16} in (a-d), respectively. Note that even though the density plot and red curves are shown on the same plot, the horizontal axes have different meanings for F and S.

Conclusions
In this work, we discussed a detection scheme and its calibration for flow cytometry based on quantum measurements. In particular, we discuss how the identification of single-photon emitters can be achieved and verified from first principles of quantum optics, the second-order autocorrelation measurement. We show that using these first principles, the in situ properties of signal and noise can be derived. This method establishes a photonic scale that provides number-of-emitters-resolving capability in flow cytometry and therefore the absolute self-calibration of the flow cytometer. The scheme proposed here is robust to photon loss since the second-order correlation function is loss-insensitive.
We also analyzed the performance of a flow cytometer at its limit of discriminating rarely occurring single biomarkers from random photon bursts due to noise and showed that it can be performed with the high success rate for practically achievable SNR values > 4. In addition, by analyzing the problem of discrimination of overexpressed HER-2 gene, we showed that successful cell sorting is possible for much lower SNR > 0.5. The capacity to detect a single emitter in flow cytometry and to distinguish between different number of emitters (biomarkers) can be an extremely advantageous tool for the diagnosis of early-stage diseases or the monitoring of a disease's recurrence. To substantiate the feasibility of our scheme, we conducted fluorescence measurements to estimate an achievable SNR. For this measurement, we used a 400 nm pulsed laser and commercially available biomarkers (Qdot nanocrystals, Life Technologies, Thermo Fisher Scientific, Waltham, MA, USA), at a concentration of approximately one biomarker in the detection volume of a microscope objective with a numerical aperture of 0.9. The manufacturer's provided concentration was used to determine the appropriate dilution. To measure the noise, we used a blank sample of distilled water. By these means, we obtained an SNR of 2.7, which led us to conclude that the scheme proposed in this work is feasible with present-day technology and instrumentation.