Asymptotic Capacity Results on the Discrete-Time Poisson Channel and the Noiseless Binary Channel with Detector Dead Time

This paper studies the discrete-time Poisson channel and the noiseless binary channel where, after recording a 1, the channel output is stuck at 0 for a certain period; this period is called the “dead time.” The communication capacities of these channels are analyzed, with main focus on the regime where the allowed average input power is close to zero, either because the bandwidth is large, or because the available continuous-time input power is low.


Introduction
For some detection systems that record arrivals, for example, a single-photon avalanche diode, after each recorded arrival, there comes a period during which the detector is not able to record any new arrivals. This period is often called the "dead time"; see, e.g., [1][2][3]. There are mainly two types of behavior for the dead time: nonparalyzable (nonextendable) means the detector dead time is a fixed period after each recorded arrival; and paralyzable (extendable) means the dead time happens after each occurred arrival, i.e., an arrival that is not recorded because the channel is already "dead" will restart the dead-time period. The two types of dead time are illustrated in Figure 1. One important example of a communication system where the detector may be affected by dead time is direct-detection optical communication, where the input signal is emitted by a laser or light-emitting diode. In Information Theory, such a system is often modeled as a Poisson channel. Some capacity results on the Poisson channel can be found in [4][5][6][7][8][9][10][11]. In this work we are interested in the communication capacity of the discrete-time Poisson channel with detector dead time, which, to our knowledge, is largely unexplored.
Before attacking the Poisson channel, we first look at the simpler problem of the noiseless binary channel, where the input can be 0 or 1, and where, when the channel is not "dead," the channel output always equals the input. We think of each received 1 as an arrival, and assume that it triggers a dead-time period of d channel uses. This problem has been extensively studied in the literature on run-length limited (RLL) coding [12][13][14][15]; we shall elaborate on this later. The noiseless binary channel with dead time can be thought of as a model for direct-detection optical channel with number states containing zero or one photon as inputs. 1 Furthermore, as we shall see, it serves as a reference for comparison for the Poisson channel.
For both channels, we are primarily interested in the wideband regime, where each channel use corresponds to a short time duration. We believe this regime to be particularly relevant, because each dead-time period would occupy a large number of channel uses. When studying the Poisson channel in the wideband regime, we distinguish between the cases where there is feedback and where there is not. Related to wideband, we also study the low-continuous-time-power regime where bandwidth is moderate. In both regimes, the average input power per channel use is small, but in the low-continuous-time-power regime, the dead-time period only occupies a moderate number of channel uses.
For most of the above cases, we determine the asymptotic capacity; in some cases we also determine the second-order term. In these cases, we show that dead time does not affect the asymptotic capacity. An exception is the Poisson channel in the wideband regime without feedback, for which we prove a lower bound, but we have not found a matching upper bound. In this case, we suspect that dead time does incur a penalty on the dominant term in capacity.
After introducing some notation and definitions, we present first our study on the noiseless binary channel, and then that on the Poisson channel. At the end of the paper, we summarize the results and discuss some future research directions.

Some Notation and Definitions
The channel, whether it is noiseless or Poisson, is characterized by two parameters: β denotes the maximum allowed average input power per channel use, and d denotes the duration of each dead-time period in channel uses. We further denote which corresponds to the maximum allowed average input energy per dead-time period. Capacity is in discrete time and has the unit "nats per channel use." As usual, it is defined as the supremum over all communication rates for which the probability of a decoding error can be made arbitrarily small. Given the above two parameters β and d, we use C NL (β, d) to denote the capacity of the noiseless binary channel, C P (β, d) that of the Poisson channel without feedback, and C P FB (β, d) that of the 1 Number states (Fock states) are pure quantum states that contain deterministic numbers of photons. They are different from coherent states, which are used to model light emitted by lasers. A (nonzero) coherent state is also a pure quantum state, but contains a random number of photons following a Poisson distribution. For more details we refer the reader to [16].
Poisson channel with feedback. Sometimes, to highlight the fact that we hold the parameter η to be fixed, we write η β in place of d, such as in C P (β, η β ). Unless otherwise stated, we use O(·) and o(·) to describe functions of β in the regime where β tends to zero, hence a function described as O( f ) satisfies and a function described as o( f ) satisfies Sometimes the dependence of the function f on β may be implicit, and f may be the constant 1. Throughout this paper, log denotes the natural logarithm, and information is measured in nats.

The Noiseless Binary Channel
We consider a noiseless channel with dead time d, where the input X takes value in {0, 1}. The law of the channel for nonparalyzable dead time is described as For paralyzable dead time, the channel law is given by Without constraints on the input sequence, the capacity of the channels (4) and (5) is well understood in terms of RLL codes. For example, it is given by the logarithm of the largest root to [13] a d+1 − a d − 1.
Alternatively, it can be written as max α∈[0,1] where H b (·) denotes the binary entropy function: Now consider the case where an average-power constraint is imposed on the input sequence. Specifically, for a codebook at blocklength n, we require that the average number of 1s contained in a codeword not exceed nβ. The capacity of this power-constrained channel is again an elementary result. For completeness, we present it as a proposition and provide a proof. Proposition 1. The capacities of the noiseless binary channels (4) and (5) with dead time d subject to the above power constraint are equal, and are given by Proof. See Appendix A.
Before proceeding with the asymptotic analysis, we put the above channel model in a continuous-time perspective. To this end, consider a noiseless photon channel where the receiver employs a single-photon detector with a time resolution of t seconds. More precisely, the time axis is divided into slots each lasting t seconds, and the detector declares each detected photon as belonging to a certain slot; it is not able to provide finer timing information on the photons. The sender sends a sequence of photons into the channel, each at a chosen time. 2 We impose an input-power constraint which says that the transmitter can send at most ρ photons per second. Assume that each dead-time period lasts τ seconds, where τ is an integer multiple of t. We can now think of each t-second slot as one use of a discrete-time noiseless binary channel. The discrete-time input is 1 if and only if the sender sends at least one photon in this slot. (Note that the transmitter's choice of timing within that slot is irrelevant, because the receiver cannot recover this information. Note as well that it is suboptimal for the transmitter to send more than one photon within a slot, because the second photon cannot be detected.) The discrete-time output is 1 if and only if a photon is detected in this slot. The discrete-time input-power constraint is β = ρt, and the discrete-time dead-time duration is d = τ t channel uses. In the following, we study the wideband regime where t is brought down to zero while ρ is held fixed. In this case, β approaches zero proportionally to t, while d grows to infinity proportionally to 1 t ; the product η = dβ = τρ remains unchanged. Then we also study the low-continuous-time-power regime where ρ is brought down to zero while t is held fixed. In this case, d is fixed and finite when β ↓ 0. For clarity, in the rest of this section we shall stay with the discrete-time picture, and only use the parameters d, β, and η.
As a reference for comparison, we recall that the capacity of the noiseless binary channel without dead time, in the regime where the average power β ↓ 0, is given by H b (β) and behaves like β log 1 We note that (10) has the same dominant term as the asymptotic capacity of the Poisson channel (without dead time), but a better second-order term [9,10]; see also (47) ahead.

Wideband Regime
We consider the regime where β ↓ 0 while η = dβ is held fixed. As we shall see, the asymptotic capacity has different expressions when η < 1 and when η ≥ 1.

The Case η < 1
In this case, the following proposition shows that the detector dead-time effect on C NL (β, η β ) is on the second-order term.

Proposition 2.
In the regime where β ↓ 0 and η = dβ is fixed and less than 1, Proof. First, note that, when η < 1, the condition for the maximization in (9) is equivalent to When β is sufficiently small, one could verify that 1 1+dα H b (α) is monotonically increasing in α satisfying (12). Consequently, (for sufficiently small β) the maximum in (9) is achieved by α = β 1−η , and The expression (11) follows by applying Taylor series expansion and rearranging terms.
In this case, dead time incurs a penalty on the first-order term of C NL (β, η β ) as shown by the following proposition.

Proposition 3.
In the regime where β ↓ 0 and η = dβ is fixed and greater than or equal to 1, Proof. We first note that the condition in (9) always holds. Indeed, when η ≥ 1, we have Hence the average-power constraint is inactive. We now derive a lower bound on C NL (β, d). To this end, we bound With the specific choice α = 1 d log d, we obtain and since log d 1+log d → 1 when d → ∞, we arrive at the asymptotic lower bound To get an upper bound on C NL (β, d), we write so we can continue (22) to obtain Let α * be the value of α that achieves the maximum in (24). The derivative with respect to α of the expression to be maximized on the right-hand side of (24) is Since f is monotonically decreasing (which can be verified by computing its derivative), positive when α ↓ 0, and negative at α = 1, it must have a unique root, therefore α * must satisfy Since is positive for all d > e 2 , we have for all d > e 2 . Therefore, for large enough d, where the second inequality follows by (28). Hence, by (24), (31), and the fact that α * achieves the maximum in (24), we get Combining (20) and (32) proves which, in the asymptotic regime of interest, is equivalent to (15).
The converse part of Proposition 3 can also be proven by noting the following. The dead time effectively imposes a constraint on the number of 1s in the output sequence: the proportion of 1s cannot exceed 1 d . Hence the capacity of this channel can be upper-bounded by the noiseless binary channel without dead time, but with an average-power constraint 1 d .
As noted in the proof, when η ≥ 1, the power constraint is inactive, so (33) provides an approximation to (7) when d is large. We plot the capacity (7) and its approximation d log 1 d in Figure 2.

Low-Continuous-Time-Power Regime
In this regime, the effect of a fixed dead time affects capacity starting only on the third-order term: In the regime where β ↓ 0 and d is fixed and finite, Proof. Following the argument used in the proof of Proposition 2, we have that, for small enough β, the maximum in (9) is achieved by Therefore, Using the Taylor series expansion for small a we continue (36) as which is as claimed.

The Poisson Channel
To introduce our model of the discrete-time Poisson channel with dead time, we start with a continuous-time picture. As in the noiseless case, we assume that the receiver employs a single-photon detector with a time resolution of t seconds: the time axis is divided into t-second slots, and each detected photon is declared to be in a certain slot. Further assume that a dead-time period lasts τ seconds, where τ = dt for an integer d. The transmitter modulates a laser signal by a (properly normalized) nonnegative waveform w(·). Let Y be the number of detected photons within a slot [t 0 , t 0 + t). Assume there is no dark current. Then, provided that the channel is not dead at time t 0 , where Note that (42) is different from the channel law of a standard Poisson channel, for which Y can take values that are larger than 1. This is because we assume the dead time τ to be longer than one slot t, and consequently there cannot be more than one recorded photon within one slot.
Thus, formally, the discrete-time Poisson channel with nonparalyzable dead time d has input alphabet R + 0 , output alphabet {0, 1}, and channel law To model paralyzable dead time, we introduce an auxiliary sequenceŶ n , which describes the output of the channel (42) without dead time: We then model the output sequence Y n as follows: X n − −Ŷ n − − Y n form a Markov chain, and Assume that an average-power constraint ρ is imposed on the continuous-time input waveform w(·). By (43), this implies an average-power constraint β = ρt on the discrete-time input x. That is, averaged over the codebook and the blocklength, We next study the capacity of the Poisson channel with nonparalyzable dead time (44) in the regime where β is close to zero. As in the previous section, we distinguish between the wideband scenario and the low-continuous-time-power scenario. In the former, t ↓ 0, so β ↓ 0 and d → ∞ with η = dβ = τρ remaining unchanged; in the latter, ρ ↓ 0 so β ↓ 0 while d remains unchanged. In the former, wideband scenario, we further distinguish between the cases where there is feedback and where there is not. Henceforth we shall stay in the discrete-time picture and no longer refer to the continuous-time picture.
For comparison, we note that, for a discrete-time Poisson channel without dead time and with average-power constraint β, in the regime where β is close to zero, the best known capacity approximation is given by [10] β log 1 (The work [10] considers the standard Poisson channel where the output is not binary. However, the achievability proof in [10] maps the output to a binary random variable, so (47) applies also to the binary-output channel that is the same as our model when d = 0.) In general, no closed-form capacity expression has been found for this channel. Several capacity bounds and asymptotic results have been obtained in [7][8][9][10][11].

Wideband Regime, with Feedback
Consider the regime where β ↓ 0 while η = dβ is held fixed. Assume that immediate noiseless feedback is available, so the encoder learns the realizations of Y 1 , . . . , Y i−1 before producing x i . We first observe that the capacity of the Poisson channel with feedback cannot exceed that of the noiseless binary channel.

Proposition 5.
For any β ∈ R + 0 and d ∈ Z + 0 , the capacity of the channel (44) under constraint (46) with immediate noiseless feedback satisfies Proof. Assume that an encoder-decoder pair for the Poisson channel with feedback is given. When the encoder is applied on the message M = m, the output vector equals y n with a certain probability Pr(Y n = y n |M = m).
Now for the noiseless channel we construct a random encoder as follows: given the message M = m, the codeword is chosen to be y n with the probability given by (49). Clearly, all codewords will pass through the channel without change. Combining this random encoder with the given decoder for the Poisson channel, we then obtain exactly the same error probability for both channels. Further, from (44) it is clear that the average power in Y n is less than or equal to that in X n , therefore our random encoder for the noiseless channel consumes at most as much power as the given encoder for the Poisson channel. Thus, given any code for the Poisson channel with feedback, we can construct a valid code for the noiseless binary channel that has the same performance.
The next proposition shows that, in the wideband regime, the capacity of the Poisson channel with feedback has the same dominant term as that of the noiseless binary channel.

Proposition 6.
For the Poisson channel (44) under constraint (46) with immediate noiseless feedback, in the regime where β ↓ 0 and η = dβ is fixed, capacity has the asymptotic expression (50) Proof. The converse part follows immediately from Propositions 2, 3 and 5. We next prove the achievability part in the case where η < 1. We shall describe a random coding scheme. We first note a small technicality: the average input power of the following scheme is larger than β and of the form β + o(β). Hence, to obtain a codebook that satisfies the average-power constraint, one should replace β in the following by (1 − )β for some positive , and later let approach zero. For simplicity, we shall ignore this technicality henceforth.
Assume that β is sufficiently small. For the first channel use, we choose For future channel uses, based on the feedback, the transmitter determines whether the channel is dead or not. When it is dead, the transmitter chooses X = 0 with probability one; otherwise it picks X according to (51) independently of all past inputs and outputs.

From (51) we obtain
and Since each Y = 1 results in d dead channel uses, recalling dβ = η, we have by (53) that, as n → ∞ and β ↓ 0, the ratio of dead to not-dead channel uses approaches η 1−η in probability, and hence the proportion of not-dead channel uses among all channel uses approaches (1 − η) in probability. This combined with (52) shows that, as claimed earlier, over all channel uses E[X] = β + o(β).
The dead channel uses can be identified by both the transmitter and the receiver, hence they can be discarded. The remaining channel uses form a sequence that is memoryless, and our choice of inputs on these channel uses are independent and identically distributed (IID). We can hence apply Shannon's classic result for discrete memoryless channels [17]: over the channel uses that are not dead, we can achieve all rates up to the mutual information. Note that, for small a, Recalling (53), we then have From (51) we have Hence we have Recalling that the proportion of not-dead channel uses tends to (1 − η) in probability as n → ∞ and β ↓ 0 completes the proof for the case where η < 1.
We now consider the case where η ≥ 1. We use a similar random coding scheme as above, the difference being we replace the distribution (51) by the following: where γ > 0 will be chosen to approach zero later on. Instead of (53), we now have This implies that, as n → ∞ and β ↓ 0, the portion of not-dead channel uses approaches γ η+γ in probability. This further implies that so the average-power constraint is satisfied when β is small enough. By the same argument as in the previous case, over the channel uses that are not dead, we can achieve any rate up to The overall rate is thus given by the above multiplied by γ η+γ , which is Letting γ approach zero completes the proof.

Remark 3.
The coding schemes used in the above proof work for paralyzable dead time as well. When dead time is paralyzable, then in general whether the channel is dead or not cannot be determined from the output sequence. Nevertheless, for the proposed schemes, it can be so determined. Indeed, since the initial state of the channel is known ("not dead"), and since the transmitter always sends 0 when it knows the channel to be dead, one can show by induction that no arrival will occur when the channel is dead, hence dead time will never be restarted in the paralyzable case. This impliesŶ n = Y n with probability one.

Wideband Regime, without Feedback
Again consider the regime where β ↓ 0 while η = dβ is held fixed. Now we assume there is no feedback, so the transmitter cannot know exactly when the channel is dead. In the following we analyze a strategy in which the transmitter sends zero whenever the channel may be dead. The strategy employs pulse-position modulation (PPM) [10,18,19].
We fix a positive integer b and specify its value later. We divide the n channel uses into blocks of length (b + d). Within each block, we send a PPM symbol in the first b channel uses, and send zeros in the last d channel uses. Specifically, the transmitter uniformly picks one among the first b channel uses, and sends In all other channel uses in this block, it sends X = 0. The pulse position in different blocks are chosen independently. Clearly, the power constraint (46) is satisfied. Within a specific block, given the input vector, the output has the following distribution. With probability 1 − e −ξ , Y = 1 at the same position as the input pulse, and Y = 0 elsewhere; and with probability e −ξ , Y = 0 for the entire block. Furthermore, the channel's behavior between different blocks are independent (because the last d channel uses in every block are not used, a dead-time period cannot extend to the next block). We thus obtain n b+d uses of a memoryless b-ary erasure channel, where the erasure probability is e −ξ , and with IID uniform inputs. The capacity of this erasure channel is given by The rate we thus achieve over the Poisson channel is (ignoring the difference between n b+d and n b+d , whose effect vanishes for large n) for some positive , and recall that η = bβ. Then (65) becomes (In the above we ignored the difference between d and d, the effect of which becomes negligible as d → ∞.) Letting ↓ 0 in the above yields the following asymptotic lower bound.

Proposition 7.
For the channel (44) under constraint (46) with no feedback, in the regime where β ↓ 0 and η = dβ is fixed, capacity satisfies In the above scheme, the transmitter sends 0 whenever there is a nonzero probability that the channel is dead, hence no input energy is wasted. The penalty of this approach is that most of the channel uses are wasted because they are treated as "possibly dead." Each "possibly dead" period spans d channel uses, and we must use all the energy budget that is "saved" within this period, which equals η, before the next "possibly dead" period starts. This means our choice of the amplitude of the pulse in PPM must be at least η. For the Poisson channel without dead time, the asymptotically optimal choice for the amplitude of the pulse in PPM should approach zero [10]. That we must choose η in place of 0 accounts for the factor 1−e −η η in (68) as opposed to 1 in (47). One can check that using on-off keying instead of PPM (while still sending zero whenever the channel may be dead) achieves the same lower bound.
An alternative approach to the above would be to "waste" energy instead of channel uses, or to find a trade-off between these two resources. At the time of writing this paper, we have not found a scheme along this direction that provides a better asymptotic lower bound than (68). Neither have we found a nontrivial upper bound; the best asymptotic upper bound that we know is the capacity with feedback (50).
We compare the asymptotic capacity with feedback (50) and the lower bound without feedback (68) in Figure 3.

Low-Continuous-Time-Power Regime
We now consider the regime where the allowed continuous-time input power is low, i.e., where β ↓ 0 while d is held fixed. We shall not consider feedback, because, as we shall see, the effect of dead time on capacity is rather small even when there is no feedback.
We again consider a PPM scheme. The n channel uses are divided into blocks of length (b + d), where we now choose Within each block, the transmitter uniformly picks one among the first b channel uses, where it sends For all the other channel uses in this block, the transmitter sends 0. The pulse position in different blocks are chosen independently of each other. The average-power constraint (46) is satisfied: over all the n channel uses, As in Section 3.2, we obtain n b+d uses of a b-ary erasure channel. For each block, with probability e −ζ , the output sequence contains only zeros; and with probability (1 − e −ζ ), Y = 1 at the same position as the input pulse, and all other outputs in the block are zero. The capacity of this b-ary erasure channel is The rate we can thus achieve over the original channel is (ignoring the effect of the · operation) Comparing this expression with (47) shows that the influence of d is only in the O(β) term. We summarize this observation in the following proposition.

Proposition 8.
For the channel (44) under constraint (46), in the regime where β ↓ 0 and d is fixed, capacity satisfies for all finite d.

Discussion
As a first step toward understanding communication channels with detector dead time, we have studied the noiseless binary channel and the discrete-time Poisson channel with dead time in the asymptotic regime where the allowed average input power approaches zero. Although these channels have memory, the results in this paper were obtained using simple tools; we have not explored information stability [20] or information-spectrum methods [21,22].
In the scenario where bandwidth is fixed and the average continuous-time input power is required to be low, for both channels, dead time has no effect on the first-and second-order terms in capacity. This may not be surprising, as only a vanishing proportion of channel uses are dead. In the scenario where continuous-time input power is fixed and bandwidth grows large, if dead time limits the maximum possible output rate (as compared to the allowed input rate), then it incurs a penalty in the dominant term in capacity for both channels. If dead time does not limit the output rate, then, for the noiseless channel and for the Poisson channel with feedback, it does not affect the dominant term in capacity, even though now a nonvanishing proportion of channel uses would be dead. For the Poisson channel without feedback in this scenario, there is a gap in the dominant term between our best achievability result and the capacity without dead time. Intuitively, this gap is due to the fact that the transmitter must either sacrifice almost all available channel uses (if it sends nothing when the channel "may be dead"), or risk wasting input energy. We conjecture that the dominant term in capacity in this case is indeed smaller than that without dead time, but a nontrivial upper bound remains to be proven.
In our study of the Poisson channel, we have assumed the dark current to be zero. In the presence of a nonzero dark current, detections can occur at the receiver even when no input is provided to the channel, and these detections will trigger the dead time, complicating the problem considerably. We leave this as a topic for future work.
Another potentially interesting problem is the peak-limited continuous-time Poisson channel [4][5][6]. The closed-form capacity expression of this channel is a classic result. However, we are not aware of any capacity results on such a channel with dead time.
Funding: This research received no external funding.
As n grows large, α as defined by (A8) can be arbitrarily close to any real number that satisfies Thus we can combine (A7) and (A9) to obtain Finally, we show that C NL (β, d) = C (β, d) for all β and d. Since C (β, d) is the capacity subject to a stronger constraint than C NL (β, d), we immediately have C NL (β, d) ≥ C (β, d). For the reverse direction, consider a sequence of length-n codebooks at rate R that satisfies the average-power constraint. Take an arbitrary > 0. Form a new sequence of codebooks by collecting all the codewords in the original codebooks that contain at most (1 + )nβ 1s. Clearly, the new codebook satisfies the hard constraint for average power (1 + )β. The size of the new codebook must be at least 1+ · e nR . As n grows large, the influence of the factor 1+ on the rate of the codebook vanishes. We thus obtain, for any > 0, C NL (β, d) ≤ C ((1 + )β, d). (A12) The proof is completed by noting that C (·, d) is a continuous function.