Next Article in Journal
Occurrence of Ordered and Disordered Structural Elements in Postsynaptic Proteins Supports Optimization for Interaction Diversity
Previous Article in Journal
A New Surrogating Algorithm by the Complex Graph Fourier Transform (CGFT)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accuracy Assessment of Nondispersive Optical Perturbative Models through Capacity Analysis

Department of Electrical Engineering, Chalmers University of Technology, 41296 Gothenburg, Sweden
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(8), 760; https://doi.org/10.3390/e21080760
Submission received: 20 May 2019 / Revised: 18 July 2019 / Accepted: 28 July 2019 / Published: 5 August 2019
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

:
A number of simplified models, based on perturbation theory, have been proposed for the fiber-optical channel and have been extensively used in the literature. Although these models are mainly developed for the low-power regime, they are used at moderate or high powers as well. It remains unclear to what extent the capacity of these models is affected by the simplifying assumptions under which they are derived. In this paper, we consider single-channel data transmission based on three continuous-time optical models: (i) a regular perturbative channel, (ii) a logarithmic perturbative channel, and (iii) the stochastic nonlinear Schrödinger (NLS) channel. To obtain analytically tractable discrete-time models, we consider zero-dispersion fibers and a sampling receiver. We investigate the per-sample capacity of these models. Specifically, (i) we establish tight bounds on the capacity of the regular perturbative channel; (ii) we obtain the capacity of the logarithmic perturbative channel; and (iii) we present a novel upper bound on the capacity of the zero-dispersion NLS channel. Our results illustrate that the capacity of these models departs from each other at high powers because these models yield different capacity pre-logs. Since all three models are based on the same physical channel, our results highlight that care must be exercised in using simplified channel models in the high-power regime.

1. Introduction

The vast majority of the global Internet traffic is conveyed through fiber-optical networks, which form the backbone of our information society. To cope with the growing data demand, the fiber-optical networks have evolved from regenerated direct-detection systems to coherent wavelength division multiplexing (WDM) ones. Newly emerging bandwidth-hungry services, like Internet-of-Things (IoT) applications and cloud processing, require even higher data rates. Motivated by this ever-growing demand, an increasing attention has been devoted in recent years to the analysis of the capacity of the fiber-optical channel.
Finding the capacity of the fiber-optical channel that is governed by the stochastic nonlinear Schrödinger (NLS) equation ([1], Equation (1)), which captures the effects of Kerr nonlinearity, chromatic dispersion, and amplification noise, remains an open problem. An information-theoretic analysis of the NLS channel is cumbersome because of a complicated signal–noise interaction caused by the interplay between the nonlinearity and the dispersion [2]. In general, capacity analyses of optical fibers are performed either by considering simplified channels, or by evaluating mismatched decoding lower bounds [3] via simulations (see [4] and ([5], Sec. I) for excellent literature reviews). Lower bounds based on the mismatch-decoding framework go to zero after reaching a maximum (see, for example, [2,6,7,8,9]). Capacity lower bounds with a similar behavior are also reported in [10]. In [11], it has been shown that the maximum value of a capacity lower bound can be increased by increasing fiber dispersion, which mitigates the effects of nonlinearity. To establish a capacity upper bound, Kramer et al. [12] used the split-step Fourier (SSF) method, which is a standard approach to solve the NLS equation numerically ([13], Sec. 2.4.1), to derive a discrete-time channel model. They proved that the capacity of this discrete-time model is upper-bounded by that of an equivalent additive white Gaussian noise (AWGN) channel. In contrast to the available lower bounds, which fall to zero or saturate at high powers, this upper bound, which is the only one available for a realistic fiber channel model, grows unboundedly.
Since the information-theoretic analysis of the NLS channel is difficult, to approximate capacity, one can resort to simplified models, a number of which have been studied in the literature (see [14] and references therein for a recent review). Two approaches to obtain such models are to use the regular perturbation or the logarithmic perturbation methods. In the former, the effects of nonlinearity are captured by an additive perturbative term [15,16]. This approach yields a discrete-time channel with input–output relation y l = x l + Δ x l + n l ([14], Equation (5)), where x l and y l are the transmitted and the received symbols, respectively; n l is the amplification noise; and Δ x l is the perturbative nonlinear distortion. This model holds under the simplifying assumption that both the nonlinearity and the signal–noise interaction are weak, which is reasonable only at low power.
Regular perturbative fiber-optical channel models, with or without memory, have been extensively investigated in the literature. In [17], a first-order perturbative model for WDM systems with arbitrary filtering and sampling demodulation, and coherent detection is proposed. The accuracy of the model is assessed by comparing the value of a mismatch-decoding lower bound, which is derived analytically based on the perturbative model, with simulation results over a realistic fiber-optical channel. A good agreement at all power levels is observed. The capacity of a perturbative multiple-access channel is studied in [18]. It is shown that the nonlinear crosstalk between channels does not affect the capacity region when the information from all the channels is optimally used at each detector. However, if joint processing is not possible (it is typically computationally demanding [19]), the channel capacity is limited by the inter-channel distortion.
Another class of simplified models, which are equivalent to the regular perturbative ones up to a first-order linearization, is that of logarithmic perturbative models, where the nonlinear distortion term Δ x l is modeled as a phase shift. This yields a discrete-time channel with input–output relation y l = x l e j Δ x l + n l ([14], Equation (7)). In [5], a single-span optical channel model for a two-user WDM transmission system is developed from a coupled NLS equation, neglecting the dispersion effects within the WDM bands. The channel model in [5] resembles the perturbative logarithmic models. The authors study the capacity region of this channel in the high-power regime. It is shown that the capacity pre-log pair (1,1) is achievable, where the capacity pre-log is defined as the asymptotic limit of C / log P for P , where P is the input power and C is the capacity.
Despite the fact that the aforementioned simplified channels are valid in the low-power regime, these models are often used also in the moderate- and high-power regimes. Currently, it is unclear to what extent the simplifications used to obtain these models influence the capacity at high powers. To find out, we study the capacity of two single-channel memoryless perturbative models, namely, a regular perturbative channel (RPC), and a logarithmic perturbative channel (LPC). To assess accuracy of these two perturbative models, we also investigate the per-sample capacity of a memoryless NLS channel (MNC).
To enable an information-theoretic analysis of the fiber-optical channel, we deploy two common assumptions on the channel model. First, the dispersion is set to zero and second, a sampling receiver is used to obtain discrete-time models from continuous-time channels. These two assumptions were first applied to the NLS equation in [1] to obtain an analytically tractable channel model. This channel model was developed also in [20,21,22] using different methods. In this paper, we refer to this model as MNC.
In [21], a lower bound on the per-sample capacity of the memoryless NLS channel is derived, which proves that the capacity goes to infinity with power. In [22], the capacity of the same channel is evaluated numerically. Furthermore, it is shown that the capacity pre-log is 1 / 2 . Approximations of the capacity and optimal input distribution in the intermediate power range are derived in [23,24]. These results are extended to a channel with a more realistic receiver than the sampling one in [25]. The only known nonasymptotic upper bound on the capacity of this channel is log ( 1 + SNR ) (bits per channel use) [12], where SNR is the signal-to-noise ratio. This upper bound holds also for the general case of nonzero dispersion.
The novel contributions of this paper are as follows. First, we tightly bound the capacity of the RPC model and prove that its capacity pre-log is 3. Second, the capacity of the LPC is readily shown to be the same as that of an AWGN channel with the same input and noise power. Hence, the capacity pre-log of the LPC is 1. Third, we establish a novel upper bound on the capacity of the MNC (first presented in the conference version of this manuscript [26]). Our upper bound improves the previously known upper bound [12] on the capacity of this channel significantly and, together with the proposed lower bound, allows one to characterize the capacity of the MNC accurately.
Although all three models represent the same physical optical channel, their capacities behave very differently in the high-power regime. This result highlights the profound impact of the simplifying assumptions on the capacity at high powers, and indicates that care should be taken in translating the results obtained based on these models into guidelines for system design.
The rest of this paper is organized as follows. In Section 2, we introduce the three channel models. In Section 3, we present upper and lower bounds on the capacity of these channels and establish the capacity pre-log of the perturbative models. Numerical results are provided in Section 4. We conclude the paper in Section 5. The proofs of all theorems are given in the appendices.
Notation: Random quantities are denoted by boldface letters. We use CN 0 , σ 2 to denote the complex zero-mean circularly symmetric Gaussian distribution with variance σ 2 . We write R x , | x | , and x to denote the real part, the absolute value, and the phase of a complex number x. All logarithms are in base two. The mutual information between two random variables x and y is denoted by I ( x ; y ) . The entropy and differential entropy are denoted by H ( · ) and h ( · ) , respectively. Finally, we use * for the convolution operator.

2. Channel Models

The fiber-optical channel is well-modeled by the NLS equation, which describes the propagation of a complex baseband electromagnetic field through a lossy single-mode fiber as
a z + α g 2 a + j β 2 2 2 a t 2 j γ | a | 2 a = n .
Here, a = a ( z , t ) is the complex baseband signal at time t and location z. The parameter γ is the nonlinear coefficient, β 2 is the group-velocity dispersion parameter, α is the attenuation constant, g = g ( z ) is the gain profile of the amplifier, and n = n ( z , t ) is the Gaussian amplification noise, which is bandlimited because of the inline channel filters. The third term on the left-hand side of (1) is responsible for the channel memory and the fourth term for the channel nonlinearity.
To compensate for the fiber losses, two types of signal amplification can be deployed, namely, distributed and lumped amplification. The former method compensates for the fiber loss continuously along the fiber, whereas the latter method boosts the signal power by dividing the fiber into several spans and using an optical amplifier at the end of each span. With distributed amplification, which we focus on in this paper, the noise can be described by the autocorrelation function [2]
E n ( z , t ) n * ( z , t ) = α n sp h ν δ W N ( t t ) δ ( z z ) .
Here, n sp is the spontaneous emission factor, h is Planck’s constant, and ν is the optical carrier frequency. In addition, δ ( · ) is the Dirac delta function and δ W N ( x ) = W N sinc ( W N x ) , where W N is the noise bandwidth. In this paper, we shall focus on the ideal distributed-amplification case g ( z ) = α .
We use a sampling receiver to go from continuous-time channels to discrete-time ones. A comprehensive description of the sampling receiver and of the induced discrete-time channel is provided in ([22], Section III). Here, we review some crucial elements of this description for completeness. Assume that a signal a ( 0 , t ) , which is band-limited to W 0 hertz, is transmitted through a zero-dispersion NLS channel ((1) with β 2 = 0 ) in the time interval [ 0 , T ] . Because of nonlinearity, the bandwidth of the received signal a ( L , t ) may be larger than that of a ( 0 , t ) . To avoid signal distortion by the inline filters, we assume that W 0 is set such that a ( z , t ) is band-limited to W N hertz for 0 z L . Since W 0 W N , assuming W N T 1 , both the transmitted and the received signal can be represented by 2 W N T equispaced samples. The transmitter encodes data into subsets of these samples of cardinality 2 W 0 T , referred to as the principal samples. At the receiver, demodulation is performed by sampling a ( L , t ) at instances corresponding to the principal samples. This results in 2 W 0 T parallel independent discrete-time channels that have the same input–output relation.
The sampling receiver has a number of shortcomings [27] and using it should be considered a simplification. The resulting discrete-time model is used extensively in the literature (see, for example, [1,20,21,22,28,29]), since it makes analytical calculation possible. In this paper, we apply the sampling receiver not only to the memoryless NLS channel but also to the memoryless perturbative models.
Next, we review two perturbative channel models that are used in the literature to approximate the solution of the NLS Equation (1). Among the multiple variations of perturbative models available in the literature, we use the ones proposed in [30]. For both perturbative models, first continuous-time dispersive models are introduced, and then memoryless discrete-time channels are developed by assuming that β 2 = 0 and by using a sampling receiver. Finally, we introduce the MNC model, which is derived from (1) under the two above-mentioned assumptions.
Regular perturbative channel (RPC): Let a li ( z , t ) be the solution of the linear noiseless NLS equation (Equation (1) with n ( z , t ) = 0 and γ = 0 ). It can be computed as a li ( z , t ) = a ( 0 , t ) * h ( z , t ) , where h ( z , t ) = F 1 exp j β 2 ω 2 z / 2 and F 1 ( · ) denotes the inverse Fourier transform. In the regular perturbation method, the output of the noiseless NLS channel (Equation (1) with n = 0 ) is approximated as
a ( L , t ) = a li ( L , t ) + Δ a ( L , t ) .
Here, L is the fiber length and Δ a ( z , t ) is the nonlinear perturbation term. If now the model is expanded to include amplification noise as an additive noise component, neglecting signal–noise interactions, then the accumulated amplification noise
w ( L , t ) = 0 L n ( z , t ) d z
can be added to the signal at the receiver to obtain the channel model ([14], Equation (5))
a ( L , t ) = a li ( L , t ) + Δ a ( L , t ) + w ( L , t ) .
The first-order approximation of Δ a ( L , t ) is ([30], Equation (13))
Δ a ( L , t ) = j γ 0 L a li ( ζ , t ) 2 a li ( ζ , t ) h ( L ζ , t ) d ζ ,
where the convolution is over the time variable. (Using higher-order nonlinear terms improves the accuracy of the regular perturbative channels. However, in this paper, we focus only on the channel model based on the first-order approximation, which is commonly used in the literature.) Neglecting dispersion (i.e., setting β 2 = 0 ), we have h ( z , t ) = δ ( t ) and a li ( ζ , t ) = a ( 0 , t ) . Using this in (6), and then substituting (6) into (5), we obtain
a ( L , t ) = a ( 0 , t ) + j L γ | a ( 0 , t ) | 2 a ( 0 , t ) + w ( L , t ) .
Finally, by deploying sampling receiver, we obtain from (7) the discrete-time channel model
y = x + j η | x | 2 x + n .
Here, n CN 0 , P N ,
P N = 2 α n sp h ν L W N
is the total noise power, and
η = γ L .
We refer to (8) as the RPC.
Logarithmic perturbative channel (LPC): Another method for approximating the solution of the NLS Equation (1) is to use logarithmic perturbation. With this method, the output signal is approximated as ([14], Equation (7))
a ( L , t ) = a ( 0 , t ) exp j Δ θ ( L , t ) + w ( L , t ) ,
where w ( L , t ) is the same noise term as in (5)–(4). The first-order approximation of Δ θ ( L , t ) is ([30], Equation (19))
Δ θ ( L , t ) = γ a li ( L , t ) 0 L a li ( ζ , t ) 2 a li ( ζ , t ) * h ( L ζ , t ) d ζ .
Under the zero-dispersion assumption ( β 2 = 0 ), we have h ( z , t ) = δ ( t ) and a li ( ζ , t ) = a ( 0 , t ) . Using this in (12), and then substituting (12) into (11), we obtain
a ( L , t ) = a ( 0 , t ) e j γ L | a ( 0 , t ) | 2 + w ( L , t ) .
Finally, by sampling the output signal, the discrete-time channel
y = x e j η | x | 2 + n
is obtained, where n CN 0 , P N , P N is given in (9), and η is defined in (10). We note that the channels (8) and (14) are equal up to a first-order linearization, which is accurate in the low-power regime. Furthermore, one may also obtain the model in (13) by solving (1) for β 2 = 0 , n = 0 , and g = α and by adding the noise at the receiver.
Memoryless NLS Channel (MNC): Here, we shall study the underlying NLS channel in (1) under the assumptions that β 2 = 0 and that a sampling receiver is used to obtain a discrete-time channel. Let r 0 and θ 0 be the amplitude and the phase of a transmitted symbol x , and let r and θ be those of the received samples y . The discrete-time channel input–output relation can be described by the conditional probability density function (pdf) ([20], Ch. 5) (see also ([28], Sec. II))
f r , θ | r 0 , θ 0 ( r , θ | r 0 , θ 0 ) = f r | r 0 ( r | r 0 ) 2 π + 1 π m = 1 R C m ( r ) e j m ( θ θ 0 ) .
The conditional pdf f r | r 0 ( r | r 0 ) and the Fourier coefficients C m ( r ) in (15) are given by
f r | r 0 ( r | r 0 ) = 2 r P N exp r 2 + r 0 2 P N I 0 2 r r 0 P N ,
C m ( r ) = 2 r ν m exp r 2 + r 0 2 ν m cos x m I m 2 r r 0 ν m .
Here, I m ( · ) denotes the mth order modified Bessel function of the first kind, and
x m = 2 j m γ r 0 2 P N L 2 r 0 2 + P N 1 / 2 ,
ν m = x m P N sin x m .
The complex square root in (18) is a two-valued function, but both choices give the same values of ν m and C m ( r ) .
In the next section, we study the capacity of the channel models given in (8), (14), and (15). Since all of these models are memoryless, their capacities under a power constraint P are given by
C = sup I ( x ; y ) ,
where the supremum is over all complex probability distributions of x that satisfy the average-power constraint
E | x | 2 P .

3. Analytical Results

In this section, we study the capacity of the RPC, the LPC, and the MNC models. All these models are based on the same fiber-optical channel and share the same set of parameters. Bounds on the capacity of the RPC in (8) are provided in Theorems 1–3. Specifically, in Theorem 1, we establish a closed-form lower bound, which, together with the upper bound provided in Theorem 2, tightly bounds capacity (see Section 4). A different upper bound is provided in Theorem 3. Numerical evidence suggests that this alternative bound is less tight than the one provided in Theorem 2 (see Section 4). However, this alternative bound has a simple analytical form, which makes it easier to characterize it asymptotically. By using the bounds derived in Theorems 1 and 3, we prove that the capacity pre-log of the RPC is 3. In Theorem 4, we derive the capacity of the LPC in (14) and show that it coincides with the capacity of an equivalent AWGN channel. Hence, the capacity pre-log is 1. Theorem 4 is rather trivial and we present it in this section only for completeness. Finally, in Theorem 5, we provide an upper bound on the capacity of the MNC in (15), which improves the previous known upper bound [12] significantly, and, together with a proposed capacity lower bound, yields a tight characterization of capacity (see Section 4).

3.1. Capacity Analysis of the RPC

Theorem 1.
The capacity C RPC of the RPC in (8) is lower-bounded by
C RPC L RPC ( P ) = max λ log λ 2 + 6 η 2 λ 3 P N e 12 η 2 λ 2 + 6 η 2 + 1 ,
where λ is positive and satisfies the constraint
18 η 2 + λ 2 λ 6 η 2 + λ 2 P .
Furthermore, the maximum in (22) is achieved by the unique real solution of the equation
P λ 3 λ 2 + 6 P η 2 λ 18 η 2 = 0 .
Proof. 
See Appendix A. □
Theorem 2.
The capacity of the RPC in (8) is upper-bounded by
C RPC U RPC ( P ) = min μ > 0 , λ > 0 log μ 2 + 6 η 2 μ 3 e P N + λ + max s > 0 μ E q | y | 2 | | x | 2 = s log e λ s + P N P + P N .
Here, q ( x ) = g 1 ( x ) , where g x = x + η 2 x 3 .
Proof. 
See Appendix B. □
Note that, given | x | 2 = s , the random variable 2 | y | 2 / P N is conditionally distributed as a noncentral chi-squared random variable with two degrees of freedom and noncentrality parameter 2 ( s + η 2 s 3 ) / P N . This enables numerical computation of U RPC ( P ) .
Theorem 3.
The capacity of the RPC in (8) is upper-bounded by
C RPC U ˜ RPC ( P ) = min μ > 0 log μ 2 + 6 η 2 μ 3 e P N + μ P + B log e ,
where
B = P N + π P N 12 3 / 8 ( 3 1 ) η .
Furthermore, the minimum in (26) is achieved by the unique real solution of the equation
P + B μ 3 μ 2 + 6 η 2 P + B μ 18 η 2 = 0 .
Proof. 
See Appendix C. □
Pre-log analysis: By substituting μ = 1 / P into (26), we see that
lim P C RPC 3 log ( P ) log 6 η 2 P N .
Furthermore, since
18 η 2 + λ 2 λ 6 η 2 + λ 2 18 η 2 + 3 λ 2 λ 6 η 2 + λ 2
= 3 λ ,
we can obtain a valid lower bound on C RPC by substituting λ = 3 / P into (22). Doing so, we obtain
lim P C RPC 3 log ( P ) log 2 η 2 e 2 9 P N .
It follows from (29) and (32) that the capacity pre-log of the RPC is 3.

3.2. Capacity Analysis of the LPC

Theorem 4.
The capacity of the LPC in (14) is
C LPC = log 1 + P P N .
Proof. 
We use the maximum differential entropy lemma ([31], Sec. 2.2) to upper-bound C LPC by log 1 + P / P N . Then, we note that we can achieve this upper bound by choosing x CN 0 , P . Alternatively, this theorem can be readily proved by applying a preprocessing step x ˜ = x exp ( j η | x | 2 ) to the input and transferring the channel to a linear AWGN channel with the same power constraint, whose capacity is log 1 + P / P N . □

3.3. Capacity Analysis of the MNC

A novel upper bound on the capacity of the MNC in (15) is presented in the following theorem [26].
Theorem 5.
The capacity of the MNC in (15) is upper-bounded by
(34) C MNC U MNC ( P ) (35) = min λ > 0 , α > 0 α log P + P N α + log π Γ ( α ) + λ + max r 0 > 0 { g λ , α ( r 0 , P ) } ,
where Γ ( · ) denotes the Gamma function and
g λ , α ( r 0 , P ) = ( α log e λ ) r 0 2 + P N P + P N + ( 1 2 α ) E log ( r ) | r 0 = r 0 h r | r 0 = r 0 h ( θ | r , r 0 = r 0 , θ 0 = 0 ) .
The upper bound U MNC ( P ) can be calculated numerically using the expression for the conditional pdf f r , θ | r 0 , θ 0 ( r , θ | r 0 , θ 0 ) given in (15).
Proof. 
See Appendix D. □

4. Numerical Examples

In Figure 1, we evaluate the bounds derived in Section 3 for a fiber-optical channel whose parameters are listed in Table 1. (The channel parameters are the same as in ([22], Table I).) Using (10), we obtain η = 6350 W 1 .
As can be seen from Figure 1, the capacity of the RPC is tightly bounded between the upper bound U RPC ( P ) in (25) and the lower bound L RPC ( P ) in (22). Furthermore, one can observe that although the alternative upper bound U ˜ RPC ( P ) in (26) is loose at low powers, it becomes tight in the moderate- and high-power regimes.
We also plot the upper bound U MNC ( P ) on the capacity of the MNC. It can be seen that U MNC ( P ) improves substantially on the upper bound given in [12], i.e., the capacity of the corresponding AWGN channel (33) (which coincides with C LPC ). As a lower bound on the MNC capacity, we propose the mutual information in (20) with an input x with uniform phase and amplitude r 0 following a chi distribution with k degrees of freedom. Specifically, we set
f r 0 ( r 0 ) = 2 r 0 k 1 Γ ( k / 2 ) k 2 P k / 2 exp k r 0 2 2 P ,
where Γ ( · ) denotes the gamma function. The parameter k is optimized for each power. (Due to the computational complexity, we only considered k values from 0.5 to 2.5 in steps of 0.5 .) We calculated the bound numerically and include it in Figure 1 (referred to as max–chi lower bound). We also include two lower bounds corresponding to k = 1 (with half-Gaussian amplitude distribution, first presented in [22]) and k = 2 (with Rayleigh-distributed amplitude, or equivalently, a complex Gaussian input x , first presented in [26]). The max–chi lower bound coincides with these two lower bounds at asymptotically low and high power, and improves slightly thereon at intermediate powers (around 0 dBm), similarly to the numerical bound in [32]. Specifically, at asymptotically low powers, k = 2 (Gaussian lower bound) is optimal. This is expected, since the channel is essentially linear at low powers. At high powers, on the other hand, the optimal k value approaches 1 (half-Gaussian lower bound), which is consistent with [22], where it has been shown that half-Gaussian amplitude distribution is capacity-achieving for the MNC in the high-power regime. Based on our numerical evaluations, we observed that k = 0.5 maximizes the max–chi lower bound (among the set of considered values of k) in the power range 18 P 32 dBm. Finally, in Figure 1, we plot the lower bound based on the input distribution ([23], Equation (45)). As can be seen, based on our numerical evaluation, this lower bound almost coincides with the max-chi lower bound at low powers ( P < 5 ) and improves on it in the intermediate power range ( 5 P < 27 dBm); however, it is suboptimal at high powers ( P 27 dBm).
Figure 1 suggests that C MNC experiences changes in slope at about 0 and 30 dBm (corresponding to the inflection points at about 10 dBm and 20 dBm). To explain this behavior, we evaluate the phase and the amplitude components of the half-Gaussian lower bound. Specifically, we split the mutual information into two parts as
(38) I ( x ; y ) = I ( r 0 , θ 0 ; r , θ ) (39) = I ( r 0 , θ 0 ; r ) + I ( r 0 , θ 0 ; θ | r ) .
The first term in (39) is the amplitude component and the second term is the phase component of the mutual information. These two components are evaluated for the half-Gaussian amplitude distribution and plotted in Figure 1. It can be seen from Figure 1 that the amplitude component is monotonically increasing with power while the phase component goes to zero with power after reaching a maximum. Indeed, at high powers, the phase of the received signal becomes uniformly distributed over [ 0 , 2 π ] and independent of the transmitted signal ([33], Lem. 5). By adding these two components, one obtains a capacity lower bound that changes concavity at two points. The reduction of the capacity slope at intermediate powers is consistent with [23,24], where it is shown that the capacity grows according to log log P in this regime.
As a final observation, we note that C RPC diverges from C MNC at about 15 dBm, whereas C LPC diverges from C MNC at about 5 dBm. Since the MNC describes the nondispersive NLS channel more accurately than the other two channels, this result suggests that the perturbative models are grossly inaccurate in the high-power regime.

5. Discussion

The capacity of three optical models, namely, the RPC, the LPC, and the MNC, were investigated. All three models are developed under two assumptions: channel memory is ignored and a sampling receiver is applied. Furthermore, two of these models, i.e., the RPC and the LPC, are based on perturbation theory and ignore signal–noise interaction, which makes them accurate only in the low-power regime. By tightly bounding the capacity of the RPC, by characterizing the capacity of the LPC, and by developing a tight upper bound on the capacity of the MNC, we showed that the capacity of these models, for the same underlying physical channel, behave very differently at high powers. Since the MNC is a more accurate channel model than the other two, one may conclude that the perturbative models become grossly inaccurate at high powers in terms of capacity calculation.
Note that the LPC model can be obtained from the MNC by neglecting the signal–noise interaction. Comparing the capacity of these two channels allows us to conclude that the impact of neglecting the signal–noise interaction on capacity is significant. Observe also that the capacity of the LPC model grows quickly with power because of the large capacity pre-log. Such a behavior is caused by the additive model used for the nonlinear distortion, which causes an artificial power increase at high SNR. A more accurate model than the RPC may be obtained by performing a normalization that conserves the signal power. Future work should consider more realistic channel models with nonzero dispersion and with more practical receivers.

Author Contributions

K.K. derived the theorems and proofs, carried out the numerical calculations, and wrote the paper. E.A. formulated the problem and contributed to the analysis. G.D. contributed to the information-theoretic analysis. All authors reviewed and revised the paper.

Funding

This work was supported by the Swedish Research Council (VR) under Grant 2013-5271.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

The capacity of the regular perturbative channel can be written as
C RPC = sup I x ; y ,
where the supremum is over all the probability distributions on x that satisfy the power constraint (21). Let
w = x + j η | x | 2 x .
We have that
(A3) I x ; y = h y h y | x (A4) = h w + n h w + n | x (A5) = h w + n h n .
Using the entropy power inequality ([31], Sec. 2.2) and the Gaussian entropy formula ([34], Th. 8.4.1), we conclude that
(A6) h w + n log 2 h w + 2 h n (A7) = log 2 h w + π e P N .
Substituting (A7) into (A5), and using again the Gaussian entropy formula ([34], Th. 8.4.1), we obtain
I x ; y log 2 h w + π e P N log π e P N .
We take x circularly symmetric. It follows from (A2) that w is also circularly symmetric. Using ([35], Equation (320)) to compute h ( w ) , we obtain
h w = h | w | 2 + log π .
Substituting (A9) into (A8), we get
I x ; y log 2 h | w | 2 e P N + 1 .
Next, to evaluate the right-hand side (RHS) of (A10), we choose the following distribution for the amplitude square s = | x | 2 of x :
f s ( s ) = ζ 3 η 2 s 2 + 1 e λ s , s 0 .
The parameters λ > 0 and ζ > 0 are chosen so that (A11) is a pdf and so that the power constraint (21) is satisfied. We prove in Appendix A.1 that by choosing these two parameters so that
ζ = λ 3 λ 2 + 6 η 2
and so that (24) holds, both constraints are met. In Appendix A.2, we then prove that
h | w | 2 = log ζ + ζ 1 λ + 18 η 2 λ 3 log e .
Substituting (A13) and (A12) into (A10), we obtain (22). Although not necessary for the proof, in Appendix A.3, we justify the choice of the pdf in (A11) by showing that it maximizes h ( w ) . (Based on (A5), the distribution that maximizes h ( w ) achieves capacity at high powers, where h ( w + n ) h ( w ) .)

Appendix A.1. Choosing ζ and λ

We choose the coefficients ζ and λ so that (A11) is a valid pdf and E s P . Note that
(A14) 0 f s ( s ) d s = 0 ζ 3 η 2 s 2 + 1 e λ s d s (A15) = ζ λ 2 + 6 η 2 λ 3 .
Therefore, choosing ζ according to (A12) guarantees that f s s integrates to 1. We next compute E s :
(A16) E s = 0 s f s ( s ) d s (A17) = 0 s ζ 3 η 2 s 2 + 1 e λ s d s (A18) = ζ 18 η 2 λ 4 + 1 λ 2 .
Substituting (A12) into (A18), we obtain
E s = 18 η 2 + λ 2 λ λ 2 + 6 η 2 .
We see now that imposing E s P is equivalent to (23). Observe that the RHS of (A19) and the objective function on the RHS of (22) are decreasing functions of λ . Therefore, setting the RHS of (A19) equal to P, which yields (24), maximizes the objective function in (22).
Finally, we prove that (24) has a single positive root. We have
(A20) f ( λ ) = P λ 3 λ 2 + 6 P η 2 λ 18 η 2 (A21) = λ 2 + 6 η 2 P λ 1 12 η 2 .
Note that f ( λ ) as λ and that the RHS of (A21) is negative when λ < 1 / P . Furthermore, f ( λ ) is monotonically increasing in the interval [ 1 / P , ) . Indeed, when λ 1 / P ,
(A22) d d λ f ( λ ) = 3 P λ 2 2 λ + 6 P η 2 (A23) 3 λ 2 λ + 6 P η 2 (A24) > 0 .
This yields the desired result.

Appendix A.2. Proof of (A13)

To compute the differential entropy of t = | w | 2 , we first determine the pdf of t . By definition,
t = s + η 2 s 3 .
Let now g x = x + η 2 x 3 . Since
d d x g x = 1 + 3 η 2 x 2 ,
we conclude that g x is monotonically increasing for x 0 . Hence, g ( x ) is one-to-one when x 0 and its inverse
q x = g 1 x
is well defined. Thus, the pdf of t is given by ([36], Ch. 5)
(A28) f t ( t ) = f s q t g q t (A29) = ζ 3 η 2 q 2 t + 1 e λ q t 3 η 2 q 2 t + 1 (A30) = ζ e λ q ( t ) , t 0 .
Here, (A29) holds because of (A11). Using (A30), we can now compute h t as
(A31) h t = 0 f t ( t ) log f t ( t ) d t (A32) = log ζ + λ ζ log e 0 q t e λ q t d t (A33) = log ζ + λ ζ log e 0 r e λ r 1 + 3 η 2 r 2 d r (A34) = log ζ + λ ζ 1 λ 2 + 18 η 2 λ 4 log e ,
where in (A33) we used the change of variables r = q t . This proves (A13).

Appendix A.3. f s (s) Maximizes h(w)

We shall prove that the pdf f s ( s ) = ζ 3 η 2 s 2 + 1 e λ s , s 0 , maximizes h w . It follows from (A9) that, to maximize h w , we need to maximize h t . We assume that the power constraint is fulfilled with equality, i.e., that
E s = 0 s f s ( s ) d s = P .
Using the change of variables s = q ( t ) , where q ( t ) was defined in (A27), we obtain
0 q ( t ) f s ( q ( t ) ) q ( t ) d t = P .
Substituting (A28) into (A36) and using that q ( t ) = 1 / g ( q ( t ) ) , we obtain
0 q ( t ) f t ( t ) d t = P .
It follows now from ([34], Th. 12.1.1) that the pdf that maximizes h ( t ) is of the form f t ( t ) = e λ 0 + λ 1 q ( t ) , t 0 , where λ 0 and λ 1 need to be chosen so that (A37) is satisfied and f t ( t ) integrates to one. Using (A28), we get
(A38) f s ( s ) = f t ( g ( s ) ) g ( s ) (A39) = 1 + 3 η 2 s 2 e λ 0 + λ 1 s , s 0 .
By setting ζ = e λ 0 and λ = λ 1 , we obtain (A11).

Appendix B. Proof of Theorem 2

Fix λ 0 . It follows from (20) and (21) that
C MNC ( P ) sup I ( x ; y ) + λ 1 E | x | 2 + P N P + P N ,
where the supremum is over the set of probability distributions that satisfy the power constraint (21). Next, we upper-bound the mutual information I x ; y as
(A41) I x ; y = h y h y | x (A42) = h y h n (A43) = h | y | + h y | | y | + E log | y | h n (A44) = h | y | 2 + h y | | y | log 2 h n (A45) h | y | 2 + log π h n (A46) = h | y | 2 log e P N ,
where in (A43) we used ([35], Lemma 6.16) and in (A44) we used ([35], Lemma 6.15). We fix now an arbitrary input pdf f x ( · ) that satisfies the power constraint and define the random variables v = | y | 2 and w = x + j η | x | 2 x . Next, we shall obtain an upper bound on h v that is valid for all f x ( · ) . Let
f ˜ v ( v ) = κ e μ q v , v 0
for some parameters κ > 0 and μ > 0 . The function q ( · ) is defined in (A27). We next choose κ so that f ˜ v ( v ) is a valid pdf. To do so, we set z = q v , which implies that g ( z ) = v , and that
1 + 3 η 2 z 2 d z = d v .
Therefore, integrating f ˜ ( v ) in (A47), we obtain
(A49) 0 κ e μ q v d v = κ 0 e μ z 1 + 3 η 2 z 2 d z (A50) = κ μ 2 + 6 η 2 μ 3 .
We see from (A50) that the choice
κ = μ 3 μ 2 + 6 η 2
makes f ˜ v ( v ) a valid pdf. Using the definition of the relative entropy, we have
(A52) D f v ( v ) | | f ˜ v ( v ) = + f v ( v ) log f v ( v ) f ˜ v ( v ) d v (A53) = h ( v ) E log f ˜ v ( v ) .
Since the relative entropy is nonnegative ([34], Thm. 8.6.1), we obtain
(A54) h v E v log f ˜ v ( v ) (A55) = log κ + μ E q v log e .
Substituting (A46) and (A55) into (A40), we obtain
(A56) C MNC ( P ) log e P N log κ + λ + sup μ E q v log e λ E | x | 2 + P N P + P N (A57) log e P N log κ + λ + max s > 0 μ E q v | | x | 2 = s log e λ s + P N P + P N .
The final upper bound (25) is obtained by minimizing (A57) over all λ 0 and μ 0 .

Appendix C. Proof of Theorem 3

It follows from (A55) and (A46) that
I x ; y log κ + μ E q v log e log e P N ,
where v = | y | 2 . Moreover,
(A59) E q v = E q | w + n | 2 (A60) = E q | w | 2 + | n | 2 + 2 R w n * .
Next, we analyze the function q ( x ) . We have
(A61) q x = 1 g q x (A62) = 1 1 + 3 η 2 q 2 x .
Furthermore,
q x = 6 η 2 q x 1 + 3 η 2 q 2 x 3 0 .
Therefore, q x is a nonnegative concave function on [ 0 , ) . Thus, for every real numbers x 0 and y x ,
q x + y q x + q ( x ) y .
Using (A64) in (A60), with x = | w | 2 + | n | 2 and y = 2 R w n * , we get
E q v E q | w | 2 + | n | 2 + 2 q | w | 2 + | n | 2 R w n * .
Using (A64) once more with x = | w | 2 and y = | n | 2 , we obtain
(A66) E q v E q | w | 2 + q | w | 2 | n | 2 + 2 q | w | 2 + | n | 2 R w n * (A67) = E q | w | 2 + P N E q | w | 2 + 2 E q | w | 2 + | n | 2 R w n * .
We shall now bound each expectation in (A67) separately. Since | w | 2 = g | x | 2 , we have that
(A68) E q | w | 2 = E | x | 2 (A69) P ,
where the last inequality follows from (21). It also follows from (A62) that
q ( | w | 2 ) 1 .
Furthermore, (A62) and (A63) imply that the function q x is positive and decreasing in the interval x 0 . Therefore,
(A71) E q | w | 2 + | n | 2 R w n * E q | w | 2 + | n | 2 | R ( w n * ) | (A72) E q | w | 2 | R ( w n * ) | (A73) E q | w | 2 | w | · | n | (A74) = π P N 2 E q | w | 2 | w | (A75) π P N 2 max t 0 t q t 2 (A76) = π P N 2 max t 0 t 1 + 3 η 2 q 2 t 2 ,
where (A74) holds because E | n | = π P N / 4 and the last equality follows from (A62). To calculate the maximum in (A76), we use the change of variables t 2 = x + η 2 x 3 to obtain
(A77) max t 0 t 1 + 3 η 2 q 2 t 2 = max x 0 x + η 2 x 3 1 + 3 η 2 x 2 (A78) = 1 12 3 / 8 3 1 η ,
where the last step follows by some standard algebraic manipulations that involve finding the roots of the derivative of the objective function on the RHS of (A77). Substituting (A78) into (A76), we obtain
E q | w | 2 + | n | 2 R w n * π P N 2 × 12 3 / 8 3 1 η .
Substituting (A69), (A70), and (A79) into (A67), and the result into (A58), we obtain
I x ; y log κ + μ log e P + P N + π P N 12 3 / 8 3 1 η log ( e P N ) .
Finally, we obtain (25) by substituting (A51) into (A80). Since the upper bound (A80) on mutual information holds for every input distribution that satisfies the power constraint, it is also an upper bound on capacity for every μ > 0 . To find the optimal μ , we need to minimize
log μ 2 + 6 η 2 μ 3 + μ P + B log e = log exp μ P + B μ 2 + 6 η 2 μ 3 ,
where B was defined in (28). Observe now that the function inside logarithm on the RHS of (A81) goes to infinity when μ 0 and when μ . Therefore, since this function is positive, it must have a minimum in the interval [ 0 , ) . To find this minimum, we set its derivative equal to zero and get (28). Note finally that, since (24) has exactly one real root, which was proved in Appendix A.1, (28) also has exactly one real root.

Appendix D. Proof of Theorem 5

The proof uses similar steps as in ([37], Sec. III-C). We upper-bound the mutual information between the x and y expressed in polar coordinates as
(A82) I ( x ; y ) = I ( r 0 , θ 0 ; r , θ ) (A83) = I ( r 0 , θ 0 ; r ) + I ( r 0 , θ 0 ; θ | r ) (A84) = h ( r ) h ( r | r 0 , θ 0 ) + h ( θ | r ) h ( θ | r , r 0 , θ 0 ) (A85) h ( r 2 ) E [ log ( r ) ] log 2 h ( r | r 0 , θ 0 ) + log ( 2 π ) h ( θ | r , r 0 , θ 0 ) .
In (A85), we used ([35], Equation (317)) and that h ( θ | r ) log ( 2 π ) . Let now f ˜ r 2 ( · ) denote an arbitrary pdf for r 2 . Following the same calculations as in (A52)–(A54), we obtain
h ( r 2 ) E r 2 log ( f ˜ r 2 ( r 2 ) ) .
We shall take f ˜ r 2 ( · ) to be a Gamma distribution with parameters α > 0 and β = ( P + P N ) / α , i.e.,
f ˜ r 2 ( z ) = z α 1 e z / β β α Γ ( α ) , z 0 .
Here, Γ ( · ) denotes the Gamma function. Substituting (A87) into (A86), we obtain
(A88) E [ log ( f ˜ r 2 ( r 2 ) ) ] (A89) = 2 ( α 1 ) E log ( r ) α E r 0 2 + P N P + P N log e α log P + P N α log ( Γ ( α ) ) .
It follows from (15) that the random variables r and θ 0 are conditionally independent given r 0 . Therefore,
h ( r | r 0 , θ 0 ) = h ( r | r 0 ) .
Next, we study the term h ( θ | r , r 0 , θ 0 ) in (A85). From Bayes’ theorem and (15) it follows that, for every θ [ 0 , 2 π ) ,
f θ | r , r 0 , θ 0 ( θ | r , r 0 , θ 0 ) = f θ | r , r 0 , θ 0 ( θ θ | r , r 0 , θ 0 θ ) .
Therefore,
(A92) h ( θ | r , r 0 , θ 0 ) = 0 2 π f θ 0 ( θ 0 ) h ( θ | r , r 0 , θ 0 = θ 0 ) d θ 0 (A93) = 0 2 π f θ 0 ( θ 0 ) h ( θ θ 0 | r , r 0 , θ 0 = 0 ) d θ 0 (A94) = 0 2 π f θ 0 ( θ 0 ) h ( θ | r 0 , θ 0 = 0 ) d θ 0 (A95) = h ( θ | r , r 0 , θ 0 = 0 ) .
Here, (A94) follows because differential entropy is invariant to translations ([34], Th. 8.6.3). Substituting (A89), (A86), (A90), and (A95) into (A85), we obtain
I ( r 0 , θ 0 ; r , θ ) α log P + P N α + log Γ ( α ) + log ( π ) + α E r 0 2 + P N P + P N log e
+ ( 1 2 α ) E log ( r ) h r | r 0 h ( θ | r , r 0 , θ 0 = 0 ) .
Fix λ 0 . We next upper bound C MNC using (A96) as
(A97) C MNC ( P ) sup I ( r 0 , θ 0 ; r , θ ) + λ 1 E [ r 0 2 ] + P N P + P N α log P + P N α + log Γ ( α ) + log ( π ) + λ + sup { ( α log e λ ) E r 0 2 + P N P + P N + ( 1 2 α ) E log ( r ) h r | r 0 (A98) h ( θ | r , r 0 , θ 0 = 0 ) } ,
where the supremum is over the set of input probability distributions that satisfy (21). We complete the proof by noting that the supremum in (A98) is less than or equal to max r 0 > 0 { g λ , α ( r 0 , P ) } , where g λ , α ( r 0 , P ) is defined in (36).

References

  1. Mecozzi, A. Limits to long-haul coherent transmission set by the Kerr nonlinearity and noise of the in-line amplifiers. J. Lightw. Technol. 1994, 12, 1993–2000. [Google Scholar] [CrossRef]
  2. Essiambre, R.J.; Kramer, G.; Winzer, P.J.; Foschini, G.J.; Goebel, B. Capacity Limits of Optical Fiber Networks. J. Lightw. Technol. 2010, 28, 662–701. [Google Scholar] [CrossRef]
  3. Merhav, N.; Kaplan, G.; Lapidoth, A.; Shamai (Shitz), S. On information rates for mismatched decoders. IEEE Trans. Inform. Theory 1994, 40, 1953–1967. [Google Scholar] [CrossRef] [Green Version]
  4. Secondini, M.; Forestieri, E. Scope and limitations of the nonlinear Shannon limit. J. Lightw. Technol. 2017, 35, 893–902. [Google Scholar] [CrossRef]
  5. Ghozlan, H.; Kramer, G. Models and information rates for multiuser optical fiber channels with nonlinearity and dispersion. IEEE Trans. Inform. Theory 2017, 63, 6440–6456. [Google Scholar] [CrossRef]
  6. Secondini, M.; Forestieri, E.; Prati, G. Achievable Information Rate in Nonlinear WDM Fiber-Optic Systems With Arbitrary Modulation Formats and Dispersion Maps. J. Lightw. Technol. 2013, 31, 3839–3852. [Google Scholar] [CrossRef]
  7. Fehenberger, T.; Alvarado, A.; Bayvel, P.; Hanik, N. On achievable rates for long-haul fiber-optic communications. Opt. Express 2015, 23, 9183–9191. [Google Scholar] [CrossRef] [Green Version]
  8. Mitra, P.P.; Stark, J.B. Nonlinear Limits to the Information Capacity of Optical Fibre Communications. Nature 2001, 411, 1027–1030. [Google Scholar] [CrossRef]
  9. Ellis, A.D.; Zhao, J.; Cotter, D. Approaching the Nonlinear Shannon Limit. J. Lightw. Technol. 2010, 28, 423–433. [Google Scholar] [CrossRef]
  10. Dar, R.; Shtaif, M.; Feder, M. New Bounds on the Capacity of the Nonlinear Fiber-Optic Channel. Opt. Lett. 2014, 39, 398–401. [Google Scholar] [CrossRef]
  11. Splett, A.; Kurzke, C.; Petermann, K. Ultimate Transmission Capacity of Amplified Optical Fiber Communication Systems taking into account Fiber Nonlinearities. In Proceedings of the European Conference on Optical Communication (ECOC), Montreux, Switzerland, 12–16 September 1993; p. MoC2.4. [Google Scholar]
  12. Kramer, G.; Yousefi, M.I.; Kschischang, F.R. Upper Bound on the Capacity of a Cascade of Nonlinear and Noisy Channels. In Proceedings of the IEEE Information Theory Workshop (ITW), Rotorua, New Zealand, 29 August–1 September 2015. [Google Scholar]
  13. Agrawal, G.P. Nonlinear Fiber Optics, 4th ed.; Elsevier: San Diego, CA, USA, 2006. [Google Scholar]
  14. Agrell, E.; Durisi, G.; Johannisson, P. Information-theory-friendly models for fiber-optic channels: A primer. In Proceedings of the IEEE Information Theory Workshop (ITW), Rotorua, New Zealand, 29 August–1 September 2015. [Google Scholar]
  15. Peddanarappagari, K.V.; Brandt-Pearce, M. Volterra series transfer function of single-mode fibers. J. Lightw. Technol. 1997, 15, 2232–2241. [Google Scholar] [CrossRef]
  16. Mecozzi, A.; Clausen, C.B.; Shtaif, M. Analysis of Intrachannel Nonlinear Effects in Highly Dispersed Optical Pulse Transmission. IEEE Photon. Technol. Lett. 2000, 12, 392–394. [Google Scholar] [CrossRef]
  17. Meccozzi, A.; Essiambre, R.J. Nonlinear Shannon Limit in Pseudolinear Coherent Systems. J. Lightw. Technol. 2012, 30, 2011–2024. [Google Scholar] [CrossRef]
  18. Taghavi, M.H.; Papen, G.C.; Siegel, P.H. On the Multiuser Capacity of WDM in a Nonlinear Optical Fiber: Coherent Communication. IEEE Trans. Inform. Theory 2006, 52, 5008–5022. [Google Scholar] [CrossRef]
  19. Dar, R.; Feder, M.; Mecozzi, A.; Shtaif, M. Accumulation of nonlinear interference noise in fiber-optic systems. Opt. Express 2014, 22, 14199–14211. [Google Scholar] [CrossRef]
  20. Ho, K.P. Phase-Modulated Optical Communication Systems; Springer: New York, NY, USA, 2005. [Google Scholar]
  21. Turitsyn, K.S.; Derevyanko, S.A.; Yurkevich, I.; Turitsyn, S.K. Information Capacity of Optical Fiber Channels with Zero Average Dispersion. Phys. Rev. Lett. 2003, 91, 203901-1–203901-4. [Google Scholar] [CrossRef]
  22. Yousefi, M.I.; Kschischang, F.R. On the per-sample capacity of nondispersive optical fibers. IEEE Trans. Inform. Theory 2011, 57, 7522–7541. [Google Scholar] [CrossRef]
  23. Terekhov, I.; Reznichenko, A.; Kharkov, Y.A.; Turitsyn, S. Log-log growth of channel capacity for nondispersive nonlinear optical fiber channel in intermediate power range. Phys. Rev. E 2017, 95, 062133. [Google Scholar] [CrossRef] [Green Version]
  24. Panarin, A.; Reznichenko, A.; Terekhov, I. Next,-to-leading-order corrections to capacity for a nondispersive nonlinear optical fiber channel in the intermediate power region. Phys. Rev. E 2017, 95, 012127. [Google Scholar] [CrossRef]
  25. Reznichenko, A.; Chernykh, A.; Smirnov, S.; Terekhov, I. Log-log growth of channel capacity for nondispersive nonlinear optical fiber channel in intermediate power range: Extension of the model. Phys. Rev. E 2019, 99, 012133. [Google Scholar] [CrossRef] [Green Version]
  26. Keykhosravi, K.; Durisi, G.; Agrell, E. A Tighter Upper Bound on the Capacity of the Nondispersive Optical Fiber Channel. In Proceedings of the European Conference on Optical Communication (ECOC), Gothenburg, Sweden, 17–21 September 2017. [Google Scholar]
  27. Kramer, G. Autocorrelation function for dispersion-free fiber channels with distributed amplification. IEEE Trans. Inform. Theory 2018, 64, 5131–5155. [Google Scholar] [CrossRef]
  28. Lau, A.P.T.; Kahn, J.M. Signal design and detection in the presence of nonlinear. J. Lightw. Technol. 2007, 25, 3008–3016. [Google Scholar] [CrossRef]
  29. Tavana, M.; Keykhosrav, K.; Aref, V.; Agrell, E. A Low-Complexity Near-Optimal Detector for Multispan Zero-Dispersion Fiber-Optic Channels. In Proceedings of the European Conference on Optical Communication (ECOC), Roma, Italy, 23–27 September 2018. [Google Scholar]
  30. Forestieri, E.; Secondini, M. Solving the nonlinear Schrödinger equation. In Optical Communication Theory and Techniques; Forestieri, E., Ed.; Springer: Boston, MA, USA, 2005; pp. 3–11. [Google Scholar]
  31. Gamal, A.E.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  32. Li, S.; Häger, C.; Garcia, N.; Wymeersch, H. Achievable information rates for nonlinear fiber communication via end-to-end autoencoder learning. In Proceedings of the European Conference on Optical Communication (ECOC), Roma, Italy, 23–27 September 2018. [Google Scholar]
  33. Yousefi, M.I. The asymptotic capacity of the optical fiber. arXiv 2016, arXiv:1610.06458. [Google Scholar]
  34. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
  35. Lapidoth, A.; Moser, S.M. Capacity bounds via duality with applications to multiple-antenna systems on flat-fading channels. IEEE Trans. Inform. Theory 2003, 49, 2426–2467. [Google Scholar] [CrossRef]
  36. Papoulis, A.; Pillai, S.U. Probability, Random Variables, and Stochastic Processes, 4th ed.; McGraw-Hill Education: New York, NY, USA, 2002. [Google Scholar]
  37. Durisi, G. On the capacity of the block-memoryless phase-noise channel. IEEE Commun. Lett. 2012, 16, 1157–1160. [Google Scholar] [CrossRef]
Figure 1. Capacity bounds for the RPC in (8) and the MNC in (15), together with the capacity of the LPC in (14). The amplitude and the phase components of the half-Gaussian lower bound for the MNC are also plotted.
Figure 1. Capacity bounds for the RPC in (8) and the MNC in (15), together with the capacity of the LPC in (14). The amplitude and the phase components of the half-Gaussian lower bound for the MNC are also plotted.
Entropy 21 00760 g001
Table 1. Channel parameters.
Table 1. Channel parameters.
ParameterSymbolValue
Attenuation α 0.2 dB / km
Nonlinearity γ 1.27 W · km 1
Fiber lengthL 5000 km
Maximum bandwidth W N 125 GHz
Emission factor n sp 1
Photon energy h ν 1.28 × 10 19 J
Noise variance P N 21.3 dBm

Share and Cite

MDPI and ACS Style

Keykhosravi, K.; Durisi, G.; Agrell, E. Accuracy Assessment of Nondispersive Optical Perturbative Models through Capacity Analysis. Entropy 2019, 21, 760. https://doi.org/10.3390/e21080760

AMA Style

Keykhosravi K, Durisi G, Agrell E. Accuracy Assessment of Nondispersive Optical Perturbative Models through Capacity Analysis. Entropy. 2019; 21(8):760. https://doi.org/10.3390/e21080760

Chicago/Turabian Style

Keykhosravi, Kamran, Giuseppe Durisi, and Erik Agrell. 2019. "Accuracy Assessment of Nondispersive Optical Perturbative Models through Capacity Analysis" Entropy 21, no. 8: 760. https://doi.org/10.3390/e21080760

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop