# Harnessing machine learning for fiber-induced nonlinearity mitigation in long-haul coherent optical OFDM

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Drawbacks and Deficiencies of Benchmark Fiber Non-Linearity Compensation Schemes

## 3. Sources of Stochastic Noises

- Advanced modulation formats—These have become a key ingredient to the design of modern optically routed networks, as a signal is modulated at amplitude, frequency and phase enabling the information carrying capacity to be doubled. Such signal formats include high-order single-carrier formats (e.g., 16/64-QAM) or multi-carrier modulation schemes (e.g., OFDM) [8] which cope better with ‘linear’ channel distortions. Unfortunately, high-order signal formats are vulnerable to fiber non-linearities, to the point that, when multiple signals are transmitted spectrally closely to each other the resultant non-linear deterministic noise is so ‘dense’ that appears stochastic [20,21]. In multi-carrier modulation schemes such as CO-OFDM, this phenomenon is more prominent due to the high PAPR and the fact that subcarriers are spectrally very close to each other causing inter-carrier interference [8,20,21].
- Optical Amplifiers—In long-range optical communications there is multi-span amplification for keeping the signal power levels high enough, but their excess noise beats with the incoming signal. This noise originates by means of quantum mechanical uncertainties in the number of photons added at each amplifier and ultimately limited by the Heisenberg uncertainty principle [3,7]. The amplifier excess noise can be interpreted as resulting from unavoidable spontaneous emission into its amplified mode (i.e., ASE). The effect of ASE noise on fiber non-linearity interaction is called parametric noise amplification (PNA).
- Optical Fibers—Conventional fibers include SMFs which generally exhibit stochastic noise from polarization rotation. The other form of stochastic noise is due to the interplay between linear CD and Kerr non-linearity when signal–noise interaction is considered.

## 4. Machine Learning for Fiber-Induced Non-Linear Noise Suppression in Coherent Optical Orthogonal Frequency Division Multiplexing (CO-OFDM)

- when closed-form solutions do not exist, and trial and error methods are the only approaches to solving the problem at hand,
- when the application requires real-time performance, and
- when faster convergence rates and smaller errors are required in the optimization of large systems.

^{19}−1 having a period of approximately 2

^{19937}−1 (Mersenne twister) [46]. Compared to the work reported in [47] showing that when employing short pseudorandom sequences (with lengths of 2

^{7}and 2

^{15}), ANNs most likely will overestimate the system performance and the adopted pseudorandom sequence has a much longer period. Furthermore, the training process applies to a data-set of 2

^{19}−1 which is not repeated over and over and is split into three separate classes: (i) an actual training set (dependent on the number of iterations-epochs for ANN); (ii) a validation set; and (iii) testing data, using 70%, 10%, and 20%, respectively. The ANN/SVM algorithm is iteratively updated until the error on the validating data set converges to a given rate while different amount of training data is tested, i.e., ranging from 1% up to 70%. As indicated in [20,21,22], the optimal training data corresponding to the maximum achievable Q-factor is 10% for both quaternary phase-shift keying (QPSK) and 16-QAM formats, above which there is saturation (i.e., no Q-factor improvement was noticed).

#### 4.1. Artificial Neural Network (ANN)

_{k,i}, for each subcarrier where the outputs of all subcarriers are summed. In the training stage, the minimum mean-square error (MMSE) algorithm determines the error signal and updates the weights, which are iteratively updated until the desired error value is reached, thus indicating the optimal match between the sub-network output and the transmitted CO-OFDM symbols. The error signal is given as $E\left(k\right)=s\left(k\right)-\widehat{s}\left(k\right),$ where ŝ(k) is calculated in terms of a non-linear activation function (NAF), φ

_{k,i}, that is given by $\widehat{s}\left(k\right)={\displaystyle \sum}_{i=1}^{M}{w}_{k,i}{\phi}_{k,i}\left(s\left(k\right)\right)$. The chosen NAF is a differential sigmoid function and is a “split” complex NAF, where two conventional real-valued functions process the I-Q components in contrast to our proposed approach which processes the complex data simultaneously, thus accounting for real and imaginary data cross-information. The number of ANN neurons in every sub-neural network is equal to the number of points of the constellation. The 2D ANN is based on the Riedmiller’s resilient-back propagation (RR-BP) algorithm and performs an approximation to the global minimization achieved by the steepest descent [20]. The training function updates the weights and bias values according to RR-BP, which minimizes the difference between the ANN output and the desired output by splitting the complex OFDM data in two real-valued data collections. The transfer functions for the hidden layer of the ANN are differentiable and similar to the hyperbolic tangent function. For the output layer, the linear function “purelin” was employed [20]. The MMSE in Figure 3a represents the subsystem that implements RR-BP to find the weights that minimize the error vector $E\left(n\right)=\parallel S\left(n\right)-\widehat{S}\left(n\right){\parallel}^{2}$, where $S\left(n\right)$ and $\widehat{S}\left(n\right)$ are the desired and calculated output vectors, respectively. The weights are updated according to the steps described in Figure 4 by applying the gradient descent on the cost function E(n) to reach a minimum. Finally, at the end-output of Figure 3b we introduced slack variables that allow some misclassified symbols but penalizes them.

#### 4.2. Support Vector Machine (SVM)

_{k,i}, after which, the outputs for different k are summed. The distribution of noisy constellation points is learnt during an initial training process similarly to ANN. Once the distribution is learnt, the detector can make decision for the new unknown observation symbols. A hyperplane is also obtained through approximation of a nonlinear function using a set of Kernels (sigmoid function) of training dataset. SVR maps the data to a high-dimension feature space as shown in Figure 5a, using a nonlinear mapping φ and then linear regression is formulated by introducing the “ε-insensitive” loss function in the following form $f\left(x,w\right)={\displaystyle \sum}_{i=1}^{M}{w}_{k,i}{\phi}_{k,i}\left(x\right)+b,$ where $f\left(x,w\right)$ is the target linear model, ${\phi}_{k,i}\left(x\right)$ denotes a set of nonlinear transformations of input x, and b is the bias term. The number of vectors in every hidden node is equal to the number of points of the constellation; hence for example for 4-QAM is 4. The “ε-insensitive” loss function can be learnt through training process by minimizing the error, $\psi \left(w,\xi \right)=\frac{1}{2}{\parallel w\parallel}^{2}+C\sum \left({\xi}_{k}^{-}+{\xi}_{k}^{+}\right),$ where ${\xi}_{k}^{-},{\xi}_{k}^{+}$ are slack variables corresponding to the upper and lower bounds on the output function and C is the penalty parameter. Depending on how much loss is ignored, the latter equation can be approximated by the Lagrange loss function $L\left(y,f\left(x,w\right)\right)$.

_{k,i}(where i is the symbol) by means of a hybrid maximum likelihood and recursive least-square process [25]. In the IRWLS steps, described in Figure 6b, w refers to the weights, y refers to the received symbols (reference sequence), while ${L}_{\epsilon}\left({e}_{i}\right)$ is the loss function, C a penalty regulation parameter, ${e}_{i}$ the penalization term for the ith symbol, and Ns is the total number of subcarriers. Finally, ${R}_{s}\mathrm{and}\text{}{R}_{P}$ refer to the Sato’s and Godard’s constants, respectively. For the fast version of SVM, a Newton SVM was implemented with an architecture as shown in Figure 6c. Newton SVM suppresses the input space features for a nonlinear programming formulation of supervised SVM classifiers. This stand-alone method can handle classification problems in very high dimensional spaces. In this algorithm, a Newton-based algorithm is solved which is implemented via Lagrangian multipliers of an SVM-based classifier, thus resulting to an effective iterative scheme [29] constituted of only a few steps. To process a high-level modulation format order (and thus constellation mapper) with a large dimensional input, a fast-finite Newton approach was considered. For the classification problem, this approach searches for a unique Lagrangian-based global minimum solution by determining a finite number of times, a system of nonlinear equations. The aforementioned Newton-based algorithm steps and related equations are depicted in Figure 6d which involves an Armijo step-size [29]. In the equations in Figure 6d, column vectors are considered except if transposed to a row vector (using a $T$ superscript). Moreover, as depicted in Figure 6c, $\parallel x\parallel $ denotes the 2-norm of a vector x, while A is the matrix related to an OFDM received signal incorporating m complex symbols in the n-dimensional real space ${R}^{m}$ which expresses the modulation order level (i.e., 4 for 16-QPSK).

#### 4.3. Clustering

- Choose k initial cluster centers (centroid).
- Compute point-to-cluster-centroid distances of all observations to each centroid.
- Compute the average of the observations in each cluster to obtain k new centroid locations.
- Repeat steps 2 through 3 until cluster assignments do not change, or the maximum number of iterations is reached.

_{i}is referred to the ith symbol, c

_{j}is the center of a jth cluster, and μ

_{ij}refers to the MD of t

_{i}into jth cluster. FL is processed in 5 steps: 1. Enter the number of targeted clusters; 2. Initiate the cluster MD, μ

_{ij}; 3. Estimate the center per cluster by ${C}_{j}={\displaystyle \sum}_{i,j}^{N}({\displaystyle \sum}_{i=1}^{R}{\mu}_{ij}^{m}{X}_{i}/{\displaystyle \sum}_{i=1}^{R}{\mu}_{ij}^{m})$; 4. Update μ

_{ij}using ${C}_{j}=1/{({\displaystyle \sum}_{i,j}^{N}{\displaystyle \sum}_{k=1}^{L}\parallel {t}_{i}-{c}_{J}\parallel /\parallel {t}_{i}-{c}_{k}\parallel )}^{2/m-1}$ and compute F

_{m}; 5. Return and perform steps 2–4 until F

_{m}is converged at a desired threshold.

_{1}through x

_{n}be a set of complex data (symbol), with no assumptions made about their internal structure, and let S be a function that quantifies the similarity between any 2 symbols, such that S(x

_{i}, x

_{j})>S(x

_{i}, x

_{k}) if x

_{i}is more similar to x

_{j}than to x

_{k}. For this example, the negative squared distance of 2 symbols was used i.e., for points x

_{i}and x

_{k},

**responsibility**, R(i, k)’ and ‘

**availability**, A(i, k)’ matrices, where R quantifies how “well-suited” x

_{k}is to serve as the exemplar for x

_{i}compared to other candidate exemplars, while A shows how “appropriate” it would be for x

_{i}to pick x

_{k}as its exemplar, taking into account other points’ preference. R and A, are initialized to zero being viewed as log-probability tables and then AP is iteratively updated for R and A by:

## 5. Experimental Setup and Performance of Machine Learning Algorithm in CO-OFDM

_{10}$\left[\sqrt{2}erf{c}^{-1}\left(2BER\right)\right])$) measurements averaging over 10 recorded traces (~10

^{6}bits) by error counting (hard-decision-decoding, HDD).

## 6. Complexity Analysis

#### 6.1. Complexity Analysis of Digital Back-Propagation (DBP) and Inverse-Volterra Series-Transfer Function (IVSTF)-Based Non-Linear Equalizations (NLEs)

_{block}= K·N

_{signal}, since the data has to be oversampled with an oversampling constant K in order to account for the out-of-band non-linear components. When the N

_{block}is a power of two, the split-radix is the implementation showing the lowest complexity [13] requiring the floating-point (FLOPs) real-valued operations from (1):

#### 6.1.1. Complexity of NLEs Based on Digital Back-Propagation

_{linear}and N

_{non-linear}are given by 8N

_{block}log

_{2}N

_{block}−6N

_{block}+ 16 and 18N

_{block}, respectively. On the other hand, the number of DBP steps is, assuming uniform length:

#### 6.1.2. Complexity of NLEs Based on Inverse Volterra Series Transfer Function (IVSTF)

_{span}, and therefore, the total number of FLOPs can be calculated as:

_{linear}is given by:

_{prod}is the number of operations for linear equalization, which is given by 6N

_{block}as it requires N

_{block}complex multiplications (6FLOPs each). The number of FLOPs for the non-linear compensation block, on the other hand, is:

_{blocks}data and, consequently, also requires 6N

_{blocks}. Consequently, the total number of operations for the IVSTF-based NLE is given by:

#### 6.2. Complexity Analysis of ANN and SVM-Based NLEs

_{SC}) and the number of bits coded in each subcarrier (M).

#### 6.2.1. Complexity of ANN

#### 6.2.2. Complexity of SVM

_{w}order filter operating on data blocks of Ns samples. The complexity of each iteration within the ML-RLS estimation can be calculated step by step. The first step that is the computation of a

_{i}is given by 8N

_{W}+ 3. The second step, where the w

_{s}is calculated using the least-square method, requires 64/3N

_{s}

^{3}+ 18N

_{s}

^{2}. The updating of the w vector carried out in the third step depends on particular implementation. For the case of the supervised SVM, the number of operations is 4N

_{w}+ 6N

_{s}−1, whereas for Sato’s and Godard’s based blind implementations, the number operations is 3N

_{s}

^{2}+ 2N

_{s}and (3p + 2)N

_{s}

^{2}+ (p + 2)N

_{s}(p represents the power of the norm), respectively. Step four and step five a priori do not affect the FLOP count since they do not require any extra arithmetic manipulation. It is important to note that in all the studied cases, the computational complexity is O(${N}_{S}^{3}$), with the least square calculation the limiting stage. The computational complexity of the cost function, on the other hand, is O(N

_{s}) for non-blind equalization and O(${N}_{S}^{2}$) for both Sato’s and Godard’s cost functions and, consequently, blind equalization does not suppose a significant computational cost increment compared to the unsupervised approach.

#### 6.3. Complexity of Clustering Algorithms

#### 6.3.1. K-means

^{M}), d is the dimensionality (in our case, 2), and i = 2

^{Ω(√n)}is the number of the required iterations.

#### 6.3.2. Affinity Propagation

^{2}), where i, k and n are the number of iterations, clusters, and elements, respectively. While the number of clusters and element are trivial, the number of iterations i is difficult to predict due to the complexity algorithm and its interplay with the dispersion structure of the data.

#### 6.4. Impact of High-Order Modulation Format Levels on Computational Complexity

## 7. Conclusions

## Funding

## Conflicts of Interest

## References

- Winzer, P.J. Scaling optical fiber networks: Challenges and solutions. Opt. Photonics News
**2015**, 26, 28–35. [Google Scholar] [CrossRef] - Cisco Virtual Networking Index: Forecast and Methodology, 2014–2019; CISCO: San Jose, CA, USA, 2015.
- Mitra, P.P.; Stark, J.B. Nonlinear limits to the information capacity of optical fiber communications. Nature
**2001**, 411, 1027–1030. [Google Scholar] [CrossRef] [PubMed] - Agrawal, G.P. Nonlinear Fiber Optics, 3rd ed.; Academic Press: San Diego, CA, USA, 2001; ISBN 0-12-045143-3. [Google Scholar]
- Temprana, E.; Myslivets, E.; Kuo, B.P.; Liu, L.; Ataie, V.; Alic, N.; Radic, S. Overcoming Kerr-induced capacity limit in optical fiber transmission. Science
**2015**, 348, 1445–1448. [Google Scholar] [CrossRef] [PubMed] - Behrens, C. Mitigation of Nonlinear Impairments for Advance Optical Modulation Formats. Ph.D. Thesis, Department of Electronic and Electrical Engineering, University College London, London, UK, 2012. [Google Scholar]
- Ellis, A.D.; McCarthy, M.E.; Al Khateeb, M.A.; Sorokina, M.; Doran, N.J. Performance limits in optical communications due to fiber nonlinearity. Adv. Opt. Photonics
**2017**, 9, 429–503. [Google Scholar] [CrossRef] - Shieh, W.; Athaudage, C. Coherent optical orthogonal frequency division multiplexing. Electr. Lett.
**2006**, 42, 587–589. [Google Scholar] [CrossRef] - Morshed, M.; Du, L.B.; Lowery, A.J. Mid-Span Spectral Inversion for Coherent Optical OFDM Systems: Fundamental Limits to Performance. J. Lightw. Technol.
**2013**, 31, 58–66. [Google Scholar] [CrossRef] [Green Version] - Le, S.T.; McCarthy, M.E.; Mac Suibhne, N.; Al-Khateeb, M.A.; Giacoumidis, E.; Doran, N.; Ellis, A.D.; Turitsyn, S.K. Demonstration of Phase-conjugated Subcarrier Coding for Fiber Nonlinearity Compensation in CO-OFDM Transmission. J. Lightw. Technol.
**2015**, 33, 2206–2212. [Google Scholar] [CrossRef] - Gao, G.; Zhang, J.; Gu, W. Analytical Evaluation of Practical DBP-Based Intra-Channel Nonlinearity Compensators. Photonics Technol. Lett.
**2013**, 25, 717–720. [Google Scholar] [CrossRef] - Song, M.; Pincemin, E.; Vgenopoulou, V.; Roudas, I.; Amhoud, E.M.; Jaouën, Y. Transmission performances of 400 Gbps coherent 16-QAM multi-band OFDM adopting nonlinear mitigation techniques. In Proceedings of the 2015 Tyrrhenian International Workshop on Digital Communications TIWDC, Florence, Italy, 22 September 2015; pp. 46–48. [Google Scholar]
- Giacoumidis, E.; Aldaya, I.; Jarajreh, M.A.; Tsokanos, A.; Le, S.T.; Farjady, F.; Jaouën, Y.; Ellis, A.D.; Doran, N.J. Volterra-Based Reconfigurable Nonlinear Equalizer for Coherent OFDM. Photonics Technol. Lett.
**2014**, 26, 1383–1386. [Google Scholar] [CrossRef] - Yu, Y.; Zhao, J. Modified phase-conjugate twin wave schemes for fiber nonlinearity mitigation. Opt. Exp.
**2015**, 23, 30399–30413. [Google Scholar] [CrossRef] - Yoshida, T.; Sugihara, T.; Ishida, K.; Mizuochi, T. Spectrally-efficient Dual Phase-Conjugate Twin Waves with Orthogonally Multiplexed Quadrature Pulse-shaped Signals. In Proceedings of the Optical Fiber Communication Conference (OFC), San Francisco, CA, USA, 9–13 March 2014. [Google Scholar]
- Egmont-Petersen, M.; de Ridder, D.; Handels, H. Image processing with neural networks—A review. Pattern Recognit.
**2002**, 35, 2279–2301. [Google Scholar] [CrossRef] - Ye, H.; Li, G.Y.; Juang, B.-H. Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems. Wirel. Commun. Lett.
**2018**, 7, 114–118. [Google Scholar] [CrossRef] - Zibar, D.; Wymeersch, H.; Lyubomirsky, I. Machine learning under the spotlight. Nat. Photonics
**2017**, 11, 751. [Google Scholar] [CrossRef] - Argyris, A.; Bueno, J.; Fischer, I. Photonic machine learning implementation for signal recovery in optical communications. Sci. Rep.
**2018**, 8, 8487. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jarajreh, M.A.; Giacoumidis, E.; Aldaya, I.; Le, S.T.; Tsokanos, A.; Ghassemlooy, Z.; Doran, N.J. Artificial Neural Network Nonlinear Equalizer for Coherent Optical OFDM. Photonics Technol. Lett.
**2015**, 27, 387–390. [Google Scholar] [CrossRef] - Giacoumidis, E.; Le, S.T.; Ghanbarisabagh, M.; McCarthy, M.; Aldaya, I.; Mhatli, S.; Jarajreh, M.A.; Haigh, P.A.; Doran, N.J.; Ellis, A.D.; et al. Fiber Nonlinearity-Induced Penalty Reduction in Coherent Optical OFDM by Artificial Neural Network based Nonlinear Equalization. Opt. Lett.
**2015**, 40, 5113–5116. [Google Scholar] [CrossRef] [PubMed] - Giacoumidis, E.; Mhatli, S.; Wei, J.; Le, S.T.; Aldaya, I.; Stephens, M.F.; McCarthy, M.E.; Ellis, A.D.; Doran, N.J.; Eggleton, B.J. Intra and inter-channel nonlinearity compensation in WDM coherent optical OFDM using artificial neural network based nonlinear equalization. In Proceedings of the Optical Fiber Communications Conference and Exhibition (OFC), Los Angeles, CA, USA, 19–23 March 2017. [Google Scholar]
- Koike-Akino, T.; Millar, D.S.; Parsons, K.; Kojima, K. Nonlinearity Equalization with Multi-Label Deep Learning Scalable to High-Order DP-QAM. In Proceedings of the Signal Processing in Photonic Communications (SPPCom), Zurich, Switzerland, 2–5 July 2018. [Google Scholar]
- Kaur, G.; Kaur, G. Performance analysis of Wilcoxon-based machine learning nonlinear equalizers for coherent optical OFDM. Opt. Quant. Electr.
**2018**, 50, 256. [Google Scholar] [CrossRef] - Kaur, G.; Kaur, G. Application of functional link artificial neural network for mitigating nonlinear effects in coherent optical OFDM. Opt. Quant. Electr.
**2017**, 49, 227. [Google Scholar] [CrossRef] - Ahmad, S.T.; Kumar, K.P. Radial Basis Function Neural Network Nonlinear Equalizer for 16-QAM Coherent Optical OFDM. Photonics Technol. Lett.
**2016**, 28, 2507–2510. [Google Scholar] [CrossRef] - Nguyen, T.; Mhatli, S.; Giacoumidis, E.; Van Compernolle, L.; Wuilpart, M.; Mégret, P. Fiber nonlinearity equalizer based on support vector classification for coherent optical OFDM. Photonics J.
**2016**, 8, 1–9. [Google Scholar] [CrossRef] - Giacoumidis, E.; Mhatli, S.; Nguyen, T.; Le, S.T.; Aldaya, I.; McCarthy, M.E.; Ellis, A.D.; Eggleton, B.J. Comparison of DSP-based nonlinear equalizers for intra-channel nonlinearity compensation in coherent optical OFDM. Opt. Lett.
**2016**, 41, 2509–2512. [Google Scholar] [CrossRef] [PubMed] - Giacoumidis, E.; Mhatli, S.; Stephens, M.F.; Tsokanos, A.; Wei, J.; McCarthy, M.E.; Doran, N.J.; Ellis, A.D. Reduction of Nonlinear Inter-Subcarrier Intermixing in Coherent Optical OFDM by a Fast Newton-based Support Vector Machine Nonlinear Equalizer. J. Lightw. Technol.
**2017**, 35, 2391–2397. [Google Scholar] [CrossRef] - Giacoumidis, E.; Le, S.T.; MacCarthy, M.E.; Ellis, A.D.; Eggleton, B.J. Record Intrachannel Nonlinearity Reduction in 40-Gb/s 16QAM Coherent Optical OFDM using Support Vector Machine based Equalization. In Proceedings of the ANZCOP/ACOFT, Adelaide, Australia, 29 November–3 December 2015. [Google Scholar]
- Giacoumidis, E.; Mhatli, S.; Le, S.T.; Aldaya, I.; McCarthy, M.E.; Ellis, A.D.; Eggleton, B.J. Nonlinear Blind Equalization for 16-QAM Coherent Optical OFDM using Support Vector Machines. In Proceedings of the ECOC, Düsseldorf, Germany, 18–22 September 2016; p. Th.2.P2. [Google Scholar]
- Mhatli, S.; Mrabet, H.; Dayoub, I.; Giacoumidis, E. A novel SVM robust model Based Electrical Equalizer for CO-OFDM Systems. IET Commun.
**2017**, 11, 1091–1096. [Google Scholar] [CrossRef] - Giacoumidis, E.; Tsokanos, A.; Ghanbarisabagh, M.; Mhatli, S.; Barry, L.P. Unsupervised Support Vector Machines for Nonlinear Blind Equalization in CO-OFDM. Photonics Technol. Lett.
**2018**, 30, 1091–1094. [Google Scholar] [CrossRef] - Jarajreh, M.A. Compensation of filter cascading effects and non-linearities in flexible multi-carrier-based optical networks using a complex-kernel-based support vector machine. IET Commun.
**2018**, 12, 1737–1742. [Google Scholar] [CrossRef] - Giacoumidis, E.; Matin, A.; Wei, J.; Doran, N.J.; Barry, L.P.; Wang, X. Blind Nonlinearity Equalization by Machine Learning based Clustering for Single- and Multi-Channel Coherent Optical OFDM. J. Lightw. Technol.
**2018**, 36, 721–727. [Google Scholar] [CrossRef] - Giacoumidis, E.; Aldaya, I.; Wei, J.L.; Sanchez, C.; Mrabet, H.; Barry, L.P. Affinity propagation clustering for blind nonlinearity compensation in coherent optical OFDM. In Proceedings of the CLEO, San Jose, CA, USA, 13–18 May 2018. [Google Scholar]
- Ellis, A.D.; Al Khateeb, M.A.Z.; McCarthy, M.E. Impact of Optical Phase Conjugation on the Nonlinear Shannon Limit. Opt. Exp.
**2017**, 35, 792–798. [Google Scholar] [CrossRef] [Green Version] - Ellis, A.D.; McCarthy, M.E.; Al-Khateeb, M.A.Z.; Sygletos, S. Capacity limits of systems employing multiple optical phase conjugators. Opt. Exp.
**2015**, 23, 20381–20393. [Google Scholar] [CrossRef] - Phillips, I.; Tan, M.; Stephens, M.F.; McCarthy, M.; Giacoumidis, E.; Sygletos, S.; Rosa, P.; Fabbri, S.; Le, S.T.; Kanesan, T.; et al. Exceeding the Nonlinear-Shannon Limit using Raman Laser Based Amplification and Optical Phase Conjugation. In Proceedings of the Optical Fiber Communication Conference (OFC), San Francisco, CA, USA, 9–13 March 2014. [Google Scholar]
- Sanchez, C.; Mccarthy, M.; Ellis, A.D.; Wright, P.; Lord, A. Optical-phase conjugation nonlinearity compensation in Flexi-Grid optical networks. In Proceedings of the DNCOCO, Budapest, Hungary, 12–14 December 2015; pp. 39–43. [Google Scholar]
- Liu, X.; Chraplyvy, A.R.; Winzer, P.J.; Tkach, R.W.; Chandrasekhar, S. Phase-conjugated twin waves for communication beyond the Kerr nonlinearity limit. Nat. Photonics
**2013**, 7, 560–568. [Google Scholar] [CrossRef] - Le, S.T.; McCarthy, M.E.; Mac Suibhne, N.; Ellis, A.D.; Turitsyn, S.K. Phase-Conjugated Pilots for Fiber Nonlinearity Compensation in CO-OFDM Transmission. J. Lightw. Technol.
**2015**, 33, 1308–1314. [Google Scholar] [CrossRef] - Czegledi, C.B.; Liga, G.; Lavery, D.; Karlsson, M.; Agrell, E.; Savory, S.J.; Bayvel, P. Digital backpropagation accounting for polarization-mode dispersion. Opt. Exp.
**2017**, 25, 1903–1915. [Google Scholar] [CrossRef] [PubMed] - Irukulapati, N.V.; Wymeersch, H.; Johannisson, P.; Agrell, E. Stochastic digital backpropagation. Trans. Commun.
**2014**, 62, 3956–3968. [Google Scholar] [CrossRef] - Vgenopoulou, V.; Erkilinc, M.S.; Killey, R.I.; Jaouën, Y.; Roudas, I.; Tomkos, I. Comparison of Multi-Channel Nonlinear Equalization using Inverse Volterra Series versus Digital Backpropagation in 400 Gb/s Coherent Superchannel. In Proceedings of the 42nd European Conference on Optical Communication (ECOC), Dusseldorf, Germany, 18–22 September 2016. [Google Scholar]
- Matsumoto, M.; Nishimura, T. Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudorandom Number Generator. ACM Trans. Model. Comput. Simul.
**1998**, 8, 3–30. [Google Scholar] [CrossRef] - Eriksson, T.A.; Buelow, H.; Leven, A. Applying Neural Networks in Optical Communication Systems: Possible Pitfalls. Photonics Technol. Lett.
**2017**, 29, 2091–2094. [Google Scholar] [CrossRef] [Green Version] - Mateo, E.; Zhu, Z.; Li, G. Impact of XPM and FWM on the digital implementation of impairment compensation for WDM transmission using backward propagation. Opt. Exp.
**2008**, 16, 16124–16137. [Google Scholar] [CrossRef]

**Figure 1.**Digital back-propagation (DBP) conceptual diagram [11].

**Figure 2.**Inverse-Volterra series-transfer function (IVSTF) block diagram for coherent optical orthogonal frequency division multiplexing (CO-OFDM). Where m is the number of spans in a long-haul network and k is the Kernel order [13]. (I)FFT: (Inverse) fast Fourier transform.

**Figure 7.**Conceptual dendrogram for fuzzy-logic (FL) [35].

**Figure 8.**Affinity propagation (AP) clustering procedure (e.g., quaternary phase-shift keying (QPSK)) [36].

**Figure 9.**Experimental setup for multi- (upper) and single-channel (lower) CO-OFDM incorporating Volterra-non-linear equalization (V-NLE) and FS-DBP in time domain and the machine learning algorithms in frequency domain (after the FFT in CO-OFDM) [21,22,28,31,33,35,36]. ECL: external cavity laser, AWG: arbitrary waveform generator, AOM: acousto-optic modulator, EDFA: Erbium-doped fiber amplifier, GFF: gain-flattening filter, LO: local oscillator, DFB: distributed feedback laser, ASE: amplified spontaneous emission, PMM: polarization maintaining multiplexer, WSS: wavelength selective switch, BPF: bandpass filter, OSA: optical spectrum analyzer. Inset: received optical spectrum for all channels, highlighting the 5 middle channels.

**Figure 11.**WDM QPSK CO-OFDM results [22,28,29,33,35,36] (

**a**) Q-factor vs. launched optical (LOP) per channel among clustering and deterministic algorithms. (

**b**) Middle subcarriers Q-factor distribution at −5 dBm of LOP (optimum). (

**c**) Q-factor vs. LOP for unsupervised SVMs. (

**d**) Performance comparison between supervised support vector machine-regression (SVR) and artificial neural network (ANN). (

**e**) Performance comparison between Fast-Newton-SVM (F-SVM) and benchmark clustering/deterministic algorithms. (

**f**) Received constellation diagrams for affinity propagation (AP) and K-means at −7 dBm of LOP. LE: linear equalization; NLE: nonlinear equalization; V-NLE/IVSTF: inverse Volterra-series transfer function NLE; BNLE: blind NLE; FS-DBP: full-step digital back-propagation; FC/FLC: Fuzzy-logic clustering.

Parameter | Value |
---|---|

Net bit-rate | 18.2 Gb/s(WDM), 40 Gb/s(1-ch.) |

Net bit-rate for ANN | 16.8 Gb/s (WDM), 38 Gb/s(1-c.) |

Raw bit-rate | 20 Gb/s (WDM), 46 Gb/s (1-ch.) |

Format of modulation | QPSK (WDM), 16-QAM (1-ch.) |

Number of symbols | 400 |

Symbol time duration | 20.48 ns |

Generated subcarriers | 210 |

CP | 2% |

Size of FFT & inverse(I)FFT | 512 |

ANN Training overhead | 10% |

ANN Train. symbol length | 40 symbols |

Local oscillator linewidth | 100 kHz |

OH-LITE fiber attenuation | 18.9–19.5 dB/100km |

Number of spans | 30 (WDM), 20 (1-chan.) |

Length-per-span | 100 km |

Center wavelength | 1550.2 nm |

Link parameters | Signal parameters | ||
---|---|---|---|

Symbol | Definition | Symbol | Definition |

N_{span} | Number of spans | N_{SC} | Subcarrier number |

L_{span} | Length per span | K | Oversampling factor |

Δd | Spatial step | M | No. bits per subcarrier |

**Table 3.**Computational complexity comparison between full-step DBP, IVSTF and ANN for different modulation format order (M) and transmission distances.

Deterministic Technique | System A (2000 km) | System B (3200 km) |
---|---|---|

DBP | 163852800 (1.6 × 10^{8}) | 262164480 (2.6 × 10^{8}) |

IVSTF | 1151312 (1.2 × 10^{6}) | 1839632 (1.8 × 10^{6}) |

ANN (M = 4) | 5040 (5.0 × 10^{3}) | |

ANN (M = 16) | 100800 (1.0 × 10^{5}) | |

ANN (M = 64) | 1693440 (1.7 × 10^{6}) | |

ANN (M = 128) | 6827520 (6.8 × 10^{6}) |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Giacoumidis, E.; Lin, Y.; Wei, J.; Aldaya, I.; Tsokanos, A.; Barry, L.P.
Harnessing machine learning for fiber-induced nonlinearity mitigation in long-haul coherent optical OFDM. *Future Internet* **2019**, *11*, 2.
https://doi.org/10.3390/fi11010002

**AMA Style**

Giacoumidis E, Lin Y, Wei J, Aldaya I, Tsokanos A, Barry LP.
Harnessing machine learning for fiber-induced nonlinearity mitigation in long-haul coherent optical OFDM. *Future Internet*. 2019; 11(1):2.
https://doi.org/10.3390/fi11010002

**Chicago/Turabian Style**

Giacoumidis, Elias, Yi Lin, Jinlong Wei, Ivan Aldaya, Athanasios Tsokanos, and Liam P. Barry.
2019. "Harnessing machine learning for fiber-induced nonlinearity mitigation in long-haul coherent optical OFDM" *Future Internet* 11, no. 1: 2.
https://doi.org/10.3390/fi11010002