Low Complexity DSP for High Speed Optical Access Networking †

: A novel low-cost and energy-efficient approach for reaching 40 Gb/s signals is proposed for cost-sensitive optical access networks. Our proposed design is constituted of an innovative low-complex high-performance digital signal processing (DSP) architecture for pulse amplitude modulation (PAM-4), reuses existing commercial cost-effective 10-G components and eliminates the need of a power-hungry radio frequency (RF) component in the transmitter. Using a multi-functional 17-tap reconfigurable adaptive Volterra-based nonlinear equalizer with noise suppression, significant improvement in receiver optical power sensitivity is achieved. Results show that over 30 km of single-mode fiber (SMF) a link power budget of 33 dB is feasible at a bit-error-rate (BER) threshold of 10 − 3 .


Introduction
Today's fastest standardized passive optical network (PON) is the 25 Gb/s per lane Ethernet PON [1]. Driven by bandwidth hungry applications such as high definition video services and virtual reality, a higher lane rate PON is under study by Telecommunication Standardization Sector of the International Telecommunications Union (ITU-T) [2]. An alternative approach is to increase the lane count by ultra-dense wavelength division multiplexing (UWDM) with central office consolidation and extended geographic coverage. On this topic, the Institute of Electrical and Electronics Engineers (IEEE) 802.3cs super PON standard under development supports a 50 km reach and a larger than 1:64 split ratio but requires a wavelength router in the field. As the attractiveness of PONs is very sensitive to cost and power consumption, it is highly desirable that high speed PONs leverage the optical component ecosystem of today's mature 10 Gb/s PONs, where intensive efforts have been made by the research community [3][4][5][6][7][8][9][10][11][12][13][14]. Higher spectrally efficient modulation schemes and advanced digital signal processing (DSP) must be employed to achieve such targets. Examples include chirp managed non-return-to-zero (NRZ) [3], electrical/optical Duobinary [4][5][6][7], four-level pulse amplitude modulation (PAM-4) [3,[7][8][9][10]13] and PAMs with even more amplitude levels [15], as well as more complex multi-carrier [5,9,11] or multi-band modulation schemes [12].
Notably, PAM-4 has been chosen as the format for high speed Ethernet standards [16] for data center (DC) applications due to its implementation simplicity. High-volume PAM-4 chips are commercially available. However, at >25 Gb/s bit rates, implementing PAM-4 with simple linear equalizers [7,8,17,18] to support a transmission distance of 20 km over a single-mode fiber (SMF)-as required in access networks-using only 10-G optoelectronics [7], is extremely challenging, unless higher bandwidth transmitters or receivers (e.g., 25 G photodiodes [8]), or a dedicated chromatic dispersion compensation module [4] are adopted. It has been shown that reach extension is possible by mitigating system and component nonlinearities using DSP without changing the optical infrastructure [18,19].
It is worth mentioning that, alternatively, there are research studies proposing to employ coherent-lite technologies [20][21][22][23]. Coherent reception enables a much higher achievable transmission capacity as well as longer transmission distances but requires a higher number of devices. However, the work presented in this paper is limited to simple direct detection (DD) applications using PAM-4 at the C-band. Although previous DD PAM-4 systems with improved capacity [24,25] have been demonstrated, they required the use of sophisticated neural network (NN) nonlinear equalization [24], or an O-band where chromatic dispersion is much smaller [25]. Moreover, none of the previous DD PAM-4 demonstrations considered a simpler nonlinear equalization architecture with noise suppression.
This work proposes a novel 40 Gb/s lane rate PON architecture that includes a highperformance DSP unit, 10-G optoelectronics and at the same time eliminates the need for a power-hungry radio frequency (RF) driver from the transmitter. Thanks to the adoption of shared bi-directional optical amplifiers (OAs) in the optical line terminal (OLT) site, as standardized in the 10-G lane PON, the optical launch power can be maintained at a high level in order to guarantee a high optical power budget. The technical challenge is that the nonlinearity attributed to signal-signal beating upon square-law detection must be tackled and traded off with the carrier-to-signal power ratio (CSPR) of the modulation scheme. Our novel DSP approach uses a nonlinear adaptive DSP with zero overheads at the receiver side to effectively tackle the nonlinearity issue. The nonlinear DSP can be further simplified and flexibly reconfigured, guaranteeing a low-cost and low-power receiver. Such a PON has the capability of full C-band tunability with 100 GHz channel spacing, positioning it as an attractive solution for 40 Gb/s lane rate PONs, and hence, enabling PON links with >1.6 Tb/s link capacity (e.g., by using 48 WDM channels of 40 Gb/s transceivers over the whole C-band). Here, experimental demonstrations were performed using a single channel 40 Gb/s PAM-4 signal transmission over 20 km (30 km) SMF with a link power budget of 38 dB (30.7 dB), at a pre-forward error correction (FEC) bit-error-rate (BER) of 10 −3 . In addition, when a simpler KP-4 FEC with a BER threshold of 2.2 × 10 −4 was considered, a link power budget of 33 dB (27 dB) was achieved after a 20 km (30 km) SMF transmission. Figure 1a illustrates the experimental setup and Figure 1b depicts the detailed architecture of the nonlinear equalizer. In the transmitter, an offline PAM-4 signal is generated after a 2-tap symbol-spaced digital pre-distortion and a Nyquist shaping with a roll-off factor of 0.45. The generated signal is loaded into a 16 GHz, 80 GS/s digital-to-analog convertor (DAC) with an output swing of approximately 250 mV. Without using any driver amplifier, a 10-G tunable transmitter assembly (TTA) consisting of a tunable laser operating at 1543.3 nm and a differentially driven InP Mach-Zehnder intensity modulator (MZM) with a bandwidth of approximately 11 GHz [8] is driven by the DAC output directly. Following the MZM, a tunable optical band-pass filter (OBPF) mimicking an optical multiplexer with a 3 dB bandwidth of approximately 40 GHz is adopted and its output is amplified by a booster Erbium-doped fiber amplifier (EDFA). The OBPF has a 2nd order super Gaussian filtering profile. A variable optical attenuator (VOA) at the output of the Appl. Sci. 2021, 11, 3406 3 of 10 EDFA optimized the launched optical power into the fiber link, thus, always providing the lowest BER. After transmission over an SMF link of up to 30 km, a combined optical receiver consisting of a VOA, a pre-amplifier EDFA, a 40-GHz bandwidth OBPF mimicking an optical de-multiplexer with identical configuration of the transmitter OBPF, and a 10-G p-i-n photodiode-transimpedance amplifier (PIN-TIA) is utilized to detect the received optical signal. The PIN-TIA has a bandwidth of roughly 11 GHz and its input power is adjusted and optimized by a prior VOA. The detected PAM-4 signal is then converted into a digital signal by analog to digital convertor (ADC) sampling at 80 GS/s and the following DSP is conducted offline.

Experimental Setup
Appl. Sci. 2021, 11, x FOR PEER REVIEW output is amplified by a booster Erbium-doped fiber amplifier (EDFA). The OBPF 2nd order super Gaussian filtering profile. A variable optical attenuator (VOA) a output of the EDFA optimized the launched optical power into the fiber link, thus, al providing the lowest BER. After transmission over an SMF link of up to 30 km, a comb optical receiver consisting of a VOA, a pre-amplifier EDFA, a 40-GHz bandwidth O mimicking an optical de-multiplexer with identical configuration of the transmitter O and a 10-G p-i-n photodiode-transimpedance amplifier (PIN-TIA) is utilized to d the received optical signal. The PIN-TIA has a bandwidth of roughly 11 GHz and its power is adjusted and optimized by a prior VOA. The detected PAM-4 signal is converted into a digital signal by analog to digital convertor (ADC) sampling at 80 and the following DSP is conducted offline. The detailed process of offline DSPs for the receiver is also presented in Figure 1 clock recovery obtains the optimum sampling phase and then a re-sampler down-sam the signal into 2 samples per symbol. A following blind nonlinear feedforward decision feedback equalizer (FFE/DFE), together with a noise suppression (NS) filt then adopted to combat both linear and nonlinear effects from the low-bandw components and fiber link. After symbol-to-bit mapping, the BER prior to forward correction (FEC) is the result of a one-on-one comparison between the transmitted an recovered bits. The adaptive nonlinear equalizer (NLE) is the core of the receiver which is indicated in Figure 1b. The proposed reconfigurable NLE has the capabil combat both linear and nonlinear distortions as well as to suppress the signal noise DSP is operated in a completely blind manner with zero overhead. As indicated in F 1b, the nonlinear equalizer consists of both a Volterra FFE, a DFE and an NS fil recover the symbols.
The cost function adopted for the optimization of the coefficients is the least square (LMS), which is expressed as: where M, N, and H are the FFE, DFE and NS filter orders, and y and y' represent th input and output, respectively. ˆ' y is obtained from a hard decision of y'. Only linea the 2nd-order Volterra kernels are considered to maintain a low level of complexit (bj) and ij a (bij) are the linear and nonlinear FFE (DFE) tap coefficients, respectively.
the NS filter tap coefficient. The idea of the NS filter is to predict the noise of the cu The detailed process of offline DSPs for the receiver is also presented in Figure 1. The clock recovery obtains the optimum sampling phase and then a re-sampler downsamples the signal into 2 samples per symbol. A following blind nonlinear feedforward and decision feedback equalizer (FFE/DFE), together with a noise suppression (NS) filter, is then adopted to combat both linear and nonlinear effects from the low-bandwidth components and fiber link. After symbol-to-bit mapping, the BER prior to forward error correction (FEC) is the result of a one-on-one comparison between the transmitted and the recovered bits. The adaptive nonlinear equalizer (NLE) is the core of the receiver DSP, which is indicated in Figure 1b. The proposed reconfigurable NLE has the capability to combat both linear and nonlinear distortions as well as to suppress the signal noise. The DSP is operated in a completely blind manner with zero overhead. As indicated in Figure 1b, the nonlinear equalizer consists of both a Volterra FFE, a DFE and an NS filter to recover the symbols.
The cost function adopted for the optimization of the coefficients is the least mean square (LMS), which is expressed as: where M, N, and H are the FFE, DFE and NS filter orders, and y and y' represent the FFE input and output, respectively.ŷ is obtained from a hard decision of y'. Only linear and the 2nd-order Volterra kernels are considered to maintain a low level of complexity. a i (b j ) and a ij (b ij ) are the linear and nonlinear FFE (DFE) tap coefficients, respectively. p h is the NS filter tap coefficient. The idea of the NS filter is to predict the noise of the current FFE output sample based on the past H noise samples and the predicted noise is then subtracted from the signal sample.  6.56 dB net coding gain and a threshold BER of 10 −3 is assumed here [26]. Considering the superior performance of the proposed NLE, a 5.84% overhead KP-4 RS(544, 514) FEC with a threshold BER of 2.2 × 10 −4 , widely adopted in high speed Ethernet links, can also be used [27].

Experimental Results at 20 km Transmission
The eye diagrams of the modulator output shown in Figure 2 exhibit strong eye closure and asymmetry with smaller lower eyes. The degradation of the signal quality caused by the MZM is obvious by comparing the eye diagram with the DAC output eye diagram. The latter is much more open and the apertures are more uniform, although a slight eye closure is observed due to the DAC bandwidth limitation. The eye diagrams of the detected signals are also shown in Figure 2, in which they are completely closed, regardless of the fiber transmission distance. This is a direct result of the joint effect of limited transceiver bandwidth and fiber dispersion.  6.56 dB net coding gain and a threshold BER of 10 −3 is assumed here [26]. Considering the superior performance of the proposed NLE, a 5.84% overhead KP-4 RS(544, 514) FEC with a threshold BER of 2.2 × 10 −4 , widely adopted in high speed Ethernet links, can also be used [27].

Experimental Results at 20 km Transmission
The eye diagrams of the modulator output shown in Figure 2 exhibit strong eye closure and asymmetry with smaller lower eyes. The degradation of the signal quality caused by the MZM is obvious by comparing the eye diagram with the DAC output eye diagram. The latter is much more open and the apertures are more uniform, although a slight eye closure is observed due to the DAC bandwidth limitation. The eye diagrams of the detected signals are also shown in Figure 2, in which they are completely closed, regardless of the fiber transmission distance. This is a direct result of the joint effect of limited transceiver bandwidth and fiber dispersion.  Figure 3a identifies an optimum operating point that is far below the linear regime, which is verified by the BER versus modulator bias voltage curve presented in Figure 3b. Considering a small differential driving signal swing of about 500 mV, which is about one ninth of the Vπ of the MZM, a bias closer to the null point brings about a lower CSPR (namely, a higher signal to noise ratio (SNR) for a fixed optical power by suppressing optical carrier) but stronger signal-signal beating interference upon square-law detection at the same time. The optimum bias gives rise to the best trade-off between CSPR and beating interference. This leads to a low optical output power, but it can be amplified by a booster EDFA. This configuration is critical to reduce the transceiver cost and power due to the avoidance of an RF driver in each channel since the EDFA cost can be shared by all channels.  Figure 3a identifies an optimum operating point that is far below the linear regime, which is verified by the BER versus modulator bias voltage curve presented in Figure 3b. Considering a small differential driving signal swing of about 500 mV, which is about one ninth of the V π of the MZM, a bias closer to the null point brings about a lower CSPR (namely, a higher signal to noise ratio (SNR) for a fixed optical power by suppressing optical carrier) but stronger signal-signal beating interference upon square-law detection at the same time. The optimum bias gives rise to the best trade-off between CSPR and beating interference. This leads to a low optical output power, but it can be amplified by a booster EDFA. This configuration is critical to reduce the transceiver cost and power due to the avoidance of an RF driver in each channel since the EDFA cost can be shared by all channels. Figure 4a presents the BER versus the received optical power performance under various receiver DSP configurations for an SMF length of 20 km. With only a linear FFE with an identified optimum tap count of M = 8, the BER improves slowly with increasing optical power but an error floor appears at a BER level of 5 × 10 −5 . The use of DFE with an optimum tap count N = 4, in addition to FFE, shows a performance enhancement, especially in lowering the error floor. Based on the optimum FFE and DFE tap counts, the 2nd-order nonlinear terms given in Equation (1) are additionally introduced, requiring a total number of 44/10 FFE/DFE taps according to Equation (1). It is obvious that nonlinear equalization brings a tremendous improvement in both the power sensitivity and error floor. Further improvement is achieved by using an NS filter, which is more significant at lower BERs. Overall, a total optical power sensitivity improvement of approximately 2 dB (4 dB) is obtained at the 10-G EPON FEC (KP-4 FEC) threshold BER of 1 × 10 −3 (2 × 10 −4 ), respectively. at the same time. The optimum bias gives rise to the best trade-off between CSPR and beating interference. This leads to a low optical output power, but it can be amplified by a booster EDFA. This configuration is critical to reduce the transceiver cost and power due to the avoidance of an RF driver in each channel since the EDFA cost can be shared by all channels.  The improvement enabled by the receiver adaptive DSP is firstly attributed to-as indicated in Figure 4b-the nonlinear equalization, that effectively mitigates the signalsignal beating interference due to fiber dispersion and square-law detection, leading to approximately a 3 dB reduction in noise power. Secondly, the nonlinear response of the MZM near to the null bias point can also be significantly compensated by nonlinear The improvement enabled by the receiver adaptive DSP is firstly attributed to-as indicated in Figure 4b-the nonlinear equalization, that effectively mitigates the signalsignal beating interference due to fiber dispersion and square-law detection, leading to approximately a 3 dB reduction in noise power. Secondly, the nonlinear response of the MZM near to the null bias point can also be significantly compensated by nonlinear equalization. This is identified by comparing the histograms shown in Figure 4c of the equalized signals of linear and nonlinear equalization cases under the same optical power: nonlinear equalization reduces the frequency of amplitudes distributed in the threshold regimes, leading to reduced errors on the following hard decisions. Moreover, Figure 4c also indicates that the nonlinear equalized PAM-4 signal shows even more amplitude space between each level. The NS filter not only significantly suppresses the noise distributed in the frequency range beyond 10 GHz but also whitens the noise power spectral density.
It is also shown in Figure 4a that it is possible to significantly reduce the FFE/DFE and NS filter taps, and hence, the filter complexity, without sacrificing the performance of 20 km SMF transmission. This is achieved by disabling the taps whose values are below the average of the corresponding tap category.
The pre-amplified receiver could potentially be replaced by an avalanche photodiode (APD) in the downlink system for cost-efficiency reasons, but the corresponding sensitivity penalty will typically rise to~5 dB in this case [7,8,12]. However, if dynamic biasing of the APD is used, it is expected that the sensitivity can be reasonably improved [28].

Simulation Results at 20 km Transmission
Due to the absence of a high-bandwidth APD in our laboratory, we performed realistic simulations using the VPIphotonics Design Suite platform (VPItransmissionMaker) to predict the impact of a realistic APD in the proposed setup. For our simulations, we used two cases, the first being identical to the experimental setup using the EDFA and a PIN with a transimpedance amplifier (TIA). While the second replaced the latter components with a single APD + TIA. For simplicity, we used an 8 tap FFE in the DSP unit for comparison with the corresponding experimental measurements with a simplified DSP. A total of 2 20 symbols were used for effective BER Monte-Carlo simulation. The simulation parameters for the PAM-4 with the APD set-up were identical to [8], while the following additional parameters were considered in the simulation: a single-polarization system was used using an standard SMF (SSMF) with attenuation, chromatic dispersion, chromatic dispersion slope and a Kerr nonlinear index of 0.2 dB/km, 16 ps/(km × nm), 0.08 ps/km × nm 2 , and 2.6 × 10 −20 m 2 /W, respectively. For both EDFAs, the noise figure was set 6 dB. For the PIN and APD, the thermal noise was set at 1 pA/ √ (Hz), the dark current at 10 nA and the responsivity at 0.9 A/W. The avalanche multiplication factor for the APD was 2.5. The bandwidth limitation of the modulator and PIN/APD was simulated using Bessel low-pass filters of the 3rd order, while the TIA was modelled with a 4th filter order. Gaussian OBPFs of the 2nd order were also simulated. Identical PAM-4 parameters to the experiment were used, with a current noise spectral density of 10 nA/ √ (Hz). The laser linewidth was set to 10 MHz and its relative intensity noise (RIN) was set to −130 dB/Hz at a measured power of 1 mW. The laser average power was set to 10 mW, while the side mode separation was set to 200 GHz and the side mode suppression ratio to 100 dB. The MZM was assumed to be ideal with an insertion loss of 6 dB and extinction ratio of 30 dB.
In Figure 4a, together with the experimental traces from Section 3, the simulated results are depicted for the case of an 8 tap FFE at 20 km of transmission. Figure 4d shows the received histograms for the afore-mentioned simulated DSP case without using an equalizer. Our simulated results designate the following: (1) An implementation penalty of 0.4 dB is observed at the KP4 FEC-limit when compared to the experimental curve, mainly induced from the electro-optical components; (2) A~4.3 dB sensitivity penalty is observed when using an APD without receiver pre-amplification, confirming the prediction reported in [7,8,12].

DSP Analysis and Experimental Transmission-Reach Limitations
The simplification of the NLE filter taps is illustrated in Figure 5. Note that for the simplified DSP case, the equalization runs again after the taps below the threshold are disabled from the complete DSP. On convergence, the survived taps show different values compared to their complete DSP counterparts since the interaction between taps changed. Taking the nonlinear FFE term as an example, Figure 5 shows that the most significant nonlinear terms are the square terms and the product terms between most adjacent symbols, which corresponds to the nonlinear FFE coefficient a i,j for |i − j| ≤ 1. The product terms between symbols further away from each other are negligible and can be eliminated to reduce computational complexity. It is observed that the penalty of the simplified nonlinear FFE/DFE, relative to the full nonlinear FFE/DFE, increases slightly with increasing the fiber length beyond 20 km. This is because of the increased fiber dispersion and stronger distortion (as shown in Figure 2), and the fact that a small part of the eliminated nonlinear product terms shown above are no longer negligible. It is noticeable that both the FFE and DFE terms show that more weighted taps survived via simplification. While for the NS filter, the tap significance was changed slightly via simplification. This is because the change of the FFE/DFE taps has an influence on the noise correlation. Overall, the total tap count is reduced by 75% (from 44/10/10 to 10/4/3 taps). The simplified DSP makes the receiver comparable to a commercial simply equalized clock and the data recovery (CDR) of PAM-4 as regards the complexity and power dissipation.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 of 10 that both the FFE and DFE terms show that more weighted taps survived via simplification. While for the NS filter, the tap significance was changed slightly via simplification. This is because the change of the FFE/DFE taps has an influence on the noise correlation. Overall, the total tap count is reduced by 75% (from 44/10/10 to 10/4/3 taps). The simplified DSP makes the receiver comparable to a commercial simply equalized clock and the data recovery (CDR) of PAM-4 as regards the complexity and power dissipation.  Figure 6a summarizes the optical power sensitivity at the threshold BERs of both of the proposed FECs for various fiber lengths under two receiver DSP configurations: 8/4 tap FFE/DFE without 2nd-order terms, and a simplified 10/4/3 tap FFE/DFE/NS filter with both linear and 2nd-order terms as well as the NS filter. It clearly shows that when compared with a linear FFE/DFE, 2nd-order Volterra nonlinear equalizations with an NS filter considerably enhance the receiver sensitivity and that the gain is more significant at longer fiber distances. The linear equalization fails to support 30 km SMF transmission for both FECs.  Figure 6a summarizes the optical power sensitivity at the threshold BERs of both of the proposed FECs for various fiber lengths under two receiver DSP configurations: 8/4 tap FFE/DFE without 2nd-order terms, and a simplified 10/4/3 tap FFE/DFE/NS filter with both linear and 2nd-order terms as well as the NS filter. It clearly shows that when compared with a linear FFE/DFE, 2nd-order Volterra nonlinear equalizations with an NS filter considerably enhance the receiver sensitivity and that the gain is more significant at longer fiber distances. The linear equalization fails to support 30 km SMF transmission for both FECs.
Considering a launch power of 12 dBm, Figure 6b presents the achievable link power budget, subject to the DSP configurations in Figure 6a. As references, three different high value power budget specifications from 10 G EPON standard are illustrated. It indicates that the proposed 40 Gb/s PAM-4 PON can achieve a link power budget of 39 dB (33 dB) and 37 dB (27 dB) for 20 km (30 km) SMF transmissions by using the 10 G EPON FEC and KP-4 FEC, respectively. The system can support PR30 and PR40 power budget specification with 20 km and 25 km distances, respectively, under all DSP configurations. Only the FFE/DFE/NS configuration together with 10 G EPON FEC can support PR50 by considering a 25 km reach. It is worth mentioning that the pre-amplified receiver may be replaced by an APD in the downlink system for cost reasons and the corresponding achievable power budgets will be~7.5 dB less [7,8,12]. In this case, the PR30 power budget (20 km reach) can still be achieved. When dynamic biasing APD is used, it is expected that the power budget can be reasonably improved [28]. In the case of using an APD without a receiver EDFA, simulations in Figure 4 have revealed that 20 km reach can be achieved and so it can cover the PR30 power budget as depicted in Figure 6b.
. Feedforward and decision feedback equalizer (FFE/DFE) and noise suppression (NS) filter tap coefficients for and simplified DSP cases with 20 km SMF at an optical power of −21 dBm. Figure 6a summarizes the optical power sensitivity at the threshold BERs of both of the proposed FECs for various fiber lengths under two receiver DSP configurations: 8/4 tap FFE/DFE without 2nd-order terms, and a simplified 10/4/3 tap FFE/DFE/NS filter with both linear and 2nd-order terms as well as the NS filter. It clearly shows that when compared with a linear FFE/DFE, 2nd-order Volterra nonlinear equalizations with an NS filter considerably enhance the receiver sensitivity and that the gain is more significant at longer fiber distances. The linear equalization fails to support 30 km SMF transmission for both FECs.

Conclusions
We have demonstrated a unique low-complexity and energy-efficient 40 Gb/s PAM-4 PON system using a novel high-performance DSP unit and a 10-G transceiver without requiring a power-hungry RF-driver. The DSP consisted of a low complexity reconfigurable blind adaptive Volterra-based nonlinear FFE/DFE, integrated with an NS filter as a receiver equalizer to alleviate various linear and nonlinear effects from the components and the fiber. The results revealed that a link power budget of 39 dB and 37 dB for 20 km (30 km) SMF transmission was feasible at a BER threshold of 10 −3 , respectively. Up to a 75% reduction in the DSP complexity is feasible without performance degradation. The 17-tap NLE has a complexity that is comparable with a commercial equalized CDR. Simulation results revealed that an APD can potentially be used to avoid pre-amplification, thus, enabling greater cost and energy efficiency in modern PONs, while simultaneously covering the PR30 power budget at 20 km.