ML-Based Identification of Structured Light Schemes under Free Space Jamming Threats for Secure FSO-Based Applications

Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 KACST-TIC in Radio Frequency and Photonics for the e-Society, King Saud University, Riyadh 11421, Saudi Arabia; wsaif@ksu.edu.sa (W.S.S.); dsaleh@ksu.edu.sa (S.A.A.) 2 Department of Electrical Engineering, King Saud University, Riyadh 11421, Saudi Arabia * Correspondence: aragheb@ksu.edu.sa


Introduction
It is anticipated in the year 2050 that more than two-thirds of the world population will live in urban areas, where advanced technologies should play a significant role to optimize the life-style in future smart cities. Free space optics (FSO) is one promising technology with an expected market size investment of $300 million in 2029 [1]. FSO has been extensively considered by various communication sectors including wireless communication networks, optical interconnect in data centers, underwater communications, and next generation Internet of things systems [2][3][4][5]. This is owing to the unique features of FSO technology such as ease and low installation cost, high-throughput, long reach distance, and low link latency.
In this regard, different light beam structures were used in data multiplexing and M-ary pattern coding applications [6,7]. These include the traditional Gaussian beam and complex light structures such as Laguerre, Hermit, and Bessel Gaussian (LG, HG, and BG) mode families. Nonetheless, the wireless optical link is subject to propagation challenges that limit the FSO efficiency. These include intrinsic atmospheric conditions and extrinsic human-made risks (i.e., jamming and interception threats). In the former, the effect of atmospheric turbulence, rain, fog, and dust has been comprehensively investigated theoretically and by experimental demonstrations [8]. Additionally, traditional digital signal processing (DSP), adaptive optics (AO), and machine learning (ML) methods were used alternatively to mitigate atmospheric effects. For instance, the work in [9,10] studied the mitigation of crosstalk-based turbulence using DSP for multiplexed LG modes propagating in an emulated weak-turbulance channel. A 15-tap 4 × 4 multi-input multi-output (MIMO) equalizer is used for four multiplexed LG channels each carrying 20 Gbps quadrature phase shift keying (QPSK) signal. In addition, the authors in [11] exploited AO components to alleviate atmospheric turbulence on the data multiplexed LG mode family, where pre-and post-compensation of weak and moderated turbulence is achieved using an AO feedback closed loop.
Besides, ML-based methods have been used, broadly, as a classifier to identify structured light signals, in M-ary pattern coding systems, and as a regressor to predict various atmospheric conditions. In this regard, artificial neural network (ANN) has been utilized to identify 16-ary superposition LG modes transmitted over a strong turbulence channel, of 3 km link, in Vienna city [7]. Moreover, convolutional neural network (CNN) has been used as a classifier and regressor in [12] to identify 16-ary superposition LG modes and jointly predict the turbulence level, respectively. In [13], the k-nearest neighbour (kNN), support vector machine (SVM), and CNN have been compared to identify 8-, 16-, and 32-ary structured light beams generated using LG and HG mode families, in a dusty environment. Moreover, the CNN-based regressor was used to predict the visibility range of the communication channel. In [14], a chaotic interleaving processing step has been applied to 16-ary orbital angular momentum (OAM) states to mitigate FSO turbulence after coding the original transmitted bits using low-density parity-check codes and Turbo codes. The CNNbased classifier was used to identify various OAM-states with an achieved accuracy of 99% after optimizing CNN hyperparameters. In addition to the free space conditions, the underwater environment effects, such as water bubbles, temperature inhomogeneity, and water turbid, have been studied using CNN algorithm on single and superposition LG modes that form 16-ary OAM states [3]. It is worth noting that the immunity of ML-based approaches to environment (atmospheric or underwater) conditions has stimulated the recent interest in ML-based methods as an alternative option to traditional DSP and/or AO-based receivers [7].
On the other side, the investigation of extrinsic threats such as FSO beam tapping and jamming are still in its fancy. For instance, in [15] the alleviation of FSO tapping has been studied using free space spatial diversity and encoder/decoder techniques. Moreover, the effect of jamming on standard Gaussian beam is analytically investigated in [16,17]. In [16], the jamming effect on the bit error rate (BER) performance of FSO system is studied using numerical simulation. In addition, in [17] the FSO channel outage probability is evaluated under jamming effect, where jamming mitigation is investigated by proposing a multi-input-single-output (MISO) FSO system. Moreover, a closed-form expression of the average BER has been obtained in [18] for a 2 × 1 MISO optical space shift keying system under jamming effect and in a turbulent FSO channel. So far, these methods are based on numerical analysis to mitigate and/or study FSO system under jamming; however, ML-based techniques are not yet reported in literature for extrinsic free space challenges. Table 1 shows the recent progress of ML-based identification for M-ary pattern coding under various channel conditions, where intrinsic channel conditions are considered more.
It is worth noting that, to the best of authors knowledge, no work has been yet reported in literature that investigates the jamming effect on complex structured light beam mode families using ML-based techniques. Therefore, in this paper we experimentally investigate the effect of jamming signal on LG, superposition-LG, and HG mode families. As shown in Figure 1, structured light patterns can be used to transfer information bits between two buildings; however, a simple standard Gaussian jammer can be used to corrupt the pattern at the receiver side. In particular, the following is considered in this work: 1.
An FSO pattern coding system is experimentally built to generate 8-ary LG, 8-ary superposition-LG (we call it Mux-LG), 16-ary LG and Mux-LG, 16-ary HG, and 32-ary formed by all considered modes.

2.
The identification accuracy of different mode families is assessed, in a direct detection FSO system, using CNN-based classifier and under signal-to-jammer ratio (SJR) ranging from −5 to 3 dB.

3.
The direction-of-arrival (DoA) of the jamming signal is determined using CNN-and linear discriminant analysis (LDA)-based classifiers.  The rest of paper is organized as follows. Section 2 presents the structured representation of LG and HG mode families. The experimental setup and data set collection are discussed in Section 3. CNN-and LDA-based classifiers are introduced in Section 4. The experimental results and their analysis are reported in in Section 5. Section 6 discusses the limitations and the practicability of the proposed system. Finally, Section 7 provides concluding remarks.

Mode Basis Background
It has been shown that the distribution of laser light amplitude, in free space, can be represented, in rectangular coordinates, by the product of Hermitte polynomials H n (·)H m (·), where n and m are the polynomials order, or in cylindrical coordinates by Laguerre polynomial L p (·), where and p are the azimuthal and radial indices, respectively [19]. In this work, the HG and LG orthogonal modes are used as code words for data transmission in an M-ary pattern coding system, where M is the modulation order of LG or HG coding system. In Figure 2a, the laboratory-generated LG and HG modes are shown, which are used to build an M-ary coding communication system with M = 8, 16, and 32. In particular, Hermite-Gaussian and Laguerre-Gaussian functions provide an exact representation of the higher-order solutions of the free space paraxial wave equation in rectangular and cylindrical coordinates, respectively [20]. The HG modes u HG mn can be written in rectangular coordinates as u HG mn (x, y, z) = u n (x, z) × u m (y, z) [20], where u n (x, z) or u m (y, z) is the solution in x or y transverse dimension, and is given by where H n (·) is Hermite polynomial of order n, ω o is the beam waist, k = 2π/λ is the wave number and λ is the operating wavelength, Similarly, the LG modes u LG l p characterized by the azimuthal and radial indices can be expressed in cylindrical coordinates as [20] where A is a normalized constant, L p (·) is the generalized Laguerre polynomial, and the parameters R(z), ω(z), and ψ(z) are the same as in Equation (1).

Experimental Setup and Dataset
In this work, the jammer effect on two laser mode families is investigated in direct detection FSO system. Figure 2b shows the demonstrated experimental setup used in this investigation. The information carrier link is built using a Teraxion continuous wave (CW) laser source operating at 1550 nm (LD1). The laser output power is boosted up to ∼20 dBm using an Erbium doped fiber amplifier (EDFA) before collimated in free space using Thorlabs F230FC-1550. A free space set of a half wave plate (HWP) and a polarizer (P) is used to maximize the intensity and align the polarization of the collimated light, respectively. An Hamamatsu (X13138-08) programmed liquid crystal on silicon-spatial light modulator (LCOS-SLM) is utilized to modulate the phase of the horizontally polarized incident Gaussian beam. According to the M-ary order, a computer (PC1) is used to program the LCOS-SLM with different hologram sets to convert the incident Gaussian mode into LG, Mux-LG, and/or HG modes.
On the other side, the jamming link is a standard Gaussian beam generated using another CW laser source (LD2) working in C-band with an output power varying from 8 to 15 dBm to change the signal-to-jamming ratios (SJRs). A 15x Thorlabs GBE-15C beam magnifier (BM) is used to adjust the beam diameter of the jamming signal. Both mode pattern and jamming signals are transmitted over a free space link of 1-meter inside the lab. The free space direct detection receiver is constructed using a Thorlabs LB1471-C convex lens of 50 mm focal length and a charge-coupled device (CCD) detector (Ophir-Spiricon LBP2-IR2) that captures and stores the intensity profiles of the different jammed modes in PC2. Moreover, a beam splitter is used initially to measure the the power of the received pattern and jamming signals and to adjust the SJR values. Figure 3 compares the received beam width diameters statistically measured by D4σ method (i.e., four times beam profile standard deviation in x or y directions) [21], for left-hand jammer, LG, Mux-LG, and HG mode profiles. The measured beam diameter of the jamming signal is 3 mm, see Figure 3a, whereas the measured beam widths of the LG/Mux-LG modes varying from ∼2.3 mm (LG 01/0±1 ) to 4.2 mm (LG 08/0±8 ), as shown in Figure 3b,c. For HG modes, the beam diameters change from ∼1 mm (HG 00 ) to 4.5 mm (HG 33 ) in x and y directions. LG 08 (c) LG 0±1  In order to mimic the random wandering-around of a jammer, a 3-axis translation stage is used to randomly change the incident position of the jammer. Besides, in this investigation, both the left-and right-hand jammer direction-of-arrivals (DoAs) are considered. A dataset of the jammed patterns was created by recording the received intensity profiles using the CCD detector. The CCD recorded the received profiles with a frame rate of 1 frame/sec (which is sufficient to capture the slow jammer wandering), for a duration of 200 s (∼3.3 min). In addition, the profiles was recorded at different SJRs varying from −5 to 3 dB with a step size of 1 dB. It is worth noting that the choice of the lower (−5 dB) and upper (3 dB) SJR values is owing to the devices power limitations and the performance consistency, respectively. This creates a data set of 14,400, 28,800, and 57,600 frames for 8-ary LG/Mux-LG, 16-ary HG/LG+Mux-LG, and 32-ary LG+Mux-LG+HG modes, respectively. Figure 4 shows examples of the various intensity profiles of the randomly jammed mode pattern at SJRs of −5, 0, and 3 dB, of a left-hand DoA jammer. It can be seen that the jammer attack frequently appears at the right-hand side of the sensor's active area. Moreover, when the jammer's incident angle exceeds the lens's angle of view, the jammer distribution shows a non-circular/distorted shapes as in LG 05 , HG 02 , and HG 21 at 0 dB SJR. The recorded images have been used for the training and testing of CNN-based classifier, such that 70% were used in the training phase whereas the remaining 30% are for testing the classifier. It worths noting that to predict the jammer's DoA, another dataset was built that considers the 8-ary LG modes only. This generates a dataset of 28,800 frames for the left-and right-hand jammer's DoA. In the next sections, the obtained accuracies for both M-ary modulation classification and DoA determination are discussed. LG01 LG02 LG03 LG04 LG05 LG06 LG07 LG08 LG0±1 LG0±2 LG0±3 LG0±4 LG0±5 LG0±6 LG0±7 LG0±8 LG01 LG02 LG03 LG04 LG05 LG06 LG07 LG08 LG0±1 LG0±2 LG0±3 LG0±4 LG0±5 LG0±6 LG0±7 LG0±8 HG00  HG01  HG02  HG03  HG10  HG11  HG12  HG13  HG20  HG21  HG22  HG23  HG30  HG31  HG32  HG33   HG00  HG01  HG02  HG03  HG10  HG11  HG12  HG13  HG20  HG21  HG22  HG23  HG30  HG31  HG32  HG33   HG00  HG01  HG02  HG03  HG10  HG11  HG12  HG13  HG20  HG21  HG22  HG23  HG30  HG31  HG32  HG33 LG01 LG02 LG03 LG04 LG05 LG06 LG07 LG08 LG0±1 LG0±2 LG0±3 LG0±4 LG0±5 LG0±6 LG0±7 LG0±8

Modes Identification and DoA Determination Classifiers
The free space direct-detection method depends on observing the image of the intensity profiles. Therefore, the CNN-based classifier is exploited to identify the images of different mode patterns, as it can directly process the two-dimensional input signal types. The CNN, shown in Figure 5a, is used to automatically extract features from the raw image data which leads to better discrimination between modes and, consequently, better performance in the testing phase. The CNN network is constructed using three main processing layers. These are the convolutional, pooling, and fully direct connected layers. In our work, the CCD colored recorded images were greyscaled and resized to 28 × 28 pixels to reduce the processing complexity. The resultant images were processed using the convolutional and pooling layers. The convolutional layer is considered as the core building block of CNNbased classifier. The input mode image is subdivided into different windows, which are convolved with kernel filters to produce the feature maps. The result of the convolution layer is passed through a nonlinear activation function. In this work, the activation function is the Rectified Linear Unit (ReLU). The convolutional layer is then followed by a pooling layer to reduce the size of the feature map. Two blocks of alternating convolutional and pooling layers were considered, where the convolutional layers use 16 and 32 kernels with 3 × 3 pixels, respectively. The output feature maps pass through the pooling layers with a max-pooling of 2 × 2 pixels. This produces 32 feature maps with 7 × 7 pixels after the second pooling layer. Then, these feature maps are flattened to 1-D vector with 1568 length and applied to a fully connected layer which contains 256 weights, sufficient to solve the problem without requiring much resources. Finally, the CNN output layer is 8, or 16, or 32 elements vector (e.g., 8 for 8-ary LG modes and 16 for 16-ary HG modes). The softmax activation function is implemented at the output layer. The position of output vector element of highest value determines the type of received mode pattern. Note that the entries of output vector are of values ≤ 1, where each entry represents the probability of corresponding mode type. The sum of all output vector probabilities equals to one. The network training is relying on the back propagation algorithm which updates the network's weights based on gradient of the loss. The loss function (L) is the cross-entropy, given by [22] where K is the number of classes, N is the number of samples, and y and t are the predicted and target class (mode) probabilities, respectively. In the training phase, instead of using the entire dataset as input to the CNN, which requires high memory space, the dataset is divided into different batches. Every batch contains 64 images. Further, to reduce the training time, batch normalization (i.e., subtracting the mean of each batch and dividing by the batch standard deviation) is considered. Batch normalization considerably decreases the training time when normalizing the input of each processing layer in the network, not only the network input layer. Moreover, an important hyperparameter for tuning Deep Neural Networks (DNNs) is the use of optimizer. In this study, the adaptive moment optimizer (Adam) is considered with learning rate equals 5 × 10 −3 . Adam is easy to implement, computationally efficient, and requires less memory resources [23]. Because overfitting is a common challenge in training DNN models, L2 regularization of 10 −4 is used to prevent it. On the other side, to determine the jammer DoA, the CNN-based and LDA classifiers are considered and their performances are compared. The later is considered owing to its simplicity. The idea of the LDA technique is to project the original dataset of size (N × D) onto a new space of size (N × d), where D >> d, as shown in Figure 5b. In our development, N represents the number of mode images in the training dataset and D is the number of features (image pixels). Therefore, the LDA can be performed using the following steps [24]: (1) Compute the between-class covariance matrix S B of the training dataset.
(2) Compute the within-class covariance matrix S W of the training dataset.
(3) Find the transformation matrix Φ = S −1 W S B that maximizes S B and minimize S W . The eigenvectors and the corresponding eigenvalues of the transformation matrix Φ provide the information about LDA space. The directions of LDA space is represented by the eigenvectors, while the associated eigenvalues denote the magnitude of the eigenvectors. Therefore, every eigenvector indicates one axis of the LDA space; however, the eigenvalue represents the robustness of this eigenvector. The robustness of the eigenvector reflects its potential to distinguish between the various classes, i.e., increase S B and decreases S W . Thus, it achieves the goal of LDA. The LDA space is constructed using the eigenvectors that have the largest eigenvalues. It is worth noting that for K number of classes, the number of non-zero eigenvalues will be ≤ K − 1 [22]. Finally, the training dataset is projected on LDA space. The classifier then designed to have a decision threshold (i th ) that separates the two adjacent classes. The decision threshold can be calculated as [22] where µ 1 and µ 2 represent the training dataset means of the two adjacent classes in LDA space. Once the threshold value is calculated, it can easily be used to classify a new data by first projecting it on LDA space, and then comparing its value with the value of ith. Figure 5b illustrates an example of classification using LDA. It can be observed that there is a significant overlap between the classes in case of the eigenvector v 2 (i.e., with lowest eigenvalue) which leads to error in classification. However, the eigenvector v 1 (i.e., with largest eigenvalue) shows a considerable improve in separation between classes.

M-ary Mode Identification
Investigation is conducted to examine the identification accuracy, using CNN-based classifier, for each individual mode for 8-ary LG, 8-ary Mux-LG, and 16-ary HG formats. Figure 6 shows the classification accuracy in percent evaluated at SJR ranging form −5 to 3 dB. For LG modulation set, all modes reach 100% accuracy at SJR equals 0 dB, except LG 01 which requires 1 dB more to reach 100% accuracy. However, for 8-ary Mux-LG and 16-ary HG modulation schemes, all modes can reach a recognition accuracy of 100% at SJR of −2 dB. This shows an improvement of 3 dB over the 8-ary LG format. At low SJR (i.e., −5 dB), the mode accuracy reduces to ∼85%, 96%, and 95% for LG 03 , LG 0±6 , and HG 12 . This can be interpreted from the confusion matrix shown in Figure 7 at −5 dB SJR, where the diagonal values represent the correct classification accuracy ratios for each mode. However, the rest of the matrix values represent the misclassification with the other modes. For instance, the LG 02 provides 87.9% correct accuracy and confuses with LG 01 , LG 03 , LG 04 , and LG 05 with 4.1%, 6.23%, 1.63%, and 0.13%, respectively. It can be observed that the misclassification ratio is large with neighboring modes and decreases as it moves away which is expected owing to the similarity of LG adjacent modes. It can be seen that mode LG 03 is confused with neighbouring modes LG 02 and LG 04 . Moreover, in Figure 7b,c, the confusion for 8-ary Mux-LG and 16-ary HG formats is shown at −5 dB SJR. For Mux-LG modulation, the mode LG 0±6 is confused with G 0±7 and G 0±8 . Moreover, for HG modulation, mode HG 12 is confused with HG 13 and HG 22  LG0±1 LG0±2 LG0±3 LG0±4 LG0±5 LG0±6 LG0±7 LG0±8 LG02 LG03 LG04 LG05 LG06 LG07 LG08

Jammer DoA Determination
Here, the classification methods are used to identify jammer threat direction of arrival (DoA). In this experiment, both right-hand and left-hand jammer positions with respect to the transmitter have been considered for 8-ary LG formats only (i.e., worst results reported in the previous sub-section). The dataset comprises 28,800 images created from the two jammer directions with a capturing duration of 200 s for each mode, and for SJR ranging from −5 to 3 dB with 1 dB step (i.e., total dataset size: 200 images × 2 directions × 8 modes × 9 SJR values). Two classifiers have been trained and tested using 70% and 30%, respectively, of the obtained dataset. The first classifier uses the CNN network; however, the output layer contains two nodes, which denote direction of jammer either right or left. In the second technique, a simple classifier based on LDA is used. Using this technique, each image with size (28 × 28) is flattened into one vector, resulting in an input dimensionality of N = 784. The LDA has the ability to reduce the dataset into only 1D. Then a threshold value i th is obtained to classify between left or right DoA. Figure 9 shows the classification performance in terms of the recognition accuracy for LDA-and CNN-based classifiers. The CNN-based method is able to identify the left and right jammer DoA with 100% accuracy for all SJRs. This could be attributed to the clear and frequent incidents of the left-hand jammer on the right-side of the received mode (see Figure 4) and the vice versa is expected for the right-hand jamming. However, exploiting LDA-based algorithm (i.e., low-complexity classifier) reduces the DoA identification to a maximum of 97% at SJR of 2 dB. Inset in Figure 9b shows the distribution of data on LDA space. It is observed that much useful information of data was preserved in 1D space, with the existence of a clear separation between the two DoA jammers, except a small region of interference. While LDA classification accuracy is less than the CNN classifier, nevertheless it gives acceptable results, especially at SJR ≥ 0 dB, where the accuracy reaches more than 95%.

Discussion
The rapid evolve of FSO market size motivates the development of new methods and technologies that provide robust, reliable, fast, and secure FSO systems. This work focused on the security issue of a well-researched FSO technology that uses structured light as bit pattern codes. The demonstrated system is based on two main parts, the first is the jammer signal which is a simple off-the-shelf laser source. The second part is the data communication link which comprises an SLM (at the transmitter-side) and a CCD camera (at the receiver-side).
The practical implementation of the data communication link is limited by the cost and the switching speed (i.e., data transmission rate) of the structured light beam generator. Complex SLM can be replaced by 3D-printing microscale spiral phase plates that can generate pure structured light beams and are easily integrated with laser sources [25]. This increases the proliferation of the structured light beam generator in future FSO nodes. On the other side, the direct-detection method using a CCD camera and a robust ML algorithm reduce the hardware complexity at the receiver nodes, as they eliminate any modal decomposition process [7]. However, the limiting factors of the CCD camera are its sensitivity and frame capture rate. The former controls the free space distance; the latter limits the communication data rate.

Conclusions
In this paper, we investigated the performance of complex light structured patterns under human-made jamming threats. Two widely used laser mode families have been utilized to generate five modulation schemes. These are the 8-ary LG, 8-ary Mux-LG, 16-ary HG, 16-ary LG and Mux-LG, and 32-ary LG and Mux-LG and HG. Direct detection-based free space receivers with ML-based algorithm have been used to identify the attacked modes. This study showed that standard LG modes are highly affected by jamming and is not recommended for data transmission at low SJR (i.e., less than 0 dB). This can be attributed to the structural similarity of the standard Gaussian jammer and the LG profiles as opposed to Mux-LG and HG structures. Moreover, the CNN-based algorithm can identify the jammer DoA with 100% accuracy under sever jamming conditions. The work in this paper can be extended to study and investigate human-made eavesdropping of structured light signal in the coexistence of atmospheric distortions.