Application of Machine Learning in Electromagnetics: Mini-Review

: As an integral part of the electromagnetic system, antennas are becoming more advanced and versatile than ever before, thus making it necessary to adopt new techniques to enhance their performance. Machine Learning (ML), a branch of artiﬁcial intelligence, is a method of data analysis that automates analytical model building with minimal human intervention. The potential for ML to solve unpredictable and non-linear complex challenges is attracting researchers in the ﬁeld of electromagnetics (EM), especially in antenna and antenna-based systems. Numerous antenna simulations, synthesis, and pattern recognition of radiations as well as non-linear inverse scattering-based object identiﬁcations are now leveraging ML techniques. Although the accuracy of ML algorithms depends on the availability of sufﬁcient data and expert handling of the model and hyperparameters, it is gradually becoming the desired solution when researchers are aiming for a cost-effective solution without excessive time consumption. In this context, this paper aims to present an overview of machine learning, and its applications in Electromagnetics, including communication, radar, and sensing. It extensively discusses recent research progress in the development and use of intelligent algorithms for antenna design, synthesis and analysis, electromagnetic inverse scattering, synthetic aperture radar target recognition, and fault detection systems. It also provides limitations of this emerging ﬁeld of study. The unique aspect of this work is that it surveys the state-of the art and recent advances in ML techniques as applied to EM.


Introduction
Recent development in the field of electromagnetics (EM) and its diverse application have attracted a wider community of researchers [1]. Among the different aspects of EM, antenna research has evolved and transformed in totality from the design to end-use application [2][3][4][5][6]. Advanced network systems [7,8], implantable devices [9,10], wearables [11], flexible devices [12,13], textile products [14], and modern control system development [15,16] necessitate futuristic antenna technology and performance requirements. For example, with the advent of the 5G spectrum, antenna design is by far the most challenging part of the implementation as it is entirely dependent on the end device form factor. This inevitably pushes antenna design for 5G devices to fit the ever-increasing requirements for greater bandwidth, more frequency bands, and superior interference immunity [17,18]. Furthermore, fault detection in antenna arrays and inverse scatteringbased non-linear problems need sophisticated yet cost-effective solutions, where Machine Learning (ML) can provide an edge over other techniques [19,20].

Familiarization with ML
During the past few decades, researchers have been aiming at making machines intelligent by providing them the capacity to make decisions or classify objects without being programmed or without the use of predefined functions, rather using raw data from the environment. Termed Artificial Intelligence (AI), the field has made breakthroughs in the past few years. Mimicking the brain to solve an unknown problem, AI is widely used in the fields of engineering, science, medicine, and technology [26]. ML is a branch of AI that can be used to train algorithms to learn from and act on data without being explicitly programmed [27], and the learned model can be used for future designs. Currently, ML is used in interpreting and analyzing data more efficiently and effectively than the human brain. Incorporating the statistical modeling-based prediction and optimization techniques, ML has various models and approaches as different problems need different approximations to provide optimized solutions [28]. Common ML algorithms include artificial neural networks (ANNs), support vector machines (SVMs), k-nearest neighbors (KNNs), random forest, and gradient boosting trees [29][30][31]. From a methodological standpoint, ML can be classified into three main types: Supervised, unsupervised, and reinforcement learning [32][33][34].

Supervised Learning
In supervised learning, the algorithm learns from a training dataset (i.e., predefined inputs and known outputs) to make predictions about the unforeseen data. A dataset is designed using empirical data from a system in different configurations. This dataset is then divided into training and testing sub-datasets. The training dataset is used to train a model that infers a mathematical relationship between the input and output. A trained model can then be used as a standard function for the system. Figure 1 shows the methodological workflow of supervised learning. designed using empirical data from a system in different configurations. This dataset is then divided into training and testing sub-datasets. The training dataset is used to train a model that infers a mathematical relationship between the input and output. A trained model can then be used as a standard function for the system. Figure 1 shows the methodological workflow of supervised learning.
Different algorithms have been developed under supervised learning to solve a variety of problems. Some of the known algorithms are mentioned below: i.
K-Nearest Neighbor (KNN): The concept behind the nearest-neighbor algorithm is to find a predefined number of training samples nearest in distance to the new value and predict the label from them. KNN depends on the surrounding limited samples, rather than relying on the method of discriminating the class domain to determine the value, and the KNN algorithm is more suitable for datasets with crossover or overlap. ii. Support Vector Machine (SVM): Support Vector Machines (SVMs) are paradigms based on statistical learning theory. They incorporate the structural risk minimization principle, which has been shown to be better than the traditional empirical risk minimization principle used by ANNs. Thus, they have a greater generalization capability. SVMs can solve classification and regression problems and perform well in a high-dimensional feature space. They handle a nonlinear classification efficiently using the kernel trick that implicitly transforms the input space into another highdimensional feature space. iii.
Gradient Boosting (GB) Tree: The GB tree is an ensemble method that builds one decision tree learner at a time by fitting the gradients of the residuals of the previously constructed tree learners. To build a tree, the method starts from a single node and iteratively adds branches to the tree until a criterion is met. For each leaf, branches are added to maximize the loss reduction after the split (gain function). iv.
Random Forest (RF): The forest operates by forming a multitude of decision trees at training time, and it is mostly trained using the bagging method, which adopts randomly selected training data and then constructs the regressor. Finally, it combines the multiple decision trees to obtain more accurate and stable predictions. v.
Naive Bayes (NB): This generally works as a classifier by utilizing the clustering technique. Clustering occurs considering the conditional probability of the components. Based on the Bayes theorem, it independently predicts the probability of one feature existing over others. Different algorithms have been developed under supervised learning to solve a variety of problems. Some of the known algorithms are mentioned below: i.
K-Nearest Neighbor (KNN): The concept behind the nearest-neighbor algorithm is to find a predefined number of training samples nearest in distance to the new value and predict the label from them. KNN depends on the surrounding limited samples, rather than relying on the method of discriminating the class domain to determine the value, and the KNN algorithm is more suitable for datasets with crossover or overlap. ii. Support Vector Machine (SVM): Support Vector Machines (SVMs) are paradigms based on statistical learning theory. They incorporate the structural risk minimization principle, which has been shown to be better than the traditional empirical risk minimization principle used by ANNs. Thus, they have a greater generalization capability. SVMs can solve classification and regression problems and perform well in a high-dimensional feature space. They handle a nonlinear classification efficiently using the kernel trick that implicitly transforms the input space into another highdimensional feature space. iii. Gradient Boosting (GB) Tree: The GB tree is an ensemble method that builds one decision tree learner at a time by fitting the gradients of the residuals of the previously constructed tree learners. To build a tree, the method starts from a single node and iteratively adds branches to the tree until a criterion is met. For each leaf, branches are added to maximize the loss reduction after the split (gain function). iv. Random Forest (RF): The forest operates by forming a multitude of decision trees at training time, and it is mostly trained using the bagging method, which adopts randomly selected training data and then constructs the regressor. Finally, it combines the multiple decision trees to obtain more accurate and stable predictions. v.
Naive Bayes (NB): This generally works as a classifier by utilizing the clustering technique. Clustering occurs considering the conditional probability of the components. Based on the Bayes theorem, it independently predicts the probability of one feature existing over others.

Unsupervised Learning
In unsupervised learning, there are no training data, and the algorithm finds patterns in data without any reference to labeled outcomes. The model itself discovers important features existing in a raw dataset. In most cases, unsupervised learning is used for classification problems or feature reduction. One of the simplest forms of unsupervised learning is K-means clustering. It is a classification method that determines a center and the samples neighboring the cluster. This determination of K-centers follows a pattern of choosing the farthest samples from each other. Moreover, Principal Component Analysis (PCA) and Independent Component Analysis (ICA) are used to cluster a set of data with similar characteristics and distinguish unrelatable data from a given dataset without any supervision of the system or dataset. Figure 2 illustrates a simple block diagram of unsupervised learning.

Unsupervised Learning
In unsupervised learning, there are no training data, and the algorithm finds patterns in data without any reference to labeled outcomes. The model itself discovers important features existing in a raw dataset. In most cases, unsupervised learning is used for classification problems or feature reduction. One of the simplest forms of unsupervised learning is K-means clustering. It is a classification method that determines a center and the samples neighboring the cluster. This determination of K-centers follows a pattern of choosing the farthest samples from each other. Moreover, Principal Component Analysis (PCA) and Independent Component Analysis (ICA) are used to cluster a set of data with similar characteristics and distinguish unrelatable data from a given dataset without any supervision of the system or dataset. Figure 2 illustrates a simple block diagram of unsupervised learning.

Reinforcement Learning
This learning method is more like a humanoid approach than an isolated machine. A model designed by reinforcement learning decides by utilizing negative and positive feedbacks, also known as penalties and rewards, respectively, provided by the environment. These two reagents determine a string of actions named a policy. The individual action of any policy may be inappropriate, but the cumulative outcome will always show the highest reward for any specific task. Figure 3 is a simple algorithmic view of a reinforcement learning technique.

Reinforcement Learning
This learning method is more like a humanoid approach than an isolated machine. A model designed by reinforcement learning decides by utilizing negative and positive feedbacks, also known as penalties and rewards, respectively, provided by the environment. These two reagents determine a string of actions named a policy. The individual action of any policy may be inappropriate, but the cumulative outcome will always show the highest reward for any specific task. Figure 3 is a simple algorithmic view of a reinforcement learning technique.  Besides the above methods, new approaches such as Artificial Neural Networks (ANN) and Deep Learning (DL) algorithms are found to be handy in the case of complex problems. ANNs are bio-inspired algorithms for data processing, designed to model the way in which the human brain operates. ANNs are typically structured in layers i.e., the input layer, the hidden layer, and the output layer. Each layer contains many neurons.  Besides the above methods, new approaches such as Artificial Neural Networks (ANN) and Deep Learning (DL) algorithms are found to be handy in the case of complex problems. ANNs are bio-inspired algorithms for data processing, designed to model the way in which the human brain operates. ANNs are typically structured in layers i.e., the input layer, the hidden layer, and the output layer. Each layer contains many neurons. Recently, ANNs with multiple hidden layers, which are usually referred to as Deep Neural Networks (DNNs) or Deep Learning (DL), have been introduced to solve EM problems. Specially, DL has a vast variety of network architectures such as the multilayer perceptron (MLP), convolutional neural network (CNN) (with many variants, including the U-Net, which is an encoder-decoder CNN with skip connections), recurrent neural network (RNN), and generative adversarial network (GAN).

ML and EM
In the era of advanced communication characterized by high speed and high bandwidth network topologies, the need for EM systems to offer reconfigurability, compactness, directivity, and energy efficiency has become a necessity. Furthermore, the detection and identification of remote objects and faulty components in systems as well as pattern recognition of radiating signals in electrical systems are the latest trends in EM. On the other hand, the optimization of EM systems for superior performance given the constraints has been a recurrent challenge. ML-based approaches have the potential to address such complex and multifaceted challenges. The EM scientific community has leveraged ML for various applications. In the next few subsections, a few of the applications of ML in EM will be discussed.

Design Optimization and Synthesis
Antennae are an integral part of the EM system, which starts with the design, implementation, and testing [35]. Traditionally, full wave (FW) EM simulation, such as finite-difference time-domain (FDTD) or finite-element modeling (FEM) methods, is used for antenna design and optimization. These methods require large computational resources and time [36]. In fact, optimizing antenna arrays may involve significant repetitions of EM simulations to fine-tune the geometric and/or material parameters for performance improvement. Thus, applying ML models in the design of compact antenna or antenna arrays with high gain, transmission efficiency, and directivity, including suitable material selection, can enrich the effectiveness of the design. ML has been used in different applications of antenna design and optimization [37][38][39]. Lecci et al. [40] proposed an ML framework that enabled a simulation-based optimization of thinned arrays, considering network-level metrics such as signal to interference plus noise ratio statistics, based on a Monte Carlo approach. Koziel et al. [41] proposed a multiobjective optimization-based sequential mode algorithm that provided optimized design characteristics of antennas using a few hundred full-wave EM simulations. Surrogatemodeling-based antenna optimization is one of the most important methods in antenna design [42]. Its purpose is to replace computationally expensive EM simulation with computationally cheap estimation models. Surrogate models are formed using statistical learning techniques [43]. A multistage collaborative ML platform (MS-CoML) was introduced in another study [44], which reported almost double speed in antenna modelling without compromising the accuracy via single-output Gaussian process regression (SOGPR). By jointly applying three ML methods i.e., SOGPR and symmetric and asymmetric multioutput GPR methods, the surrogate antenna models for different design objectives were constructed based on a limited number of high-fidelity responses, while achieving high prediction accuracy. Three antenna examples, i.e., single-band, broadband, and multiband designs operating at the Q, V, and S/C bands, were chosen to validate the proposed method.
Simulation results showed that the MS-CoML method can greatly minimize the total optimization time without compromising modeling accuracy or performance. An advanced study of a modified parallel simulation surrogate model-assisted system (PSADEA) was proposed previously [45] that showed 1.8 times faster performance in optimization than the conventional surrogate modelling method with higher antenna quality. Another approach, which claimed 90% more time saved in antenna optimization compared to other surrogate models, is the single-fidelity, EM-model-based, training cost-reduced, surrogate modelassisted hybrid differential evolution for complex antenna optimization (TR-SADEA) [46]. It is a self-adaptive hybrid surrogate modeling framework that facilitates better performance for complex antenna design with increased design variables and specifications. More than 80% computational expense reduction was made possible by [47] implementing a Gaussian process regression (GPR) model when the training data contain high fidelity.
A semi-supervised approach-based process was proposed in another study [48] where a GP model and an SVM model were concurrently trained using a small amount of pre-labeled samples. The system was controllable by selecting the required accuracy, which optimizes design time. It offered high-precision predictive ability with respect to conventional supervised learning-based surrogate modeling using less labeled data and provided quality antenna designs utilizing only 10 to 15 EM simulations. Xue et al. [49] developed a hybrid ML model where ten different models were initially trained using a small set of data, which worked as a base learner. The initial predictions were then inserted into a K-nearest neighbors (KNN) model, which reported the final prediction of 0.00456 mean squared error in the case of a triangular-shaped pin-fed patch antenna. To increase the transmission efficiency of antennas used in wireless power transmission, ML has become a reliable assistive technology [50].
Antenna synthesis involves the determination of a geometrical or physical form from the knowledge of the electrical parameters. ML models can be fully used to enhance antenna synthesis efficiency. Gaussian Boosting Tree (GBT) was applied in a previous study [51] to synthesize a phase array antenna by estimating the phase angle for different amplitudes. To predict the phase shift in a reflect array antenna, another study [52] showed significant accuracy using a deep CNN-based AlexNet. The model takes the radiation pattern and beam direction as inputs and predicts the phase shift with less than 0.4% prediction error. Another ML approach was introduced in [53] where the coupling coefficient between pantograph arcing and a GSM-R antenna is determined utilizing multiple neural network (NN) regressors. The study showed significantly faster performance (almost three times faster via ML) while conserving the accuracy of the coupling coefficient. NN has provided over 99% accuracy towards [54] synthesizing an H-shaped rectangular microstrip antenna (RMSA). However, studies have shown that the Radial Basis Functions (RBF) of ANN are best suitable for evaluating the resonant frequency of rectangular microstrip patch antennae [55]. Radiation field estimation utilizing the nearfield focusing technique offered by an earlier report [56] showed faster performance of the SVM model with a smaller training dataset. Investigators [57] developed an inverse NN (INN) model that could determine the VSWR, gain, radiation pattern, and radiation efficiency of different planes with greater accuracy using a small dataset for training, with only 36 samples. INN has also shown usability in transmit array antenna synthesis from a given transmission co-efficient [58].
Sharma et al. [59] showed how modern ML algorithms such as least absolute shrinkage and selection operator (lasso), KNN, and ANN improved the design optimization and synthesis of antennae based on a specific bandwidth selection. The performance table and the frequency response curve for the above study are shown in Figure 4, which suggests the KNN algorithm as being the fastest among all. Again, in a different study [60], a modified KNN showed the lowest S 11 parameter compared to GPR, ANN, and a conjugate gradient (CG) with a small dataset of 36 samples. CNN was used earlier [61] to estimate the phase angle from an input (two-dimensional radiation pattern). The network was able to accurately calculate phases to synthesize the desired pattern.

Antenna Selection Applications
Antenna selection is key for efficient transmission with enhanced communication and resource utilization. ML has the potential to play significant roles in the automation of selecting antennas for an application-specific need. Especially in multiple-input multi- An adaptive chaotic particle swarm optimization (PSO) algorithm was used to avoid trapping in the local optimum, resulting in premature convergence of the antenna synthesis. A prior article [62] provided a DL methodology that determined the amplitude and phase of the antenna elements in response to the input radiation patterns, but for 0 • and 360 • , it will show less accuracy.
DL-based optimizers also provide significant improvements in DL models [63]. A combined GP CNN model was proposed by Zhang et al. [64] as a tool for the rapid optimization of EM problems such as antenna design. PSO was also used to optimize the model parameter, which elevated the performance, reduced convergence time, and increased the design accuracy. For higher bandwidth and gain, PSO was used prior [65] for optimizing a dielectric resonator antenna. For the combination of linear and non-linear natures of geometrical parameters and design specifications of antennae, deep CNN was used for the fast and accurate design of the antenna [66]. In an earlier study [67], modeling multiband antennae using three ML algorithms with increased accuracy and performance was reported. Kazemian et al. [68] described a directional antenna design scheme using multi-objective GA to reduce the number of the antenna side lobes and lobe levels, which eventually aids directivity and security in wireless communication. Rudant et al. [69] proposed a Dual-Band Global Navigation Satellite System (GNSS) micro-array antenna design optimization method to achieve higher Right-Hand Circular Polarized (RHCP) gain and lower Left-Hand Circular Polarized (LHCP) gain.

Antenna Selection Applications
Antenna selection is key for efficient transmission with enhanced communication and resource utilization. ML has the potential to play significant roles in the automation of selecting antennas for an application-specific need. Especially in multiple-input multiple-output (MIMO) technology, several developments utilizing ML in efficient transmit antennae selection are visible. To select a suitable antenna subset for large-scale MIMO systems, a study [70] proposed a dynamic generalized spatial modulation framework with Euclidean distance-optimized antenna selection (EDAS) and a Multi-layer perception (MLP) model, which enabled higher diversity gain. Another article [71] proposed a CNN-based transmit antenna selection for a nonorthogonal multiple-access MIMO system for 5G applications, where the proposed algorithm showed a performance 10,000 times faster than the exhaustive search and 2 times faster than the hyper region proposal network (HRPN) with an 89% validation accuracy. However, SVM, Naive bayes, and KNN showed increased performance in selecting transmit antennas in untrusted relay networks conserving standard channel-state information (CSI) secrecy with decreased computational complexity compared to the exhaustive search in MIMO systems ( Figure 5) [72,73].
A multilevel CNN was used in a MIMO Internet of Things (IoT) system to select transmit antennae [74]. For modern MIMO communication transmit antennae selection, a learn-to-select (L2S) approach was implemented by Diamantaras et al. [75] where achieving the optimal uniform linear array of antennas was expensive, both from the design and cost of materials perspectives. A DNN-based approach was implemented by Zhong et al. [76], which seems to outperform the conventional norm-based approach for antenna selection in MIMO software-defined radio systems by 53%. Aside from MIMO technology, ML was used previously [77], where the Gaussian Mixture Model was adopted to sort the features of an RF fingerprinting dataset, and later, an SVM was used to classify the antenna for the classification and wireless identification of different RF devices. This study showed above 75% accuracy in the case of heavily noise-affected RF signals and better feature extraction performance than conventional algorithms. A prior study [78] proposed a deep-learning-based approach rather than SVM for joint antenna selection, as well as a precoding design algorithm to select the suitable group of antennas for base stations, with an increased system sum rate and quality of service. The reason for selecting DNN is to improve performance by using more elaborate functions compared to SVM-based hyperplane models. tion (MLP) model, which enabled higher diversity gain. Another article [71] proposed a CNN-based transmit antenna selection for a nonorthogonal multiple-access MIMO system for 5G applications, where the proposed algorithm showed a performance 10,000 times faster than the exhaustive search and 2 times faster than the hyper region proposal network (HRPN) with an 89% validation accuracy. However, SVM, Naive bayes, and KNN showed increased performance in selecting transmit antennas in untrusted relay networks conserving standard channel-state information (CSI) secrecy with decreased computational complexity compared to the exhaustive search in MIMO systems ( Figure  5) [72,73]. A multilevel CNN was used in a MIMO Internet of Things (IoT) system to select transmit antennae [74]. For modern MIMO communication transmit antennae selection, a learn-to-select (L2S) approach was implemented by Diamantaras et al. [75] where achieving the optimal uniform linear array of antennas was expensive, both from the design and cost of materials perspectives. A DNN-based approach was implemented by Zhong et al. [76], which seems to outperform the conventional norm-based approach for antenna selection in MIMO software-defined radio systems by 53%. Aside from MIMO technology, ML was used previously [77], where the Gaussian Mixture Model was adopted to sort the features of an RF fingerprinting dataset, and later, an SVM was used to classify the antenna for the classification and wireless identification of different RF devices. This study showed above 75% accuracy in the case of heavily noise-affected RF signals and better feature extraction performance than conventional algorithms. A prior study [78] proposed a deep-learning-based approach rather than SVM for joint antenna selection, as well as a

Antenna Position, Direction, and Radiation Estimation
ML techniques have been significantly applied in estimating antenna position and direction to achieve the maximum gain in transmission and receiver systems. It assists in the detection and control of beam phasing of antennas based on signal patterns, strength, and target location [79,80]. The direction of arrival (DoA) estimation has become a popular application in military-civil research for remote object detection [81][82][83]. The fixed-antenna method for DoA estimation has limitations, hence antenna arrays have been used as an alternative in practice. The selection of suitable antennas for an antenna array and their positioning for multiple-input single-output (MISO) applications using ML [84] is gaining popularity due to its optimized computational time and performance.
Barthelme et al. [81] studied an NN-enabled algorithm to estimate DoA with lower computational complexity. A modified genetic algorithm (GA) was implemented in another study [85] to predict the optimal position of the antenna in an RF-based advanced driving assistance system. A previous article [79] described a rotating, elevated antenna technique assisted by ML to estimate the angle for a better field of view and cost-effective automation. In another study [86], ANN was used to estimate beam alignment and distribution without prior knowledge of user location information. Hong et al. [87] described a directional antenna design scheme using multi-objective GA to reduce the number of antenna side lobes and the lobe level, which eventually aided directivity and security in wireless communication. In a prior report [88], GPR was used for computing the resonant frequency of a square microstrip patch antenna. ML algorithms perform better in both linear and complex problems [89], where different scattering parameters are estimated in the UHF band of an antenna operating at a resonance frequency. An earlier study [90] provided a predictive model using ANN, which can predict the resonant frequency of a flexible microstrip patch antenna considering an inserted airgap. In a past investigation [91], the Support Vector Regressor (SVR) method was introduced to predict the EM response of complex reflect-array antenna elements with complex shapes. SVR enabled reliable, accurate, and fast EM response estimation (15% more time was saved compared with the FW simulation). Furthermore, a recurrent neural network (RNN) was used to suppress harmonics in wireless communication [92]. Deep neural networks (DNN) aid in reduced computational complexity while predicting the optimized direction of arrival (DoA) [93]. The study also showed how a hybrid multilayered network can both detect the optimum antenna from an antenna array and estimate DoA at the same time. Figure 6 shows the model architecture for the complex algorithm to estimate DoA reported in that study [93]. Another study [94] proposed a deep learning model to accurately track mobile stations and point to the satellite antenna to obtain an interference-free communication link. To obtain the desired radiation pattern, Lutati and Wolf [95] used a composite ANN approach of a hypernetwork. Recurrent learning was shown to have the scope for advancing the control of antenna such as safe antenna tilting at a remote distance [96].
Reconfigurability in antenna can be achieved through different mechanisms such as electrical, optical, physical, and/or material changes [97][98][99][100][101]. ML has become popular with reconfigurable antennas, specifically frequency-reconfigurable ones [102,103]. For example, an intelligent surrogate model-assisted differential evolution to synthesize an antenna array [24] with reconfigurable frequency was reported earlier. ML has been shown to be more robust in complex environments compared to most other signal processing techniques and offered an increased signal-to-noise ratio and efficient beamforming architectures. Different types of fractal antennas, star antennas, and pattern reconfigurable antennas are now utilizing ML algorithms to provide automatic reconfigurability features to serve modern applications [104,105].

Remote Object Detection and Recognition
ML has been applied in the analysis and prediction of information from radar signals. Synthetic Aperture Radar (SAR) is a microwave tool for the detection and recognition of targets as well as for the analysis of natural and man-made scenes. It has the benefits of all-weather, day-and-night operation with high-resolution imaging capability. Recently, CNNs have been used for SAR target recognition (TR) and could achieve high precision in image recognition and classification. However, the shortage of data still makes SARbased target recognition difficult. An earlier report [106] utilized CNN with three types of data augmentation for issues in SAR-TR, such as the translation of a target, the randomness of speckle noise in different observations, and the lack of pose images in training data. To overcome the over-fitting problem of CNN for SAR-TR, Chen et al. [107] proposed a new all-convolutional network (A-ConvNets) based on sparsely connected layers. The experimental results showed that, using data-driven features the model learned automatically from SAR image pixels, A-ConvNets achieves excellent classification accuracy. Another study [94] proposed a deep learning model to accurately track mobile stations and point to the satellite antenna to obtain an interference-free communication link. To obtain the desired radiation pattern, Lutati and Wolf [95] used a composite ANN approach of a hypernetwork. Recurrent learning was shown to have the scope for advancing the control of antenna such as safe antenna tilting at a remote distance [96].
Reconfigurability in antenna can be achieved through different mechanisms such as electrical, optical, physical, and/or material changes [97][98][99][100][101]. ML has become popular with reconfigurable antennas, specifically frequency-reconfigurable ones [102,103]. For example, an intelligent surrogate model-assisted differential evolution to synthesize an antenna array [24] with reconfigurable frequency was reported earlier. ML has been shown to be more robust in complex environments compared to most other signal processing techniques and offered an increased signal-to-noise ratio and efficient beamforming architectures. Different types of fractal antennas, star antennas, and pattern reconfigurable antennas are now utilizing ML algorithms to provide automatic reconfigurability features to serve modern applications [104,105].

Remote Object Detection and Recognition
ML has been applied in the analysis and prediction of information from radar signals. Synthetic Aperture Radar (SAR) is a microwave tool for the detection and recognition of targets as well as for the analysis of natural and man-made scenes. It has the benefits of all-weather, day-and-night operation with high-resolution imaging capability. Recently, CNNs have been used for SAR target recognition (TR) and could achieve high precision in image recognition and classification. However, the shortage of data still makes SAR-based target recognition difficult. An earlier report [106] utilized CNN with three types of data augmentation for issues in SAR-TR, such as the translation of a target, the randomness of speckle noise in different observations, and the lack of pose images in training data. To overcome the over-fitting problem of CNN for SAR-TR, Chen et al. [107] proposed a new all-convolutional network (A-ConvNets) based on sparsely connected layers. The experi-mental results showed that, using data-driven features the model learned automatically from SAR image pixels, A-ConvNets achieves excellent classification accuracy.
Mufti et al. [108] used a transfer learning approach where a pre-trained CNN, namely AlexNet, was used as a feature extractor. Moreover, Pei et al. [109] proposed a Multiview deep convolutional neural network (DCNN), which is an implementation of deep learning architecture, to recognize targets using multiple viewing angles. The framework was seen to effectively learn and extract classification information from the Multiview and requires only a small number of SAR images to train the network model. Experimental results have shown the superiority of the proposed framework based on the moving and stationary target acquisition and recognition datasets. By analyzing multiple images from different viewpoints (Figure 7a,b), the proposed algorithm showed an improved recognition rate using the targets' four input views compared to when using two or three input views, in the case of an extended operating condition (Figure 7c). CNNs require a massive amount of parameters and operations to generate a single inference, making them unsuitable for latency-and energy-constrained applications such as SAR-TR. To reduce the cost of implementing these networks, Dbouk et al. [110] developed a set of compact network architectures, which achieves an overall 984 times reduction in terms of storage requirements and 71 times reduction in terms of computational complexity compared to state-of-the-art CNNs for automatic target recognition. To achieve good performance with a small number of parameters, Huang et al. [111] proposed a lightweight two-stream CNN to extract multilevel features. Experimental results demonstrate that the proposed method achieved higher recognition accuracy compared to other CNN-based methods. Another method for determining the characteristics of the unknown reflective object is analyzing the signals from inverse scattering problems of EM. Mufti et al. [108] used a transfer learning approach where a pre-trained CNN, namely AlexNet, was used as a feature extractor. Moreover, Pei et al. [109] proposed a Multiview deep convolutional neural network (DCNN), which is an implementation of deep learning architecture, to recognize targets using multiple viewing angles. The framework was seen to effectively learn and extract classification information from the Multiview and requires only a small number of SAR images to train the network model. Experimental results have shown the superiority of the proposed framework based on the moving and stationary target acquisition and recognition datasets. By analyzing multiple images from different viewpoints (Figure 7a,b), the proposed algorithm showed an improved recognition rate using the targets' four input views compared to when using two or three input views, in the case of an extended operating condition (Figure 7c). CNNs require a massive amount of parameters and operations to generate a single inference, making them unsuitable for latency-and energy-constrained applications such as SAR-TR. To reduce the cost of implementing these networks, Dbouk et al. [110] developed a set of compact network architectures, which achieves an overall 984 times reduction in terms of storage requirements and 71 times reduction in terms of computational complexity compared to state-ofthe-art CNNs for automatic target recognition. To achieve good performance with a small number of parameters, Huang et al. [111] proposed a lightweight two-stream CNN to extract multilevel features. Experimental results demonstrate that the proposed method achieved higher recognition accuracy compared to other CNN-based methods. Another method for determining the characteristics of the unknown reflective object is analyzing the signals from inverse scattering problems of EM. (a)

Inverse Scattering Problem
The electromagnetic inverse scattering problems (ISPs) seek to determine the nature of the unknown (i.e., location, shape, and constitutive parameters) from the knowledge of measured scattered fields. The imaging techniques based on ISPs are instrumental in numerous areas, such as remote sensing, biomedical imaging, through-wall imaging, geo-

Inverse Scattering Problem
The electromagnetic inverse scattering problems (ISPs) seek to determine the nature of the unknown (i.e., location, shape, and constitutive parameters) from the knowledge of measured scattered fields. The imaging techniques based on ISPs are instrumental in numerous areas, such as remote sensing, biomedical imaging, through-wall imaging, geophysics, and non-destructive evaluation. ISPs are challenging to solve due to their intrinsic nonlinearity. This has led to the development of many various EM inverse scattering methods, which can be categorized into two groups: (1) Deterministic optimization methods including the subspace optimization method (SOM) [112], distorted born iterative method [113], and contrast source inversion (CSI) [114], and (2) stochastic methods [115] including genetic algorithms, evolutionary optimization, and PSO algorithms. Compressive-sensing (CS)-based methods are used as effective regularization tools to mitigate the ISP-related challenges, especially for high-contrast and large objects, due to the high complexity and long computing time. DNNs have been successfully used to solve inverse problems. Moreover, the learning-by-examples (LBEs) methods have been proposed to devise various ML approaches to resolve the ISPs [116]. Wei et al. [20] developed DL schemes for ISP that were able to produce good quantitative results by training a U-Net [117]; a unique CNNs architecture originally designed for bio-medical segmentation.
To exploit the relationship between the DNNs architecture and the iterative methods of nonlinear EM inverse scattering, the author [20] proposed three inversion schemes based on a U-Net CNN.
Li et al. [118] investigated a Deep Neural Network for Nonlinear Inverse Electromagnetic Scattering (Deep-NIS), which was based on a cascade of three CNN modules, in which the inputs from the scene were processed by the backpropagation (BP) method and the output was the reconstruction images of the unknown scatterers. Deep-learning schemes could achieve excellent performance, but they are too difficult to implement using the physical knowledge of electromagnetic inverse scattering. Thus, to bridge the gap between physical knowledge and learning approaches, Wei et al. [119] proposed an induced current-learning method by integrating benefits in the iterative algorithm architecture of CNN. To solve the ISPs with high contrast, a contrast source network (CS-Net) combined with a traditional subspace-based optimization method was investigated [120] and CNNs were developed with three stages [121]. Furthermore, Yao et al. [122] proposed a two-step deep-learning approach that can reproduce high-contrast objects using a cascade of CNNs and another complex-valued deep residual CNN. In another study [123], a gradient learning approach was used to invert transient electromagnetic data. Thus, the inversion methods using only phase-less data (e.g., only amplitude data) are preferred for engineering applications. Xu et al. [124] proposed three inversion schemes based on U-net CNNs to solve phaseless ISPs.

Fault Detection Systems
The failure of elements in an antenna array causes sharp field variations across the array aperture, thus degrading the radiation pattern of the antenna system. Therefore, identifying defective elements is vital for correcting array failures. The Woodward-Lawson method [125], Case-Based Reasoning systems [126], GA [127], and ML techniques have been used earlier to locate the fault element(s) in an antenna array [128]. Patnaik et al. [129] used an ANN to locate a maximum of three defective elements in a 16-element array, from its distorted radiation pattern. The results were in excellent agreement with the simulation results. In another study [130], the SVM classifier was used to detect defective elements in a four-element array, using a different SVM for each combination of defective elements. However, this approach was not feasible with moderate or large arrays, as the number of possible combinations increased. The training and detection of the SVM were conducted on measured radiation data with a signal-to-noise ratio (SNR) in the range of 0-25 dB. In shipborne antennas, it is arduous to detect faults due to the unfriendly working environment and loud background noises, so DL can play an important role by using multiscale analysis with a multi-layered network [131]. EM can be used in power transmission anchor rods to detect faults. The procedure is noninvasive and nondestructive due to the absence of physical human touch. However, the acquired signal is complex and ML techniques must be used to recognize the pattern of the radiation and thus detect anomalies [132]. The uncertainty of sudden antenna failure in textile applications was characterized using a Bayesian optimization-adopted possibility theory in a prior article [133]. A defect in the base station often creates sleeping cells resulting in network failures that are hard to detect. The naive bayes method provided excellent results in detecting this kind of failure with precision [134].

Miscellaneous Applications
ML has been found to be useful in versatile EM applications. Tang et al. [135] described a dual-antenna approach by optimizing the antenna parameters using GA, to avoid the pathloss and increase the signal power effectively. A dual-band Planar Inverted F-Antennas (PIFA) antenna for 5G mobile communications was proposed utilizing a GA and Bayesian convolution incorporated hybrid algorithm in an earlier study [136]. ANN was used to optimize the dual-band antenna with the desired return loss and frequencies of operation [137], which are ideal for 5G communication.
For better configuration of the number of subcarriers and the size of constellation symbols in the case of mm-Wave communications, KNN was used in another study [138] to maximize the system throughput. The system was evaluated using four antenna configurations resulting in a uniform cylindrical array with the highest data rate. A KNN-based power allocation algorithm was proposed in [139] for a distributed antenna system. Thirteen ML algorithms including GP, SVM, KNN, Naive bayes, and ANN were adopted in another study [140] for performance testing of fingerprint-based mobile terminal localization, which improved the efficiency of KNN over other algorithms, producing the least mean squared error.
Jiang and Schotten [141] studied a deep RNN-based approach including the longshort-term memory (LSTM) and the gated recurrent unit to estimate the channel condition. ML has been utilized in different antenna-related Internet of Things (IoT) applications [142]. The single-input multiple-output (SIMO) system is a great example of a modern IoT application where ANN enabled significant improvements in learning the modulationdemodulation schemes for multipath channels [143]. To achieve energy efficiency while reserving the quality of service in the case of a shared wireless medium, base stations must choose the best subset of antennae to utilize the spatial diversity of multiple users. Deep Learning is also used in distributed antenna systems for optimal power allocation to achieve spectral and energy efficiency with reduced computational complexity [144]. ML has also been used to perform different medical applications, such as breast cancer detection by EM radiation [145]. Moreover, ML is used to generate models for the experimentation of antenna performance of wireless devices adjacent to the human body, such as radiation patterns, return loss, impedance, etc. [146]. It has been utilized to predict breast cancer by estimating the S-parameter of an ultra-MIMO sensor antenna using PCA [147]. Figure 8 illustrates the versatility of ML in EM applications involving the design, synthesis, recognition, and optimization of various EM-based systems.

Limitations
It is undeniably noticeable from prior sections that the advent of ML has revolutionized the progress in EM. However, machine-learning-based approaches have limitations. As ML approaches are data driven, major limitations arise due to inadequate data to train the model effectively. Moreover, as ML facilitates predictions, accuracy can decrease in new cases and needs continuous evaluation and training of the model. Even the best-performing CNN is found to be quite inaccurate outside of its known environment [148]. Although surrogate modeling provides great opportunities, it requires sufficient empirical data from experiments and sophisticated tuning of the hyperparameters to achieve the expected results. It often lacks efficiency in the case of higher dimensionality and highly non-linear problems. Although different unsupervised algorithms are used to solve the feature reduction and classification problems, almost all require expertise in handling the design and in-depth programming of the required model of interest. Although it is important to provide an abundance of data during training as the accuracy of the model mostly depends on it [149], additional characteristics such as the learning rate, batch size, and others are also essential for optimized and improved outcomes [150].
Although PCA and GA have shown significant optimization of ML algorithms using an intelligent search-based mechanism [151,152], further improvement is necessary to achieve better results. Even in antenna arrays, the performance of ML techniques is not satisfactory due to the increased possibilities of antenna combinations. Sometimes, even larger datasets for fault detection results in low accuracy, which makes the selection of the appropriate ML algorithm difficult [153]. Table 1 summarizes the comparisons of different ML algorithms (their pros and cons)

Limitations
It is undeniably noticeable from prior sections that the advent of ML has revolutionized the progress in EM. However, machine-learning-based approaches have limitations. As ML approaches are data driven, major limitations arise due to inadequate data to train the model effectively. Moreover, as ML facilitates predictions, accuracy can decrease in new cases and needs continuous evaluation and training of the model. Even the bestperforming CNN is found to be quite inaccurate outside of its known environment [148]. Although surrogate modeling provides great opportunities, it requires sufficient empirical data from experiments and sophisticated tuning of the hyperparameters to achieve the expected results. It often lacks efficiency in the case of higher dimensionality and highly non-linear problems. Although different unsupervised algorithms are used to solve the feature reduction and classification problems, almost all require expertise in handling the design and in-depth programming of the required model of interest. Although it is important to provide an abundance of data during training as the accuracy of the model mostly depends on it [149], additional characteristics such as the learning rate, batch size, and others are also essential for optimized and improved outcomes [150].
Although PCA and GA have shown significant optimization of ML algorithms using an intelligent search-based mechanism [151,152], further improvement is necessary to achieve better results. Even in antenna arrays, the performance of ML techniques is not satisfactory due to the increased possibilities of antenna combinations. Sometimes, even larger datasets for fault detection results in low accuracy, which makes the selection of the appropriate ML algorithm difficult [153]. Table 1 summarizes the comparisons of different ML algorithms (their pros and cons) when applied in various aspects of EM. Since the emergence of 5G technology, antenna arrays have received unprecedented attention and will continue to play a dominant role in next-generation wireless communications systems. Small cell clustering, MIMO channel estimation, bandwidth sharing automation, and different fault detections in 5G technology are gradually adopting ML to obtain automation and success [154]. The use of ML improves energy efficiency, allows for more signal-path diversity, and helps to mitigate multi-path fading. Electromagnetic radiation-based detection can be a prominent solution [155,156] where deep learning can play an important role in analyzing the radiation pattern and facilitating a fast response. ML classifiers displayed significant accuracy in providing information beyond the line of sight. These have now been used in the Global Navigation Satellite System (GNSS) to eradicate the disputes caused by multipath and non-line-ofsight signal propagation between the earth and satellites [157]. Another study [158] has shown the feasibility of ML in the state recognition of satellite antennas by accumulating minimal data. In addition, published literature [159,160] proved the viability of using ML algorithms in the detection of spoofing attacks in multiantenna snapshot receivers by tracking the antenna positions by satellite. ML can also automate the design procedure in different computational EM methods of intelligent CAD and CAE applications [161]. ML is becoming popular in the design of complex antennas for advanced and new applications involving 5G technology and IoT [162]. To represent the path loss in complex environments of future wireless sensors, ML can contribute significantly via advanced feature selection and modeling tools [163]. Again, determining the signal condition is an important aspect of modern wireless networks, in which ML algorithms can play a significant role [164]. AI and the subset ML approaches are better than conventional algorithms in the implementation of reconfigurable antenna arrays with greater robustness in the case of noisy and multipath environments [104], even for the protection of the devices [165]. Not suitable for real-time object detection for high latency to deal with massive parameters.
Less effective for MIMO system KNN Maximization of the system throughput in mm-Wave communications due to ensuring the highest data rate and least mean squared error.
Unable to reduce computational complexity.
PCA Different medical applications (e.g., prediction of breast cancer by estimating S-parameter of an ultra-MIMO sensor antenna) Complexity to generate models for experimentation of antenna performance of wireless devices adjacent to the human body such as radiation pattern, return loss, impedance etc.

Conclusions
This paper presents a survey of the state-of-the art investigations in the application of machine-learning techniques in EM. The available literature proves the effectiveness and efficiency of these ML methods in solving EM problems. The advantages of ML models include their accuracy and ease of implementation for highly non-linear EM problems. A primary disadvantage of ML techniques is the high number of parameters and hyperparameters to set, which makes the process complex and consumes time. Nonetheless, it can be envisioned that ML techniques will play a dominant role in next-generation wireless technology via faster solutions in the EM domain.