1. Introduction
Electromagnetic compatibility (EMC) testing is essential for ensuring the proper functioning of electronic devices [
1,
2,
3,
4], with voltage levels measured strictly according to CISPR 25 specifications. Basically, these tests require high costs for resources, time, and infrastructure. Since they need controlled environments (anechoic chamber) and advanced equipment (LISN, EMI receiver, and antenna) to further measure electromagnetic interference itself. Therefore, it surely becomes important to create new methods that lower these costs while keeping the results accurate. In [
1], the authors show that certain bus configurations (cable length, termination, impedance) can cause artificial resonance peaks during testing, distorting the results. In [
2], the authors demonstrate that specific setup details, such as cable positioning, grounding, bench height, or harness connections, can significantly influence the results. Authors in [
3] demonstrate that the probe, often neglected in classical models, significantly influences the measurement of conducted emissions. Also, in [
4], the authors perform experimental measurements and analyses to understand how fast switching and internal circuit layout influence EMC disturbances.
Moreover, such alternative approaches are essential for maintaining quality outcomes. As a result, it becomes crucial to develop alternative methods that reduce these costs while maintaining the accuracy of the results. Among these methods, we can cite virtual prototyping, integration of the finite element method in circuit-type simulators, circuit-type modeling of active and passive components, carrying out reliability tests on active and passive components, the integration of AI algorithms, and co-simulation. In [
5], the authors present a review application of the PEEC method (field→network conversion) to illustrate the state of the art for integrating field models into circuit simulators (FEM→circuit). In [
6], the authors present a recent concrete case of co-simulation (multi-physics FEM + controllers/circuits) and show practical integration in system simulators. In [
7], the authors demonstrate the extraction of wideband models for key components (motors, filters, and cables) and assemble these models into a circuit diagram for systemic EMC analyses that can be a proof of circuit modeling of passive/active components for EMC simulation. Finally, in [
8], the authors conclude that the use of machine learning or artificial intelligence for the analysis and acceleration of electromagnetic problems (antennas, inverse problems, modeling) is primordial.
In this context, artificial intelligence (AI) and machine learning (ML) emerge as promising solutions for virtualizing EMC testing. By using machine learning, deep learning, and data processing techniques, it is possible to predict common-mode and differential-mode voltages from input data such as current, voltage, and other electrical parameters. These AI models offer an effective alternative for simulating EMC test conditions, enabling accurate predictions to be made without the need for systematic, costly physical testing [
9,
10,
11,
12]. The literature includes some papers that address the modeling, design, and simulation of EMC effects using AI. In [
9], the authors in this paper combined machine learning and collective knowledge graphs to propose an intelligent method for diagnosing and managing EMC. The goal is to train machine learning models to automatically detect the sources of undesired emissions and suggest mitigation techniques by organizing knowledge of EMC issues (disturbance sources, coupling paths, and corrective solutions). This speeds up the design process for EMC compliance and lessens reliance on human expertise. The authors of this paper in [
10] used machine learning techniques to reconstruct RF sources in systems affected by RF extraction. Even in complex or black-boxed systems, the authors demonstrate how to train an ML model on measured data to estimate the location and spectrum of EMI sources that cause noise. Compared to conventional techniques based on direct measurements, this method enables a quicker and more precise diagnosis of EMI issues. In [
11], the authors suggest a technique for using deep neural networks to model macroscopic EMI circuits for black-box systems. The objective is to develop a rough circuit model that can forecast a device’s EMI behavior without fully understanding its internal workings. In complex systems without detailed models, this method is helpful for quick simulation, design optimization, and disturbance mitigation. In order to increase inductive proximity sensors’ resistance to electromagnetic interference (EMI), this study [
12] investigates the use of neural networks (AI/deep learning) to filter quick disruptions (electrical transients). After comparing various noise attenuation architectures (CNN, RNN, and hybrid), the authors suggest a model based on GRU layers (RNN) that is optimized for noise reduction (≈70%) while adhering to memory constraints for a potential embedded implementation. In [
13], the author shows how AI methods can only change computational electromagnetism and electromagnetic compatibility in power systems, and this reveals significant potential for major improvements. He shows how these methods can further reduce calculation time and improve simulation precision, which itself helps optimize electrical device design. In [
14], the authors suggest using artificial neural networks only to make and design inductors in a fast and correct way. Basically, AI uses the same few methods, like RNN, ANN, KNN, RF, and PSO, to reach this goal [
15,
16,
17,
18,
19]. Basically, for all these methods’ strengths and limitations, a careful evaluation is required to identify the method that best aligns with the characteristics of the data and the targeted outcomes. As per the results, KNN works well for EMI prediction and can be used as a practical tool regarding circuit optimization and EMC compliance.
This paper is organized as follows:
Section 2 presents the experimental setup description by demonstrating the device under test (DUT) and the measurement setup according the CISPR25.
Section 3 displays the deep learning algorithm.
Section 4 presents the dataset collection.
Section 5 and
Section 6 illustrate the results and discussion for the different methods described in
Section 3, and present the conclusions of our study.
5. Results and Discussion
Before evaluating the performance of different artificial intelligence methods applied to EMC prediction, it is essential to highlight the importance of evaluation metrics. Among the most commonly used are the
RMSE,
MAE, and
R2, which play a good role in assessing the reliability and accuracy of the method. These metrics not only quantify the discrepancy between real and predicted amplitude of electromagnetic disturbances but also evaluate the model’s ability to correctly reproduce observed physical increase or decrease. Thus, they serve as crucial tools for judging the relevance of a predictive model in EMC, where high accuracy is crucial for anticipating risks, optimizing designs, and reducing costly laboratory testing iterations [
24,
25].
Mean Absolute Error (MAE):
Mean absolute error is defined as shown in Equation (3):
where y
i are the real values,
are the predicted values, and N is the observation number.
MAE represents the average absolute deviation between the model’s predictions and the real values, expressed in the same units as the target variable. A smaller
MAE indicates that the predictions are, on average, closer to the real values. Since it treats all errors equally,
MAE provides an intuitive measure of the typical prediction error.
Root Mean Square Error (RMSE):
Root mean square error is given by Equation (4):
This measure confirms larger errors because the differences are squared before averaging. Therefore, RMSE is particularly sensitive to outliers and is often considered more conservative than MAE. A lower RMSE indicates that the model not only achieves good average accuracy but also avoids producing large deviations from the actual data.
Coefficient of Determination (R2):
The coefficient of determination is defined as shown in Equation (5):
where
is the mean of the real values.
R2 measures how much of the variance in the dependent variable can be explained by the model. R2 = 1 indicates a perfect fit, while R2 = 0 means the model does not improve upon simply predicting the mean. In practice, a higher R2 indicates a better goodness of fit and stronger correlation between predictions and real values.
All simulations and MATLAB® code executions were carried out on a personal computer (PCWIN64) running Microsoft Windows 11 Professional. The system is equipped with a 16-core CPU and a total memory of 33.78 GB, of which 10.34 GB was available during the simulations. This computational configuration provided adequate processing capability and memory resources to ensure stable and efficient training and evaluation of the proposed deep learning method.
5.1. Recurrent Neural Network (RNN)
To be able to predict electromagnetic interference levels in both common mode and differential mode from electrical system signals, the MATLAB® code implements an RNN based on an LSTM architecture. Power supply, load current, duty cycle, switching frequency, component characteristics, PCB layout, filter capacitance, gate resistor, and frequency, CM, and DM amplitude (dBµV) for output are the first datasets loaded in the process. Z-score normalization is used to extract and standardize these variables, providing identical scaling across all inputs and enhancing the neural network’s training effectiveness and numerical stability.
For the purpose of sequential training over manageable data chunks, the dataset is then split up into multiple fixed-size subsets, each of which contains 32,769 samples. The eight input features (power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor) are used as input sequences (X) for each subset, and the corresponding CM and DM measurements are used as target sequences (Y). In order to conform to the LSTM format, which treats each feature as a distinct input channel and each time step as an example in the sequence, these inputs, as well as outputs, are transposed.
Deep learning code in MATLAB® is then used to build an LSTM network. One lstmLayer with 50 hidden units set up to return sequences, a fullyConnectedLayer function with two outputs (signaling predicted CM and DM values), a regression0Layer function to calculate loss during training based on continuous output values, and a sequenceInputLayer function to receive the eight-dimensional input data make up this system. The Adam optimizer is used to train the network with predetermined training parameters, including a learning rate of 0.001, a mini-batch size of 64, and 50 training epochs (after these parameters are varied in Cases 2 and 3). Although this configuration does not use a validation set, training progress can be tracked via recurring internal updates.
Once training is complete for a given subset, the network is tested using a real input set. After that, the test data is normalized in the same way as the training inputs and reshaped for prediction. The trained LSTM model predicts CM and DM noise amplitude for this new input across the frequency range. These predictions are stored and accumulated over all subsets.
Following the completion of each prediction, the frequency values are denormalized to determine the true frequency in Hz, and the linear units are converted to decibel microvolts (dBµV). For clarity, the script then uses logarithmic scales (also known as the semilogx plot function) to plot the predicted CM and DM amplitudes as a function of frequency across multiple orders of magnitude. Furthermore, experimental reference data with measured CM and DM levels across frequencies is imported from a text file called real_Vmc_Vmd.txt. Thus, it makes it possible to compare predicted and actual values graphically in two different figures: one for the common mode and one for the differential mode.
Three widely used statistical metrics in machine learning and regression, MAE, RMSE, and R2, were calculated to assess the precision of the LSTM model predictions. MAE calculates the average absolute deviation, in dBµV, between the measured and predicted values for both common and differential mode voltage. The predictions are generally closer to the actual measurements when the MAE is lower. The spread of prediction errors is indicated by RMSE, which penalizes larger errors more severely. Finding significant differences between actual and anticipated data is one of its main uses. The coefficient of determination quantifies the proportion of variance in the experimental data explained by the model. An R2 value close to 1 indicates a strong correlation between predictions and measurements, while a value near 0 shows that the model explains very little of the observed variability.
These metrics, reported separately for common mode and differential mode, therefore provide a comprehensive assessment of the model’s performance: average prediction accuracy (MAE), robustness against large errors (RMSE), and overall goodness of fit to the experimental data (R2).
In the training configuration of the proposed RNN (LSTM), three key hyperparameters were defined to control the learning process: the maximum number of epochs (MaxEpochs), the mini-batch size (MiniBatchSize), and the initial learning rate (InitialLearnRate).
5.1.1. Maximum Number of Epochs
This parameter represents the total number of complete passes through the entire training dataset during the learning process. A sufficiently large number of epochs allows the model to converge, while an excessive number may lead to overfitting.
5.1.2. Mini-Batch Size
Mini-batch size specifies the number of training samples used in each weight-update step. Smaller batches can improve generalization but may increase training noise, whereas larger batches provide more stable updates at the cost of higher memory usage.
5.1.3. Initial Learning
This parameter rate determines the step size used by the optimizer to adjust the model’s weights. A well-chosen learning rate ensures fast and stable convergence, while values that are too high can cause divergence, and values that are too low can slow down the training. For this reason, they should carefully choose this parameter.
Figure 6,
Figure 7 and
Figure 8 illustrate the real and predicted values for both (a) common and (b) differential mode voltages for the different Cases 1, 2, and 3.
Case 1: Epoch Max Number = 50, Mini Batch Size 64, Initial Learn Rate = 0.001
Case 2: Epoch Max Number = 100, Mini Batch Size = 32, Initial Learn Rate = 0.0001
Case 3: Epoch Max Number = 100, Mini Batch Size 16, Initial Learn Rate = 0.0001
Table 5 summarizes the prediction performance metrics of the RNN-based model for predicting common and differential mode voltages. The values of
MAE,
RMSE, and
R2 are reported to quantitatively evaluate the model’s prediction accuracy and its ability to reproduce the variability of the measured data for the different cases.
The results obtained with the RNN model show varying performance depending on the cases and the configurations, CM and DM. For Case 1, the RNN achieves an MAE of 16.52 dBµV in CM mode and 16.61 dBµV in DM mode, with corresponding R2 values of 0.61 and 0.43. Cases 2 and 3 show a progressive improvement in CM performance, with the MAE decreasing from 15.48 to 14.69 dBµV and the R2 increasing from 0.6789 to 0.7578, demonstrating a stronger capacity of the model to explain variance. In DM mode, the performance remains more stable, with the MAE fluctuating between 15.52 dBµV and 14.57 dBµV and R2 values ranging from 0.4903 to 0.5550. These results suggest that the RNN model is more effective in predicting CM behavior than DM behavior, particularly in Cases 2 and 3.
5.2. Artificial Neural Networks (ANNs)
To ensure successful model training and evaluation, the datasets are split into three sections: 70% for training, 20% for validation, and 10% for testing. In order to learn patterns from the data and modify the model’s parameters, the training set is utilized. By testing the model’s performance on unseen data during training, the validation set aids in optimizing hyperparameters and avoiding overfitting. Lastly, the test set offers an objective evaluation of the model’s capacity to generalize to entirely new data. Reliable predictions and strong model performance are guaranteed by this structured data split.
Figure 3 represents the real and predicted values for both common and differential mode voltage.
The ANN model is implemented in MATLAB® code to predict electromagnetic interference levels in both CM and DM components based on a system’s electrical parameters. Data-driven prediction of electromagnetic behavior is made possible by this modeling approach, which allows the mapping of input variables to the corresponding noise levels. These variables include power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor.
The eleven columns representing the input part power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, gate resistor, output part frequency, measured CM, and DM voltage will be loaded at the beginning of the process. To facilitate efficient and scalable training, the dataset—which consists of more than 1.9 million rows—is divided into multiple smaller subsets of 32,769 samples. Matrix X contains the pertinent input features (power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor) for each subset, and matrix Y includes the target outputs (Vcm and Vdm).
The feedforwardnet function in MATLAB® is used to build an ANN. It defines a feedforward network with two hidden layers of 128 and 64 neurons, respectively (it can increase or decrease those two values). Depending on the system’s complexity, these hidden layer sizes can be changed. For supervised training, 70% of the data is used for training, 20% for validation, and 10% for testing. The train function, which transposes inputs and outputs to conform to MATLAB’s expected format (columns as samples), is used to train the network separately on each data subset.
The model is assessed using new, fixed input values for power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistors that differ across the values in the corresponding subset after training on a subset. To generate Vcm and Vdm predictions, these inputs are repeated at all frequencies and fed into the trained network. A comprehensive prediction matrix covering the whole frequency range is created by storing and adding up these predictions for every subset.
Using logarithmic scaling (semilogx plots function), which is frequently used in electromagnetic compatibility analysis to show variations over several frequency decades, the code creates plots of predicted CM and DM values over the frequency range.
To validate the model’s performance, real measured data is imported from a text file (real_Vmc_Vmd.txt) containing frequency, Vcm, and Vdm values. These are also converted into dBµV and plotted alongside the model’s predictions. Comparative plots are generated for both CM and DM components, allowing visual assessment of the ANN model’s predictive capability against experimental results. Finally, the calculation of performance metrics (R2, MAE, and RMSE) is included to evaluate the accuracy and reliability of the prediction results.
5.2.1. Case 1: Hidden Layer Size = [8, 4]
Simulation time is 489 s.
Figure 9 shows the real and predicted values for both (a) common- and (b) differential-mode voltages.
5.2.2. Case 2: Hidden Layer Size = [16, 8]
Figure 10 illustrates the real and predicted values for both common (a) and differential (b) mode voltages. The simulation time is 3374 s. Furthermore, the neurons and hidden layers are increased.
5.2.3. Case 1: Hidden Layer Size = [32, 16]
Figure 11 shows the ANN-predicted and real FFT of common (a) and differential (b) modes.
The results obtained across the different configurations presented in
Table 6(CM and DM, Cases 1 to 3) indicate that the ANN model applied to conducted emission prediction shows only moderate performance. The mean absolute error (
MAE ≈ 12–14 dBµV) and root mean square error (
RMSE ≈ 15–16 dBµV) highlight a significant discrepancy between the measured and predicted emission levels. In terms of explanatory power, the coefficients of determination (
R2 ≈ 0.42–0.65) reveal that the ANN is able to capture part of the variability of the experimental data, although the fit remains limited and far from optimal.
These findings suggest that, in its current configuration, the ANN is not sufficiently accurate for precise modeling of conducted electromagnetic emission phenomena. Potential improvements could involve enriching the training dataset, incorporating additional explanatory variables (e.g., physical characteristics and test conditions), or adopting more advanced modeling strategies such as deep neural networks, hybrid architectures, or frequency-domain approaches.
5.3. Random Forest (RF)
In this method, the NTrees parameter is varied, as it plays a crucial role in enhancing prediction performance.
Figure 10,
Figure 11 and
Figure 12 present the predicted and real common and differential mode voltages for the different values of NTress.
Based on electrical input parameters, the RF regression model is implemented in the MATLAB® code to predict electromagnetic interference in CM and DM signals. Using ensemble learning techniques, this data-driven approach enables the estimation of conducted emission levels across a frequency spectrum. The method’s modular structure allows it to process massive amounts of measurement data and carry out tasks like prediction and visualization.
A dataset containing eight inputvariables, such as power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, gate resistor, and, for the output, frequency, as well as Vcm and Vdm, must be loaded first. To enable iterative and memory-efficient model training, these data are organized in a large matrix (1,966,140 × 6) and separated into fixed-length subsets of 32,769 samples each. The target outputs are extracted as CM and DM voltage (Y_ss), and the four input features, such as power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor, are combined into a matrix X_ss for each subset.
MATLAB’s TreeBagger function is used to create two distinct RF regression models, one for Vcm and one for Vdm, in order to model the prediction task. To guarantee reliable and generalized learning across intricate data distributions, each model is made up of 50, 500, or 5000 decision trees (NTrees = 500). The input matrix X_ss and the corresponding target outputs from the current subset are used to train these models.
Once training is complete, models are created to predict the Vcm and Vdm levels over a given frequency range. A new input matrix is constructed by repeating the fixed values and combining them with the frequency values of different subsets.
A global prediction matrix called toutes_pred is used to store and accumulate the expected outputs for CM and DM. The signal amplitude is then expressed in a format frequently used in electromagnetic compatibility analysis by converting these values into dBµV.
The non-parametric nature of this RF-based method and its capacity to manage nonlinear relationships between input features and output noise levels make it especially useful. The performance of the model can be evaluated and improved for useful applications in EMC modeling, design, and system diagnostics by contrasting predictions with actual measurements.
Case 1: NTress = 50
Case 2: NTress = 500
Figure 13 shows predicted and real FFT of common (a) and differential (b) modes.
The RF suggests a quasi-linear increase with the number of NTress 5000.
Table 7 summarizes the regression performance of the RF model for predicting V
cm and V
dm. The table reports the
MAE,
RMSE, and
R2, which provide a quantitative evaluation of the model’s accuracy and its ability to capture the variability of the measured data.
Table 7 presents the performance of the RF model across three experimental cases, separated by CM and DM. The
MAE values are consistently low in CM mode, ranging from 5.29 dBµV to 5.33 dBµV, indicating that the predictions closely match the true values. In DM mode, the
MAE is slightly higher, between 6.59 dBµV and 6.89 dBµV, showing that the model is somewhat less accurate for DM. The
RMSE follows a similar pattern, with lower errors in CM (≈7 dBµV) compared to DM (≈8 dBµV), confirming better performance in CM mode.
R2 values are relatively high for both modes, slightly favoring CM (0.65–0.67) over DM (0.60–0.64), which indicates that the RF model explains a significant portion of the data variance, especially in CM mode. Overall, the RF model demonstrates robust performance, with consistently better results in CM than in DM, making it a reliable choice for these predictions.
5.4. Particle Swarm Optimization (PSO)
Figure 14,
Figure 15 and
Figure 16 present the predicted and real common and differential mode voltages for the three cases based on the PSO method. The number of particles (numParticles) and the maximum iterations (maxIterations) are adjusted.
PSO and ANN are used in a hybrid approach to forecast electromagnetic interference (EMI) levels, specifically Vcm and Vdm, over a broad frequency range. This method is appropriate for complex nonlinear systems like conducted EMI prediction because it makes use of both the global optimization power of PSO and the learning capabilities of neural networks.
Power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, gate resistor, frequency, common-mode voltage, and differential-mode voltage are among the most significant electrical and electromagnetic interference (EMI) parameters that are represented by the massive dataset, which has over 1.9 million rows and 11 columns. The dataset is separated into smaller subsets of 32,769 samples, which are processed one after another in order to effectively handle this massive volume of data.
For each subset, input features (X) are created from different electrical input parameters such as the power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, gate resistor, and the corresponding target outputs (Y) are composed of Vcm and Vdm measurements. These input and output matrices are transposed to match the expected input structure of MATLAB’s feedforwardnet neural network architecture.
Feedforwardnet has been utilized to define a single hidden-layer ANN with 64 neurons, and its structure has been set up to take in four input features. PSO is used to optimize the model’s weights and biases instead of training the neural network using conventional gradient-based techniques (such as backpropagation). This method uses 100 iterations to explore the weight space with 30 particles. The positions of each particle are iteratively updated based on both individual and global best scores, and each particle represents a potential set of network parameters (weights and biases). The MSE between the current subset’s expected and actual EMI outputs is the objective function that PSO aims to minimize.
The best configuration is used to make predictions on a fresh data scenario after the PSO process has reached the ideal network parameters. As is common in EMC/EMI analysis, the predicted Vcm and Vdm are recorded and then converted to dBµV. A number of visualizations are created: Using log-log scaling, the first plot shows the expected CM and DM EMI levels as a function of frequency. For comparison, real measurement data is plotted similarly after being imported from an external file (realVmcVmd.txt).
Additional comparison plots are generated to show the difference in level between real and predicted CM and DM, allowing for visual validation of the model’s accuracy.
In the last step, this hybrid ANN–PSO method effectively blends the global optimization power of PSO with the nonlinear modeling capability of neural networks. It works well for EMI prediction over a wide frequency range and under complicated input conditions, which makes it a useful tool for power electronic systems’ emission analysis, design, and verification of electromagnetic compatibility (EMC).
Here, the number of particles and the number of iterations are increased to optimize the prediction.
The number of particles and the number of iterations are increased to optimize the prediction.
Figure 17 presents the
. PSO-predicted and real FFT of common (
a) and differential (
b) modes.
Table 8 shows the performance of the PSO-based model across three cases for V
cm and V
dm. In Case 1, the CM exhibits a relatively high
MAE of 11.02 dBµV but a low
RMSE of 5.41 dBµV and an
R2 of 0.7111, indicating moderate predictive accuracy with low variance in error. In contrast, DM mode in Case 1 has a lower
MAE of 5.26 dBµV, slightly higher
RMSE at 6.90 dBµV, and a higher
R2 of 0.7872, suggesting a better overall fit. Cases 2 and 3 show improvements in both modes, with
MAE and
RMSE decreasing in general and
R2 increasing, particularly in CM, where
R2 reaches up to 0.92727 in Case 3, demonstrating a very strong predictive capability. Overall, the PSO model achieves high accuracy and explains a large portion of the variance, with especially strong performance in the later cases and generally higher
R2 values compared to earlier ones.
5.5. K-Nearest Neighbors (KNN)
The dataset, which has about 1.9 million data rows and 11 columns, is loaded first by the KNN MATLAB® script. Power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, gate resistor, frequency, common-mode voltage, and differential-mode voltage are all examples of measured physical variables that are represented by each column. The script divides the dataset into fixed-size subsets of 32,769 points each in order to handle these large datasets efficiently.
The corresponding outputs (common mode and differential mode voltages) are stored in Y_ss, while the input parameters (power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor) are extracted and combined into a matrix X_ss for each subset. The frequency then fluctuates based on the subset values, while a fixed new combination of current, voltage, and grille resistor is defined.
The KNN algorithm with k = 50,500 and 5000 is the prediction technique employed. The script calculates the Euclidean distance between the new data point and every point in the current subset for every frequency in that subset. The common mode and differential mode voltages are then calculated by averaging the values of the k = 50,500 and 5000 nearest neighbors.
The predicted results for each frequency are stored in a global matrix, toutes_pred. Once all predictions are completed, the total simulation time is displayed.
Finally, the script visualizes the predicted values on logarithmic scale plots by converting them to dBµV. In addition, it imports actual experimental data from an external file, converts it to dBµV, and then plots the real and predicted values for both common mode and differential mode voltages graphically. The performance of the KNN approach used to solve this problem is qualitatively evaluated with the aid of these visualizations.
Figure 18 displays the real and predicted values of (a) common and (b) differential mode voltage using the KNN method.
Table 9 presents the performance of the KNN model applied to CM and DM. In CM mode, the model achieves an
MAE of 6.85 dBµV and an
RMSE of 8.66 dBµV, with an
R2 of 0.9478, indicating very high accuracy and strong explanatory power of the data variance. In DM mode, the results are even better, with a lower
MAE of 5.97 dBµV, an
RMSE of 7.35 dBµV, and a high
R2 of 0.9415. These results suggest that KNN provides robust and reliable predictions in both configurations, with a slight advantage in DM mode due to lower error values, while maintaining excellent overall explanatory capability.
Given these results, the KNN algorithm becomes the primary focus of the study. There is still potential for improvement, even though its performance is currently adequate, especially in terms of accuracy and its capacity to replicate experimental trends. In order to further improve its robustness and reliability for the conducted emission prediction, we will look into advanced KNN variants, optimize its parameters (number of neighbors, distance metrics, and weighting strategies), and incorporate it into hybrid approaches in the next phase of our work.
5.6. Improved KNN Version
The improved KNN method differs from the basic version primarily by the introduction of data normalization and weighted averaging in the prediction process. In the improved code, the input features, power supply, load current, switching frequency, duty cycle, component characteristics, PCB layout, filter capacitance, and gate resistor are normalized by subtracting their mean and dividing by their standard deviation. This step ensures that all features have comparable scales, preventing variables with larger magnitudes from dominating the distance calculations. Additionally, instead of using a simple average of the k-nearest neighbors’ outputs, the improved version assigns weights inversely proportional to the distances, giving more importance to closer neighbors. This weighted averaging generally leads to more accurate predictions. Furthermore, after prediction, the frequency is converted back to its original scale to maintain meaningful interpretation in the plots. Overall, these enhancements make the improved KNN model more robust and sensitive to data variations, providing more reliable and precise results compared to the basic KNN approach, which uses raw data and unweighted averaging. After choosing the optimal value of K, the proposed method should be validated by testing with a different value of the input parameters.
Table 10 presents the different values of the input parameters. The choice of these values was random.
- (1)
Case 1:
Figure 19 shows both the real and predicted common-mode (a) and differential-mode (b) voltages for different values of the input parameters. Four combinations of input parameters are selected, and the method is applied to predict the EMC performance.
- (2)
Case 2:
Figure 20 shows the real and predicted values of common (a) and differential (b) mode voltage.
- (3)
Case 3:
Figure 21 shows the real and predicted values of (a) common and (b) differential mode voltages.
- (4)
Case 4:
Figure 22 depicts the real and predicted values of common (a) and differential (b) mode voltage.
Table 11 presents the performance of the KNN method across four cases, in both CM and DM. In CM, the
MAE ranges from 6.85 dBµV to 7.87 dBµV and the
RMSE from 8.66 dBµV to 9.85 dBµV, with very high
R2 values that range between 0.9741 and 0.9778. This demonstrates excellent predictive capability and a strong correlation between predicted and real CM using KNN. In DM mode, the results are even better in terms of error, with
MAE values ranging from 5.96 dBµV to 6.84 dBµV and
RMSE from 7.34 dBµV to 8.29 dBµV. The determination coefficients
R2 also remain very high, between 0.9472 and 0.9515, indicating very important explanatory power. Generally, the KNN method provides robust and reliable predictions in both modes, with a small advantage in DM due to lower error values, while maintaining excellent
R2 across all cases.
5.7. Enhanced Comparative Interpretation of AI Models
The comparative analysis of the different AI methods reveals clear differences in performance depending on the algorithm and the emission mode (CM or DM). ANN provides acceptable results with moderate MAE and RMSE values; however, its relatively low R2 indicates limited explanatory capacity, mainly due to its sensitivity to the nonlinear and oscillatory nature of EMI data. RNN shows slight improvements over ANN, particularly in CM mode for Cases 2 and 3, but its recurrent structure appears insufficient to accurately track rapid local variations, leading to higher residual errors.
Random Forest achieves substantially lower error values (MAE between 5.29 dBµV and 6.89 dBµV) and R2 scores around 0.60–0.67, demonstrating good robustness, especially in CM mode. However, RF remains limited when dealing with fine-scale oscillations because the ensemble structure averages out high-frequency variations.
PSO exhibits strong predictive capability, reaching R2 values as high as 0.92 in Cases 2 and 3. Its performance confirms the effectiveness of swarm-based optimization in adjusting model parameters, although its accuracy still depends heavily on initialization and tuning complexity.
KNN outperforms all other methods, achieving consistently high
R2 values above 0.94 (up to 0.98 in CM mode) and low
MAE/
RMSE. This superior performance is mainly due to the characteristics of the EMI dataset: smooth variations, strong local correlations, and oscillatory patterns that KNN captures effectively through instance-based learning. Unlike models requiring extensive hyperparameter tuning, KNN provides accurate predictions while maintaining simplicity of implementation, making it particularly well-suited for conducting EMI prediction tasks [
26,
27].
6. Conclusions
This study emphasizes the significance of prototyping in EMC analysis by creating equivalent circuit models of actual PCBs through transient SPICE simulations. This method allows designers to forecast conducted emissions early, in the process, thereby saving the expenses and delays linked to physical prototyping iterations.
Moreover, the study demonstrates the capability of artificial intelligence to forecast conducted electromagnetic interferences, particularly Vcm and Vdm, using measured and simulated waveform datasets.
A detailed comparative analysis of AI models (ANN, RNN, RF, PSO, and KNN) was conducted using the same dataset, training procedures, and evaluation criteria. The findings show that KNN achieves the prediction precision, maintaining consistently low MAE and RMSE figures along with R2 values above 0.94 throughout all instances.
This advantage can be attributed to the nature of the dataset: the spectral responses display localized fluctuations and resonance features that are more effectively modeled by instance-based methods, like KNN, while parametric techniques (ANN and RNN) or ensemble methods (RF) generally oversmooth the oscillatory patterns. PSO also achieves results but involves greater computational expense and more intricate tuning demands.
Although these findings are encouraging, some limitations persist. Firstly, the research depends on a blend of measured and simulated data; hence, validating the suggested approach using only measurements would bolster the findings. Secondly, the study concentrates exclusively on conducted emissions; applying the method to radiated emissions or other EMC phenomena is a future direction. Finally, this study could be extended to other DC/DC, DC/AC, and AC/DC converter structures.
Overall, this research provides a solid methodological contribution by combining PCB-level SPICE modeling with AI-based prediction of conducted disturbances, demonstrating that instance-based learning methods such as KNN are especially well-suited to EMC prediction tasks. The results open promising perspectives for accelerating EMC design workflows, reducing prototyping costs, and enabling intelligent EMC-aware design in future power electronic systems.