Next Article in Journal
Asymmetric Metamaterial Nanowire Structure for Selective Solar Absorption
Previous Article in Journal
Spatially Resolved Transient Current Technique Characterization of an Asymmetric p-i-p Silicon Diode Under Multi-Wavelength Excitation
Previous Article in Special Issue
Hybrid Stochastic–Information Gap Decision Theory Method for Robust Operation of Water–Energy Nexus Considering Leakage
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Multi-Output Neural Network-Based Hybrid Control Strategy for MMC-HVDC Systems

1
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong, China
2
School of Electric Power Engineering, South China University of Technology, Guangzhou 510641, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(24), 4803; https://doi.org/10.3390/electronics14244803
Submission received: 25 September 2025 / Revised: 28 November 2025 / Accepted: 29 November 2025 / Published: 6 December 2025

Abstract

The modular multilevel converter (MMC) has become a pivotal technology in high-voltage direct current (HVDC) transmission systems due to its modularity, superior harmonic performance, and enhanced controllability. However, conventional control strategies, including model predictive control (MPC) and sorting-based voltage balancing methods, often suffer from high computational complexity, limited real-time performance, and inadequate handling of transient events. To address these challenges, this paper proposes a novel Multi-Output Neural Network-based hybrid control strategy that integrates a multi-output neural network (MONN) with an optimized reduced-switching-frequency (RSF) sorting algorithm. The MONN directly outputs precise submodule switching signals, eliminating the need for traditional sorting processes and significantly reducing switching losses. Meanwhile, the RSF algorithm further minimizes unnecessary switching operations while maintaining voltage balance. Furthermore, to enhance the accuracy of predicted switching stage, we extend the MONN for submodule activation count prediction (ACP) and employ a novel Cardinality-Constrained Post-Inference Projection (CCPIP) to further align the predicted switching stages and activation count. Simulation results under dynamic load conditions demonstrate that the proposed method achieves a 76.1% reduction in switching frequency compared to conventional bubble sort, with high switch prediction accuracy (up to 92.01%). This approach offers a computationally efficient, scalable, and adaptive solution for real-time MMC control, enhancing both dynamic response and steady-state stability.

1. Introduction

Driven by the increasing integration of renewable energy sources (e.g., wind and solar power) into power systems and the growing demand for long-distance transmission, the modular multilevel converter (MMC) has garnered significant academic and industrial interest in high-voltage direct current (HVDC) applications [1]. This attention stems primarily from its modular architecture, which offers superior harmonic suppression, enhanced fault ride-through capability, and exceptional controllability [2]. As global power transmission capacity continues to grow rapidly, the High Total Harmonic Distortion (THD) and Control Lag pose significant challenges to the sustainable utilization of power transmission and the stable operation of converter control [3]. Particularly, high time complexity in module selection not only affects the stability of power transmission but also imposes higher demands on the operation and maintenance of MMC. Therefore, accurate MMC gate signal control, especially precise forecasting for locating the position of MMC control signals, is essential for the smooth operation of MMC.
Control strategies for MMCs primarily fall into (i) linear feedback control [4] (e.g., Proportional–Integral (PI)/Proportional–Integral–Derivative (PID) with modulation), (ii) optimal control (e.g., Model Predictive Control) [5,6], and (iii) intelligent control (e.g., fuzzy logic, neural networks, reinforcement learning). Among the intelligent control methods, those utilizing neural networks (NN) in particular have received growing attention due to their strength in rapid response capabilities, which is essential to enhance system stability, reliability, and power quality in real-time control operations. For instance, one promising approach involves the integration of MPC and NN, in which MPC provides stability and handles constraints through a system model, while NN leverages data-driven learning to handle nonlinearity and uncertainty, enhancing prediction accuracy and adaptability. Such synergistic combinations highlight the ongoing evolution within these methodological categories.
Model Predictive Control (MPC) [5] is an optimization-based control method that utilizes system models to optimize the control inputs over a horizon to minimize a certain objective function while satisfying given constraints on the system’s state. For power inverters, the finite set MPC [7,8] is usually employed, where an appropriate combination of the switches is determined to achieve a reference current. Such MPC controllers, however, face challenges for controlling Modular Multilevel Converters (MMC) due to the large number of switch configurations, which requires high computational complexity for real-time implementation. Due to the nonlinear characteristics and dynamic changes in MMC, MPC requires precise system models and substantial computational resources to determine the optimal control inputs, which may adversely affect the system’s dynamic response [9]. MMCs also face further control challenges on computational demands for more complex structures [10]. On the other hand, artificial intelligence (AI) methods address these limitations by learning the behavior of conventional MPC controllers [7,11,12,13,14]. The Neural Network-based MPC [11] replaces traditional model predictive control (MPC) by training neural networks to emulate MPC behavior using input–output data (e.g., arm voltages, currents). This reduces computational burden by more than 50% while maintaining similar performance, as neural networks directly predict submodule insertion counts without real-time optimization [12]. The Neural Predictor-based MPC (NNMPC-MMC) [13] enhances robustness by cascading neural predictors with finite control-set MPC. The predictor estimates system disturbances and unmodeled dynamics using techniques like generalized ultra-local modeling, eliminating dependency on precise parameters and weighting factors [14]. Event-triggered variants further reduce switching frequency by 20% [7]. Auxiliary AI techniques, such as ADALINE algorithms, enable sensorless capacitor voltage estimation using switching states and adaptive linear neurons [15], while 1DCNN-LSTM networks achieve more than 95% accuracy in fault diagnosis by processing raw current signals [16]. These methods collectively improve reliability, enhance dynamic response, mitigate circulating currents, and optimize capacitor voltage balancing under transient conditions [8]. AI-driven control enables MMCs to operate efficiently in high-voltage applications like HVDC transmission and renewable energy integration [17], overcoming traditional limitations in computational complexity and model dependency [18].
This transition underscores the importance of developing effective hybrid models that can simultaneously maintain errors at a low level while addressing the real-time computational demands. Hybrid methods often integrate a simulated system with neural network (NN) algorithms, allowing for better modeling of complex, nonlinear patterns in MMC data. For example, combining circuit models with regression neural networks (RNNs) or Neural Network Predictive Regressors (NNPRs) can handle complex transient events and nonlinear dynamics, leading to enhanced prediction accuracy and system stability [13]. Recent studies have demonstrated the effectiveness of hybrid models in signal prediction of MMC controllers, utilizing approaches that adapt to fluctuations and uncertainties inherent in power transmission. These models not only improve forecast efficiency but also provide probabilistic estimates that are crucial for risk management and operational decision-making in the power transmission context.
Additionally, in the hybrid control models of Modular Multilevel Converters (MMC), the use of sorting algorithms is of great significance for optimizing the generation of control signals and enhancing system performance [19]. Sorting algorithms can handle the complex dynamic behavior within MMC, especially when it comes to managing the switching states of different modules and balancing energy. By employing sorting algorithms, more precise control signal allocation can be achieved, thereby improving the overall efficiency and stability of the system. Common sorting algorithms mainly include the following: (i) Traditional full sorting algorithms (e.g., bubble sort) which require continuous monitoring of all submodule voltages and arm currents, determining submodule switching states through complete sorting [20], but suffering from high sensor dependency and substantial computational load; (ii) algorithms with improved strategies to reduce switching frequency which includes threshold-triggered sorting based on voltage limits [21], event-driven sorting activated by output voltage level changes or variations in the number of required submodules [22], and hybrid indicator sorting incorporating both actual voltage values and their deviations from nominal values [23]; (iii) algorithms utilizing estimation techniques for minimizing sensor usage, which include capacitor voltage estimation via arm voltage measurements or Kalman filtering [24], the use of the derivative of the sum of capacitor voltages to replace current sensors for direction determination [25], and the use of model predictive control to estimate voltage states [26]. These strategies progressively enhance system performance, suppressing capacitor voltage fluctuations while systematically optimizing switching losses and sensor costs, collectively ensuring efficient and stable operation of MMC systems.
While the abovementioned sorting algorithms in the hybrid control frameworks can maintain capacitor voltage balancing under steady-state conditions, current state-of-the-art (SOTA) neural network approaches mainly focus on activation count prediction (ACP) without providing specific module positions (also known as switching states). Consequently, an extra sorting step is required to locate the set of modules, which reduces the benefit of using the data-driven NN approach. To overcome this limitation, we propose a multi-output neural network (MONN) to directly predict the specific module positions and hence eliminate the need for subsequent sorting. The main contributions of this paper are as follows:
(1)
A hybrid control strategy for Modular Multilevel Converters (MMCs) is proposed, achieving coordinated and precise control of multiple variables such as AC current tracking, submodule capacitor voltage balancing, and circulating current suppression. It significantly reduces steady-state error and computational burden without trading off its dynamic performance.
(2)
A Reduced Sorting Frequency (RSF) algorithm based on dynamic window adjustment and state memory is designed. It intelligently selects a subset of submodules requiring state transitions, substantially reducing the average switching frequency with minimal compromise in capacitor voltage balancing accuracy. The RSF also helps to generate training data for the MONN for learning the above-mentioned reduced switching pattern, leading to better power efficiency.
(3)
A multi-task learning-driven multi-output neural network (MONN) controller is developed. It is a triple tower architecture made up of two classification towers and a regression tower for switching state prediction (multi-label classification), activation count prediction (multi-class classification), and voltage/current forecasting. While the classification towers are used to learn the decisions, just the classifier alone may be insufficient to guarantee that the NN model can learn the dynamics of the MMC. Hence, the regression tower provides an extra layer of insurance by learning the complementary information that may be missed by the classifier. A feature-sharing layer is used to capture the common features across different tasks. This enables the co-training process of the multiple tasks to be cohesive.
(4)
Finally, to further enhance the accuracy of switching state prediction (SWP), we propose a novel Cardinality-Constrained Post-Inference Projection (CCPIP) to align the SWP with the predicted activation count using projection over a constrained set. This could help to correct the possible misalignment between the predicted switch state/module position and the activation counts. A mathematical proof is also provided to demonstrate that the optimal constrained set can be achieved using a simple procedure without the need to solve an exhaustive combinatorial problem.
Simulation results show that the proposed approach outperforms other machine learning classifiers [27], such as the support vector machine (SVM), naïve Bayes, XGBoost, and Random Forest, in switching state prediction. While the proposed MONN also achieves highly comparable performance in activation count prediction (ACP) compared to the NNMPC-MMC [13], it can also directly predict the specific module position without the need for extra sorting. We also analyzed the complexity of the proposed approach. The proposed approach just requires roughly 3000 parameters with a compact storage size of 11 kb. A 8-bit integer implementation, say using an NXP MCX Series with Dual Cortex M33 cores of 150 Mhz, would just require 0.022 ms inference time and support real-time operation.
The rest of this paper is organized as follows. Section 2 proposes the methodology, including the MMC control methods and the Neural Network structure. Section 3 presents the simulation details and baseline models. Section 4 discusses the implications of the results. Section 5 concludes the paper with suggestions for future research directions.

2. Methodology

2.1. Task Definition

This study presents a data-driven approach for generating MMC control signals, where MMC system is simulated on MATLAB (version R2024b). The method uses Jupyter Notebook (version 7.2.2) to train a neural network on past system electrical parameters to predict switching events. The core challenge lies in the accurate prediction of module positioning, which is characterized by rapid state transitions. These transitions are difficult to forecast owing to their transient nature and the exponentially large number of possible switching combinations.
A critical prerequisite of this framework is the simulation-based implementation of an MMC circuit that complies with practical engineering standards, coupled with the selection of input parameters that comprehensively represent the system’s electrical state. This fundamentally supports the core objective of generating precise MMC control signals with minimal latency. For instance, in a single-arm four-module MMC, the neural network is fed eight key feature parameters. The core task is to train a multi-task network on simulation data to achieve rapid and accurate prediction of switching commands. Furthermore, the model is designed for multi-outputs: it generates immediate control signals while simultaneously predicting future current and voltage parameters, creating a foundation for the closed-loop optimization of the control strategy.

2.2. Introduction of MMC Topology

As shown in Figure 1, the Modular Multilevel Converter (MMC) features a fundamental structure consisting of three phases with six arms. Each phase comprises upper and lower arms, with each arm formed by a series connection of sub-modules, an arm inductor, and a resistor. The most commonly used sub-module configuration is the half-bridge structure, which contains two IGBT switching devices and one DC storage capacitor. The operational principle of the converter involves controlling the switching states (turn-on and turn-off) of the devices in each sub-module to either insert or bypass the capacitor units. By precisely regulating the switching sequence of individual sub-modules, a multilevel approximately sinusoidal voltage waveform is generated across each arm. The voltage difference between the upper and lower arms produces the AC-side output voltage, while the charging and discharging processes of the sub-module capacitors are controlled to maintain DC-side voltage stability.
Due to this multi-module switching mechanism, the MMC system exhibits distinctive voltage and current fluctuation characteristics: voltage and current fluctuations are usually characterized by local maxima or minima, that is, voltage or current changes sharply in a short time and quickly reaches the peak or valley. This phenomenon in the Modular Multilevel Converter (MMC) system corresponds to the characteristics of energy transmission and the strategy of module engagement. In order to accurately correlate these electrical characteristics with the control signals of the MMC modules to achieve precise control, this study employs a circuit simulation method based on a hybrid control structure, combined with an optimization sorting method. This technology can achieve a balance between suppressing power loss and maintaining high-precision control. Through this method, the stable operation of the MMC module can be realized, and the power loss in the system operation can be significantly reduced. The control of the MMC model usually faces several key challenges:
(1)
Multivariable coupling: There is a strong coupling relationship among multiple variables, such as AC current tracking, submodule capacitor voltage balancing, and circulating current suppression in the MMC system, which requires coordinated control.
(2)
High-frequency operation: The switching frequency of MMC reaches the kHz level, requiring the controller to make decisions within a ≤10 µs period. Although the Model Predictive Control (MPC) has good performance, its computational burden increases exponentially with the system complexity (especially the number of submodules N).
(3)
Second-order harmonic circulating current: Circulating current is an inherent second-order harmonic component of MMC. It increases the current stress and loss of devices and disrupts the balance of the capacitor voltage.
To overcome these challenges, we developed a hybrid control strategy. This strategy integrates the superior dynamic performance of Model Predictive Control (MPC) with the steady-state accuracy and simplicity of Proportional–Integral (PI) control. By incorporating decoupling techniques in the rotating coordinate system, the strategy achieves precise, stable, and efficient control of all key state variables. Importantly, this hybrid control strategy significantly reduces steady-state error, a key metric for assessing control performance under steady-state conditions.
Steady-state error is a key metric for assessing the control performance of a circuit system under steady-state conditions, typically expressed as a percentage. It quantifies the relative difference between the desired reference value (e.g., voltage or current) and the actual measured value once the system reaches equilibrium. Specifically, the steady-state error for voltage or current at time t can be expressed as:
Δ V s s ( t ) = V r e f V ( t ) V r e f × 100 %
Δ I s s ( t ) = I r e f I ( t ) I r e f × 100 %
where V r e f represents the Reference Voltage, V ( t ) represents the actual Voltage, I r e f represents the Reference Current, I ( t ) represents the actual Current.
The steady-state error provides a normalized measure of the circuit system’s performance, which is crucial for the parameter tuning of the simulation system.

2.2.1. Main Circuit Control Methods

The proposed Multi-Output Neural Network-Based Hybrid Control Strategy for Model Predictive Control (MONN-HCS-MPC) is a learn-to-optimize (L2O) approach that aims to combine machine learning with model predictive control (MPC) of the MMC. As the L2O approach leverages historical observations for learning decisions, a physical model has to be provided for generating such observations. Equations (1)–(11) are responsible for such purposes. More specifically, the Physical MMC Model includes outer loop PI control (Equations (1)–(5)), the inner-loop model predictive control (MPC) (Equations (6) and (7)), and the dedicated decoupling strategy of the MMC (Equations (8)–(11)). These equations are essential for generating the training and testing data for the proposed NN.
More specifically, the design of the control system is based on a hierarchical concept of “outer loop for target setting, inner loop for rapid response, and dedicated loop for disturbance suppression.” The core of this approach is the optimal allocation of controllers according to the physical characteristics and time scales of the control objectives. The overall control architecture is mainly divided into four parts:
DC Voltage Control: The stability of the DC link voltage V d c is a prerequisite for system power balance and stable operation. By measuring the actual DC voltage V d c and comparing it with the reference value V d c * , the error is processed through a PI controller, and the output is the reference value for the active current i d * .
i d * = K p d c ( V d c * V d c ) + K i d c ( V d c * V d c ) d t
where K p d c and K i d c are the proportional and integral gains that have been tuned.
Phase energy balance Control: To maintain the overall stability of the submodule capacitor voltage, the total energy of the upper and lower bridge arms in the three phases is calculated and compared with the rated value. The error is processed through another PI controller to generate an additional component of the circulating current reference signal i c i r * . This works in conjunction with the DC voltage controller to ensure global energy balance within the system.
W j = 1 2 C s m i = 1 N v c p j ( i ) 2 + v c n j ( i ) 2   f o r   j = a , b , c
Δ i d i f f , q * = j = a , b , c K p b a l ( W j * W j ) + K i b a l ( W j * W j ) d t f j ( θ )
where subscript j serves as the phase identifier, denoting one of the three phases (a, b, or c). Superscript ∗represents the corresponding reference value. Superscript i identifies the i-th submodule within a phase. Subscripts p and n designate the upper Arm and lower Arm, respectively. W represents the total energy stored in the capacitors of all submodules within a specific phase. C s m is the capacitance of an individual submodule capacitor. N defines the number of submodules per arm (upper or lower). v c signifies the capacitor voltage. Δ i d i f f , q * is the compensating signal added to the circulating current reference to balance phase energy. K p b a l and K i b a l are the proportional and integral gains, respectively, for the energy balance controller. f j ( θ ) is a distribution function based on the phase angle θ , which allocates the corrective signal across the three phases to ensure proper control action.
Model Predictive Current Control: The inner current control is the core of the system, utilizing Finite Control Set MPC (FCS-MPC) to achieve fast dynamic response. Based on the discretized mathematical model of MMC in the abc coordinate system, the predicted value of the k -th, k = ( a , b , c ) phase current at a future time instant can be expressed as:
i p r e d ( t + 1 ) = T s L u s ( t ) N n V n N p V p 2 + i ( t ) 1 R T s L
where i p r e d ( t + 1 ) represents the predicted current at time step t + 1 , T s denotes the sampling period, and L indicates the equivalent inductance calculated as L = L s + L a r m 2 , with L s and L a r m being the grid-side and arm inductance, respectively. The model incorporates the grid voltage u s ( t ) at time step t , the number of engaged submodules in the lower and upper arms ( N n and N p ), their corresponding average capacitor voltages ( V n and V p ), and the actual measured current i ( t ) . Additionally, the equivalent resistance R is defined as R = R s + R a r m 2 , where R s and R a r m refer to the grid-side and arm resistance.
The cost function is used to evaluate the difference between the predicted current and the reference current:
g ( t ) = | i r e f ( t ) i p r e d ( t + 1 ) |
where i r e f ( t ) is the reference of the AC current.
Decoupling control strategy: The core of this decoupling control strategy lies in utilizing a rotating coordinate transformation to convert the circulating components with a frequency twice the base frequency in a three-phase AC system into DC components. This enables the PI (Proportional–Integral) controller to track them accurately and without steady-state error. By introducing a feedforward decoupling term to offset the cross-coupling effect between the d and q axes, independent control is achieved. The control structure is divided into the following parts:
(1)
Coordinate Transformation: First, the three-phase circulating current i c i r c , a b c is transformed into the dq rotating coordinate system (synchronized with the second harmonic frequency, i.e., ω c i r c = 2 × π f , resulting in the direct current components).
i c i r c , d i c i r c , q = 2 3 cos ( 2 ω t ) cos ( 2 ω t 2 π / 3 ) cos ( 2 ω t + 2 π / 3 ) sin ( 2 ω t ) sin ( 2 ω t 2 π / 3 ) sin ( 2 ω t + 2 π / 3 ) i c i r c , a i c i r c , b i c i r c , c
where for notational convenience, we have dropped the time-step (t) for of the circulating currents for simplicity. i c i r c , a ,   i c i r c , b ,   i c i r c , c are the instantaneous values of the circulating current in the three-phase stationary coordinate system (phases a, b, c) (AC quantities). i c i r c , d and i c i r c , q represents the direct and quadrature axis components of the circulating current in the two-phase rotating coordinate system (DC quantities). 2 ω t is the angle of the rotating coordinate system. ω is the fundamental angular frequency; 2 ω indicates the coordinate system rotates at twice the fundamental frequency, synchronizing with the main component of the circulating current.
(2)
Decoupling Control Law: The output of the PI controller is superimposed with feedforward decoupling terms ω L 0 i c i r c , q and ω L 0 i c i r c , d to generate the compensating voltage.
v c i r c , d * = K p + K i s i c i r c , d * i c i r c , d 2 ω L 0 i c i r c , q
v c i r c , q * = K p + K i s i c i r c , q * i c i r c , q 2 ω L 0 i c i r c , d
where v c i r c , d * ,   v c i r c , q * represent the calculated reference values for compensating voltages. The time-step symbols are also dropped, similarly to in Equation (8), for the sake of presentation. The parameters K p and K i denote the proportional and integral gains of the PI controller, respectively. i c i r c , d * ,   i c i r c , q * are the reference values for the d- and q-axis circulating current components, which are typically set to zero. The terms ± 2 ω L 0 i c i r c , q / d are feedforward decoupling terms, where L 0 is the equivalent inductance. These terms are crucial for canceling the coupling between the d and q axes, thus facilitating independent control of each axis.
(3)
Inverse Transformation: Finally, the decoupled compensation voltage is inversely transformed back into the three-phase coordinate system and superimposed onto the modulation wave:
V c i r c , a * V c i r c , b * V c i r c , c * = cos ( 2 ω t ) sin ( 2 ω t ) cos ( 2 ω t 2 π / 3 ) sin ( 2 ω t 2 π / 3 ) cos ( 2 ω t + 2 π / 3 ) sin ( 2 ω t + 2 π / 3 ) V c i r c , d * V c i r c , q *
where V c i r c , a * ,   V c i r c , b * ,   V c i r c , c * represent the compensating voltage reference values in a three-phase system. These AC quantities are derived following an inverse transformation process. The three signals are then superimposed onto a modulation waveform, which is crucial for generating the PWM (Pulse Width Modulation) signals that control the switches.
This method eliminates steady-state errors by converting AC quantities to DC control and achieves complete decoupling through feedforward compensation, enhancing the dynamic performance and steady-state accuracy of circulating current suppression.
By employing a hybrid control strategy that decomposes the overall control task, the proposed approach harnesses the dynamic performance of Model Predictive Control (MPC) in handling complex nonlinear systems, while capitalizing on the reliability of Proportional–Integral (PI) controllers in steady-state regulation and specific frequency disturbance rejection. This method helps to offload part of the control from the MPC to other controllers. After the simulation model of the hybrid control strategy operates stably, key electrical parameters are systematically collected, laying a foundation for subsequent performance analysis and neural network training.

2.2.2. Submodule Capacitor Voltage Balancing Strategy

To ensure the stable and reliable operation of the Modular Multilevel Converter (MMC), maintaining the balance of the capacitor voltages of each Submodule (SM) is crucial. This study employs an active voltage balancing control algorithm based on sorting, which not only responds quickly but also reduces unnecessary submodule switching actions, thereby lowering the overall switching frequency and losses.
The core principle of this control strategy is to intelligently select specific submodules to be engaged or disengaged based on the direction of the arm current to achieve natural balancing of the capacitor energy. Its input variables are: the vector V of capacitor voltages of each submodule in the arm, the arm current I indicating the direction of energy exchange, and the number of submodules n to be engaged as determined by the higher-level control. In addition, to improve the smoothness of control, the algorithm also incorporates the switching state from the previous control cycle as an input to achieve a state memory function. The execution process of the algorithm is illustrated in Figure 2.

2.3. Architecture of the Proposed Multi-Output Neural Network (MONN)

To enhance the real-time performance of MMC module control, especially under conditions of low switching frequency, we propose a Multi-Output Neural Network (MONN) model for switching state prediction (SWP) and activation count prediction (ACP). This model integrates a multi-task learning neural network with a supervised learning framework to achieve real-time control of power electronics. The structure incorporates hierarchical feature sharing and a hybrid output design to directly and accurately output control signals for MMC, eliminating the need for cumbersome reordering processes.
Figure 3 illustrates that the multi-output network model consists of the following key components: (1) Parameter selection: Electrical parameters are selected as the inputs and outputs of the neural network based on the characteristics of the circuit and the number of modules. V c 1 V c 4 are the capacitor voltage. I k ,   I k u ,   I k l   a n d   I c are the AC-side current, arm current, and circulating current measured from the system. G c 1 G c 4 are module control signals. V k ( t + 1 ) and I k ( t + 1 ) are predicted arm module average voltage and AC-side current. These electrical parameters can directly reflect the operating state of the circuit and the working conditions of the modules. (2) Hierarchical Feature Sharing Mechanism: By synergistically designing multi-level feature extraction and task-specific branches, accurate output of MMC control signals is achieved. This mechanism includes three key levels, forming a pyramidal information processing structure. (3) Hybrid Output Architecture: Integrating discrete switch control with continuous electrical parameter prediction within a unified framework forms an intelligent control system with multi-task collaboration.
The network input is an 8-dimensional feature vector, which is then processed through two fully connected hidden layers for feature extraction: The first hidden layer contains 64 neurons, and the second hidden layer contains 32 neurons, both using the ReLU activation function to enhance the model’s nonlinear modeling capability. To mitigate overfitting, we introduced L2 weight regularization in both hidden layers, with the regularization coefficient set to 0.001. The model includes a total of N = 4 output heads for switching state prediction (SWP) (multi-label sigmoid classification), N + 1 (for predicting n = 0 to N) output for activation count prediction (ACP) (multi-class softmax classification) and two regression output for forecasting arm average voltage and ac side currents, respectively, using a linear activation function. During training, we adopted a multi-task learning framework and configured specific loss functions and weights for different tasks: cross-entropy loss for SWP and ACP, and quadratic loss for the regressors. To balance the importance of different tasks, the loss weights for the classification and regression tasks are set to 1.0, 1.0, and 0.5, respectively. The optimization algorithm employed is ADAM, combined with a dynamic learning rate adjustment strategy—where the learning rate is halved if the validation loss does not improve for 5 consecutive epochs—to promote training stability and convergence. This design aims to balance feature sharing across multiple tasks and the accuracy of specific tasks, with the goal of achieving robust joint prediction performance. Note, the proposed CCPIP is only applied during inference; it does not affect the training process.
Table 1 outlines the detailed parameters of the proposed MONN. From the table, we can see that the proposed algorithm is a light-weight model with a compact size of 11 kb. It should be remarked that the training phase of the proposed multiple-output Neural Network (MONN) is an offline process. Once trained, the MONN just requires, say, 0.022   m s for inference, say, using NXP MCX series with Dual Cortex M33 scores of 150 Mhz (See detailed calculation in Table 1). Neural networks can be efficiently parallelized using vector operations in computer simulations. In practical applications, the neural network can be efficiently accelerated through layer-wise and neuron-wise parallelism [28], along with vectorized operations on Single-Instruction-Multiple-Data (SIMD) processors. Many embedded processors like ARM Cortex-M support SIMD capabilities.

2.3.1. Parameter Selection

When designing the neural network controller for Modular Multilevel Converters (MMC), the input parameters include the capacitor voltage of each module ( V c , n ( t ) ), the AC-side current ( i k ( t ) ), the upper and lower arm currents ( i k u ( t ) ), and the circulating current ( i c ( t ) ). The selection of input parameters comprehensively captures the core state of the system: the voltage of each module ensures the energy storage information necessary for capacitor voltage balancing control, the AC-side current reflects the system-level power transmission objectives, the upper and lower arm currents provide the key carrier for internal energy exchange, and the circulating current is directly related to the advanced control objective of internal loss suppression. The output directly generates switching signals to achieve end-to-end optimized control, surpassing the limitations of traditional modulation strategies. Meanwhile, predicting the electrical quantities of the next moment as auxiliary output imposes physical constraints on network training, demonstrating stronger adaptability and optimization potential than traditional methods.

2.3.2. Feature-Sharing Mechanism

The hierarchical feature sharing mechanism is a neural network architecture specifically designed for multi-task learning. It captures common features across different tasks while retaining task-specific distinctive features by constructing a shared base feature extractor and multiple parallel task-specific towers. Its core structure consists of a shared base network and several task-specific towers: the shared base comprises multiple fully connected layers with activation functions, responsible for learning common foundational feature representations from raw inputs for all tasks; the task-specific towers are customized small network branches for each learning objective, dedicated to further refining the most critical information for specific tasks from the shared features.
In the neural network model of this study, the input to the shared base consists of 8-dimensional raw electrical data collected in real-time by the MMC system. This data is processed through FC(64)-ReLU to FC(32)-ReLU layers and transformed into a 32-dimensional shared feature embedding vector, where FC stands for fully connected layers. This vector is simultaneously fed into six parallel task-specific towers (four switch status classification towers and two electrical parameter regression towers). Each tower independently processes the shared vector according to its specific task characteristics (such as discrete classification or continuous regression) to ultimately generate predictive outputs.
The update process of the hierarchical feature sharing mechanism is as follows: the shared layer parameter set θ s h a r e d represents the universal feature representation learned by the model from all tasks, while θ t o w e r ( k ) denotes the parameter set of the k-th task-specific tower, storing its task-specific decision rules (this study includes six tasks: four switch classification tasks and two regression tasks). The mechanism is updated through a unified optimization process: first, the losses L ( k ) of the k tasks are weighted and combined into a composite loss L t o t a l = ( w k L k ) . Then, the gradients of the shared parameters θ s h a r e d and all task-specific parameters θ t o w e r ( k ) are computed simultaneously via a single backpropagation step. The optimizer collaboratively updates all parameters based on these gradients. This unified optimization mechanism, based on a composite loss function, captures both the commonalities among tasks (through θ s h a r e d ) and their specificities (through θ t o w e r ( k ) ). It is particularly well-suited for the composite optimization problem involving discrete switching decisions and continuous electrical parameter tracking in modular multilevel converter (MMC) control.

2.3.3. Multi-Output Architecture

The hybrid output architecture is a design that enables the model to simultaneously handle discrete classification and continuous regression tasks, achieving precise coordination between switch control and electrical parameter prediction in MMC systems. This study employs a multi-task learning framework, computing both types of outputs in parallel based on a shared feature set, thereby enhancing the model’s overall control capability. The core of this architecture lies in generating task-specific outputs from the shared features: the input to the task-specific networks is the shared feature embedding vector produced by the hierarchical feature sharing mechanism (Shared-Bottom Network).
In this network model, the hybrid outputs are computed independently by the task-specific towers. The switch status output y ^ s w i t c h is calculated using a Sigmoid activation function and converted into a binary (0/1) switch state via a 0.5 threshold; the electrical parameter output y ^ e l e c t r i c a l is generated through a linear transformation without any nonlinear activation. All outputs are ultimately merged into a hybrid prediction vector.

2.3.4. Proposed Cardinality-Constrained Post-Inference Projection for Switch State Prediction

As mentioned earlier, we propose to jointly predict the switching state (SWP) and activation count together. However, an initial test revealed that the cardinality of the predicted switching state of SWP could misalign with the predicted activation count of the activation count prediction (ACP), and hence, there was not any improvement with the SWP. This has motivated us to propose a novel Cardinality-Constrained Post-Inference Projection (CCPIP) to align the SWP with ACP using projection over a constrained set. More specifically, we drop the subscript switch and denote y s w i t c h and y for notational convenience. More specifically, we drop the subscript switch and denote y s w i t c h and y for notational convenience. Moreover, since we replicate the same network architecture for both lower and upper arms of the MMC, we also drop the subscripts p and n for the upper and lower arms, for the sake of presentation. The proposed CCPIP reads as follows.
Definition 1.
Proposed Cardinality-Constrained Post-Inference Projection.
Consider the multi-task network with two classification towers with outputs  y ^ n , t    and  N ^ c , t  from SWP and ACP tasks, respectively. The goal is to find  y ^ n , t    that minimizes the following constrained optimization problem:
a r g   min y ^ Ω n = 1 N L C E ( y ^ n , t , p ^ n , t )   s . t .   Ω = { y ^ t = 0,1 N   | n = 1 N y ^ n , t = N ^ c , t } ,
Cardinality refers to the number of selected binary variables (e.g., SMs/switches).
  • where
  • N ^ c , t :  Activation count prediction (ACP) at   t t h  time-step (obtained from NN)
  • y ^ n , t :  Switching state for  n t h  switch/submodule (SM) (unknown to be solved)
  • p ^ n , t :  Output sigmoid probability (obtained from NN)
  • y ^ t :   a binary vector of dimension  N ,   y ^ t = y ^ 1 , t ,   y ^ 2 , t , ,   y ^ N , t T .
  • 0,1 N :  Space of all possible combinations of  N  binary output labels.
  • L C E y , p Entropy loss given as in (12)
  • Ω the constraint set that contains all binary vectors satisfying the cardinality o  N ^ c , k .
The entropy loss is given as
L C E y , p = [ y log p + 1 y l o g ( 1 p ) ]
Although Definition 1 may seem like a complicated combinatorial problem, it actually has a straightforward solution as follows:
Conjecture 1.
Solution to Proposed CCPIP.
Let  N ^ k = n = 1 N y ^ n , t    be the cardinality of the predicted switching states. The solution to the proposed CCPIP in (R.1) can be divided into three cases:
Case 1: No misalignment— N ^ t = N ^ c , t
Solution: Conventional decision rule can be applied:  y ^ n , t = 1 p ^ n , t 0.5 0 o t h e r w i s e
Case 2: Over-estimation of Cardinality— N ^ t > N ^ c , t
Solution: Apply  y ^ n , t = 1 p ^ n , t 0.5 0 o t h e r w i s e . Set the  N ^ d i f f  positive outputs with the smallest sigmoid probability  p ^ n , t  to zero.
Case 3: Under-estimation of Cardinality— N ^ t < N ^ c , t .
Solution: Apply  y ^ n , t = 1 p ^ n , k 0.5 0 o t h e r w i s e . Set  N ^ d i f f  negative outputs (i.e.,  y ^ n , t = 0 ) with the largest sigmoid probability  p ^ n , t  to one.
Proof of Case 1.
The unconstrained minimizer of entropy loss is within the constrained set. Hence, we can revert to the conventional unconstrained solution, i.e., the predicted label y ^ n , t is computed by the following decision rule:
y ^ n , t = 1   if   p ^ n , t 0.5   or 0 otherwise,
Proof of Case 2.
There will be N ^ k positive outputs (i.e., y ^ n , t = 1 ) assigned using the conventional decision rule (13) as the cardinality is N ^ t . The unconstrained minimizer is attained when all the N ^ t sigmoid probabilities p ^ n , t was set to p ^ n , t = 1 . If we sort the probabilities in descending order p ^ n 1 , t > p ^ n 2 , t > > p ^ n N , t , the loss read
m = 1 N L C E y ^ n m , t , p ^ n m , t = L A + L B , where
  L A = m = 1 N ^ k ( log p ^ n ( m ) , t + 0 ) , and
L B = m = N ^ t + 1 N [ 0 + log 1 p ^ n m , t ]
L A and L B correspond to the loss due to the SM with positive outputs and zero outputs, respectively. Suppose we overshoot N ^ d i f f = N ^ t N ^ c , t . We have no choice but to project the outputs { y ^ n ( N ^ c , k + 1 ) , t , y ^ n ( N ^ c , k + 2 ) , t , y ^ n N ^ k , t }  to 0 so as to satisfy the constrained set. The loss will no longer be as small as the unconstrained minimum and the terms  p ^ n N ^ c , t + 1 , t to p ^ n N ^ t , t are assigned to L B , which is the opposite of L A . Each term of L B reads log 1 p ^ n m , t . In other words, a smaller probability (i.e., more likely to be output zero) will lead to a smaller loss. Hence, we assign the N ^ d i f f SMs with the smallest probabilities with positive outputs to zero, and this will guarantee L B is minimized. On the other hand, as each term of L A (i.e., log p ^ n m , t ) is non-negative and increases with smaller p ^ n m , t , we removed the N ^ d i f f smallest SMs and assigned to L B . This also reduces the loss of L A .
Proof of Case 3.
Conversely, we under-estimated the cardinality and N ^ d i f f < 0 . We have to project N ^ d i f f output, i.e., { y ^ n ( N ^ t + 1 ) , t , y ^ n ( N ^ t + 2 ) , t , y ^ n N ^ c , t , t } from 0 to 1. This also means their corresponding sigmoid probabilities { p ^ n ( N ^ t + 1 ) , t , p ^ n ( N ^ t + 2 ) , t , p ^ n N ^ c , t , t } are all smaller than 0.5 as they have been assigned to zero at the first place. When we flip the N ^ d i f f SMs with the largest probabilities with zero outputs to one, L B will be minimized because the N ^ d i f f SMs contain the largest loss in L B . On the other hand, when they are flipped to one, L A will also be minimized as their probabilities are closer to one as compared to other SMs with output zero. This concludes the proof of conjecture (1). Table 2 shows the implementation of the proposed CCPIP. □

3. Experimental Setup

3.1. System Modeling and Configuration

This study constructs a detailed model of the Modular Multilevel Converter (MMC) using the MATLAB/Simulink R2024b simulation platform to verify the effectiveness of the applied hybrid control strategy. The core parameters of the simulation system are determined based on actual engineering practices, combined with system design requirements and theoretical calculations, aiming to ensure the accuracy of the model while considering computational efficiency. Additionally, the parameter design includes considerations for safety margins to ensure safety and industrial safety relevance in practical applications. Table 3 summarizes the key simulation parameters of the system.
Each arm of the MMC system includes 4 submodules (SMs), with the submodule capacitor voltage at 1500 V, the DC bus voltage at 6 kV, and the system rated power at 1.2 MW. The simulation uses a fixed-step discrete solver, with the step size set to 1 0 6 s, to accurately capture the switching dynamic characteristics of the IGBTs.
To comprehensively evaluate the control performance, the simulation sets up a multi-stage test scenario: 0–0.4 s for system startup and no-load transient process, 0.4–0.42 s for a linear ramp-up of power to +1.2 MW, and at 0.7 s, a −1.2 MW load step is applied to test the dynamic response capability of the controller.
The control system employs a hierarchical hybrid architecture, which ensures basic constraints such as capacitor voltage balance (fluctuation < ±5%) and circulating current suppression (<10% of the rated current), providing a high-quality training data foundation for the neural network.

3.2. Dataset Description

The dataset used in this study comprises historical record data from a simulation model of an MMC rectifier circuit, which features a topology with six bridge arms, each containing four submodules. Electrical parameters—such as submodule voltage, AC current, loop current, and switching signals—were collected at a high sampling frequency over a duration of 1.5 s, yielding approximately 1.5 million records. Since the initial phase of the simulation involves dynamic adjustments for system control objectives and energy balance, only the longest continuous time series after the system reached steady-state operation was selected to ensure time-series integrity and avoid introducing bias during model training.
During data preprocessing, all input features and continuous regression targets (voltage and current) were standardized using Standard Scaler, which centers the data by removing the mean and scales it to unit variance. This approach enhances the stability of model training by mitigating the influence of feature scales. Though it is less robust to outliers compared to quantile-based scaling, Standard Scaler was chosen for its computational efficiency and general effectiveness under Gaussian-like distributions. The scaler parameters obtained from the training set were stored and applied consistently during the prediction phase to ensure uniform input distribution for the multi-task learning model. Further details regarding neural network architecture and training configurations are provided in Section 3.3.

3.3. Experimental Settings

3.3.1. Dataset Partitioning

To ensure rigorous model evaluation and generalization capability, this study adopts a systematic data partitioning strategy: 80% of the original dataset is allocated as the training set, with the remaining 20% reserved as the test set. Furthermore, 10% of the training set is extracted to form a validation set, resulting in a final distribution of 70% training data, 10% validation data, and 20% test data. This approach guarantees sufficient training data while enabling hyperparameter optimization and early stopping monitoring through the validation set, with the test set providing an objective assessment of model performance on unseen data.

3.3.2. Hyperparameter Configuration

The model employs a multilayer perceptron architecture with the following core hyperparameters: the input layer dimension is 8, corresponding to the number of electrical features; hidden layers are configured with 64 and 32 neurons, both using ReLU activation functions; the output layer consists of six nodes—four with Sigmoid activation for switch status classification tasks and two with linear activation for voltage and current regression tasks. The model uses the Adam optimizer with a batch size of 64. The loss function configuration combines binary cross-entropy for classification tasks and mean squared error for regression tasks, with weighting coefficients emphasizing optimization priority for critical parameters.

3.3.3. Training Details

The model training follows a multi-task learning framework with a maximum of 300 training epochs. Each epoch includes forward propagation to compute six output results, followed by backpropagation and parameter updates based on a weighted loss function (cross-entropy for classification tasks, mean-squared error for regression tasks). After each training cycle, independent performance metrics for subtasks and the overall weighted loss are synchronously evaluated on the validation set to continuously monitor model generalization capability and training stability. The experiments were implemented using Keras and executed on a Windows PC equipped with an Intel Core i5-1135G7 processor, which handles graphical processing through its integrated Intel Iris Xe GPU.

3.4. Evaluation Metrics

To comprehensively evaluate the model’s performance, we have selected the following evaluation metrics. The choice of evaluation metrics is carefully designed to match the dual-task nature of the proposed composite neural network model (which includes both discrete switch state classification and continuous electrical parameter regression) as well as the practical requirements of the Modular Multilevel Converter (MMC) system.

3.4.1. Evaluation Metrics for Switch Signals (Classification Task)

Accuracy, precision, recall, and F1 score are used for evaluation and they are summarized as follows:
Accuracy: The accuracy of each of the four switch signals is calculated separately; after post-processing correction, the accuracy is recalculated again; and the overall switch accuracy is determined. This directly reflects the model’s ability to identify the overall combination of switches.
A c c u r a c y = T P + T N T P + T N + F P + F N ,
where TP (True Positives) denotes the number of positive instances correctly predicted by the model; FP (False Positives) indicates the number of negative instances erroneously predicted as positive; FN (False Negatives) represents the number of positive instances incorrectly predicted as negative; TN (True Negatives) represents the number of negative instances correctly predicted by the model.
Precision, Recall, and F1 Score: Precision and recall quantify the reliability and coverage of positive predictions in the results, addressing class imbalance issues. The F1 score, as the harmonic mean of the two, provides a comprehensive measure of the classifier’s performance.
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l

3.4.2. Evaluation Metrics for Voltage/Current (Regression Tasks)

The mean absolute error (MAE) is used for evaluation:
M A E = 1 n i = 1 n y i y ^ i
where n denotes the total number of data points (samples) in the dataset. y i represents the actual observed value of the i t h data point. y ^ i represents the predicted value for the ith data point, as estimated by the model. The Mean Absolute Error (MAE) directly reflects the magnitude of prediction errors for continuous variables such as voltage and current, and it has a clear physical meaning.
The aforementioned metrics provide a quantifiable and interpretable rigorous assessment basis for the performance of the composite neural network controller from multiple dimensions, including macroscopic accuracy, microscopic error distribution, balance of positive and negative samples, and control deviation of continuous variables. Together, their assessment results form a scientific criterion for determining whether the proposed method meets the high-precision and high-reliability control requirements of the MMC system, laying a solid data foundation for subsequent analyses.

4. Results and Discussion

4.1. MMC Simulation Performance (Comparison with Benchmark [18])

This paper investigates the performance of the hybrid control MMC system under load step change conditions through simulation studies and compares it with the reference MPC method. The simulation results show that the system maintains a zero-power control state before 0.4 s. During the period from 0.4 to 0.42 s, a load of 0.6 megawatts is applied in a ramp form. At 0.7 s, a load of −0.6 megawatts is applied. By comparing with the results of the reference study, the control characteristics of the proposed method in the dynamic load change process are deeply analyzed. Figure 4 illustrates the system voltage of the Modular Multilevel Converter (MMC). Figure 5 shows the AC side current, which, compared to the benchmark method, can achieve stable and ripple-free current control under rapid load changes. Figure 6 and Figure 7 present the active and reactive power on the AC side, respectively, clearly showing the situation of active and reactive power on the AC side. The active power can accurately track the load reference value, rising from 0.6 megawatts to finally stabilize at −0.6 megawatts. Figure 8 depicts the internal unbalanced current of Phase A, which remains relatively stable despite load fluctuations. Figure 9 shows the d-q axis AC current, where the d-q axis circulating current maintains zero value throughout the dynamic test, highly consistent with the target waveform. Figure 10 illustrates the DC side current, indicating that the proposed control method, like the reference method, can achieve precise regulation without ripples. Figure 11 shows the submodule capacitor voltage of the upper arm of Phase A, which, although slightly slower in response than the reference method, still maintains a stable and balanced state. This suggests the proposed hybrid control strategy achieves better voltage control and phase energy steady-state error at the expense of slightly slower dynamic response. More advanced Pulse Width Modulation (PWM), such as the Nonlinear Modulation Pulse Width Modulation (NLM-PWM) or quicker Insulated Gate Bipolar Transistors (IGBTs), may help to improve the dynamic response.
In summary, the simulation results fully demonstrate the superior dynamic performance of the proposed control method under load step change conditions. Compared with the reference method, this method can accurately track load changes, achieving rapid and overshoot-free active power regulation (from 1.2 megawatts to −1.2 megawatts), while ensuring the stability of reactive power output. During the load transient process, the system voltage and AC side current remain ripple-free and stable, which suggests the reliability of the proposed algorithm in suppressing high-frequency interference. In addition, the internal unbalanced current and the d-q axis circulating current closely follow their target values, the DC side current has no transient fluctuations, and the submodule capacitor voltage is well balanced. These results comprehensively confirm the significant advantages of the proposed strategy in ensuring the stable operation of MMC, mitigating transient impacts, and improving power quality.
The selection of a 1500 V submodule voltage is based on multiple considerations. International engineering practice has shown that this voltage level has been widely adopted—for instance, Germany’s DolWin3 high-voltage direct current (HVDC) transmission project employs a 1500 V design, which enhances transmission efficiency and system stability. It not only meets high-voltage power transmission requirements but also adapts well to complex grid environments. Simulation results further demonstrate that even when the submodule voltage is increased, the output current and voltage waveforms of the system remain of high quality. The 1500 V design satisfies performance requirements while offering good applicability in engineering applications. Thus, it is not only theoretically sound but also provides significant practical advantages.

4.2. RSF Algorithm Performance

Evaluation of the Reduced Sorting Frequency (RSF) algorithm’s performance is conducted using two key indicators: submodule capacitor voltage fluctuation and switching action count. The waveform comparison between the conventional method and the RSF approach visually confirms the resultant system performance enhancement.
Figure 12 compares the performance of the two algorithms in terms of submodule voltage. Compared to the traditional sorting algorithm, the RSF algorithm does not sacrifice accuracy while reducing the switching frequency. The fluctuation of the submodule capacitor voltage only slightly increases from 1.58% to 1.61%.
Figure 13 visually illustrates the switching frequency of system modules under the control of both the traditional sorting algorithm and the RSF sorting algorithm. Based on a dataset containing the same number of samples (400,000), the traditional bubble sort algorithm recorded an average switching frequency of 5.09 kHz. In contrast, the RSF algorithm reduced the average switching frequency to 1.22 kHz. This represents a 76.1% reduction in switching operations. A detailed comparison of the relevant electrical parameter performance is provided in Table 4.
It is worth noting that the reduction in switching frequency does not compromise the electrical performance of the MMC. As shown in Figure 12, the electrical performance of both methods is nearly identical: the submodule capacitor voltage ripple remains low, while the total harmonic distortion (THD) of the output current remains reasonably well and approximately unchanged (8.64% vs. 8.68%), as suggested in Table 4. Furthermore, the algorithm maintains good inter-arm operational consistency. It also recorded that the standard deviation of the switching frequency among the arms increases by a negligible 0.04 kHz (from 0.05 kHz to 0.09 kHz), confirming that the balancing performance of the control system at the system level remains unaffected.
This significant reduction originates from the core mechanism of the RSF algorithm: its state-retention mechanism. Unlike traditional methods that re-sort all sub-modules every control cycle (which usually leads to redundant state changes), the RSF algorithm selectively operates only on the subset of modules that require a state transition. It specifically selects from the inactive modules when increasing the number of engaged modules, and from the activated modules when decreasing the number, thereby intelligently managing module states. This approach, combined with the voltage sorting strategy that adapts to the direction of current flow, eliminates unnecessary switching actions.

Effects of Varying Submodule Size per Arm and Choice of Voltage Balancing Approaches

We also studied the effect of varying the submodule size of the MMC to explore the direct applicability of the proposed approach to larger, more complex MMC systems, and the findings are summarized in Table 5, Table 6 and Table 7. From the switching state prediction accuracy shown in Table 8a–d, the proposed approach can maintain reasonably well accuracy in switching state prediction (SWP). Moreover, we have included a comparison with the following state-of-the-art approaches:
(1)
Conventional (Bubble sort): Bubble sort is used.
(2)
Switching Rotation Control: A fixed circular sequence is used for submodule insertion and bypass.
(3)
Quick Sort: It partitions submodules into high and low-voltage groups using a pivot voltage, prioritizing insertion from the high-voltage group and bypass from the low-voltage group without requiring a full sort.
(4)
Model Prediction Control: It forecasts future capacitor voltage trajectories to proactively balance voltages or prevent limit violations using exhaustive search of each switching combination.
(5)
Group Sort: It divides submodules into smaller groups, performs independent sorting within each group, and then employs a cross-group coordination mechanism to finalize the selection of submodules to operate.
(6)
Proposed MONN approach with RSF: The proposed RSF is used for generating the training data so that the proposed MONN can learn the optimal switching states. After training, RSF is not required, and the MONN will directly determine which SM to switch on/off.
The major differences between the proposed MONN with RSF and the other sorting approaches are the consideration of the previous switching state. As the proposed RSF approach only adds or subtracts SMs on top of the activated SMs during previous state as shown in Figure 2, the number of switching operations can be substantially reduced, as shown in Figure 13b. Experimental results demonstrate that the RSF method achieves a reduction in power dissipation of 73% at N = 4 (6234 W to 1661 W). The reduction in power dissipation can even further improve to 87.7% (6358 W to 766 W) when N is increased to 10. Moreover, the proposed approach (with RSF) can achieve lower switching frequency, and, hence, power loss. The power loss per SM can be reduced by roughly 50% (from 1661 W to 766 W) when N increases from 4 to 10 in Table 5. This suggests the potential benefit of the proposed approach in offering reduced power consumption per SM for larger systems. Details on the calculation of power consumption can be found in Appendix A.

4.3. Multi-Output Neural Network (MONN) Controller Performance

4.3.1. For Classification Tasks

In this section, we evaluate the performance of the proposed MONN in predicting the switching state and activation count. Among the most recent works [7,11,12,13,14], only the Neural Network Based Model Predictive Controllers for Modular Multilevel Converters (NNMPC-MMC) [13] proposed a NN-based classifier for activation count, rather than the actual switching state (on/off) of the submodule (SM)/switch. Consequently, an extra step of sorting is required to determine the set of switches to be turned on. To our best knowledge, there are no other non-neural classifiers reported for switching state prediction (SWP). This is possibly due to the challenges of dealing with the time-series (continuous voltages and currents) and event (discrete switching state) forecasting at the same time, which favors neural network classifiers as they can be designed to support multi-tasking. Hence, we propose to compare the proposed algorithm with NNMPC-MMC and other non-neural classifiers to further improve the comprehensiveness of our comparison:
(1)
Neural Network Based Model Predictive Controllers for Modular Multilevel Converters (NNMPC-MMC) [13]: Unlike the proposed method, the classifier of NNMPC-MMC only performs activation count prediction (ACP) (i.e., predicting the total number of switched-on SMs), rather than specifying the exact switching state, such as which ones are on and which ones are off. There are the two configurations: (a) a classifier that treats the activation count as discrete categories, and (b) a regressor that relaxes the activation count as a continuous variable.
(2)
Support Vector Machine (SVM): As non-neural classifiers have rarely been studied for MMC switching state prediction, we include the SVM for comparison.
(3)
XGBoost: Similarly, XGBoost is also included for studying the MMC switching state prediction.
(4)
Naïve Bayes classifier: Similarly, the Naïve Bayes classifier is also included for studying the MMC switching state prediction.
For the proposed approach, we propose the following settings:
(1)
Proposed MONN-based switching state prediction (MONN-SWP) (Setting A): It is the original setting reported in the original manuscript. It can predict the exact switching state, as well as the number of switched-on SMs as it can be determined by the predicted switching state.
(2)
Proposed MONN-based activation count prediction (MONN-ACP) (Setting B): For fair comparison with NNMPC-MMC [13], we create a separate network for predicting activation counts only. The complexity of NN is also reduced from the original 2 hidden layers (1st hidden layer with 64 and 2nd hidden layer with 32 hidden units, respectively) to 1 hidden layer with 12 hidden units following the hidden-layer architecture of [13].
(3)
Proposed MONN-SWP-ACP with Novel Cardinality-Constrained Post-Inference Projection (MONN-SWP-ACP-CCPIP) (Setting C): We propose to jointly predict the switching state (SWP) and activation count together. However, an initial test revealed that the cardinality of the predicted switching state of SWP could misalign with the predicted activation count of the activation count prediction (ACP) and hence there was no improvement with the SWP. To this end, we propose to a novel Cardinality-Constrained Post-Inference Projection (CCPIP) to align the SWP with ACP using projection over a constrained set. Experimental results to be reported below show that roughly 5% improvement can be made to switching state prediction over the original setting (Setting A) of the proposed approach. Table 2 shows the implementation of the solution of the proposed CCPIP outlined in conjecture (1) in Section 2.3.4. Detailed derivation of the proposed MONN-SWP-ACP-CCPIP can also be found in Section 2.3.4.
The t-test is used to validate the repeated hold-out procedure and compare with other machine learning classifiers [27] (such as support vector machine (SVM), XGBoost, etc.) as well as state-of-the-art NN-MPC-related literature. We notice when the number of Monte Carlo repetitions is increased to 10, the variance of the performance metrics (i.e., accuracy, precision, recall, and F1-scores) across the Monte Carlo repetitions are small, e.g., 0.005, and the mean difference in performance is sufficiently large and significant p-values (Bonferroni corrected for multiple testing) can be obtained. Hence, we set the number of Monte Carlo repetitions to 10.
(1)
Comparison of Switching State Prediction (SWP) Performance
We compared against various state-of-the-art algorithms in terms of switching state prediction of the four submodules (SMs)/switches and activation count predictions. Table 8a–d show the switching state prediction performance (mean ± std) of various algorithms in terms of accuracy, precision, recall, and F1-score for SM/switches 1 to 4, respectively. The newly added algorithms and settings are highlighted in yellow, which includes SVM, Naïve Bayes, Random Forest, XGBoost, and settings B and C of the proposed MONN. On the other hand, Table 8e shows the activation count prediction performance of the proposed approach against NN-MPC-MMC [13].
From Table 8a–d, the proposed approaches (Settings A to C) generally outperform other classifiers by 10–15% in terms of accuracy, precision, recall and F1-scores. This is largely attributed to the dual network architecture of the proposed approach, enabling it to perform the classification (switching state prediction) and at the same time use the regression tower (arm average voltage and ac-side current forecasting). As the optimization of classification and regression towers is cohesive and they share a common set of features, the regression can help to enhance learning by learning some complementary information that may be missed by the classifier. For instance, the classifier just needs to learn the decision of switching on/off, but the regressor must learn how to predict the pattern of the arm voltage and ac-side current. The extra information gained through learning to predict the actual voltages and currents may have contributed in improving the switching state prediction performance.
Moreover, to further enhance the performance of the proposed approach, we proposed an extra setting (Setting C), which adds an extra activation count prediction (ACP) task to enhance the quality of switching state prediction (SWP) performance using the proposed CCPIP. Roughly 5% performance improvement can be achieved.
(2)
Comparison of Activation Count Prediction (ACP) Performance
As the NNMPC-MMC approach only predicts the activation counts, rather than the actual switching state, we have re-configured the classification tower of the proposed approach to compare fairly with NNMPC-MMC, and the new setting is called MONN-SWP-ACP (Setting B). We also reduced the complexity of our original network from 2 hidden layers (1st layer with 64 and 2nd layer with 32 hidden layers) to 1 hidden layer with 12 hidden units, following the configuration of NNMPC-MMC [13]. This ensures both networks have the same complexity. For the input and output variables, although the original NNMPC-MMC has not included the capacitor voltages [13] as input, we provide these extra inputs to NNMPC-MMC so that both our proposed approach and NNMPC-MMC have the same set of input variables, ensuring a fair comparison.
Table 8e shows the comparison between NNMPC-MPC and the proposed MONN-ACP (Setting B). The performance of the two algorithms is highly comparable, where a two-sided t-test shows that there is no significant difference (p > 0.05) between the two algorithms. However, it should be noted that NNMPC-MPC just predicts the activation count, and an extra search is required (say via conventional sorting) to find the actual SMs to be switched on/off. In contrast, the proposed approach can directly predict the switching state.
Moreover, comparing the original setting (Setting A) and the MONN-SWP-ACP-CCPIP (Setting C) of the proposed approach, it can be seen that the proposed approach achieves roughly a further 5% improvement in SWP by correcting the predicted states using the proposed constrained projection (CCPIP) and the obtained ACP.
(3)
Effect of different optimizer
To study the effect in the choice of optimizer, we have trained the proposed MONN using the ADAM optimizer (regularization parameter 0.001) and the Bayesian Regularization Back Propagation (BRB) optimizer [29] and their corresponding loss curves are shown in Figure 14a–d and Figure 15a–d, respectively, for the four switches (SW1–SW4). The ADAM optimizer supports adaptive learning rate but choosing a good regularization parameter may require extra hyperparameter tuning. Choosing a good regularization parameter is quite challenging and it may require further hyperparameter tuning.
The BRB employs a Bayesian framework in deriving the closed-form solution for the optimal Regularization parameter and the Levenberg–Marquardt (LM) algorithm [30] for numerically solving the backpropagation. The damping factor, which can be viewed as regularization applied to the original Gauss–Newton problem, adjusts the balance between Gauss–Newton (more aggressive) and Gradient Descent (more conservative). Additionally, the damping factor is automatically adjusted in the LM algorithm, allowing the learning rate to be adaptive.
From the loss curves shown in Figure 14 and Figure 15, it can be seen that the BRB gives less oscillation at the loss curves. This is attributed to the Bayesian Regularization and the adaptive learning rate of the LM algorithm. However, we note that the overall classification performance are almost the same for both algorithms are almost the same, suggesting the oscillations do not impact the overall classification performance much.
Our multi-task model employed binary cross-entropy loss for the four independent switch prediction tasks. Overall, the training dynamics were highly favorable, evidenced by the closely coupled descent of the training and validation losses. All switches achieved test accuracies exceeding 91%, validating the model’s efficacy.

4.3.2. For Regression Tasks

To evaluate the regression prediction performance of the model in multi-task learning, scatter plots are used to compare the predicted versus actual values for both voltage and current were generated in Figure 16. Analysis demonstrates that in both scatter plots, the data points are closely distributed along both sides of the reference line (y = x), exhibiting linear correlation and aggregation. Moreover, to further investigate the possibility of system bias, we have employed an approach similar to the kCV-B [31], which further perform Bootstrap in the validation set. Together with the 10 Monte Carlos mentioned in Section 4.3.1, the total number of realizations are 10 × 1000 = 10,000 times. With the kCV-B approach, we are able to obtain confidence interval of the system bias. From Figure 16, the 95% confidence interval covers both negative and positive values. We also examined the histograms of the bootstrapped system biases of the arm averaged voltages and arm currents in Figure 17a,b, respectively. It can be seen that the bootstrapped system biases also lie within the 95% confidence interval of zero mean. These findings suggest that the model’s predictions for both physical quantities are in strong agreement with the true values with no significant deviation from zero.

5. Conclusions

This study proposes an innovative neural network (NN)-based control strategy for modular multilevel converters (MMCs), with a primary focus on achieving precise switch signal control. The multi-output neural network model demonstrates superior performance, particularly when the number of submodules per arm is small. For example, at N = 4, the model achieves an average switching accuracy of 92.01%, highlighting its capability to generate accurate control signals—essential for stable and efficient operation of MMCs.
Compared to conventional methods, PI control offers a simple structure and high steady-state accuracy but struggles to handle nonlinear dynamics. Model predictive control (MPC), while excelling in dynamic performance and multi-objective optimization, suffers from high computational complexity and strong dependence on accurate system models. In contrast, the NN-based control strategy adopted in this work integrates a data-driven mechanism to combine the advantages of both approaches: it learns steady-state system characteristics through offline training to achieve high-precision control, implicitly constructs a nonlinear dynamic model to facilitate multi-objective optimization and rapid dynamic response, and incorporates online learning capabilities to adapt autonomously to changing operating conditions.
When combined with optimization algorithms, the NN strategy also contributes to reducing energy losses and extending component service life. Although this study is validated under stable operating conditions with a small number of submodules, future research will aim to develop more comprehensive models that incorporate various real-world variables such as unbalanced three-phase grid voltages, harmonic disturbances, grid faults, and internal disturbances. This will enhance the robustness and accuracy of output control signals. Further investigation into the interactions among these variables, along with the integration of real-time data and advanced machine learning techniques, will help optimize converter settings and improve system resilience. Extending this approach to MMCs with more submodules will enhance the efficiency and sustainability of power transmission systems.

Author Contributions

Conceptualization, S.G., H.C.W. and J.Z.; methodology, S.G., H.C.W. and J.Z.; software, S.G.; validation, S.G., S.C.C. and J.Z.; formal analysis, S.G. and H.C.W.; investigation, S.G., H.C.W. and J.Z.; resources, H.C.W. and J.Z.; data curation, S.G.; writing—original draft preparation, S.G.; writing—review and editing, S.G., H.C.W., S.C.C. and J.Z.; visualization, S.G. and H.C.W.; supervision, J.Z.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interests.

Appendix A. Power Loss Calculation for IGBT Module (SKM400GB17E4) [32]

Power loss is calculated based on the parameters specified on the datasheet of SKM400GB17E4 [32]. The total power loss P t o t a l of an IGBT submodule is given by the following equation:
P t o t a l = P c o n d + P s w + P r e c
where P c o n d is the conduction loss resulting from the current flowing through the IGBT during its on-state. P s w is the switching loss incurred during the turn-on and turn-off transitions of the IGBT. P r e c is the reverse recovery loss associated with the freewheeling diode. These three power loss components can be calculated respectively using the following equations:
P c o n d = I a v g V c e , s a t
P s w = f s w ( E o n + E o f f )
P r e c = f s w E r e c
where I a v g is the average IGBT current. V c e , s a t is the IGBT on-state voltage drop. f s w is the switching frequency. E o n , E o f f , and E r e c represent the corrected switching and recovery energies under actual operating conditions. The switching energies E o n , E o f f , and E r e c are derived by scaling the datasheet values measured at reference conditions. The correction is based on the actual peak current I p e a k and DC-link voltage V d c , using the following equations:
E o n = E o n , d b I p e a k I r e f V d c V r e f
E o f f = E o f f , d b I p e a k I r e f V d c V r e f
E r e c = E r e c , d b I p e a k I r e f V d c V r e f

References

  1. Perez, M.A.; Ceballos, S.; Konstantinou, G.; Pou, J.; Aguilera, R.P. Modular Multilevel Converters: Recent Achievements and Challenges. IEEE Open J. Ind. Electron. Soc. 2021, 2, 224–239. [Google Scholar] [CrossRef]
  2. Bergna-Diaz, G.; Suul, J.A.; D’Arco, S. Energy-based state-space representation of modular multilevel converters with a constant equilibrium point in steady-state operation. IEEE Trans. Power Electron. 2018, 33, 4832–4851. [Google Scholar] [CrossRef]
  3. Xiao, Q.; Jin, Y.; Jia, H.; Tang, Y.; Cupertino, A.F.; Mu, Y.; Teodorescu, R.; Blaabjerg, F.; Pou, J. Review of Fault Diagnosis and Fault-Tolerant Control Methods of the Modular Multilevel Converter Under Submodule Failure. IEEE Trans. Power Electron. 2023, 38, 12059–12077. [Google Scholar] [CrossRef]
  4. Moon, J.; Member, S.; Park, J.; Kang, D. A control method of HVDC-modular multilevel converter based on arm current under the unbalanced voltage condition. IEEE Trans. Power Del. 2015, 30, 529–536. [Google Scholar] [CrossRef]
  5. Geyer, T. Model Predictive Control of High Power Converters and Industrial Drives; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
  6. Wang, J.; Tang, Y.; Lin, P.; Liu, X.; Pou, J. Deadbeat Predictive Current Control for Modular Multilevel Converters With Enhanced Steady-State Performance and Stability. IEEE Trans. Power Electron. 2020, 35, 6878–6894. [Google Scholar] [CrossRef]
  7. Liu, X.; Qiu, L.; Wu, W.; Ma, J.; Fang, Y.; Peng, Z.; Wang, D. Event-Triggered Neural-Predictor-Based FCS-MPC for MMC. IEEE Trans. Ind. Electron. 2022, 69, 6433–6440. [Google Scholar] [CrossRef]
  8. Liu, X.; Qiu, L.; Wu, W.; Ma, J.; Fang, Y.; Peng, Z.; Wang, D. Neural Predictor-Based Low Switching Frequency FCS- MPC for MMC With Online Weighting Factors Tuning. IEEE Trans. Power Electron. 2022, 37, 4065–4079. [Google Scholar] [CrossRef]
  9. Dekka, A.; Wu, B.; Yaramasu, V.; Zargari, N.R. Integrated model predictive control with reduced switching frequency for modular multilevel converters. IET Electr. Power Appl. 2017, 11, 857–863. [Google Scholar] [CrossRef]
  10. Vatani, M.; Bahrani, B.; Saeedifard, M.; Hovd, M. Indirect Finite Control Set Model Predictive Control of Modular Multilevel Converters. IEEE Trans. Smart Grid 2015, 6, 1520–1529. [Google Scholar] [CrossRef]
  11. Liu, X.; Qiu, L.; Wu, W.; Ma, J.; Fang, Y.; Peng, Z.; Wang, D. Predictor-Based Neural Network Finite-Set Predictive Control for Modular Multilevel Converter. IEEE Trans. Ind. Electron. 2021, 68, 11621–11627. [Google Scholar] [CrossRef]
  12. Gutierrez, B.; Kwak, S.S. Modular Multilevel Converters (MMCs) Controlled by Model Predictive Control with Reduced Calculation Burden. IEEE Trans. Power Electron. 2018, 33, 9176–9187. [Google Scholar] [CrossRef]
  13. Wang, S.; Dragicevic, T.; Gao, Y.; Teodorescu, R. Neural Network Based Model Predictive Controllers for Modular Multilevel Converters. IEEE Trans. Energy Convers. 2021, 36, 1562–1571. [Google Scholar] [CrossRef]
  14. Liu, X.; Qiu, L.; Rodriguez, J.; Wu, W.; Ma, J.; Peng, Z.; Wang, D.; Fang, Y. Neural Predictor-Based Dynamic Surface Predictive Control for Power Converters. IEEE Trans. Ind. Electron. 2022, 70, 1057–1065. [Google Scholar] [CrossRef]
  15. Ding, H.; Wang, Q.; Deng, F.; Cheng, M.; Buja, G. Capacitor Monitoring for Modular Multilevel Converters Based on Intelligent Algorithm. In Proceedings of the 2022 4th Asia Energy and Electrical Engineering Symposium, AEEES 2022, Chengdu, China, 25–28 March 2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022; pp. 762–766. [Google Scholar]
  16. Langarica, S.; Pizarro, G.; Poblete, P.M.; Radrigan, F.; Pereda, J.; Rodriguez, J.; Nunez, F. Denoising and Voltage Estimation in Modular Multilevel Converters Using Deep Neural-Networks. IEEE Access 2020, 8, 207973–207981. [Google Scholar] [CrossRef]
  17. Zhang, L.; Zou, Y.; Yu, J.; Qin, J.; Vijay, V.; Karady, G.G.; Shi, D.; America, G.N.; Wang, Z. Modeling, control, and protection of modular multilevel converter-based multi-terminal HVDC systems: A review. CSEE J. Power Energy Syst. 2017, 3, 340–352. [Google Scholar] [CrossRef]
  18. Moon, J.W.; Gwon, J.S.; Park, J.W.; Kang, D.W.; Kim, J.M. Model Predictive Control with a Reduced Number of Considered States in a Modular Multilevel Converter for HVDC System. IEEE Trans. Power Deliv. 2015, 30, 608–617. [Google Scholar] [CrossRef]
  19. Saeedifard, M.; Iravani, R. Dynamic performance of a modular multilevel back-to-back hVdc system. IEEE Trans. Power Del. 2010, 25, 2903–2912. [Google Scholar] [CrossRef]
  20. Deng, F.; Chen, Z. Voltage-balancing method for modular multilevel converters under phase-shifted carrier-based pulsewidth modulation. IEEE Trans. Ind. Electron. 2015, 62, 4158–4169. [Google Scholar] [CrossRef]
  21. Zhang, J.; Liu, J.; Liu, J.; Fang, W.; Hou, J.; Dong, Y. Modified capacitor voltage balancing sorting algorithm for modular multilevel converter. J. Eng. 2019, 2019, 3315–3319. [Google Scholar] [CrossRef]
  22. Samajdar, D.; Bhattacharya, T.; Dey, S. A reduced switching frequency sorting algorithm for modular multilevel converter with circulating current suppression feature. IEEE Trans. Power Electron. 2019, 34, 10480–10491. [Google Scholar] [CrossRef]
  23. Chen, G.; Peng, H.; Zeng, R.; Hu, Y.; Ni, K. A fundamental frequency sorting algorithm for capacitor voltage balance of modular multilevel converter with low-frequency carrier phase shift modulation. IEEE Trans. Emerg. Sel. Top. Power Electron. 2018, 6, 1595–1604. [Google Scholar] [CrossRef]
  24. Abushafa, O.S.M.; Dahidah, M.S.; Gadoue, S.M.; Atkinson, D.J. Submodule voltage estimation scheme in modular multilevel converters with reduced voltage sensors based on kalman filter approach. IEEE Trans. Ind. Electron. 2018, 65, 7025–7035. [Google Scholar] [CrossRef]
  25. Hu, P.; Teodorescu, R.; Wang, S.; Li, S.; Guerrero, J.M. A currentless sorting and selection-based capacitor-voltage-balancing method for modular multilevel converters. IEEE Trans. Power Electron. 2019, 34, 1022–1025. [Google Scholar] [CrossRef]
  26. Ilves, K.; Harnefors, L.; Norrga, S.; Nee, H. Predictive sorting algorithm for modular multilevel converters minimizing the spread in the submodule capacitor voltages. IEEE Trans. Power Electron. 2015, 30, 440–449. [Google Scholar] [CrossRef]
  27. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Duchesnay, É. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  28. Jia, Z.; Zaharia, M.; Aiken, A. Beyond data and model parallelism for deep neural networks. Proc. Mach. Learn. Syst. 2019, 1, 1–13. [Google Scholar]
  29. Foresee, F.D.; Hagan, M.T. Gauss-Newton approximation to Bayesian learning. In Proceedings of the International Conference on Neural Networks (ICNN’97), Houston, TX, USA, 12 June 1997. [Google Scholar]
  30. Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: New York, NY, USA, 2006. [Google Scholar]
  31. Nurunnabi, A.A.M. kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, XLVIII-4/W2-2022, 85–92. [Google Scholar] [CrossRef]
  32. Semikron. IGBT Module SKM400GB17E4 Datasheet. Danfoss, November 2023. Available online: https://assets.danfoss.com/documents/latest/516850/AI498340333607en-000103.pdf (accessed on 17 December 2014).
Figure 1. Structure of an MMC.
Figure 1. Structure of an MMC.
Electronics 14 04803 g001
Figure 2. Flowchart of proposed RFS approach.
Figure 2. Flowchart of proposed RFS approach.
Electronics 14 04803 g002
Figure 3. The proposed multi-task learning-driven multi-output neural network (MONN). 1. Setting A: Proposed MONN with switching state prediction and forecasting (MONN-SWP). 2. Setting B: Proposed MONN with activation count prediction (MONN-ACP). 3. Setting C: It combines settings A and B with a novel Cardinality-Constrained Post-Inference Projection for Switch State Prediction (MONN-SWP-ACP-CCPIP). It is not indicated in figure due to space limitation. Details of the proposed CCPIP can be found in Section 2.3.4. 4. ReLU activation is used for hidden layers. Sigmoid activation is used for the SWP tower. Softmax activation is used for the ACP tower. Linear activation is used for the regression tower.
Figure 3. The proposed multi-task learning-driven multi-output neural network (MONN). 1. Setting A: Proposed MONN with switching state prediction and forecasting (MONN-SWP). 2. Setting B: Proposed MONN with activation count prediction (MONN-ACP). 3. Setting C: It combines settings A and B with a novel Cardinality-Constrained Post-Inference Projection for Switch State Prediction (MONN-SWP-ACP-CCPIP). It is not indicated in figure due to space limitation. Details of the proposed CCPIP can be found in Section 2.3.4. 4. ReLU activation is used for hidden layers. Sigmoid activation is used for the SWP tower. Softmax activation is used for the ACP tower. Linear activation is used for the regression tower.
Electronics 14 04803 g003
Figure 4. Grid-side voltage waveform comparison: (a) benchmark, (b) proposed model.
Figure 4. Grid-side voltage waveform comparison: (a) benchmark, (b) proposed model.
Electronics 14 04803 g004
Figure 5. AC-side current waveform comparison: (a) benchmark, (b) proposed model.
Figure 5. AC-side current waveform comparison: (a) benchmark, (b) proposed model.
Electronics 14 04803 g005
Figure 6. AC-side active power: (a) benchmark, (b) proposed model.
Figure 6. AC-side active power: (a) benchmark, (b) proposed model.
Electronics 14 04803 g006
Figure 7. AC-side reactive power: (a) benchmark, (b) proposed model.
Figure 7. AC-side reactive power: (a) benchmark, (b) proposed model.
Electronics 14 04803 g007
Figure 8. A-phase inner unbalanced current suppression: (a) benchmark, (b) proposed model.
Figure 8. A-phase inner unbalanced current suppression: (a) benchmark, (b) proposed model.
Electronics 14 04803 g008aElectronics 14 04803 g008b
Figure 9. D-Q axis circulating current suppression: (a) benchmark, (b) proposed model.
Figure 9. D-Q axis circulating current suppression: (a) benchmark, (b) proposed model.
Electronics 14 04803 g009
Figure 10. DC-link current ripple comparison: (a) benchmark, (b) proposed model.
Figure 10. DC-link current ripple comparison: (a) benchmark, (b) proposed model.
Electronics 14 04803 g010aElectronics 14 04803 g010b
Figure 11. Submodule capacitor voltage balancing performance: (a) benchmark, (b) proposed model.
Figure 11. Submodule capacitor voltage balancing performance: (a) benchmark, (b) proposed model.
Electronics 14 04803 g011
Figure 12. SM Capacitor voltages performance: (a) bubble sort, (b) RSF algorithm.
Figure 12. SM Capacitor voltages performance: (a) bubble sort, (b) RSF algorithm.
Electronics 14 04803 g012
Figure 13. Switching frequency performance: (a) bubble sort, (b) RSF algorithm.
Figure 13. Switching frequency performance: (a) bubble sort, (b) RSF algorithm.
Electronics 14 04803 g013
Figure 14. Training and validation loss curves for the four switches. (a) Loss curve for Switch1 model. (b) Loss curve for Switch2 model. (c) Loss curve for Switch3 model. (d) Loss curve for Switch4 model using the ADAM optimizer.
Figure 14. Training and validation loss curves for the four switches. (a) Loss curve for Switch1 model. (b) Loss curve for Switch2 model. (c) Loss curve for Switch3 model. (d) Loss curve for Switch4 model using the ADAM optimizer.
Electronics 14 04803 g014
Figure 15. Training and validation loss curves for the four switches. (a) Loss curve for Switch1 model. (b) Loss curve for Switch2 model. (c) Loss curve for Switch3 model. (d) Loss curve for Switch4 model using the Bayesian Regularization Back Propagation (BRB) optimizer.
Figure 15. Training and validation loss curves for the four switches. (a) Loss curve for Switch1 model. (b) Loss curve for Switch2 model. (c) Loss curve for Switch3 model. (d) Loss curve for Switch4 model using the Bayesian Regularization Back Propagation (BRB) optimizer.
Electronics 14 04803 g015aElectronics 14 04803 g015b
Figure 16. Scatter plots for regression tasks. (a) Arm average voltage, (b) arm current.
Figure 16. Scatter plots for regression tasks. (a) Arm average voltage, (b) arm current.
Electronics 14 04803 g016
Figure 17. Bootstrap distributions of system bias for (a) arm average voltage, (b) arm current.
Figure 17. Bootstrap distributions of system bias for (a) arm average voltage, (b) arm current.
Electronics 14 04803 g017aElectronics 14 04803 g017b
Table 1. Parameters of the proposed multi-output neural network (MONN) 1.
Table 1. Parameters of the proposed multi-output neural network (MONN) 1.
LayerNumber of
Parameters
Size
(4 byte/Parameter)
OperationsCycles for Int8 Microprocessors (MCUs) with LUT
Feature Sharing Layers
Input832 byte
Hidden layer 1
(64 hidden units)
8   × 64 + 64 = 5762304 byte512 MAC 2
64 Bias add
576–1152 cycles
Activation (ReLU) 30 64 ReLU operations192–384 cycles
Hidden layer 2
(32 hidden units)
64   × 32 + 32 = 20808320 byte2048 MAC
32 Bias add
2080–4160 cycles
Activation (ReLU) 30 32 ReLU operations96–192 cycles
Switch State Prediction (SWP) Tower
Output (4 units) 32 × 4 + 4 = 132528 byte128 MAC
4 Bias add
132–264 cycles
Activation (Sigmoid) 40 4 sigmoid operations12–24 cycles
Activation Count Prediction (ACP) Tower
Output (5 units) 32 × 5 + 5 = 165660 byte160 MAC
5 Bias add
165–330 cycles
Activation (Softmax) 50 5 softmax LUT
(6–10 cycles per element)
30–50 cycles
Regression Tower
Output (2 units) 32 × 2 + 2 = 66264 byte34 MAC
2 Bias add
36–72 cycles
Activation
(Linear)
2 linear operations2 cycles
Total302712,108 bytes~11 kb 3321–6630 cycles
t ~ 0.022   m s
for 150 Mhz MCU 6
1 Training time: 1 h and 35 min for 300 epochs (for 1.5 million samples) in PC; 2 MAC (Multiply-Accumulate operation) and bias add: 1–2 cycles per operation. 3 ReLU: Branchless max (1 cycle per element) plus re-quantization (3–6 cycles per element). 4 Sigmoid: 1–2 cycles per look-up operation for each element based on an assumption of 256–1024 entries over a bounded range and integer index + 2–3 cycles for interpolation. 5 Softmax: based on an assumption of 256–1024 entries for e x , plus a reciprocal LUT for 1 / e x j . 6–10 cycles per element due to comparisons, LUT for e x , summing, reciprocal LUT, and normalization. 6 Run-time is estimated by t = c y c l e s f c l o c k , where f c l o c k is the processor clock frequency. For illustration, we have chosen the NXP MCX Series with Dual Cortex M33 cores of 150 Mhz.
Table 2. Proposed Cardinality-Constrained Post-Inference Projection.
Table 2. Proposed Cardinality-Constrained Post-Inference Projection.
A—compute switching state after inferencing
y ^ n , t = 1   if   p ^ n , t 0.5 or 0 otherwise
B—compute cardinality:
N ^ t = n = 1 N y ^ n , t
C—Projection to constrained set
Case 1—No misalignment
     If  N ^ t = N ^ c , t , (i.e., no misalignment), no projection required #.
Case   2 False   positive   ( Cardinality   of   SWP   greater   than   ACP )
   Else   if   N ^ t > N ^ c , t :
      Compute   N d i f f = N ^ t N ^ c , t
      Sort   the   probabilities   p ^ n , t   of   the   N ^ t classifiers.
      Set   the   outputs   of   the   classifiers   with   the   N d i f f smallest p ^ n , t to 0. *
Case   3 False   negative   ( Cardinality   of   SWP   smaller   than   ACP )
   Else   if   N ^ t < N ^ c , t :
      Compute   N d i f f = N ^ t N ^ c , t
      Sort   the   probabilities   p ^ n , t   of   the   N N ^ t classifiers.
      Set   the   outputs   of   the   classifiers   with   the   N d i f f largest p ^ n , t to 1. ^
# As there is no misalignment, the unconstrained minimizer (original solution) of L C E ( y ^ n , t , p ^ n , t ) is within the constraint set. No projection is required. * When minimizing L C E ( y ^ n , t , p ^ n , t ) , the classifiers with the smallest p ^ n , t will receive the greatest loss when assigning to y ^ n , t = 1 as they are most unlikely to give an output of 1. Hence, projecting those false positives to zero will minimize the loss and satisfy the cardinality constraint. ^ Conversely, the classifier with the largest p ^ n , t will receive the greatest loss when assigning to y ^ n , t = 0 . Hence, projecting those false negatives to one will minimize the loss and satisfy the cardinality constraint.
Table 3. MMC Simulation Parameters.
Table 3. MMC Simulation Parameters.
ItemsValues
Number of SMs per Arm 4
Rated DC Voltage6000 V
Nominal SM Capacitance10 mF
Nominal SM Capacitor Voltage1500 V
Rated Frequency 50 Hz
Arm Inductance 2 mH
Sample Frequency 1 MHz
AC System Voltage 1500 V
Table 4. Comparison of voltage balancing control strategies.
Table 4. Comparison of voltage balancing control strategies.
Voltage Balancing MethodAverage Switching Frequency Submodule
Voltage Ripple
Output AC
Current THD
Conventional Balancing Method5093.75 Hz1.58%8.64%
RSF Strategy1218.75 Hz1.61%8.68%
Table 5. Switching state prediction accuracy under varying number of submodules (SM) per arm ( N ).
Table 5. Switching state prediction accuracy under varying number of submodules (SM) per arm ( N ).
N = 4N = 5N = 6N = 7N = 8N = 9N = 10
Switch 191.05%90.40%90.23%90.33%90.17%89.08%72.69%
Switch 291.91%90.08%90.46%90.79%89.78%90.28%72.67%
Switch 391.22%90.52%90.00%91.09%90.04%89.46%73.38%
Switch 489.73%90.78%89.96%91.16%90.23%88.77%73.23%
Switch 5/90.28%90.44%90.45%90.21%88.03%72.99%
Switch 6//90.00%90.33%89.87%89.10%73.03%
Switch 7///91.00%90.13%89.94%73.52%
Switch 8////90.03%89.54%73.28%
Switch 9/////88.5973.51%
Switch 10//////72.62%
Table 6. Switching frequency and power consumption of the proposed approach under varying number of submodules (SM) per arm ( N ).
Table 6. Switching frequency and power consumption of the proposed approach under varying number of submodules (SM) per arm ( N ).
N = 4N = 5N = 6N = 7N = 8N = 9N = 10
Switching Frequency (Hz)
Conventional5094484939815207510952615199
Proposed (with RSF)12191060539507513526460
Module Power Consumption (W)
Conventional6234594549216367625264316358
Proposed (with RSF)16611474859821828844766
Table 7. Comparison of different MMC Voltage Balancing approaches.
Table 7. Comparison of different MMC Voltage Balancing approaches.
MethodFLOPSNumber of SMs Engaged in Each Time-StepApplication Scenario
Bubble Sort 2 n ( n 1 ) N ^ t Educational/small-SM demo
Rotation Control n   log 2 ( n ) N ^ t Low-frequency, high-power
Quick Sort 2 n   ln ( n ) N ^ t Medium-/high-voltage DC
MPC (N = 1) 2 n N ^ t High-dynamic, multi-objective
Group Sort ( n / g ) 2 N ^ t Multi-core/FPGA implementation
Proposed MONN with RSFNo sorting required after training Δ = | N ^ t N ^ t 1 | Low-cost MCU/DSP-based MMC
N ^ t : Number of estimated SMs required in Equation (6); g: number of groups.
Table 8. Comparison of various algorithms.
Table 8. Comparison of various algorithms.
(a) For Switching State prediction (mean ± std) for Switch 1 (SW1)
AlgorithmsAccuracyPrecisionRecallF1-score
Other classifiers (From scikit-learn library [R1])
SVM0.8468 ± 0.0030.8613 ± 0.0030.8391 ± 0.00440.8501 ± 0.0029
Naïve Bayes0.7487 ± 0.00180.7559 ± 0.00290.7601 ± 0.00350.758 ± 0.0021
Random Forest0.8287 ± 0.01320.7773 ± 0.01570.7917 ± 0.0130.7844 ± 0.013
XGBoost0.8197 ± 0.01070.7533 ± 0.01390.7976 ± 0.01540.7746 ± 0.009
Proposed MONN
MONN-SWP (Setting A)0.9104 ± 0.0067
( p < 1 0 13.76 ) 1
0.9145 ± 0.015
( p < 1 0 7.13 ) 1
0.9127 ± 0.0174
( p < 1 0 8.3 ) 1
0.9133 ± 0.0066
( p < 1 0 13.76 ) 1
MONN-SWP-ACP-CCPIP
(Setting C)
0.96593 ± 0.0158
( p < 1 0 12.65 ) 2
0.96825 + 0.0171
( p < 1 0 11.24 ) 2
0.97165 + 0.0236
( p < 1 0 10.45 ) 2
0.96994 + 0.0144
( p < 1 0 13.39 ) 2
(b) For Switching State prediction (mean ± std) for SW2
AlgorithmsAccuracyPrecisionRecallF1-score
Other classifiers (From scikit-learn library [27])
SVM0.862 ± 0.00120.8619 ± 0.00340.8637 ± 0.00320.8628 ± 0.0015
Naïve Bayes0.7273 ± 0.00450.7264 ± 0.00590.733 ± 0.00480.7297 ± 0.0045
Random Forest0.831 ± 0.02090.7647 ± 0.00850.8377 ± 0.02070.7994 ± 0.0116
XGBoost0.8198 ± 0.01720.7382 ± 0.02040.8093 ± 0.02270.7719 ± 0.016
Proposed MONN
MONN-SWP
(Setting A)
0.9163 ± 0.0089
( p < 1 0 11.12 ) 1
0.9112 ± 0.0214
( p < 1 0 4.41 ) 1
0.9243 ± 0.0186
( p < 1 0 6.59 ) 1
0.9173 ± 0.0081
( p < 1 0 11.78 ) 1
MONN-SWP-ACP-CCPIP
(Setting C)
0.95528 + 0.0097
( p < 1 0 31 ) 2
0.96635 + 0.0182
( p < 1 0 10.6 ) 2
0.95105 + 0.0155
( p < 1 0 10.44 ) 2
0.95864 + 0.0119
( p < 1 0 13.194 ) 2
(c) for Switching State prediction (mean ± std) for SW3
AlgorithmsAccuracyPrecisionRecallF1-score
Other classifiers (From scikit-learn library [27])
SVM0.8441 ± 0.00330.8103 ± 0.00480.9092 ± 0.00220.8569 ± 0.0031
Naïve Bayes0.7307 ± 0.00230.7365 ± 0.00430.7402 ± 0.00320.7383 ± 0.0024
Random Forest0.8315 ± 0.02110.7715 ± 0.01530.872 ± 0.02360.8184 ± 0.0139
XGBoost0.8203 ± 0.02090.7356 ± 0.02820.8527 ± 0.02070.7895 ± 0.0199
Proposed MONN
MONN-SWP
(Setting A)
0.9118 ± 0.0033
( p < 1 0 31 ) 1
0.9275 ± 0.0144
( p < 1 0 12.98 ) 1
0.8988 ± 0.0159
( p > 0.05 ) 1
0.9127 ± 0.0034
( p < 1 0 31 ) 1
MONN-SWP-ACP-CCPIP
(Setting C)
0.96039 ± 0.0142
( p < 1 0 13.19 ) 2
0.95865 ± 0.0249
( p < 1 0 10.8 ) 2
0.97455 ± 0.0177
( p < 1 0 7.4 ) 2
0.96654 ± 0.0154
( p < 1 0 12.1 ) 2
(d) For Switching State prediction (mean ± std) for SW4
AlgorithmsAccuracyPrecisionRecallF1-score
Other classifiers (From scikit-learn library [27])
SVM0.8539 ± 0.00350.841 ± 0.0060.8592 ± 0.00430.85 ± 0.0042
Naïve Bayes0.7139 ± 0.00460.6981 ± 0.00640.7154 ± 0.00560.7066 ± 0.005
Random Forest0.8321 ± 0.01760.8372 ± 0.01620.6693 ± 0.02880.7436 ± 0.0225
XGBoost0.8086 ± 0.01290.8225 ± 0.01440.6101 ± 0.02920.7001 ± 0.0191
Proposed MONN
Proposed MONN-SWP
(Setting A)
0.8957 ± 0.0068
( p < 1 0 10.37 ) 1
0.8806 ± 0.0241
( p < 1 0 10.37 ) 1
0.908 ± 0.0246
( p < 1 0 3.55 ) 1
0.8934 ± 0.0064
( p < 1 0 10.64 ) 1
Proposed
MONN-SWP-ACP-CCPIP
(Setting C)
0.95087 ± 0.0230
( p < 1 0 8.4 ) 2
0.95015 ± 0.0272
( p < 1 0 8.4 ) 2
0.95645 ± 0.0151
( p < 1 0 11 ) 2
0.95328 ± 0.0153
( p < 1 0 11.67 ) 2
(e) For Activation Count Prediction (mean ± std)
Accuracy PrecisionRecallF1-score
NNMPC-MMC [13]0.9876 ± 0.0070.9914 ± 0.0040.9876 ± 0.0070.9895 ± 0.005
Proposed MONN-ACP
(Setting B)
0.9976 ± 0.007
(p > 0.05) 3
0.9984 ± 0.002
(p > 0.05) 3
0.9976 ± 0.002
(p > 0.05) 3
0.9980 ± 0.002
(p > 0.05) 3
1 One-sided t-test is computed to test whether the proposed algorithm (Setting A) is significantly better than the best algorithm out of SVM, Naïve Bayes, Random Forest and XGBoost. Bonferroni adjustment is made to multiply the p-value by 36 (number of algorithms x number of SMs); 2 One-sided t-test is computed between Setting C (Proposed) vs. the best algorithm out of SVM, Naïve Bayes, Random Forest and XGBoost; 3 The p-value cutoff is 0.05, i.e., 10 1.3 .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guo, S.; Wu, H.C.; Chan, S.C.; Zhu, J. A Multi-Output Neural Network-Based Hybrid Control Strategy for MMC-HVDC Systems. Electronics 2025, 14, 4803. https://doi.org/10.3390/electronics14244803

AMA Style

Guo S, Wu HC, Chan SC, Zhu J. A Multi-Output Neural Network-Based Hybrid Control Strategy for MMC-HVDC Systems. Electronics. 2025; 14(24):4803. https://doi.org/10.3390/electronics14244803

Chicago/Turabian Style

Guo, Shunxi, Ho Chun Wu, Shing Chow Chan, and Jizhong Zhu. 2025. "A Multi-Output Neural Network-Based Hybrid Control Strategy for MMC-HVDC Systems" Electronics 14, no. 24: 4803. https://doi.org/10.3390/electronics14244803

APA Style

Guo, S., Wu, H. C., Chan, S. C., & Zhu, J. (2025). A Multi-Output Neural Network-Based Hybrid Control Strategy for MMC-HVDC Systems. Electronics, 14(24), 4803. https://doi.org/10.3390/electronics14244803

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop