Optimization of a Fuzzy Automatic Voltage Controller Using Real-Time Recurrent Learning

: The automatic voltage regulator is an important component in energy generation systems; therefore, the tuning of this system is a fundamental aspect for the suitable energy conversion. This article shows the optimization of a fuzzy automatic voltage controller for a generation system using real-time recurrent learning, which is a technique conventionally used for the training of recurrent neural networks. The controller used consists of a compact fuzzy system based on Boolean relations, designed having equivalences with PI, PD, PID, and second order controllers. For algorithm implementation, the training equations are deduced considering the structure of the second order compact fuzzy controller. The results show that a closed-loop fuzzy control strategy was successfully implemented using real-time recurrent learning. In order to implement the controllers optimization, different weighting values for error and control action are used. The results show the behavior of the conﬁgurations used and its performance considering the steady state error, overshoot, and settling time.


Introduction
The power grid is a non-linear complex system consisting of various interconnected systems or control areas. In general, the distribution network is employed to control various nodes in the supply network with power supply stabilizers, power systems, and Automatic Voltage Regulators (AVRs) [1,2].
It is also noteworthy that the quick expansion of energetic resources distributed along the low voltage network induces violations in the voltage limits, especially in rural zones with a low electrical charge. Therefore, the use of devices for line voltage regulation arises as an approach to solve such issue and at low costs for the grid [3].
Following [4], excitation systems for synchronous generators contribute to the effective control of voltage and the stability of the energy system. Even if such an excitation system is adjusted to allow the proper operation of the electric energy scheme in a wide range of operating conditions, it may be necessary to tune it again when aiming at the improvement of the system's stability amid disregarded operation conditions.
In a generator, the excitation system helps keep the energy and the control of power flux employing an automatic voltage regulator. The job of an AVR consists of maintaining the magnitude in the energy of a synchronous generator at a certain level; hence, the stability in an AVR may affect the security in an energy system [5]. Furthermore, an AVR can be found in a Direct Current (DC) Microgrid system to regulate the Alternating Current (AC) power produced by the generators before rectification and then supplied to DC grid [6].

Automatic Voltage Regulator Design
Regarding related works to a voltage controller design, AVR voltage responses and frequencies for different gains in a specified limit for output voltage are studied in [7]. This work proposes a Proportional Integral (PI) controller achieving a response with minimal overshoot and small settling time.
Reference [2] displays a study of predictive Proportional Integral Derivative (PID) automatic voltage regulator used in a multi-machine feeding system. This work carries out the comparison with a traditional PID controller observing that the proposed method may provide better attenuation of the system's oscillators in the power grid with small and large disturbances.
Meanwhile, an automatic voltage regulator AVR based on series voltage compensation with an AC chopper is proposed in [8], this consists of an AC chopper with Pulse Width Modulation (PWM), and a transformer for the series voltage compensation. The AC chopper provides AC-AC direct energy conversion with no energy storage elements; thus, the size and cost of the AVR are reduced. The proposed AVR can compensate the voltage drops and the increment of voltage in the power input.

Automatic Voltage Regulator Optimization
Given the relevance of the AVR systems in the generation of energy, the literature allows observing different works performing the optimization of controllers employed with the AVRs. In [5], is displayed an approach with evolutionary computation to establish the optimal values of the gains of a PID employing evolutionary algorithms. This work shows the relevance of achieving the controller's adjustment.
Regarding the employment of other bio-inspired algorithms, in [9] the optimization of a Fractional-Order Proportional-Integral-Derivative (FOPID) controller is performed, since such controller includes two additional parameters compared to a conventional PID and its adjustment process is more complex. Therefore, it is used the Salp Swarm Optimization Algorithm (SSA) for the selection of the optimized parameters of the controller FOPID to achieve the optimal dynamic response and stability; besides, an analysis of stability is shown utilizing zero pole stability criteria and bode diagrams.
A method for the optimal adjustment of a FOPID to an automatic voltage regulator is presented in [10]. The method is based on the Yellow Saddle Goatfish Algorithm (YSGA). The performance of the obtained FOPID controller is verified when compared to various FOPID controllers adjusted by other metaheuristic algorithms.
In addition, in [11] is presented a technique to determine optimal value gains of a PID controller for an AVR using Cuckoo Search (CS) evolutionary algorithm. The dynamic performance of the proposed controller is evaluated considering the transitory response characteristics like the rise time, settling time, overshoot, and steady state error.
Finally, in [12] is displayed a hybrid metaheuristic method for the optimal adjustment of four different types of PID controllers for an automatic voltage regulator system. The method is based on the optimization algorithm of Simulated Annealing -Manta Ray Foraging Optimization (SA-MRFO). The performance of the ideal PID, real PID, fractional order PID, and PID obtained from second-order derived controllers is verified through the comparison with the tuned controllers by different algorithms presented in the literature.

Real-Time Recurrent Learning
The Real-Time Recurrent Learning (RTRL) algorithm is usually employed for identification and control of dynamic systems using recurrent neural networks [13,14]. This algorithm corresponds to the application of the chain rule when using the descending gradient method for the adjustment of parameters in a recurrent neural network. Even though this is mostly used for training of neural networks, it can be also applied in other optimization applications. This algorithm calculates the derivatives of states and outputs concerning all weights as the network processes the sequence (during the forward step) by which is not necessary the neural network unfolding [13]. The derivative of the states regarding the weights at the moment n is calculated starting from the states, and the derivatives at the moment n − 1, and the input at the moment n. In this way, a set of recurrence equations is employed for the neural network training.
In this paper, is used the RTRL algorithm for the optimization of a fuzzy controller in close-loop, however, to present the RTRL theory is considered the classical application for a Recurrent Network (RN). An example of a fully connected RN is shown in Figure 1, where z −1 represents a memory element (delay).  Considering the presented in [15], for a step time n, the parameters of a fully connected recurrent network are: • x k [n]: signal applied to input node k. The neural connections for the output of a processing node k in the step time n + 1 is y k [n + 1] = f k (s k [n]), which is calculated using Equation (1).
In Equation (1) f k is the activation function. On the other hand, taking T[n] as the set of indexes belonging to U representing the existence of a target value for a processing node k in a step time n, then, the total instantaneous squared error at the step time n is given by Equation (2).
In Equation (2), the error signal for each node is e k [n] = d k [n] − y k [n]. Using the gradient descent method where η is the learning rate, the weights can be updated using Equation (3).
In this way is observed that: Calculating the derivative of y k [n + 1] (given in Equation (1)) with respect to w ij is obtained: In Equation (5) δ ki is the Kronecker delta and z j [n] is defined as presented in Equation (6).

Proposal Approach and Document Organization
This article shows the optimization of a fuzzy automatic voltage controller for a generation system employing the algorithm real-time recurrent learning. The controller's optimization is performed in discrete time domain. The architecture of this controller consists of a fuzzy compact structure based on Boolean relations. The main purpose of the document is to deduce the equations of the algorithm for the controller's training, as well as its application for the AVR system using the weighting values for error and action control obtaining different behaviors. In this way, for the application of generation systems the real-time recurrent learning is a suitable option for training neuro-fuzzy controllers, which is the central aspect of this article.
The main contributions of this work are described as follows.
• Considering a close-loop control system, it is presented the deduction of the training equations employed to implement the training algorithm using the AVR discrete time model. Furthermore, the algorithm steps used for controller optimization are presented. • Fuzzy controller schemes are proposed making analogies with linear controllers and the result of the training process is shown.
The document is organized as follows. Section 2 details the model of an automatic voltage regulator. Section 3 displays the architecture of the fuzzy controller (compact fuzzy based on Bolean relations). Section 4 presents the deduction of training equations and the algorithm used for parameter adaptation. Next, Section 5 presents the implementation of compact fuzzy controllers: Second Order (SO), PI (Proportional Integral), Proportional Derivative (PD), and Proportional Integral Derivative (PID). Section 6 describes the experimental results; finally, Section 7 presents the conclusions.

Automatic Voltage Regulator
In energy systems, the automatic voltage regulator is a crucial part for achieving energy exchange and regulating the energy for the consumers. The AVR in the power system allows supplying quality reliable energy aiming at reducing the steady-state error to zero in an interconnected power system [7]. In the design of an AVR, the main requisites are quick response, low overshoot and steady-state error to zero at deviation from reference voltage [2]; all of this is to keep the reliability in the operations of the power system. Generally, a real power station has more than one generator connected to a busbar and each one has an AVR [16,17].
In Figure 2, is observed the schematic diagram of an AVR system. A simple AVR includes four main components: an amplifier, an exciter, a generator, and a sensor [18,19]. In this order, the controller allows having a suitable behavior of AVR.
As mentioned, the purpose of designing an AVR is to keep the output voltage at a certain level in a generation system. An electric power grid includes more than one generator connected to a similar busbar and each one has an AVR. Since the purpose of this system is to control the energy of the power grid to which the generator is connected by a power transformer, the level of energy is constantly measured as a feedback signal using an energy sensor that may consist of a transformer. After being rectified and filtered, this signal is compared to the reference voltage in the comparator to obtain a voltage error signal. The error signal passes through the controller, then it is amplified to feed the exciter for the adjustment of the voltage/current of the field winding in the generator's in a way that any voltage deviation may be compensated [20]. According to [5], to obtain the mathematical model and transfer function of the AVR, the four main components are linearized taking into account their time constants. Considering the presented in [2,5,21], an AVR can have a representation like the one in Sensor

AVR Discrete Time Model
Considering that the fuzzy controller optimization takes place in discrete time, it is proceeding to obtain the respective AVR model in this domain. For this, the transfer function of the AVR corresponds to: Transforming this transfer function into a discrete time representation considering a sampling time of T s = 0.05s, and using the bilinear transformation method is obtained that: where K = 1 × 10 −3 . In general, this transfer function can be written as: Using negative exponents:

Fuzzy Controller
According to [22], fuzzy logic has wide applicability in control systems, due to its flexibility for the implementation of control strategies, using knowledge of the system and linguistically proposing control strategies.
The architecture of the fuzzy controller is proposed considering its equivalence with a discrete time linear controller, in this way, the controller is optimized using Real-Time Recurrent Learning.
The controller architecture uses the structure of fuzzy logic systems based on Boolean relationships. In this regard, reference [23] presents the design methodology starting from a controller using Boolean sets which then become fuzzy. For the implementation of the fuzzy controller, two possible regions corresponding to negative and positive values in each universe of discourse are considered, therefore, for its implementation, membership functions of the sigmoidal type are used.
The fuzzy controller uses the compact structure presented in [24] that allows having an equivalence with a linear controller (in discrete time) using fuzzy sets to model the actions of direct inputs and the controller feedback. The general scheme of the control system can be seen in Figure 4. Reference Output + − To carry out the design of the fuzzy controller, a second order linear controller (in discrete time) is considered where the transfer function is:

Controller
The difference Equation (in discrete time) of this controller corresponds to: where the respective coefficients a i , b j are constant. For the fuzzy controller, these constants are replaced by non-linear relations given by fuzzy membership functions, such that: In general terms: For the fuzzy control system is used the membership functions shown in Figure 5, where fuzzy sets of sigmoidal type are employed to model the negative and positive values of the universe of discourse. In this way, the membership function µ ij (x i ) is given by Equation (17).
u, e Considering the fuzzy sets of Figure 5, and the general structure of the controller given by the Equation (15) it is obtained the scheme of Figure 6, where the fuzzy controller used is shown.
Considering that In this way, it is obtained Equation (19), being the set of controller parameters h ij ∈ {v ij , σ ij , γ ij }, where v ij is a scalar value and σ ij , γ ij the parameters of the membership function µ ij (x i ).

Fuzzy Controller Training Process
For a practical approach from the scheme of Figure 3, it is obtained the respective equivalent system in discrete time shown in Figure 4. In this model, the sensor is included as part of the plant, this being the variable to control.
In order to perform the adaptation of the fuzzy controller parameters h ij , it is used the learning rate η, if chosen with a very large value, it can cause that the algorithm does not converge, while choosing a very small value can cause the algorithm to require more time to converge [25]. Considering the presented in [25], the training method is based on the direction of maximum descent given by the gradient, therefore, the adaptation (optimization) of h ij is carried out in the following way: where J corresponds to the adjustment function defined by the error signal e[n] = r[n] − y[n] then: In Equation (21) the factor P weights the error signal and Q the control action, in a way that combinations of these values allow having different behaviors as desired.
The derivative of J depending on the adjustment parameters is: On the other hand, from the transfer function (12), the corresponding recurrence equation for the plant is: Considering the controller architecture ( Figure 6) and Equation (19) for f i the dynamics of the controller is given by: To implement the training Equation (20), the derivative of the plant output y[n] with respect to a controller parameter h ij is: In the same way, the derivative of the control action u[n] with respect to h ij is: Meanwhile, the derivative of the error e[n] with respect to a controller parameter h ij is: In order to establish the respective derivatives using Equation (19) is obtained: where l = 1, ..., 5, i = 1, ..., 5 and j = 1, 2, therefore, the following cases are presented: Case 1: When l = i and j can be equal to any value m = 1, 2, then: The respective derivatives are: Case 2: When l = i and j = 1, it is established that: Case 3: When l = i and j = 2 is obtained: To calculate the respective derivatives of cases 2 and 3, it is first obtained: For the other derivatives: In the case of the parameter h ij = v ij is determined: Taking the parameter h ij = σ ij it is established that: With the parameter h ij = γ ij the respective derivative is: For Equations (37)-(39) is considered that: Furthermore, also: Considering the above, Equations (37)-(39) can be written as: Taking in a general way a controller parameter h ij it has: In this way Equation (33) corresponding to the case when l = i and j = 1 can be written as: Meanwhile, for Equation (34) when l = i and j = 2 is obtained: Using the previous equations, the update of the controller parameters is carried out using the general Equation (20) having for each parameter the following equations:

Training Algorithm
Algorithm 1 shows the steps used for controller training, in the first step is chosen the initial parameter settings for the controller. In the next step, the control system output is calculated using the plant model. Then it proceeds to perform the adjustment of the fuzzy controller parameters using the equations that involve the dynamics of the control system and the derivatives (Equations (25)-(27) and subsequent equations). It is important to note that the adjusted parameters are stored in an auxiliary variable since during this step the controller does not use these values. Then, it is returned to the step where the control system is evaluated with n = n + 1 until complete the simulation time N T . When completing the simulation time, the controller parameters are updated with the optimized values and it is returned to the step where is calculated the control system response for a new iteration k = k + 1, until the objective function J T (k) given by Equation (53) is less than a defined ε, or until k is equal to a defined number K T .
It should be noted that in this algorithm the AVR discrete time model is used for controller training, therefore, it is not necessary to use additional AVR data. The controller and the plant model are integrated in a control close-loop, in this way, the control system input is the reference desired value r[n].

Implementation of Compact Fuzzy Controllers
The training method is based on the direction of maximum descent given by the gradient, its convergence to the optimum point depends on the initial values of the parameters. Considering that the membership functions used allow representing the actions for positive and negative values, their initial configuration is taken from Figure 5. With this approach, it can be seen that fuzzy systems allow establishing an initial configuration that can be later optimized.
For the design of compact fuzzy controllers, linear control strategies are considered as a reference: Considering the presented in [24], the compact fuzzy controllers are implemented considering Equation (24) associated with Figure 6, having the following expression:

Second Order Controller Structure
The discrete linear controller considered consists of a second order system with transfer function: The respective difference equation is: Then for the fuzzy controller is considered the structure:

PI Equivalent Structure
In this case, the controller has the form: considering the compact structure it can be expressed as: In this way, the difference equation of the PI controller corresponds to: Then for the PI fuzzy controller the structure is:

PD Equivalent Structure
The PD controller can be represented as: then the expression for this is: The corresponding difference equation is: Then the structure for the PD fuzzy controller is:

PID Equivalent Structure
In the case of the PID controller, it can be described as: performing the respective operations it is obtained: in general, it can be written as: The respective difference equation for this controller is: Then for the PID fuzzy controller the structure is:

Configurations
Considering the general expression of the compact fuzzy controller given by Equation (54), the equivalences with PI, PD, PID, and second order linear controllers are obtained as shown in Table 1, this according to the respective discrete time Equations (57), (61), (65), and (70).

Experimental Results
In order to carry out the experimental tests, it is possible to have different combinations of P and Q, therefore, for the values selection one of these is left fixed, in this case P = 1 and Q is varied on a logarithmic scale in a way that in the results the effect of these combinations is observed. Furthermore, the learning rate is set to 2 × 10 −5 . The values considered of P and Q are the following: The performance index used corresponds to the one shown in Equation (71), where r is the desired output value, y the response of the control system, u the control signal, P and Q weighting values of the error and control action, finally, N T is the total number of data (total simulation steps).
The process carried out to obtain the experimental results considers the different values of Q and the controllers, thus the steps are:
Perform the controller training using Algorithm 1.

4.
Calculate the respective performance index given by Equation (71).
Return to step 2 until completing all controllers. 7.
Return to step 1 until completing all values of Q.
In this way are obtained the results for the values of Q using different compact fuzzy controllers. It should be noted that since the structure and the initial configuration of the controllers given by each µ ij (x i ) is established, only one run of the algorithm is required since the initial configuration is not random. In addition, the reference value r[n] = 1 since the controller seeks to regulate the normalized output voltage.
The optimization process is performed for the different cases of P and Q for each controller obtaining the results of the objective function shown in Table 2. In this table is seen that when increasing Q the value of the objective function also increases.   Considering case PQ3, Figure 10 displays the comparison results of controllers SO, PI, PD, and PID presenting the output signal y(t) and the control action u(t). In these results, SO has the largest overshot and the PI controller a suitable behavior considering the overshot and setting time. Finally, Figure 11 shows the response detail of the optimized fuzzy controllers, where the steady-state error and the overshoot are better observed for the SO, PI, PD, and PID controllers.

Conclusions
In this article, the process for the optimization of an automatic voltage controller using Real-Time Recurrent Learning was presented, making the deduction of the respective equations used for the implementation of this algorithm.
In order to carry out the experimental tests, different configurations for the P and Q values are considered; P weights the error signal and Q the control action, in a way that combinations of these values allow having different behaviors.
The results show the response of the different fuzzy controllers considered which were determined by the analogy of linear controllers, namely, base on preliminary knowledge of the controllers, therefore the controller topology considered was SO, PI, PD, PID.
The configurations display different dynamic behaviors according to values of P and Q. Checking the results, the control action decreases when the value of Q increases.
The record of the objective function J T (k) in each of the cases considered shows that the optimization process is successful using the deduced equations.
This scheme can be used to tune the controller in real time with an adaptive scheme when there are variations in the plant. Besides, in an additional work other fuzzy controller configurations could be used.  Institutional Review Board Statement: In this work were not carried out test on individuals (humans).

Informed Consent Statement:
The study presented in this work does not involve human beings.

Data Availability Statement:
No external data were required for this work.