Feedforward Control of Piezoelectric Ceramic Actuators Based on PEA-RNN

Multilayer perceptron (MLP) has been demonstrated to implement feedforward control of the piezoelectric actuator (PEA). To further improve the control accuracy of the neural network, reduce the training time, and explore the possibility of online model updating, a novel recurrent neural network named PEA-RNN is established in this paper. PEA-RNN is a three-input, one-output neural network, including one gated recurrent unit (GRU) layer, seven linear layers, and one residual connection in the linear layers. The experimental results show that the displacement linearity error of piezoelectric ceramics reaches 8.96 μm in the open-loop condition. After using PEA-RNN compensation, the maximum displacement error of piezoelectric ceramics is reduced to 0.465 μm at the operating frequency of 10 Hz, which proves that PEA-RNN can accurately compensate piezoelectric ceramics’ dynamic hysteresis nonlinearity. At the same time, the training epochs of PEA-RNN are only 5% of the MLP, and fewer training epochs provide the possibility to realize online updates of the model in the future.


Introduction
The Shanghai Synchrotron Radiation Facility (SSRF), which operates nearly 5000 h a year, has massive data resources, and the high availability of data brings enormous opportunities for the application of artificial intelligence technology in the field of synchrotron radiation. In recent years, SSRF has explored intelligent beamlines based on the differential evolution algorithm and achieved initial success [1]. However, in the online test, the researchers found that, after the introduction of the second crystal pitch axis of the monochromator [2], the excellent individuals were inherited to the next generation due to its poor motion repeatability, resulting in the intelligent algorithm not converging. SSRF uses many PEAs to achieve micro displacements. However, due to the inherent hysteresis nonlinearity and creep properties, and the defects of the control algorithm, the motion accuracy is not high enough, and the motion repeatability is poor. Studies have shown that, without considering the error caused by vibration, the uncertainty introduced by piezoelectric ceramic hysteresis is generally 15-20%, and the creep error is 1-5% [3].
For a long time, the hysteresis nonlinearity and compensation technology of piezoelectric ceramics have been research hotspots for scholars. The early classical feedforward control technology used mathematical methods to establish a fitting model of the hysteresis loop and to obtain the relationship between the excitation voltage and the actual output displacement by solving its inverse model. Physics-based models and phenomenological models [4] were the main classifications of the hysteresis model in the past. The micro-electromechanical models [5] are typical phenomenological models. The rest of this paper is organized as follows: In Section 2, the performance of the experimental platform is evaluated, and the control strategy is preliminarily determined, then the related neural network structures are introduced. Next, the training and the application of PEA-RNN are illustrated, and the effects of MLP and residual connections are verified in Section 3. Finally, this paper is concluded in Section 4. In this experiment, the digital piezoelectric actuator(E-709.CRG) [16] and the piezo linear precision positioner (P-621.1CD) [17] of Physik Instrumente (PI) are used, as shown in Figure 2. The signal amplifier is integrated into the actuator, and the capacitive sensor is integrated into the positioner. After the actuator receives the command, it sends the voltage to the positioner through the signal amplifier, reads back the current position of the piezoelectric ceramic from the capacitive sensor, and communicates with the computer through USB, as shown in Figure 3.

Primary and Secondary Hysteresis Loops
The hysteresis nonlinearity of piezoelectric ceramics is mainly manifested in input voltage and output displacement curves. The voltage-rising curve does not coincide with the voltage-dropping curve. Many factors affect this characteristic, not only related to the current input voltage, but also related to the input history and the input frequency [18].
The test data were obtained by taking the step signal, the dampened sinusoidal signal, and the amplified sinusoidal signal with a working frequency of 10 Hz as the input. The primary hysteresis loop refers to the largest hysteresis loop formed by the positioner during its entire travel. Furthermore, the secondary hysteresis loop refers to the smaller hysteresis loop.
As shown in Figure 4, the primary hysteresis loop was linearly fitted, and the linear relationship between the output displacement x and the input voltage V was obtained as

Hysteresis Loop Model
It can be found from the primary and the secondary hysteresis loops that the current position of the positioner has a great relationship with the current output voltage, the last position, and the last input voltage.
The state Φ t of the piezoelectric ceramic at time t is uniquely represented by the input voltage and output displacement at the current time.
where V t is the input voltage at time t, and x t is the displacement at time t. From the diagrams of the primary and the secondary hysteresis loops, it is not difficult to see the relationship between the current state and the historical state, and there is an obvious time series relationship between Φ 1 , . . . , Φ t , which can be represented by a conditional probability model.
As the operating time of PEA increases, more and more piezoelectric ceramic states need to be recorded and calculated, which will directly lead to an increasing control time and a continuous decrease in operating frequency. Therefore, consider introducing the Markov assumption, suppose that in the real situation, the rather long sequence Φ t−1 , . . . , Φ 1 , may not be necessary for the current state Φ t . Therefore, it is only necessary to satisfy a certain length τ; using the observation sequence Φ t−1 , . . . , Φ t−τ , the current state can be predicted. According to the Markov assumption, (3) can be modified as To meet the requirements of high response frequency, a simple first-order Markov model is used, that is, τ = 1. (4) can be simplified as Bringing (2) into the equation V t−1 is the input voltage at time t − 1, and x t−1 is the displacement at time t − 1.
After transformation, (7) is obtained: Finally, modifying the parameters: For the piezoelectric ceramic positioner, from the given target position x T , the last time input voltage V t−1 and the last time displacement x t−1 , the voltage applied to the positioner V R is calculated by the function f . Figure 7 is the principle structure diagram of the whole system. The task of this paper is to design a feedforward control model of a recurrent neural network, perform compensation calculation on the preset target position, and output the result through the controller to apply it to piezoelectric ceramics to realize micro and precise displacement.  In the model, considering the state of the last moment as a hidden variable for processing, it is necessary to add a neural network structure that can describe the hidden variable, which is the RNN [19].
As shown in Figure 8, for a single recurrent neuron, the input x t , at time t, needs to be calculated in three steps to obtain the output o t (to simplify the description, the input layer bias and the output layer bias are ignored). Firstly, the input layer weight w xh is multiplied; secondly, the hidden layer output h t−1 , at time t − 1, is multiplied by the weight w hh , and then added to the first step result to obtain h t ; thirdly, h t is multiplied by the output layer weight w hq , activated by the activation function, and the final output is o t . The first step is usually expressed together with the second step: The third step can be expressed as: The hidden layer state h is continuously transmitted and used "recurrently", which coincides with the time sequence characteristic of the piezoelectric ceramic operating data.
The structure of the basic recurrent neural unit is simple, but for long sequence problems, the training likely fails due to gradient disappearance or explosion. GRU is a recurrent neural unit with better performance optimized for the shortcomings of the basic one [19].
The calculation flow of the GRU is shown in Figure 9. GRU mainly includes the reset gate and the update gate, which are vectors in the interval (0,1). The opening of the reset gate controls how well the current input combines with the hidden state, while the opening of the update gate controls the retention of the hidden state [20].
For a given time step t, assuming the input is a mini-batch X t ∈ R n×d (number of samples: n, number of inputs: d), the hidden state of the previous time step is H t−1 ∈ R n×h (number of hidden units: h). Then the reset gate R t ∈ R n×h and the update gate Z t ∈ R n×h are calculated as follows: where σ represents the sigmoid function, which is used to convert the input value to the interval (0,1), the expression is  Next, the reset gate R t is integrated with the regular hidden state update mechanism to obtain candidate hidden states H t ∈ R n×h at time step t where the symbol is the Hadamard product operator. The role of the tanh function is to convert the input value to the interval (−1,1); the specific expression is The result of the calculation is the candidate because the operation of the update gate still needs to be combined. The element multiplication of R t and H t−1 in (14) can reduce the influence of past states. Whenever the terms in the reset gate R t approach 1, restore a normal recurrent neural network. For all terms close to 0 in R t , the candidate hidden state is the result of an MLP with X t as input. Therefore, any pre-existing hidden state is reset to default. Finally, the effect of the update gate Z t needs to be combined, which determines to what extent the new hidden state H t is the old state H t−1 , and the use of the new candidate state H t quantity. The update gate Z t only needs an element-wise convex combination between H t−1 and H t to achieve this. This provides the final update formula for the gated recurrent unit: Whenever the update gate Z t approaches 1, only the old state is kept. At this point, the information from X t is essentially ignored, effectively skipping time step t in the dependency chain. Conversely, as Z t approaches 0, the new hidden state H t approaches the candidate hidden state H t . These designs can help deal with the vanishing gradient problem in recurrent neural networks and better capture the dependencies of sequences with long time-step distances. For example, if the update gate is close to 1 for all time steps of the entire subsequence, then regardless of the length of the sequence, the old hidden state at the beginning time step will be easily preserved and passed on to the end of the sequence.
In summary, GRU has the following two salient features: [19] 1.
Reset gate helps capture short-term dependencies in the sequence; 2.
Update gate helps capture long-term dependencies in the sequence.

Multilayer Perceptron (MLP)
The structure of MLP is shown in Figure 10. This structure can improve the performance of the network and realize dimensional transformation.

Residual Connection
Generally, increasing the number of layers can improve the neural network performance, but it will correspondingly increase the computing time and the resources occupied by the calculation. At the same time, the increase in the number of layers will also lead to network degradation-that is, the weight matrix in the network will no longer be sensitive to changes in the input in some dimensions, and the same output is obtained regardless of the input. To optimize the MLP, the number of layers should be compressed as much as possible without loss of performance or very little performance, so the residual connection is introduced into MLP in the experiment.
Focus on the network portion: As shown in Figure 11, suppose the original input is x, and the ideal map you want to train is f(x) (as input to the activation function above). The part in the dashed box on the left needs to fit the mapping f(x) directly, while the part in the dashed box on the right needs to fit the residual mapping f(x)-x. Residual maps tend to be easier to optimize in reality. The right picture is the basic structure of ResNet [21]-the residual block. Using the residual connection can break the symmetry of the neural network, making the input propagate forward faster between layers, and more conducive to constructing deep neural networks.

Network Training
The training process is shown in Figure 12. Firstly, initialize the weight value parameters, scramble the dataset randomly, and divide it into a training dataset and a verification dataset according to the ratio of 8:2. Then, take out the training data in groups, take the current actual displacement, the last time voltage and the last time displacement as the input of the neural network, set the initial state to all zeros, and calculate the voltage prediction layer by layer through the forward propagation algorithm. After getting the prediction, compare it with the current actual voltage and calculate the loss value (mean square error, MSE). The neural network weight parameters are updated according to the optimizer in the backpropagation algorithm. Whenever the loss value on the training dataset decreases, the network is verified using the verification dataset. If the loss value is the minimum loss value on the verification dataset at this time, the current neural network parameters are saved.

Activation function
Weight layer

Activation function
Weight layer In the experiment, the loss function used is the MSE function, and the specific expression is: where n represents the number of samples. The optimizer uses the most commonly Adam [22], which can adaptively adjust the learning rate α so that the training can converge faster. The parameter update formula is: In the formula, w t is the weight parameter to be updated; w t−1 is the weight parameter at the last moment; α t is the learning rate, and m t is the first-order moment estimation value of the gradient, which can adaptively adjust the speed of the learning rate change; v t is the second-order moment estimate of the gradient, which can prevent the parameter from falling into a local minimum;ˆ = 10 −8 , which is used to prevent the divisor from being 0.  The total dataset comprises the piezoelectric ceramic operating data, with 16,020 sets. During the experiment, set the Batch Size to 32 and the learning rate to 0.01. After 5000 epochs of training, a model that meets the accuracy requirements is obtained. The error decrease on the validation dataset is shown in Figure 13.    Figure 14 and Table 1, the designed deep neural network includes one recurrent layer (hidden state dimension is 128) and seven linear layers. The ReLU function is used as the activation function between the linear layers, and the second layer of the linear layer connects with the fifth layer by Residual connection. The input layer contains three dimensions (current actual displacement, the last time voltage, and the last time displacement), and the output layer is one-dimensional (input voltage prediction).  According to the design, for the ideal input voltage V k required to move to the target displacement x k at time k, the operation process is as follows: Step 1: The target displacement x k , the output displacement x k−1 and the input voltage V k−1 at time k − 1 directly form the input matrix X (k) (X (k) ∈ R 1×3 ) of the neural network at time k, and set the initial hidden state matrix H (k) 0 (H (k) 0 ∈ R 1×128 ) to zero: Step 2: GRU processes the input matrix sequentially and then outputs the corresponding result and the hidden state matrix containing trend information.
Step 3: The output result H where F(·) represents the complex nonlinear computation constructed by the MLP.
After the calculation of the recurrent layer and the MLP, the ideal input voltage V k of the piezoelectric driver is finally obtained.

Experimental Test
According to Figure 15, the validity of the model was verified on the experimental platform ( Figure 2). Set the target position on the PC to change with the linear motion trajectory, sinusoidal motion trajectory, amplified sinusoidal motion trajectory, and damped sinusoidal motion trajectory with a working frequency of 10 Hz. After PEA-RNN feedforward compensation, the prediction is sent to the controller to generate input voltage. Finally, test the actual position of the positioner. The experimental results are shown in Figures 16-18. When set as the step signal, the target position has a good linear relationship with the actual position, and the maximum error value is 0.210 µm. When the target is set to a sinusoidal signal, the maximum error is 0.396 µm. When the target is set to a dampened sinusoidal signal, the maximum error is 0.230 µm. The maximum error is 0.465 µm when the target is set to an amplified sinusoidal signal.

Ablation Experiment
To more clearly demonstrate the contribution of each structure in the PEA-RNN to the model, two ablation experiments were performed to evaluate the effect of the MLP and the residual connection, respectively.

The Impact of MLP
This experiment aims to verify the effect of MLP on the control accuracy of PEA-RNN. The experimental method adopted is to directly delete the MLP in PEA-RNN, so the new network has only one GRU layer. After the network is fully trained with the same training strategy as the original, the output error of the piezoelectric ceramic is tested under the condition that the input is a sinusoidal signal. The results are shown in Figure 19. The maximum output error is 0.599 µm, an increase of 51.3% compared to the original network.
After removing the MLP, the new network is a normal RNN, including a GRU layer. The linear fitting ability of the network is reduced, the error distribution is also relatively discrete, and the error at the starting point is significant. The MLP at the tail can enhance the nonlinear expression ability of PEA-RNN so that the model can describe the hysteresis nonlinearity of piezoelectric ceramics in appropriate detail.

The Impact of Residual Connection
This section evaluates its impact on model performance by removing the residual connection in PEA-RNN alone, and the experimental test results are shown in Figure 20. At the beginning of the piezoelectric ceramic displacement output, the error is too large, indicating that the neural network has not completely fitted the original curve after the same epochs of training, and the training error has not dropped to a relative minimum. A part of the error value is around ±1 µm, which is far worse than the original network control accuracy.  The role of the residual connection is to speed up the neural network training process. At the same time, it is beneficial to build a deeper network, which can also increase the nonlinear expression ability of the network.

Conclusions
Based on the PyTorch framework, this paper builds a deep neural network model named PEA-RNN, with one recurrent layer and seven linear layers. The ReLU function is used as the activation function between the linear layers. The residual connection is used between the second and fifth layers of the linear layers. The model's design principle and training process is given, and the model obtained by training is applied to the feedforward compensation of the positioner.
The test results show that the maximum displacement error is reduced from 8.96 µm to 0.465 µm under the control of PEA-RNN with the input of the 10 Hz operating frequency. Through ablation experiments to verify the role of each structure in the PEA-RNN, MLP can effectively enhance the nonlinear expression ability of the model, and the residual connection can not only accelerate the training process but also enhance the nonlinear expression of the model. These results show that PEA-RNN constructed in this paper can accurately describe the dynamic hysteresis nonlinearity of piezoelectric ceramics, realize the real-time intelligent beam line modulation system, and create the possibility for online updating of the model.