IK-SPSA-Based Performance Optimization Strategy for Steam Generator Level Control System of Nuclear Power Plant

: The steam generator (SG) is a critical component of the steam supply system in the nuclear power plant (NPP). Hence, it is necessary to control the SG level well to ensure the stable operation of the NPPs. However, its dynamic level response process has signiﬁcant nonlinearity (such as the ‘swell and shrinks’ effect) and time-varying properties. As most of the SG level control systems (SGLCS) are constructed based on the Proportional-Integral-Derivative (PID) controllers with ﬁxed parameters, the controller parameters should be optimized to improve the performance of the SGLCS. However, traditional parameters tuning methods are generally experience-based, cumbersome, and time-consuming, and it is difﬁcult to obtain the optimal parameters. To address the challenge, this study adopts a knowledge-informed simultaneous perturbation stochastic approximation (IK-SPSA) based on adjacent iteration points information to improve the performance of the SGLCS. Rather than the traditional controller parameter tuning method, the IK-SPSA method optimizes the control system directly by using measurements of control performance. The method’s efﬁciency lies in the following aspects. Firstly, with the help of historical information during the optimization process, the IK-SPSA can dynamically sense the current status of the optimization process. Secondly, it can accomplish the iteration step size tuning adaptively according to the optimization process’s current status, reducing the optimization cost. Thirdly, it has the stochastic characteristic of simultaneous perturbation, which gives it high optimization efﬁciency to optimize high dimensional controller parameters. Fourthly, it incorporates an intelligent termination control mechanism to accomplish optimization progress control. This mechanism could terminate the optimization process intelligently through historical iterative process information, avoiding unnecessary iterations. The optimization method can improve the stability, safety, and economy of SGLCS. The simulation results demonstrated the effectiveness and efﬁciency of the method.


Introduction
The 26th Conference of the Parties (COP 26) to the United Nations Framework Convention on Climate Change (UNFCCC) set the goal of ensuring a net-zero global goal of zero carbon emissions and carbon neutrality by the mid-21st century as one of the four main objectives of the negotiations [1]. With the background of achieving "carbon peaking" and "carbon neutrality", the safety, efficiency, and cleanness of nuclear energy make it an important vehicle for ensuring power supply, realizing the double carbon commitment, clean and low carbon development, and building a new development pattern of dual "circulation" [2,3].
To ensure the safe and economical operation of nuclear power plants (NPPs), it is necessary to strictly control the level of steam generators (SG) in NPPs. An SG is a critical component of an NPP's steam supply system, transferring heat from the primary loop to of model-free intelligent optimization methods [20,21]. Moreover, the knowledge-informed simultaneous perturbation stochastic approximation (IK-SPSA) proposed by Kong et al. further improves the optimization efficiency by using historical iterative process information. Compared with the SPSA, the IK-SPSA can dynamically sense the current status of the optimization process and accomplish the iteration step size tuning adaptively according to the optimization process's current status, reducing the optimization cost. Therefore, the IK-SPSA method can effectively increase the efficiency of parameter tuning and control system optimization.
Considering the similarity between the parameter tuning of the control system and the batch process optimization, the IK-SPSA method, as a data-driven method, may be utilized to promote the efficiency for the parameter tuning of the SGLCS. Thus, in this paper, the IK-SPSA method was revised according to the specific scenario of the SGLCS performance optimization. Meanwhile, an iteration termination control strategy was incorporated with the revised IK-SPSA method to avoid unnecessary costs. The background, the detailed methodology, and the verification of the IK-SPSA-based strategy are illustrated in the following sections.

SG Mechanism
An SG comprises a downcomer, tube bundle, riser, steam-water separator, dryer, etc. The structural diagram of a typical SG is shown in Figure 1. The secondary loop's feedwater is combined with the water separated by the steam-water separator before flowing upward along the outside of the inverted U-shaped tube bundle through the downcomer. The rising water absorbs the heat generated by the reactor in the primary loop tube bundle and converts it into a steam-water mixture. The steam-water mixture enters the steam-water separator and dryer. After the steam-water separation, the stream flows from the top outlet of the SG to the steam turbine for rotating the generator. The separated water is mixed with feedwater for recycling [6].
the optimal settings efficiently. Considering that the SPSA has data−driven optimization characteristics, it only needs to use the measured value of the process quality instead of relying on historical process information [16][17][18][19], avoiding the model mismatch problem of model−based optimization methods. Meanwhile, SPSA has a high optimization efficiency due to its simultaneous perturbation characteristics, allowing it to overcome the drawbacks of model−free intelligent optimization methods [20,21]. Moreover, the knowledge−informed simultaneous perturbation stochastic approximation (IK−SPSA) proposed by Kong et al. further improves the optimization efficiency by using historical iterative process information. Compared with the SPSA, the IK−SPSA can dynamically sense the current status of the optimization process and accomplish the iteration step size tuning adaptively according to the optimization process's current status, reducing the optimization cost. Therefore, the IK−SPSA method can effectively increase the efficiency of parameter tuning and control system optimization.
Considering the similarity between the parameter tuning of the control system and the batch process optimization, the IK−SPSA method, as a data−driven method, may be utilized to promote the efficiency for the parameter tuning of the SGLCS. Thus, in this paper, the IK−SPSA method was revised according to the specific scenario of the SGLCS performance optimization. Meanwhile, an iteration termination control strategy was incorporated with the revised IK−SPSA method to avoid unnecessary costs. The background, the detailed methodology, and the verification of the IK−SPSA−based strategy are illustrated in the following sections.

SG Mechanism
An SG comprises a downcomer, tube bundle, riser, steam−water separator, dryer, etc. The structural diagram of a typical SG is shown in Figure 1. The secondary loop's feedwater is combined with the water separated by the steam−water separator before flowing upward along the outside of the inverted U−shaped tube bundle through the downcomer. The rising water absorbs the heat generated by the reactor in the primary loop tube bundle and converts it into a steam−water mixture. The steam−water mixture enters the steam−water separator and dryer. After the steam−water separation, the stream flows from the top outlet of the SG to the steam turbine for rotating the generator. The separated water is mixed with feedwater for recycling [6].  SG is a highly complex, nonlinear, and time-varying inverse dynamic system. Swell and shrink are one of an SG's dynamic characteristics. The level of SG is influenced by many factors, of which the most important are the rate of the feedwater flow and the steam flow. However, at the same time, it is also affected by the feedwater temperature, the pressure in the SG, the number of bubbles, and other factors, and the level characteristics under different powers are also quite different.

SGLCS
To keep the level of SG within a specific safe range during the operation of the nuclear power unit, the level control of SG needs to complete two main tasks. The first task is to overcome the influence of the shrink and swell effect caused by steam flow disturbance during the operation of the SG. According to the error of stream flow and water supply, the SG level can reach the setpoint in the shortest time, and the SGLCS has better dynamic characteristics. The second is to make the SG keep the relative balance between feedwater flow and steam flow and the basic constant level during stable operation, making the control system have good static characteristics [17]. To achieve the tasks, the control diagram of the SGLCS should be designed as shown in Figure 2; and the current SGLCSs generally adopts the three-impulse cascade proportional-integral-derivative (PID) control structure, which is shown in Figure 3.
SG is a highly complex, nonlinear, and time−varying inverse dynamic system. Swell and shrink are one of an SG's dynamic characteristics. The level of SG is influenced by many factors, of which the most important are the rate of the feedwater flow and the steam flow. However, at the same time, it is also affected by the feedwater temperature, the pressure in the SG, the number of bubbles, and other factors, and the level characteristics under different powers are also quite different.

SGLCS
To keep the level of SG within a specific safe range during the operation of the nuclear power unit, the level control of SG needs to complete two main tasks. The first task is to overcome the influence of the shrink and swell effect caused by steam flow disturbance during the operation of the SG. According to the error of stream flow and water supply, the SG level can reach the setpoint in the shortest time, and the SGLCS has better dynamic characteristics. The second is to make the SG keep the relative balance between feedwater flow and steam flow and the basic constant level during stable operation, making the control system have good static characteristics [17]. To achieve the tasks, the control diagram of the SGLCS should be designed as shown in Figure 2; and the current SGLCSs generally adopts the three−impulse cascade proportional−integral−derivative (PID) control structure, which is shown in Figure 3.

Performance Optimization of SGLCS
Once the SGLCS determines the control system structure, its control performance is determined mainly by the control system's controller parameters. To optimize control performance, the relevant controller parameters of each controller need to be adjusted. The control system's parameter optimization structure is shown in Figure 4. The choice of different parameter combinations will impact the performance of the SGLCS. A set of parameter combination solutions that can optimize the control performance of the related control system is defined as the optimal parameter combination solution. The optimization problem can be expressed as follows by converting the performance optimization described above into a mathematical statement: max ( ) . .  SG is a highly complex, nonlinear, and time−varying inverse dynamic system. Swell and shrink are one of an SG's dynamic characteristics. The level of SG is influenced by many factors, of which the most important are the rate of the feedwater flow and the steam flow. However, at the same time, it is also affected by the feedwater temperature, the pressure in the SG, the number of bubbles, and other factors, and the level characteristics under different powers are also quite different.

SGLCS
To keep the level of SG within a specific safe range during the operation of the nuclear power unit, the level control of SG needs to complete two main tasks. The first task is to overcome the influence of the shrink and swell effect caused by steam flow disturbance during the operation of the SG. According to the error of stream flow and water supply, the SG level can reach the setpoint in the shortest time, and the SGLCS has better dynamic characteristics. The second is to make the SG keep the relative balance between feedwater flow and steam flow and the basic constant level during stable operation, making the control system have good static characteristics [17]. To achieve the tasks, the control diagram of the SGLCS should be designed as shown in Figure 2; and the current SGLCSs generally adopts the three−impulse cascade proportional−integral−derivative (PID) control structure, which is shown in Figure 3.

Performance Optimization of SGLCS
Once the SGLCS determines the control system structure, its control performance is determined mainly by the control system's controller parameters. To optimize control performance, the relevant controller parameters of each controller need to be adjusted. The control system's parameter optimization structure is shown in Figure 4. The choice of different parameter combinations will impact the performance of the SGLCS. A set of parameter combination solutions that can optimize the control performance of the related control system is defined as the optimal parameter combination solution. The optimization problem can be expressed as follows by converting the performance optimization described above into a mathematical statement: max ( ) . .

Performance Optimization of SGLCS
Once the SGLCS determines the control system structure, its control performance is determined mainly by the control system's controller parameters. To optimize control performance, the relevant controller parameters of each controller need to be adjusted. The control system's parameter optimization structure is shown in Figure 4. The choice of different parameter combinations will impact the performance of the SGLCS. A set of parameter combination solutions that can optimize the control performance of the related control system is defined as the optimal parameter combination solution. The optimization problem can be expressed as follows by converting the performance optimization described above into a mathematical statement: where Perf represents the performance index of the control system, x represents the selected control parameter set and represents the selected range of the control parameter set, and f(x) represents the relationship between controller parameters and its corresponding control system performance index. where Perf represents the performance index of the control system, x represents the selected control parameter set and represents the selected range of the control parameter set, and f(x) represents the relationship between controller parameters and its corresponding control system performance index.

Data−Driven Overall Strategy for Performance Optimization of SGLCS
According to batch characteristics of SGLCS performance optimization, a data−driven strategy framework for SG level control performance optimization is proposed, as shown in Figure 5. The optimization framework consists of three parts. Among them, the SG object and the control system of the NPP constitute a generalized process object; the data−driven optimization strategy consists of two parts: the performance evaluation and the data−driven optimization algorithm.
The brief introduction of each part is as follows: (1) Generalized process object In the actual industrial process, the generalized process object is the actual SG level process and its distributed control system (DCS). The data−driven optimization algorithm will transfer the iterative controller parameters to the DCS, modify the relevant controller parameters, and put them into the controller system. Then, the operation data are collected by DCS and transferred to the performance evaluation system. In order not to lose generality in the research process, the classical SG level model and its control model are used to substitute the actual process.
(2) Performance evaluation The performance evaluation system receives the water level response data from the process object and stores these data. According to the characteristics of the system, the calculated value of the performance evaluation can be used to evaluate the performance of the control system corresponding to the selected control parameters. In this paper, the control system performance evaluation index uses the integral of time multiplied by the absolute error index (ITAE) to evaluate the characteristics of the step response transient curve [22], and it can be expressed as below:

Data-Driven Overall Strategy for Performance Optimization of SGLCS
According to batch characteristics of SGLCS performance optimization, a data-driven strategy framework for SG level control performance optimization is proposed, as shown in Figure 5.
where Perf represents the performance index of the control system, x represents the selected control parameter set and represents the selected range of the control parameter set, and f(x) represents the relationship between controller parameters and its corresponding control system performance index.

Data−Driven Overall Strategy for Performance Optimization of SGLCS
According to batch characteristics of SGLCS performance optimization, a data−driven strategy framework for SG level control performance optimization is proposed, as shown in Figure 5. The optimization framework consists of three parts. Among them, the SG object and the control system of the NPP constitute a generalized process object; the data−driven optimization strategy consists of two parts: the performance evaluation and the data−driven optimization algorithm.
The brief introduction of each part is as follows: (1) Generalized process object In the actual industrial process, the generalized process object is the actual SG level process and its distributed control system (DCS). The data−driven optimization algorithm will transfer the iterative controller parameters to the DCS, modify the relevant controller parameters, and put them into the controller system. Then, the operation data are collected by DCS and transferred to the performance evaluation system. In order not to lose generality in the research process, the classical SG level model and its control model are used to substitute the actual process.
(2) Performance evaluation The performance evaluation system receives the water level response data from the process object and stores these data. According to the characteristics of the system, the calculated value of the performance evaluation can be used to evaluate the performance of the control system corresponding to the selected control parameters. In this paper, the control system performance evaluation index uses the integral of time multiplied by the absolute error index (ITAE) to evaluate the characteristics of the step response transient curve [22], and it can be expressed as below:  The optimization framework consists of three parts. Among them, the SG object and the control system of the NPP constitute a generalized process object; the data-driven optimization strategy consists of two parts: the performance evaluation and the data-driven optimization algorithm.
The brief introduction of each part is as follows: (1) Generalized process object In the actual industrial process, the generalized process object is the actual SG level process and its distributed control system (DCS). The data-driven optimization algorithm will transfer the iterative controller parameters to the DCS, modify the relevant controller parameters, and put them into the controller system. Then, the operation data are collected by DCS and transferred to the performance evaluation system. In order not to lose generality in the research process, the classical SG level model and its control model are used to substitute the actual process.
(2) Performance evaluation The performance evaluation system receives the water level response data from the process object and stores these data. According to the characteristics of the system, the calculated value of the performance evaluation can be used to evaluate the performance of the control system corresponding to the selected control parameters. In this paper, the control system performance evaluation index uses the integral of time multiplied by the absolute error index (ITAE) to evaluate the characteristics of the step response transient curve [22], and it can be expressed as below: In consideration of the ITAE value changing too much compared with the parameters change scale, which could make the optimization process oscillate too violently, the performance evaluation index ITAE(lg) in this paper is defined as shown in the following formula: (3) Data-driven optimization The data-driven optimization system receives the process data from the performance evaluation system. Historical data information generated in the optimization process is stored in order and used to guide the optimization process, improve the optimization efficiency, and reduce the optimization cost as much as possible. This paper focuses on introducing a new type of SPSA method. The improved SPSA optimization is based on the traditional SPSA optimization method and combined with the water level control characteristics of the SG to realize the adaptive adjustment of optimization step size and intelligent termination of the optimization iteration process.

IK-SPSA
SPSA algorithm is a MFO methodology suitable for multi-dimensional and noisy environments [19,20]. The flowchart of the SPSA algorithm is shown in Figure 6.
Step 6: New iteration point generation.
The new iteration point is generated by ( ) k G X and k a according to the following formula: The traditional SPSA algorithm has high efficiency. However, because of its fixed step size mechanism, the related parameters of step sizes cannot be adjusted adaptively. If the step size is too small, the optimization process will be slowed down, and when the step size is too large, it will cross the optimal solution and cause oscillation in the optimization process.
To solve the above problems and make the SPSA algorithm more efficient and accurate, the following effective improvement schemes are put forward by using the data−driven idea combined with the adjacent iteration points in historical information: Firstly, the current optimization process status is evaluated based on data−driven historical information. Hence, the SPSA could adjust its iteration step size according to the status evaluation. Thus, it obtains an adaptive ability in the optimization process.
Secondly, integrated adaptive compensation factors in SPSA will adaptively adjust the next iteration step size according to the current optimization process status. Hence, an improved algorithm (knowledge−informed simultaneous perturbation stochastic approximation, IK−SPSA) is formed, combining the above two ideas and the regular pattern of historical information. It can adaptively adjust the step size according to the current opti- Step 1: Initialization. Determine the parameters to be optimized and their feasible areas, and select the starting point through experience or according to specific rules. Set appropriate optimization method coefficients {a, A, c, α, γ} and iteration termination conditions.
Step 2: Step size calculation. Update step size a k at the current iteration batch, and the iteration step size can be calculated as below: where k is the current iteration number. a k indicates the iteration step length of SPSA at the kth iteration. The perturbation step size could be calculated as follows: where c and γ are coefficients of SPSA, c k represents the perturbation steps the size of SPSA in the kth iteration, and it will decrease with the progress of the iterative optimization process.
Step 3: Perturbation points generation. Generate an n-dimensional perturbation vector ∆ k by the Monte Carlo method, each element of which is randomly generated by Bernoulli ±1 distribution.
Assuming that the current iteration point is X k , then the positive perturbation point is X k + c k ∆ k , and the negative perturbation point is X k − c k ∆ k .
In the kth iteration, evaluate the loss functions of iteration points L(X k ), positive perturbation points, and negative perturbation points L(X k ± c k ∆ k ).
Step 5: Gradient approximation calculation. Gradient approximation can be calculated according to positive and negative perturbation points and their corresponding loss function values. The approximate formula of the gradient is as below: Step 6: New iteration point generation.
The new iteration point is generated by G(X k ) and a k according to the following formula: The traditional SPSA algorithm has high efficiency. However, because of its fixed step size mechanism, the related parameters of step sizes cannot be adjusted adaptively. If the step size is too small, the optimization process will be slowed down, and when the step size is too large, it will cross the optimal solution and cause oscillation in the optimization process.
To solve the above problems and make the SPSA algorithm more efficient and accurate, the following effective improvement schemes are put forward by using the data-driven idea combined with the adjacent iteration points in historical information: Firstly, the current optimization process status is evaluated based on data-driven historical information. Hence, the SPSA could adjust its iteration step size according to the status evaluation. Thus, it obtains an adaptive ability in the optimization process.
Secondly, integrated adaptive compensation factors in SPSA will adaptively adjust the next iteration step size according to the current optimization process status. Hence, an improved algorithm (knowledge-informed simultaneous perturbation stochastic approximation, IK-SPSA) is formed, combining the above two ideas and the regular pattern of historical information. It can adaptively adjust the step size according to the current optimization process status and effectively improve the optimization efficiency of SPSA.
The flowchart of the IK-SPSA algorithm is shown in Figure 7, and the specific implementation steps are as follows: Steps 1-5 are the same as the steps of SPSA.
To identify the status of the optimization process, a large number of historical iterative optimization process data are compared with typical optimization process trajectories.
According to the analysis, it is concluded that five kinds of local optimization process status are: rapid local descent, local deceleration descent, local uniform descent, local concave oscillation, and local convex oscillation, as shown in Figure 8. Steps 1-5 are the same as the steps of SPSA.
To identify the status of the optimization process, a large number of historical iterative optimization process data are compared with typical optimization process trajectories.
According to the analysis, it is concluded that five kinds of local optimization process status are: rapid local descent, local deceleration descent, local uniform descent, local concave oscillation, and local convex oscillation, as shown in Figure 8.
To identify the process status in the optimization process, the adjacent historical iteration information is used to obtain the judgment factor. It is used to judge the current optimization process status. The judgment factor can be denoted as: where ajt L  represents information set of point loss function of successive three iterations; δ is weighting factor; kΓ Δ is judgment factor of current optimization status. The relationship between that judgment factor and the status of the optimization process is as follows: (1) If the optimization process is currently in a rapid decline stage, then the judgment factor is 0 kΓ Δ < . It shows that the current optimization increment is large, and the optimization process can be accelerated, and the appropriate step size can be increased by the compensation factor to accelerate the optimization process; (2) If the optimization process is in the slow descending stage at this time, then the judgment factor is 0 kΓ Δ > indicating that the optimization increment is small and close to the local optimum, and the optimization process can be slowed down by reducing the appropriate step size through the compensation factors; Step size calculation To identify the process status in the optimization process, the adjacent historical iteration information is used to obtain the judgment factor. It is used to judge the current optimization process status. The judgment factor can be denoted as: where ↔ L ajt represents information set of point loss function of successive three iterations; δ is weighting factor; ∆ kΓ is judgment factor of current optimization status. The relationship between that judgment factor and the status of the optimization process is as follows: (1) If the optimization process is currently in a rapid decline stage, then the judgment factor is ∆ kΓ < 0. It shows that the current optimization increment is large, and the optimization process can be accelerated, and the appropriate step size can be increased by the compensation factor to accelerate the optimization process; (2) If the optimization process is in the slow descending stage at this time, then the judgment factor is ∆ kΓ > 0 indicating that the optimization increment is small and close to the local optimum, and the optimization process can be slowed down by reducing the appropriate step size through the compensation factors; (3) When the optimization process is in a critical status, that is, the constant-speed descending stage, ∆ kΓ = 0. Keep the current step size without adjustment.
(4) If the optimization process is in a status of local oscillation at present, the optimization process is close to the local optimum at this time, to make the optimization process converge. The compensation factor reduces the size of the appropriate step size to slow down the optimization process to ensure its convergence. (3) When the optimization process is in a critical status, that is, the constan descending stage, 0 kΓ Δ = . Keep the current step size without adjustment. (4) If the optimization process is in a status of local oscillation at present, the zation process is close to the local optimum at this time, to make the optimization converge. The compensation factor reduces the size of the appropriate step size down the optimization process to ensure its convergence. Step 7: Dynamic adjustment of step size. Adjust the current step size according to the judgment factor as follows: Step 7: Dynamic adjustment of step size. Adjust the current step size according to the judgment factor as follows:

Loss function
where d k is the step value adjusted by the IK-SPSA step adjustment mechanism, a k is the original step value, ψ k is the signal used to judge the status of the optimization process, and the direction of step adjustment of the optimization process. ∆ d k is the step adjustment amplitude and ξ is the step adjustment coefficient, which is a typical value of 0.5 in this paper. The sign of ψ k is obtained from the following formula by the loss function value of adjacent historical iteration points: where sgn(·) is a symbolic function. ∆ d k is obtained from the parameter set of adjacent iteration points of the optimization process, as shown in the following formula: Step 8: New iteration point X k+1 generation.

Iteration Termination Control
MFO based on data-driven not only needs the rationality of iterative step optimization but also needs a reasonable mechanism to realize the timely termination of the iterative optimization process [15]. In the process of optimization, convergence is its important characteristic and evaluation index. The traditional SPSA realizes iteration termination according to the fixed maximum number of iterations. To improve the optimization efficiency and avoid exchanging multiple iteration costs for smaller optimization results, it is necessary to analyze the historical iteration information to realize the timely termination of the iteration process. Figure 9 is the control flowchart of iteration termination.

Simulation Experiment Platform
The SG level change process is time−varying, highly complex, and has an obvious swell and shrink phenomenon. Considering the particularity of nuclear power operation safety, it is difficult to carry out online experiments on actual nuclear power units. In this paper, the piecewise linear mathematical model of SG level proposed by E. Irving is used  The implementation can be divided into the following five steps: Step 1: Historical iteration sequence updating. The historical iteration function sequence S H is a self-increasing sequence that stores the function values of all historical iteration points. When a new iteration point is generated, the iteration point function value is updated to the sequence.
Step 2: Relative optimality sequence updating. The historical iteration sequence will be sorted according to the control performance of alliteration points. A relatively optimal iteration sequence, S RO is iteratively formulated based on the best point at each iteration.
To obtain the iterative process trend, the relatively optimal sequence is smoothed by the moving average method to obtain the smoothed trend sequence S ST , as shown in Equation (13).
where n is the dimension of the parameter and λ is the smoothing coefficient.
Step 4: Smoothing termination sequence updating. Further, smooth the smoothed trend sequence to obtain termination sequence S TM for termination control. S TM is a monotone decreasing sequence based on the trend sequence S ST , as in Equation (14).
Step 5: Differential control sequence updating. Differential control sequence, ∆S TM , is formulated as below: Step 6: Iteration termination factor calculation and normalization.
To evaluate the relative progress of optimization, the termination factor ζ is defined as shown in the following formula: where ζ indicates the ratio of control performance improvement to current point performance. Considering the consistency of the termination factor, it is necessary to normalize the termination factor to obtain a general factor independent of the problem. The normalization is shown as in Equation (17).
Step 7: Iteration termination. When ζ is small enough, that is ς(i) < ς T , (ς T is the allowable error), and to avoid premature termination of iteration due to accidental factors in the iteration process, it is further verified, so the iteration termination rule is defined as: κ F is the repetition coefficient that can be set by the engineer, κ indicates a counter representing the number of iterations that satisfies the former tolerance. When the termination condition is met, the optimization process will be terminated and the optimal solution will be obtained.

Simulation Experiment Platform
The SG level change process is time-varying, highly complex, and has an obvious swell and shrink phenomenon. Considering the particularity of nuclear power operation safety, it is difficult to carry out online experiments on actual nuclear power units. In this paper, the piecewise linear mathematical model of SG level proposed by E. Irving is used as the simulation experiment platform [23].
E. Irving's SG level model is widely adopted in the research of SG level control, and its transfer function is as follows: where Y is the level of the SG; Q e is the feedwater flow and Q v is the steam flow; G 1 , G 2 , G 3 is a constant coefficient; τ 1 , τ 2 is the delay time; T is the oscillation period, and the unit is s; s is Laplace transform operator. E. Irving's model is a piecewise linear model, and the parameters change in different power segments. The state-space expression of the SG level model is shown in the following: .
where e(t) is the feedwater flow; v(t) is the steam flow; and y(t) is the water level.
The feedwater flows step response and steam flow step response of the simulated E. Irving's model under different powers are shown in Figures 10 and 11, respectively. It can be seen from Figures 10 and 11 that with the increase of power, the swell and shrink effects weaken, which is consistent with the actual SG level response phenomenon. Moreover, when the power reaches 30%, the step response curve of feedwater flow has apparent oscillation, which shows that E. Irving's model can better reflect the actual SG's false level and mechanical oscillation. Furthermore, at the same power, the level value of the SG finally decreased due to the increase of steam flow rate is equal to the level value of the SG finally increased due to the increase of feedwater flow rate. The actual SG can keep the level unchanged when the feedwater flow rate and steam flow rate are equal. when the power reaches 30%, the step response curve of feedwater flow has apparent oscillation, which shows that E. Irving's model can better reflect the actual SG's false level and mechanical oscillation. Furthermore, at the same power, the level value of the SG finally decreased due to the increase of steam flow rate is equal to the level value of the SG finally increased due to the increase of feedwater flow rate. The actual SG can keep the level unchanged when the feedwater flow rate and steam flow rate are equal.

Experimental Setup
The SG level model adopts the piecewise linear mathematical model of the SG level proposed by E. Irving as the simulation experiment platform. As shown in Figure 3, the SG level control system adopts the three−impulse cascade PID control scheme in 5% full power (FP), 30% FP, 50% FP, and 100% FP. The control system includes six controller parameters: the primary PID and the auxiliary PID. In this paper, these six controller parameters are selected as optimization variables, and these six controller parameters are set as oscillation, which shows that E. Irving's model can better reflect the actual SG's false level and mechanical oscillation. Furthermore, at the same power, the level value of the SG finally decreased due to the increase of steam flow rate is equal to the level value of the SG finally increased due to the increase of feedwater flow rate. The actual SG can keep the level unchanged when the feedwater flow rate and steam flow rate are equal.

Experimental Setup
The SG level model adopts the piecewise linear mathematical model of the SG level proposed by E. Irving as the simulation experiment platform. As shown in Figure 3, the SG level control system adopts the three−impulse cascade PID control scheme in 5% full power (FP), 30% FP, 50% FP, and 100% FP. The control system includes six controller parameters: the primary PID and the auxiliary PID. In this paper, these six controller parameters are selected as optimization variables, and these six controller parameters are set as Figure 11. Curve of level response for step flowrate in steam at different powers.

Experimental Setup
The SG level model adopts the piecewise linear mathematical model of the SG level proposed by E. Irving as the simulation experiment platform. As shown in Figure 3, the SG level control system adopts the three-impulse cascade PID control scheme in 5% full power (FP), 30% FP, 50% FP, and 100% FP. The control system includes six controller parameters: the primary PID and the auxiliary PID. In this paper, these six controller parameters are selected as optimization variables, and these six controller parameters are set as X = [X 1 , X 2 , X 3 , X 4 , X 5 , X 6 ] T , the maximum number of iterations is 40. As NPPs have special requirements for safety, the constraint range of control parameters should be within the stable range of control system. In reality, the constraint range can be selected through the experience of engineers. Detailed description and constraints of optimized variables are shown in Table 1, and system parameters are shown in Table 2.

Effectiveness Test
IK-SPSA does not change the basic framework of the SPSA method, so its convergence is consistent with the SPSA method. As a stochastic search algorithm, its effectiveness can be demonstrated by its principal design and numerical statistical test. This method has been widely used and recognized by academy and industry [19][20][21]. Without losing generality, the Monte Carlo method is used to randomly select the initial points of different powers. Adjust the control system parameters to the above control parameters. The IK-SPSA optimization strategy is implemented in the experiment. The optimization process and results are recorded as shown in Figures 12-15 below. (2) Trajectories of step size change It can be seen from Figure 13 that the overall trend of the improved step size of IK−SPSA is consistent with the continuous decrease with the increase of iteration times, thus avoiding the oscillation near the optimal value. In addition, there will be some fluctuations in the step size, indicating that IK−SPSA can judge the current optimization process status and adjust the step size accordingly according to the current optimization process status. IK−SPSA can adaptively adjust the step size to find the optimal value. (2) Trajectories of step size change It can be seen from Figure 13 that the overall trend of the improved step size o IK−SPSA is consistent with the continuous decrease with the increase of iteration times thus avoiding the oscillation near the optimal value. In addition, there will be some fluc tuations in the step size, indicating that IK−SPSA can judge the current optimization pro cess status and adjust the step size accordingly according to the current optimization pro cess status. IK−SPSA can adaptively adjust the step size to find the optimal value. (3) The changing track of algorithm optimization parameters According to the characteristics of the IK−SPSA optimization strategy, the contro system parameters will be improved in parallel with the progress of iterative optimiza tion. The change of these key controller parameters is accompanied by the continuou improvement of control system performance. The trajectories of the controller parameter can be observed in Figure 14. It can be found that with the advancement of the optimiza tion process, the six key controller parameters will undergo random parallel perturbation and continuous dynamic change. (3) The changing track of algorithm optimization parameters According to the characteristics of the IK−SPSA optimization strategy, the control system parameters will be improved in parallel with the progress of iterative optimization. The change of these key controller parameters is accompanied by the continuous improvement of control system performance. The trajectories of the controller parameters can be observed in Figure 14. It can be found that with the advancement of the optimization process, the six key controller parameters will undergo random parallel perturbation and continuous dynamic change. line is the water level curve after optimization. Compared with the control system before optimization, the maximum deviation of the optimized control system under different power is reduced by 2%, 3.2%, 2.1%, and 27%, respectively. The peak time is reduced by 4.7%, 10%, 9.8%, and 34%, respectively; The transient time is reduced by 50%, 60%, 75%, and 71%, respectively. The response performance of the optimized control system is greatly improved, which also reflects the effectiveness of the IK−SPSA method. (5) IK−SPSA iterative Trajectories at different initial points To verify the effectiveness of the IK−SPSA method from different initial points, three different initial points were selected as initial values in the experiment. Under these three different initial points, the optimized trajectories based on IK−SPSA are shown in Figure  16 below. All three optimization experiments show similar performance, which can be (1) Trajectories of iteration From Figure 12, with the advancement of the optimization process, the ITAE index continues to drop significantly, which also means that the control system's performance has been significantly improved. Moreover, the whole optimization process only needs more than 20 times iterations, which means that IK-SPSA can achieve the optimization goal in a limited number of times and IK-SPSA can realize intelligent iteration termination, avoiding unnecessary iteration cost waste. Figure 12 shows that IK-SPSA has certain advantages over SPSA. IK-SPSA can find the optimal value faster, which verifies the effectiveness of the improved mechanism to a certain extent.
(2) Trajectories of step size change It can be seen from Figure 13 that the overall trend of the improved step size of IK-SPSA is consistent with the continuous decrease with the increase of iteration times, thus avoiding the oscillation near the optimal value. In addition, there will be some fluctuations in the step size, indicating that IK-SPSA can judge the current optimization process status and adjust the step size accordingly according to the current optimization process status. IK-SPSA can adaptively adjust the step size to find the optimal value.
(3) The changing track of algorithm optimization parameters According to the characteristics of the IK-SPSA optimization strategy, the control system parameters will be improved in parallel with the progress of iterative optimization. The change of these key controller parameters is accompanied by the continuous improvement of control system performance. The trajectories of the controller parameters can be observed in Figure 14. It can be found that with the advancement of the optimization process, the six key controller parameters will undergo random parallel perturbation and continuous dynamic change.
(4) Comparison of the level of SGLCS before and after optimization When the iteration termination control conditions are met, the optimization method terminates in time and finds an optimal control parameter point. To show the change in the control system performance before and after the optimization process and to prove the effectiveness of the optimization method, Figure 15 draws the transient response curves of the control system water level at the initial point and the optimal working parameter point during the optimization process. The dotted line is the water level target setting value. The solid line curve is the water level curve before optimization. The dotted line is the water level curve after optimization. Compared with the control system before optimization, the maximum deviation of the optimized control system under different power is reduced by 2%, 3.2%, 2.1%, and 27%, respectively. The peak time is reduced by 4.7%, 10%, 9.8%, and 34%, respectively; The transient time is reduced by 50%, 60%, 75%, and 71%, respectively. The response performance of the optimized control system is greatly improved, which also reflects the effectiveness of the IK-SPSA method.
(5) IK-SPSA iterative Trajectories at different initial points To verify the effectiveness of the IK-SPSA method from different initial points, three different initial points were selected as initial values in the experiment. Under these three different initial points, the optimized trajectories based on IK-SPSA are shown in Figure 16 below. All three optimization experiments show similar performance, which can be completed under a few iterative experiments, which also proves the effectiveness of the optimization method for performance at different initial points. completed under a few iterative experiments, which also proves the effectiveness of the optimization method for performance at different initial points.

Efficiency Test
The above tests show that both the traditional SPSA strategy and the improved IK−SPSA strategy are effective in optimizing the SG level control performance. To further test the performance difference between SPSA and improved IK−SPSA, the iterative termination mechanism was introduced for both algorithms, and batch experiments with the same initial point and different initial points were carried out under the same other working conditions.

Experimental Design
One thousand groups of experimental tests were conducted with the same control parameters as the initial point in 100%FP. The sequential Latin square sampling method was used to design experiments at different initial points. Each parameter was divided into ten levels according to its value interval. Ten independent sampling points were randomly generated by the Monte Carlo method for a single Latin square sampling. In this test, the batch number of sequential Latin squares was set to 100. That is, 100 times of Latin square sampling were repeated during the experiment, and ten random samples were taken for each Latin square sampling, resulting in a total of 1000 sample points. Under the initial values of these sample points, the performance indexes of the optimization methods were counted to evaluate the performance of the two methods.

Efficiency Test
The above tests show that both the traditional SPSA strategy and the improved IK-SPSA strategy are effective in optimizing the SG level control performance. To further test the performance difference between SPSA and improved IK-SPSA, the iterative termination mechanism was introduced for both algorithms, and batch experiments with the same initial point and different initial points were carried out under the same other working conditions.
The sequential Latin square sampling method was used to design experiments at different initial points. Each parameter was divided into ten levels according to its value interval. Ten independent sampling points were randomly generated by the Monte Carlo method for a single Latin square sampling. In this test, the batch number of sequential Latin squares was set to 100. That is, 100 times of Latin square sampling were repeated during the experiment, and ten random samples were taken for each Latin square sampling, resulting in a total of 1000 sample points. Under the initial values of these sample points, the performance indexes of the optimization methods were counted to evaluate the performance of the two methods.  Experimental results show that IK−SPSA and SPSA have strong adaptive termination control ability because of the addition of the iterative termination control mechanism, and the number of iterations presents a dynamic distribution. The IK−SPSA optimization process can be terminated with fewer iterations, and the optimization iteration cost is reduced compared with the traditional SPSA strategy. Figures 19 and 20 show the average iteration final value distribution under each batch's Latin square sampling experiment.   Experimental results show that IK−SPSA and SPSA have strong adaptive termination control ability because of the addition of the iterative termination control mechanism, and the number of iterations presents a dynamic distribution. The IK−SPSA optimization process can be terminated with fewer iterations, and the optimization iteration cost is reduced compared with the traditional SPSA strategy. Figures 19 and 20 show the average iteration final value distribution under each batch's Latin square sampling experiment.  Experimental results show that IK-SPSA and SPSA have strong adaptive termination control ability because of the addition of the iterative termination control mechanism, and the number of iterations presents a dynamic distribution. The IK-SPSA optimization process can be terminated with fewer iterations, and the optimization iteration cost is reduced compared with the traditional SPSA strategy. Figures 19 and 20 show the average iteration final value distribution under each batch's Latin square sampling experiment.  Experimental results show that IK−SPSA and SPSA have strong adaptive termination control ability because of the addition of the iterative termination control mechanism, and the number of iterations presents a dynamic distribution. The IK−SPSA optimization process can be terminated with fewer iterations, and the optimization iteration cost is reduced compared with the traditional SPSA strategy. Figures 19 and 20 show the average iteration final value distribution under each batch's Latin square sampling experiment.  The experimental results show that due to the random perturbation characteristics of the SPSA algorithm, the iterative final value presents a dynamic distribution. Overall, the IK−SPSA optimization process can get a lower iteration final value, and the optimization performance is improved compared with the traditional SPSA strategy.

Efficiency Analysis
Sequential Latin square experimental design can provide trend trajectories analysis perspective to optimize the dynamic change of performance. Figures 21 and 22 show the dynamic change track of the accumulated average iteration times of IK−SPSA. With the advance of sequential Latin square batches, in the initial status, the average iteration times may deviate significantly from the statistical values because the number of experimental samples is insufficient. With the proceeding of Latin hypercube sampling (LHS) batches, the number of samples covered increases gradually. Correspondingly, the cumulative performance indices converged to a stable value gradually. The average performance in the feasible region could exhibit convergence characteristics under the cumulative effect, and the variability in the performance of the methods is thus eliminated. This could reveal the relative performance between the two methods. Through 1000 groups of experiments, under the same initial point and different initial points, the average iteration times of IK−SPSA are 31 and 27, respectively, and the average iteration times of SPSA are 32.2 and 29, respectively. Compared with the typical SPSA method, the iteration times of IK−SPSA strategy are reduced by 3.7% and 6.9%, respectively. When compared to the typical SPSA approach, the iteration times of the IK−SPSA strategy are shorter.  The experimental results show that due to the random perturbation characteristics of the SPSA algorithm, the iterative final value presents a dynamic distribution. Overall, the IK-SPSA optimization process can get a lower iteration final value, and the optimization performance is improved compared with the traditional SPSA strategy.
Sequential Latin square experimental design can provide trend trajectories analysis perspective to optimize the dynamic change of performance. Figures 21 and 22 show the dynamic change track of the accumulated average iteration times of IK-SPSA. With the advance of sequential Latin square batches, in the initial status, the average iteration times may deviate significantly from the statistical values because the number of experimental samples is insufficient. With the proceeding of Latin hypercube sampling (LHS) batches, the number of samples covered increases gradually. Correspondingly, the cumulative performance indices converged to a stable value gradually. The average performance in the feasible region could exhibit convergence characteristics under the cumulative effect, and the variability in the performance of the methods is thus eliminated. This could reveal the relative performance between the two methods. Through 1000 groups of experiments, under the same initial point and different initial points, the average iteration times of IK-SPSA are 31 and 27, respectively, and the average iteration times of SPSA are 32.2 and 29, respectively. Compared with the typical SPSA method, the iteration times of IK-SPSA strategy are reduced by 3.7% and 6.9%, respectively. When compared to the typical SPSA approach, the iteration times of the IK-SPSA strategy are shorter. The experimental results show that due to the random perturbation characteristics of the SPSA algorithm, the iterative final value presents a dynamic distribution. Overall, the IK−SPSA optimization process can get a lower iteration final value, and the optimization performance is improved compared with the traditional SPSA strategy.
Sequential Latin square experimental design can provide trend trajectories analysis perspective to optimize the dynamic change of performance. Figures 21 and 22 show the dynamic change track of the accumulated average iteration times of IK−SPSA. With the advance of sequential Latin square batches, in the initial status, the average iteration times may deviate significantly from the statistical values because the number of experimental samples is insufficient. With the proceeding of Latin hypercube sampling (LHS) batches, the number of samples covered increases gradually. Correspondingly, the cumulative performance indices converged to a stable value gradually. The average performance in the feasible region could exhibit convergence characteristics under the cumulative effect, and the variability in the performance of the methods is thus eliminated. This could reveal the relative performance between the two methods. Through 1000 groups of experiments, under the same initial point and different initial points, the average iteration times of IK−SPSA are 31 and 27, respectively, and the average iteration times of SPSA are 32.2 and 29, respectively. Compared with the typical SPSA method, the iteration times of IK−SPSA strategy are reduced by 3.7% and 6.9%, respectively. When compared to the typical SPSA approach, the iteration times of the IK−SPSA strategy are shorter.  Compared with the typical SPSA method, the iterative final value of IK-SPSA strategy is reduced by 2.6% and 1.3%, respectively. It can be seen that the method based on IK-SPSA is superior to the traditional SPSA method and can obtain better control parameters.         Compared with the typical SPSA method, the iterative final value of IK−SPSA strategy is reduced by 2.6% and 1.3%, respectively. It can be seen that the method based on IK−SPSA is superior to the traditional SPSA method and can obtain better control parameters.

Conclusions
A revised simultaneous perturbation stochastic approximation (SPSA), IK-SPSA, was proposed in this study to optimize the SGLCS controller parameters via only direct control system performance measurements. The historical information generated during the optimization process was utilized to enhance the optimization efficiency. Hence, two critical mechanisms were constructed to form this novel data-driven optimization strategy. The real time step size tuning mechanism could sense the current optimization status and thus adaptively tune the next iteration step size. The termination control mechanism could evaluate the progress of the optimization from a holistic perspective and thus could terminate the optimization intelligently to avoid unnecessary iteration costs. Moreover, the IK-SPSA strategy was adjusted adaptively according to the characteristics of the performance optimization of the SGLCS. This strategy was verified on a typical SGLCS simulation platform. A series of systemic experiments were designed and conducted. The effectiveness and