Comparison of Deep Learning and Deterministic Algorithms for Control Modeling

Controlling nonlinear dynamics arises in various engineering fields. We present efforts to model the forced van der Pol system control using physics-informed neural networks (PINN) compared to benchmark methods, including idealized nonlinear feedforward (FF) control, linearized feedback control (FB), and feedforward-plus-feedback combined (C). The aim is to implement circular trajectories in the state space of the van der Pol system. A designed benchmark problem is used for testing the behavioral differences of the disparate controllers and then investigating controlled schemes and systems of various extents of nonlinearities. All methods exhibit a short initialization accompanying arbitrary initialization points. The feedforward control successfully converges to the desired trajectory, and PINN executes good controls with higher stochasticity observed for higher-order terms based on the phase portraits. In contrast, linearized feedback control and combined feed-forward plus feedback failed. Varying trajectory amplitudes revealed that feed-forward, linearized feedback control, and combined feed-forward plus feedback control all fail for unity nonlinear damping gain. Traditional control methods display a robust fluctuation for higher-order terms. For some various nonlinearities, PINN failed to implement the desired trajectory instead of becoming “trapped” in the phase of small radius, yet idealized nonlinear feedforward successfully implemented controls. PINN generally exhibits lower relative errors for varying targeted trajectories. However, PINN also shows evidently higher computational burden compared with traditional control theory methods, with at least more than 30 times longer control time compared with benchmark idealized nonlinear feed-forward control. This manuscript proposes a comprehensive comparative study for future controller employment considering deterministic and machine learning approaches.


Introduction
As early as (at least) the late 19th century, scientists made efforts to design and implement control systems to deal with instability, oscillation and various nonlinear and chaotic phenomena [1]. Maxwell studied valve flow governors [2], while more recently Cartwright used the van der Pol equation in seismology to model the two plates in a geological fault [3]. Fitzhugh [4] and [5] used the equation to model action potentials of neurons. Systems exhibiting strong nonlinear behavior are tough problems to control. The standard practice of base controls on the linearization of the system is often rendered ineffective due to the elimination of the nonlinear features. Machine learning is one approach with seeming applicability due to its ability to learn and control nonlinear features.
Meanwhile, the recent development of physics-informed neural networks (PINNs), originally introduced in 2017 [12], encode differential equations in the losses of the NNs as a soft constraint enabled by automatic differentiation [13], allowing fast, efficient learning of physical mapping with relatively less labeled data. One well-known application is in the field of fluid fields [14,15]. An aspect not well known or studied is the implementation of control signals for nonlinear systems using PINNs enabled by inserting the control signals and positional constraints into the loss. This aspect is known as physics-informed deep operator control (PIDOC) [16]. Particularly, it is shown in this work, PIDOC can successfully implement controls to nonlinear van der Pol systems, yet fails to converge to the desired trajectory when the system's nonlinearity is large.

Deterministic algorithms
In 2017, Cooper et al. [17] illustrated how an idealized nonlinear feedforward very effectively controlled highly nonlinear van der Pol systems with fixed parameters, while [16] adopted Cooper's method as the benchmark for comparison, as done here in this manuscript. Based on the work presented in this manuscript on NN-based control and deterministic algorithms, it can be deduced that challenging problems remain open, particularly regarding controlling highly nonlinear systems. The "grandiose objectives" referred by Sir Lighthill [6] remain unfulfilled, and this insight guides both industry and academia efforts in controller design and system stability analysis.
There have also been attempts at comparing classical PID controllers with neural networks [18], refining PID controllers with neural networks [19,20]. or inserting neural networks into traditional controllers in general [21,22,23]. Hagan et al. [21] provides a quick overview of neural networks and explains how they can be used in control systems. Nguyen et al. [22] demonstrated a neural network can learn of its own accord to control a nonlinear dynamic system, while Antsaklis [23] evaluated whether neural networks can be used to provide better control solutions to old problems or perhaps solutions to control problems that have proved to confound.
Inserting nonlinear approximation by neural networks to refine control and stability is not a new thing and is considered a type of "learning control" dating back to the 80s and 90s. Notwithstanding, as already introduced in [16], building control frameworks solely with neural networks is relatively rare. Acknowledging the deficiency of related works, this manuscript provides a fairly comprehensive analysis of PIDOC [16] as well as the original methods proposed by Cooper et al. [17] on the van der Pol system as a nonlinear representation of oscillating circuits, amongst other example applications. A benchmark is designed considering both the works and analysis of the systematic behavior. Afterward, desired trajectories were modified from the benchmark problem to check how the control methods differ by testing their first and second-order phase portraits.
In this manuscript's Section 2 we briefly formulate the problem with a brief introduction to the van der Pol system and control schemes. In Section 3 we introduce the control approaches, including physics-informed deep operator control (Section 3.1), containing deep learning (Section 3.1.1) and physics-informed control (Section 3.1.2); and control theory algorithms in Section 3.2, with linearized feedback control (Section 3.2.1); idealized nonlinear feed-forward control (Section 3.2.2) and combined control (Section 3.2.3); we then briefly how we compare the methods in Section 3.3. Next Section 4 includes results comparing the control schemes: Section 4.1 shows how the methods differ on the benchmark problem; Section 4.2 shows how changing desired trajectories variate the controlled schemes; Section 4.3 shows how variegating systematic nonlinearity changes different control results.

Controls
Signal Command Controlled Dynamics Figure 1: A basic schematic diagram for a control process. The human desired signal command is input to the system through the controller as illustrated in the red box, which passes the control to the targeted system in a "feedforwardfeedback-control" loop. Note that the "chaos" from the systems as in the blue box is passed to the controller through the sensor. The final controlled dynamics are output to different applications as illustrated in the left schematic marked as "controlled dynamics". Detailed description please see text.
Phase Portrait of van der Pol Dynamics Figure 2: The inherent dynamics of the van der Pol equation [27]. The light green line indicates the limit cycle, the manifestation of the strong nonlinearity of the van der Pol inherent dynamics. Disparate red lines indicate trajectories beginning at various initial points, which all eventually fall onto the inherent limit cycle. The blue arrows indicate the total phase field of the van der Pol inherent dynamics, indicating the flow directions.

Problem Formulation
As introduced in Section 1, the main goal is a comparison of control methods. A basic system schematic is illustrated in Figure 1. The command signal was calculated by the controller, passing the control commands to the system, where the system's nonlinear behavior is sensed and fed back using a sensor (not illustrated in the schematic). As the control loop stabilized, the controlled dynamics are output for real-world applications. This manuscript mainly focuses on the controller (red box in figure 1), PIDOC, and other control methods are all codified in the controller box.
The van der Pol system was adopted to test the control signals' implementation, and a phase portrait of the van der Pol system is illustrated in Figure 2 where the system is arbitrarily initialized. Given arbitrary initial points, the trajectory always becomes "entrapped" in a nonlinear track (called a limit cycle), while control methods strive to release the trajectory from the trapped path along the limit cycle and drive the trajectory to some desired, commanded behavior. Such a system was first discovered by van der Pol when investigating oscillating circuits, taking the form [24,25]. van der Pol [24] introduced an oscillatory system with damping that is negative. Together with van der Mark [25], he also illustrated how to design an electrical system such that alternating currents or potential differences will occur in the system, having a frequency that is a whole multiple of the forcing function.
where in the original circuits formulation, x(t) is the current measured in amperes, as the rate of change of the charge [26] and µ is a scalar parameter indicating the nonlinearity and the strength of the negative damping [16]. Henceforth, x(t) is referred to as position.
For testing the proposed methods, control signals are formulated and passed forward to the nonlinear system as commands. The simulated system duplicated the system introduced in [16], where the MATLAB command odeint solves the equations providing data to feed the training of PIDOC. The van der Pol equation was solved in the time domain from time, t = [0, 50], and interpolated with 5000 points. The error control parameters rtol and atol are 10 −6 and 10 −10 , respectively [28].

Methodology and Materials
This section briefly outlines the theoretical foundation of the physics-informed neural network-based algorithm and the alternative based on traditional control theory. The methodology of subsequent numerical experiments used for testing the methods is also introduced.

Deep learning
Physics-informed deep operator control is enabled by the general deep neural network framework, where for the van der Pol system the position is inferred based on the input time domain in accordance with Equation (2).
A supervised machine learning framework is defined using external training data, as a formulation minimizing the loss function so that the neural network can capture data features through an optimization process, whereas in traditional neural network approaches L is usually the difference (errors) between the neural network predictions and training data. Let L = L(t, p) denote the loss function, where t is the input time series and p is the parameter vector contained in formations of I, D, and neural network. Since no external constraints or bounds are enforced, the optimization problem hence takes the form of Equation (3) [16]. min Minimizing L requires reiterating the neural network as defined for the "training". The limited-memory Broyden-Fletcher-Goldfarb-Shanno optimization algorithm, a quasi-Newton method (L-BFGS-B in TensorFlow 1.x) [29,30] is adopted. Optimization is carried over iterations looping from the blue box (neural network) to purple box (I & D) to red box (L) displayed in Figure 3. The maximum iterations are set as 2 × 10 5 . In the PIDOC formulation, L is calculated based on mean square errors of the encoded information to be construed in Section 3.1.2.

Physics-informed control
According to reference, [16], the control function is enabled by encoding the control signal into the loss function of the neural network, inspired by the formulated physics-informed neural networks (PINNS) [12], where the loss function is computed through the mean squared errors (MSE) elaborated in Equation (4).
stands for the neural network generation errors, the initial position loss, and the control signal loss, respectively, computed as Equation (5).
where x 0 D denotes the initial position of desired trajectory; x pred is the neural network predicted output; x train is the given training data (from system simulation); and x 0 pred and x 0 D denote the initial positions of the neural network predicted output and desired trajectory. Detailed formulations are elaborated by reference [16].
To impose the triangular function signals, we simply impose the form of x D in Equation (6).
Based on such an x D , the output phase portrait (ẋ(t) versus x(t) phase portrait) is expected to be a circular trajectory. To implement different amplitudes of the desired trajectory Λ, we modify Equation (5) to encode the amplitude information into the neural network losses, given the same training data resulting in Equation (7).
where the above equations represent the general formulation of PIDOC. The detailed graphical representation is illustrated as in Figure 3 B: the control system (deep blue box) first generates nonlinear data that feeds into the neural network, forwards the output to encode the control signals as shown in the deep red box into the loss function through automatic differentiation, and reiterates the training of the neural network until the control signal is fine-tuned for systematic output.

Deterministic Control Algorithms
For the alternative application of control theory, the general framework begins with the modification of Equation (8), where controller gains are calculated through the Ricatti equation becoming a controller known as the linear quadratic regulator (LQR) [17].
where F (t) is forced on the nonlinear systems to exert the control. By modifying F (t), different type of controls are implemented, where in our approach we adopt nonlinear feed-forward (FF), linearized feedback control (FB), and the combined controls, to be elaborated in Sections 3.2.2, 3.2.1, and 3.2.3, respectively.

Linearized feedback control
In control theory and sciences, a common first step in control design is linearizing nonlinear dynamic equations and then designing the control based on that linearization. For the van der Pol dynamics, Equation (8) can be linearized and reduced into Equation (9), expressed in state-variable formulation from which state space trajectories are displayed on phase portraits [17].
The infinite-horizon cost function given by Equation (10) The goal is to find the optimal cost-to-go function J * (x) which satisfies the Hamilton-Jacobi-Bellman Equation (11) ∀X , 0 = min where to find solutions, Equation (12) is formed necessitating solution of (13) where A and B are the expressions used in the linear-quadratic optimization leading to a feedback controller with linear-quadratic optimal proportional and derivative gains for K p and K d . The closed loop dynamics are established by Equation (14) where the van der Pol forcing function F (t) is a proportional-derivative (PD) controller whose gains used in this manuscript are from [17].
Adopting the linearized feedback control by Cooper et al. [17], Equation (8) can thence be expanded in the form: where x d is the desired trajectory; K d and K p are the derivative and proportional gain, respectively. Similar with our approach in Equation (6), x d is the desired control trajectory, writes x d = Λ sin(t).

Nonlinear feed-forward control
In idealized nonlinear feed-forward controls, the forced term F (t) = F F F (t) having the form of the original van der Pol system with the desired trajectory x = x d executed on: where x d is the desired signal same as in Equation (14). By implementing x d in the force term, the control is thence applied to the van der Pol system, defined as the nonlinear feed-forward control since the executed force term possesses the form of idealized nonlinear trajectory.

Combined control
To apply both the idealized nonlinear feedforward trajectory combined with the linearized feedback, the force term of the combined control simply follows (13) and (14), respectively. F C is then applied to van der Pol system in following the same form as in Equations (13) and (14).      The basic framework of the controls is shown in Figure 3 A: the signal command as shown in the deep red box (x d in our equations) is the first input to the automatic trajectory generator that is forwarded to the gains, and then forwarded to either feed-forward controls (F F F ) on the lower light blue box or feedback controls (F F B ) on the upper dark blue box or the combined approach. The control signals are tuned through the light blue tuner box on the right, which controls the force term applied to the nonlinear system as indicated in the solid blue box on the right. After exerting the desired control signals, the output signals are first fed to the gains as full state feedback indicated in the gray box; the final controlled dynamics are output after the workflow is executed iteratively.

Comparison and Estimation
To conduct a fair, decent, and comprehensive comparison of the proposed methods, we consider Systematic analysis of the provided benchmark problem as we mentioned in Section 1, Trajectory convergence for different amplitudes of desired trajectories, signified by Λ in Equation (6) and Non-linearity of the systems with different nonlinearities, signified through µ ion Equation (1). For the benchmark systematic behavior analysis, considering both the work of Zhai & Sands [16] and Cooper et al. [17], we pick Λ = 5, µ = 1, as a system with low nonlinearity; in which for the PIDOC framework the NN has the structure of 6 × 30. The initial point is picked as (1,0). For systems of different desired amplitudes, Λ is changed from 1, 3, 5, 7, 9. For systems of different non-linearities, µ is changed from 1, 3, 5, 7, 9, 10. The PIDOC was conducted in Google Colab [31] using Python 3.6 compiling TensorFlow 1.x [30]. Both FF, FB and C were written in MATLAB R2021A and executed with Simulink.

Benchmark analysis
The results of the benchmark analysis are shown in Figure 4, where sub-figures A, B stand for the first and second order phase portraits of different controlled schemes by PIDOC, FF, FB, & C, marked in different colors dashed lines as elaborated in the caption; compared with the inherent van der Pol dynamics and desired trajectory marked in black and pink solid lines, respectively. Sub-figures C to D illustrated the time evolution of the zeroth, first, and second order derivatives of the position x(t), with the same color representations as in A and B. Given the benchmark problem, it can be deduced that all the control theory methods exhibit strong fluctuations at the initial stage of controls, where FF converge to the desired trajectory successfully as indicated in the deep blue dashed lines, whereas both FB and C fails. Another interesting point to be noted is that all the traditional control algorithms exhibit a stronger fluctuation for higher order terms at the beginning stage, yet FF successfully converge to the trajectory that exhibits better control effects than PIDOC, but FB and C displays such a robust fluctuation along the time. To this phenomenon, we provide the following explanation: the errors generated by the linearization of the van der Pol equation accumulate and cause the robust fluctuations as indicated in Figure 4 for the light blue and red lines. However, admittedly, FF successfully implements the control with higher accuracy for higher order terms than PIDOC; but noted that since FF only forwarding control signals can be considered as an open-loop system, in real-world practice, trivial noises will be accumulated that leads to the in-feasibility of FF.

Trajectory amplitude
The results of controlled dynamics of trajectories of the first and second-order phase portraits are shown in Figures 5  and 6, respectively. It can be discerned from Figure 5 A and B that both PIDOC (symbolized as ΠD in the figure) and FF are able to implement controls with an exception of B1 that FF failed to control the system when Λ = 1. Similar with the benchmark problem that both FB and C failed to implement the controls with a highly fluctuating behavior, in Figure 5 C and D. An interesting phenomenon reported from D1 to D5 is that with increasing trajectory amplitudes we report a better convergence for the combined (C) control. We can hence propose the discussion on such phenomena that for higher values of desired trajectory amplitudes the linearization effect of the feedback reduces for the van der Pol systems.   Figure  6 A reports the stochastic approximation nature of PIDOC: the learning-based control executes control signals based randomized sampling for trajectory convergence. Corresponds to Figure 5, B1 shows the failure of FF control when Λ = 1; whereas B2 to B5 shows how the second order phase portraits display a higher fluctuation, as also shown from in Figure 6 C and D. Figure 6 B also shows a strong discretized form of FF control, as illustrated based on the sparse points. The control contour from both Figures 5 and 6 both FB and C controls (sub- figure C and D) shows an increased control density on the horizontal edges (x(t) direction), indicated by the denser points.
The total computational burden of the four methods is shown in Table 1: the PIDOC framework shows an evidently larger computing time than FF, FB and C; generally, FF execute the fastest control and C exhibits the longest control time within the tested control theory algorithms. We provide the following explanations for the above phenomena: (1) the PIDOC framework is based on the training of the NN, where the approximation of nonlinear data takes exponentially longer compared with just implementing the control commands; (2) Figure 6: The acceleration-position portrait of different controlled schemes, with the marking colors same as in Figure  5. Note that A(1, 2, ..., 5) to D(1, 2, ..., 5) are the same as in Figure 5: the implementations of ΠD, FF, FB, & C to different targeted trajectory amplitudes of Λ = 1, 3, ..., 9. implementation of control signals, where the elimination of feedback and error adjustment reducing computation time; (3) the combination of both feed-forward and feedback requires estimation of the route execution and linearizations, consumes more time. Based on the computation time we can discern that although more stable control implementations are exhibited by PIDOC, the drawback is also evident: the considerably longer training time required for implementing the control.

Nonlinear effects
The results of different control for systems of different nonlinearities with a fixed desired trajectory Λ = 5 are shown in Figure 7. Same as reported by Zhai & Sands [16], the PIDOC control fails to implement control for systems of high nonlinearities as to be "trapped" in a smaller radius trajectory. The FF control was implemented successfully, with a strong fluctuation reported for high nonlinearities observed from B1 to B5, with the failed implementation when µ = 10 as shown in B6, which can be considered as nonlinearity threshold. Both FB and C also failed for control execution same as in Figures 5 and 6. To note, both the control theory methods implemented show an evident higher data density along the horizontal edges, which can be adopted to infer the nature of control theory methods: stronger control imposition near edges, corresponding to the wave crests and troughs as for the time evolution of the position.  Figure 7: The phase portrait of different controlled schemes, with the marking colors same as in Figure 5. A1 to A6 shows the controls by physics-informed deep operator control, symbolized by ΠD, of different van der Pol systems with different nonlinearities from µ = 1, 3,5,7,9,10. B1 to B6 shows the controls by feed-forward controls (FF), from µ = 1, 3,5,7,9,10. C1 to C6 shows the controls by feedback controls (FB), from µ = 1, 3,5,7,9,10. D1 to D6 shows the controls by feed-forward -feedback combined controls (C), from µ = 1, 3, 5, 7, 9, 10.
The second order phase portraits are shown in Figure 8: as for the control theory methods, evidently higher nonlinearities are observed for C compared with FF and FB; a more discrete points distribution indicate larger steps for control implementations. Just by observing Figure 8 A, we discern that the systematic nonlinearity was very high as indicated in the black solid line compared with the white dashed line as for the desired trajectory. However, comparing B to D we  observe that for systems of higher nonlinearities, the control displays extremely strong fluctuations at the beginning stage of the control. Based on such a phenomenon we hence deduce another finding for control theory properties: the control implementation will enlarge the nonlinear signals with a more larger steps of control discretization. To present a more detailed analysis of Figures 7 and 8, we create Figure 9 for a zoomed view of the control schemes for both first and second-order phase portraits. Interestingly, vortex-liked structures are observed in the first-order phase portrait for both PIDOC and C along the edges of the circular trajectory. Figure 9 B6 shows how FF fails control imposition in detail: an oscillation along the circular causes the "split" of the controlled trajectory vertically, where such a trend has already been observed in Figure 9 B5. Figure 9 C clarifies a phenomenon that has already been observed and discussed: an increased data density along the edges of the desired control schemes indicates a stronger control implementation along the edges.   The computational burden as shown in Table 2 displays similar trends as in Table 1: PIDOC exhibits an evidently higher computation time, attributed to the NN training. C exhibits a higher control time than FF and FB. Another interesting phenomenon is: that with the increasing system nonlinearity, PIDOC shows a decreasing computation time. Corresponds to Figures 7, 8, and 9, we propose the following explanation: as the PIDOC-controlled schemes are entrapped in a trajectory with a lower radius, the NN straining stops at an earlier stage since the optimizer (L-BFGS-B) "discern" that more iterations won't keep decreasing the loss, which leads to a lower computation time but lower quality control. To better quantify the computational burden differences, we create Table 3, taking nonlinear feed-forward control employed by Cooper et al. [17] as a benchmark: PIDOC displays evidently higher computational burden compared with FF, with at least more than 30 times of the benchmark time to up to 100 plus more times.
To quantify the control errors, we generate Table 4 to compare the control qualities based on the absolute errors. The equation for computing the average absolute relative errors of different control signals are It can be observed from Table 4 that PIDOC generally exhibits lower control errors compared with traditional control methods in different trajectories. For different nonlinearities, corresponding to Figure 9, it can be observed that nonlinear idealized feed-forward control exhibits better control qualities.

Conclusion
The nonlinear dynamics control modeling problems of the van der Pol system are tackled by comparing deep learning with traditional deterministic algorithms in this paper. The key idea of this work is to elaborate on the main differences by conducting a comprehensive comparison and benchmark for the recently proposed physics-informed neural networks control with other deterministic algorithms. We first design a benchmark problem for testing the system response for different methods. The desired trajectory and systematic nonlinearity are then changed to check the systematic responses of different controls. The computation burdens are also considered for different methods.
For benchmark analysis, results indicated that all the control theory algorithms exhibit a strong fluctuation which can be interpreted as enlarging the nonlinear inherent van der Pol dynamics with FF successfully implementing the controls, but the rest fails. The "nonlinearity enlargement" effect is observed to be more obvious for higher-order terms. The PIDOC exhibits stochastic nature, which can be attributed to the nature of deep learning inference, same as reported by Zhai & Sands [16]. When changing the trajectory amplitudes, an interesting phenomenon is that FF failed for trajectory convergence when Λ = 1. Also, a higher control signal implementation density is observed along the horizontal edges of the first order phase portraits, unveiling control theory imposition to van der Pol systems executes stronger controls along the "signal waves' crest and trough." An evidently higher computation burden is observed for PIDOC in comparison to control theory methods. We explain such by the nature of NN learning: the recursive randomization of the NN weights and biases took a longer time than the direct execution of the control signal. For the van der Pol systems with different nonlinearities, it is observed that FF fails the control when µ = 10, whereas PIDOC also failed to implement controls when µ = 1, as the controlled schemes were "trapped" into smaller trajectories. The "nonlinearity enlargement effect" for higher-order phase portraits for control theory algorithms. An interesting phenomenon of a vortex-liked structure of the controlled schemes, as originally reported by Zhai & Sands [16], has also been reported for the C controls. The evidently higher computation time is also reported for PIDOC, the same as what has been reported for different trajectories. For PIDOC, the computation burden is generally reduced with systems of higher nonlinearities. The proposed comparison can guide the future implementation of deep learning-based controller designs and industrial selections.

DATA AVAILABILITY
All the data and code will be made publicly available upon acceptance of the manuscript through https://github. com/hanfengzhai/PIDOC. The Simulink file for the deterministic control methods is available upon reasonable requests to the corresponding author. The Simulink file was originally published by Cooper [17].