A Hybrid Neural Ordinary Di ﬀ erential Equation Based Digital Twin Modeling and Online Diagnosis for an Industrial Cooling Fan

: Digital twins can re ﬂ ect the dynamical behavior of the identi ﬁ ed system, enabling self-diagnosis and prediction in the digital world to optimize the intelligent manufacturing process. One of the key bene ﬁ ts of digital twins is the ability to provide real-time data analysis during operation, which can monitor the condition of the system and prognose the failure. This allows manufacturers to resolve the problem before it happens. However, most digital twins are constructed using discrete-time models, which are not able to describe the dynamics of the system across di ﬀ erent sampling frequencies. In addition, the high computational complexity due to signi ﬁ cant memory storage and large model sizes makes digital twins challenging for online diagnosis. To overcome these issues, this paper proposes a novel structure for creating the digital twins of cooling fan systems by combining with neural ordinary di ﬀ erential equations and physical dynamical di ﬀ erential equations. Evaluated using the simulation data, the proposed structure not only shows accurate modeling results compared to other digital twins methods but also requires fewer parameters and smaller model sizes. The proposed approach has also been demonstrated using experimental data and is robust in terms of measurement noise, and it has proven to be an e ﬀ ective solution for online diagnosis in the intelligent manufacturing process.


Introduction
Cooling fan systems have been widely used in various industrial fields, such as nuclear power plants [1], automobiles [2], and workstation computers [3]. They are often used to maintain temperature and ensure an optimal operation environment. Due to their low development cost, efficient heat dissipation, and air circulation, a cooling fan system is often regarded as the primary solution for environmental temperature control use [4,5]. Despite playing an important role in the manufacturing process, the rotary mechanism of the cooling fan is highly susceptible to failure, making it one of the top 10 failing components in electronic products [6]. Therefore, it is crucial to monitor the condition of the cooling fan system to prevent drastic changes in the operation environment.
To achieve accurate condition monitoring during the manufacturing process, it is necessary to create the digital twins of a system, which provide the virtual representation of the identified physical entity. According to the data from sensors, digital twins can update their state to mirror the dynamic behavior and the condition of the system in real time. By incorporating other analysis tools, the digital twin model can be used not only for condition monitoring, but also for prognosis, optimization, and control in the digital world [7]. Therefore, digital twins are also regarded as the backbone of Industry 4.0 [8][9][10][11], providing the necessary and useful information to enhance productivity and making intelligent manufacturing smarter, more efficient, and more convenient [12].
Digital twins can be created using various kinds of methods, including using physical models or data-driven approaches. Physical methods require domain knowledge of the identified system. For an industrial cooling fan, aerodynamics, blade geometry, and the control theorem need to be considered. There have been several works analyzing its dynamics from previous studies [4,5,13]. Since the digital twin is modeled using physical principles, it has better robustness and interpretation of the anomalies. However, the result may still have a large deviation between the identified systems because the physical equation is usually built based on ideal conditions. Therefore, the digital twin may not be able to reflect the dynamic behavior in the real world.
In contrast, data-driven methods directly employ feature extraction to find the dynamical pattern from the historical data. Common methods include observer/Kalman filter identification [14,15], dynamic mode decomposition (DMD) [16,17], and sparse identification of nonlinear dynamics (SINDy) [18,19]. The most well-known and commonly used method is the neural network (NN), which provides a structure that can approximate any system dynamics based on the universal approximation theorem [20]. As the NN has evolved, different layers and structures have been developed. One of the NN types is called a recurrent neural network (RNN), which is renowned for modeling sequential data. Since system dynamics signals can also be seen as time-series data, RNN has been proven to perform well in system modeling [21][22][23]. However, applying RNN or any other data-driven method requires a large amount of fruitful inputs/outputs to avoid unstable model predictions [24,25]. Furthermore, most of the system dynamics are better described in the continuous-time domain. These reasons lead to excessively large model sizes and memory storage usage when employing RNN for robust modeling, which may pose a challenge in applying digital twins for real-time use.
To explore an alternative way to create the digital twins, this paper proposes using neural ordinary differential equations (neural ODEs) to approximate the governing equations of the system. The concept of neural ODEs establishes a viewpoint to connect the deep learning model with differential equations [26], offering a more flexible description and reducing the need for numerous parameters for modeling system dynamics. In addition, the continuous-time structure also allows the combination of pure physical models with neural ODEs, which can make the digital twin more accurate and interpretable [27].
This paper makes significant contributions to the field of digital twin modeling and intelligent manufacturing in terms of the following. (1) The neural ODE-based digital twin can approximate the governing equations of the industrial cooling fan system. By employing neural ODEs, the digital twin can achieve smoother system evolution, reducing abnormal discontinuities, and accommodate data from sensors with varying sampling frequencies for condition monitoring. (2) The proposed digital twin framework demonstrates impressive modeling accuracy while utilizing fewer parameters. This efficiency enables real-time condition monitoring during the manufacturing process, making it a practical and viable solution for industrial applications. (3) By incorporating the model with physically informed terms, the digital twins can be closer to real dynamical behavior and be more robust to unforeseen dynamical patterns compared to pure data-driven methods.
The remainder of the paper is organized as follows: Section 2 introduces the methodology used in this paper. The previous study on neural ODE modeling is briefly explained in Section 3. In Section 4, a comparison study is conducted to assess the fitting performance of the proposed digital twins in comparison to others. Section 5 demonstrates the practical applications of the proposed digital twin, where it effectively monitors the condition of the cooling fan and accurately detects the anomalies. The paper finally concludes in Section 6.

Cooling Fan System Dynamics
An industrial cooling fan system typically comprises three essential components: a driving circuit, a rotary mechanism, and fan blades. Figure 1 briefly illustrates a cooling fan system driven by a DC motor. The driving circuit of the cooling fan system can be briefly represented by an electric circuit with a resistor and an inductor, and its dynamics can be described as where L is the inductance, R is the resistance, e K is the back emf constant, in V is the input voltage, i is the current, and ω is the rotor speed. For the motor mechanism, the dynamics can be written as where m J is the motor inertia, t K is the torque constant, m B is the viscous constant, and L T is the external load. Based on the law of energy conservation, where electrical power is equal to mechanical power, it can be expressed as Equation (3) shows it is true only when the constants e K and t K have the same value, which can be defined as Substituting (4) to (1) and (2), the governing equation of the cooling fan system can be represented as  [13].
The dynamics of the cooling fan system can be represented as a 2nd-order differential equation. Given that the response of the electrical dynamics of a cooling fan is much faster than its mechanical dynamics, the transient behavior of the electrical dynamics can be neglected. Therefore, the governing equation in (1) can be rewritten as Furthermore, according to aerodynamics, the external force on the fan blade mainly comes from the drag force, which is proportional to the square of the rotor speed. Therefore, the external load can be approximated as where d C represents the lumped aerodynamic drag coefficient. The details of the derivation can be seen in [5]. By substituting the result of (6) and (7) in the governing Equation (5), it can be expressed as The equivalent coefficients can be defined as Equation (8) can be expressed as After rearrangement, Equation (5) can be formed as a 1st-order dynamical differential equation. The symbol τ is denoted as the applied torque generated by the pulsewidth modulator (PWM), which is proportional to the input voltage of the driving circuit; the equivalent coefficients J , α , D C can be regarded as the lumped moment of inertia, viscous coefficient, and drag coefficient, respectively. After a few steps of simplification, the coefficients of the physical model (10) can be derived by measuring the rotor speed and applied torque, both of which are observable.

Filtering Operator Method
The simplified analytical Equation (10), defined as a "physical model", can be utilized to predict the dynamics of the cooling fan system. To estimate these parameters, the filtering operator method is used. The details can be seen in [13]. Let analytical Equation (10) be rewritten as where , , Taking the Laplace transform of (11), the equation is represented as Next, multiply ( ) / + 1 s λ on both sides of (13), where λ is a positive constant. The result can be written as The inverse Laplace of (15) can be calculated by integrating the measurements After integration, the parameters can be estimated by using the least-square method ˆˆ/ , ,

Recurrent Neural Network
RNN is one type of NN with a loop structure, which can model the patterns with time-dependent behavior and approximate the transient response of the system. One type of RNN, the nonlinear autoregressive network with exogenous inputs (NARX), is used in this paper for the comparison study since it shows outstanding performance in modeling different system dynamics [28][29][30]. In this paper, the NARX model is constructed and trained by the Neural Net Time Series toolbox in MATLAB. Given the input and output of the system, ( )Î p k x  and ( )Î r k y  ,the structure of NARX is illustrated in Figure 2, which can also be expressed in the following mathematical form The variable d represents the order of the model. The weights and bias of the NARX  ) are trained using the measured data and updated using the Levenberg-Marquardt algorithm.

Neural Ordinary Differential Equation
The core concept of a neural ODE is to provide a residual neural network (ResNet) architecture to describe a continuous-time dynamical behavior. Suppose the state of a system at the k th sampling time is denoted as ( ) k h ; the dynamics can be described by ResNet is a nonlinear function modeled by a neural network and parameterized by q . The iterative update equation in (25) can be seen as a Euclidean discretization result for a continuous-time model [31]. The output of the neural network can be regarded as the time derivative of the state when the time step Δt approaches zero: where t represents the continuous time stamp. In a continuous-time structure, the update equation can be written as The integral can be calculated by any kind of ODE solver. In this paper, the 4th order Runge-Kutta (RK4) method is applied for numerical integration because of its conventional usage. As the discrete time step becomes smaller and a more precise ODE solver is used for discretization, the neural network model can approximate a differential equation closely. Modeling fan dynamics using neural ODE has been applied in the previous study [25]. Instead of only depending on the initial value, such as (27), the state of the cooling fan system is also affected by the forced input. Therefore, the neural ODE for the cooling fan system is defined as , ,  are biases, of the neural network. The neural network used in this paper only contains one hidden layer, which is displayed in Figure 3. The inputs of the neural network include the state of the system, the rotor speed, and the applied torque. The parameters can be estimated by applying gradient-based optimization to reduce the value of the loss function.
In this paper, the loss function is set as the mean absolute error (MAE).
After calculating the loss, the gradient-based optimization updates the parameters using the following equation The subscript ( ) i represents the number of iterations. α is the learning rate, which is adjusted based on specific applications and optimization methods used. In this paper, ADAM optimization is applied to train the parameters of the neural network.

Hybrid Neural Ordinary Differential Equation
The architecture of the neural ODE makes it possible to combine with the physically based ODE. Therefore, this paper proposes a hybrid neural ODE model to create the digital twin of the system that contains both physics-informed terms and a neural network in one differential equation, which can be represented as The differential equation of the hybrid neural ODE model is the combination of Equations (10) and (28). The training process is displayed in Figure 4. The physical parameters , , D J α C are initially estimated using filtering operator method and kept fixed while using gradient-based optimization to train the overall model. The hybrid architecture can be regarded as using physics-informed terms to describe the linear dynamic behavior and using neural network to fit the nonlinearity and uncertainties that occur in the real world. This not only increases the fitting accuracy but also enhances the interpretability of the model.

Literature Review
The previous study in [25] has already utilized neural ODE to model a motor-driven propeller dynamical system via simulation. In this research, the fitting performance of the neural ODE model is compared with a physical model and a NARX model, which represent physically based and discrete-time data-driven methods, respectively. Furthermore, NARX models of the 1st order and 10th order are both applied to highlight that the neural ODE model can be closer to real system dynamics. The system dynamics is simulated using MATLAB/Simulink with Equation (10) Figure 7, which shows the neural ODE model has a better fitting performance and generalization compared to others. Therefore, the previous study has proved that the continuous structure in the neural network can not only enhance the modeling accuracy but also reduce the model's complexity. To further improve the feasibility and interpretability, this paper combines neural ODE with physics-informed terms to investigate its fitting performance on real-world data.

Problem Formulation
The study of physically based digital twins of a cooling fan system and a filtering operator method has been given in [4,5,13]. The results have shown that the model has great performance and uses very few parameters. Figure 8 displays the fitting results on experimental data, but it is also evident that there is a decline in the accuracy of fan speed prediction as the fan speed decreases. The result indicates that the physics employed in the model is only suitable for high-speed conditions. However, there are several uncertainties that need to be considered when determining the low-speed dynamics. To address these uncertain conditions, data-driven methods are utilized in this paper to improve the model's accuracy. In this section, different types of neural network-based models are applied and compared with physically based models. The model that can best deal with these uncertainties and provide the best-fitting results will be selected and chosen as the digital twin for the cooling fan system.

Structure of Digital Twins
This paper utilizes and compares four types of different methods to construct the digital twins of the cooling fan system and evaluate each modeling performance to find the best. Four types of models are the physical model, the NARX model, the neural ODE model, and the hybrid neural ODE model. The models' structure and number of parameters can be seen in Table 1. All the neural network models only include a single hidden layer. Although increasing the number of hidden layers may improve fitting results, it also presents challenges in real-time implementation with limited memory storage. In addition, the paper only sets applied torque and previous fan speed as the inputs of the digital twins. Since the proposed ODE model is designed to describe the dynamics of the cooling fan, the inputs of the model also have to follow the physical laws governing the identified system. Although increasing more input features may enhance the fitting performance, it runs the risk of reducing the model's ability to generalize the fan dynamics effectively. Furthermore, according to the practical scenario, fan speed and applied torque are the only available measurements from the experimental device. These are the main reasons why this research only selected these key parameters that have physical meanings as the inputs of the model. The directly measurable information not only reduces the model's complexity but also ensures the interpretability of the digital twin modeling. In this study, two sets of data are used for training and testing: training data are used to estimate the parameters of the digital twins, and testing data are utilized to examine the fitting result.
1st order with 20 hidden states Rotation Speed(rad/s) 1st order with 10 hidden states for NN part

Numerical Simulation
Before the experiment, this study first evaluates the modeling performance of digital twins by using simulation data. The simulation environment and data patterns of fan dynamics are the same as mentioned in Section 3. To evaluate the fitting performance, three criteria are used: root-mean-square error (RMSE), maximum error (Error max), and Rsquared (R 2 ), which are calculated as follows The fitting result can be seen in Figures 9 and 10, and Table 2. Since the physical model has the same dynamic equation as the simulation, the fitting performance is good. , which is close to the reference parameters values. On the other hand, the pure data-driven models, NARX model, and neural ODE model also perform well on training and testing data, but the fitting results do not surpass those of the physical model. It is noted that both show similar fitting performance, while neural ODE uses much fewer parameters. It is proved that the continuous-time model structure has advantages in modeling the system dynamics. Finally, the hybrid neural ODE model uses the same physical parameters Ĵ ,α ,ˆD C while training. The fitting result shows a slight improvement compared to the physical model, which means that the fitting performance on simulation data still accounts for physics-informed terms.

Experiment Validation
To validate the practical application of the proposed digital twins in a real fan system, this paper collects data from an actual cooling fan system, which can be seen in Figure 11. The fan tray system consists of eight cooling fans with eight knobs for speed control. A Cortex-M4 microprocessor is used to give the input command. Figures 12 and 13 are the training and testing data, respectively. Since the applied torque is assumed to be proportional to the input voltage, the voltage signal is directly used for system identification.  The fitting results are shown in Figures 14-16, and Table 3. From the results shown in Figure 14, it is more obvious that the prediction of the fan speed from the physical model has a low accuracy in training data. The complexity of the low-speed fan dynamics makes it challenging to describe accurately using just a physical model, especially in tran- . For the NARX model, although the fitting results on training data are good, the model has an overfitting problem on testing data, which is shown in Figure 16. It can be inferred that the discrete-time model has difficulty capturing the continuous dynamics of the system, making it susceptible to unforeseen data. In contrast, the neural ODE model can achieve a better fitting performance than the NARX model with fewer parameters. The outstanding capability of system identification for continuous-time neural networks is again proven. In addition, the fitting result of the neural ODE model in Figure 14 performs better than the physical model at a low speed. This indicates that the data-driven method is effective in addressing low-speed dynamics. The hybrid neural ODE model, which only sets 10 hidden states, has the best-fitting performance among all models. This shows that the inclusion of physics-informed terms can efficiently reduce the complexity of the NN required to fit the data. From another perspective, it can also be said that the physical model can enhance its fitting accuracy by incorporating a neural ODE Voltage(V) (rad/s) Voltage(V) (rad/s) structure, which makes it able to describe the nonlinearity of the real dynamics. According to the fitting results on both simulation and experimental data, it is proven that the hybrid neural ODE model has the best fitting performance and can be the digital twins of a healthy cooling fan system.

Anomaly Detection Result
In this section, the proposed digital twin hybrid neural ODE model is applied to monitor the condition and detect the anomalies of the cooling fan system. The study uses the fan tray system seen in Figure 11 and simulates two kinds of anomalies, an inlet covered by an object and a disturbance occurring on the rotor, which are shown in Figure 17. The proposed digital twin first evaluates the fan speed response in a healthy condition and assess the cooling fan when an anomaly happens. To inspect the status of a cooling fan system through a digital twin, it is essential to establish a reliable mechanism to determine its health status. Since the digital twin represents the response of a healthy cooling fan system, the status can be defined by the error between the digital twin. Figure 18 illustrates the error distribution between the hybrid neural ODE model and measurement under healthy conditions, which is the error seen in Figures 14b and 15b. The error distribution can be regarded as a normal distribution, in which over 95% of values fall within two standard deviations. Based on the characteristic, the "safe range" that determines the system under healthy conditions can be established by an upper bound (UB) and a lower bound (LB): where ( ) ω t represents the digital twins' output calculated by the hybrid neural ODE model and σ is the standard deviation. Calculation shows that the standard deviation of the error distribution is . σ = 15 8483 . According to the above definition, a cooling fan system is considered healthy when its fan speed remains within the upper and lower bound described in Equations (35) and (36). Conversely, if the fan speed exceeds the safe range, it indicates an anomaly in the fan system. After establishing the criterion for a healthy condition, the anomaly detection process can be initiated. Figure 19 presents the results by using anomaly detection based on the proposed digital-twin-based anomaly detection. The green area in the figures represents the safe range defined by digital twin outputs and the standard deviation of the error in healthy conditions, while the red area indicates the occurrence of the anomalies. Most of the time, the fan speed remains within the green area, representing that the fan system is under healthy conditions. However, when anomalies do occur, the fan speed experiences sudden changes and exceeds the safe range. The anomalies are detected using the defined safe range criteria. Once the objects are removed, which means the anomalies are addressed, the fan speed quickly returns to the safe range, representing that the system restores the normal condition. Figure 19a,b demonstrate the anomaly conditions when the inlet was covered and when there was a disturbance on the rotor. Figure 19c shows the monitoring result when the speed command was generated manually, reflecting the applicability in unpredictable commands. The experiments demonstrate accurate detection when faults occur and rapid response to anomalies. Moreover, the digital twin has significantly fewer parameters than a conventional NN, facilitating real-time implementation with limited memory storage. These results validate the practical application capabilities of the proposed digital twin.

Conclusions
This paper introduces a novel method to construct a digital twin of a cooling fan system by combining physically based and data-driven models. Based on the continuoustime structure, neural ODE can incorporate the physics-informed terms derived from the governing equation of the cooling fan. A comparison study shows that the proposed hybrid digital twin model can have outstanding fitting performances with the usage of fewer parameters. Furthermore, the fitting results on experimental data reveal that the neural ODE part of the hybrid neural ODE model can address the uncertainties and nonlinear behaviors of the real fan dynamics, enhancing both the modeling accuracy and interpretability. To validate its capability in practical applications, the proposed digital twin is employed to establish an anomaly-detection process. Represented as the fan speed under healthy conditions, the digital twin's output is utilized to estimate the status of a fan system based on the error between the digital twin's output and measurements. A fan system is identified as faulty when its fan speed overpasses the safe range defined by the digital twin output and the standard deviation of the error distribution of model output and measurement under healthy conditions. The conducted experiments simulate anomalies, and the results demonstrate that the proposed digital twin-based anomaly detection effectively responds to the faults, thereby validating the feasibility of the study presented in this paper. The study shows the outstanding modeling performance of a hybrid neural ODE. By establishing a precise relationship between the system's inputs and outputs through continuous evolution, neural ODE is proven to be suitable for the mathematical framework used in control theory, such as Lyapunov stability analysis. Therefore, future study will focus on utilizing neural ODE in control applications, thereby enabling a more intelligent and robust manufacturing process.