1. Introduction
Rotating shafts are elements of engineering systems that play a paramount role in the transmission of power, encompassing speed and torque, from one point to another [
1,
2]. They are typically designed to endure substantial loads and to operate at high velocities, underscoring the need for precise alignment, equilibrium, and freedom from imperfections. These considerations are significant not only for enhancing the overall system performance, but also for improving its safety and reliability [
3,
4]. To attain this objective, the practice of condition monitoring (CM) for rotating shafts allows continuously evaluating the shaft condition and performance and detecting any indications of malfunction or deterioration [
5,
6]. Through the application of CM methodologies, potential issues can be promptly identified, and maintenance or repair actions can be driven to prevent accidents. This approach has shown to effectively mitigate the risk of unexpected downtime, reinforcing the overall system reliability [
7,
8]. Furthermore, the widespread and cost-effective availability of sensors has revolutionized the acquisition of diagnostic signals, such as accelerations, strains, and elastic waves [
9]. However, the availability of big data is itself a new layer of complexity, especially in the realm of signal processing. That is, the rapid expansion of the amount of acquired data has introduced the need for (i) improved hardware and software performance, and (ii) developing tools to deal with confounding factors, including those unrelated to the system health state, such as environmental and operational conditions [
10,
11].
To tackle these challenges, deep learning has stood as a pivotal technological advancement in the CM of rotating machines, offering multifaced contributions of significant importance [
12]. This approach has excelled in automatically extracting intricate patterns and features from raw sensor data, enhancing the precision and reliability of fault detection and anomaly characterization. As an example, the work in [
13] applied deep learning to enhance wind turbine CM, addressing the data surge from increased wind farm units. By combining convolutional neural networks (CNNs) [
14] and recurrent neural networks (RNNs) [
15], it efficiently extracted features, reduced dimensionality, and provided effective CM, offering both real-time unit state checks and early warning capability, even amidst accidental parameter changes. In [
16], a CM model based on CNNs for automatic fault detection in rotating equipment was developed. The model, utilizing data from a single vibration sensor on the motor-drive end bearing, achieved accuracies of
and
when applied to two different databases under controlled ambient conditions. Another example was presented in [
17], where the authors proposed a novel deep learning algorithm for detecting rotor unbalance in industrial machinery. The algorithm, extracting important vibration signatures such as fast Fourier transform (FFT) and short-time Fourier transform (STFT), combined the depth of ResNet [
18] and the feature extraction capability of CNN. This hybrid approach surpassed the performance of both individual models. The study involved two analyses: binary detection of balanced vs. unbalanced cases and multilevel detection of the degree of unbalance. The work in [
19] addressed planetary gearbox fault detection by representing baseline vibration signals using the varying index coefficient autoregression (VICAR) model. The authors proposed a modified VICAR (MVICAR) model to effectively incorporate rotating speed into the representation while maintaining nonlinear modeling capacity. Experimental results demonstrated the superiority of the MVICAR model over autoencoders, expanded VICAR (EVICAR), and linear parameter-varying autoregression models in planetary gearbox fault detection. In [
20], a semi-supervised fault diagnosis approach for wind turbines was introduced. The method utilized a deep neural network with adversarial learning and incorporated a metric-guided feature enhancement technique. Despite having a limited number of annotated samples, the methodology exhibited superior fault diagnosis accuracy in experiments conducted on a wind turbine fault dataset.
However, in the context of CM, the developed methods have predominantly relied on black-box deep learning algorithms, lacking transparency in how input data are processed and whether the network behavior aligns with the physics of the problem [
21]. Existing approaches to address this issue involve either post-training explainability algorithms or more intricate physics-based deep learning models. The former, while debunking network behavior, fails to provide evidence of adherence to physical laws [
22,
23,
24,
25]. On the other hand, the latter ensures predictions align with the physics by incorporating regularization terms representing known physical laws during training. These terms are integrated into the network loss function, specifically at the stage where it quantifies the disparity between predicted and actual outcomes. This critical addition serves to guide the neural network towards solutions that not only capture intricate patterns from data but also adhere rigorously to the established physical laws, enhancing the reliability and interpretability of physics-informed neural network (PINN) predictions. The regularization terms act as foundational constraints, influencing the network learning to prioritize solutions that respect the governing physics throughout the training iterations. Moreover, physics-informed algorithms offer a distinct advantage by providing accurate predictions even in the presence of scarce data, a capability not shared by traditional deep learning methods. Notably, physics-informed algorithms are versatile tools applicable in various contexts, including data-driven solutions for partial differential equations, discovery of physical laws, and parameter estimation [
26,
27,
28]. However, within the CM domain, few contributions have integrated physical knowledge effectively into the training process of deep learning models. In [
29], the authors introduced a novel approach for fault detection in gearboxes using long-short term memory (LSTM) neural networks. Given a lack of data from faulty states, the authors proposed a physics-informed hyperparameter selection strategy for LSTM identification, emphasizing maximizing the discrepancy between healthy and physics-informed faulty states. Case studies on detecting gear tooth crack and tooth wear demonstrated that the approach outperformed traditional methods based on minimizing validation mean squared error (VAMSE). The work in [
30] presented a physics-informed deep learning method for bearing fault detection that combined a threshold model and a CNN. The approach was validated using data from bearings on an agricultural machine and a laboratory test stand in the Case Western Reserve University Bearing Data Centre. In [
31], a method for identifying unbalance faults in rotary systems using physics-guided neural networks (PGNNs) was proposed. The approach involved the use of a standard neural network to localize the nodal position of the experimental fault, followed by PGNN to quantify the unbalance magnitude and phase angle. Instead, the work in [
32] introduced a novel physics-informed convolution long-short-term memory (LSTM-CNN) network for rotor unbalance and shaft cracks detection and localization. In particular, the physics were taken into account through the construction of a neural network model which mimicked a finite element (FE) resolution of the problem.
To the best of the authors’ knowledge, still no efforts have been made for the direct estimation of multiple parameters characterizing the health state of a rotating shaft system by leveraging PINNs. In this work, PINNs are utilized to estimate critical health state parameters in a simple but realistic numerical case of an extended Jeffcott rotor model. This model incorporates damping effects and anisotropic supports for a more comprehensive representation. The parameters under consideration include the radial and angular position of the static unbalance caused by the disk on the shaft, stiffness along the principal axes of elasticity, and the non-rotating damping coefficient. The estimation is exclusively based on the displacement signals from the disk centre. Note that this estimation not only optimizes the performance of machineries, enhancing efficiency and reliability, but also enables predictive maintenance by identifying potential faults early on. To highlight the effectiveness and precision of the proposed methodology, various scenarios with different constant rotational speeds are examined, and the performance is compared to that of traditional optimization algorithms used for parameters estimation. Furthermore, the analysis accounts for the impact of noisy input data. It is important to note that the proposed work presents a proof of concept, demonstrating the effectiveness of the proposed methodology through simulation experiments in a controlled environment. The transition from simulations to real-world applications is highlighted, emphasizing the commitment to practicality. Subsequent efforts will focus on rigorous experimental validation and testing on more complex systems to enhance the approach versatility and robustness.
The main innovation of this work lies in integrating established physical knowledge, describing the fundamental dynamics of rotating shaft systems, into the neural network training process. This incorporation serves to guide the training, enhancing the robustness and reliability in the system health state parameter estimation. Furthermore, the estimation relies exclusively on raw time-domain displacements at the disk centre, minimizing the requirement for numerous sensors and simplifying the overall preprocessing steps.
The paper is organised as follows:
Section 2 offers a brief overview of the necessary theoretical foundations about PINNs for parameter estimation;
Section 3 shortly presents the case study and then shows in detail the implementation and the results of the PINN for the system health state characterization. Finally,
Section 4 provides some concluding remarks.
2. Methodology
The proposed framework hinges upon the use of PINNs to estimate the unknown parameters characterizing the dynamics of a rotating shaft system. The innovative aspect in this methodology stems from the tailored and specific application of PINNs, addressing the challenges and requirements associated with accurately estimating health parameters in the context of rotating shaft systems. Notably, PINNs represent deep learning tools that combine NNs with the system governing equations, and are particularly useful when data might be limited or noisy, and where the underlying physics of the problem is well understood [
26].
Assume that a generic physical system is governed by the
-th order ordinary differential equations (ODEs) shown in Equation (1):
where
refers to the system independent variable,
denotes the state vector consisting of
components defined in the domain
, and
represents the vector made of the
unknown parameters describing the system state. Subsequently, considering the NN universal approximation theorem [
33], an NN can be exploited to obtain an approximation
of the state vector
, such that
. More specifically,
and
denote the weight and bias matrices of the NN, respectively, and their values are the result of a training process [
34], as well as for the parameter vector
. Note that, since
is a function, its derivatives concerning the independent variable
can be computed during the training process through automatic differentiation (AD) [
35,
36]. Then, a function
outlining the approximation of Equation (1) can be defined, as reported in Equation (2):
To enable the neural network to fine-tune the parameters
, and
in order to (i) fulfil the underlying ODEs describing the system behaviour and (ii) to fit the available data (i.e., gathered measurements, in which the state vector
is known), two different loss functions are considered, as shown in the following Equations (3) and (4).
where
denotes the loss for the ODEs fulfilling while
is the loss of the observed data;
indicates the generic
-th element of the vector
, made of
elements inside the domain
, in which
and
are evaluated. Specifically, the acquisition time of the measured state vector
is typically employed as the vector
. These loss functions are subsequently integrated to yield the loss term
, as presented in Equation (5):
where
and
denote two coefficients employed to assign greater weight either to the contributions derived from the accessible data or those related to the system physics. Consequently, the objective of minimizing
is enforced, enabling the PINN to infer the unidentified parameters that define the system dynamics. Notably, the appropriate values for the coefficients
and
are determined through an iterative trial-and-error process. A scheme showing how the PINN is trained is presented in
Figure 1. The input of the PINN is represented by the generic time instant
, and its output is the corresponding approximation of the components of the measured state vector
. In each training iteration, the PINN output is compared with the actual value of the components of the state vector for all the time instants
within the vector
, resulting in the loss term
. Simultaneously, the PINN outputs are differentiated automatically to obtain the various terms of the
-th order ODEs described in Equation (1). This process enables the derivation of the residual
of the physics equation, from which the loss term
is computed. The two losses are then combined to form the total loss term
, which is the metric to be minimized. Note that the hyperparameters of the PINN to be optimized are not only the weights
and biases
but also, and significantly, the parameter vector
describing the system health state.
4. Conclusions
This paper has introduced a novel approach employing PINNs for estimating unknown parameters characterizing the health state of rotating shaft systems. The investigation has focused on a realistic numerical case study involving an extended Jeffcott rotor model, which has incorporated damping effects and anisotropic supports. The parameters considered have encompassed the radial and angular position of static unbalance caused by a shaft-mounted disk, stiffness values along the principal axes of elasticity, and the non-rotating damping coefficient. The estimation has relied exclusively on displacement signals from the disk centre, and various scenarios, incorporating different constant rotational speeds, have been thoroughly examined. Results have revealed the implemented PINN accuracy in estimating these parameters, demonstrating minimal relative errors even in the presence of substantial data noise. Moreover, the comparison with the estimates obtained using traditional optimization methods have revealed that PINNs slightly outperform gradient-based and genetic methods in terms of estimation accuracy, despite the longer processing time. Beyond optimizing machinery performance and enhancing efficiency and reliability, the proposed estimation method has facilitated predictive maintenance by early fault identification.
The simulation experiments outlined in this paper establish a compelling proof of concept, showcasing the effectiveness of our proposed approach within a controlled environment. It is crucial to acknowledge that, while these simulations offer valuable insights, the next step involves experimental verification to ensure the real-world applicability of our methodology. Subsequent efforts will be dedicated to conducting experimental studies on more intricate case scenarios, aiming to provide a robust validation and refinement of our proposed approach.