Next Article in Journal
Multi-Scale Displacement Prediction and Failure Mechanism Identification for Hydrodynamically Triggered Landslides
Previous Article in Journal
A Probabilistic Reliability and Risk Framework for Flood Control in Multi-Structure Complexes: Mining Site Design
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion

1
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
2
Institute of Water Resources and Hydropower, Huazhong University of Science and Technology, Wuhan 430074, China
*
Author to whom correspondence should be addressed.
Water 2026, 18(8), 915; https://doi.org/10.3390/w18080915
Submission received: 6 March 2026 / Revised: 4 April 2026 / Accepted: 10 April 2026 / Published: 11 April 2026
(This article belongs to the Section Water-Energy Nexus)

Abstract

Accurate fault diagnosis of hydropower units is crucial for ensuring the efficient and complete utilization of hydropower resources. Existing diagnostic methods predominantly consider either single-sensor or single-scale multi-sensor fusion, failing to fully exploit the effective information within monitoring data. Furthermore, they neglect the correlation between different sensors and faults during fusion diagnosis, thereby limiting the diagnostic performance of fusion models. To address this, this paper proposes a multi-sensor data fault diagnosis method based on multi-scale fusion. First, a feature extraction model is constructed to extract shallow-level features from multi-sensor signals across multiple dimensions. Subsequently, an attention-based feature fusion network is designed to extract and fuse multi-depth features, yielding high-quality deep-fused features. Finally, an information-entropy-based decision fusion strategy is established to effectively enhance the model’s diagnostic performance. Experimental validation on the public rotating machinery fault dataset and the hydropower unit fault dataset yielded diagnostic accuracies of 96.42% and 99.28%, respectively, demonstrating the significant effectiveness and robustness of the proposed method.

1. Introduction

Amidst the global pursuit of carbon neutrality and the large-scale integration of renewable energy sources, the operational landscape of hydropower has fundamentally shifted. Once predominantly operated under steady-state base-load conditions, hydroelectric generating units are now required to provide critical flexibility services, involving frequent start–stop cycles, rapid load variations, and frequency regulation to accommodate the intermittency of wind and solar power. This transition exposes units to severe electromechanical dynamics and multi-source coupled stresses, significantly increasing the risk of incipient failures [1,2]. The failure of large rotating machines such as turbines and generators may cause severe economic deficits and jeopardize personnel security [3]. Accordingly, the investigation of fault detection methods for rotary equipment is indispensable for promoting safe operation and cost-effective equipment performance.
Consequent to the rapid escalation of hardware efficiency and machine intelligence, data-driven diagnostic techniques have attained widespread industrial application. These methodologies are typically bifurcated into conventional machine learning pipelines, which necessitate signal-based feature engineering, and sophisticated deep learning networks capable of hierarchical abstraction. The former usually uses the short-time Fourier Transform (STFT) [4], the wavelet transform [5], and other methods to process the original signal to obtain the characteristics of fault signals. Then, computational intelligence frameworks, including random forest [6] and support vector machine [7], are employed to characterize signal patterns to realize equipment fault classification. This kind of model has simple principles, fast training speed, and good real-time performance; these models have proven effective in various fault diagnosis scenarios. However, this kind of model relies on signal processing methods and professional experience. When dealing with complex data distribution in the actual industrial environment, the effect of fault classification is still insufficient, which limits its deployment for identifying malfunctions in rotating mechanical assemblies. In contrast, deep learning has strong nonlinear ability and learning ability, which can accurately extract the information contained in complex data and reduce the interference of human factors. At present, neural network architectures employing convolutional layers, commonly referred to as CNNs [8] and deep autoencoders [9], are widely used. Although some multilayered neural architectures have demonstrated promising performance in the condition monitoring of rotary equipment, these methods use single-sensor data for model training and fault identification, which will cause the omission of some fault information. When the sensor fails or is disturbed by strong noise, the model diagnostic performance will be greatly reduced, which is not conducive to the accurate diagnosis of the equipment fault state.
The utilization of multi-sensor fusion provides a more comprehensive representation of mechanical health states compared to unimodal data analysis [10,11]. By synthesizing information from multiple points of acquisition, these models achieve greater diagnostic sensitivity. The architectural implementation is generally organized into a tripartite hierarchy: raw data fusion, intermediate feature-based synthesis, and final decision-stage aggregation.
Data-level fusion strategy integrate raw inputs from various sensing devices through data splicing, channel cascading, and other operations, which can greatly retain the fault information contained in the original data, reduce the information loss of the data in the fusion process, and simplify the model structure. Bai et al. [12] used STFT to transform the original 1D time series signal into a 2D image signal and combined the data of different sensors as different channels into multi-channel images to achieve data fusion. Xie et al. [13] fused multi-sensor information into RGB images with three channels via principal component analysis. Data-level fusion methods are usually simple and easy to implement, but they have high requirements for data consistency and have difficulty processing data from various sensor types.
In the feature-level fusion strategy, the features of different signals are usually cascaded or weighted, which can integrate the data of various sensors types into a unified feature set, reducing the data size and improving the model’s efficiency. Cui et al. [14] implemented complex variational mode decomposition in a multivariate context to obtain the time-frequency features of different sensor signals and constructed images to achieve feature fusion. Jiao et al. [15] proposed a data-complementary deep coupled dense convolutional network (DECON) that fuses inbuilt and external sensors of a device to reduce feature loss. Current feature fusion methods treat feature extraction and fusion as separate processes, merely combining the features obtained from each channel, which limits the quality of the resulting fused features.
The decision-level fusion obtains the final diagnosis result by synthesizing the decisions of multiple sensors. Common decision-level fusion strategies include voting rules, weighted average rules, DS evidence theory, etc. These methods comprehensively consider the decision-making information of different sensors and enhance the predictive fidelity and consistency of the diagnostic model. Yang et al. [16] made a decision fusion based on the confidence of the Bayesian probability distribution of each sensor data to achieve more accurate damage identification. Xu et al. [17] formulated an adaptive fusion framework centered on confidence-based coefficients. This system orchestrates the integration of stator current oscillations and image-derived features by assigning variable importance levels to each, thereby optimizing the model’s sensitivity to reliable fault indicators. The decision-level fusion method can effectively fuse different types of sensor data but it has high requirements for fusion strategies, and unreasonable strategies may cause adverse effects.
To address the constraints inherent in conventional multiple sensor data fusion techniques, this study introduces a method designed for diagnosing faults that integrates multiple sensor data with multi-scale fusion. The significant contributions of this work include the following.
(1)
An attention-based multi-feature fusion method is proposed, and a multi-layer single-center, multi-branch feature fusion network is constructed to perform efficient integration of features across multiple dimensions by leveraging hierarchical feature extraction and attention weighting.
(2)
A decision-level fusion strategy driven by sample information entropy is introduced, which quantifies the significance of each sensor signal under varying fault conditions by computing the entropy of individual sensor samples. The method assigns weights to the outputs of different channels, enhancing the contribution of informative signals, suppressing the influence of less useful signals, and improving the overall reliability of the final decision.
(3)
An anomaly identification architecture that merges multi-sensor data and multi-scale fusion is proposed for rotating machinery, allowing for the aggregation of information from diverse sensor sources through multi-dimensional feature extraction and deep feature fusion. Additionally, a well-designed decision-level fusion strategy is incorporated to bolster the system’s recognition accuracy and universal adaptability across unseen datasets.
The subsequent sections are organized to provide a comprehensive overview of the research. Specifically, Section 2 delineates the fundamental methodologies applied in this study. Section 3 outlines the overall framework and provides a detailed implementation of the proposed approach. The experimental configuration and a detailed evaluation of the findings are detailed in Section 4. Finally, Section 5 encapsulates the study’s contributions.

2. Preliminary

2.1. Information Entropy (IE)

Information entropy (IE) is employed to quantify the uncertainty or degree of randomness present in information. The higher the uncertainty of a signal, the greater its information entropy, representing richer or more intricate information. On the contrary, smaller information entropy indicates that the signal is more stable and carries a lower amount of information. In multi-channel fusion fault diagnosis, different channel signals contain different fault information and have different degrees of importance; therefore, the weights of different channel signals are assigned using information entropy, which considers the importance of different channel signals. The mathematical definition of information entropy is
H ( X ) = i = 1 n P x i log b P ( x i )
Here, H ( X ) denotes the entropy associated with the random variable X, n indicates the total count of possible outcomes, P ( x i ) denotes the probability of the ith possibility, and b is the base of the logarithm, usually taken as 2.

2.2. Convolutional Neural Network (CNN)

CNN is a deep feedforward neural network that utilizes local connections and parameter sharing, providing strong feature learning ability and extensive use in fault diagnosis applications. CNN models extract features in a layered manner using multiple convolution and pooling operations, followed by classification to obtain the final diagnosis results. CNN models are commonly classified into 1D, 2D, and 3D structures contingent upon the dimensionality of the input data. The convolutional layer is responsible for feature learning and includes several convolution kernels acting as distinct feature extractors. Each kernel contains trainable weights and bias values, and feature extraction is achieved by convolving the kernels with the input data to produce feature maps, which are then forwarded to subsequent layers. The convolution operation can be formulated as
s k l = f b k l + k = 1 k W k k l X k l 1
In this context, l signifies the specific layer index within the network hierarchy, while k characterizes the dimensionality of the input space. For raw signal processing where the initial input dimension is unity, the output dimensionality is determined by the quantity of convolutional kernels employed at layer l. The operator ∗ denotes the discrete convolution process. The symbols W and b represent the weights and deviations that the network needs to learn. Furthermore, σ represents the activation function; in this research, the ReLU is utilized to introduce sparsity and accelerate convergence. After the convolutional layer extracts features, the feature map is transmitted to the pooling layer. The pooling layer performs statistical aggregation and information filtering on the eigenvalues in the local area, leading to feature-map downsampling and the generation of a more compact representation. Thus, both computational demand and memory usage of the model are significantly decreased. In the present work, average pooling is used. The average pooling formula is provided as
X ¯ i = 1 p n = i i + p 1 X n
where P represents the size of the pooling window, X ¯ is the feature vector after the average pooling.

2.3. Channel Attention Mechanism (CAM)

CAM focuses on identifying the contribution of individual channel features and enhancing informative representations. The SENet framework implements this mechanism through two components: a squeeze operation that captures global contextual information and an excitation operation that assigns adaptive weights to each channel. In the squeeze phase, pooling reduces each channel feature map to a scalar value, computed for the ith channel as follows:
Z i = A v e r a g e P o o l ( x i ) = 1 W j = 1 W x i j
The above formula is mean pooling, where Z i denotes the compression value of the ith channel, and the compression values of all channels form the compression vector Z. The relative significance of individual channels is then learned through model training and calculated as follows:
Z = σ ( F ( Z ) )
Here, σ ( ) is the activation function, F represents the training process of the compressed vector, and Z corresponds to the computed channel importance coefficients, also referred to as the attention vector. In the excitation stage, the features of each channel are multiplied by the attention vector, resulting in a refined set of calibrated attributes. This transformation is represented by the following relationship:
X = X · Z = [ x 1 Z 1 , , x i Z i , , x C Z C ]

3. Proposed Method

Addressing the limitations inherent in current multi-sensor fusion paradigms, this study introduces a multi-channel, multi-scale diagnostic framework. The procedure begins with the application of the Fourier Transform to project temporal signals into the frequency domain, enabling the extraction of both time- and frequency-domain attributes to achieve a comprehensive multi-dimensional feature space. These signals are then processed through a CNN to distill high-level representations, utilizing an attention mechanism to refine and augment the fused features. Finally, a decision-level synthesis strategy—governed by the information entropy of sensor samples—is employed. By assigning weights based on the diagnostic relevance of each sensor under specific fault conditions, the framework achieves a robust multi-source integration that enhances both classification precision and system reliability. The overall flow chart is provided in Figure 1.

3.1. Multidimensional Feature Extraction

A multi-dimensional approach is implemented herein to maximize the extraction of diagnostic indicators from multi-sensor streams. By transposing time-domain data into the frequency domain via Fourier analysis, the framework establishes a dual-domain input. A CNN architecture then operates on these concurrent data streams, autonomously learning the intricate patterns associated with mechanical degradation from both temporal and spectral perspectives. The flow is shown in Figure 2.
1D CNNs are well-suited for analyzing time-series sensor data, as they can efficiently capture local signal features regardless of their position. During feature extraction, the signal in the time domain is first transformed into a frequency-based representation using FFT, which reduces the data length by half. To ensure effective fusion of features from different domains, an additional pooling operation with a kernel dimension of 2 is performed on the time-domain features to preserve the feature space length.

3.2. Attention-Based Feature Fusion Method

To facilitate the seamless integration of high-level features across disparate dimensions, this research introduces an attention-driven multi-channel fusion architecture. This method first fuses shallow features to obtain preliminary fusion features and then further extracts the depth features of each channel. After feature extraction and fusion of different depths, the deep fusion features are obtained. In the fusion process, an attentional module is used to assign significance to various channel features. The specific fusion process of a single channel is shown in Figure 3.
In the first layer of feature fusion, only the time domain feature X R O C × M and the frequency domain feature Y R O C × M are considered, where O C represents the number of output feature channels, and M represents the dimension of the output feature. The feature fusion process first calculates the attention weight as follows:
α i = C A M ( [ X i , Y i ] ) , 0 < i O C
where α i denotes the attention weight of the ith channel of each feature, C A M ( ) represents channel attention mechanism [18], X i = { x i 1 , x i 2 , , x i M } corresponds to the feature vector associated with the ith channel of the time domain feature, and Y i = { y i 1 , y i 2 , , y i M } defines the feature representation of the ith channel of the frequency domain feature.
Then, attention fusion is performed to obtain fusion features:
Z i = α i 1 · X i + α i 2 · Y i
where Z i represents the feature after the first layer fusion, Z R O C × M and α i 1 and α i 2 represent the attention weighting factors corresponding to attributes derived from time-based and frequency-based analyses.
In the fusion process of subsequent layers, additional fusion features need to be considered as follows
α i = C A M ( [ X i , Y i , Z i ] ) , 0 < i O C
Z ´ i = α i 1 · X i + α i 2 · Y i + α i 3 · Z i
where Z ´ i represents the feature of the current layer after fusion and α i 1 , α i 2 and α i 3 represent the attention weight parameters assigned to temporal, spectral, and fusion feature components, respectively.

3.3. Decision Fusion Based on Information Entropy

Using multiple sensor signals for collaborative fault diagnosis can obtain more fault characteristics, but the sensitivity of each sensor to faults is different under different fault states. In this paper, a decision-level fusion approach incorporating information entropy analysis of sensor data is introduced. The process involves quantifying the entropy inherent in the input samples of the model, the weight distribution of the diagnosis results of each sensor is carried out, which improves the reliability of multi-sensor collaborative fault diagnosis. The fusion methodology is depicted in Figure 4.
The multi-sensor decision fusion index is constructed guided by information entropy. The input samples of each sensor in the fault diagnosis model at the same time are recorded as X = { x 1 , x 2 , , x n } . The fusion feature output by the sample through the feature fusion layer is denoted as F = { f 1 , f 2 , f n } , the feature weight vector of decision fusion layer is denoted as φ = [ φ 1 , φ 2 , , φ n ] , where x n = [ x n 1 , x n 2 , , x n m ] , n denotes the total quantity of input signal sensors, m represents the sample length of the input signal, f n = [ f n 1 , f n 2 , , f n k ] , k indicates the dimension of a single sensor fusion feature vector, which is the same as the number of fault categories, φ n is the decision fusion weight of the nth sensor fusion feature vector, and the calculation process is as follows:
I E n = i = 1 m P ( x n i ) l o g b P ( x n i )
φ = s o f t m a x ( [ I E 1 , I E 2 , , I E n ] )
where I E n represents the information entropy of the nth sensor sample, and s o f t m a x ( ) normalizes the sample information entropy weight to between 0 and 1.
Each fusion feature is then assigned its calculated weight, and the aggregate of these weighted features forms the output of the decision fusion layer.
O = [ O 1 , O 2 , , O k ] = i = 1 n φ i · f i
where O k is the probability that the model input sample X belongs to the kth class of faults.

4. Experiments

This research assesses the superiority of the fusion method by analyzing its performance across two diverse scenarios: the benchmark Paderborn dataset and practical industrial data from hydropower machinery. The algorithmic implementation was carried out in Python 3.9 and PyTorch 2.8. The experimental trials were executed on a Windows 10-based hardware platform featuring an Intel Core i7-10700F CPU to ensure consistent processing power.

4.1. Datasets

4.1.1. Paderborn Dataset

The study utilizes a benchmark dataset sourced from the Chair of Design and Drive Technology, University of Paderborn (PU), Paderborn, Germany [19]. The physical testbed comprises a modular assembly including a drive motor, torque sensors, a rolling bearing test housing, a flywheel, and a secondary load motor. This configuration facilitates the concurrent collection of electrical current data and mechanical oscillation measurements from bearings across a diverse range of working conditions. Bearing conditions are categorized into three primary health states, healthy (N), inner race defect (IR), and outer race defect (OR), encompassing diverse damage severities and morphologies. For this research, seven distinct data classes were selected, as detailed in Table 1. In this dataset, both vibration and current signals are sampled at 64 kHz. Each sample consists of 1024 data points, and each category includes 250 samples.

4.1.2. Hydropower Unit Fault Dataset

The dataset is collected from the routine operation of a hydropower station in China. The hydropower unit (HU), a common type of rotating machinery, is extensively used in the energy industry, with its structure illustrated in Figure 5. Given the HU’s complex structure and the presence of significant operational noise, the dataset provides a practical scenario to assess the robustness and performance of the method. Therefore, the dataset is selected to assess the reliability and performance stability of the method in real-world industrial applications. The experimental verification is mainly carried out by using the vibration data of the X direction of the water guide bearing of the shaft system and the Y direction of the water guide bearing. These data are sampled by equal period sampling, and four different states of data are used, namely, normal working condition, stator core vibration fault, flow channel blockage fault, and mass imbalance fault. Each fault category includes 50 samples, each containing 1024 data points. In this paper, Gaussian Kernel Density Estimation (KDE) is employed to estimate the probability density distributions of data samples for different fault types across two channels in the hydroelectric generator unit dataset. The resulting plot is shown in Figure 6. By comparing the distribution patterns of different fault types, the complexity of the hydroelectric fault data is intuitively demonstrated.

4.2. Experiments and Result Analysis

This section includes five experiments, which compare and analyze the methods proposed in this paper from different perspectives and verify them on the Paderborn dataset and the actual operation dataset of hydropower units. The fault diagnosis model proposed in this section consists of a central CNN and two branch CNNs for each channel, with the two branch CNNs sharing the same structure. The model architecture and specific parameters are detailed in Table 2. Among them, the model is trained using the Adam optimizer with an initial learning rate of 0.001, 100 training epochs, and a training-to-test set ratio of 8:2. In an effort to suppress the influence of random initialization and data shuffling, each experiment was executed ten times. The final results presented herein are the arithmetic means of these repeated tests, providing a more stable performance assessment.

4.2.1. Advantage Verification of Time & Frequency Domain Feature Fusion

In an effort to quantify the performance gains of the proposed fusion strategy, multiple systematic tests were performed using the X-direction vibration signals from the water guide bearing and the PU dataset. The model’s sensitivity was tested by alternating the input between time-, frequency-domain, and fused signal representations. The experimental outcomes, summarized in Figure 7, confirm that the combined feature space significantly enhances the model’s diagnostic reliability.
The experimental results indicate that, for both datasets, using either timed- or frequency-domain signals individually yields diagnostic accuracies above 83%, demonstrating that both domains contain valuable information about the device state and can partially reflect its condition. In the PU dataset, fusing time-domain and frequency-domain signals as input increases the model accuracy to 93.34%, representing improvements of 3.05% and 8.65% compared with using time-domain and frequency-domain signals alone, respectively. For the HU dataset, the fusion approach achieves 96% accuracy, which is 12.25% and 10.00% higher than the accuracies obtained using time-domain and frequency-domain signals individually. These results show that the diagnostic performance varies depending on the type of input signal and dataset, with the combined time-frequency input consistently providing the best results. This improvement is attributed to the more comprehensive utilization of implicit information in the monitoring signals, as well as the integration of complementary characteristics from both time and frequency domains, leading to higher diagnostic precision across the model.

4.2.2. Advantage Verification of Multi-Sensor Data Feature Fusion

In this section, the PU dataset and the HU dataset will be utilized to validate the benefits of integrating diverse sensory inputs within the proposed model. In the PU dataset, current data, vibration data, and their fusion data are used for the experiments. Within the HU dataset, the analytical tests are executed by leveraging orthogonal vibration measurements from the turbine guide. Specifically, the framework utilizes time-series data captured along both the X and Y axes, and the fusion data of the two. Figure 8 shows the results obtained from the experiments.
The results demonstrate that sensor characteristics, including their type and position, affect the final diagnostic results. In the PU dataset, the difference in diagnostic effectiveness between using current data alone and vibration data alone reaches 26.12%, because there is less bearing fault diagnostic information in the current signal, which leads to a lower diagnostic accuracy observed when the diagnostic model is based solely on the current signal. The highest diagnostic accuracy of 91.60% is achieved when using fused data, which is 3.97% higher than that of vibration data alone. In the HU dataset, the diagnostic accuracies using the fused data are 1.75% and 4.75% higher than those using the turbine guide, X- and Y-direction data alone, respectively. In summary, the fusion of different channel signals can effectively bolster the classification reliability of the diagnostic framework. This is because the difference in the sensors causes the different reflections of the fault characteristics of each channel signal. Combining features from multiple channel signals helps prevent the loss of fault information and significantly enhances the accuracy of the diagnosis model.

4.2.3. Advantage Verification of the Proposed Decision Fusion Method

In this section, the benefits offered by the proposed decision-fusion approach based on the entropy of sensor sample information are verified. The dataset and model structure are the same as above, which are compared with several commonly used decision fusion methods.
Strategy 1: An average weighted fusion decision for all outputs:
O u t = 1 n i = 1 n f i
Strategy 2: Decision fusion by using fully connected layers after splicing all outputs [20]:
O u t = L i n e a r ( [ f 1 , f 2 , , f i ] ) , n × k k
Strategy 3: Decision fusion based on DS evidence theory for all outputs.
Each of the above strategies is tested separately, and the accuracy of each strategy is recorded. The results are shown in Table 3. Figure 9 displays the confusion matrices for classification under different approaches under different strategies. It is evident from the performance metrics that adopting standard fusion alternatives leads to a reduction in classification fidelity. The data indicates that a uniform averaging of multi-channel weights is slightly more effective than the decision-fusion strategy using the fully connected layer, but the classification accuracy of the model under both strategies is low. This is because neither strategy considers the importance of channel signals to the diagnosis results, resulting in poor decision fusion results. By comparing the fault diagnosis accuracies and confusion matrices across different fusion strategies, it can be observed that the DS evidence-theory-based strategy demonstrates superior diagnostic performance compared to the other two simple decision fusion strategies, effectively fusing information from different sensors. However, given that DS evidence theory suffers from severe counter-intuitive outcomes, unreliable results, or even complete failure when handling highly conflicting evidence, its diagnostic performance remains inferior to the proposed fusion strategy. A core contribution of this work is a decision-fusion mechanism that evaluates the interdependence of multi-modal sensors and their sensitivity to specific fault modes. Through an entropy-based assessment of the input data, the model autonomously determines the weighting of channel-specific results, ensuring that the final decision is driven by the most reliable informational streams. Such an approach suppresses the influence of noisy or ambiguous data, thereby optimizing the accuracy of the diagnostic output across heterogeneous operational conditions.

4.2.4. Fault Diagnosis Based on the PU Dataset

The performance of the introduced methodology is evaluated against multiple benchmark models using the PU experimental data. The control group includes both traditional and cutting-edge information integration strategies. To determine the efficacy of each framework, diagnostic success is quantified through classification accuracy, precision, and the F1-measure. Furthermore, the consistency and reliability of these methods are scrutinized by analyzing the variance in results across multiple trials. The detailed results are presented in Table 4, and Figure 10 illustrates the feature visualization of the decision layer.
E_CNN, D_CNN, MRSFN, and DRCNN achieve fault classification by adaptively extracting the convolutional features of the input signal. These methods have a relatively simple model structure and do not consider the relevant information between the samples; the method of splicing or simple addition is used to achieve feature fusion, so the diagnostic accuracy is low. In contrast, the MsfHGNN, AMMFN, and AMDC_CNN methods show good results in the PU dataset. MsfHGNN can capture the high-order correlation between samples beyond the pairwise relationship. AMMFN can effectively extract complementary information between signals. AMDC_CNN adaptively extracts features through branch networks and has a reasonable feature fusion structure. However, these three models only consider sufficient multi-sensor feature fusion, ignoring the correlations between different sensors and different fault states, resulting in insufficient model accuracy. The indicators of the model proposed in this paper in the state classification of the PU dataset are better than other comparison models, and the deviation is the smallest. This is because the proposed model combines the depth features of time domain and frequency domain signals in multiple sensors more effectively, and considers the importance of different sensor signals. The corresponding weights are given in the decision fusion, which further improves the fault recognition ability of the diagnosis model.

4.2.5. Fault Diagnosis Based on the HU Dataset

This section evaluates the model’s applicability and reliability within authentic industrial environments using the HU fault datasets. The results, summarized in Table 5, show that the introduced architecture consistently surpasses the various benchmark strategies across every performance indicator. Specifically, the framework attained a classification accuracy of 99.25%, which is an improvement of 3% to 10% over the alternative models. The experimental results show that the architecture presented herein outperforms standard methodologies in HU state identification, which can extract and integrate the depth features of multiple sensor data from the actual monitoring data of HU and can achieve high fault diagnosis accuracy, which is suitable for state identification in the intelligent operation and maintenance system of HU and has a better prospect of comprehensive application.
The features extracted from each model are visualized by downscaling and the results are shown in Figure 11. The dimensionality reduction features of this paper’s model can be clearly and accurately distinguished. In contrast, the distribution of the dimensionality reduction features of the various comparison models is more chaotic. The overlap between different fault features is significant, making it difficult to accurately distinguish additional samples, which reduces the accuracy of the diagnostic model.
The experimental results show that the model presented in this study has obvious advantages in the state recognition of hydropower units. It can extract and fuse the deep features of multiple sensors and multiple dimensions from the monitoring data of rotating machinery in engineering practice and can achieve higher diagnostic accuracy.
For the HU dataset, we visualized the weighting vectors from the last attention mechanism layer for each sensor in the network. The results are shown in Figure 12. The horizontal axis represents each sample, where every 40 consecutive samples correspond to a specific category. The vertical axis represents the three channels: 0 denotes fused features, 1 denotes time-domain features, and 2 denotes frequency-domain features. It can be observed that for general samples in the X direction, frequency-domain features account for the largest proportion of weights, while fused features account for a smaller proportion. Similarly, for samples in the Y direction belonging to the categories of normal working condition, flow channel blockage fault, and mass imbalance fault, frequency-domain features dominate the weights. However, for samples belonging to the stator core vibration fault category, the weights across the three channels are approximately equal. This indicates that, under this specific fault mode, features from different levels contribute similarly to fault discrimination.
To effectively reflect the efficiency of the proposed method in processing actual hydroelectric generator unit data, this paper quantifies the number of trainable parameters, the training time per epoch, and the model inference time. The results are presented in the Table 6. It can be observed from the results that although the model has a large number of trainable parameters, both the training time and forward inference time are short. Even with an increase in training data and epochs, the inference duration remains essentially constant. This indicates that the model can adequately meet the real-time requirements of field applications.

5. Conclusions

A multi-source, multi-scale fusion paradigm is proposed to address the challenges of hydropower units diagnostics. The methodology employs an FFT-CNN front-end to distill salient frequency and temporal characteristics, followed by an attention-driven fusion layer that ensures comprehensive feature interaction at various structural levels. To further refine the diagnostic output, a decision fusion mechanism based on information entropy is introduced, which effectively weights the contribution of each sensor according to its informational stability. Experimental results derived from both laboratory (PU) environments and hydropower units (HUs) demonstrate that the proposed method facilitates a more robust synthesis of multi-sensor data compared to traditional models. The study confirms that this integrated approach achieves high-precision fault identification, marking a significant advancement in the intelligent operation and maintenance of large-scale mechanical systems.
Currently, this study primarily considers data fusion for a small number of sensors. As the number of sensors increases significantly, the computational complexity of the model will inevitably rise. Therefore, further research is needed on fast fusion algorithms for scenarios with a large number of sensors. In addition, when fusing data from different sensors, the model does not consider the correlations between sensors in physical space, lacking mechanism-guided data fusion. Consequently, it is also necessary to conduct research on fusion mechanism analysis and mechanism-guided, data-driven multi-sensor data fusion strategies.

Author Contributions

Conceptualization, X.X.; methodology, D.Z. and C.L.; software, D.Z. and X.X.; validation, D.Z. and X.X.; formal analysis, D.Z. and X.X.; data curation, D.Z. and X.X.; writing—original draft preparation, D.Z.; writing—review and editing, X.X. and C.L.; supervision, C.L.; funding acquisition, X.X. and C.L. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge financial support from the National Natural Science Foundation of China (No. U23B20143), the National Natural Science Foundation of China (No. 52279085), the Hubei Provincial Natural Science Foundation of China (No. 2023AFD186), by the China Postdoctoral Science Foundation (No. 2025M773167) for the research, authorship, and publication of this article.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request because they involve the parameters and full characteristic data of the actual hydropower station.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Xu, X.; Deng, J.; Lin, H.; Li, Z.; Wen, H. Lightweight Anomalous Detection of Hydro Turbine Operation Sound Using Fusion Network Enhanced by Load Information. IEEE Trans. Instrum. Meas. 2025, 74, 9600213. [Google Scholar] [CrossRef]
  2. Liu, Y.; Xu, Y.; Liu, J.; Niu, X. A Hydraulic Turbine Fault Diagnosis Method Based on Synchrosqueezed Wavelet Transform and SE-ResNet. Water 2025, 17, 447. [Google Scholar] [CrossRef]
  3. Liu, J.; Xie, F.; Zhang, Q.; Lyu, Q.; Wang, X.; Wu, S. A multisensory time-frequency features fusion method for rotating machinery fault diagnosis under nonstationary case. J. Intell. Manuf. 2024, 35, 3197–3217. [Google Scholar] [CrossRef]
  4. He, C.; Shi, H.; Li, R.; Li, J.; Yu, Z. Interpretable modulated differentiable STFT and physics-informed balanced spectrum metric for freight train wheelset bearing cross-machine transfer fault diagnosis under speed fluctuations. Adv. Eng. Inform. 2024, 62, 102568. [Google Scholar] [CrossRef]
  5. Zhang, D.; Xie, M.; Hamadache, M.; Entezami, M.; Stewart, E. An Adaptive Graph Morlet Wavelet Transform for Railway Wayside Acoustic Detection. J. Sound Vib. 2022, 529, 116965. [Google Scholar] [CrossRef]
  6. Wang, Z.; Zhang, Q.; Xiong, J.; Xiao, M.; Sun, G.; He, J. Fault Diagnosis of a Rolling Bearing Using Wavelet Packet Denoising and Random Forests. IEEE Sens. J. 2017, 17, 5581–5588. [Google Scholar] [CrossRef]
  7. Wang, B.; Qiu, W.; Hu, X.; Wang, W. A rolling bearing fault diagnosis technique based on recurrence quantification analysis and Bayesian optimization SVM. Appl. Soft Comput. 2024, 156, 111506. [Google Scholar] [CrossRef]
  8. Song, B.; Liu, Y.; Fang, J.; Liu, W.; Zhong, M.; Liu, X. An optimized CNN-BiLSTM network for bearing fault diagnosis under multiple working conditions with limited training samples. Neurocomputing 2024, 574, 127284. [Google Scholar] [CrossRef]
  9. Jiang, G.; Xie, P.; He, H.; Yan, J. Wind Turbine Fault Detection Using a Denoising Autoencoder with Temporal Information. IEEE/ASME Trans. Mechatron. 2018, 23, 89–100. [Google Scholar] [CrossRef]
  10. Li, Q.; Qin, L.; Xu, H.; Lin, Q.; Qin, Z.; Chu, F. Transparent information fusion network: An explainable network for multi-source bearing fault diagnosis via self-organized neural-symbolic nodes. Adv. Eng. Inform. 2025, 65, 103156. [Google Scholar] [CrossRef]
  11. Li, X.; Wang, Y.; Yao, J.; Li, M.; Gao, Z. Multi-sensor fusion fault diagnosis method of wind turbine bearing based on adaptive convergent viewable neural networks. Reliab. Eng. Syst. Saf. 2024, 245, 109980. [Google Scholar] [CrossRef]
  12. Bai, R.; Xu, Q.; Meng, Z.; Cao, L.; Xing, K.; Fan, F. Rolling bearing fault diagnosis based on multi-channel convolution neural network and multi-scale clipping fusion data augmentation. Measurement 2021, 184, 109885. [Google Scholar] [CrossRef]
  13. Xie, T.; Huang, X.; Choi, S.K. Intelligent Mechanical Fault Diagnosis Using Multisensor Fusion and Convolution Neural Network. IEEE Trans. Ind. Inform. 2022, 18, 3213–3223. [Google Scholar] [CrossRef]
  14. Cui, X.; Wu, Y.; Zhang, X.; Huang, J.; Wong, P.K.; Li, C. A Novel Fault Diagnosis Method for Rotor-Bearing System Based on Instantaneous Orbit Fusion Feature Image and Deep Convolutional Neural Network. IEEE/ASME Trans. Mechatron. 2023, 28, 1013–1024. [Google Scholar] [CrossRef]
  15. Jiao, J.; Zhao, M.; Lin, J.; Ding, C. Deep Coupled Dense Convolutional Network with Complementary Data for Intelligent Fault Diagnosis. IEEE Trans. Ind. Electron. 2019, 66, 9858–9867. [Google Scholar] [CrossRef]
  16. Yang, X.; Fang, C.; Kundu, P.; Yang, J.; Chronopoulos, D. A decision-level sensor fusion scheme integrating ultrasonic guided wave and vibration measurements for damage identification. Mech. Syst. Signal Process. 2024, 219, 111597. [Google Scholar] [CrossRef]
  17. Xu, Y.; Wang, T.; Diallo, D.; Amirat, Y. A confidence-guided DS fault diagnosis method for tidal stream turbines blade. Ocean Eng. 2024, 311, 118807. [Google Scholar] [CrossRef]
  18. Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
  19. Li, X.; Wan, S.; Liu, S.; Zhang, Y.; Hong, J.; Wang, D. Bearing fault diagnosis method based on attention mechanism and multilayer fusion network. ISA Trans. 2022, 128, 550–564. [Google Scholar] [CrossRef]
  20. Xiao, X.; Li, C.; He, H.; Huang, J.; Yu, T. Rotating machinery fault diagnosis method based on multi-level fusion framework of multi-sensor information. Inf. Fusion 2025, 113, 102621. [Google Scholar] [CrossRef]
  21. Wang, D.; Li, Y.; Jia, L.; Song, Y.; Liu, Y. Novel Three-Stage Feature Fusion Method of Multimodal Data for Bearing Fault Diagnosis. IEEE Trans. Instrum. Meas. 2021, 70, 3514710. [Google Scholar] [CrossRef]
  22. Liu, Y.; Yan, X.; Zhang, C.-a.; Liu, W. An Ensemble Convolutional Neural Networks for Bearing Fault Diagnosis Using Multi-Sensor Data. Sensors 2019, 19, 5300. [Google Scholar] [CrossRef] [PubMed]
  23. Jing, L.; Wang, T.; Zhao, M.; Wang, P. An Adaptive Multi-Sensor Data Fusion Method Based on Deep Convolutional Neural Networks for Fault Diagnosis of Planetary Gearbox. Sensors 2017, 17, 414. [Google Scholar] [CrossRef]
  24. Yan, X.; Shi, Z.; Sun, Z.; Zhang, C.A. Multisensor Fusion on Hypergraph for Fault Diagnosis. IEEE Trans. Ind. Inform. 2024, 20, 10008–10018. [Google Scholar] [CrossRef]
  25. Wang, J.; Fu, P.; Zhang, L.; Gao, R.X.; Zhao, R. Multilevel Information Fusion for Induction Motor Fault Diagnosis. IEEE/ASME Trans. Mechatron. 2019, 24, 2139–2150. [Google Scholar] [CrossRef]
  26. Niu, G.; Liu, E.; Wang, X.; Ziehl, P.; Zhang, B. Enhanced Discriminate Feature Learning Deep Residual CNN for Multitask Bearing Fault Diagnosis with Information Fusion. IEEE Trans. Ind. Inform. 2023, 19, 762–770. [Google Scholar] [CrossRef]
Figure 1. The structure and process of the proposed model.
Figure 1. The structure and process of the proposed model.
Water 18 00915 g001
Figure 2. The process of multidimensional feature extraction.
Figure 2. The process of multidimensional feature extraction.
Water 18 00915 g002
Figure 3. The steps of the proposed feature fusion method.
Figure 3. The steps of the proposed feature fusion method.
Water 18 00915 g003
Figure 4. The specific steps of the proposed decision fusion.
Figure 4. The specific steps of the proposed decision fusion.
Water 18 00915 g004
Figure 5. Schematic diagram of the turbine structure and measuring point distribution [14].
Figure 5. Schematic diagram of the turbine structure and measuring point distribution [14].
Water 18 00915 g005
Figure 6. Probability Density Distribution of Hydropower Unit Data. (a) Normal Condition—X Direction; (b) Stator Core Vibration Fault—X Direction; (c) Flow Channel Blockage Fault—X Direction; (d) Flow Channel Blockage Fault—X Direction; (e) Normal Condition—Y Direction; (f) Stator Core Vibration Fault—Y Direction; (g) Flow Channel Blockage Fault—Y Direction; (h) Flow Channel Blockage Fault—Y Direction.
Figure 6. Probability Density Distribution of Hydropower Unit Data. (a) Normal Condition—X Direction; (b) Stator Core Vibration Fault—X Direction; (c) Flow Channel Blockage Fault—X Direction; (d) Flow Channel Blockage Fault—X Direction; (e) Normal Condition—Y Direction; (f) Stator Core Vibration Fault—Y Direction; (g) Flow Channel Blockage Fault—Y Direction; (h) Flow Channel Blockage Fault—Y Direction.
Water 18 00915 g006
Figure 7. The diagnostic results of different signals as input under two datasets.
Figure 7. The diagnostic results of different signals as input under two datasets.
Water 18 00915 g007
Figure 8. The diagnosis results of different channel signal inputs under two datasets. (a) PU dataset. (b) HU dataset.
Figure 8. The diagnosis results of different channel signal inputs under two datasets. (a) PU dataset. (b) HU dataset.
Water 18 00915 g008
Figure 9. The confusion matrix of each fusion strategy under two datasets. (a) Proposed Strategy under PU. (b) Strategy 1 under PU. (c) Strategy 2 under PU. (d) Strategy 3 under PU. (e) Proposed Strategy under HU. (f) Strategy 1 under HU. (g) Strategy 2 under HU. (h) Strategy 3 under HU.
Figure 9. The confusion matrix of each fusion strategy under two datasets. (a) Proposed Strategy under PU. (b) Strategy 1 under PU. (c) Strategy 2 under PU. (d) Strategy 3 under PU. (e) Proposed Strategy under HU. (f) Strategy 1 under HU. (g) Strategy 2 under HU. (h) Strategy 3 under HU.
Water 18 00915 g009
Figure 10. Visualization of different model results under the PU dataset. (a) AMDC_CNN. (b) AMMFN. (c) D_CNN. (d) E_CNN. (e) MsfHGNN. (f) MRSFN. (g) DRCNN. (h) Proposed.
Figure 10. Visualization of different model results under the PU dataset. (a) AMDC_CNN. (b) AMMFN. (c) D_CNN. (d) E_CNN. (e) MsfHGNN. (f) MRSFN. (g) DRCNN. (h) Proposed.
Water 18 00915 g010
Figure 11. Visualization of different model results under the HU dataset. (a) AMDC_CNN. (b) AMMFN. (c) D_CNN. (d) E_CNN. (e) MsfHGNN. (f) MRSFN. (g) DRCNN. (h) Proposed.
Figure 11. Visualization of different model results under the HU dataset. (a) AMDC_CNN. (b) AMMFN. (c) D_CNN. (d) E_CNN. (e) MsfHGNN. (f) MRSFN. (g) DRCNN. (h) Proposed.
Water 18 00915 g011
Figure 12. Visualization of attention weights in the last layer of the network. (a) Data in the X direction, (b) Data in the Y direction.
Figure 12. Visualization of attention weights in the last layer of the network. (a) Data in the X direction, (b) Data in the Y direction.
Water 18 00915 g012
Table 1. Detailed settings for the PU dataset.
Table 1. Detailed settings for the PU dataset.
Bearing CodeLabelStateExtent of DamageDamage Method
K0030N--
KA011OR1EDM
KA032OR2Electric engraver
KA053OR1Electric engraver
KI014IR1EDM
KI035IR1Electric engraver
KI076IR2Electric engraver
Table 2. Model parameters.
Table 2. Model parameters.
NetworkNetwork LayerParametersOutput Size
Branch NetworkInput-1.1024
Conv/Maxp/BN/ReLUC = 16, Ck = 3, Cs = 1, Pk = 3, Pk = 216, 170
Conv/Maxp/BN/ReLUC = 32, Ck = 3, Cs = 1, Pk = 332, 56
Conv/Maxp/BN/ReLUC = 32, Ck = 3, Cs = 1, Pk = 332, 18
Conv/Maxp/BN/ReLUC = 16, Ck = 3, Cs = 1, Pk = 316, 5
Central NetworkConv/Maxp/BN/ReLUC = 32, Ck = 3, Cs = 1, Pk = 332, 56
Conv/Maxp/BN/ReLUC = 32, Ck = 3, Cs = 1, Pk = 332, 18
Conv/Maxp/BN/ReLUC = 16, Ck = 3, Cs = 1, Pk = 316, 5
Flatten/Dense/ReLU/U = 24/24
Dense/SoftmaxU = class_numclass_num
Table 3. The Accuracy of Each Strategy under Two Datasets.
Table 3. The Accuracy of Each Strategy under Two Datasets.
Fusion DecisionProposed StrategyStrategy 1Strategy 2Strategy 3
Accuracy (PU)96.94 ± 0.8594.37 ± 1.9093.35 ± 2.1996.34 ± 1.33
Accuracy (HU)99.00 ± 1.2294.25 ± 3.1792.75 ± 2.6197.00 ± 2.92
Note: Data in bold indicates the best results.
Table 4. The Accuracy of Each Strategy under PU Dataset.
Table 4. The Accuracy of Each Strategy under PU Dataset.
MethodAccuracy (%)Precision (%)F1 Score (%)
AMDC_CNN [21]93.91 ± 1.4293.90 ± 0.5293.81 ± 0.34
E_CNN [22]93.42 ± 0.5393.24 ± 0.3193.42 ± 0.52
D_CNN [23]93.63 ± 0.8193.26 ± 0.7693.36 ± 0.56
AMMFN [19]94.62 ± 1.1093.83 ± 0.4993.87 ± 0.98
MsfHGNN [24]95.26 ± 0.8195.55 ± 0.7295.25 ± 0.80
MRSFN [25]91.09 ± 1.3991.28 ± 1.3891.28 ± 1.39
DRCNN [26]85.63 ± 3.3585.82 ± 3.2585.65 ± 3.33
Proposed97.48 ± 0.3496.74 ± 0.3296.42 ± 0.29
Note: Data in bold indicates the best results.
Table 5. The Accuracy of Each Strategy under HU Dataset.
Table 5. The Accuracy of Each Strategy under HU Dataset.
MethodAccuracy (%)Precision (%)F1 Score (%)
AMDC_CNN95.75 ± 1.2195.00 ± 1.2995.00 ± 1.05
E_CNN95.50 ± 1.0595.75 ± 1.6995.75 ± 1.21
D_CNN94.75 ± 0.7994.75 ± 1.8493.75 ± 1.77
AMMFN96.50 ± 1.2997.25 ± 1.8496.50 ± 1.75
MsfHGNN90.50 ± 1.8791.15 ± 1.9890.45 ± 1.85
MRSFN90.75 ± 2.9791.00 ± 3.1690.70 ± 2.97
DRCNN89.75 ± 2.3690.70 ± 2.2589.95 ± 2.33
Proposed99.25 ± 1.2199.38 ± 1.0099.28 ± 1.16
Note: Data in bold indicates the best results.
Table 6. Real-time performance parameters of the model.
Table 6. Real-time performance parameters of the model.
Number of Trainable ParametersTrain Time (s)Test Time (s)
23,4421.27270.0442
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, D.; Xiao, X.; Li, C. Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion. Water 2026, 18, 915. https://doi.org/10.3390/w18080915

AMA Style

Zhou D, Xiao X, Li C. Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion. Water. 2026; 18(8):915. https://doi.org/10.3390/w18080915

Chicago/Turabian Style

Zhou, Di, Xiangqu Xiao, and Chaoshun Li. 2026. "Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion" Water 18, no. 8: 915. https://doi.org/10.3390/w18080915

APA Style

Zhou, D., Xiao, X., & Li, C. (2026). Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion. Water, 18(8), 915. https://doi.org/10.3390/w18080915

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop