1. Introduction
With the greater demand and quality requirements of consumers for aquatic products, it is necessary to accelerate the transformation of freshwater fisheries through scientific and technological innovations [
1,
2]. The global market share of industrialized recirculating aquaculture systems (IRASs) has steadily increased due to its controllable environment, easy automation, and operational intelligence [
3,
4]. In a typical recirculating aquaculture plant, the water pump ensures a circulating flow of the water body, providing a suitable living environment for farmed organisms. How to efficiently detect equipment failure and safeguard this pump’s long-term stable operation is arguably very important.
Common faults of water pumps include impeller damage, blockage, motor burnout due to phase loss, and bearing wear and corrosion. In IRASs, however, impeller wear rarely occurs because there is no sediment in the water and most of its debris, such as feces and residual feed, are adequately filtered by the feces collector, micro-filter, and biological filter barrel. Likewise, since the control cabinet is equipped with phase-loss protection, the motor burnout phenomenon (due to phase loss) is also relatively rare. The most common fault is the bearing failure of the water pump, which can take several forms (
Table 1).
As seen from
Table 1, the most direct manifestation of bearing faults is an altered vibration signal, which is mainly expressed in two ways:
- (1)
Increase in vibration amplitude: Under normal circumstances, the water pump bearing operates stably, with its vibration amplitude remaining at a relatively low and stable level. When the bearing incurs wear and corrosion, its clearance will increase, loosening the tight fit between the raceway and the balls or rollers. During the operation process, additional impact forces and vibrations are generated, substantially increasing the vibration amplitude.
- (2)
Change in vibration frequency: When there is local damage to the bearing, such as one or more cracks on the surface of its balls or pits on the raceway, specific frequency components related to the bearing’s own structure and faults are generated. If the fit between the bearing and the shaft or the bearing seat becomes loose, this will decrease the vibration frequency.
In recent years, much research has emerged on equipment fault detection. Zan et al. [
5] proposed a detection method using the variational mode decomposition (VMD) algorithm for early rolling bearing faults. In order to better extract effective features, Liu et al. [
6] introduced a method of constructing features from the time-domain energy entropy values of modal components, then input those into a support vector machine (SVM) for the identification of rolling bearing faults. In another work, Hou et al. [
7] took advantage of the multi-view manifold component feature extraction (MvMCFE) technique’s superiority in nonlinear feature extraction, to propose a fault detection method that combined it with an optimized SVM. Li et al. [
8] used the particle swarm optimization (PSO) algorithm to optimize the SVM for the detection of motor bearing faults, while Wang et al. [
9] used the random forest algorithm to detect faults in charging pile equipment. Using the K-nearest neighbors (KNN) algorithm as a basis, Yang et al. [
10] proposed a fault diagnosis algorithm and monitoring system for the bearings of hydraulic turbine units. Finally, Lu et al. [
11] combined the KNN algorithm and the Naive Bayes algorithm to effectively diagnose the faults of rolling bearings.
Most of the studies cited above relied on traditional machine learning methods, which can perform well at fault detection. Still, the fault features must be manually extracted, and they are easily affected by subjective human judgments. Anupam et al. [
12] used Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to decompose the data, and then fused the long short-term memory (LSTM) and gated recurrent unit (GRU) neural networks to achieve accurate gear fault identification. Likewise, Jiang et al. [
13] suggested an equipment fault detection method based on knowledge graphs and task learning, while Ma et al. [
14] used the attention mechanism-based gated recurrent unit (AM-GRU) network to detect the faults of intelligent substation equipment. Lastly, Hao et al. [
15] presented a multi-sensor method combining one-dimensional convolution with LSTM for the successful diagnosis of equipment faults. Wang et al. [
16] employed the Empirical Mode Decomposition (EMD) method to preprocess the raw signals of the Magnetic Flux Leakage (MFL) and designed a wire rope defect diagnosis network based on the CNN-transformer. Cao et al. [
17] integrated the core advantages of the LSTM and transformer architectures for real-time prediction and fault detection in engineering system tasks.
The above deep learning algorithms can automatically extract fault features and demonstrate high recognition accuracy. Yet, most of them are based on one-dimensional signals and lack features integrating the time domain with the frequency domain. To that end, Li et al. [
18] adopted a multi-receptive field graph convolutional network: when extracting features, it integrates the information of adjacent nodes and fuses its own parameters, thereby improving the network’s overall performance. Tian et al. [
19] developed a bearing fault diagnosis method that combines a frequency-domain feature extraction with a dual-stream convolutional neural network (CNN). Mo et al. [
20] proposed a lightweight CNN network that fuses one- and two-dimensional features for equipment fault diagnosis; by researching the feature fusion of the time–frequency domain, those authors improved the model’s performance.
To sum up, progress is being made at improving the method of equipment fault detection, bringing it closer maturing. However, its application is still mostly restricted to standard industries, and its potential applications in the field of IRASs is unexplored and virtually unknown. Moreover, the environment of recirculating aquaculture factories is quite complex, and various factors, such as interference between equipment, water flow, impurities, and corrosion, could impair their water pumps’ fault detection. Aiming to tackle these problems, this paper proposes the convolutional neural network, transformer, and bidirectional gated recurrent unit integrated (CNN-transformer-BiGRU) model, whose key characteristics are anti-interference and high precision. This method first uses the continuous wavelet transform to convert one-dimensional vibration signals into time–frequency images. Then, it takes advantage of the CNN’s powerful ability to process images to extract local features in both time and frequency domains. Next, it utilizes the encoder in the transformer to improve the model’s hierarchical learning ability for features. Finally, it captures the front and back feature information in the vibration signal sequence via the BiGRU, to achieve the accurate detection of water pump faults under complex working conditions and background noise.
2. Materials and Methods
2.1. Network Model Structure
In the field of IRASs, a new network model (CNN-transformer-BiGRU) that is applied to the task of water pump fault detection is thus proposed. Its overall structure is illustrated in
Figure 1.
To detect the water pump’s faults, its vibration signal data were collected via sensors, and then processed by normalization and resampling. To convert these one-dimensional vibration signals into image data that integrate the time and frequency domains, the continuous wavelet transform was used; a CNN then preprocessed the input data and extracted local features; after that, the data were fed into the transformer for global feature modeling. This output served as input into the bidirectional gated recurrent unit (BiGRU) network model for sequence modeling and processing. Finally, the fully connected layer and Softmax function output the classification results.
2.2. Sensor Selection
Through the analysis of potential fault causes, vibration signals of the water pump were monitored and its vibration data under different working conditions were obtained for experimental analysis. To collect these data, we chose the HG-ZD-20B integrated vibration sensor, whose detailed parameters are listed in
Table 2.
The sensor was installed on the circulating water pump in the IRAS test area, using a magnetic suction base to affix it to the pump’s metal shell. The water pump model is 50WQ9-22-2.2, with a rated voltage of 380 V and a rated speed of 3000 r/min. This magnetic suction method is convenient for installing and quickly positioning the sensor at various monitoring points. By comparing different monitoring points, more distinctive signal data could be obtained.
2.3. Data Acquisition
The experimental data used in this paper were collected on-site in the IRAS test area of the Yangdu Base, Zhejiang Academy of Agricultural Sciences. The sensor was connected to the acquisition card, and in turn, the latter was connected to the computer for data collection and storage. HK_USB_DAQ is a high-speed data acquisition card based on the USB bus, with a maximum sampling rate of up to 100 kHz (
Table 3).
We used the Python 3.8.3 programming language to develop the data collection functions. Since the vibration signal data of the water pump need to be continually recorded multiple times, the continuous collection function interface was used. The continuous Analog-to-Digital (AD) collection process is depicted in
Figure 2, with the obtained vibration signal data then written into the disk file.
The vibration signals from the sensors are stored in a computer via a data acquisition card, which involves several steps: device initialization, sampling parameter configuration, cyclic data reading, temporary data buffering, file writing, and device shutdown. Among these, cyclic data reading and temporary buffering are critical for the high-speed signal acquisition of the data acquisition card.
In
Figure 3 below, the waveform diagram of the raw vibration signal data collected has the sample number along the
x-axis and vibration velocity on the
y-axis.
Three types of water pump vibration signal data were designed in this experiment: normal, slightly faulty, and severely faulty. The vibration signal data of normal water pumps were relatively easy to obtain, while those of faulty water pumps have been generated by manually disassembling and modifying the bearing parts. Specifically, bearings of normal water pumps were removed and subjected to chemical corrosion. The slightly and severely faulty types were defined according to their exposure time to corrosion (duration). The water pump vibration signal data read by the acquisition card were stored in an Excel file. In this way, we collected nine sets each of normal data, slightly faulty data, and severely faulty data (
Table 4). The sampling frequency was fixed at 10 kHz, with 51,200 single-sampling points, and a total of 1,382,400 pieces of experimental data, evenly split among the three types (
Table 4).
2.4. Data Preprocessing
To improve model performance and accelerate algorithm convergence, the water pump vibration signal data were first normalized using Equation (1).
In the above formula,
denotes the feature value of the
i-th sample in the dataset;
and
are, respectively, the minimum and maximum value among the samples in the dataset; and
is the normalized feature value, whose range is [0, 1]. To prevent the overfitting of the model during its training due to a small number of samples, we used the overlapping sampling method to augment the number of samples [
21], for which the sample step size was 512, and the overlap rate was set to 0.5.
After resampling, the dataset was split into a training set, validation set, and test set in a ratio of 7:2:1. Next, we formatted the dataset as required by the model, with data saved in a temporary file for subsequent model training.
2.5. Model-Building Steps
Here, we elaborate on each module of the model in detail:
- (1)
Conversion of one-dimensional vibration signals into two-dimensional time–frequency diagrams. When a water pump malfunctions, for example, due to bearing corrosion or wear, its vibration signals are shaped by multiple factors including the pump itself and the surrounding aquatic environment. These signals are characterized by both nonlinearity and nonstationarity. Compared with previous analysis methods that independently analyze the time domain or the frequency domain, an analytical method that merges these two domains can extract critical features more effectively [
22,
23]. The continuous wavelet transform (
CWT) can perform local transformations in space and time, and can efficiently extract key information from signals [
24,
25]. So, the
CWT was selected here to convert the water pump vibration data into time–frequency images, using Equation (2).
In the above formula for CWT, is the input data; a is the scale parameter; b is the translation parameter; and φ is the wavelet basis function. The CWT’s parameters were set as follows: sampling period = 1/12,000, total scale = 128, and wavelet basis function = ‘cmor1-1’.
- (2)
The time–frequency image data are inputted into the CNN. After passing through the convolutional activation layer, the maximum pooling layer, the convolutional activation layer, and the maximum pooling layer again [
26,
27], the output data are finally used as the input for the next module in the model.
- (3)
The output data of the previous module serve as input of this module. The transformer mainly consists of two parts: an encoder and a decoder, which may be used separately according to a given task’s requirements. In this paper, only the encoder part was actually used. It contains a multi-head self-attention mechanism and a feed-forward neural network. Residual connections and layer normalization are used to process the output of each sub-layer to enhance model training and convergence [
28]. Finally, the output result is designated as input for use in the next module.
- (4)
The output data of the previous module act as input for the BiGRU module. The model is improved upon by changing the GRU into a two-directional one. The forward GRU processes the data from the start of a sequence to its end, fully capturing its forward information; conversely, the backward GRU processes the data from the end of the sequence to its start, thus capturing its reverse information. This BiGRU is a powerful sequence data processing model. Through its bidirectional structure and gating mechanism, it can effectively capture the context information of the sequence and overcome the problem of gradient disappearance [
29,
30,
31].
Finally, in classification problems, the fully connected layer is often used as the last layer of the neural network, and the number of its output nodes is usually equal to the number of classification categories. After completing the feature extraction and transformation of each previous layer, the fully connected layer calculates the score of each category according to the input features. Next, the Softmax function converts these scores into a probability distribution. By determining the magnitude of each probability value, the category with the maximum probability value is selected to determine the specific classification of the input sample.
2.6. Fault Detection Process
The water pump fault detection process is divided into five phases: data collection, data preprocessing, model selection and construction, model training (iterative training to save the optimal model), and use of the optimal model for fault detection. This entire process is sketched in
Figure 4.
4. Discussion
This comprehensive modeling study confirms that the vibration signals of water pumps can be used to quantitatively evaluate their operating status and the degree of faults they may incur. This paper proposed a method based on a deep learning model to objectively detect and identify such faults, and this model is applied in the field of recirculating aquaculture for the first time. Data from normal operation, minor faults, and serious faults were, respectively, collected to train and test the model, so as to determine the three working condition types of the water pump. The experimental classification results are conveyed in
Figure 9, indicating the new model can effectively identify the water pump’s working condition types. This demonstrates that the water pump fault detection method based on the CNN-transformer-BiGRU model is effective.
To verify the performance and superiority of the model, nine different model validations were carried out and compared (
Table 7). It can be seen that the traditional machine learning algorithms, such as the support vector machine (SVM), random forest (RF), and K-nearest neighbors (KNN), perform poorly at the task at hand, likely because the vibration signal data of water pump faults are affected by an array of complex factors, and those algorithms struggle to manually extract the appropriate characteristic information, such as margin factors, pulse factors, and peak-to-peak values [
35,
36]. The model proposed in this paper achieved robust performance when applied to the dataset of water pump faults in industrial recirculating aquaculture systems. The results for the test set are consistent with the F1 value curve of the validation set. Compared with the CNN-LSTM, the model in this paper leverages the advantages of the self-attention mechanism and demonstrates better performance. When compared with the Swin-transformer, the model highlights the advantages of the convolutional module in capturing features from time–frequency feature maps. Through experimental tests, we find that several network models, including BiLSTM, KAN-GRU, and CNN-LSTM, attain relatively high accuracy rates. But when these are directly compared with our proposed model, a performance gap persists.
Importantly, ablation experiments were also conducted in this study. To conduct these experiments, some of the model’s modules were removed for experiments, whose results are compared in
Table 8. This clearly shows that removing the CNN led to a significant decline in all indicators of the entire model. The Acc, P, R, and F1 indicators were, respectively, reduced by 8.19%, 7.96%, 8.19%, and 8.07%. From this, we may infer that by converting the complex one-dimensional vibration signals into two-dimensional time–frequency images and then applying the CNN’s powerful processing ability to the latter, the fault characteristic information can be efficiently extracted [
37,
38]. Hence, the absence of this module has the greatest impact on model performance. After removing the transformer or BiGRU module, all indicators more or less decreased; this suggests that the encoder in the transformer improves the model’s hierarchical learning ability for features, and the self-attention mechanism strengthens the weights of the key features of the vibration signals. The BiGRU module is capable of more comprehensively capturing the front and back characteristic information in the vibration signal sequence, further enhancing the overall performance of the model [
39].
In actual aquaculture settings, the characteristic signals indicating water pump faults may include distinctive signal features such as vibration, current, and sound. The research content of this paper is limited to vibration signal data and does not involve experiments on other signal data such as current and sound. The actual aquaculture environment is complex, with issues such as multiple interferences. Therefore, future research should focus on fusing multiple signal features, ideally according to a certain weight ratio, and then use these fused signals in model training. Pursuing this approach will enhance the robustness and universality of the introduced model in real-world environments and applications.
5. Conclusions
This paper strove to solve three core problems in the fault detection of water pumps in IRASs. (1) When the equipment has a minor fault, it can still operate normally, but the external manifestations of that fault are inconspicuous. The early prediction of the fault is crucial, and timely warnings for equipment maintenance and upkeep should be provided. (2) The vibration signals are complex and nonlinear, being easily influenced by multiple factors, such as background noise, the vibration of other equipment, and water flow. It is thus pivotal to extract information about the fault’s characteristic from both the time domain and frequency domain. (3) Whether the water pump has a minor or serious fault, a certain correlation exists in the vibration signals before and after, so it is necessary to handle the long-distance dependency in the sequence.
After conducting experiments on hyperparameter adjustment, comparison experiments with commonly used models for fault detection, followed by experimental removals of various modules in the model itself, these results were compared and synthesized with multiple indicators, such as accuracy and precision, for model evaluation. The accuracy of the model proposed here is 91.43%, this being significantly better than that of comparable models.
As such, this paper’s CNN-transformer-BiGRU model can effectively solve the above three problems and meet the performance requirements of the fault detection task in aquaculture and agriculture at large.
In the future, this model will incorporate signal data such as current and sound to improve multi-sensor data fusion, thereby enhancing the model’s diagnostic performance under complex working conditions.