A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM

Huang, Jincheng; Mo, Jiaqing; Zhang, Jiangwei; Ma, Xinrong

doi:10.3390/app12178478

Open AccessArticle

A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM

by

Jincheng Huang

,

Jiaqing Mo

^*,

Jiangwei Zhang

and

Xinrong Ma

Key Laboratory of Signal Detection and Processing, College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(17), 8478; https://doi.org/10.3390/app12178478

Submission received: 27 July 2022 / Revised: 21 August 2022 / Accepted: 22 August 2022 / Published: 25 August 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

By trying to solve the issue of identifying multiple types of intrusion vibration signals collected by distributed vibrating fiber optic sensors, this study investigates the signal identification and feature extraction of intrusion signals, and proposes an optical fiber vibration signal (OFVS) identification method based on deep learning. The external vibration signal is collected by the Sagnac fiber optic interferometer, and then denoised by spectral subtraction. Endpoint detection is carried out by combining the short-time logarithmic energy method and the spectral entropy method. Finally, the equal-length signal containing valid information is intercepted and the corresponding preprocessing is carried out. The method for feature processing incorporates the strong feature learning capability of the long-short-term memory (LSTM) and the great short-term feature extraction capability of the convolutional neural network (CNN). At the same time, to further enhance the signal feature identification, a convolutional block attention module (CBAM) is introduced to perform adaptive feature refinement on the signal. In summary, a network model combining CNN, LSTM, and CBAM is proposed to process the signal features, and finally, the multi-layer perceptron (MLP) is used to complete the task of classification and recognition of multi-type intrusion signals. The experimental findings indicate that the OFVS method of CNN-CBAM-LSTM can effectively identify four kinds of OFVS, and the overall average recognition accuracy reaches 97.9%. Walking and knocking signals among them are recognized with over 99% accuracy.

Keywords:

optical fiber vibration signal; convolutional neural network; long short-term memory; convolutional block attention module

1. Introduction

Distributed optical fiber perimeter security systems are widely used in security monitoring fields, such as national defense borders, military bases, and border protection, as a result of the advantages of high sensitivity, quick response times, and continuous monitoring [1,2,3,4,5]. Due to the randomness and non-stationarity of vibration signals and the similarity between some intrusion signals and non-intrusion signals, it is easy to make poor judgments in the process of vibration signal identification. Therefore, how to reduce the false alarm rate of the system so that it can effectively extract features and identify intrusion types in a complex environment is a hot issue in current research [6,7,8,9].

Both domestic and foreign academics have conducted extensive research in response to the issues mentioned. In terms of signal decomposition, Chen et al. proposed the empirical mode decomposition (EMD) recognition algorithm to decompose vibration signals [10]. In 2015, Jiang et al. used the ensemble empirical mode decomposition (EEMD) identification algorithm to decompose the vibration signal into multiple eigenmode functions (IMFs), which effectively excluded the interference of non-human intrusions, but in heavy rain and other weather conditions. The recognition rate does decrease [11]. Some scholars have used the EMD algorithm combined with the method of adaptive wavelet packet to extract the signal features [12], and then use the support vector machine (SVM) after particle swarm optimization for classification [13]. The effect of modal aliasing still exists. Later, some scholars proposed the complementary ensemble empirical mode decomposition (CEEMD) method [14], which can suppress the residual white noise in EEMD. Artificial neural networks are frequently employed in a variety of industries, including monitoring product quality in production and forecasting some key factors in the industry [15,16,17,18]. In addition, the convolutional neural network (CNN) is crucial for computer vision, object detection, complicated network classification, pattern recognition, and other fields of study [19,20,21,22,23]. Xu et al. combined time-frequency analysis and CNN algorithm to extract features, and used the multi-class SVM recognition algorithm to replace the softmax layer for signal recognition [24]. In the situation of a low signal-to-noise ratio, this technique can overcome the drawbacks of human feature selection. Ruan et al. proposed an adaptive filtering convolutional neural network (AF-CNN), which adaptively filtered the original signal using the convolution kernel, and achieved a superior recognition effect compared to the conventional approach [25].

At present, the recognition of optical fiber vibration signals still has the problem of low accuracy. In order to improve this situation and improve the stability and accuracy of fiber perimeter security systems in practical applications. This paper studies the above problems by combining this with CNN, which is widely used in various fields.

CNN uses multiple convolution kernels to perform convolution processing separately, thereby extracting different types of features. When the data are one-dimensional time series data, the compact adaptive one-dimensional CNN can capture the features well, and the long short-term memory (LSTM) network has the advantage of processing data time series effectively [26]. Although CNN can autonomously learn the features of OFVS, it does not consider the temporal dependencies hidden in the signal, which is the key to forming classification features [27]. To refine the extracted features, combined with the excellent performance of the CBAM attention mechanism in many fields, a method of OFVS recognition based on CNN-CBAM-LSTM is proposed.

2. Convolutional Neural Networks

CNNs convolve the local area of the input signal with the filter kernel and generate output features under the action of the activation function. Since each filter extracts features using the same kernel, the connection between the network layers can be reduced, along with the danger of overfitting. The following is a description of the convolution process:

y_{i + 1, m} {(n) = w}_{i, m} x_{i} {(n) + b}_{i, m}

(1)

Among them,

x_{i} (n)

displays the n-th area of the i-th layer;

y_{i + 1, m} (n)

represents the output of the m-th filter in the (i + 1)-th layer after convolution; the weight matrix and bias of the i-th layer’s m-th filter are denoted by W and B, respectively.

After convolution, the linear features are obtained under the action of the activation function, which improves the model’s capacity for expressing features. The pooling operation can effectively optimize the feature space and network parameters after convolution. This study uses the maximum pooling approach, which can achieve the purpose of reducing parameters and reducing data dimensions [28]. The following is a description of max-pooling:

p_{i + 1, m} (n) = \max_{(n - 1) H + 1 \leq t \leq nH} {q_{i, m} (t)}

(2)

Among them, p denotes the value associated with the (i + 1)-th layer following pooling; q denotes the value of the t-th neuron in the i-th layer in the m frame;

(n - 1) H + 1 \leq t \leq nH

, where H represents the pooling region’s width.

3. LSTM Structure and Principle

The creation of LSTM networks solves the long input sequence-related problem that causes gradients in recurrent neural networks (RNN) to disappear. Compared to traditional RNNs, LSTMs introduce memory cells that can decide which states should be kept and which should be forgotten, thus being able to deal with long-term dependencies.

Input, output, and forget-ting gates, along with memory cells, make up the majority of LSTM. The three gates all have Sigmoid activation mechanisms. The forgetting gate regulates whether the previously recorded historical knowledge is maintained, whereas the input gate and output gate each control the neuron’s input and output information in turn at any given time.

The following Figure 1 is an expanded view of the LSTM network, where

X_{t}

stands for the input at the time t, and

h_{t}

represents the state at this moment t.

The calculation formula of LSTM is expressed as follows:

i_{t} {= σ (W}_{i} \cdot [h_{t - 1} {, X}_{t}] {+ b}_{i})

(3)

f_{t} {= σ (W}_{f} \cdot [h_{t - 1} {, X}_{t}] {+ b}_{f})

(4)

o_{t} {= σ (W}_{o} \cdot [h_{t - 1} {, X}_{t}] {+ b}_{o})

(5)

C_{t}^{’} {= \tanh (W}_{C} \cdot [h_{t - 1} {, X}_{t}] {+ b}_{C})

(6)

C_{t} {= f}_{t} {* C}_{t - 1} {+ i}_{t} {* C}_{t}^{’}

(7)

h_{t} {= o}_{t} * \tanh (C_{t})

(8)

Among them,

W_{i}

,

W_{f}

,

W_{C}

, and

W_{o}

are the weight matrices of the input, forget, update, and output gates, respectively.

b_{i}

,

b_{f}

,

b_{c}

, and

b_{o}

are their corresponding biases, which can help calculate the output

h_{t}

at time t and the updated cell state

C_{t}

.

4. Attention Mechanism CBAM

For feedforward CNN, CBAM (convolutional block attention module) is an easy-to-use attention module [29]. Given any intermediate feature map in CNN, to complete the intricate feature adaption processing, CBAM injects the attention map along two independent dimensions of channel and space, infers the attention map in turn, and then multiplies the attention map with the input feature map [30,31].

CBAM can be easily implemented into CNN’s architecture and trained alongside it because it is a lightweight general-purpose module. The following Figure 2 illustrates the structure of CBAM.

The figure illustrates that CBAM’s functioning can be separated into two steps when given a feature map

{F \in R}^{C \times H \times W}

as the input.

To start with, global maximum and mean pooling are performed according to the channel, and the two one-dimensional vectors obtained by pooling are sent to a fully connected layer for operation and added to obtain channel attention

M_{C} \in R^{C \times 1 \times 1}

. One must multiply it element by element with the input to obtain the adjusted feature map

F^{’}

; secondly, one can perform global maximum and mean pooling on

F^{’}

according to space, and convolve the two two-dimensional vectors obtained by pooling to produce spatial attention

M_{S} \in R^{1 \times H \times W}

, and finally combine the attention with F according to the element-wise multiplication. The overall process of CBAM generating attention is characterized as follows:

\begin{matrix} F^{’} {= M}_{C} (F) \otimes F \\ F^{’ ’} {= M}_{S} (F) \otimes F^{’} \end{matrix}

(9)

The symbol ⊗ means multiplication by the corresponding elements. Until then, it is necessary to disseminate both channel and spatial attention along the dimensions.

5. Experiment and Analysis

5.1. Signal Acquisition and Preprocessing

The experimental data were provided by the vibration fiber sensing system independently produced by Xinjiang Meite Intelligent Safety Engineering Co., LTD., Xinjiang Uygur Autonomous Region, China. A distributed fiber optic sensing system according to a Sagnac fiber optic interferometer was used in the acquisition process, and 1212 OFVSs of the following four types were obtained: flapping (283), knocking (304), walking (300), running (325). The actual measurement environment is shown in Figure 3.

As can be observed from the figure, the measuring environment is divided into the following two situations: hang net and buried, and they adopt an S-shaped arrangement.

In order to make the research process more intuitive, the overall research process is shown in Figure 4.

Among them, the first three parts are the data collection and preparation stage, and the last three parts start to study the pattern recognition of OFVS.

In traditional pattern recognition, the signal preprocessing stage is mainly divided into two parts, noise reduction and endpoint detection. The purpose of noise reduction is to suppress the signal noise and highlight the characteristics of pure vibration signals, which is advantageous to enhance the accuracy of vibration signal detection. The noise reduction method that we chose was spectral subtraction, which can remove the noise power spectrum, and output the vibration signal’s spectrum after spectral subtraction, to improve the signal-to-noise ratio of OFVS. Common endpoint detection methods include the short-time logarithmic energy method, spectral entropy method, etc. The former has improved impact on the signal with larger disturbance, and the latter can better identify the low-quality signal. Here, the fusion algorithm of short-time logarithmic energy and spectral entropy is used to detect the endpoints. The endpoints of different vibration signals can be more accurately identified by combining the benefits of the two methods.

The process of signal preprocessing is shown in Figure 5.

Firstly, the collected raw fiber vibration signal data are denoised by spectral subtraction, and then the endpoint detection of the OFVS is carried out by combining the short-time logarithmic energy and the spectral subtraction method. If the initial point of OFVS is detected, the data with a length of 80 K (the amount of data that can be collected within 1s of the vibration signal collecting device) sampling points will be intercepted from the starting position. It is evident from observation that an entire vibration signal can be found in the data of 80 K sample points (for walking and running signals, one of the multiple segments of similar short signals generated by them is selected as a representative). Then, one must perform the endpoint detection again to repeat the second and third steps; finally, one can normalize the intercepted data to the maximum value, and the four fiber vibration signals after preprocessing are shown in the Figure 6.

5.2. Dataset Processing

In the evaluation of the model, single training may have errors and randomness, which cannot represent the actual situation of the model. As a result, the hold-out method is employed to split the data set and carry out the experiments numerous times. To ensure the scientific nature of the research, the proportion of the training and testing datasets used in this experiment is 8:2 (the training set is chosen at random from the data set, and the test set is created using the remaining 20% of the data set). Four types of OFVS should be included in each partition data set, and the proportion of each type is the same. After the model is trained multiple times, the average is taken, and the data set is re-randomly divided each time it is retrained.

5.3. CNN-CBAM-LSTM Model Introduction

The OFVS recognition flow chart based on CNN-CBAM-LSTM network is shown in Figure 7. The OFVS recognition network model created that utilizes the PyTorch platform is primarily made up of an input layer, a convolutional layer, a CBAM layer, an LSTM layer, and a classification and recognition layer. The details are shown in Figure 8.

The basic process of the network can be described as follows:

The preprocessed one-dimensional OFVS is the input to the convolution layer (k = 1, s = 1), and the signal can be adaptively filtered [25].
The signal is inputted into the CBAM module to analyze the weight of the filtered signal, the importance of the network from the two dimensions of channel and space is weighed, and adaptive feature refinement on the input features is performed.
The refined features are processed by multiple convolution pooling layers to continue feature extraction and data dimensionality reduction.
Then, the features obtained after dimensionality reduction are sent into the LSTM layer, which automatically learns the vibration signal’s characteristics.
Using the BP algorithm to backpropagate the training error, the model parameters are updated layer by layer;
Finally, the identification of OFVSs is completed by classification using the multilayer perceptron (MLP).

After reviewing the data and optimizing the model after the repeated experimental testing, the parameters of the CNN-CBAM-LSTM model are acquired, as displayed in Table 1.

After experiments, it is evident that the optimal learning rate of the model is 10⁻⁴ and it is better when the mini-batch size is 32. Among them, the value of the mini-batch affects the accuracy of the model. If it is too small, the computing resources cannot be used reasonably, and it uses a significant amount of memory if it is too large. The learning rate will affect the parameter update speed of the model. If it is too small, it will slow down the training speed of the model. If it is too large, it will easily cause constant oscillation and affect the accuracy rate.

5.4. Experimental Results and Analysis

The data set is partitioned using the above method, and the model learning rate is 10⁻⁴, the mini-batch parameter is 32, and the iteration cycle is 64. The loss and accuracy of the model obtained by the experiment are shown in Figure 9.

Figure 9a illustrates that when the number of zones increases, the training loss gradually decreases below 0.05, and the test loss function drops below 0.15. As the loss value decreases, the accuracy gradually increases. Combining the two figures, the model has no overfitting phenomenon. After the 35th iteration, the network gradually converges and the accuracy tends to be stable, and the test accuracy reaches about 97.5%.

Randomly chosen test samples made up 20% of the entire data set, and repeated experiments were conducted to verify the classification effect of the CNN-ABAM-LSTM model. Figure 10 shows the generated confusion matrix. As the image illustrates, the accuracy rate of flapping signal test sample recognition is 96.43%. The recognition accuracy of running signals was relatively low at 95.59%, and the probability of misidentifying the signals as flapping and knocking signals was 2.94% and 1.47%. In addition, the model has a high recognition accuracy rate for tapping and walking signals, which can reach more than 99%. It can be observed that this model can effectively classify four kinds of OFVS.

For a more intuitive understanding of the model’s performance, three indicators of precision, recall and specificity were calculated, as shown in Table 2. Precision measures the likelihood that the positive samples are actually positive when anticipated to be positive, recall measures the likelihood of a prediction being accurate among all real positive samples, and specificity reflects the probability that the prediction is actually negative in all the samples that are actually negative. From the table, it is clear that the accuracy and specificity of flapping and knocking signals are low, and the model is prone to misjudging other signals as these two types of signals; the recall rate of running signals is low, and it is easily predicted as other signals; the prediction of walking signals is more accurate.

To comprehend the model’s performance, OFVS recognition was carried out by methods such as variational modal decomposition (VMD) [32], AF-CNN, CNN-LSTM, and CNN-CBAM-LSTM. The research process uses the same data set, the input is the original one-dimensional OFVS, the learning rate parameter adopted by the deep learning model is 10⁻⁴, and the mini-batch parameter is 32, the iteration period is 64, and the average is calculated ten times. Among them, the recognition accuracy of OFVSs in each experiment can be calculated according to Equation (10).

P_{acc} = \frac{TP}{N} \times 100 %

(10)

In the formula, TP represents the number of samples whose predicted label is the same as the actual label. N is the total number of samples. The experimental results are shown in Table 3.

As may be observed, the average accuracy of the CNN-CBAM-LSTM model can reach 97.9%, which is superior than VMD and the other two models. Among them, the recognition rate of the tapping signal is higher than the other three methods, while the recognition rate of the running signal is lower than the other two deep learning models. The CNN-CBAM-LSTM approach of OFVS identification that is suggested in this research may successfully identify four signal kinds, according to the experiments.

6. Conclusions

In traditional pattern recognition algorithms, most methods involve modal decomposition of signals to obtain features, but these methods are prone to modal aliasing and rely on manual parameter design, which has certain limitations. Aimed at this limitation, a deep learning method of CNN-CBAM-LSTM is proposed to extract features. The specific work of this paper is as follows:

In terms of data processing, the collected OFVS are denoised using spectral subtraction, and the endpoint detection is then carried out by fusing spectral entropy and short-term energy to determine the signal’s starting position. The isometric signal segment with 80 K sampling points is intercepted in accordance with the starting point, and then the signal is normalized.
In order to fully utilize the vibrational data in the original signal, the equal-length signal fragments are completely input into a 1 ∗ 1 convolutional layer. On the one hand, 80 different convolution kernels (equivalent to filters) are used for filtering operations to filter different frequency bands of the signal; On the other hand, multiple convolution kernels can generate multiple different feature maps and enhance the information integration between different channels. It provides a better basis for the subsequent use of general CNNs to extract signal features.
In order to process the feature information finely, this paper introduces the CBAM attention module, which can create attention feature maps in both space and channels, multiply them with the original feature map, and perform adaptive feature rectification. The mechanism of channel attention is employed to select feature maps that have an important impact on pattern recognition from an abundance of feature maps, simultaneously; the spatial attention is employed to select effective feature information from the spatial information of the feature maps. When the two are combined, significant feature information might be given a larger feature weight.
The feature information obtained after the previous step is input into the LSTM module, the feature time series information of the OFVS is extracted, and it will automatically learn the important features that are beneficial to the vibration signal pattern recognition. Finally, this is combined with MLP to complete the classification task of OFVS.

Through the experiment, it was discovered that when compared to the conventional method VMD, the CNN-CBAM-LSTM model significantly improved the recognition effect of various types of OFVSs. In addition, the model parameters were automatically updated through network iteration, which can adapt to different environments and has good generalization ability. Compared with the AF-CNN and CNN-LSTM, the model put forth in this paper can clearly obtain higher average recognition accuracy. It may also be observed from the accuracy curve and loss curve that the model has a better fitting effect. In conclusion, the CNN-CBAM-LSTM method is feasible and effective for the identification of OFVS. In the future, with the continuous development of artificial intelligence algorithms, we can continue to study more excellent OFVS recognition methods from the aspects of endpoint detection and recognition algorithms.

Author Contributions

Conceptualization, J.H.; Data curation, J.M.; Investigation, J.Z. and X.M.; Methodology, J.H.; Software, J.H.; Validation, J.M.; Visualization, J.H.; Writing—original draft, J.H.; Writing—review and editing, J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Xinjiang Uygur Autonomous Region, under grant number 2019D01C072.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, J.; Lou, S.; Liang, S. Study of Pattern Recognition Based on Svm Algorithm for Phi-Otdr Distributed Optical Fiber Disturbance Sensing System. Infrared Laser Eng. 2017, 46, 422003-0422003. [Google Scholar] [CrossRef]
Wang, S.; Lou, S.; Liang, S.; Chen, J. Pattern Recognition Method of Fiber Distributed Disturbance Sensing System Based on M-Z Interferometer. Infrared Laser Eng. 2014, 43, 2613–2618. [Google Scholar]
Shang, Y.; Wang, C. Review of Distributed Optical Fiber Sensing Technology. J. Appl. Sci. 2021, 39, 843–857. [Google Scholar]
Pang, S.; Luo, Z.; Wang, Z.; Chang, T.; Dai, G.; Yu, M.; Wu, C.; Cui, H. Interferometric Optical Fiber Water Level Sensing System for Oceanic Monitoring Applications. Acta Photonica Sin. 2019, 48, 0906003. [Google Scholar] [CrossRef]
Wang, F.; Liu, Z.; Zhou, X.; Li, S.; Yuan, X.; Zhang, Y.; Shao, L.; Zhang, X. Oil and Gas Pipeline Leakage Recognition Based on Distributed Vibration and Temperature Information Fusion. Results Opt. 2021, 5, 100131. [Google Scholar] [CrossRef]
Peng, F.; Wu, H.; Jia, X.-H.; Rao, Y.-J.; Wang, Z.-N.; Peng, Z.-P. Ultra-Long High-Sensitivity Phi-Otdr for High Spatial Resolution Intrusion Detection of Pipelines. Opt. Express 2014, 22, 13804–13810. [Google Scholar] [CrossRef]
Oliver, W.D.; Yu, Y.; Lee, J.C.; Berggren, K.K.; Levitov, L.S.; Orlando, T.P. Mach-Zehnder Interferometry in a Strongly Driven Superconducting Qubit. Science 2005, 310, 1653–1657. [Google Scholar] [CrossRef]
Liu, K.; Sun, Z.; Jiang, J.; Ma, P.; Wang, S.; Weng, L.; Xu, Z.; Liu, T. A Combined Events Recognition Scheme Using Hybrid Features in Distributed Optical Fiber Vibration Sensing System. IEEE Access 2019, 7, 105609–105616. [Google Scholar] [CrossRef]
Bi, F.; Feng, C.; Qu, H.; Zheng, T.; Wang, C. Harmful Intrusion Detection Algorithm of Optical Fiber Pre-Warning System Based on Correlation of Orthogonal Polarization Signals. Photonic Sens. 2017, 7, 226–233. [Google Scholar] [CrossRef]
Chen, Z.; Zheng, S. Study of Fault Diagnosis of Gears Using Empirical Mode Decomposition. J. Vib. Eng. 2003, 16, 229–232. [Google Scholar]
Wang, H.; Chen, J.; Dong, G. Feature Extraction of Rolling Bearing’s Early Weak Fault Based on Eemd and Tunable Q-Factor Wavelet Transform. Mech. Syst. Signal. Processing 2014, 48, 103–119. [Google Scholar] [CrossRef]
Yin, P.; Xiong, X. Empirical Wavelet Transform and Its Application in Fault Feature Extraction of Rolling Bearings. In Proceedings of the 2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS), Dali, China, 24–27 May 2019; pp. 855–860. [Google Scholar]
Wang, J. Fbg Intrusion Recognition Algorithm Based on Svm. In Advanced Materials Research; Trans Tech Publ: Bäch, Switzerland, 2012; pp. 1422–1427. [Google Scholar]
Wu, Z.; Huang, N.E. Ensemble Empirical Mode Decomposition: A Noise-Assisted Data Analysis Method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Almonti, D.; Baiocco, G.; Ucciardello, N. Pulp and Paper Characterization by Means of Artificial Neural Networks for Effluent Solid Waste Minimization—a Case Study. J. Process. Control. 2021, 105, 283–291. [Google Scholar] [CrossRef]
Baiocco, G.; Almonti, D.; Guarino, S.; Tagliaferri, F.; Tagliaferri, V.; Ucciardello, N. Image-Based System and Artificial Neural Network to Automate a Quality Control System for Cherries Pitting Process. Procedia CIRP 2020, 88, 527–532. [Google Scholar] [CrossRef]
Baiocco, G.; Almonti, D.; Genna, S.; Ponticelli, G.S.; Tagliaferri, V.; Ucciardello, N. Neural Network Implementation for the Prediction of Load Curves of a Flat Head Indenter on Hot Aluminum Alloy. Procedia CIRP 2020, 88, 543–548. [Google Scholar] [CrossRef]
Soares, L.P.; Grohmann, C.H. Segmentação Automática De Cicatrizes De Deslizamento De Terra Em Imagens De Sensores Remotos Utilizando Aprendizagem Profunda De Máquina (Deep Learning). Master’s Thesis, Instituto de Geociências, Madrid, Spain, 2022. [Google Scholar]
Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-Cnn for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Online, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. Cnn Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
Xin, R.; Zhang, J.; Shao, Y. Complex Network Classification with Convolutional Neural Network. Tsinghua Sci. Technol. 2020, 25, 447–457. [Google Scholar] [CrossRef]
Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (Cnn) in Vegetation Remote Sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
Kou, Q.; Xu, H.; Zhu, J.; Zhang, Z.; Xie, Y. Research on Feature Optimization Algorithm of Optical Fiber Sensor Recognition System. In Proceedings of the 2021 IEEE 15th International Conference on Electronic Measurement & Instruments (ICEMI), Nanjing China, 2–4 November 2021; pp. 358–363. [Google Scholar]
Xu, C.; Guan, J.; Bao, M.; Lu, J.; Ye, W. Pattern Recognition Based on Time-Frequency Analysis and Convolutional Neural Networks for Vibrational Events in Φ-Otdr. Opt. Eng. 2018, 57, 016103. [Google Scholar] [CrossRef]
Ruan, S.; Mo, J.; Xu, L.; Zhou, G.; Liu, Y.; Zhang, X. Use Af-Cnn for End-to-End Fiber Vibration Signal Recognition. IEEE Access 2021, 9, 6713–6720. [Google Scholar] [CrossRef]
Wang, Z.; Lou, S.; Wang, X.; Liang, S.; Sheng, X. Multi-Branch Long Short-Time Memory Convolution Neural Network for Event Identification in Fiber-Optic Distributed Disturbance Sensor Based on Φ-Otdr. Infrared Phys. Technol. 2020, 109, 103414. [Google Scholar] [CrossRef]
Muckenhirn, H.; Doss, M.M.; Marcell, S. Towards Directly Modeling Raw Speech Signal for Speaker Verification Using Cnns. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4884–4888. [Google Scholar]
Li, S.; Li, R.; Yang, J.; Wu, F.; Rashed, G.I. Combined Prediction of Photovoltaic Power Based on Sparrow Search Algorithm Optimized Convolution Long and Short-Term Memory Hybrid Neural Network. Electronics 2022, 11, 1654. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Liang, Z.; Wang, L.; Tao, M.; Xie, J.; Yang, X. Attention Mechanism Based Resnext Network for Automatic Modulation Classification. In Proceedings of the 2021 IEEE Globecom Workshops (GC Wkshps), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
Wang, L.; Yao, W.; Chen, C.; Yang, H. Driving Behavior Recognition Algorithm Combining Attention Mechanism and Lightweight Network. Entropy 2022, 24, 984. [Google Scholar] [CrossRef] [PubMed]
Bao, J.; Mo, J.; Xu, L.; Liu, Y.; Lv, X. Vmd-Based Vibrating Fiber System Intrusion Signal Recognition. Optik 2020, 205, 163753. [Google Scholar] [CrossRef]

Figure 1. LSTM network structure.

Figure 2. Channel attention and spatial attention in CBAM.

Figure 3. The actual measurement environment: (a) hung nets; (b) buried.

Figure 4. Overall research flow chart.

Figure 5. Signal preprocessing flow chart.

Figure 6. Four kinds of fiber vibration signals after pretreatment: (a) flap signal, (b) knock signal, (c) run signal; (d) walk signal.

Figure 7. CNN-CBAM-LSTM network flow chart.

Figure 8. One-dimensional CNN-CBAM-LSTM network structure.

Figure 9. Loss curve and accuracy curve: (a) loss curve, (b) accuracy curve.

Figure 10. Confusion matrix.

Table 1. Network parameter setting.

Layers	Parameter Settings
Conv1	80,000 × 1	k = 1/s = 1	80,000 × 80	ReLU
CBAM		\
Conv2	80,000 × 80	k = 80/s = 80	1000 × 128	ReLU
Maxpool1	1000 × 128	k = 2/s = 2	500 × 128
Conv3	500 × 128	k = 4/s = 2	249 × 80	ReLU
Maxpool2	249 × 80	k = 3/s = 2	124 × 80
Conv4	124 × 80	k = 4/s = 2	61 × 80	ReLU
Maxpool3	61 × 80	k = 3/s = 2	30 × 80
LSTM1	(400, 800, num_layer = 2)
LSTM2	(800, 256, num_layer = 2)
FC1	(256, 128)
FC2	(128, 4)

Table 2. Performance evaluation form.

	Precision	Recall	Specificity
Flap	0.964	0.964	0.989
knock	0.951	1.0	0.984
Walk	1.0	1.0	1.0
Run	1.0	0.956	1.0

Table 3. Comparison of different models.

	VMD	AF-CNN	CNN-LSTM	CNN-CBAM-LSTM
Recognition	VMD	AF-CNN	CNN-LSTM	CNN-CBAM-LSTM
Flap	93.3%	94.6%	97.5%	96.4%
knock	99.3%	93.4%	92.4%	99.9%
Walk	86.9%	100%	100%	99.9%
Run	92.9%	98.5%	98.4%	95.5%
Average	92.2%	96.7%	97.1%	97.9%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, J.; Mo, J.; Zhang, J.; Ma, X. A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM. Appl. Sci. 2022, 12, 8478. https://doi.org/10.3390/app12178478

AMA Style

Huang J, Mo J, Zhang J, Ma X. A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM. Applied Sciences. 2022; 12(17):8478. https://doi.org/10.3390/app12178478

Chicago/Turabian Style

Huang, Jincheng, Jiaqing Mo, Jiangwei Zhang, and Xinrong Ma. 2022. "A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM" Applied Sciences 12, no. 17: 8478. https://doi.org/10.3390/app12178478

APA Style

Huang, J., Mo, J., Zhang, J., & Ma, X. (2022). A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM. Applied Sciences, 12(17), 8478. https://doi.org/10.3390/app12178478

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Fiber Vibration Signal Recognition Method Based on CNN-CBAM-LSTM

Abstract

1. Introduction

2. Convolutional Neural Networks

3. LSTM Structure and Principle

4. Attention Mechanism CBAM

5. Experiment and Analysis

5.1. Signal Acquisition and Preprocessing

5.2. Dataset Processing

5.3. CNN-CBAM-LSTM Model Introduction

5.4. Experimental Results and Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI