Research on Multi-Source Simultaneous Recognition Technology Based on Sagnac Fiber Optic Sound Sensing System

: To solve the problem of multiple sound recognition in the application of Sagnac optical ﬁber acoustic sensing system, a multi-source synchronous recognition algorithm was proposed, which combined the VMD (variational modal decomposition) algorithm and MFCC (Mel-frequency cepstral coefﬁcient algorithm) algorithm to pre-process the photoacoustic sensing signal, and uses BP neural network to recognize the photoacoustic sensing signal. The modal analysis and feature extraction theory of photoacoustic sensing signal based on the VMD and MFCC algorithms were presented. The signal recognition theory analysis and system recognition program design were completed based on the BP neural network. Signal acquisition of different sounds and veriﬁcation experiments of the recognition system have been carried out in a laboratory environment based on the Sagnac ﬁber optic sound sensing system. The experimental results show that the proposed optical ﬁber acoustic sensing signal recognition algorithm has a simultaneous recognition rate better than 96.5% for six types of sounds, and the optical acoustic signal recognition takes less than 5.3 s, which has the capability of real-time sound detection and recognition, and provides the possibility of further application of the Sagnac-based optical ﬁber acoustic sensing system.


Introduction
In recent years, distributed fiber optic sensing technology has become a hot spot in the field of sensing technology research, because of its advantages of good environmental tolerance, anti-electromagnetic interference, and ease to realize long-distance and widerange monitoring. Distributed fiber optic sensing technology is widely used in the fields of perimeter security [1], intrusion detection [2,3], and so on.
Compared with other fiber optic sensors, Sagnac interference-based fiber optic sensing technology has been widely explored by researchers because of its higher signal-to-noise ratio, higher sensitivity, and better adaptability to harsh environments [4][5][6]. In the research of a Sagnac fiber optic sensing system, the real-time online identification of the detection signal is an important research direction. Bao et al. used VMD algorithm to improve the recognition accuracy of Sagnac fiber optic perimeter security system for intrusion signals [7]; Wang et al. used an ESN (echo state network)-based intrusion signal identification method to accurately identify different types of intrusion signals in fiber optic perimeter security systems [8]; Ren et al. proposed a high-performance railroad perimeter security system, which based on an online TDM-FFPI (time-division multiplexed fiber-optic Fabry-Perot Photonics 2023, 10, 1003 2 of 11 interferometric) sensor array, with an average recognition rate of 94.5% for four types of intrusion signals [9]; Li et al. proposed a novel, generalized DAS (distributed acoustic sensing) identification framework to be deployed on high-speed railroads for real-time intrusion threat detection with an 85.6% recognition rate [2]; Chen et al. applied the SMS (Single mode-Multimode-Single mode) fiber optic structure to an intrusion detection system, thus enabling the effective identification of man-made and natural events in the area perimeter. The above research provides solutions for intrusion signal recognition in a variety of application scenarios. However, research on multi-target identification for Sagnac fiber optic sensing systems deployed in harsh environments such as plateaus border security has not been reported. Therefore, in this paper, a multi-target online recognition algorithm study is carried out for the linear Sagnac fiber optic sound sensing system [10,11].

Sagnac Interference-Based Fiber Optic Acoustic Sensing Principle
As shown in Figure 1, the linear Sagnac fiber optic acoustic sensing system reduced the length of the non-sensing signal transmission fiber by nearly half. It also reduced the interference of environmental noise to the system at the source. In the linear Sagnac interferometric optical path, the light emitted by a SLD (super-luminescent light-emitting diode) the optical path from port 1 of 3 × 3 coupler C 1 , and is split by C 1 and output from ports 5 CCW (counterclockwise) and 6 CW(clockwise), respectively. The light output from port 6 arrives at 2 × 1 coupler C 2 after delayed fiber ring L 1 , and the light output from port 5 directly enters C 2 . The two beams of light pass through C 2 propagate independently and arrive at port 1 of 1 × 2 coupler C 3 after pickup fiber ring L 2 , respectively. Then, the beams are re-entered from another port when it is output from port 2 or 3 of C 3 and exits from port 1 and returns along the original optical path. Eventually, the two beams interfere at C 1 , and the interfering light is output to the PD (photoelectric detector) via the 3rd port of C 1 . identification method to accurately identify different types of intrusion signals in fiber optic perimeter security systems [8]; Ren et al. proposed a high-performance railroad perimeter security system, which based on an online TDM-FFPI (time-division multiplexed fiber-optic Fabry-Perot interferometric) sensor array, with an average recognition rate of 94.5% for four types of intrusion signals [9]; Li et al. proposed a novel, generalized DAS (distributed acoustic sensing) identification framework to be deployed on high-speed railroads for real-time intrusion threat detection with an 85.6% recognition rate [2]; Chen et al. applied the SMS (Single mode-Multimode-Single mode) fiber optic structure to an intrusion detection system, thus enabling the effective identification of man-made and natural events in the area perimeter. The above research provides solutions for intrusion signal recognition in a variety of application scenarios. However, research on multi-target identification for Sagnac fiber optic sensing systems deployed in harsh environments such as plateaus border security has not been reported. Therefore, in this paper, a multi-target online recognition algorithm study is carried out for the linear Sagnac fiber optic sound sensing system [10,11].

Sagnac Interference-Based Fiber Optic Acoustic Sensing Principle
As shown in Figure 1, the linear Sagnac fiber optic acoustic sensing system reduced the length of the non-sensing signal transmission fiber by nearly half. It also reduced the interference of environmental noise to the system at the source. In the linear Sagnac interferometric optical path, the light emitted by a SLD (super-luminescent light-emitting diode) the optical path from port 1 of 3 × 3 coupler C1, and is split by C1 and output from ports 5 CCW (counterclockwise) and 6 CW(clockwise), respectively. The light output from port 6 arrives at 2 × 1 coupler C2 after delayed fiber ring L1, and the light output from port 5 directly enters C2. The two beams of light pass through C2 propagate independently and arrive at port 1 of 1 × 2 coupler C3 after pickup fiber ring L2, respectively. Then, the beams are re-entered from another port when it is output from port 2 or 3 of C3 and exits from port 1 and returns along the original optical path. Eventually, the two beams interfere at C1, and the interfering light is output to the PD (photoelectric detector) via the 3rd port of C1. When the sound field is applied to the pickup L2 of the linear Sagnac fiber optic sound sensing system, the light from the different optical paths entering the sensing system all pass through the pickup twice; therefore, the phase sensitivity of the system can be expressed as follows [10]: where ∆φ denotes the phase difference between the two beams of coherent light under the action of the sound field, n denotes the refractive index of the optical fiber, f denotes the frequency of the sound, k is a constant influenced by the refractive index of the fiber, the modulus of elasticity of the pick-up structure and the optical fiber bounce coefficient, L1 is When the sound field is applied to the pickup L 2 of the linear Sagnac fiber optic sound sensing system, the light from the different optical paths entering the sensing system all pass through the pickup twice; therefore, the phase sensitivity of the system can be expressed as follows [10]: where ∆ϕ denotes the phase difference between the two beams of coherent light under the action of the sound field, n denotes the refractive index of the optical fiber, f denotes the frequency of the sound, k is a constant influenced by the refractive index of the fiber, the modulus of elasticity of the pick-up structure and the optical fiber bounce coefficient, L 1 is the length of the delay fiber ring, L 2 is the length of the sensing fiber ring, λ is the wavelength of light, c is the speed of light in a vacuum and P is the sound pressure acting on the pick-up structure. In the linear Sagnac fiber optic sound sensing system, the interferometric light intensity obtained from the detector can be expressed as follows [10]: where ∆ψ is the non-reciprocal phase shift introduced by the 3 × 3 coupler, and ∆ψ = 2π/3. Bringing Equation (1) into Equation (2), the following is obtained: In the linear Sagnac fiber optic sound sensing system, the photodetector receives the interferometric light signal carrying sound information and converts it into a current signal. Through the high-speed data acquisition circuit based on the FPGA (Field-programmable gate array) chip, the I-V conversion, amplification and filtering, analogue-to-digital conversion and data storage of the broadband signal are completed to obtain the digital optical sound sensing signal, and finally the recognition of the sound signal is completed through the characterization of the optical sound sensing signal.

Optical and Acoustic Signal Recognition Algorithm Design
The block diagram of the linear Sagnac fiber optic sound sensing system for sound signal recognition is shown in Figure 2. The photoacoustic signals collected by the Sagnac fiber optic sound sensing system are preprocessed into the signal recognition system. Firstly, the optical sound signal preprocessing is completed and sufficient sound signals are selected as training signals for feature extraction using the VMD algorithm [12,13] and MFCC [14]. The extracted features are used in a classification model based on BP neural network to complete the signal classification recognition and output the sound recognition results to be detected. the length of the delay fiber ring, L2 is the length of the sensing fiber ring, λ is the wavelength of light, c is the speed of light in a vacuum and P is the sound pressure acting on the pick-up structure.
In the linear Sagnac fiber optic sound sensing system, the interferometric light intensity obtained from the detector can be expressed as follows [10]: where Δψ is the non-reciprocal phase shift introduced by the 3 × 3 coupler, and Δψ = 2π/3. Bringing Equation (1) into Equation (2), the following is obtained: In the linear Sagnac fiber optic sound sensing system, the photodetector receives the interferometric light signal carrying sound information and converts it into a current signal. Through the high-speed data acquisition circuit based on the FPGA (Field-programmable gate array) chip, the I-V conversion, amplification and filtering, analogue-to-digital conversion and data storage of the broadband signal are completed to obtain the digital optical sound sensing signal, and finally the recognition of the sound signal is completed through the characterization of the optical sound sensing signal.

Optical and Acoustic Signal Recognition Algorithm Design
The block diagram of the linear Sagnac fiber optic sound sensing system for sound signal recognition is shown in Figure 2. The photoacoustic signals collected by the Sagnac fiber optic sound sensing system are preprocessed into the signal recognition system. Firstly, the optical sound signal preprocessing is completed and sufficient sound signals are selected as training signals for feature extraction using the VMD algorithm [12,13] and MFCC [14]. The extracted features are used in a classification model based on BP neural network to complete the signal classification recognition and output the sound recognition results to be detected. In the signal preprocessing stage, the VMD algorithm is used to complete the signal decomposition [15][16][17]. It achieves the signal bandwidth and minimization of each mode, and keeps the decomposed modes consistent with the original signal. VMD-based modal decomposition increases the number of dimensions of the signal and facilitates the extraction of more signal features. On the one hand, it increases the amount of training data; on the other hand, the most important feature components of the signal features can be extracted comparing with the original signal, thus reducing the degree of interference of secondary components to the target classification and greatly improving the classification accuracy.
During signal feature extraction using the VMD algorithm, the IMF (intrinsic mode function) is expressed as follows [15]: In the signal preprocessing stage, the VMD algorithm is used to complete the signal decomposition [15][16][17]. It achieves the signal bandwidth and minimization of each mode, and keeps the decomposed modes consistent with the original signal. VMD-based modal decomposition increases the number of dimensions of the signal and facilitates the extraction of more signal features. On the one hand, it increases the amount of training data; on the other hand, the most important feature components of the signal features can be extracted comparing with the original signal, thus reducing the degree of interference of secondary components to the target classification and greatly improving the classification accuracy.
During signal feature extraction using the VMD algorithm, the IMF (intrinsic mode function) is expressed as follows [15]: where φ n (t) is the phase of I n (t), and φ n (t) ≥ 0; A n (t) is the instantaneous amplitude of I n (t), and A n (t) ≥ 0; ω n (t) is the instantaneous frequency of I n (t), and ω n (t) = φ n (t); Photonics 2023, 10, 1003

of 11
A n (t) and ω n (t) change more slowly than the phase φ n (t); and I n (t) is approximately a harmonic signal of amplitude A n (t) and frequency ω n (t).
In the process of iteratively solving the variational model, the center frequencies and bandwidths of the IMF components are continuously updated. Based on the frequency domain characteristics of the signal, the frequency band of the signal is adaptively partitioned to obtain multiple narrowband IMF components. The original signal is decomposed into n IMF classifications via VMD, and the corresponding constrained variational model is as follows: where {I n } = {I 1 , I 2,······ , I n } is the n IMF components decomposed using the VMD method, {ω n } = {ω 1 , ω 2,······ , ω n } is the center frequency of each IMF component and δ(t) is the unit impulse function.
To transform the constrained variational problem into an unconstrained variational problem, a quadratic penalty term α and a Lagrange multiplier λ, are introduced in Equation (5) to obtain the Lagrange expression as follows: The optimal solution of Equation (6) is obtained using the alternating direction method of the multiplicative operator to obtain the n narrowband IMF components. According to the above principle, the photoacoustic signal is decomposed into n IMF signals via the VMD algorithm. In this paper, the decomposition of the VMD algorithm is better when the value of n is determined as 6, i.e., the VMD algorithm decomposes the signal into six different modes.
After the signal modal decomposition using the VMD algorithm, the MFCC [18,19] algorithm, which has good robustness and is not easily disturbed by fluctuations in signalto-noise ratio, was used to perform feature extraction of the signal. In the process of signal feature extraction using MFCC, the Mel frequency is introduced to convert the non-linearity of sound sensitivity to a linearized description, and the conversion relationship between the Mel frequency and the actual frequency is as follows [20]: where m is the Mel frequency and f is the actual frequency. The relationship between the Mel frequency and the actual linear frequency is shown in Figure 3a. As shown in Figure 3b, MFCC is based on the critical bandwidth size from dense to sparse, setting up Meier filters from low to high frequencies, and filtering the input signal to obtain the output signal energy, which will be used as the basic features of the signal. The MFCC parameters in the signal feature extraction process can be expressed as follows [18]: where d t denotes the t-th 1st order difference and C t denotes the t-th standard MFCC parameter; k denotes the time difference of the 1st order derivative, and in the process of programming, k = 1 is usually taken; Q denotes the order of the MFCC parameter. The MFCC parameters used in this paper consist of the static MFCC parameters of the photoacoustic signal, the first-order difference, and the second-order difference MFCC parameters [21], which can be calculated by substituting the calculation results from the above equation into Equation (8)  As shown in Figure 3b, MFCC is based on the critical bandwidth size from dense to sparse, setting up Meier filters from low to high frequencies, and filtering the input signal to obtain the output signal energy, which will be used as the basic features of the signal. The MFCC parameters in the signal feature extraction process can be expressed as follows [18]: where dt denotes the t-th 1st order difference and Ct denotes the t-th standard MFCC parameter; k denotes the time difference of the 1st order derivative, and in the process of programming, k = 1 is usually taken; Q denotes the order of the MFCC parameter. The MFCC parameters used in this paper consist of the static MFCC parameters of the photoacoustic signal, the first-order difference, and the second-order difference MFCC parameters [21], which can be calculated by substituting the calculation results from the above equation into Equation (8) again.
The characteristic parameters of the photoacoustic signal obtained via MFCC are identified using a BP neural network [22]. The topology of the neurons in the BP neural network is represented as follows: The Sigmoid function is used as the activation function in the recognition procedure, and the expression of the Sigmoid function is shown in Equation (10), as follows: The mapping of any m-dimension to n-dimension is achieved by a three-layer BP neural network, the structure of which is shown in Figure 4. The characteristic parameters of the photoacoustic signal obtained via MFCC are identified using a BP neural network [22]. The topology of the neurons in the BP neural network is represented as follows: The Sigmoid function is used as the activation function in the recognition procedure, and the expression of the Sigmoid function is shown in Equation (10), as follows: The mapping of any m-dimension to n-dimension is achieved by a three-layer BP neural network, the structure of which is shown in Figure 4. The number of nodes in the input and output layers in a BP network is determined as m and n, respectively; the number of nodes in the hidden layer is l. The following relationship is generally satisfied.  The number of nodes in the input and output layers in a BP network is determined as m and n, respectively; the number of nodes in the hidden layer is l. The following relationship is generally satisfied.
where c is the regulation parameter, which is taken as 1 to 10 in this paper. The BP neural network modelling process involves two stages of forward information transfer and reverse error transfer [23][24][25].
Positive information transfer process The forward pass is the input mode, which is passed from the input layer to the output layer via the implicit layer processing. Let the output value of the i-th node at layer m be y m i , the threshold value be θ i , the activation value be S i , the activation function f be a Sigmoid function, and the connection weight between this node and the j-th node at layer m − 1 be ω ij , as shown in Equation (12).
The forward pass process calculates the output of each network node in turn according to Equation (12).
Reverse error transfer process The process of adjusting the weights and thresholds of the network is carried out so that the output value approximates the desired value, which is based on the rule of gradient most rapid descent, i.e., adjusting the weights and thresholds along the direction of the most rapid descent of the squared relative error. The output error function of the BP neural network is In Equation (13), d denotes the output layer output result and y denotes the expected value.
The adjustment process for weights and thresholds can be expressed by the following equation: In Equation (14), η 1 is the weight learning efficiency and η 2 is the threshold learning efficiency. Each node is adjusted in the BP neural network according to Equation (14), and the reverse transfer process is controlled by setting the error accuracy and the number of iterations. The flow chart of photoacoustic signal recognition based on BP neural network is shown in Figure 5.
During model training, the training samples are input to the initialized BP neural network, and the output value and the expected error value E are obtained, through implicit layer processing and output layer output. When the output results meet the accuracy requirements or the number of iterations reaches the specified number, the BP neural network modelling is completed and the recognition and classification function of the test signal is realized.
Photonics 2023, 10, 1003 7 of 11 In Equation (14), 1 is the weight learning efficiency and 2 is the threshold learning efficiency. Each node is adjusted in the BP neural network according to Equation (14), and the reverse transfer process is controlled by setting the error accuracy and the number of iterations. The flow chart of photoacoustic signal recognition based on BP neural network is shown in Figure 5. During model training, the training samples are input to the initialized BP neural network, and the output value and the expected error value E are obtained, through implicit layer processing and output layer output. When the output results meet the accuracy requirements or the number of iterations reaches the specified number, the BP neural network modelling is completed and the recognition and classification function of the test signal is realized.

Experiment
LabVIEW is used to build a virtual instrument platform to complete the acquisition and storage of signals. Feature extraction of photoacoustic signals based on VDM and MFCC algorithms is carried out using Matlab 2018b, and the classification and identification of signals is completed by designed BP neural network algorithms.
In this paper, six types of sounds are selected as experimental test sound signals: small helicopter, Boeing aircraft, Hummer, gale, quadrotor UAV and fixed-wing UAV. Figure 6 shows the Sagnac-based photoacoustic sensing system acquisition completed

Experiment
LabVIEW is used to build a virtual instrument platform to complete the acquisition and storage of signals. Feature extraction of photoacoustic signals based on VDM and MFCC algorithms is carried out using Matlab 2018b, and the classification and identification of signals is completed by designed BP neural network algorithms.
In this paper, six types of sounds are selected as experimental test sound signals: small helicopter, Boeing aircraft, Hummer, gale, quadrotor UAV and fixed-wing UAV. Figure 6 shows the Sagnac-based photoacoustic sensing system acquisition completed with sound signal detection, and the photoacoustic sensing signal obtained. In the experimental test, six different sound signals were collected. For every sound, five groups of signals were collected, and every group contains 100 signals. Thus, the total number of response signals were 3000. During the experiments, 90% of the signals from each group were randomly selected as training samples and the remaining 10% were used as test samples to verify the accuracy of the recognition algorithm. The number of samples used for training was 2700 and the number of samples used for testing was 300, and the training of the recognition model, the accuracy of the recognition system and the testing of the recognition time were completed, respectively.
As shown in Figure 7a, the training accuracy tended to be stable and converged around 93% after 150 rounds. The test accuracy also fluctuated up and down 93%, while the curve fluctuation is slightly greater than training curve. These results show that the training accuracy and test accuracy remained consistent. The loss curves are shown in Figure 7b. The training loss curve after 150 rounds tended to be steady and converged around 0.1 and 0.2, respectively, while the test loss curve also tended to be steady after 150 rounds, and the test loss gradually stabilized at about 0.2. The training loss and test loss were kept consistent too. signals were collected, and every group contains 100 signals. Thus, the total number of response signals were 3000. During the experiments, 90% of the signals from each group were randomly selected as training samples and the remaining 10% were used as test samples to verify the accuracy of the recognition algorithm. The number of samples used for training was 2700 and the number of samples used for testing was 300, and the training of the recognition model, the accuracy of the recognition system and the testing of the recognition time were completed, respectively. Figure 6. Response of the system to different sound signals.
As shown in Figure 7a, the training accuracy tended to be stable and converged around 93% after 150 rounds. The test accuracy also fluctuated up and down 93%, while the curve fluctuation is slightly greater than training curve. These results show that the training accuracy and test accuracy remained consistent. The loss curves are shown in Figure 7b. The training loss curve after 150 rounds tended to be steady and converged around 0.1 and 0.2, respectively, while the test loss curve also tended to be steady after 150 rounds, and the test loss gradually stabilized at about 0.2. The training loss and test loss were kept consistent too. As shown in Table 1, the BP neural network achieved high recognition rates for the six sounds, with 100% accuracy for the small drones in trials 1 and 2; the third trial  As shown in Figure 7a, the training accuracy tended to be stable and converged around 93% after 150 rounds. The test accuracy also fluctuated up and down 93%, while the curve fluctuation is slightly greater than training curve. These results show that the training accuracy and test accuracy remained consistent. The loss curves are shown in Figure 7b. The training loss curve after 150 rounds tended to be steady and converged around 0.1 and 0.2, respectively, while the test loss curve also tended to be steady after 150 rounds, and the test loss gradually stabilized at about 0.2. The training loss and test loss were kept consistent too. As shown in Table 1, the BP neural network achieved high recognition rates for the six sounds, with 100% accuracy for the small drones in trials 1 and 2; the third trial As shown in Table 1, the BP neural network achieved high recognition rates for the six sounds, with 100% accuracy for the small drones in trials 1 and 2; the third trial achieved 100% identification accuracy on Quadrotor UAV; the fourth trial achieved 100% accuracy in the identification of fixed-wing UAVs. The lowest recognition accuracy was the fourth recognition of Hummer with an accuracy of 90.91%, while the rest of the tests were above 92%, with an average accuracy of 96.50% for the five experiments. As can be seen from Table 2, the average training time of the BP neural network is 42 s, and the average recognition time is 5.3 s, with the ability to achieve real-time monitoring of intrusion disturbances. Specific analysis of the first set of experiments in Table 1, using 1, 2, 3, 4, 5 and 6 as labels for small helicopters, Boeing aircraft, Hummer, the wind, quadrotor UAV and fixedwing drones, respectively, the 250 randomly selected test sets in the first set of experiments contained 41 sets of categories 1, 47 sets of categories 2, 56 sets of category 3, 34 sets of category 4, 39 sets of category 5 and 33 sets of category 6. The test sets were classified and identified, and the results of the BP neural network recognition of the six sound signals were analyzed as shown in Figure 8. Specific analysis of the first set of experiments in Table 1, using 1, 2, 3, 4, 5 and 6 as labels for small helicopters, Boeing aircraft, Hummer, the wind, quadrotor UAV and fixedwing drones, respectively, the 250 randomly selected test sets in the first set of experiments contained 41 sets of categories 1, 47 sets of categories 2, 56 sets of category 3, 34 sets of category 4, 39 sets of category 5 and 33 sets of category 6. The test sets were classified and identified, and the results of the BP neural network recognition of the six sound signals were analyzed as shown in Figure 8. As shown in Figure 8, the recognition results of the BP neural network for 250 sets of test samples showed nine recognition errors, where the predicted sounds did not match the actual sounds. To show the statistical results of the misclassification more intuitively, the labels of the misclassified data and their original information were summarized.
As shown in Table 3, of the nine false identifications for this experiment, there were three false identifications for the Hummer, two false predictions each for the Boeing and quadcopter UAVs, and one false prediction each for the Gale and fixed-wing UAVs, for an As shown in Figure 8, the recognition results of the BP neural network for 250 sets of test samples showed nine recognition errors, where the predicted sounds did not match the actual sounds. To show the statistical results of the misclassification more intuitively, the labels of the misclassified data and their original information were summarized.
As shown in Table 3, of the nine false identifications for this experiment, there were three false identifications for the Hummer, two false predictions each for the Boeing and quadcopter UAVs, and one false prediction each for the Gale and fixed-wing UAVs, for an overall false alarm rate of less than 3.6% for the recognition system. The causes of the recognition errors are the small sample size of the BP neural network input data and the lack of optimization of the sound signal feature extraction algorithm.

Conclusions
Based on the Sagnac optical fiber acoustic sensing system, a feature extraction algorithm based on the fusion of the VMD algorithm, the MFCC algorithm and a BP neural network classification recognition network was proposed, as well as a multi-target recognition system for optical acoustic signals. Simultaneous multi-target recognition experiments were completed for six types of sound signals including small helicopters, Boeing aircraft, Hummer, the wind, quadrotor UAV and fixed wing drones. A total of 3000 sets of data were tested in the experiment; 2700 sets of measurement signals were randomly selected as training samples for training the neural network, and the remaining 300 sets were used as test samples to verify the recognition accuracy. The experimental results show that the accuracy of the BP neural network algorithm is better than 96.5% for the six classification recognition of the response signals, and the recognition time of the photoacoustic signal is less than 5.2 s. In the future, studies need to focus on increasing the number of training samples and optimizing the feature extraction algorithm to further improve the recognition accuracy of the system.