A Multi-Position Approach in a Smart Fiber-Optic Surveillance System for Pipeline Integrity Threat Detection

: We present a new pipeline integrity surveillance system for long gas pipeline threat detection and classiﬁcation. The system is based on distributed acoustic sensing with phase-sensitive optical time domain reﬂectometry ( φ -OTDR) and pattern recognition for event classiﬁcation. The proposal incorporates a multi-position approach in a Gaussian Mixture Model (GMM)-based pattern classiﬁcation system which operates in a real-ﬁeld scenario with a thorough experimental procedure. The objective is exploiting the availability of vibration-related data at positions nearby the one actually producing the main disturbance to improve the robustness of the trained models. The system integrates two classiﬁcation tasks: (1) machine + activity identiﬁcation, which identiﬁes the machine that is working over the pipeline along with the activity being carried out, and (2) threat detection, which aims to detect suspicious threats for the pipeline integrity (independently of the activity being carried out). For the machine + activity identiﬁcation mode, the multi-position approach for model training obtains better performance than the previously presented single-position approach for activities that show consistent behavior and high energy (between 6% and 11% absolute) with an overall increase of 3% absolute in the classiﬁcation accuracy. For the threat detection mode, the proposed approach gets an 8% absolute reduction in the false alarm rate with an overall increase of 4.5% absolute in the classiﬁcation accuracy.


Introduction
Pipeline transmission is used as a transmission method to transport the energy sources based on liquid or gas from the producing factories to the final users. Special attention is needed from the factory operators in order to prevent unfortunate accidents [1,2], which can produce human deaths. Moreover, most incidents around the factories related to natural gas transmission happen when the pipelines are damaged by works carried out in their surroundings. Therefore, cost-effective solution development allowing for continuously monitoring suspicious activities that represent serious threats for the pipeline is a must.
Distributed Acoustic Sensing (DAS) technology can be effectively applied to this task, since this allows to continuously interrogate urban areas by measuring the activity occurring nearby a fiber-optic buried under the ground. Moreover, phase-sensitive Optical Time Domain Reflectometry (φ-OTDR) in fiber-optics has been shown to be a suitable DAS technology: Conventional φ-OTDR-based sensors can sense several tens of kilometers, with spatial resolutions of a few meters, and have been shown to provide enough sensitivity to allow the detection of simulated signals [3] and people walking over a buried fiber [4]. By adding distributed optical amplification, sensing ranges of 100 km [5] and above 100 km

Related Work
For pipeline surveillance, the combination of the DAS technology in long pipelines with pattern recognition system (PRS) strategies, which aim to recognize the activity being carried out, makes it effectively detect the vibration produced by a dangerous activity so that the prevented action can be undertaken: In [19], three different events (climbing, kicking, and raining) are classified from level crossing rate features extracted from the acoustic signals. In [20], multi-scale wavelet decomposition is used to generate the features for identifying human intrusion for mild, normal, and extreme weather conditions. In [21], three different events (background noise, air movement, and hand perturbation) were classified by employing singular spectrum analysis for feature extraction and neural networks for classification. In [22], energy features based on the fast Fourier transform (FFT) were extracted from the acoustic signals and a support vector machine was employed to classify five different events (background noise, people walking with wind, shaking the fence, people walking without wind, and vibration exciter). In [23], morphological features based on time and space domains were used within the feature extraction stage, and the relevance vector machine was employed for classifying three different events (people walking, digging, and vehicles). In [24,25], three different events are aimed to be classified (background noise, human intrusion, and hand clapping [24], and background noise, digging, and vehicles [25]) from multi-scale wavelet decomposition and multi-scale wavelet packet decomposition as features and a neural network-based pattern recognition strategy.
The works presented in [26][27][28][29] employ the same data as those used in this work and aim to classify eight different classes (big excavator moving along the ground, big excavator hitting the ground, big excavator scrapping the ground, small excavator moving along the ground, small excavator hitting the ground, small excavator scrapping the ground, plate compactor compacting the ground, and pneumatic hammer compacting the ground) in a machine + activity identification task, along with threat detection, whereas in [30], a system that addresses threat detection from the same data is presented. In [30], the features are based on the FFT and the energy in frequency bands, and the classification is carried out from Gaussian mixture modeling. The approach presented in [26] adds a new machine + activity identification mode and two feature vector normalization schema to the work presented in [30]. In [27], an augmented feature set (based on neural networks) with respect to the one used in [26] was presented and system combination was based on the different feature sets. The work presented in [28] incorporates a hidden Markov model (HMM) approach for acoustic signal classification over [26] and the work presented in [29] added this HMM approach to the system presented in [27].
A review of the target DAS + PRS scenarios can be found in [31], and an exhaustive review of DAS + PRS strategies which includes a thorough methodology regarding database construction can be found in [32].
Combining DAS + PRS technology can also be used to solve some other problems, such as the death of fishes in a fish farm addressed in [33] by taking preventive action when a non-authorized work is carried out in the fish farm surroundings. PRS technology can also be applied to smart grid infrastructure so that the critical issue is solved [34]. Moreover, edge computation is a growing technology in which DAS + PRS can also be implemented [35]. By using this technology, regarding the smart factory research field, a decision for each suspicious activity can be carried out so that the corresponding action can be undertaken.
Regarding the PRS technology itself, evolutionary algorithms that aim to mimic the behavior of the biological evolution such as the Chicken Swarm Optimization algorithm and its variants [36] can also be applied.
Special attention also deserves the use of robots for pipeline surveillance: In [37], an autonomous vehicle is employed to detect pipeline failures due to corrosion from Markov decision processes. In [38], the robot is employed to detect leakage along the pipeline and also integrates a mechanism to remove the obstacles the robot finds while traveling the path. In [39], an ultrasonic signal-based robotic system was proposed to detect pipeline defects. In [40], a two-leg robotic system is employed to monitor horizontal, vertical, and coupling pipes. In [41], an autonomous robot based on an A* algorithm is used for pipeline inspection. Video processing along with simulated annealing algorithms are used in the robotic system presented in [42] to inspect gas pipelines. Gaussian processes with covariance function for cylindrical structure representations have been employed in the context of water pipeline inspection in [43]. A review of the use of robots for pipeline inspection can be found in [44].
In [26], we presented a pipeline integrity surveillance system based on: • A φ-OTDR sensor (named FINDAS) [78] for acoustic signal acquisition. • Feature extraction that outputs the feature vectors with spectral information for acoustic signal representation. • Feature vector normalization that employs the high-frequency content of the recorded signals. • Pattern classification from a Gaussian mixture model (GMM) whose training was carried out from a single position for each recorded signal.
The pattern classification in [26] was based on a single-position training (so that it will be considered our baseline here), which, for some activities, may produce poor models due to data scarcity and the strong non-linear effects that are inherent to the fiber sensitivity response. On the other hand, employing more data as training material does typically build more robust models so that the pattern classification performance can be enhanced. Taking into account the human cost for data acquisition/measurement, and the power of DAS, which is able to simultaneously record several positions around the location for which the disturbance is maximum (for example, 400 positions in our system, each separated by 1 m) that comprise the vibration of the corresponding activity, we propose a multi-position approach in this work. This consists of an augmented system based on the use of the data acquired at several positions as the GMM training material, aiming to produce more robust models to enhance the pattern classification performance, and also provides insights about the actual limitations in the possible performance improvements. Moreover, due to the existence of the fading points along with the fiber (i.e., fiber optic positions that present null or very low sensitivity for the given perturbation no matter whether these positions are close or not to the actual perturbation position), a method for selecting the fiber optic position/s employed for system training is required. Therefore, the contribution of this work is 2-fold: (1) Building more robust GMMs that may enhance the system performance by employing data recorded in multiple fiber-optic positions and (2) presenting an intelligent multi-position selection method to avoid the fading points in the fiber-optic when the perturbation occurs. To the best of our knowledge, this is the first work that presents a multi-position approach for detecting threats to the pipeline integrity from different fiber optic positions measured at the same time, and that is evaluated on realistic environment conditions. As in our previous work [26], the pipeline integrity surveillance system integrates two operation modes: machine + activity identification, where both the machine and the activity carried out are identified, and threat detection, where the occurrence of a suspicious threat for the pipeline integrity must be detected.
The organization of the rest of the paper is as follows: Section 2 presents the DAS + PRS strategy. The experimental procedure is described in Section 3. Experiments and results are presented in Section 4. Discussion is given in Sections 5 and 6 concludes the work.

DAS + PRS: The Multi-Position Approach for Detecting Threats to the Pipeline Integrity
The system integrates two different modes: machine + activity identification and threat detection. In the first mode, the input acoustic signal is assigned to a certain machine + activity pair. In the threat detection mode, the signal is simply assigned the threat or non-threat class. The first mode is ideal for situations in which both the machine and the activity carried out must be known. The second mode is ideal for cases in which just a threat for the pipeline must be detected.
The system integrates different modules, as shown in Figure 1, which depicts an augmented architecture with respect to the one presented in [26]. The acquisition equipment in Figure 1 is connected to a 45 km fiber-optic and consists of a DAS sensor in charge of measuring the suspicious activities to create the so-called acoustic traces (i.e., acoustic signals). The pattern recognition system integrates two different stages: the training and classification stages. These two stages contain the feature extraction and feature normalization modules, which are the same for both stages and aim to obtain the most significant features from the acquired acoustic signals. In the training stage, the normalized feature vectors corresponding to the training data are used to train the acoustic models from the signals previously recorded by the sensor. The GMM training module is in charge of the model building and also integrates the method for fiber-optic position selection. The labeling for each suspicious activity or threat/non-threat depending on the system mode is also needed in this stage so that the model for each suspicious activity or threat/non-threat can be effectively trained. The classification stage integrates a pattern classification module that employs the normalized feature vectors corresponding to the testing data and the models previously trained in the training stage to make the corresponding decision (machine + activity pair recognized in the machine + activity identification mode or threat/non-threat in the threat detection mode) for each testing acoustic signal. All these modules are explained in more detailed next. The DAS system consists of a φ-OTDR-based sensor called FINDAS [78]. The system architecture for the FINDAS sensor is presented in Figure 2. We provide here a short summary of the sensor, and refer readers to [16] for more details on the sensing mechanism and sensor configuration. The φ-OTDR employs Rayleigh scattering, which is an elastic scattering (without frequency shift) of light, and measures the changes that appear in the fiber state. The FINDAS sensor works with highly coherent optical pulses with a central wavelength near 1550 nm, which are given to the optical fiber. The signal that is backreflected from the fiber-optic is then recorded, so that the interference pattern produced by the Rayleigh backscattering (i.e., the φ-OTDR signal) can be monitored with the same fiber input. Carrying out a flight time mapping of the light inside the fiber, the φ-OTDR signal received at a certain timestamp is related to a certain fiber position. If vibrations are produced at a certain fiber-optic position, the relative positions in the fiber-optic of the Rayleigh scattering centers will change, and the φ-OTDR signal will be accordingly changed, which allows for distributed acoustic sensing [16]. Since the fiber-optic is not installed parallel to the pipeline, and in some positions, there are fiber rolls for maintenance, a fiber-optic calibration between fiber distance and location was carried out along the pipeline.
The FINDAS sensor has an (optical) spatial resolution of five meters (readout resolution of one meter) and is able to sense up to 45 km with standard single-mode fiber. The sampling frequency for the signal acquisition was set to f s = 1085 Hz. When the energy of the monitored vibrations for a certain fiber position is higher than a predefined threshold, the acoustic samples on the top of the fiber are recorded by the FINDAS sensor to create a 20 s length acoustic trace. This acoustic trace is then given to the pipeline integrity threat detection system (described next) for classification. The system allows for the detection of multiple activities at different fiber-optic positions simultaneously, by giving a suitable threshold to each fiber position (that can be trained from past activity monitoring periods). This strategy is used in the system to avoid false alarms, so that the classification stage is only activated when there is a perturbation with an energy level above the previously trained threshold. This approach for activity detection leads to a 100% accuracy in the detection of the "no-activity" condition.

Signal Behavior
Both the physical process related to signal measurement and the different properties that each sensed location conveys make the signal features vary considerably.
On one hand, the vibration propagation along the fiber is heavily influenced by: (1) the distance from the machine that is carrying out a certain activity to the acquisition sensor, (2) the ground characteristics (dry or wet, compact or soft, etc.), and (3) the mechanical coupling of the fiber to the pipeline area. This causes that the acoustic signal recorded for a certain activity possesses significant differences when the recording location changes. Moreover, the background noise can also convey different properties for different locations (due to the nearness of houses, animals, roads, factories, etc.).
On the other hand, the transduction function of a φ-OTDR-based sensor is non-linear, which is particularly important for strong perturbations, as those considered in this work. Moreover, due to the random characteristic of a φ-OTDR signal, certain points that are distributed in a random way along the fiber (the so-called fading points [16]) may have low, or even null sensitivity to fiber-optic vibrations. This can be solved by analyzing several adjacent fiber-optic points to ensure that the recorded acoustic signal actually corresponds to the suspicious activity for a given location. However, the fiber sensitivity can strongly vary in a certain location from one point to another, even when they are close to each other.
Moreover, due to the so-called fiber losses, the optical power received from a certain fiber-optic position at a distance of d m meters from the beginning of the fiber, denoted as P(d m )-and hence the amplitude of the measured signals-will present an exponential decay. The coefficient for the fiber attenuation is α ≈ 0.0002 dB/m with the FINDAS operating laser wavelength (1550 nm). For a certain optical power at the beginning of the fiber named as P(0), this equation stands: P(d m ) = P(0) · 10 (−2·d m ·α/10) . Taking into account the full round-trip of the light of the fiber, this generates a 3 dB decay for every 7.5 km. This means that the optical loss effects will be relevant for long distances (e.g., tens of kilometers), as the scenario considered in this work is.
All these issues related to signal acquisition do undoubtedly affect the detection performance, so considering them in the system design is a must.

Feature Extraction
Feature extraction is used to obtain the most meaningful information present in the acoustic signals (i.e., traces) recorded by the FINDAS in the so-called feature vectors. Each acoustic trace (with 20 s length) is first divided into 1 s length segments (so-called frames), so that a feature vector representing each frame is computed. For feature vector computation, each frame overlaps 95% (i.e., 1030 values for the sampling frequency f s = 1085 Hz) with the previous one so that a smooth change in each frame is produced. This means that a smooth change also occurs in the feature vector representation, as shown in our previous work [26]. Then, the FFT is applied to each frame to obtain the corresponding spectral information by converting it from the time-domain to the frequency-domain, since the acoustic traces from the machines and activities recorded presented a consistent spectral pattern that is suitable for further classification, as shown in [26]. The number of points of the FFT is set to 8192 as in [26] (therefore each frequency bin comprises 0.066 Hz, being 542.5 Hz the maximum available frequency in the acoustic signal). Then, the energy values corresponding to 100 frequency bands with a 100 Hz bandwidth are computed from the obtained spectral information (this 100 Hz bandwidth was shown to be the best in preliminary experiments and was used in our previous work [26]). Therefore, these energy values are the values stored in a feature vector, which will be denoted as x f .

Feature Vector Normalization
After the raw initial features have been computed in the feature extraction stage, a sensitivity-based normalization [26] for each feature vector x f normalizes each energy value by the energy above the used bandwidth (i.e., from 100 Hz to 542.5 Hz), as shown in Equation (1): where x f n (i) denotes the position i in the 100-dimensional normalized feature vector, x f (i) denotes the position i in the 100-dimensional raw feature vector, and E high denotes the energy from 100 Hz to 542.5 Hz in a given frame. This feature vector normalization approach deals with the signal degradation related to the distance between the sensed location and the FINDAS sensor. Therefore, 100-dimensional normalized feature vectors comprise the output of this module at the end. These new feature vectors x f n are the input to the pattern classification stage.
The use of GMMs for system modeling is suitable since these models need a reasonably limited amount of training data to obtain a good pattern recognition performance [82]. Moreover, GMMs provide a straightforward method with which a linear combination of Gaussian functions can effectively represent a large class of sample distributions (i.e., the so-called training and testing data).
Therefore, considering a certain GMM named λ, the corresponding probability that a certain data vector (or simply data) x f n belongs to the class represented by that model λ can be obtained, which will be denoted as p(x f n |λ).

Multi-Position Selection
From all the 400 positions recorded by FINDAS (separated by 1 m each) during signal acquisition around the given sensed location for each timestamp and a certain activity, an intelligent procedure to select the positions that will be used for model training (i.e., multi-position selection) is needed, given that the activity carried out on the top of the fiber will impact a wide fiber-optic segment nearby the actual activity location.
In our previous work [26] we simply selected the position with the highest energy, as this was supposed to correspond to the position with the highest sensitivity, thus avoiding fading points. The energy computation was carried out with the same parameter configuration as that used for feature extraction.
In this work, we present an N-highest energy position approach. To do so, the N positions which present the N-top highest energies according to the feature extraction parameters, have been chosen for GMM training. The objective here is to augment the amount of training data, while keeping a good expected sensitivity.

Training
From a subset of the acoustic traces recorded by FINDAS for a certain class k, which comprises the training subset, the training stage consists of model construction by estimating the parameters of each GMM λ k . The Expectation-Maximization algorithm [83] is employed to estimate the model parameters for the GMM from a maximum likelihood criterion. It must be noted that this model training is just needed once, so that all the classification processes employ the set of trained GMMs. In this work, several positions have been used to carry out the GMM training, with the criteria described in the previous subsection. To do so, a GMM for a certain class is trained with a variable number of highest energy sensed positions. To cover a reasonably wide range of possibilities, we run experiments for N = {1, 2, 4, 7, 11}. It must be noted that the highest energy sensed positions are computed on a frame-by-frame basis so that the highest energy sensed position/s is/are computed every 1 s and the corresponding normalized feature vectors are used for GMM training.

Classification
With the trained models, classification is carried out on a fully independent data subset, which is called the testing subset, to find the class represented by the modelĉ with the maximum posterior probability. This is carried out for a given input feature vector x f n from the Bayes' rule as follows: where a uniform prior probability for each class is assumed (i.e., p(λ k ) = 1/C, where C is the number of classes (in our case C = 8 in the machine + activity identification mode and C = 2 in the threat detection mode)). Given that we use a supervised strategy for system training, our proposal is able to detect the activities for which we can generate a trained model. Since the same data are used for the machine + activity identification and threat detection modes, all the possible threats produced by the machine + activity pairs in the database can be detected. However, as shown in [84], which also bases on a GMM system, our proposal is expected to work reasonably well even for detecting threats from activities not included in the database due to the threat model generalization capability. Classification is carried out on a frame basis from the highest energy position frame. Therefore, a classification decision is obtained for every 1 s frame within every 20 s length acoustic signal. Given the configuration parameters in the feature extraction, there are 415 decisions for each 20 s length acoustic signal in total.

Database Description
For a fair comparison, the database used in the experiments is the one used in our previous work [26].
The database (i.e., the acoustic traces) was recorded over a real gas transmission pipeline managed by Fluxys Belgium S.A. (a GERG (The European Gas Research Group) partner). This means that our system is tested in a real environment. The FINDAS sensor was located at one end of the fiber which runs in parallel to the pipeline. Six different locations (named from LOC1 through LOC6) were used for acoustic trace recording. These locations aim to cover different pipeline "reference positions" selected at high distances from the FINDAS sensor (see Table 1) to produce feature variability in terms of ground conditions (e.g., concrete, grass, etc.) and evaluate the system in sensing limit conditions. The recordings were also carried out in different days, times, and weather conditions (e.g., sunny day, rainy day, etc.) to produce feature variability in terms of weather conditions (see Table 1). For data recording, before each machine starts its activity in each location, a reference position was manually selected in each location by measuring the fiber optic meter with good sensitivity when a certain machine is carrying out a certain activity (in our case, the plate compactor while compacting the ground was chosen). Once the reference position was chosen, 400 m were recorded (200 m at the left and right sides of that reference position) for each machine + activity pair in each location. Figure 3 shows an example of the furthest LOC6 location recording conditions, and Table 2 presents the machine + activity pairs recorded in each location. Time duration and the threat/non-threat labeling used in the threat detection mode of the system are also presented for each machine + activity pair and location. The decision on the activities to be considered as threats or non-threats (i.e., threat/non-threat labelling) was taken by the GERG project partners, depending on whether they could actually lead to pipeline damage. It must be noted that the time duration refers the time each machine is carrying out the corresponding activity. However, there are 400 fiber-optic positions (i.e., 400 m) being simultaneously recorded. The machines employed for database recording along with an aerial view of a machine working on the ground are presented in Figure 4.  [26]. RP stands for "reference point", referring to the center of the machine operation area.

Evaluation Metrics
The main metric employed for testing the two system modes (i.e., machine + activity identification and threat detection) is the classification accuracy. For the machine + activity identification mode, the accuracy represents the ratio between the number of correctly classified testing frames corresponding to the machine + activity pairs and the total number of testing frames for the machine + activity pairs. A hit (i.e., the machine + activity pair is correctly detected) is produced in the system when both the machine and the activity recognized by the system coincide with those of the ground truth and are within the machine + activity ground-truth time span. For the threat detection mode, a threat is correctly detected in case the system outputs the occurrence of a threat (i.e., the DAS + PRS strategy classifies the testing frame as a threat) within the threat ground-truth time span.
For the machine + activity identification mode, the confusion matrix will also be presented for additional result analyses. This confusion matrix presents the percentage of testing frames for a given class that have been classified as any of the other classes.
For the threat detection mode, we also present the Threat Detection Rate (TDR) and the False Alarm Rate (FAR). The TDR represents the percentage of threat testing frames that are classified as a threat, and the FAR corresponds to the percentage of non-threat testing frames that are classified as threats. For the TDR, the higher the better, whereas for the FAR, the lower the better.

Experiments and Results
Since the recording location plays an important role in system performance as shown in our previous work [26], to meet the specifications regarding database recording and experiments shown in [32], and for a proper comparison with our previous work [26], the experiments were carried out from a leave-one-out cross-validation (CV) with 6-folds, where each fold contains the acoustic traces recorded in a single location. In each fold, five locations were used for GMM training, whereas the other location was used for testing. The final testing results are computed from the average of the individual testing results for each fold.
For the machine + activity identification mode, results are presented in Table 3. They show that increasing the number of positions for GMM training leads to better overall performance. However, if we examine the individual performance per class, the gain is not so clear. It can be seen that for activities with a single mechanic behavior (that will be referred to as flat activities) and high-energy machine + activity pairs (i.e., big excavator+moving, big excavator+hitting, and small excavator+moving pairs), using more data for GMM training leads to more robust models so that the class identification performance is improved. This is due to the fact that flat activities are easier to detect since they integrate a single behavior that remains stable in the fiber-optic along several positions. On the other hand, high-energy activities get more benefit of using several positions for model training, since the spread of the energy across the fiber-optic indicates the presence of a certain activity. These classes, which represent 62% of all data, contribute to the best overall performance. However, for activities that include more than one behavior and whose energy is lower (i.e., scrapping, which can include hitting, and compacting, whose energy is lower than the moving and hitting activities), increasing the number of positions does typically reduce the system performance. The only exceptions are for small excavator + scrapping pairs, which could be due to a more consistent behavior than the counterpart big excavator, and the pneumatic hammer + compacting pair, whose energy may extend up to fewer meters to indicate the given activity. Table 3. Results for the machine + activity identification mode by varying the number of positions used in the GMM training with the best result for each class and the best overall accuracy in bold font. Confidence intervals are shown for a 95% confidence. Green cells show values within the confidence interval for the best performance value (i.e., statistically non-significant differences). The values shown between brackets denote the number of frames that belong to the corresponding class. Regarding the statistical significance of the results, Table 3 also includes statistical confidence values for a 95% confidence interval, and to ease the statistical significance analysis, we have included a green background color to the cells around the one with the best results with non-significant differences. It can be seen that even when the results are not statistically significant for all the classes when comparing them (the best performance obtained using N = 11 positions is statistically significant with the other cases in 3 out of the 8 classes), the average accuracy when using N = 11 positions is statistically significant as compared with all the other cases. Moreover, the use of more than N = 1 position for system training is shown to be significant for 4 out of the 8 classes when compared with the baseline (i.e., N = 1 position).

Machine + Activity Identification
For the threat detection mode, results are presented in Table 4. They show that increasing the number of positions also leads to better overall accuracy. However, this is due to the best FAR and the fact that the number of non-threats in the system is higher than the number of threats (77.5% vs. 22.5%). Looking at the TDR, it can be seen that increasing the number of positions for GMM training does actually reduce the threat detection capability. The threat model is built from all the threat activities, and these include many different behaviors (hitting and scrapping), which results in a more blurred model so that using more positions for training may generate a less robust model for real threat detection. This is confirmed by the fact that only a few machine + activity pairs get statistically significant improved performance when several positions are used in modeling them, as shown in Table 3. Table 4. Results for the threat detection mode by varying the number of positions used in the GMM training with the best result for each metric in bold font. 'TDR' stands for threat detection rate and 'FAR' for false alarm rate. Confidence intervals are shown for a 95% confidence. Green cells show values within the confidence interval for the best performance value (i.e., statistically nonsignificant differences). The values shown between brackets denote the number of frames that belong to the corresponding class, where the number below FAR represents the number of non-threats in the database.  Table 4 also includes the results of the statistical-significance analysis for a 95% confidence interval, and to ease the statistical significance analysis, it employs the same background-color convention as that used in Table 3. In this case, the differences when using N = 11 positions are only statistically significant for the FAR metric, but, as in the machine + activity identification mode, the average accuracy is statistically significant.
For the machine + activity identification mode, the confusion matrices for N = 1 and N = 11 positions in the GMM training, which are the configurations that obtain, in general, the best class performance, are presented in Tables 5 and 6, respectively. To avoid cluttering, the confusion matrix values below chance (i.e., 1/8 = 12.5%) have been deleted and cell background color information has been added for better visualization and analysis. They show similar trends. In general, it can be seen that the diagonal contains the highest values for each class, except for the hitting activity, which is confused with scrapping and moving, and scrapping activity, which is confused with hitting. Scrapping includes hitting, and hitting class contains the lowest amount of data (and therefore training data), which derives in a less robust GMM, even though more positions are used for training. Table 5. Confusion matrix of the multi-position approach for detecting threats to the pipeline integrity for N = 1 positionbased GMM training for the machine + activity identification mode (which represents the single-position approach chosen as baseline [26]). Each cell contains the classification accuracy. The values shown between brackets denote the number of frames that belong to the real class.

Computational Time Analysis
Since this system aims to on-line monitor possible threats to pipeline integrity, the time response of the system is a critical issue. To show that the system is able to work in real-time, a computational time analysis for each of the system modules (feature extrac-tion+normalization and pattern classification) has been carried out. It must be noted that the time response of the feature extraction and normalization stage does not depend on the number of models. However, the time response of the pattern classification stage varies depending on the number of system models, since the recorded acoustic trace has to be compared with all the system models. Experiments were carried out on an Intel Quad Q9550 2.83 GHz processor (Intel, Santa Clara, CA, USA) and 4GB RAM. The time response of each of the system modules is shown in Table 7. It can be seen that the system works in real-time, since the time response is less than one second for both system modes. It can also be seen that each new model added to the system increases the time response of the pattern classification by 10 msec. By analyzing the total time response of the system it can be claimed that, after the FINDAS sensor records one suspicious 20 s length acoustic trace, the machine + activity identification mode lasts 0.22 s to make a decision, and the threat detection mode lasts 0.16 s to make a threat/non-threat decision.

Comparison with Other Works
Before describing the comparison results, it is worth mentioning that our purpose in this work is to show that increasing the training data with the multi-position approach is able to improve the system performance with respect to the use of a single position for model training. Therefore, our baseline selection was that of [26], which significantly improved our first proposal in [30]. Our decision to base our current proposal in [26] was to assess the actual robustness of the improved training strategy with a relatively simple pattern recognition strategy. After the selected baseline, we proposed several alternative systems [27][28][29], mainly based on the use of more powerful feature extraction and pattern recognition approaches, so that we carried out a comparison with respect to these other works that employ the same database.
The comparison results for the machine + activity identification mode are presented in Table 8. It must be noted that the results obtained are worse than those obtained in [27,29], but our proposal exhibits a better average accuracy when compared with [28]. This improvement with respect to [28] shows that the use of multiple positions for system training (even when using less robust GMM-based acoustic models) gets more benefit than the more robust HMM-based acoustic models on a single-position approach. Table 8. Machine + activity identification mode result comparison with previous works [26][27][28][29] with the best result in bold font. Green cells show values within the confidence interval for the best performance value (i.e., statistically non-significant differences). The system comparison for the threat detection mode is shown in Table 9. It can be seen that the multi-position approach obtains the best overall accuracy and the lowest false alarm rate. This confirms our conjecture that the approach presented in this work, aiming to increase the number of data for model training from the multi-position selection approach, is able to improve the system performance even though a more powerful set of features and classifiers are employed in the single-position approach (used in all the other systems [27][28][29]), at the cost of missing some threat detection capability. Moreover, the multi-position selection approach improves both the TDR and the FAR (and hence the overall accuracy) with respect to [30]. This is due to the effectiveness of the multi-position selection approach and the feature vector normalization stage. Table 9. Threat detection mode result comparison with previous works [26][27][28][29][30] with the best result in bold font. 'TDR' stands for threat detection rate and 'FAR' for false alarm rate. Green cells show values within the confidence interval for the best performance value (i.e., statistically non-significant differences).

Discussion
Although the results have shown that the multi-position approach presented in this work generally improves the classification accuracy, there are some limitations that should be mentioned: • For the machine + activity identification mode, not all the machine + activity pairs get benefit from the use of multiple positions for GMM training. Results suggest that only the high energy and flat activities are effectively addressed with them. However, since suspicious activities for pipeline integrity are typically generated from highenergy events (i.e., sudden impacts to the pipeline or heavy machinery), the approach presented in this work is valid for addressing them. Concerning the activities that involve more than one single behavior, more research is needed to effectively detect them with the multi-position approach. • For the threat detection mode, the use of multiple positions for GMM training does increase the accuracy. However, this is at the cost of missing real threats. Due to the FAR being reduced in a greater extent with respect to the TDR reduction (8% FAR vs. 6% TDR), we consider that this system mode is a valid approach for saving unnecessary work for the system operator while detecting three out of four real threats. • As any supervised machine learning classification system (as the GMM approach presented in this work is, and as all the other approaches presented in the literature for DAS + PRS are), only the activities for which a model has been previously trained will be accurately detected. Although the threat detection mode is able to detect the threats produced by the activities carried out by the machines presented in Table 2, it was shown in [84] that the supervised strategy based on the GMM approach is able to work reasonably well for detecting threats produced by activities that have not been seen in the training stage due to the threat model generalization capability.

Conclusions and Future Work
We have presented an augmented system with respect to our previous work in [26], which is able to continuously monitor a 45 km-long pipeline for identifying possible threats. The novelty presented in this work relies on a procedure to select and use different fiberoptic positions, synchronously recorded, for model training. The system has been tested in a real-field environment in two different modes: machine + activity identification, where the machine and activity are aimed to be identified, and threat detection, where only the occurrence of a threat for the pipeline integrity must be detected.
The results of a rigorous experimental procedure suggest that, for activities with consistent behavior and high energy, using multiple positions for model training leads to better performance, whereas for multiple-behavior and low-energy activities, using multiple positions reduces the system performance. Regarding threat detection, we have shown that the multi-position approach is able to both reduce the false alarm rate and increase the overall accuracy significantly.
As future work, we plan to extend the work with deep learning approaches, oriented towards a more robust data augmentation strategy and also to improve the system performance. We also plan to apply the multi-position selection approach presented in this work to our additional systems [27][28][29]. Finally, we will also extend our proposal to be used in a biomedical field for biomedical signal processing.