Changes in the Eruptive Style of Stromboli Volcano before the 2019 Paroxysmal Phase Discovered through SOM Clustering of Seismo-Acoustic Features Compared with Camera Images and GBInSAR Data

: Two paroxysmal explosions occurred at Stromboli on July 3 and August 28, 2019, the first of which caused the death of a young tourist. After the first paroxysm an effusive activity began from the summit vents and affected the NW flank of the island for the entire period between the two paroxysms. We carried out an unsupervised analysis of seismic and infrasonic data of Strombolian explosions over 10 months (15 November 2018–15 September 2019) using a Self-Organizing Map (SOM) neural network to recognize changes in the eruptive patterns of Stromboli that preceded the paroxysms. We used a dataset of 14,289 events. The SOM analysis identified three main clusters that showed different occurrences with time indicating a clear change in Stromboli’s eruptive style before the paroxysm of 3 July 2019. We compared the main clusters with the recordings of the fixed monitoring cameras and with the Ground-Based Interferometric Synthetic Aperture Radar measurements, and found that the clusters are associated with different types of Strombolian explosions and different deformation patterns of the summit area. Our findings provide new insights into Strombolian eruptive mechanisms and new perspectives to improve the monitoring of Stromboli and other open conduit volcanoes. Paroxysmal SOM Clustering of Seismo-Acoustic Features Compared GBInSAR Author Contributions: Conceptualization, F.G., S.C., A.M.E., L.S., L.D., M.M. and G.M.; methodology, F.G., A.M.E., L.L., G.M., F.D.T., A.T., T.N., N.C. and S.C.; formal analysis, F.G., A.M.E., L.L., S.C., T.N., N.C. and G.M.; data curation, W.D.C., M.O., R.P., G.S., F.D.T., T.N., N.C., G.G., T.C. and S.C.; writing—original draft preparation, F.G., A.M.E., S.C., F.D.T. and G.M.; writing—review and editing, F.G., A.M.E., L.L., S.C., F.D.T., T.N., N.C., L.D., A.T., M.M. and G.M.


Introduction
Artificial Neural Networks (ANNs) are applied in a wide range of fields to approach classification, pattern recognition, clustering, regression analysis, and time series prediction problems. In recent years, ANNs have been successfully applied in the field of seismology and volcanology to solve geophysical signal automatic classification and clustering problems [1][2][3][4] and to perform predictive analyses [5,6]. In the field of seismology, many studies demonstrated that ANNs are powerful tools to improve the performances and the robustness of the automatic systems for seismological analyses that allow gaining critical information for people's safety in near real time [7][8][9]. Many applications have also been developed to automatically classify the seismicity of Stromboli [10-13] and other volcanoes [1,14], obtaining performances up to 100% correct classification [7]. Here, we focus on studying the eruptive style of the Stromboli volcano (Italy) before and during the 2019 eruptive crisis [4,[15][16][17] through the neural analysis of seismic and infrasonic signals produced by the explosive activity.
Stromboli is a volcanic island in the Mediterranean Sea characterized by a persistent explosive activity that produces hundreds of moderate explosions per day. Three main vent regions are located in the upper part of the volcano: the North East (NE), the Central (C), and the South West (SW) vent regions ( Figure 1) [18][19][20][21][22]. In recent decades, geophysical and volcanological studies have indicated that the ordinary explosive activity of Stromboli shows a variety of eruptive mechanisms and products, whose signature is distinguishable in the geophysical signals generated by the explosions (e.g., seismic and infrasonic signals). In early studies, an association between the eruptive vent (NE or SW) and waveform of the VLP events produced by the explosions was observed [24] and led for the first time to the definition of two categories of explosions: Type 1, from the NE vent region, and Type 2, from the SW vent region. Subsequently, significant exceptions to the vent-waveform association were highlighted through a previous application of an unsupervised neural network clustering, compared with thermal camera measurements [11]. Two main types of explosions were also recognized in the ordinary activity of Stromboli through thermal camera measurements by Patrick et al. [25]: Type 1, dominated by coarse ballistic particles, and Type 2, characterized by an ash-rich plume, with (Type 2a) or without (Type 2b) ballistic particles. Leduc et al. [26] added a gas-dominated type (Type 0) to those mentioned above. More recently, Simons et al. [27], studying the explosions of the Yasur volcano (Vanuatu), defined a further type (Type 3) of explosion characterized by tephra jetting through a breccia/ash-occluded vent, followed by prolonged ash emission, belonging to the spectrum of Strombolian explosions. The rate, size, and relative occurrence of the different types of explosions characterize the eruptive style of the ordinary Strombolian activity.
Changes in the ordinary Strombolian activity are generally associated with imminent dangerous phenomena such as lava flows, major explosions, or paroxysms (e.g., [4,28,29]). Typically, an increase in Strombolian activity, in terms of the number of explosions per hour and amplitude of seismic signals associated with volcanic tremor and explosions, preceded the lava flow output [4]. They are generally caused by overflows [30] that originate from the eruptive vent regions, or by the opening of new eruptive fissures along the Sciara del Fuoco slope [31][32][33]. Fissure eruptions are also preceded by an intensification of landslides [8,34,35]. Major explosions are sporadic explosive events traditionally defined as explosions larger than the persistent activity, able to injure people visiting the top of the volcano. Recently, Calvari et al. [17] proposed a method based on the "VLP size" parameter of the seismic signal [4] and on the muzzle velocity by the duration of the explosive event to estimate the variable magnitude and intensity of Strombolian explosions and therefore to separate the field of the major explosions from those of paroxysms and ordinary explosions. Although to date major explosions are unpredictable, these events are most likely to occur when variations of the eruptive style happen [15,36,37]. Paroxysms are explosive eruptions that form eruptive columns of some kilometers, eject metric-sized ballistic blocks, and can generate modest pyroclastic flows [16,32,[38][39][40].
The eruptive crisis of 2019 produced two paroxysms that occurred on July 3 and August 28, which formed eruptive columns of about 5 km and were accompanied by pyroclastic flows that traveled more than one kilometer on the sea surface. The first of these two paroxysms caused the death of a young tourist and the injury of some people, in addition to triggering extensive fires that have involved the vegetated areas of the island [23]. The effusive eruption, which began with the paroxysm on 3 July 2019, and lasted about two months, also emplaced a lava flow field on the Sciara del Fuoco slope [41,42]. It is worth pointing out that the seismic parameters that were routinely monitored, such as the seismic amplitude and the hourly occurrence of VLP events (0.05-0.5 Hz), did not show significant changes before the 2019 paroxysms (Figure 2a,b). On the contrary, the "VLP size" associated with the explosive activity and other parameters, such as the peak-to-peak amplitude of VLP events and polarization of the raw seismic signal, showed significant variations about one month before that paroxysm [4]. These medium-term seismic precursors of the paroxysmal activity (Figure 2c-e) have been interpreted as variations in the eruptive style linked to the magma-conduit interaction and to the degassing of the volcanic system, which control the Strombolian explosive mechanism. Monthly histograms from November 2018 to September 2019 of the raw seismic signal amplitude recorded by the STRA station (a); hourly occurrence of VLPs (b); polarization azimuth of the raw signal of the STRA station (c); peak-to-peak amplitude of VLP events (0.05-0.5 Hz) recorded by the STRA station, horizontal component (d); VLP size measured by the STRA station, horizontal component (e). The dark gray bars are relevant for the period before the paroxysm of 3 July 2019, the light gray bars are relevant for the period following the aforementioned paroxysm.
In addition to the geophysical studies conducted on volcanoes, analogue experiments also provided useful information to discriminate the factors that affect the degassing and eruptive style of an open conduit volcano such as Stromboli. Spina et al. [43] investigated the role of conduit surface irregularity and physical properties (e.g., viscosity and gas flux) of an analogue basaltic magma using an experimental setup [44] and produced seismoacoustic measurements. Giudicepietro et al. [45] designed an unsupervised neural network based on a Self-Organizing Map (SOM) that was able to consistently group the artificial seismo-acoustic events generated in similar experimental conditions (conduit roughness, analogue magma viscosity, and gas flux) thanks to an appropriate strategy for extracting the seismo-acoustic features.
The aim of this work is to study the types of explosions recognizable in the persistent activity of Stromboli through unsupervised neural networks applied to seismic and infrasonic signals, which contain the fingerprints of the explosive mechanism. In particular, our target is the period that preceded and included the paroxysmal phases of 2019. For this purpose, we applied a SOM clustering of the seismic and infrasonic features to group events generated in similar conditions and we compared the result of clustering with the images recorded by the Istituto Nazionale di Geofisica e Vulcanologia (INGV) thermal and visible cameras and with the ground displacement measurements obtained by means of Ground-Based Interferometric Synthetic Aperture Radar (GBInSAR) devices, in order to gain insights into the explosive mechanisms and the pre-eruptive dynamics of the paroxysmal activity.

Seismic and Infrasonic Data
We used the data of the STRA seismic-acoustic station ( Figure 1) belonging to the Istituto Nazionale di Geofisica e Vulcanologia (INGV) monitoring network [46][47][48]. The seismic station is equipped with a CMG40T Guralp broadband sensor. The infrasonic sensor is a Chaparral Model 25. The signals of both sensors are acquired by a 24-bit GAIA2 digital recorder [49], optimized for low power consumption, a critical requirement for data acquisition in inaccessible areas. The sampling rate for seismic and infrasonic signals is 50 samples per second. The data are continuously transmitted to the Osservatorio Vesuviano, Osservatorio Etneo and Osservatorio Nazionale Terremoti of INGV, Italy. Figure 3 shows examples of seismic-acoustic recordings of explosive events linked to the persistent explosive activity of Stromboli. The filtered signals (red line) in frequency bands used in this work for the seismic-acoustic feature extraction are superimposed on the raw signals (gray line).  Figure 4 shows the frequency content of the seismic and infrasonic signals of the explosive event considered in Figure 3.
To describe the temporal evolution of the eruptive style of Stromboli before and during the 3 July-30 August 2019 eruptive crisis, we considered a wider time interval that extends from 15 November 2018 to 15 September 2019, and selected 14,289 seismic recordings (each 30 s long) and 14,179 infrasonic recordings (each 30 s long), most of which are pairs of seismo-acoustic signals linked to the same explosive event. We chose the same events related to the "VLP size" time series of Giudicepietro et al. [4], which refers to the VLPs with maximum "size" for each half-hour in the target period. We adopted this criterion for the selection of the seismic-acoustic events because the VLP size was already recognized as a parameter that effectively highlighted variations in the period preceding the eruptive crisis of July August 2019 [4]. We used the seismic data of the east-west component of the STRA station, which is the component with the maximum amplitude of the signals, with the Stromboli seismic wavefield being mainly horizontally polarized (e.g., [4,37]). In addition to the seismic data, we also selected the corresponding infrasonic data recorded by a sensor located in the same site as the STRA seismic station. To analyze the temporal evolution of the eruptive style of Stromboli, we developed a novel preprocessing strategy suitable to extract the seismic-acoustic features representative of the fingerprints of the explosive mechanisms and to overcome the problem of the data window cutting, which cannot be based on a precise picking of seismic phases that are not recognizable in the signals of our interest (VLP events). Then, we applied the SOM algorithm to cluster the seismic-acoustic feature datasets.

Seismic-Acoustic Feature Extraction Methods
An efficient feature extraction method for seismic data typically takes into account the spectral content and the waveform of the events (e.g., [7,8,45]). Actually, these are the characteristics that the analysts visually examine to classify seismograms, for example, to distinguish a local earthquake from a regional one or a teleseism. Often, spectrograms expressed in compressed form and waveform functions calculated on sliding windows [1] are used to analyze events with impulsive onset. However, in this case, the signals of our interest are the VLPs (0.05-0.5 Hz) in which an impulsive onset cannot be recognized. Thus, we designed novel methods for seismic and infrasonic feature extraction, that are independent of the picking of a transient signal onset. The method for cutting data, which allows us to extract event records from the continuous signal, relies on an algorithm designed to detect the VLP event with the largest "size" in half-hour windows [4]. This algorithm allows cutting a 30 s signal interval from each half-hour window, but it does not return signal intervals starting precisely at the picking of a signal onset. For this reason, to encode the VLP event waveforms, we sorted the amplitude features in ascending order and, to encode the seismic signal frequency content, we used the spectrum, and not the spectrogram ( Figure 5). In particular, by using the utilities of ObsPy Toolbox [50], we calculated the spectrum of every selected 30 s signal, then we applied a filter for smoothing the obtained spectrum, using the aforementioned ObsPy Toolbox, and, finally, we downsampled the smoothed spectrum by a factor of 1:3 ( Figure 5d). Moreover, we encoded the VLP event waveforms by filtering the signal in the 0.05-0.5 Hz band, resampling it by a factor 1:16, and sorting the values of the filtered and resampled seismogram in ascending order ( Figure 5c). We added the information of the raw seismic signal waveform using an encoding based on the peak-to-peak amplitude of a 25-sample (0.5 s) sliding window. This waveform parameterization is performed by applying the following equation: where Ai,M and Ai,m are the maximum and the minimum amplitudes in a 25-sample window and N is the total number of windows. Finally, we sorted the values of the raw seismic waveform parameterization vector in descending order (Figure 5b). Figure 5 shows an example of feature extraction for one of the 14,289 seismic events in the dataset. We also extracted the features of the infrasonic signal. To avoid the high-noise component present in the low frequencies of the infrasonic signal (Figure 3b), we highpass filtered the data (>0.5 Hz) and applied an encoding procedure similar to that used for the raw seismic signal waveform (Equation (1)). Then, we sorted the infrasonic feature vector in ascending order to make the encoding independent from the picking of the events.

SOM Method
Once the dataset of the extracted seismic-acoustic features was prepared, we analyzed it with a SOM-based method to highlight clusters of explosive events with common characteristics.
There are different clustering techniques for the analysis of complex datasets, which can be divided mainly into two types: linear ones, such as the Principal Component Analysis (PCA) or the Multidimensional Scaling (MDS); and non-linear ones, such as the Self-Organizing Map (SOM), the Curvilinear Component Analysis (CCA), or the Curvilinear Distance Analysis (CDA). First, it has been proved that the SOM algorithm discriminates better than CCA and PCA, providing a simplified two-dimensional representation of the data and preserving the distinctive information that allows them to be separated [51,52]. Furthermore, we chose to use the SOM algorithm based on the good results obtained with SOM to analyze experimental data, proving its ability to group experimental seismo-acoustic events [45].
The SOM neural networks were designed by Tuevo Kohonen in 1982 [53] and inspired by brain cortex topology. In particular, he took into account the connections between neurons and how a neuron can affect its neighbors; neurons that are close to the active ones strengthen the connections, while those that are further away weaken them. The SOM network uses an unsupervised and competitive learning algorithm; this means that the process is entirely data-driven and the neurons (or nodes) compete with each other to respond to a subset of the input data. Competitive learning increases the specialization of each node in the network. The goal is to discover some hidden structures of the data so that they can be clustered.
The SOM method is used in several applications, in particular in data examination and visualization. As a clustering method, it allows the reduction of a large amount of data by grouping them. However, contrary to the classical clustering methods, being a non-parametric technique, it does not require information on the data distribution. As a projection method, it can display high-dimensional data onto an easily understandable lower-dimensional space (commonly two-dimensional), useful for improving the input pattern interpretation and classification and for finding unexpected structures in the data. Its effectiveness as a visualization technique is given by the fact that the mapping is nonlinear and the resulting map preserves the topological and metric relationships of the data.
The SOM unsupervised algorithm works as follows: before the training, the prototypes are initialized with small random values to demonstrate the strong selforganizing capability of the SOM. First, a randomly extracted input vector of the dataset is presented to the network; then, the winning node (called the best-matching unit) is identified, i.e., the node whose prototype is closest to the input, in terms of Euclidean distance. The weights of the winning node and its topological neighbors are then updated or moved towards the input vector. The updating rule of the prototype vectors uses a decreasing neighborhood function of the distance between two nodes on the map grid. The most commonly used is the Gaussian. This function uses two parameters: the learning rate, which controls the intensity of the attraction of the input vector, and the neighborhood radius, which regulates the number of vectors attracted other than the winning node. Both of these parameters are time-decreasing functions and change their values during training. Then, they remain constant during the convergence phase. Thus, in the beginning, the map provides a first rough representation of the input data distribution while at the end the prototypes are settled to their final values and the final map is shown. A more exhaustive description of the SOM algorithm can be found in [11]. At the end of the iterative process, the final map consists of "ordered" prototype vectors on the grid so that similar inputs fall into topologically neighboring nodes. In this sense, the SOM is a similarity graph.
The architecture of a SOM network has two levels, one of the input nodes and one of the output nodes located on a generally two-dimensional grid. Each input node is connected to all the nodes of the grid; each output node has a vector of weights with the same dimensions as the input vectors ( Figure 6a). Once the feature vectors have been processed, the final configuration of the weights will divide the input elements into the SOM nodes, which represent their clustering. In our work, we used a SOM with a 3 × 4 = 12 local hexagonal grid (Figure 6b), and a global toroid shape displayed as a sheet to have an immediate visual cluster configuration. This means that nodes on top and bottom are neighbors as well, as are the side nodes. Figure 6b also shows the numbering of the map nodes which proceeds from top to bottom and from left to right. Finally, we fixed the SOM parameters according to Kohonen et al. [54] and the SOM toolbox for Matlab (http://www.cis.hut.fi/somtoolbox/, last accessed 4 March 2022).

Thermal and Visible Camera Data
To visually analyze the explosive activity of Stromboli and compare it with the clustering of the seismo-acoustic features, we used the recordings of the INGV monitoring fixed camera network. In particular, we used the visual and thermal images recorded by the SQV/SQT and SPV/SPT cameras (Q and P in Figure 1). These cameras acquire one image every two seconds (SQV and SQT) and two images every second (SPT), both in the visible (V) band and in the thermal (T) band. Sensors in the thermal band are particularly useful because they are not very sensitive to day/night light variations. The two cameras have different distances from the crater area and different viewing angles. This causes a different sensitivity to the detection of the explosions that occur in the three vent regions. Moreover, particular weather conditions characterized by low-lying clouds can affect the visibility of the explosions. Therefore, depending on the case, the analysis was performed by using a specific camera allowing the best view.
Based on the results of the seismo-acoustic data clustering obtained with the SOM method, we identified five days that are representative of the five main nodes, which are grouped into three clusters. Each of these days was characterized by a prevalent explosive type, according to the neural analysis of the seismo-acoustic features (see Section 3.3). Thus, we selected about 180 video recordings of explosions and characterized them on the basis of the eruptive vent, the height and shape of the ejection, and the duration of the explosive process.

GBInSAR Data
GBInSAR measurements allow retrieving ground deformation by exploiting the phase difference between pairs of Synthetic-Aperture Radar (SAR) images acquired by in situ instrumentation. They are based on the same principle as satellite-based SAR observation, with the advantage that GBInSAR has a higher rate of data acquisition and takes images short distances from the target area. At Stromboli, there are two GBInSAR devices installed in the stable area north of the Sciara del Fuoco, in order to monitor ground displacement of the unstable flank and the top of the volcano. The technical characteristics of the instruments, their settings, and the data processing methods have made this technique very important for identifying the periods of inflation/deflation of the shallow magmatic system in Stromboli (e.g., [17,30,35,41,[55][56][57]), regardless of the weather conditions and ash content in the atmosphere. The two instruments, working in the Ku-band (17.0-17.1 mm radar), acquire with a revisiting time of 6-7 min, and then the images are averaged over 30 min in order to increase the signal-to-noise ratio. A resample operation returns images with a pixel size of about 2 m × 2 m along both range and crossrange [58] and, setting a coherence (>0.7; see Antonello et al. [59]) and a power filter (>55 dB; [30,57]), the pixel by pixel staking algorithm allows the reconstruction of the cumulative displacement maps, allowing for the tracing displacement time series of selected points (averaged over 5 × 5 pixels) with a precision in the displacement measurement of 0.5 mm [41,56]. We used GBInSAR data recorded from 15 November 2018 and 15 September 2019. The features of the two GBInSAR devices are reported in Table 1.

Results
Our goal is to analyze the temporal evolution of the Strombolian explosive activity in order to highlight changes in the eruptive style that preceded the paroxysmal phases of 3 July and 28 August 2019.

Seismic-Acoustic Features
We applied the novel procedures for the feature extraction from seismic and infrasonic data, which are described in the "Data and Methods" section. Starting from 30 s seismic signal recordings corresponding to 1500 samples (50 samples per second), we obtained 351-dimensional feature vectors. In particular, we encoded the seismic signal frequency content by downsampling the smoothed spectrum by a factor of 1:3 and considering the spectral features up to 10 Hz. This frequency threshold is suitable to adequately represent the signals of interest for our study [24,34,37]. Then, we encoded the VLP event waveforms by resampling the filtered signal (0.05-0.5 Hz) by a factor of 1:16 and, finally, sorting the VLP waveform features in ascending order. Finally, we extracted the raw seismic signal waveform features using an encoding based on the peak-to-peak amplitude of a 25-sample (0.5 s) sliding window (Equation (1)), sorted in descending order. Therefore, we obtained a vector of the seismic features composed of 200 coefficients for the spectral content encoding, 92 coefficients that encode the VLP waveform, and 59 coefficients for the parameterization of the raw waveform ( Figures 5 and 7).
We also extracted the infrasonic feature vectors by high-pass filtering the signal (>0.5 Hz) and applying an encoding procedure similar to that for the raw seismic signal waveform (Equation (1)). We obtained a 59-dimensional infrasonic feature vector for each infrasound record. Additionally, in this case we sorted the vector of the features in ascending order to make the encoding independent from the picking of the events ( Figure  7).

SOM Analysis
We carried out three clustering experiments through the SOM analysis: (i) using only the seismic features; (ii) using only the infrasonic features; (iii) using both features of seismic and infrasonic data. Figure 8 shows the results of the three experiments. In the SOM maps, each node is shown as a yellow hexagon, whose size indicates the node density, in terms of the number of input samples associated with that node. The Euclidean distance between the prototypes of two neighboring nodes is represented according to grayscale in the spaces between the nodes. Dark gray hexagons interposed between the nodes correspond to a great distance, light gray indicates high similarity between the prototypes of neighboring nodes. We indicated the nodes of the SOM map with progressive numbers from 1 to 12 (the dimensions of the map are 3 × 4). The node numbering criterion is shown in Figure 6b.
The results of the experiment that was based only on the seismic features (Figure 8a) highlight three main clusters: a red cluster, formed by node 10; a blue cluster, formed by nodes 4 and 12, and a green cluster composed of nodes 1, 2, and 5.
The experiment that was based only on the infrasonic features (Figure 8b) provided only two main distinct nodes of the SOM network, in which most of the examples of the dataset are grouped. Thus, only two clusters were identified in this experiment: a red cluster, coinciding with node 2, and a blue cluster, coinciding with node 12.
The results of the experiment that was based both on seismic and infrasonic features (Figure 8c) highlight three main clusters: a red cluster, composed of nodes 4 and 7; a blue cluster, which includes nodes 5, 9, and 10; and a green cluster composed of node 11.
In all the three experiments that we carried out, the results indicate a variation in the relative occurrence of the clusters in the three months preceding the eruptive crisis, which began on 3 July 2019 (Figure 8). In particular, the experiment with only infrasonic data separates two large families of events: one characterized by large-amplitude impulsive infrasonic signals, that is marked as a red cluster in the results of our experiment, and another with infrasonic signals almost indistinguishable from background noise, marked as a blue cluster in the results of our experiment (Figure 8b). The experiment with only the features of the seismic signals identifies a greater variety of types that can be grouped into three main clusters. Finally, the experiment with the seismo-acoustic features used jointly also identifies three main clusters and more clearly emphasizes the variation in the relative occurrence of the clusters before the paroxysm of 3 July 2019 (Figure 8c).
By associating the seismic features with the infrasonic ones, the event parameterization is more complete and the SOM clustering experiment provides more significant information on the temporal evolution of the eruptive style of Stromboli in the target period. For this reason, in the following, we will focus on the results of this experiment by comparing the retrieved clusters with the camera images and GBInSAR deformation data. We will call the three clusters obtained in the seismo-acoustic SOM experiment cluster Red (in total 4539 explosions: 2950 in node 4 and 1589 in node 7), cluster Blue (in total 6332 explosions: 1183 in node 5; 2638 in node 9; and 2511 in node 10) and cluster Green (1682 explosions in node 11) (Figure 8c).

Classification of the Explosions Belonging to Clusters through the Analysis of Camera Images
To link the three seismo-acoustic clusters obtained from the SOM analysis to types of explosions, we selected a subset of seismo-acoustic events representative of Blue, Green, and Red clusters (Table 2), and compared them with the INGV camera recordings ( Figure  1). We analyzed the main nodes of cluster Blue, which are 9 and 10, with node 5 being very similar to node 9. First, we selected the camera images relevant to the days when there was the highest concentration of explosions falling in the main nodes belonging to a specific cluster. These days are 17 February, 16 May, 8 June, 9 July, and 6 August 2019. Table 2 shows the distribution of the seismo-acoustic clusters on the selected days. February 17 was chosen to represent cluster Red, with a prevalence of seismo-acoustic events that fall into node 4 (43 out of 47); 16 May and 8 June were selected to represent cluster Blue, with a prevalence of seismo-acoustic events belonging to nodes 9 and 10, respectively; 9 July was again representative of cluster Red, with a prevalence of events in node 7; and August 6 represented the events of cluster Green, all of which fall into node 11. Thus, there are 182 explosions of interest, equal to the sum of the values in bold underlined in Table 2. Unfortunately, some of these explosions were not properly recorded by the INGV cameras due to poor weather conditions or technical problems. Furthermore, several explosions belonging to cluster Blue are inherently undetectable by cameras due to the types of events that this cluster groups together, namely gas explosions. In particular, thirty of the forty 16 May explosions, belonging to node 9 in our dataset, are not visible in the camera recordings. The same happens for nine of the forty explosions relevant to 8 June, belonging to node 10. In summary, the cameras recorded: 42 explosions falling into node 4 on 17 February; 32 explosions belonging to node 7 on 9 July; 10 explosions belonging to node 9 on 16 May; 31 explosions belonging to node 10 on 8 June; and 20 explosions falling into node 11 on 6 August 2019. The analyzed dataset allowed us to characterize the types of explosions corresponding to the seismo-acoustic clusters. Figure 9 shows some examples of the seismograms associated with the seismoacoustic events that characterize the main SOM nodes. Table 2. Days representative of the seismo-acoustic events of the three clusters. The values in bold and underlined indicate the number of events in our dataset, which, on the day indicated in the first column, falls into the prevailing node (reported in the last column). The column "Detected" gives the number of explosions that have been identified by the monitoring cameras.

Date
Red n.4  Table 2 reports in the column "Detected" the number of explosions recorded by the seismo-acoustic trace, and falling into the corresponding prevailing node (last column) that could be identified by the camera images. Some of them are shown in Figure 10. Actually, the observation of the camera images allowed us to recognize the vent (Figure  10a,f) and the eruptive style of prototypal explosions belonging to the three clusters. In particular, on 17 February, the SQV camera (Q in Figure 1) recorded 43 explosions from node 4, which belongs to cluster Red ( Table 2). All of them were characterized by wellcollimated jets from the N1 vent (Figure 10a), with approximately the same elevation (~200 m), and lasting on average ~5.6 s (three SQV frames). Only two events out of 43 lasted longer (8 s), whereas 12 events lasted less (4 s). These explosions ejected juvenile pyroclastic fragments showing ballistic trajectories. On May 16, the SPT camera (P in Figure 1) recorded 10 of the 40 explosions that fall into node 9 (cluster Blue). Most of the SPT videos recorded on 16 May 2019 were damaged due to technical problems, probably related to the data transmission system, and could not be used, but the few available allowed observations of this activity that cannot be detected from other cameras because of too-low intensity and very short gas plume. The observed explosions were all from the SW2 vent (Figure 10a). They were mild and mostly gas-dominated (Type 0, according to Leduc et al. [26]), displaying slow bowl-shaped gas emissions with no visible ash or incandescent lapilli. The max height reached by these explosions was around 10-20 m and their duration ranged between 11 and 33 s (average 18.7 s).
On June 8, the SPT camera recorded 31 of the 40 explosions that fell into node 10 (cluster Blue). These explosions were Type 0 events occurring from SW1, SW2 (most common), or C1 vents (Figure 10a,c), having a duration between 9 and 72 s (average 22.5 s). Some of them were not visible on the surface (nine events) probably because they were too weak and occurred within the conduit.
On 9 July, the SQT camera recorded 32 powerful explosions belonging to node 7 (cluster Red) generally with well-collimated jet and ballistic ejections from the SW vent region (Figure 10a,e). We could not exactly distinguish the vent because of the inclined view offered by this camera. The duration ranged between 10 s and 28 s (18.6 average).
On 6 August, 26 explosions belonging to node 11 (cluster Green) were recorded. In this period, a lava flow descended along the SW slope of the Sciara del Fuoco (Figure 1). Given the almost continuous explosive activity that accompanied this effusive phase, it was difficult to identify the explosions associated with the seismo-acoustic signals. However, the SQV camera recorded some of these events. They may be related to explosions from multiple vents, generally the South West (SW) and Central (C) vent regions (Figure 1). These explosions were characterized by the ejection of pyroclastic fragments, most of which were incandescent spatter-like, with a wide range of ejection angles that gave the explosion an almost hemispherical shape, and the height reached a maximum of 80 m. Figure 10 shows an example for each of the main nodes belonging to one of the three clusters. In particular, panels (b) to (f) show the images and the seismo-acoustic recordings of the event types for nodes 9 and 10 (cluster Blue), node 11 (cluster Green), and nodes 4 and 7 (cluster Red). For each of them, the date, the node they belong to, the image of one of the cameras used for the analysis, the raw seismic signal, the seismic signal filtered in the VLP band (0.05-0.5 Hz), the high-pass filtered infrasonic signal (>0.5 Hz), and a zoomed-in view of the infrasonic signal are shown from top to bottom. The SPV and SPT cameras are very close to the vents whereas the SQV and SQT cameras are further away from them (Figures 1 and 10a), therefore the explosions that produce a weak signal in the camera recordings (e.g., panels b and c) are visible only from the cameras installed at site P in Figure 1 (SPV and SPT). Figure 10b,c represents two event types of cluster Blue, which is associated with gas explosions, belonging to nodes 9 and 10, respectively. The events of this cluster show VLPs (Figure 9b,c) characterized by prolonged oscillation, especially evident in the events falling into node 9, and peak-to-peak amplitude generally higher than that of the events belonging to the other two clusters (particularly evident in node 10). The infrasonic signal associated with these events is almost indistinguishable from the background noise. Figure 10d shows an example of cluster Green, consisting of only node 11, which groups explosions with ballistic spatters and hemispherical shapes. The raw seismic signal associated with this explosion is modest in amplitude whereas the sustained VLP signal includes numerous oscillations. The infrasonic signal does not show an evident pulse linked to the explosion and is characterized by repeated minor pulses linked to spattering activities. Finally, Figure 10e,f represents two event types of cluster Red, falling on nodes 4 and 7, respectively. These events are characterized by a VLP signal with a distinct amplitude pulse and an infrasonic transient of remarkable amplitude. The raw seismic signal shows a greater contribution of the high-frequency components compared to the other types of events, in part due to the coupling of the infrasonic signal with the ground [60]. These seismo-acoustic events are associated with explosions that produce a well-collimated jet, with ejection of ballistic fragments as described above for the events of cluster Red.

Seismo-Acoustic Clusters and GBInSAR Measurements
We compared the time evolution of seismo-acoustic clusters with the ground deformations (Figure 11a) measured by the GBInSAR device in the summit area of the volcano (Figure 1). The investigated period was characterized by an oscillatory trend of deformations, with displacements towards the sensor (i.e., inflation), and displacements away from it (deflation). We observed an initial period from 15 November 2018 to 5 February 2019 (1 in Figure 11), characterized by high displacements (on average 1.8 mm/day) towards the sensor (inflation), a period of low to null displacement until March 3 (2 in Figure 11), and new inflation of about 5 mm/day toward the sensor, which lasted until March 15, 2019 (3 in Figure 11). After this inflation, there was a period characterized by small fluctuations in displacements, which in any case remained low or null until 19 July 2019 (4 in Figure 11). The following period was characterized by displacement towards the sensor (inflation), with an average rate of 2.8 mm/day and peaks that reached even more than 30 mm/day (9 August 2019 and 10 September 2019). A very striking feature, currently never observed in the GBInSAR data from Stromboli (e.g., [56]), was the increase in the oscillations of the crater terrace, which can be deduced here from the increase in the standard deviation of the daily displacement rate. In particular, the period considered can be divided into three subperiods: (i) from 15 November 2018 to 5 April 2019, with a low standard deviation (on average 24 mm/day); (ii) from 8 May 2019 to 8 July 2019 (period preceded by the absence of data due to technical problems of the instrument), with an increase in the standard deviation (on average 45 mm/day); (iii) from 9 July 2019 to 15 September 2019, which was characterized by high standard deviation values (on average 105 mm/day), testifying the strong oscillations of the crater terrace in the time of acquisition of the GBInSAR data.
By comparing this displacement data with SOM clusters and then with the camera images, we found that the period dominated by the gas explosions of cluster Blue (Figure  11c), which begins in early April (the predominance of cluster Blue is highest in June 2019, as shown by the black rectangle in Figure 11), occurs during a stasis of ground deformation interposed between two phases of inflation of the upper part of the volcanic edifice (Figure 11a). On the contrary, the explosions of clusters Red and Green ( Figure  11b,d), that are dominated by the ejection of coarse juvenile ballistic particles, occur in periods characterized by inflation of the crater area. In particular, cluster Green, erupting large spatter, seems temporally correlated with the phases of more intense inflation of the top of the volcano.

Discussion
In a previous study, Giudicepietro et al. [4] highlighted the precursors of the 2019 paroxysmal phase through the calculation of seismic parameters such as the polarization of the seismic signal, the peak-to-peak amplitude of VLP events, and the VLP size. The comparison of these parameters with the temporal evolution of the seismo-acoustic clusters retrieved with the SOM analysis clearly shows that the anomalies of the seismic parameters are linked to a significant change in the types of explosions before the 2019 paroxysmal phase (Figure 12).
In particular, significant variations have been recognized thanks to the definition of the VLP size parameter, which provides a value representative of the magnitude of the main VLP event for each half an hour. When the continuity of the seismic signal is satisfactory, 48 values per day relating to 48 VLP events are retrieved. The events identified by the VLP size calculation carried out in Giudicepietro et al. [4] have been selected to constitute the dataset analyzed in this work.
The time series of the VLP size in the period 15 November 2018-15 September 2019 shows a remarkable increase before the 3 July 2019 paroxysm. This increase is reflected in the time evolution of the seismo-acoustic clusters ( Figure 13). Actually, about three months before the first paroxysm (3 July 2019), the occurrence of seismo-acoustic events belonging to cluster Blue (gas explosions or Type 0) increased with respect to the occurrence of seismo-acoustic events belonging to clusters Red and Green. This indicates that the gas explosions were predominant in the persistent Stromboli activity for about three months before the 3 July 2019 paroxysm. Furthermore, our findings indicate that Type 0 explosions produce large VLP events whereas they do not generate evident signals in the camera recordings, which in some cases do not record the event at all. We interpret this as the effect of large gas slugs that cause a volumetric variation in the source area of the VLP seismic signals when they rise along the conduit [24]. However, they do not generate ejection of pyroclastic fragments, or hot material, which should be detected by visible and thermal cameras, nor do they generate remarkable infrasonic signals, in frequencies greater than 0.5 Hz (Figure 10). Therefore, Type 0 explosions may not be detected at all by monitoring cameras and infrasonic networks whereas they are always clearly evident in broadband seismic signals. A low-frequency infrasonic signal, e.g., within the frequency range of the band of VLP seismic events (0.05-0.5 Hz), has been observed in some cases, but this component of the infrasonic spectrum has not been considered for the parameterization of the signals because it is generally affected by strong noise due to atmospheric weather conditions. The comparison between the GBInSAR measures and the SOM clustering of the seismo-acoustic features highlights that, counterintuitively, the geodetic precursor of the paroxysm of 3 July 2019 was not a phase of inflation but rather an interruption of the inflation and a trend towards deflation in the last month before the paroxysm (Figure 11). The relationship between the prevailing type of explosions and the ground deformations in the crater area ( Figure 11) indicates the consistency of the clustering obtained with the SOM with physical variations of the state of the volcano. In particular, the prevalence of gas explosions (cluster Blue) during a period of little or no inflation of the crater area is consistent with the fact that the gas is compressible and therefore when it passes through the final part of the conduit it produces less deformation than magma. On the other hand, the temporal correlation between the inflation phases in the crater area with the prevalence of explosions belonging to cluster Red (well-collimated jets of ballistics), and especially to cluster Green (erupting large spatter), is consistent with a condition in which the final part of a shallow conduit is filled with magma. This condition is typical of the effusive phase (3 July-30 August 2019) fed by the SW vent region, during which the occurrence of the Green cluster explosions increased. The abrupt change in the eruptive style that arose when the paroxysm of 3 July 2019 occurred is noteworthy, suddenly determining the transition between an activity characterized by a prevalence of gas explosions with little or no emission of pyroclastic material (explosions of the Blue cluster) to an activity characterized by explosions that eject incandescent ballistic pyroclasts in conjunction with effusive activity (Figures 11 and  13). The explosions that emit incandescent ballistic fragments, which appeared immediately after the paroxysm of 3 July 2019, are distributed in two different clusters that correspond to different characteristics of the explosive mechanism whose fingerprints are recognizable in the seismo-acoustic features. In particular, the explosions of cluster Red are characterized by a well-collimated jet, which reaches a height of more than 200 m above the vent, and by a remarkable infrasonic transient. Those of cluster Green are characterized by the emission of incandescent ballistic spatter with a wide range of ejection angles and do not show an easily recognizable infrasonic transient associated with them. The latter show a hemispherical shape and reach a lower height (around 80 m). Compared to the explosions of cluster Red, this second type of explosion is linked to a greater height of the magma column in the conduit, which is completely filled with magma, as also observed in other volcanoes, for example, Etna [61]. After the paroxysm of 3 July 2019, the Green cluster explosions became frequent and probably occurred at the same SW vents that fed the lava flow. A small but significant variation in the locations of the VLP events reported in Giudicepietro et al. [4] corresponded to the sudden change in the eruptive style (3 July 2019, in Figures 11 and 13). These locations were concentrated in the SW sector of the VLP source volume before the paroxysm of 3 July 2019, and migrated slightly NE after this paroxysm, indicating a resumption of the activity in the NE vent region (see Figure 7 in Giudicepietro et al. [4]).
Information on the final part of the conduits linked to the eruptive vents is contained in the seismo-acoustic features as also highlighted in the analysis of experimental seismoacoustic events in Giudicepietro et al. [45]. Actually, clusters Blue and Red are composed of more than one node, and the subdivision of the seismo-acoustic events into the different nodes typically corresponds to explosions with a similar mechanism emitted from different vent regions, as in the case of nodes 4 (N1 vent) and 7 (SW vents) that form cluster Red (Figure 10a,e,f).
All the three main types of explosions recognized by the SOM analysis generally manifest themselves in the persistent activity of Stromboli, each of which can occur in different vent regions (Figure 1). Therefore, the anomaly that preceded the first paroxysm of 2019 was the clear predominance, within our dataset, of gas explosions (cluster Blue), which reached 96.12% of the total in the last month before the 3 July 2019 paroxysm ( Figure 13). As already specified in the Data and Methods section, our dataset does not include all explosions, which can exceed 400 per day, but only those associated with the largest VLP size of every half-hour, for a maximum of 48 events per day. This selection allowed us to extract the 14,289 and 14,179 most significant seismic and infrasonic recordings, respectively, and to prevent the dataset from reaching dimensions that are not easy to handle for analysis. In the period preceding the paroxysm of 3 July 2019, the predominance of cluster Blue in this dataset indicates a degassing activity that is not accompanied by an effective emission of juvenile material, consistently with the deflation or the absence of inflation in the crater area, therefore indicating a remarkable anomaly in the pattern of the persistent activity of Stromboli. The eruptive style change before the paroxysmal phase, which is clearly recognizable in the temporal evolution of the seismoacoustic clusters found with the SOM analysis ( Figure 11 and 13), is an important finding because it highlights hidden variations in the state of the volcano that may reveal undetected escalation of volcanic plume degassing and/or precursory leakage from deeply stored gas-rich magma (e.g., [62]). Actually, despite Stromboli being a wellmonitored volcano, when the first paroxysm of the 2019 eruptive crisis occurred, it was considered to be in a state of normal activity.
The second paroxysm, which occurred on 28 August 2019, happened 56 days after the start of the effusive activity, which began immediately after the first paroxysm on 3 July 2019. Therefore, this event occurred in a different condition compared with the first one, as also indicated by the temporal evolution of the seismo-acoustic clusters ( Figures  11-13). Considering the models of Stromboli paroxysm triggering proposed in the literature, the first paroxysm (3 July 2019) could be explained by an increased supply of gas and magma from the depths (e.g., [39,62,63]). However, the neural analysis of the eruptive style and its comparison with the deformation of the summit area allowed us to discover that this paroxysm was preceded by a phase of decrease in the feeding of the persistent activity, which is highlighted by the decreased emission of pyroclastic material and by the deflation of the summit area accompanied by a greater release of degassing (Type 0 explosions of the Blue cluster). For this reason, the input of gas and magma from the depths that caused the paroxysm does not seem linked only to increased activity of the deep magma system but also to a deceleration, in the period preceding the paroxysm, of persistent activity, which is partly controlled by the shallow volcanic system. On the contrary, the paroxysm of 28 August 2019 is consistent with a trigger due to the drainage of the highly porphyritic magma which is typically found in the upper part of the conduit, due to the effusive activity that began about two months earlier, which determined the rise of low-porphyritic magma capable of producing paroxysmal eruptions (e.g., [64]). In any case, the GBInSAR measurements indicate that in the medium term a deflation shortly before the event occurred for both paroxysms. Furthermore, in the short term, the strainmeter data show that similar dynamics occurred in both paroxysms, as reported in Giudicepietro et al. [4] and Di Lieto et al. [65].

Conclusions
The SOM analysis of the seismo-acoustic features associated with a set of about 14,200 explosions selected based on the VLP size parameter allowed us to identify three main clusters in the period 15 November 2018-15 September 2019, which contains the paroxysmal phase of July-August 2019.
The comparison of a subset of events with the visible and thermal camera images allowed us to associate distinct explosive types to the three main seismo-acoustic clusters. In particular, the cluster called Red is associated with explosions characterized by wellcollimated vertical jets of ~200 m in height, which eject incandescent ballistic pyroclastic fragments and produce a remarkable infrasonic signal. Cluster Blue is associated with gas explosions with height in the range 10-20 m and with little or no ash and ballistic emission. These bursts may not be detected by the camera recordings and infrasonic signals whereas they are evident in the VLP seismic signals (filtered in the 0.05-0.5 Hz frequency band). Cluster Green groups explosions characterized by the ejection of incandescent spatter-like pyroclastic fragments, with a wide range of ejection angles and hemispherical shape. The explosions of cluster Red are mainly generated in the NE vent region whereas the explosions of clusters Blue and Green are mainly emitted from the central and SW vent regions.
Looking at the time evolution of the three main clusters, we discovered that the eruptive style of Stromboli was affected by significant changes in the three months preceding the 3 July 2019 paroxysm and that the gas explosions (Type 0; Leduc et al. [26]) falling into cluster Blue dominated the persistent Strombolian activity, especially in the last month before this paroxysm, forecasting the ascent of gas-rich magma from a depth [62].
Finally, by comparing the temporal evolution of the clusters with the deformations of the top of the volcano retrieved through GBInSAR measurements, we were able to recognize a relationship between the eruptive style and the inflation/deflation phases of the crater area. Actually, the period dominated by the gas explosions of cluster Blue (early April-late June 2019) was characterized by the absence of significant deformations whereas the effusive phase between the two paroxysms (early July-mid September 2019), dominated by explosions falling into clusters Red and Green, was characterized by inflation of the crater area, especially from July 19 until the end of our target period (15 September 2019). The explosions of clusters Red and Green are both characterized by the emission of incandescent ballistic pyroclasts but with different mechanisms: the explosions of cluster Red produce vertical jets, with a narrow ejection cone, and generate a distinct infrasonic transient associated with them; the explosions of cluster Green eject the ballistic pyroclasts according to a wide range of ejection angles assuming a hemispherical shape. The latter are linked to a high level of magma in the conduit and are often associated with spattering. Among the three main clusters, only the explosions falling in the Red cluster generate clearly recognizable infrasonic transients in the frequency band >0.5 Hz.
This study allowed us to discover variations in the pattern of the persistent activity of Stromboli that preceded the 2019 eruptive crisis and to interpret the geophysical data in terms of variations in the eruptive style and the state of activity of the volcano. The results obtained increase our ability to distinguish the different Strombolian mechanisms and suggest new opportunities for an advancement in the monitoring of Stromboli focused on the forecasting of potentially dangerous eruptive activity variations and early warning for paroxysms.