Recognizing Eruptions of Mount Etna through Machine Learning Using Multiperspective Infrared Images

: Detecting, locating and characterizing volcanic eruptions at an early stage provides the best means to plan and mitigate against potential hazards. Here, we present an automatic system which is able to recognize and classify the main types of eruptive activity occurring at Mount Etna by exploiting infrared images acquired using thermal cameras installed around the volcano. The system employs a machine learning approach based on a Decision Tree tool and a Bag of Words-based classiﬁer. The Decision Tree provides information on the visibility level of the monitored area, while the Bag of Words-based classiﬁer detects the onset of eruptive activity and recognizes the eruption type as either explosion and / or lava ﬂow or plume degassing / ash. Applied in real-time to each image of each of the thermal cameras placed around Etna, the proposed system provides two outputs, namely, visibility level and recognized eruptive activity status. By merging these outcomes, the monitored phenomena can be fully described from di ﬀ erent perspectives to acquire more in-depth information in real time and in an automatic way. C.C. G.B.; G.G.;


Introduction
Mt. Etna is a basaltic volcano characterized by persistent and highly variegated eruptive activity at the summit craters [1,2]. One of the most intriguing features of Etna is the growing number of its summit craters, which has increased from one (the former Central Crater, which had existed for many centuries) to five, i.e., the Northeast Crater (1911), the Voragine (1945), the Bocca Nuova (1968), the Southeast Crater (1971), and the New Southeast Crater (2007), in an interval of just 96 years [3,4]. The eruptions at summit craters can last from a few hours to several months, and evolve by phases of degassing alternating with Strombolian activity and, occasionally, paroxysms and lava fountains to lava overflows [5][6][7][8]. These different eruptive phases may generate a variety of hazards, including lava flows, gas emissions, explosions and tephra fall, which can represent a significant threat to people and property. Thus, the timely identification and reliable characterization of eruptive events is crucial to rapidly forecast the potential impact of hazardous phenomena and to support mitigation actions to reduce risk to people or critical infrastructure [9][10][11][12][13][14][15][16][17][18].
The past few decades have seen a surge in the development of remote sensors, such as ground-based cameras, drones and satellite-based sensors, that have been applied to the study of volcanoes, and especially poorly monitored volcanoes, all over the world [19][20][21][22][23][24][25]. Among them, Thermal InfraRed (TIR) remote sensing measurements of high-temperature volcanic features are widely used to identify renewed volcanic activity, forecast eruptions and assess hazards [19][20][21][22]. machine learning has been successfully used in a wide variety of applications [56][57][58][59] thanks to its ability to model the complex behavior of image data.
Here, we propose a multiperspective ML system to automatically recognize and classify the eruptive activity recorded by the five thermal cameras belonging to the INGV network located on the Etna volcano. The proposed system relies on a two-step approach using two different kinds of classifiers: a decision tree (DT) and a Bag of Words (BoW) model. Given that eruptive events may be obscured by thick volcanic clouds, which may severely limit or block observations of activity, the DT-based classifier is at first used to determine the visibility level of the monitored scene, and thus the reliability of the classifier outcome. Then, a BoW-based classifier exploiting support vector machines (SVM) is used to recognize the eruptive activity type.
Finally, the considerable variability of eruptive event images (in terms of both shape and scale) and the instability of thermal images (which are susceptible to changeable conditions of weather and sunlight) make it very difficult to extract reliable information from images. In addition, the five cameras provide different and partial views of the summit area; therefore, the automatic classifier produces a single outcome for each thermal camera. However, by combining all the outcomes from all the cameras, the results show that the new classifier can provide global and accurate information on what is happening at the summit craters in real-time, i.e., what kind of eruptive activity is taking place.
We will describe and demonstrate the operation of this automatic system by using a retrospective analysis of the summit eruptive activity that occurred at Etna in 2019.

Thermal Cameras Deployed on Etna
The current INGV monitoring network of Etna comprises five FLIR (Forward Looking InfraRed) thermal cameras, which are installed on the southern, eastern and western flanks of the volcano. Thermal cameras are equipped with uncooled microbolometers that detect emitted radiation in the waveband from 7.5 µm to 13 µm. The radiometric information is converted to calibrated false-color RGB images. The cameras can be set to acquire data at different rates, and this is sometimes manually adjusted, depending on the current level of volcanic activity. The main features of the five INGV thermal cameras, including the location name, distance from the summit (in km), elevation (in m above sea level, a.s.l.) and field of view (FOV, in degrees), are reported in Table 1. It is worth noting that both EBT and ENT are located quite far from the summit area; this results in a lower level of detail, but it offers a wider perspective of the monitored eruptive events compared to the other cameras. In Figure 2, the locations of Etna's summit craters are shown with respect to each thermal camera view. Each color line is associated with the corresponding crater name, whereas the arrows indicate the positions of the corresponding craters that are in the background and out of sight.  In Figure 2, the locations of Etna's summit craters are shown with respect to each thermal camera view. Each color line is associated with the corresponding crater name, whereas the arrows indicate the positions of the corresponding craters that are in the background and out of sight. In Figure 2, the locations of Etna's summit craters are shown with respect to each thermal camera view. Each color line is associated with the corresponding crater name, whereas the arrows indicate the positions of the corresponding craters that are in the background and out of sight.

Workflow of the Machine Learning Classifier
The proposed machine learning approach relies on three main steps, as illustrated in Figure 3. After a preprocessing step to manipulate the false-color images acquired from each thermal camera, the DT classifier recognizes the visibility conditions of the scene monitored and the BoW-based classifier identifies whether any volcanic activity is taking place, detecting the specific type of volcanic activity among predefined classes. For each event, two attributes are assigned to each frame: visibility level and volcanic activity category.

Workflow of the Machine Learning Classifier
The proposed machine learning approach relies on three main steps, as illustrated in Figure 3. After a preprocessing step to manipulate the false-color images acquired from each thermal camera, the DT classifier recognizes the visibility conditions of the scene monitored and the BoW-based classifier identifies whether any volcanic activity is taking place, detecting the specific type of volcanic activity among predefined classes. For each event, two attributes are assigned to each frame: visibility level and volcanic activity category. Firstly, a preprocessing phase is needed to prepare the false-color images as input to the next steps. For each camera, frames are cropped so as to exclude the color bar and the information about the acquisition time and camera name, which are normally embedded in the frame. To train the two classifiers, two separate training datasets have been prepared. To take into account the image quality and visibility, a first training dataset is split into "clear" and blurry or "fuzzy" images from all cameras for the DT classifier (Table 2). To classify the variety of volcanic activity, a second training dataset is split into "explosive", "effusive", "explosive and effusive", "degassing" and "no activity" categories for the BoW-based classifier (Table 3). This task is crucial and not always straightforward, as we will clarify shortly. Therefore, expert knowledge is needed to prepare the datasets to include a wide enough range of conditions.
The training datasets should consider as many eruptive events as possible in different atmospheric conditions and at different times of day (day and night, sunset and sunrise). Moreover, it should include some images leading to false alarms in order to train the Machine Learning techniques to recognize them as belonging to the correct category. For instance, two of the thermal cameras on Etna (EMCT and EBT) are oriented in such a way that either the sun or the moon can appear in the monitored scene. Similarly, depending on the time of the year, some portions of the EMOT images appear brighter, with a hotter reflection that may easily be confused with thermal activity due to limitations in the false-color calibration. These images have been included in the training sets with the "no activity" category. Table 2. Main activity categories and scenes outputted by the DT classifier.

Visibility Category Monitored Scene
"clear" Sunny day, clear night, slightly foggy and slightly cloudy scenes "fuzzy" Heavily foggy and cloudy scenes Firstly, a preprocessing phase is needed to prepare the false-color images as input to the next steps. For each camera, frames are cropped so as to exclude the color bar and the information about the acquisition time and camera name, which are normally embedded in the frame. To train the two classifiers, two separate training datasets have been prepared. To take into account the image quality and visibility, a first training dataset is split into "clear" and blurry or "fuzzy" images from all cameras for the DT classifier (Table 2). To classify the variety of volcanic activity, a second training dataset is split into "explosive", "effusive", "explosive and effusive", "degassing" and "no activity" categories for the BoW-based classifier (Table 3). This task is crucial and not always straightforward, as we will clarify shortly. Therefore, expert knowledge is needed to prepare the datasets to include a wide enough range of conditions. Table 2. Main activity categories and scenes outputted by the DT classifier.

Visibility Category Monitored Scene
"clear" Sunny day, clear night, slightly foggy and slightly cloudy scenes "fuzzy" Heavily foggy and cloudy scenes Table 3. Main activity categories and scenes outputted by the BoW-based classifier.

Activity Category Monitored Scene
"degassing/lithic ash emission" lithic ash emission or gas plume (without magmatic explosions) "explosive" Strombolian (with or without plume), lava fountain (with or without plume) and explosive activity in general "explosive and effusive" explosive activity with lava flow and/or incandescent deposits "effusive" effusive activity/incandescent deposits "no activity" no activity, low degassing/ash level, cooled lava flows and deposits The training datasets should consider as many eruptive events as possible in different atmospheric conditions and at different times of day (day and night, sunset and sunrise). Moreover, it should include some images leading to false alarms in order to train the Machine Learning techniques to recognize them as belonging to the correct category. For instance, two of the thermal cameras on Etna (EMCT and EBT) are oriented in such a way that either the sun or the moon can appear in the monitored scene. Similarly, depending on the time of the year, some portions of the EMOT images Remote Sens. 2020, 12, 970 6 of 17 appear brighter, with a hotter reflection that may easily be confused with thermal activity due to limitations in the false-color calibration. These images have been included in the training sets with the "no activity" category.

The DT Classifier
The DT classifier is a supervised machine learning tool consisting of a tree-like graph of possible decisions and consequences related to a set of input attributes. It is made up of three components: nodes where the attributes undergo a testing phase, the branches connecting two consecutive nodes and the leaf nodes being the terminal nodes, predicting the final outcomes. Data is iteratively split into portions on each branch until the terminal node is reached using a binary recursive partitioning. During the training phase, appropriately selected sets of attributes with their corresponding classes are provided to the decision tree to learn the node criteria and build up the decision tree.
In our case, the sets of attributes are the grayscale intensity histograms of the thermal images, which are able to distinguish the visibility conditions between the two classes, i.e., "clear" and "fuzzy", by appropriately setting threshold levels (i.e., the node criteria). Indeed, "clear" frames show a peaked histogram, with at least one peak in correspondence with the most frequent gray level (background), whereas images with "fuzzy" visibility are characterized by a quasi-uniform histogram. We found that a 10-level gray color histogram provides a good representation of the thermal image, allowing the DT classifier to derive appropriate thresholds for each intensity level. Thus, a vector 1 × 10 is built for each image and the corresponding class is assigned (0 if "clear" and 1 if "fuzzy").

BoW-Based Classifier
A BoW model represents an image by treating it as a text document whose words are the extracted image features. An image modelled by using BoW is represented by a histogram of the occurrences of a set of representative visual words which have been opportunely identified. The BoW-based classifier was designed to detect the onset of an eruptive activity, recognizing the specific type of volcanic activity. It consists of three main steps: (1) Setting up the images dataset to be used for the training and testing phases. Each image in the training dataset is labeled as belonging to one of the previously defined categories ("explosive", "effusive", "explosive and effusive", "degassing" and "no activity") using expert knowledge. (2) Representing images by using a bag of visual words. Three steps need to be performed to achieve this task: (i) features detection, (ii) features description and (iii) codebook generation. The image features are detected and described from the training dataset by using the Speeded up Robust Features (SURF) method [60]. The SURF interest point detector relies on integral images to reduce the computation time; it is based on the Hessian matrix. The SURF descriptor is based on a distribution of Haar-wavelet responses within the interest point neighborhood, exploiting integral images for speed. Once SURF descriptors have been found, the extracted features are grouped, i.e., the codebook is generated by means of an unsupervised clustering method. Here, the image features identified by the SURF extractor are grouped using a k-means clustering method into k mutually exclusive clusters, with each centroid representing a feature, i.e., a visual word. Thus, each image is characterized by an occurrence histogram of the extracted visual words (500 in our case). (3) Designing the image classifier. An error-correcting output codes (ECOC) framework classifies the BoW histograms (generated in step 2) according to the previously defined categories. The ECOC framework [61] combines the output of several binary SVM classifiers, with each SVM learning to discriminate one-versus-one in a pair of categories. By construction, with n (i.e., 5) categories, the ECOC will operate with s = n(n − 1)/2 SVMs.
Once training is complete, applying the ML algorithm associates two attributes to each image of each thermal camera, thereby providing a partial description of the monitored phenomena in a specific Remote Sens. 2020, 12, 970 7 of 17 location; a more complete description of the event can be obtained by collecting the outcomes of all thermal cameras.

Results
The training and testing phases of our system were performed using images acquired from all the thermal cameras operating on Etna during volcanic events at the summit craters in 2019. Here, we present the outcomes obtained for an eruption which started between 18th and 19th of July, for the volcanic activity of the 9th of September and for the explosive activity of the 6th of December.

The 18-19th July 2019 Eruptive Event
On 15 July, 2019, the New Southeast Crater (NSEC) of Etna was active with sporadic explosions that continued until 17 July. In the evening of 18 July, Strombolian activity became more intense, culminating with the opening of a new vent on the lower northeastern side of the cone, which produced a small lava flow flooding towards Valle del Leone (Figure 1).
An image of the eruptive activity recorded by the INGV thermal cameras at the same instant and from different perspectives is shown in Figure 4. The EMCT camera (Figure 4a) captured the small lava flow, while both the EMOT and EBT cameras (Figure 4d) clearly showed an explosion whose plume was captured by the ESR camera (Figure 4c). By applying the ML-based system, two outcomes for each thermal camera are produced, as reported in Table 4.
classifies the BoW histograms (generated in step 2) according to the previously defined categories. The ECOC framework [61] combines the output of several binary SVM classifiers, with each SVM learning to discriminate one-versus-one in a pair of categories. By construction, with n (i.e., 5) categories, the ECOC will operate with s=n(n-1)/2 SVMs. Once training is complete, applying the ML algorithm associates two attributes to each image of each thermal camera, thereby providing a partial description of the monitored phenomena in a specific location; a more complete description of the event can be obtained by collecting the outcomes of all thermal cameras.

Results
The training and testing phases of our system were performed using images acquired from all the thermal cameras operating on Etna during volcanic events at the summit craters in 2019. Here, we present the outcomes obtained for an eruption which started between 18th and 19th of July, for the volcanic activity of the 9th of September and for the explosive activity of the 6th of December.

The 18-19th July 2019 Eruptive Event
On 15 July, 2019, the New Southeast Crater (NSEC) of Etna was active with sporadic explosions that continued until 17 July. In the evening of 18 July, Strombolian activity became more intense, culminating with the opening of a new vent on the lower northeastern side of the cone, which produced a small lava flow flooding towards Valle del Leone (Figure 1).
An image of the eruptive activity recorded by the INGV thermal cameras at the same instant and from different perspectives is shown in Figure 4. The EMCT camera (Figure 4a) captured the small lava flow, while both the EMOT and EBT cameras (Figure 4-d) clearly showed an explosion whose plume was captured by the ESR camera (Figure 4c). By applying the ML-based system, two outcomes for each thermal camera are produced, as reported in Table 4.

Thermal Camera Visibility Volcanic Activity
EMOT "clear" "explosive" EMCT "clear" "effusive" ESR "clear" "no activity" ENT "clear" "explosive" EBT "clear" "no activity" It is worth noting that by looking at the outcomes from all the thermal cameras, it is possible to obtain a comprehensive description of the eruptive activity, which was characterized by a lava flow and Strombolian explosions. The visibility conditions for this specific time frame appear good for all the thermal cameras, thus confirming the reliability of the results. Looking at this volcanic eruption in a wider time interval, i.e., from 18 to 21 July, 2019 (  Thermal Camera Visibility Volcanic Activity EMOT "clear" "explosive" EMCT "clear" "effusive" ESR "clear" "no activity" ENT "clear" "explosive" EBT "clear" "no activity" It is worth noting that by looking at the outcomes from all the thermal cameras, it is possible to obtain a comprehensive description of the eruptive activity, which was characterized by a lava flow and Strombolian explosions. The visibility conditions for this specific time frame appear good for all the thermal cameras, thus confirming the reliability of the results. Looking at this volcanic eruption in a wider time interval, i.e., from 18 to 21 July, 2019 ( Figure 5), information on the onset and the evolution of the volcanic activity can be retrieved. The explosive activity starting in the evening of the 18th became more frequent from 20:00 UTC, and completely ended in the evening of 20 July (very few explosions). A lava flow erupted at nearly 23:20 UTC of 18 July and remained visible through the EMCT camera until 21:00 UTC on 21 July, at which time it started cooling. Figure 5. Volcanic activity evolution detected by the ML algorithm from the 18th to the 21st July, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras. Figure 5. Volcanic activity evolution detected by the ML algorithm from the 18th to the 21st July, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras.

The September 2019 Eruptive Event
In the first half of September 2019, Etna's volcanic activity was characterized by eruptions from the North East (NEC) and Voragine (VOR) craters. Strombolian activity was observed from the NEC on the 9th of September. In fact, despite the bad atmospheric conditions, the volcanic activity was monitored by the EMOT thermal camera. In Figure 6, the images acquired from each thermal camera are shown. By applying the proposed algorithm, two outcomes for each thermal camera were produced, as reported in Table 5.

The September 2019 Eruptive Event
In the first half of September 2019, Etna's volcanic activity was characterized by eruptions from the North East (NEC) and Voragine (VOR) craters. Strombolian activity was observed from the NEC on the 9th of September. In fact, despite the bad atmospheric conditions, the volcanic activity was monitored by the EMOT thermal camera. In Figure 6, the images acquired from each thermal camera are shown. By applying the proposed algorithm, two outcomes for each thermal camera were produced, as reported in Table 5. The ESR camera has a lower frame rate than the others, i.e., 2fps; for this case, the ESR image closer to the other cameras is shown. The monitored scene shows the Strombolian activity which started on the first half of September and was only just detected because of the low visibility conditions. Table 5. Algorithm Outcomes of the DT and BoW-based classifiers for the eruptive activity at NEC and VOR on 9 September, 2019.

Thermal Camera
Visibility Volcanic Activity EMOT "clear" "explosion" EMCT "fuzzy" "no activity" ESR "fuzzy" "no activity" ENT "fuzzy" "no activity" EBT "fuzzy" "no activity" In this case, only the EMOT camera detected that explosive activity was taking place; the other cameras did not detect any activity. However, by looking at the visibility level, we note that only EMOT appears to have a good visibility level, thus yielding a reliable result. On the other hand, the critical visibility levels of the other cameras reveal uncertainty about the activity taking place in the areas they monitor. Any kind of volcanic activity may or may not be present. As a consequence, for this instant, the only description that can be provided is that an explosive activity was certainly occurring, but due to the critical visibility conditions, some other activities may be hidden.
Thus, if we analyze this event in time, we can get an idea about the dynamics of the activity taking place (Figure 7). Due to the low visibility conditions for all the cameras, we consider the The ESR camera has a lower frame rate than the others, i.e., 2 fps; for this case, the ESR image closer to the other cameras is shown. The monitored scene shows the Strombolian activity which started on the first half of September and was only just detected because of the low visibility conditions. Table 5. Algorithm Outcomes of the DT and BoW-based classifiers for the eruptive activity at NEC and VOR on 9 September, 2019.

Thermal Camera Visibility Volcanic Activity
EMOT "clear" "explosion" EMCT "fuzzy" "no activity" ESR "fuzzy" "no activity" ENT "fuzzy" "no activity" EBT "fuzzy" "no activity" In this case, only the EMOT camera detected that explosive activity was taking place; the other cameras did not detect any activity. However, by looking at the visibility level, we note that only EMOT appears to have a good visibility level, thus yielding a reliable result. On the other hand, the critical visibility levels of the other cameras reveal uncertainty about the activity taking place in the areas they monitor. Any kind of volcanic activity may or may not be present. As a consequence, for this instant, the only description that can be provided is that an explosive activity was certainly occurring, but due to the critical visibility conditions, some other activities may be hidden.
Thus, if we analyze this event in time, we can get an idea about the dynamics of the activity taking place (Figure 7). Due to the low visibility conditions for all the cameras, we consider the interval [18][19][20][21][22][23]. We can note that initially, the explosive activity was visible only from EMOT; then, it was detected by EBT in the following hours together with EMCT at 20:38:00 UTC. The sporadic presence of class 1 and class 2 is due to the incandescent deposits which are also visible in Figure 7. . We can note that initially, the explosive activity was visible only from EMOT; then, it was detected by EBT in the following hours together with EMCT at 20:38:00 UTC. The sporadic presence of class 1 and class 2 is due to the incandescent deposits which are also visible in Figure 7. September, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras.

The December 2019 Eruptive Event
From the last week of November, 2019, a period of degassing phase and ash emission started with an isolated explosion on 1 December. From 6 December, 2019, Strombolian activity started to intensively increase at the NSEC. In Figure 8, the images acquired from each thermal camera are shown. Both the EMOT and ENT cameras showed the explosive activity taking place; the other cameras did not catch any activity. By applying the proposed algorithm, two outcomes for each thermal camera are produced, as reported in Table 6. Figure 7. Volcanic activity evolution detected by the ML algorithm from the 18:26 to 23:38 UTC of 9 September, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras.

The December 2019 Eruptive Event
From the last week of November, 2019, a period of degassing phase and ash emission started with an isolated explosion on 1 December. From 6 December, 2019, Strombolian activity started to intensively increase at the NSEC. In Figure 8, the images acquired from each thermal camera are shown. Both the EMOT and ENT cameras showed the explosive activity taking place; the other cameras did not catch any activity. By applying the proposed algorithm, two outcomes for each thermal camera are produced, as reported in Table 6. The ESR camera has a lower frame rate than the others, i.e., 2fps; for this case, the ESR image closer to the other cameras is shown. The monitored scene showed the Strombolian activity started on 6 December, 2019, and was clearly visible from EMOT and ENT cameras. Table 6. Algorithm Outcomes of the DT and BoW-based classifiers for the eruptive activity at NSEC on 6 December, 2019.
Thermal Camera Visibility Volcanic Activity EMOT "clear" "explosion" EMCT "clear" "no activity" ESR "fuzzy" "no activity" ENT "fuzzy" "explosion" EBT "clear" "no activity" In this case, the explosive activity taking place was detected by both EMOT and ENT cameras. In the area monitored by the EBT camera, no volcanic activity was detected, as confirmed by the clear conditions outputted by the DT classifier. Also, in this case, two cameras presented critical visibility, i.e., EMCT and ESR, and thus, nothing can be said about what was really happening in the monitored areas. If we analyze this event in time (Figure 9), we can notice that the explosions became more frequent from the early morning of 7 December (04:10 UTC) to the late morning (12:30 UTC) of the same day. Some incandescent deposits were caught by the ML-based system. The ESR camera has a lower frame rate than the others, i.e., 2 fps; for this case, the ESR image closer to the other cameras is shown. The monitored scene showed the Strombolian activity started on 6 December, 2019, and was clearly visible from EMOT and ENT cameras. Table 6. Algorithm Outcomes of the DT and BoW-based classifiers for the eruptive activity at NSEC on 6 December, 2019.

Thermal Camera
Visibility Volcanic Activity EMOT "clear" "explosion" EMCT "clear" "no activity" ESR "fuzzy" "no activity" ENT "fuzzy" "explosion" EBT "clear" "no activity" In this case, the explosive activity taking place was detected by both EMOT and ENT cameras. In the area monitored by the EBT camera, no volcanic activity was detected, as confirmed by the clear conditions outputted by the DT classifier. Also, in this case, two cameras presented critical visibility, i.e., EMCT and ESR, and thus, nothing can be said about what was really happening in the monitored areas. If we analyze this event in time (Figure 9), we can notice that the explosions became more frequent from the early morning of 7 December (04:10 UTC) to the late morning (12:30 UTC) of the same day. Some incandescent deposits were caught by the ML-based system. Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 17 Figure 9. Volcanic activity evolution detected by the ML algorithm from 6 to 7 December, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras.

Discussion
The results provided by the ML-based system showed good agreement with the eruptive activity occurring at the summit craters of Etna in July, September and December, 2019, even when the activity was quite low in intensity, and also for the thermal camera located far from the volcano summit (e.g., ENT on 6 December). To quantitatively evaluate the accuracy of results, we computed the associated confusion matrix, which is commonly employed to measure the performance of the classifiers.
A confusion matrix is contingency table with the same classes (our categories) in the two dimensions (namely "known" and "predicted"), which reports the number of correctly and incorrectly classified observations. Each row of the matrix represents the instance in an actual class, whereas each column represents the instance in a predicted class. For each class, an accuracy index (ACC) is computed as the ratio between the number of correct predictions over the number of total samples. Furthermore, as an overall classifier index, an average accuracy (AVR) index is computed by performing the average of the ACCs. ACC indices for both the DT classifier and the BoW-based classifier are reported in Tables 7 and 8, respectively (i.e., the diagonal elements). The AVR index for the DT classifier is 0.9964, whereas the average accuracy for the BoW-based classifier is 0.9. Figure 9. Volcanic activity evolution detected by the ML algorithm from 6 to 7 December, 2019. "DG", "EX", "EX+EF", "EF" and "NA" stand for degassing, explosive, explosive and effusive, effusive and no activity respectively. The visibility index is given by the average of the visibility indexes for all the cameras.

Discussion
The results provided by the ML-based system showed good agreement with the eruptive activity occurring at the summit craters of Etna in July, September and December, 2019, even when the activity was quite low in intensity, and also for the thermal camera located far from the volcano summit (e.g., ENT on 6 December). To quantitatively evaluate the accuracy of results, we computed the associated confusion matrix, which is commonly employed to measure the performance of the classifiers.
A confusion matrix is contingency table with the same classes (our categories) in the two dimensions (namely "known" and "predicted"), which reports the number of correctly and incorrectly classified observations. Each row of the matrix represents the instance in an actual class, whereas each column represents the instance in a predicted class. For each class, an accuracy index (ACC) is computed as the ratio between the number of correct predictions over the number of total samples. Furthermore, as an overall classifier index, an average accuracy (AVR) index is computed by performing the average of the ACCs. ACC indices for both the DT classifier and the BoW-based classifier are reported in Tables 7 and 8, respectively (i.e., the diagonal elements). The AVR index for the DT classifier is 0.9964, whereas the average accuracy for the BoW-based classifier is 0.9. The DT classifier showed a very high accuracy level, making it a valuable tool by which to determine the visibility conditions of each thermal camera. From the presented cases studies, it appears evident that this information is fundamental to obtaining a reliable description of the monitored volcanic event, especially when the outcome of the BoW-based classifier is "no activity".
It is worth noting that although the visibility measure of the cameras closer to the summit area is reliable, the visibility measure of the farthest cameras, i.e., EBT and ENT, is intrinsically less reliable due to the greater distance. Thus, ambiguous cases (even for human operators) may occur more frequently with the farthest cameras.
The ACC indices for the BoW-based classifier are quite high (from 0.960 to 0.984) for the categories "explosive", "effusive", "explosive and effusive" and "no activity". The lowest value (0.620) is obtained for the category "degassing/ash emission". Indeed, plume image features may appear similar to the cloudy image features, as shown by the percentage of misclassified "degassing/ash emission" images in "no activity" images (0.380). This negatively affected the 0.900 of the AVR index, which would have otherwise increased to 0.973.
Other particular cases affecting the accuracy of the classifiers are due to the nature of the data adopted, i.e., false-color RGB images. For example, the sun/moon can appear in the monitored area of some thermal cameras, affecting the results. In most cases, the classifier outcomes are rightly classified as "no activity", but sometimes, when the sun/moon is really close or partially overlapped by the skyline, it can be misclassified as belonging to the "explosive" category. Our multiperspective approach notably improves the classification success rate in these ambiguous cases. Being able to use all the thermal cameras at the same time allowed us to make the most of the available information, also in cases of low visibility, as in September, 2019, when only one (EMOT) of the five cameras detected the explosive activity of NEC.

Conclusions
We have introduced the first multiperspective ML-based system able to detect the onset and recognize the typology of eruptive activity occurring at the summit craters of the Etna volcano. The system uses different views of the summit area acquired from the thermal cameras of the INGV monitoring network located around the volcano. The machine learning approach is built upon a decision tree to determine the scene visibility level, and a BoW-based classifier with support vector machines to recognize the type of eruptive activity as either explosive, effusive, explosive plus effusive or degassing.
The system was validated using recent eruptions at Etna, i.e., between July and December, 2019. The outcomes from all the thermal cameras are considered together to obtain global and reliable information and a deeper knowledge of the monitored phenomena. From this perspective, the degree of visibility of the adopted cameras has proved crucial, allowing us to assess the quality of the partial information coming from each camera during each event. The results also showed that the Bag of Words representation can catch the main features of a volcanic eruption. Including temporal information in the algorithm allowed us to determine how the monitored activity evolved, and to improve the classification in ambiguous cases. For instance, the current version of the classifier includes incandescent deposits in the "effusive" category, since large amounts of incandescent deposits may appear very similar to lava flows. This assumption can be overcome by including temporal analyses of the thermal images, considering that the cooling of the lava takes much more time than the cooling of pyroclastic deposits.
Besides the importance of the proposed tool in terms of volcanic activity monitoring, it can be used as a starting point to obtain a more comprehensive characterization of the volcanic activity taking place. In fact, future developments will involve getting georeferenced information of the lava emplacement for each camera, automatically determining the main features of the eruptive activity, i.e., the height of the explosion, and identifying the active craters. Then, following the technique proposed in [49], ground measurements coming from all the thermal cameras will be used to compute the radiant heat flux, and hence, provide an estimation of the effusion rate for the effusive eruptions imaged.
Finally, we believe that the ML procedure developed for use on Mount Etna could be applied to other open-conduit volcanoes, e.g., Stromboli, Kilauea, Yasur, Masaya, or Ontake, if it is equipped with one or more fixed thermal cameras.