On Board Volcanic Eruption Detection through CNNs and Satellite Multispectral Imagery

In recent years, the growth of Machine Learning (ML) algorithms has raised the number of studies including their applicability in a variety of different scenarios. Among all, one of the hardest ones is the aerospace, due to its peculiar physical requirements. In this context, a feasibility study and a first prototype for an Artificial Intelligence (AI) model to be deployed on board satellites are presented in this work. As a case study, the detection of volcanic eruptions has been investigated as a method to swiftly produce alerts and allow immediate interventions. Two Convolutional Neural Networks (CNNs) have been proposed and designed, showing how to efficiently implement them for identifying the eruptions and at the same time adapting their complexity in order to fit on board requirements.


Introduction
Nowadays, Remote Sensing (RS) is seeing its maximum expanse in terms of applicability and use cases, because of the extremely large availability of remote-sensed images, mostly satellite-based, allowing many scientists from different research fields to approach RS applications [1] [2][3] [4]. A notable example is the increasing use of this type of data in Earth science and geological fields for monitoring parameters which by their nature are or may become difficult to measure, or that need a lot of time and efforts to be recorded with classical instruments.
Although most of the data processing is usually carried out on ground, there have been in the last years some attempts on bringing the computation effort, or at least a part of it, on board the satellites [5]. The ultimate frontier of satellite RS is represented by the implementation of AI algorithms on board the satellites for scene classification, cloud masking, and hazard detection, which has seen the European Space Agency (ESA) as a pioneer in moving the first steps with the φ-sat 1 satellite, launched on 3 September 2020 [6] [7]. With this mission, it has been shown how AI models can help recognizing too cloudy images thus avoiding to download them toward the Ground Stations and therefore reducing the data transmission load [8].
The aim of this study is to investigate the possibility of using satellite images to monitor hazardous events, specifically volcanic eruption, by means of AI techniques and on board computing resources [9]. The results achieved for the classification of volcanic eruptions could be suitable for future autonomous satellites such as the next ones from the φ-sat program.
In the literature, many researchers made use of satellite images, and Synthetic Aperture Radar (SAR) data in particular, to monitor ground movements in proximity of a volcano's crater just before the eruption. Yet, in recent years, many scientists have switched from classical to AI techniques as for example in [10], where the authors investigated the use of two different Deep Neural Networks (DNNs) together with a combined feature vector of linear prediction coefficients and statistical properties, for seismic events classification purposes. Similar approaches can be found using both real [11] and simulated [12] [13] Interferometric SAR (InSAR) data.
In our work Sentinel-2 and Landsat-7 optical data have been considered, and based on the current state of the art, the main contributions are the following: 1. the on board detection of volcanic eruptions by CNN approaches has never been taken into consideration so far in the literature, to the best of the authors' knowledge.
2. the proposed CNN is discussed with regard to the constraints imposed by the on board implementation, which means that the network must be optimized and modified in order to be consistent with target hardware architectures.
3. the performances of the CNN deployed on the target hardware are analyzed and discussed after the execution of experimental tests.
The rest of the paper is organized as follows. Sec. 2 deals with the description of the volcanic eruptions dataset, while Sec. 3 presents the CNN models explored in this study. In Sec. 4 the on board implementation of the proposed detection approach is discussed. Results and discussion are presented in Sec. 5. Conclusions are given at the end.

Dataset
This work focuses on volcanic eruptions by using remote sensing data, hence the first necessary step consists in building the right dataset.
Since no ready-to-use data were found to perfectly fit the specific task of this work, a specific dataset was built by using an online catalog of volcanic events including geolocalization information [14]. Satellite images acquired for the place and the date of interest were collected and labeled using the open-access Python tool presented in [15].

The volcanic eruptions catalog
The dataset has been created by selecting the most recent volcanic eruptions reported in the Volcanoes of the World (VOTW) catalog by the Global Volcanism Program of the Smithsonian Institution. This is a catalog of Holocene and Pleistocene volcanoes and eruptions from the past 10,000 years [14] up to today. An example of information available in the catalog is reported in Table 1. For the purpose of the dataset creation, the only useful information are the starting date of the eruption, the geographic coordinates and the volcano name, so these information were extracted and stored apart.  [14].
The images used to create the dataset have been collected using Landsat 7 and Sentinel 2 products, accessed with Google Earth Engine [16]. Even though the technique described in this work is not limited to the application on these two satellite products, the authors have limited this research to Landsat 7 and Sentinel 2 as they cover the entire period of interest. However, the same approach can be extended to other remote sensing optical products presenting the same bands required for this study, e.g. blue, green, red, and the two SWIR (short-wave infrared) bands located approximately at 1650 nm and 2200 nm. The SWIR bands have been considered in order to better locate and inspect the volcanic eruptions. Indeed, volcanic eruptions can be easily located in RGB wavelengths when they are captured by the satellite camera during the eruptive event, which however is not always the case. After the eruption, the initially red lava gets darker and darker, even though its temperature remains very high. In order to highlight high temperature soil, temperature sensitive bands like the infrared ones have been included.
It is worthy to highlight that there are some differences between the Sentinel-2 and Landsat-7 products, both in terms of spatial resolution and wavelength/bandwidth of the bands of interest, as shown in Table 2 and Table 3. These differences are addressed in the next sections.

Data Preparation and Manipulation
Satellite data have been downloaded with the above cited tool [15] that allows to automatically download small patches of images. For this work, the downloaded patches covered an overall area of 7.5 km 2 .
After downloading the data, some pre-processing procedures have been applied. Firstly, the images have been resized to 512×512 pixels using the Bicubic Interpolation method of Python OpenCV. This procedure mitigates the difference of spatial resolution between Sentinel-2 and Landsat-7. Secondly, infrared bands are combined with RGB bands, in order to visually highlight the color of the volcanic lava, regardless of its color (which is typically red during the eruption and darker a few hours later). Finally, in the experimental phase the proposed algorithm has been adapted and deployed on a Raspberry PI board with a PI camera [17]. Since the PI camera only acquires RGB data, the bands' combination has become necessary to simulate RGB-like images. The bands' combination for highlighting IR spectral answers is given by the equation [18]: Namely, red and green bands are mixed with SWIR1 and SWIR2 bands, in order to enhance the pixels with high temperature. The multiplicative factor α x is used to adjust the scale of the image and it is set to 2.5. In practice, the infrared bands change the red and green bands, so that the heat information is highlighted and visible to the human eye. In this way, it was possible to create a quantitatively correct dataset since, during the labeling, the eruptions were easily distinguishable from non-eruptions. In Figure 1 the difference between a simple RGB image and the one highlighting IR is shown. Figure 1: True RGB color image (left) and IR highlighted image (right).

Dataset expansion
Since the task addressed in this paper is a typical binary classification problem, images have been downloaded in order to fill both the eruption and the no-eruption classes. To have a high variability and to reach better results the no-eruption images have been downloaded by focusing on five subclasses: 1) non-erupting volcanoes, 2) cities, 3) mountains, 4) cloudy images and 5) completely random images. The presence of cloudy images is really important, in order to make the CNN learn to distinguish between eruption smoke and clouds. An example of comparison is shown in Figure 2.
The same pre-processing step has been applied to the new data, since in the deep learning context a homogeneous dataset is preferable for reducing the sensitivity of the model to variations in the distribution of the input data [19,20]. The final dataset contains 260 images for the class eruption and 1500 for the class non-eruption. Due to the type of event analyzed, the dataset appears to be unbalanced, as an acquisition with an eruption is a rare event. The problem of the imbalanced dataset is addressed in the next sections. Some samples from the dataset are shown in Figure 3.

Proposed Model
The detection task has been addressed by implementing a binary classifier where the first class is assigned to images with eruptions and the second one addresses all the other scenarios. The overall CNN architecture is shown in Figure 4. The proposed CNN can be divided in two sub-networks: the first convolutional network is responsible for the features extraction and the second fully connected network is responsible for the classification task [19][20][21].
The first sub-network consists of seven convolutional layers, each one followed by a batch normalization layer, a ReLU activation function and a max pooling layer. Each convolutional layer has a stride value equal to (1,1) and an increasing number of filters, from 16 to 512. Each max pooling layer (with size (2,2) for both kernel and stride) halves the feature map dimension. The second sub-network consists of five fully-connected layers, where each layer is followed by a ReLU activation function and a dropout layer. In this case the number of elements of each layer decreases.
In the proposed architecture the two sub-networks are connected with a global average pooling layer that, compared to a flatten layer, drastically reduces the number of trainable parameters, speeding up the training process.

Image Loader and Data Balancing
Given the nature of the analyzed hazard, the dataset results unbalanced. An unbalanced dataset, with a number of examples of one class much greater than the other, will lead the model to recognize only the dominant class. To solve this issue, an external function called Image Loader from the Phi-Lab ai4eo.preprocessing library has been used [22].
This library allows the user to define a much more efficient image loader than the already existing Keras version. Furthermore, it is possible to implement a data augmentator that allows the user to define further transformations. The most powerful feature of this library is the one related to the balancing of the dataset through the oversampling technique. In particular, each class is weighted independently using a value depending on the number of samples of the class.

Training
During the training phase, for each epoch the error between the real output and the prediction is

Going on board, a first prototype
The proposed models have been developed for carrying out an on board volcanic eruption classification. A prototype has been assembled and mounted on a drone; it was mainly composed of a Raspberry PI, a PI camera and a Movidius Stick for Deep Learning purposes [23][24][25], as shown in Figure 8.
The description of the architecture of the drone system is out of the scope of this work since the drone has only been used for simulation purposes, thus its subsystems are not included in the schematic.
The Raspberry PI, with the Raspbian Operating System (OS), is the on board computer that

Raspberry PI
The Raspberry adopted for this use case is the Raspberry Pi 3 Model B, the earliest model of the third-generation Raspberry Pi.

Implementation on Raspberry and Movidius Stick
In order to run the experiments with the Raspeberry PI and the Movidius Stick, two preliminary steps are necessary: 1) the CNN must be converted from the original format (e.g. Keras model) to an OpenVino format, using the OpenVino library; 2) an appropriate operating system must be install on the Raspberry (e.g. the Rasbian OS through the NOOBS installer). The implementation process is schematized in Figure 9.

OpenVINO library
For deep learning, the current Raspberry PI hardware is inherently resource constrained. The Movidius Stick allows faster inference with the deep learning coprocessor that is plugged into the USB socket. In order to transfer the CNN on the Movidius, the network should be optimized, using the OpenVINO Intel library for hardware optimized computer vision.
The OpenVINO toolkit is an Intel Distribution and is extremely simple to use. Indeed, after setting the target processor, the OpenVINO-optimized OpenCV can handle the overal setup [26].
Based on CNN, the toolkit extends workloads across Intel hardware (including accelerators) and maximizes performance by: • enabling deep learning inference at the edge • supporting heterogeneous execution across computer vision accelerators (e.g. CPU, GPU, Intel Movidius Neural Compute Stick, and FPGA) using a common API • speeding up time to market via a library of functions and pre-optimized kernels • including optimized calls for OpenCV.
After implementation on Raspberry and Movidius, the models were tested by acquiring images using a drone flying over a print made with Sentinel-2 data of an erupting volcano, as shown in Figure 10. The printed image has been processed with the same settings used for the training and validation dataset.

Training on the PC
The model performances have been computed on the testing dataset. In Figure 11 18 samples are shown, while in Table 6 the corresponding ground truth and prediction labels are reported. A low value of predicted label indicates a low probability of eruption, so the image is classified as no eruption; a high value of predicted label, indicates a high probability of eruption, so the image is classified as eruption; a value next to 0.5 indicates a random prediction. Since the problem is a binary classification, a threshold value of 0.5 was selected to identify the class based on the prediction: a value lower that 0.5 was rounded to 0 (non-eruption class), while a value higher than the 0.5 was rounded to 1 (eruption class).
The performances have been computed also for the second model, the one obtained after the pruning process. The same samples shown in Figure 11 have been used to make the comparison with the original model. Table 7 shows the results.

Results on the Movidius
The confusion matrix and the architecture computational speed, for the two models, are reported in Table 8, where is evident that the performances in terms of good prediction are slightly changed after the model pruning, but the most important aspect is that the images per second that the model and the hardware can handle is increased from 1 to 7.

Conclusions
This work aimed to present a first workflow on how to develop and implement an AI model to be suitable for being carried on board satellites for Earth Observation. In particular, two detectors of volcanic eruptions have been developed and discussed.
AI on board is a very challenging field of research which has seen the European Space Agency A prototype and a simulation process, keeping a low-cost kind of implementation, were realized.
The experiment and the development chain were completed with commercial and ready-to-use hardware components, and a drone was also employed for simulations and testing. The AI processor had no problem recognizing the eruption in the test printed image. The results are encouraging, showing that even the pruned model can reach a good performance in detecting the eruptions. Further studies will help to understand possible extensions and improvements.
out by ESA to demonstrate how AI on board the satellites can be used for Earth Observation [7].