Multilayer Perceptron Neural Networks Model for Meteosat Second Generation SEVIRI Daytime Cloud Masking

A multilayer perceptron neural network cloud mask for Meteosat Second Generation SEVIRI (Spinning Enhanced Visible and Infrared Imager) images is introduced and evaluated. The model is trained for cloud detection on MSG SEVIRI daytime data. It consists of a multi-layer perceptron with one hidden sigmoid layer, trained with the error back-propagation algorithm. The model is fed by six bands of MSG data (0.6, 0.8, 1.6, 3.9, 6.2 and 10.8 μm) with 10 hidden nodes. The multiple-layer perceptrons lead to a cloud detection accuracy of 88.96%, when trained to map two predefined values that classify cloud and clear sky. The network was further evaluated using sixty MSG images taken at different dates. The network detected not only bright thick clouds but also thin or less bright clouds. The analysis demonstrated the feasibility of using machine learning models of cloud detection in MSG SEVIRI imagery.


Introduction
Remote sensing of cloud properties has largely focused on the applications of the various spectral bands of satellite sensors [1].In remote sensing, clouds are generally characterized by lower temperature and higher reflectance in comparison with clear sky, which causes a strong reflection of the solar radiation at the top of the atmosphere (TOA) [2].The magnitude of this effect depends on type and physical properties of clouds and can vary strongly both in space and time [3].Usually, a thin cloud has some physical characteristics similar to other atmospheric constituents (e.g., air pollution), while thick clouds block almost all information from the surface or near surface [2].So, accurate information about the physical and radiative properties of clouds (including cloud coverage) is essential in order to determine the role of clouds in the climate system and for retrievals of surface and aerosol properties [4].For example, an accurate determination of the cloudy and cloud contaminated pixels can extensively affect the robustness of satellite retrievals of the aerosol optical depth (because cloud contamination can lead to large overestimation of Aerosol Optical Depth) [3].
During the last fifteen years, cloud analysis techniques have been considerably improved with the advent of geostationary satellite imagery such as Meteosat First Generation (MFG) [1].The first Meteosat Second Generation (MSG) satellite was successfully launched in August 2002.MSG satellites are planned to ensure an operational service until at least 2018.The visible and infrared bands of MSG's Spinning Enhanced Visible and Infrared Imager (SEVIRI) represent a significant step ahead compared with the preceding Meteosat radiometer [5].MSG has a higher temporal sampling (quarter hourly) in comparison with Meteosat radiometer (half hourly).The spectral capabilities (12 spectral bands) of MSG allows an accurate cloud cover analysis (due to its 3.9 µm channel), and the computation of numerous products in cloud free areas, such as sea and land surface temperatures, snow cover and integrated water vapor content of the atmosphere [5,6].
An initial step towards the estimation of cloud properties from satellite images is the classification of pixels into cloud-free and cloudy classes.In the literature, various methods have been proposed for the detection and the classification of clouds using visible/infrared (VIS/IR) imaging and most of them rely on a combination of threshold tests applied to different spectral channels [7][8][9][10][11][12], for example multispectral thresholding models [9,[11][12][13][14] or simple intensity threshold analysis and IHS threshold technique [15][16][17] applied to individual pixels.The other models are spatial coherence methods [18,19] or dynamic cloud clustering based on histogram analysis [20,21].
Bankert, R.L., 1993 obtained the overall accuracy of 79.8% for classifying 16 × 16 pixel AHVRR sample areas using probabilistic neural networks model [22].A texture-based neural network classifier using only single-channel visible LANDSAT MSS imagery achieves an overall cloud identification accuracy of 93% in Lee, J. et al., 1990 [23].Miller, S.W. and W.J. Emery, 1997 presented an automated neural network cloud classifier that functions over both land and ocean backgrounds which is sought to discern different types of clouds and different types of precipitating systems from Advanced Very High Resolution Radiometer (AVHRR) imagery [24].Kox, S. et al., 2014 presented an approach for the detection of cirrus clouds and the retrieval of optical thickness and top altitude based on the MSG SEVIRI imagery [25].
However, a detailed analysis of machine learning models for automatic classification of SEVIRI MSG images has not been presented so far in the literature, whereas machine learning models can be very competitive in terms of accuracy and speed for image classification.Starting from these motivations, the purpose of the present paper is to demonstrate the potential of the machine learning approach for a fast, robust, accurate and automated SEVIRI MSG image classification approach without using any ancillary data.
The cloud mask product (CLM) by EUMETSAT's Meteorological Product Extraction Facility (MPEF), which is based on the narrowband channels of SEVIRI (EUMETSAT, 2007), is used as a benchmark for our comparison as it is distributed through the EUMETCast system together with the level 1.5 SEVIRI images.
This paper is structured as follows: in the following section, a brief overview of the dataset used in our study, including the characteristics of the SEVIRI instrument, is given.The methodology has been discussed in Section 3. Section 4 presents results and discussions, followed by conclusions in Section 5.

MSG-SEVIRI Dataset
The proposed model has been trained and tested on MSG-SEVIRI data collected in different weather conditions.The Meteosat series of satellites are geostationary meteorological satellites operated by EUMETSAT under the Meteosat transition program and the MSG program [5].In 2013 MSG-1 (renamed to Meteosat-8), MSG-2 (Meteosat-9) and MSG-3 (Meteosat-10) are all operational [5].
The dataset for training/test the Multilayer Perceptron (MLP) model is consists of one MSG image from each of the first five days of each month, beginning with the 9:00 UTC image on the 1st and selecting an image 1:30 later on each subsequent day, meaning that on the 5th, the 15:00 UTC image is examined.

EUMETSAT Cloud Mask
The cloud mask (CLM) has been distributed in GRIB format on EUMETCast since February 2013.CLM product is a classified image with four classes (0-clear sky ocean, 1-clear sky land, 2-cloudy or 3-no data), which is derived from a combination of multi-spectral threshold model (from almost all SEVIRI channels except channel 8, 9.7 μm) [4].
The EUMETSAT cloud mask relies on the physical principle that clouds are generally more reflective and colder than the background.However, the accuracy of the cloud detection, regardless of the method used, depends on the extent of the clouds compared to the spatial resolution of the satellite scanner.
The cloud mask algorithm is based on a multispectral threshold technique applied to each pixel of the image.At the end of the test sequences, a spatial filtering is applied in order to reclassify isolated pixels with a class type different from their neighboring pixels.Finally, pixels with measured reflectances and brightness temperatures which are close to the thresholds are flagged as low confidence [10].

An Overview of Multilayer Perceptron Neural Network
For applying a binary classification to separate cloudy and clear-sky pixels, an artificial neural network classifier has been used.Artificial neural network (ANN) algorithms classify regions of interest using a methodology that performs similar functions to the human brain such as understanding, learning, solving problems and taking decisions.In its most general form, a neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest [26].ANN architecture consists of three units.The first layer is the input layer and the number of its nodes is determined by the input parameters.
The last layer is the output layer, and the number of its nodes is given by the desired output.The layer(s) between the input and output layers are called the hidden layer(s).In most ANNs, hidden layers use non-linear activation functions for processing the data [27].The block diagram of Figure 1 shows the model of a neuron, which forms the basis for designing artificial neural networks.The bias bk has the effect of increasing or lowering the net input of the activation function.A neuron k can be described by writing the following pair of equations [26]: () Where x1, …, xn are the input signals, wk1, …, wkn are the synaptic weights of neuron k, uk is the linear combiner output due to the input signals, bk is the bias, φ(.) is the activation function, and yk is the output signal of the neuron.

Methods
In this study, Multilayer Perceptron (MLP) has been used, which have been found to be the best suited topology for pixel level classifications [26].MLP model is a feed-forward artificial neural network classifier.The connections between perceptrons in an MLP are forward and every perceptron is connected to all the perceptrons in the next layer except the output layer that directly gives the result.A non-linear activation function in most cases is applied to the data and the result is used as input to the next layer up to the output layer.MLP utilizes back propagation for training the network [28][29][30][31][32].
In our study, the hyperbolic tangent function (which is the most commonly used activation function from the sigmoid function category) has been used as activation function.Hyperbolic tangent is the activation function ranges from −1 to +1, in which case the activation function assumes an anti-symmetric form with respect to the origin.Allowing an activation function of the sigmoid type to assume negative values has analytic benefits [26].
In the proposed model, visible bands at 0.6 and 0.8 μm, near-infrared at 1.6 μm, infrared 3.9 μm, water vapor channels 6.2 μm and infrared 10.8 μm have been used which contain more information for daytime cloud detection [33].
Visible channels are essential for daytime cloud detection (particularly important for the detection of low clouds along with IR 3.9 μm channel) as well as Rayleigh and aerosol scattering (haze, smoke, dust, pollen, etc.) [3,10,11].The signatures of clouds are similar in the two visible channels (0.6 and 0.8 μm), but the reflection of the sea surface is higher at 0.6 μm while land surfaces appear brighter at 0.8 μm [3].Therefore, the two visible channels 0.6 and 0.8 μm have been used in order to have high contrast between cloud and surface targets.
In order to discriminate between snow, ice and water cloud, NIR channel (1.6 μm) have been used.Water vapor channel (6.2 μm) has been used for temperature determination of thin clouds [3].IR 10.8 μm channel is essential for measuring sea and land surface and cloud-top temperatures, also for the detection of cirrus cloud and volcanic ash clouds [34].The brightness temperature in IR 10.8 μm channel is highly affected by clouds than any other parameter (radiation cooling in diurnal cycle, atmospheric absorption variation, change in surface emissivity, radiometer calibration) [34].

Discussion
Cloud boundary determination is a difficult task due to the nature of the atmosphere with the complex interaction between clouds and aerosols and due to the uncertainties of the instrumental measurements used for their detection [10,11,33].
The MLP ANNs classifier was assessed for cloud detection in MSG-SEVIRI data.Overall, the data set is made of 500,000 samples (pixels) for training and testing the model that was extracted from the whole disk of MSG SEVIRI data.Sixty percent of the data set was selected for the training set and the remainder only for the test set (the classification evaluation) that is never used for training the networks.The network training was repeated 15 times, and then the network producing the best accuracy was selected; this implies that the retained model resides within the best 14% of the distribution of all possible models (99% confidence) [35].
The number of hidden nodes was optimized by trial and error from two to twenty.The outputs of the models consist of three classes, which are: cloud, not cloud and outer space.Figure 2 presents the mean square error (MSE) of the testing set, as a function of the number of hidden nodes.Considering generalization, time load, and accuracy, the network with 10 nodes has a lower MSE than others.An accuracy (defined here as the percentage of successful MLP cloud detection) assessment has been carried out in order to assess the classifiers more appropriately.From our MSG-SEVIRI dataset, 60 images (which have not been used for extracting training/test pixels) have been randomly selected and classified (by the neural networks models with 10 number of nodes in the hidden layer) which are not used for samples extraction (pixels for training and test the networks).For each of the 60 images, 688,947 pixels (which is 5% of each image) have randomly been selected and then labeled by visual interpretation.The same procedure has been used to calculate the accuracy of the MPEF CLM (for each of the 60 images).
Visual inspection shows that the machine learning classifier achieves acceptable detection results under a variety of conditions in comparison with the results of the EUMETSAT cloud mask (CLM) (Figure 3).The accuracy of the whole test dataset classified by MLP NNs is 88.96% with a standard deviation of 1.68%, 3.88% and 11.04% commission (the samples which are committed to the wrong class) and omission (the samples which are omitted from the right class) errors average, respectively.The average accuracy computed for the MPEF CLM is 86.10% with a standard deviation of 2.47%, 7.27% and 13.90% commission and omission errors average, respectively.
An improvement of 2.86% in accuracy has been obtained on the dataset classified by MLP NNs with respect to the same dataset classified by the MPEF CLM model.Moreover, MLP classifier generated less Commission error with respect to the MPEF CLM and it is because of the concentration of the MLP model on daytime imagery whereas the MPEF CLM model is very general (it has been designed for the whole day images).The results of accuracy assessment applied to different models are displayed in Table 1.
It is necessary to identify the situations where the proposed approach generates poor accuracy and see why this method failed to work correctly in those cases.The worst accuracy generated by the proposed model is 85%.The major source of uncertainty for our cloud mask is the spatial and temporal variability of the surface reflectance (e.g., due to changes in atmospheric aerosols or vegetation) and undetected thin cirrus clouds with low visible reflectance (which have the signature similar to clear-sky).
Due to the construction of the assessment dataset, it is possible that the misclassifications by the other ANN models or by the human visual interpretation cloud be included in the assessment dataset, and the mentioned error might influence the results in Table 1.The CALIOP cloud detection can be used (for extracting the training and testing dataset and accuracy assessment) to overcome these drawbacks [36].

Conclusions
In this study, cloud detection in Meteosat second generation SEVIRI images was performed by a multiple-layer perceptron neural networks trained with back-propagation algorithm.The nineteen multiple-layer perceptrons were separately trained using six Meteosat second generation SEVIRI bands (0.6, 0.8, 1.6, 3.9, 6.2 and 10.8 μm).The network with 10 nodes in hidden layer has been selected as the best topology based on it generalization, time load, and accuracy with an accuracy of 88.96% for the testing set.An improvement of 2.86% in accuracy has been obtained on the dataset classified by multiple-layer perceptron neural networks with respect to the same dataset classified by EUMETSAT's meteorological product extraction facility.
Cloud detection by the proposed approach can be completed in about 20 s on a personal computer with an Intel Pentium dual-core, a speed of 2.2 GHz, and 2 GB of RAM, which is much faster than some existing methods in the literature.This might have a significant impact on the reduction of the computational burden when large data sets need to be processed.
The analysis demonstrated the advantage of multiple-layer perceptron neural networks algorithm over the original cloud mask provided by EUMETSAT.The proposed model detected not only bright thick clouds but also thin or less bright clouds.The comparison of the results of multiple-layer perceptron applied to Meteosat second generation SEVIRI images with the cloud mask provided by EUMETSAT's meteorological product extraction facility demonstrates the feasibility of using multiple-layer perceptron neural networks for Meteosat second generation SEVIRI cloud detection.
An important characteristic of multiple-layer perceptron neural networks model is that it does not use ancillary data such as sea surface temperature map, NWP temperature and humidity profiles.The six Meteosat second generation SEVIRI bands are the only piece of information used in the multiple-layer perceptron model.Thus, the use of multiple-layer perceptron neural networks as cloud detection method is useful in the cases where there is not enough ancillary data.Nevertheless, it is believed that the multiple-layer perceptron may possibly be further improved by augmenting the size and diversity of the training and testing sets, and by systematically testing other artificial neural networks topologies and training algorithms.The efficiency of the method will also have to be assessed for larger datasets.

Figure 1 .
Figure 1.Nonlinear model of a neuron.

Figure 2 .
Figure 2. Training mean square error (MSE) as a function of the number of hidden nodes using Meteosat Second Generation (MSG)-Spinning Enhanced Visible and Infrared Imager (SEVIRI) data.

Figure 3 .
Figure 3. First column shows the band combination result R = 10.8 μm, G = 1.6 μm, B = 0.6 μm of Meteosat second generation SEVIRI images over Europe, multiple-layer perceptron neural networks classifier result (second column) and cloud mask products by EUMETSAT's meteorological product extraction facility (third column).In this figure black color represents cloud free pixels and white color represents cloudy pixels.

Table 1 .
Minimum, maximum and average values of the accuracies for different models (in %).