Performance of Using Cascade Forward Back Propagation Neural Networks for Estimating Rain Parameters with Rain Drop Size Distribution

The aim of our study is to estimate the parameters M (water content), R (rain rate) and Z (radar reflectivity) with raindrop size distribution by using the neural network method. Our investigations have been conducted in five African localities: Abidjan (Côte d'Ivoire), Boyele (Congo-Brazzaville), Debuncha (Cameroon), Dakar (Senegal) and Niamey (Niger). For the first time, we have predicted the values of the various parameters in each locality after using neural models (LANN) which have been developed with locally obtained disdrometer data. We have shown that each LANN can be used under other latitudes to get satisfactory results. Secondly, we have also constructed a model, using as train-data, a combination of data issued from all five localities. With this last model called PANN, we could obtain satisfactory estimates forall localities. Lastly, we have distinguished between stratiform and convective rain while building the neural networks. In fact, using simulation data from stratiform rain situations, we have obtained smaller root mean square errors (RMSE) between neural values and disdrometer values than using data issued from convective situations.


Introduction
The most important property of the Earth's atmosphere is the great variability of its parameters [1,2].Scientists in the world are searching for theoretical and practical means to investigate these variations in order to conceive some models which are able to predict the evolution of climate.Taking advantage of the technological progress in several domains, meteorological instruments used for the measurement of these parameters are now very reliable while obtaining direct and punctual measurements.Nevertheless, the installation of this equipment in a given region, for a spatio-temporal survey, needs enormous human and financial means.This constitutes a handicap in mastering the spatio-temporal evolution of the atmospheric parameters in these regions, particularly in developing countries like the sub-Saharan ones.
The use of atmospheric remote sensing with the help of radars and satellites can be considered as a good alternative or complementary to the ground-based equipment.The characterization of the atmosphere can be realized everywhere, but the measurements are not direct.The determination of the indirectly measured physical parameter means that data, obtained in situ, must be taken into account in the calibration process of the instrument [3].
Remote sensing has paved the way for the development of powerful applications in the domain of atmospheric physics in order to understand the complexity of some atmospheric perturbations.However, the most important difficulty encountered lies in the different relationships issued from the calibration.That is the case when measurements are processed on convective systems with meteorological radars.In this example, the aim is to find a relationship between the radar reflectivity Z (dBZ) and the rain rate R (mm•h −1 ) (Z-R relationship).The observed precipitation's variability is due to the raindrop size distribution whose instability is reliable for some factors like speed, collision or agglomeration of rain drops [4,5].
An automatic adjustment of the Z-R relationship dependent on the type of precipitation is extremely difficult to put in place in operational conditions [6].The raindrop size distribution diversity of precipitations and the different phases of the hydrometers, solid or liquid, have an influence on this relationship.There are a lot of relationships between Z and R [7][8][9][10][11][12].
The use of only one relationship cannot generally represent the natural variability of precipitations.For example (see Figure 1), for a value of R, one can get several values corresponding to Z and vice versa.
Several studies have been conducted for the modelling of the size of rain drops under different latitudes [13].The theoretical distribution functions evaluated are many but likely are not easy to manipulate [6].The disdrometer allows measuring automatically and continuously the extent of rain drop distribution.
In this study, we want to adapt the neural network technique to the Rain Drop Size Distribution (RDSD) of precipitations in order to conceive a model which could take into account the distinct types of rain perturbations under different latitudes, with the RDSD.The liquid water content (M), the rain rate (R) and the radar reflectivity (Z) are considered as outputs.
The most important property of the neural networks is the nature of their adaptation, which may be deducted from samples.This characteristic provides some tools, which are able to solve high non-linear relationships [14].The neural network is an alternative tool for modelling and describing complex relationships between physical or technical parameters [15].It is considered as an "engine" delivering to its entries an "answer" which is inaccessible or not easily accessible with existing analytical methods.In the following work, we describe briefly the theory of neural networks, presenting the Feed Forward Back Propagation (FFBP) and the Cascade Forward Back Propagation (CFBP) models.We will then situate the five African localities which were subject to our experimental work and where the collection and the classification of the rain drops were realized, while using the disdrometer.Lastly, we will describe how the different neural network models have been constructed, trained and used for the estimation of the water content, the rain rate and the radar reflectivity.We present the results in the form of curve comparisons and calculated root means square errors (RMSE).

Basic Principles
An artificial neural network is organized into several layers.One layer contains some neurons which are connected to those of the following layer.Each connection is weighed.A neuron is described with its own activation level, which is responsible for the propagation of the information from the input layer to the output layer.However, to obtain reliable weights, the neural network must, first of all, learn about the known input-and output-samples.During the learning process, an error between theoretical and experimental outputs is computed.Thus, the weight-values are modified through an error back propagation process which is executed on several sampling data, until achieving as small error as possible.After this last step, the neural network can be considered as trained and able to be used in calculating other responses to new entries that have never been presented to the network.It is important to emphasize that the learning speed of the neural network depends not only on the architecture but also on the algorithm used.

Model of a Neuron in a Neural Network
The neuron is considered as an elementary cell in which some computational operations are realized (see Figure 2a).It is composed of an integrator (∑) which calculates the weighted sum of the entries.The activation level n, with b as bias, of this integrator, is transformed by a transfer function f in order to produce an output a.The matrix representation of the neuron is shown on Figure 2b, with S supposed to be equal to the unity, to signify that there is only one neuron [16].The R entries of the neuron correspond to the vector represents the vector weights of the neuron.The output n of the integrator is then described by Equation (1): If the argument of the function f becomes zero or positive, the activation level reaches or is superior to the bias b, if not, it is negative [16,17].
The output of the neuron is given by the Equation ( 2):

Construction of a Neural Network and its Learning Process
To build a neural network, it is sufficient to combine the neural layers.Each layer has its own matrix weight W k , where k designates the index of the layer.Thus, the vectors b k , n k and a k are associated to the layer k.To specify the neural network structure, the number of layers and the number of neurons in each layer must be chosen.The learning step is a dynamic and iterative process which consists in modifying the parameters of the network after receiving the inputs from its environment.The learning type is determined by the way the change of the parameters occurs [14].In almost all neural architectures encountered, the learning results in the synaptic modification of the weights connecting one neuron to the other.If ( ) designates the weight connecting the neuron i to its entry j and at time t, a change ∆ ( ) of the weights can simply be expressed by the Equation ( 3): where ( + 1) represents the new entry value of the weight .
A set of well-defined rules that allow realizing a weight adaptation process is designated by a learning algorithm of the neural network.There are a lot of rules that can be used.We can enumerate among others: the method of learning by error correction, the so-called Least Mean Square (LMS) method; the feed forward back propagation in a multilayer neural network; and the method of the back propagation of error sensitivities.In our study, we focalize on the method based on a small modification of the feed forward back propagation method.

The Feed Forward Back Propagation Method Used by a Multilayer Neural Network
Let us now consider the multilayer neural network (composed of M layers) whose synoptic graphical representation is given in Figure 3 for three layers.The equation describing the outputs of the layer k is: In the presence of sample-data combinations entries/outputs , , = 1, … , , where designates an entry-vector and a desired output-vectors, we can forward propagate at each instant t, an entry-vector P(t) through the neural network in order to get an output-vector a(t).Given here are e(t), the error produced by the network calculation for an entry, and the corresponding desired output d(t): The performance function F that permits minimizing the root mean square error is defined by the following expression: where E[ ] and X are respectively the mean and the vector grouping the set of the weights and the biases of the neural network.F will be approximated on a layer with the instantaneous error: The method of the steepest falling gradient is used to optimize X with the help of the following equations: where η designates the learning rate of the neural network.To calculate the partial derivatives of the rule of composition functions is used: The activation levels of the layer k depend directly on the weights and biases on this layer, and can be expressed by the following relation: Now, for the first terms of the Equation ( 9), we define the sensitivities of in relation with the changes of the activation level of the neuron i belonging to the layer k by the equation: The expressions of ∆ ( ) and ∆ ( ) become: Equation ( 12) is responsible for the modification of the weight connections and biases and the different iteration learning steps of the neural network presented in Section 4.

Localities and Data Used for the Study
The applications of our study have been realized after using the RDSD.These data have been collected under different latitudes in Sub-Saharan Africa (see Figure 4).Their main characteristics are grouped in the Table 1.The disdrometer used for the data collection is a mechanical one and was developed by Joss and Waldvogel [18,19].This instrument produces the RDSD minute by minute.The technical description of its functions is given by [18][19][20][21].
The principle of data processing of the disdrometer consists in classifying repeatedly in an interval of one minute the rain drops according to their diameter.During the collection period, some values are computed with the data by minute in order to obtain the values of some rain parameters such as the total number of rain drops, the distribution of the rain drop size, the rain rate, the radar reflectivity, the liquid water content and the median volume diameter.The control parameters of the RDSD are calculated by different drop-diameter moments observed [10] and changes with particular used moments [22][23][24][25][26].The parameters M, R and Z exploited in our study can be expressed in a generic form as follows: 0 ( ) , where N(D) is the number of rain drops per unit height and per volume (mm −1 •m −3 ), D (mm) is the diameter of the equivalent sphere, and a P a coefficient which depends on the type of parameter considered and on the chosen units, so as shown in t Table 2.The stated disdrometer observation normally has large errors for small drop size and large drop size [27].The coefficient p in Equation ( 13) is issued from the definition of the parameters; Z and R are respectively defined as the moment of order 6 and 3.67 of the diameter D [28].

Methodology
The modelling process of a neural network requires the disposal of an entry's number R as well as the number of neurons in the output layer.These neural networks are defined by the specifications of the problem to be solved.The CFBP (Cascade Forward Back Propagation) model (see Figure 5) is one of the artificial neural network types, which is used for the prediction of new output data [29][30][31].Taking into consideration the different analyses in, we will give an abstract of the methodology used in the learning process that we have implemented.
1. Initialize the weights with small random values; 2. For each combination ( , ) in the learning sample: • Propagate the entries p q forward through the neural network layers: ; , 1 , , • Back propagate the sensitivities through the neural network layers: • Modify the weights and biases: ( ) 3. If the stopping criteria are reached, then stop; if not reached, they permute the presentation order of the combination built from the learning database, and begin again at Step 2.
The diagram describing the different steps of the learning process of the neural network is given by Figure 6.

Description of the Databases and their Use in the Neural Models
The classes of diameters are used as entries of the neural network models in order to estimate the water content, the rain rate and the radar reflectivity factor.Thus, the neural networks will then be constituted of 25 entries corresponding to the 25 classes and of one output.The samples presented to the entry of the neural network correspond each to the one-minute values that were calculated by the disdrometer.
On the basis of the retrieval work, we then have two databases that we have named as "Base X" and "Base Y".They are each made up of 23,126 samples.Each sample from the first database is a vector ).The neural model is a cascade forward back propagation (CFBP) one, with a hidden layer of 20 neurons (Figure 7a,b).We develop for each locality a model which will be applied to other localities before building another model, which will integrate all five localities.This last model will be applied on all five localities.(1 × 20 dimension in our case)

Construction and Use of the Different Neural Networks
The databases are available for five studied localities, and two approaches are taken into consideration: First, we create for each locality three LANN-models corresponding to the three parameters .The initial 5000 entries and the primary 5000 desired outputs, except Niamey with only 4468 sample data, are used for the training of the neural networks, and the 5000 following sample data (from the 5001st to 10,000th) have served as test data.After the training, we have obtained acceptable performance curves and have stored the LANN, that we designate by the indices, i, to confer them the role of retrieving the parameter .Each LANN i of a given locality is then invited to predict the parameters over the other localities.In this case, we have used as LANN inputs the entries obtained in the locality subjected to the predictive process.We estimate for the samples from the 5001st-10,000th the outputs that we compare with the available experiment present in this locality.
Secondly, we build for each parameter ( , , ) a polyvalent model (PANN): we extract from the databases of each locality the first 1200 sample data and constitute a training set composed of 6000 training data combinations.Each PANN i (i indicating its role of retrieval of parameter ) is then used to predict the parameters over all localities.As follows, we use the designations from Table 3 and the notations M, R and Z, respectively, for the variables Y1, Y2 and Y3.

Capability of a Neural Network to Estimate the Parameters in Various Other Localities
We have estimated the values of the three parameters (M, R and Z) over five localities with a LANN which has been trained only with the experimental RDSD values measured in Debuncha (Cameroon).For example, after using the first 500 training samples of this locality, we predicted in Abidjan (Côte d'Ivoire) 200 values for each parameter, corresponding to the sample data from the 4801st to the 5000th and have obtained the approximations presented in Figure 8.The dots correspond to the experimental values and the squares to the estimated ones.
The different approximations can be considered as very good because the values measured by the disdrometer are practically the same as those estimated by the LANN.

Estimation of the Values of M, R and Z with the PANN
We have shown that a LANN which has been trained with disdrometer parameters issued only from a given locality, in order to estimate the parameters M, R and Z, was also able to estimate the same parameters in another locality, which is far different from the first.For ameliorating this type of estimation, we have created another ANN that we have qualified as a polyvalent one (PANN) and, which takes into account, while it is in development, data issued from all the localities where it will be used for prediction.For our concrete case, we have put together the first 1200 sampling data of each locality in order to obtain training values.The trained PANN has been used to estimate 5000 values of M, R and Z over all five localities, namely those from the 1201st to the 6200th.

Evaluation of the Root Mean Square Errors (RMSE)
We have evaluated the root mean square errors produced by the LANN and the PANN estimation, while comparing the estimates to the disdrometer measures.Equation ( 17) has been used: where We present in Table 4 the RMSE between values measured by the disdrometer and those estimated by the different LANN.
If the estimates over a given locality are delivered by a LANN trained with data issued from another locality, N is equal to 5000 (except the estimations realized in Niamey (Niger), only 4468 values were available).The estimations produced locally by the LANN concern the sample data from the 5001st to the 10,000th.A, B, C, D and N).Table 5 gives the two best LANN for the retrieval of the parameters M, R and Z.

Table 4. Comparisons between RMSE after use of the different LANN (
Considering the results obtained in Tables 4 and 5, we observe that the best LANN for the estimation in a locality is logically the one which has been developed with data issued from that same locality and shows that the LANN of Debuncha (Cameroon) is in almost all cases able to predict the parameters in other localities.

RMSE between Values Measured by the Disdrometer and Estimates Delivered by the PANN
We have constructed a polyvalent ANN model (PANN) with data issued from the five localities and have chosen the data-gathering order: A+B+C+D+N.The training entries and the desired outputs are constituted by the first 1200 sampling data from A, the initial 1200 sampling data from B, the primary 1200 sampling data from C, the initial 1200 sampling data from D and lastly, the first 1200 sampling data from N. We have calculated and presented in Table 6 the RMSE between values measured by the disdrometer and those estimated by the PANN.Observing the results obtained, the use of the PANN allows obtaining the best RMSE-values (Table 6).However, it is to be noted that while juxtaposing the training data, the order of the localities must be also respected in the fusion of the desired outputs.The RMSE calculated and presented in Tables 4 and 6 concern generally rainfall without a distinction between stratiform and convective rains.This has led us to obtain very small RMSE values.The differentiation between these two rains types enable understanding the importance of the different contributions brought by the two types during the rain period.For this purpose, we have constituted two value classes: the first database corresponds to the values < 410, < 10, < 10,810 (stratiform situation), and the second database corresponds to the values ≥ 410, ≥ 10, ≥ 10,810 (convective situation).A separate evaluation of the RMSE gives the results presented in Tables 7-10.

• Stratiform Rainfall
Table 7 gives the RMSE produced by the five LANN while estimating the parameters over all five localities.Here, only the stratiform rain is considered.
Table 8 presents the RMSE produced by the PANN in the five localities.Here, only the stratiform situation is considered.Observing the results obtained in Tables 7 and 8, we note that the RMSE, if considering only the stratiform situation, remains very small.This is proof that this rain type has been dominant during the rain period.Comparing Tables 7 and 8, respectively, to Tables 4 and 6, we can perceive that the values remain in the same level order.

• Convective Rainfall
We have only considered the data issued from the convective rainfall and calculated the RMSE produced by the LANN while estimating the parameters M, R and Z over each locality.Table 9 gives these errors for the five LANN and Table 10 presents them for the PANN.Considering the results from Tables 9 and 10, and comparing them respectively to Tables 7 and 8, we observe that the RMSE concerning the convective situation is higher.

Conclusions
Our study concerns the estimation of water content (M), rain rate (R) and radar reflectivity (Z).For this purpose, we have used RDSD data measured by a disdrometer installed in five African sub-Saharan localities.We have developed and trained for each locality a LANN with the disdrometer data of the same locality and have used the trained LANN to estimate the parameters M, R and Z over all localities.We have also developed a polyvalent ANN model (PANN) whose training was realized with a combination of data issued from all the localities where it will be used for estimating the parameters over each locality.
We have, first of all, considered the precipitations generally before taking into account their stratiform and convective structures.For appreciating the capability of the developed and trained LANN and PANN to estimate the parameters, we have calculated the RMSE produced by them during their use over the distinct localities.Observing the different results and comparisons obtained, we have perceived that the PANN is very capable of predicting over all the localities the values of the parameter for which it has been developed.
The RMSE produced by the PANN, while considering stratiform structures, is generally lower than those induced while considering the convective ones.After our investigation, we can thus consider the LANN and the PANN as models, which are able to predict rainfall values under different latitudes.
In areas of low coverage measurement networks of meteorological parameters, as in our study area (sub-Saharan Africa), the implementation of a PANN in a chain of radar measurements for rainfall could provide better understanding and describe the precipitations independently of their form and locality of occurrence.
The neural network technique is known to be difficult to manipulate.However, the implementation of such a technique in modern data acquirement instruments like rain radars may constitute a very interesting method to be investigated.Putting in place such a technology could permit a simultaneous measurement of DSD and the computing of rain parameters.
In our future works, we will also conduct a sensitivity study that consists of evaluating the significance of the RMSE between disdrometer measurement, rain gauge observations and NN estimates taking into account the disdrometer errors caused by rain drop size limitation during the measurement process.
of elements in the input-vector S=1: Number of neurons in the layer b)

Figure 3 .
Figure 3.An example of synoptic representation of a 3 layer-neural network in a feed forward back propagation form.

Figure 4 .
Figure 4. Location of the data collection sites.

Figure 6 .
Figure 6.Diagram of the learning process of the neural network.
of 25 elements [ , , … , ] .The base Y is a matrix of 23,126 × 3 dimensions whose three rows represent three different vectors, Y 1 , Y 2 and Y 3 .These three last vectors are known as the water content (M(g•m −3 )), the rain rate (R(mm•h −1 )) and the radar reflectivity (Z(mm 6 •m −3 )), each retrievable individually, from the X database.With each sample j of the first database is then associated a real number ( = 1,2,3; = 1,2, … ,23126).In this work, we have to retrieve these parameters.The different neural networks have as entries the parameters ( = 1,2, … ,25) and as outputs the parameter ( = 1,2,3

Figure 7 .
Figure 7. ANN models: (a) description of the model; (b) the CFBP model, which was used for our simulations.

Figure 8 .
Figure 8. Neural prediction in Abidjan (Ivory Coast) of the liquid water content M (a), the rain rate R (b) and the radar reflectivity Z (c) by a LANN constructed with data issued only from Debuncha (Cameroon).
E i : Error (observed value−Neural Network estimate); and N: Number of observed values.5.3.1.RMSE between Measured Parameters in the Localities and Estimates Delivered by the LANN A, B, C, D and N

Table 1 .
Database and localities studied in sub-Saharan Africa.
Town (Country) Collection Period of the RDSD Total Number of the RDSD (min) Abidjan (Ivory Coast)June; September to December: 1986

Table 2 .
Values of the exponent factor p and of the multiplicative factor a p in relation to the parameter studied ( = ( ) ).

Table 3 .
Model designation for each locality.

Table 5 .
Choice of the two best LANN for retrieval of the parameters in the different localities (A, B, C, D and N).

Table 6 .
RMSE between values measured by the disdrometer and those estimated by the PANN.

Table 7 .
Comparisons between RMSE after use of the different LANN: Case of stratiform rains.Computation of the RMSE after Distinction of Stratiform and Convective Rains

Table 8 .
RMSE between values measured by the disdrometer and those estimated by the PANN: Case of the stratiform rains.

Table 9 .
Comparisons between RMSE after using the different LANN: Case of the convective rainfall.

Table 10 .
RMSE between values measured by the disdrometer and those estimated by the PANN: case of the convective rainfall.