Fuzzy Linear Regression of Rainfall-Altitude Relationship †

Classical linear regression has been used to measure the relationship between rainfall data and altitude in different meteorological stations, in order to evaluate a linear relation. The values of rainfall are supposed as dependent variables and the values of elevation of each station as independent variables. It has long been known that a classical statistical relationship exists between annual rainfall and the station elevation which in many cases is linear as the one examined in this article. However classical linear regression makes rigid assumptions about the statistical properties of the model, accepting the error terms as random variables, and the violation of this assumption could affect the validity of the classical linear regression. Fuzzy regression assumes ambiguous and imprecise parameters and data. For this reason it may be more effective than classical regression. In this paper we evaluate the relationship between annual rainfall data and the elevation of each station in Thessaly’s meteorological stations, using fuzzy linear regression with trapezoidal membership functions. In this possibilistic model the dependent measured elevations are crisp, and the independent observed rainfall values as well as the parameters of the model are fuzzy.


Introduction
The scope of constructing in engineering a model is always to attempt to maximize its usefulness.This aim is closely connected with the relationship among three key characteristics of every systems model: complexity, credibility, and uncertainty.Our purpose in systems modelling is to estimate an optimal level of uncertainty for each modelling problem.In 1965 Zadeh in [1], introduced the theory of fuzzy sets, in which the fuzziness of a system incorporated all the types of uncertainty, in contrast with probability theory, which was capable of representing only one of several distinct types of uncertainty.
In Hydrology, rainfall measurement models have been extensively used in the design process of water resource projects such as hydrological prediction, spillway design, climatic change studies, rainfall and runoff correlation etc. Rainfall measurements in a specific area are commonly displayed in the form of time series, where recorded values can be either continuous or discrete.In many instances, there is a correlation between rainfall data that belong to different stations and comprise measurements with differing range and stations elevations.Correlation analysis is used to depict the relation between the independent variable (usually the meteorological station elevations) and the dependent variables (meteorological stations rainfall data).
In classical linear regression, the difference between measurement values and estimated values, is a random variable with normal distribution and is considered to be caused by measurement errors.Upper and lower bounds of the estimated value are calculated and the probability that the estimated value will lie between them represents the estimation confidence.According to this, classical regression is considered to be probabilistic and has many uses, but can be rendered problematic if the data set is small, if it's hard to prove that error distribution is normal, if there is fuzziness between dependent and independent variables or if linearity acceptance is not proper [2].
Nowadays, new regression models have been introduced, based on fuzzy logic [3][4][5][6][7][8][9][10][11][12][13].In fuzzy regression the difference between measurement values and estimated values is attributed to the inherent fuzziness of the system as well as to the fuzziness of input and output data.In contrast with classical regression analysis, fuzzy regression analysis uses fuzzy functions for the regression factors and usually meets one of the three cases: Trapezoidal membership function models have been used by [14,15,[19][20][21][22][23][24].Charfeddine in [19] extended Tanaka's method for the case of trapezoidal membership functions with crisp measured input and output values.She used a fuzzy level function , with four crisp level functions , , , For, , , she used Tanaka's method [8], whereas for , , she used classical linear regression, which led to function and standard deviation σ of the model.Based on this , , become: = − , = + with λ being an adjustment factor.Ganesan and Veeramani in [22], described a fuzzy linear programming with symmetric trapezoidal fuzzy parameters.They proved fuzzy analogues of some important theorems of linear programming and they gave a numerical example, leading to a solution of fuzzy linear programming problems.Fung et al. in [25], proposed a new model for asymmetric trapezoidal fuzzy parameters.Kumar and Kaur in [24] presented a new method, called Mehar's method, suitable for solving fuzzy linear programming problems with trapezoidal fuzzy parameters.They proved that this method is easy to apply to fuzzy linear programming problems, with respect to existing methods.Kheirfam and Verdegay in [23] extended the dual simplex method to fuzzy linear programming, with symmetric trapezoidal fuzzy parameters.They studied the variation of values to a certain limit, so that the fuzzy optimal solution remains invariant.
In this article, a possibilistic model (Tzimopoulos et al. model) is described, where membership functions are trapezoidal, measured input values are crisp and measured output values are fuzzy triangular [13].In this model, a rather simple two-phase method was used, with one crisp measured input value and one fuzzy measured output value.During step 1, measured input and output values were considered crisp, while parameters were considered fuzzy and supports were estimated using Tanaka's method.In step 2, estimated supports were considered to be the known kernel of the trapezoidal membership functions and in the inclusion they were transferred to the known terms.Supports for the trapezoidal membership functions of the estimation were calculated.Triangular membership functions were used for measured output values.
Further, we present one application of the above model, concerning a hydrological problem in the region of Central Thessaly (Greece), where twenty rainfall measurement stations of the region with rainfall data and their elevation have been used.According to the kernel inclusion constraint mentioned above, we have:

Mathematical Model
Namely, the kernel of experimental output values is within the kernel of estimated values.According to Shapiro et al. (2009), for the case of crisp output values (only the kernels of measured values), if we apply Tanaka's method (Tanaka 1987), the range [ , ] = [ , ] encircles the kernels of experimental values.In this stage, only triangular functions ( , ), ( , ) are applied for the method of Tanaka and the possibilistic model used is: = + .Thus, the problem of determining the estimated values in this step is: Through the solution of this system we get the surroundings [ , ] = [ , ] as follows: As long as they meet the same constraints, these surroundings coincide with the kernel of trapezoidal functions, resulting in the relations below: We apply now support inclusion: where space [ , ] is given as follows: Based on relations (3a, 3b, 3c) and (4a, 4b), we get: where and are known since they have been calculated during step 1.
The problem turns now into the following: s.t

Generalities
Given the following rainfall and elevation data of twenty meteorological stations of central Thessaly (Greece) (Tables 1 and 2), with crisp input data and symmetric output fuzzy data, we want to assess the two step model of Tzimopoulos et al., using the fuzzy regression method, considering trapezoidal membership functions for the assessment.

Step 1
We apply Tanaka's model, considering that output data are crisp.For, the model is written as follows: min (20 + 10926 ) s.t Solving this problem we get kernel equations: ℓ = 529.493+ 0.42706x, y = 800.989+ 0.79956x (3b)  Notation: e means the spread of the rainfall data, which are symmetrical triangular fuzzy numbers.

Step 2
In phase 2, the kernel is considered to be known and the model is written as follows: min (20 0 ℓ + 10926 1 ℓ + 20 0 + 10926 1 ) In Figure 3 calculation results are shown, while support equations are: y ℓ = 450.069+ 0.3629x, y = 887.100+ 1.20226x In the Figure 4a illustrates the estimated output data, Figure 4b illustrates the measured and predicted values at Megali_Kerasia_Ypex meteorological station, while Figure 3a, b illustrate the predicted parameters and respectively.
(a) Crisp input values and crisp output values (CICO); (b) Crisp input values and fuzzy output values (CIFO); and (c) Fuzzy input values and fuzzy output values (FIFO).In all of these three cases, estimated values are fuzzy.Most of the cited above writers have used tridiagonal fuzzy functions for the formulation of the problem.However the need to use trapezoidal functions, results of the following reasons [14]: (a) we need to optimize the fuzziness of the model and (b) we need to restrict experimental data inside the estimated value range.Using trapezoidal membership functions for estimated values [14-18] allows us to achieve inclusion for output data with triangular membership functions and estimated values with trapezoidal membership functions , for confidence level h = 1, for which the kernel is not minimized in a point: [ ] ⊆ [ ] .In addition, for a level of confidence h = 0, we can achieve inclusion: [ ] ⊆ [ ] .Due to the linearity of the membership function, inclusion for those levels of confidence, allows us to ensure that inclusion is possible for every level of confidence: [ ] ⊆ , ∀ℎ ∈ [0,1].Consequently, trapezoidal fuzzy parameters have higher capability to model varieties of fuzziness than triangular fuzzy parameters.

Table 1 .
Elevation and Rainfall data from meteorological stations of Central Thessaly.

Table 2 .
Elevation and Rainfall data from meteorological stations of Central Thessaly.