Dynamic Analysis of Meteorological Parameters in Košice Climatic Station in Slovakia

Evaporation and precipitation are often considered the most important processes in the water cycle. Recent studies have turned to chaotic analysis and short-term prediction for analyzing and forecasting the time series of such phenomena. However, even with chaos theory, the accurate forecasting of pan evaporation is not a straightforward business, as it involves a number of variables whose changes directly and/or indirectly affect the scale and amount of pan evaporation. In this study, the use of the false nearest neighbour method for the chaotic analysis of pan evaporation and related metrological parameters is discussed. A literature review is presented on chaos theory and its applications in modelling physical systems. Also, a review of the literature on multivariate analysis and the presence of chaos in meteorology are presented. A detailed procedure for finding the presence of chaos in a time series using false nearest neighbour (FNN) is discussed. The possible lag time to be considered in the FNN analysis is estimated using the autocorrelation function (ACF) and average mutual information (AMI) apart from the time-step of the measurement. Thus, FNN is studied with three different lag times of the time series. Six meteorological parameters: average temperature, relative humidity, wind speed, sunshine hours, dew point temperature, and pan evaporation are measured at the observation station Kosice in Slovakia for a period of 20 years. Thus, the available time series are analysed using ACF, AMI, and FNN methods, and the results obtained are analysed in the study. Nonlinear behaviour is seen in all of the observed parameters. Pan evaporation, average temperature, and dew point temperature are found to exhibit clear chaotic behaviour, while relative humidity, sunshine hours, and wind speed show stochastic behaviour.


Introduction
Common meteorological parameters measured at most weather stations include maximum, minimum, and average temperature, maximum relative humidity, wind speed, total sunshine hours, dew point temperature, and pan evaporation.Each of the parameters depends upon the remaining parameters, and the underlying relationships are highly complex, nonlinear, and have spatial and temporal variations, even though each parameter follows a cyclic pattern, often with a trend.Most of these parameters are often influencing variables in most of the models that are used for the prediction of evaporation at any given location.Hence, a complete understanding of the dynamics of evaporation can be achieved only with a complete understanding of these sensitive parameters.Time series are a set of observations that are arranged chronologically at equally spaced time intervals [1].A time-series analysis is carried out to understand the underlying structure of the observed data.The process of predicting the future values based on the previous observations using a specially developed model is called time series forecasting.
Time series analysis is being widely applied to various fields of research such as climatology, hydrology, surface water studies, oceanography, etc.The method of forecasting to be employed for all of the fields will not be the same due to the high variability in the nature of the time series encountered.Analysis methods are based primarily on the kind of behavioural pattern exhibited by the time series under consideration.One of the most challenging problems is the availability of verifiable data.In order to predict the nature of a system, sometimes data spanning for decades may be necessary.In climate models such as the general circulation model (GCM), the run is done by taking data spanning centuries.Such types of data are not available in all kinds of situations, and we may have to make do with whatever we could get our hands on.Time series behaviour is conventionally classified into two types: deterministic and stochastic.Hence, the developed models could track only these two, and for the same reason were too limited in their applicability.The nonlinear behaviour that could be inherent in many of the time series is not considered here.A new behaviour that bridges the two was proposed, called chaotic behaviour, which states that random input is not the only source of irregularity, but a nonlinear chaotic system can also produce very irregular data [2].Such a process, which is a realization of a stochastic phenomenon in nature, is called a stochastic (or probabilistic) process, and the corresponding time series is called a stochastic time series.The process with a given initial condition can proceed in more than one way.The analysis of a stationary random process is done using autoregressive or differencing models such as autoregressive moving average models (ARIMA).Many nonlinear methods are also available for prediction.
Nonlinear analysis methods have started to be explored, which lead the way to the introduction and development of the concept called deterministic dynamics and chaos.Nonlinear dynamic systems associated with strange attractors for the description of deterministic 'chaos' has been a growing branch over the past few decades [3].Presently, the chaos theory is the most complexly studied in Sivakumar [4].A time series of hazard-looking data that follows some special mathematical rule, and leads to the occurrence of a deterministic nonlinear system, is called a chaotic time series.The system may look unpredictable at first glance, but some kind of definite pattern will evolve as the series progresses that makes prediction possible.Chaos theory can offer a coupled deterministic stochastic approach, since its underlying concepts of nonlinear interdependence, hidden determinism and order, sensitivity to initial conditions are highly relevant [5,6].
Chaos has been considered an inherent behaviour in climatic parameters, with weather being defined as a set of atmospheric states for a dynamic, chaotic system showing deterministic variability [7].The study of chaos can yield information regarding the number of necessary variables for the modelling of system dynamics, and the possibility of future prediction at a certain level of reliability.Many methods are available for identifying the presence of chaos in a time series.Some of these are the correlation dimension method (CDM), Lyapunov's integral method, Kolmogorov entropy method, the false nearest number (FNN) method, etc.For the systems that have deterministic chaos, short-term predictions are possible.
Buizza [8] treated weather as a chaotic system and demonstrated that the application of linear algebra to meteorology can help design new ways of numerical weather prediction, and observed that the same technique is applicable to any dynamical system, however complex, with a large dimension.The basic idea proposed was that there are only a few important directions in the phase space for any system, along which the most important processes occur.If a successful prediction of the system time evolution is obtained, it should sample these directions, describing the system evolution along them.Das [9] analyzed average daily air temperature records for 12 years for some cities by removing seasonality.Nonlinear analysis was carried out to find the chaotic nature of the daily average temperature of original data, as well as the data after removal of seasonality through calculation of Lyapunov exponent.He concluded that temperature was chaotic and deeply influenced by seasonality.Millan et al. [10] performed a nonlinear time series analyses on a time series of mean daily temperature and dew point temperature at Babolsar, Iran.They observed positive Lyapunov exponents for both series, thereby providing evidence of the chaotic nature of both series.Farzin et al. [11] studied the monthly evaporation from Urmia Lake for 40 years from 1967 to 2007 to investigate the presence of chaos.Embedding dimension was calculated using the false nearest neighbour algorithm (FNN) and delay time using the average mutual information method (AMI).The evaporation for a further 10 years from 2007 to 2017 was made.Guo et al. [12] used a chaotic forecasting model for four meteorological stations located around the Hexi Corridor area, China and found that the chaotic forecasting model (using a weighted local region method) efficiently improves the accuracy of the wind speed forecasting.
Univariate analysis often fails to completely capture the behaviour and dynamics of the system, particularly when the process involved is of a complex nature.Synder [13] gave the advantages of using multivariate analysis in certain hydrological systems.He observed that some of the statistical techniques of multivariate analysis will prove useful in fitting prediction equations for the observational data.Schiff et al. [14] and Quyen et al. [15] had independently tried to establish relations between two chaotic time series based on state spaces by utilising the method of cross-prediction, where the prediction of one variable depended on the dynamics of the other variable in the embedded state space.Cao et al. [16] had shown that predictions using multivariate time series can be significantly better than those using a univariate time series.They gave a simple but effective method that can determine the embedding dimensions from a multivariate time series.They also proved that synchronization can be brought about between the reconstructed systems and original systems.Further studies by Sfetsos and Coonick [17] on multivariate prediction also showed multivariate prediction to have more accuracy than univariate prediction.However, the time series that are used in the multivariate prediction must have some relations.Porporato and Ridolfi [18] extended the nonlinear prediction of a river flow time series to a multivariate form so as to include information from multiple time series, rather than that of discharge alone.They explained both the conceptual basis of the multivariate approach and its application to the forecasting of river flow.Jin et al. [19] used ideas from dynamical systems theory to investigate the joint phase space characteristics of several climatic variables.Han and Wang [20] proposed a method to detect the direct and/or indirect relations existing among different state spaces before prediction.They implemented a method of expanding multivariate prediction with the combination of neural network theory and the principle component analysis (PCA) method to model and predict the multivariate time series.Dhanya and Kumar [21] assumed that the predictability in the chaotic system is limited mainly due to its sensitivity to initial conditions, and that the ineffectiveness of the proposed model revealed the system's underlying dynamics.They made an attempt to improve the predictability by quantifying the uncertainties involved by adopting a multivariate nonlinear ensemble prediction method.
Most of the hydrological and meteorological time series such as rainfall, evaporation, temperature, sunshine hours, wind speed, etc. don't follow a smooth curve and have an erratic appearance [22].Evaporation, as a naturally occurring phenomenon, has a high probability of having a chaotic nature.Also, the parameters on which evaporation depends are also probable candidates to exhibit chaotic patters.A chaotic analysis of evaporation, along with the related meteorological parameters, is a so-far unexplored field that can possibly lead to new understanding and better forecasting of the evaporation process.Although these patterns look complex, according to chaos theory they could have a deterministic nature and a simple cause for their erratic appearance.The chaotic analysis of these time series can help in understanding these processes and can ultimately provide better information that can be useful in arriving at improved short-term predictions for each parameter.The influence of various parameters on evaporation may be of varying degrees.The dominant number of variables that governs the meteorological process satisfactorily may be arrived at by checking the chaotic nature of each of the individual processes and comparing them for similarities and apparent influences in the variations of the output process.Such an analysis has not been done for Slovakia yet.

Study Area and Data
In terms of global climate classification, the territory of Slovakia lies in the northern temperate climatic zone with the regular alternation of four seasons and variable weather, with a relatively even distribution of rainfall throughout the year [23].According to the Slovak Hydrometeorological Institute (SHMI), average annual rainfalls of less than 600 mm may occur in Slovakia.In general, the rainfall increases with altitude.The rainiest month is usually June or July, and the least rainfall occurs from January to March [24,25].The highest daily rainfall was 231.9 mm, which was measured in 1957.In summer, very rainy storms occur relatively frequently over the whole country: almost every year, somewhere in Slovakia, the daily rainfall exceeds 100 mm.In winter, much of the rain falls in the form of snow, particularly in the middle and the high mountain ranges.The average duration of snow cover is less than 40 days in southern Slovakia, and in the mountains, the average duration of snow cover is 80 to 120 days.
The Slovak Hydrometeorological Institute provided 64-year data of daily meteorological readings from 1 January 1951 to 31 October 2014 from an observation station at Kosice, Slovakia.Global coordinates at the collection area are of longitude 48 • 40 20 N and latitude 21 • 13 21 E. The location of the station on the map of Slovakia is shown in Figure 1.The area has a mild climate, with temperature going under 0 • C in winter, above 30 • C in summer, and relative humidity coming to 100%.This sample can provide a more extensive understanding of the evaporation process under extreme conditions.
similarities and apparent influences in the variations of the output process.Such an analysis has not been done for Slovakia yet.

Study Area and Data
In terms of global climate classification, the territory of Slovakia lies in the northern temperate climatic zone with the regular alternation of four seasons and variable weather, with a relatively even distribution of rainfall throughout the year [23].According to the Slovak Hydrometeorological Institute (SHMI), average annual rainfalls of less than 600 mm may occur in Slovakia.In general, the rainfall increases with altitude.The rainiest month is usually June or July, and the least rainfall occurs from January to March [24,25].The highest daily rainfall was 231.9 mm, which was measured in 1957.In summer, very rainy storms occur relatively frequently over the whole country: almost every year, somewhere in Slovakia, the daily rainfall exceeds 100 mm.In winter, much of the rain falls in the form of snow, particularly in the middle and the high mountain ranges.The average duration of snow cover is less than 40 days in southern Slovakia, and in the mountains, the average duration of snow cover is 80 to 120 days.
The Slovak Hydrometeorological Institute provided 64-year data of daily meteorological readings from 1 January 1951 to 31 October 2014 from an observation station at Kosice, Slovakia.Global coordinates at the collection area are of longitude 48°40′20″ N and latitude 21°13′21″ E. The location of the station on the map of Slovakia is shown in Figure 1.The area has a mild climate, with temperature going under 0 °C in winter, above 30 °C in summer, and relative humidity coming to 100%.This sample can provide a more extensive understanding of the evaporation process under extreme conditions.Even though the length of data is said to be 64 years, evaporation measurement commenced only in May 1994.Even then, there are long gap periods in the time series.An acceptably long interval, wherein all of the parameters are continuously measured, was not available due to these gaps in data.However, from temperature distribution, it could be seen that almost all of the cases of missing data in evaporation and sunshine hours were on days where the temperature was very low.The daily data of selected parameters (including the filled-up data) from the above-specified station was selected for a period of over 20 years, from 1 May 1994 to 31 April 2014.The meteorological parameters measured at the station include the following: average temperature, relative humidity, wind speed, sunshine hours, dew point temperature, and pan evaporation.The FNN method is applied to these meteorological parameters to study their behaviour.Even though the length of data is said to be 64 years, evaporation measurement commenced only in May 1994.Even then, there are long gap periods in the time series.An acceptably long interval, wherein all of the parameters are continuously measured, was not available due to these gaps in data.However, from temperature distribution, it could be seen that almost all of the cases of missing data in evaporation and sunshine hours were on days where the temperature was very low.The daily data of selected parameters (including the filled-up data) from the above-specified station was selected for a period of over 20 years, from 1 May 1994 to 31 April 2014.The meteorological parameters measured at the station include the following: average temperature, relative humidity, wind speed, sunshine hours, dew point temperature, and pan evaporation.The FNN method is applied to these meteorological parameters to study their behaviour.

Decriptive and Chaotic Analysis
In normal cases, the term chaos refers to disorder or confusion.However, in a scientific sense, the term chaos in the chaotic systems is used to denote the irregular behaviour of dynamical systems arising from a strictly deterministic time evolution without any source of noise or external stochasticity, but with sensitivity to initial conditions [26].It basically talks about how things change over time.The theory of nonlinear dynamic systems associated with the concept of strange attractors for the determination of deterministic chaos has drawn the attention of a number of researchers over the recent decades.This concept also provides a new technique for time series analysis, because in many instances, the time series can be viewed as a dynamic system with a low-dimensional attractor, which can be reconstructed using the time delay embedding method [27].
The time series of the selected data are analysed in detail to get their basic governing dynamics.The false nearest neighbours (FNN) algorithm is used as a method to determine the optimal embedding dimension required for recreating or unfolding nonlinear system dynamics.The algorithm is considering the geometry of the reconstructed phase space.If the embedding dimension chosen is not high enough, then some points that appear to have their trajectories close to each other in phase space may end up having vastly different outputs; such neighbours are termed false neighbors.They appear to be close only because they are represented in a dimension that is less than sufficient to completely capture their behaviour.Furthermore, to consider the linear and nonlinear behaviours while doing chaotic analysis, the lag of the lagged series is varied accordingly.The methods of analysis that are employed to use the delay time in FNN analysis are the autocorrelation function (ACF) and the average mutual information (AMI) method.
The neighbours are classified as true or false based on a ratio test conducted that determines the magnitude of distance to the points in the higher dimension with respect to the distance in the current dimension.Let each point in the considered phase space be Y i .The m dimensional phase space is searched for its nearest neighbours (Y j ).The Euclidean distance is calculated between each Y m (i) and Y m (j).Then, the same is calculated in the (m + 1) th dimension.The ratio of both distances is taken as R i , which is given in Equation (1).The value will be more than one as the distance in the higher dimension cannot be less than that in the lower dimension.Therefore, in order to distinguish between true and false neighbours, a threshold value is fixed for the ratio (R t ) in a way that the Euclidean distance measured between the neighbours in the (m + 1)th dimension should be comparable to that in mth dimension.From the literature, it is found that the threshold can be fixed around 10 [28].
If the ratio between these two distances is greater than a threshold value (R t ), then the neighbours are considered to be false, or vice versa [29].
The value of the ACF ranges from +1 to −1, with +1 indicating complete correlation, and −1 indicating a completely inversed or negative correlation.The function can be used as a means to compare the linear nature of the different time series.ACF curves are plots of autocorrelation versus lag.The point where the curve first crosses zero can be taken as a standard upon which different time series can be compared.The autocorrelation value is found by the expression (1): where N is the number of points; m is the number of Lags; x t is the value of variable x at any time t; and x is the mean of all of the values of x.
The two series may be an independent series or an original series along with its lagged series.The output is the number of bits of information that is mutually available.AMI curves are plots of average mutual information versus lag.The point where the first minima appears indicates the least interrelated series in nonlinear considerations.The subsequent minima are discarded.The higher the value of the AMI, the more complex the nonlinear relation to model the time series.Similarly high AMI values between two physical processes may mean that there is some nonlinear similarity between the systems in question.

Statistical Analysis
A univariate analysis of each parameter needs to be carried out in order to ascertain the variation over time.Daily data provided by SHMI during the 64 years were analysed.The average annual temperatures that were measured ranged between 29.75 to −14.5 degrees at Kosice.The value of relative humidity shows a high variation, with the maximum and minimum RH max values at the station at 100% and 27.3%, respectively.The value of wind speed shows high variation, with the maximum and minimum values at Kosice being 54 and 0 kmph, respectively.The value of sunshine hours shows high variation, with the maximum and minimum values at Kosice being 15.2 h and 0.1 h, respectively.The maximum value for the dew point temperature at Kosice is 29.1 degrees.Evaporation values show high variation due to climatic conditions.The maximum value of evaporation measured at Kosice is 12.1 mm/day.
To further study the data, various statistical indices of individual parameters were calculated.The results of the statistical analysis are shown in Table 1.Temperature and average wind velocity are on the lower side.This could indicate that evaporation may not be a prominent process at this location.It is backed by a low rate of average evaporation.The deviation from the mean also suggests that evaporation may be more correlated to temperature and wind speed than any other factor.It may be advisable not to derive any particular relations between evaporation and other parameters from this skewness and kurtosis value, as the particular values for evaporation could be influenced by the interpolated data points.Skewness and kurtosis values do not show any apparent pattern or suggest that the distribution pattern of evaporation is more similar to average temperature and dew point temperature.The unique positive values obtained in wind speed for the two indices indicate that that particular parameter may have different inherent dynamics than the rest of the parameters.High value of kurtosis for evaporation may be explained by long periods of low temperature, followed by short periods of comparatively high temperature.This variation could produce peaks in the distribution.

Trend Analysis
The different trends exhibited by each data set are explored to find possible common trends and anomalies for the selected meteorological parameters at Kosice station.The results of the trend analysis test conducted for Kosice are given in Table 2. P in the trend row indicates a positive trend, and N indicates a negative trend.Where there is no trend at all, it has been so noted.The trend analysis shows that there is no fixed trend among the meteorological parameters measured at Kosice.Average temperature shows that the temperature of the area is on the rise.Negative trends in relative humidity with a positive trend for temperature can result in a hot humid climate in the distant future.The positive trend in evaporation is the result of an increasing trend in temperature and a falling trend in relative humidity.

Nonlinear Dynamic Analysis of Meteorological Parameters
A more detailed analysis of the parameters is done one at a time by carrying out some descriptive analysis methods.The false nearest neighbour method used the delay time values picked up from autocorrelation analysis and the average mutual information test.The tests are so selected that each analyses various behaviours of the individual time series.The univariate analysis thus conducted can give the nature of the time series of various parameters.The main aim in this study is to find the chaotic nature of the parameters.This along with the descriptive analysis can give a comparative evaluation of advantage of using chaotic analysis over the traditional methods for modelling meteorological systems.
All of the results from the ACF and AMI analyses of the parameters considered at Kosice station are given in Table 3.The lag at which the ACF value crosses zero is taken as a lag of ACF.The lag at which the AMI is minimum is taken as the lag of the AMI.These lag values are used as the delay time in the FNN analysis.There are some salient points and values obtained from the two aforementioned analyses.Most of the features to be noted were described along with the analysis results of individual parameters.Given here are the trends and patterns seen in the results, which were not outlined previously.Linear modelling will be difficult to carry out for all of the parameters, as the ACF values are invariably high.Nonlinear behaviour is exhibited by all of the time series, and needs to be considered during further analyses.
The chaotic nature of the parameters is checked taking one-time series at a time, and finding the false nearest neighbour value.The plot is between the number of embedding dimensions on the x axis and the percentage of false neighbours on the Y axis.The shape of the plot and the FNN number are both equally important.The study is done in three parts: the FNN method with (delay time as) lag = 1; the FNN method with (delay time) lag = ACF value; the FNN method with (delay time) lag = AMI value.
The optimum embedding dimension selected from FNN analysis with these delay times is shown in Table 4.The optimal embedding dimension can be selected based on a delay time lag = 1 or using AMI rather than the ACF, since the ACF can represent a linear nature well.Thus, the study shows that the dominant dimension of meteorological variables at Kosice varies between four and seven.It can be an average of six dimensions.From the analysis conducted, it is found that pan evaporation is indeed a chaotic process.Out of the six other meteorological parameters considered, maximum temperature, minimum temperature, average temperature, and dew point temperature show a chaotic nature.The other parameters considered, i.e., relative humidity, wind speed, and sunshine hours all show a stochastic nature.
Evaporation, when considered as a function of the six variables, should thus be exhibiting a chaotic nature with some stochastic component.Hence, a stochastic model, however sophisticated, cannot model the process by itself.The chaotic nature needs to be considered for proper modelling of the same.Each meteorological process, when considered individually, can be better analysed using chaos theory.

Conclusions
Meteorological parameters are important part of hydrological and climatological studies.A clear understanding of the same is necessary for analyzing their underlying relationship.This in turn is crucial for estimating water requirements in water resources.Researchers so far have limited their analysis of these parameters to univariate analysis, mostly.In the rare cases where multivariate analysis is employed, the presence of chaotic dynamics has not been checked.
In the current study, some important meteorological parameters that are most commonly measured and often used in water resources such as average temperature, relative humidity, wind speed, dew point temperature, sunshine hours, and pan evaporation are analysed to find the linear, nonlinear, and nonlinear dynamic behaviour.The analysis is done for 20 years daily data collected from a weather station at Kosice, Slovakia.
Statistical analysis of the data was first done to find the behaviour of the time series over the long-term.The statistical analysis of the data shows that there are no strict rules to be obtained from the regular statistical parameters such as mean, standard deviation, skewness, kurtosis, etc.It helps to give some insights, but these are inconclusive, and need rigorous studies that take numerous stations at a time for verification and authentication.Trend analyses show that usually, rising trends in temperature, sunshine hours, and wind speed result in an increasing trend in the evaporation time series, but this also needs to be further verified.Some insights into the future climatic and meteorological trends in the area may be arrived at from the trend analysis.A trend analysis of the data was carried out to find the general trend and behaviour of the series.Further, each parameter is analysed, considering each time series individually.The statistical analysis of the data shows that there are no strict rules to be obtained from the regular statistical parameters such as mean, standard deviation, skewness, kurtosis, etc.A trend analysis shows that usually, rising trends in temperature, sunshine hours, and wind speed result in increasing trends in the evaporation time series, but this also need to be further verified.Some insights into the future climatic and meteorological trends in the area may be arrived at from the trend analysis for individual stations.Also, the analysis turned up with some unique values for trends in evaporation and dew point temperature.This once again proves the inconclusive nature of the results and highlights the necessity of further studies.The analysis provided also gives an overview of the effects of various parameters on evaporation.
The autocorrelation function (ACF), average mutual information (AMI), and false nearest neighbour (FNN) methods were used for the analysis.FNN analysis was done with three different lags.The first of these was the standard FNN method, which took the lag as 1.A further two analyses were done, taking the result of the ACF and AMI analyses as the lag.The results of FNN analysis showed the nonlinear behaviour of the meteorological parameters.It also resulted that the predominant dimension of each meteorological process varied between five and six, i.e., each meteorological process is governed by various independent processes.

Table 1 .
Statistical analysis of meteorological parameters at Kosice.

Table 2 .
Results of trend analysis for Kosice station.

Table 3 .
Autocorrelation function (ACF) and average mutual information (AMI) values for all of the parameters at Kosice station.

Table 4 .
Results from the false nearest neighbour analyses.