A New Perspective on Formation of Haze-Fog : The Fuzzy Cognitive Map and Its Approaches to Data Mining

Haze-fog has seriously hindered the sustainable development of the ecological environment and caused great harm to the physical and mental health of residents in China. Therefore, it is important to probe the formation of haze-fog for its early warning and prevention. The formation of haze-fog is, in fact, a fuzzy nonlinear process. The formation of haze-fog is such a complex process that it is difficult to simulate its dynamic evolution using traditional methods, mainly because of the lack of their consideration of the nonlinear relationships. It is, therefore, essential to explore new perspectives on the formation of haze-fog. In this work, previous research on haze-fog formation is summarized first. Second, a new perspective is proposed on the application of fuzzy cognitive map to the formation of haze-fog. Third, a data mining method based on the genetic algorithm is used to discover the causality values of a fuzzy cognitive map (FCM) for haze-fog formation. Finally, simulation results are obtained through an experiment using the fuzzy cognitive map and its data mining method for the formation of haze-fog. The validity of this approach is determined by definition of a simple rule and the Kappa values. Thus, this research not only provides a new idea using FCM modeling the formation of haze-fog, but also uses an effective method of FCM for solving the nonlinear dynamics of the haze-fog formation.


Introduction
Presently, the formation of haze-fog in China, especially in North China, exhibits a complexity that reflects two areas of concern.One is the existence of pollutants.Air pollution is changing from, originally, the single-type to multiple variations.Secondary pollution [1,2], formed by a variety of pollutants reacting with each other, is becoming increasingly prominent.The second area of concern is related to meteorological conditions, whose changes may impact the formation of haze-fog in different ways.First, global warming makes the local "extreme weather" more variable, and disturbance due to EI Niño increases the likelihood of haze-fog formation [3].Second, the Three-North Shelter Forest Project and large scale wind power stations in Southern Inner Mongolia and Zhangbei may make a certain influence on the wind power of North China [4].Third, the South-to-North Water Diversion Project has a few effects over the humidity of the area [5].In addition, there are also other possible factors that may affect haze-fog like atmospheric circulation, solar radiation, traffic, industrial emissions, fires, etc. [6][7][8][9][10].
In China, there is high false alarm rate in haze early warning systems.For example, Beijing and other areas in North China experienced a haze-fog for a long duration lasting for up to 110 h on 27 November 2015, as shown as Figure 1.However, at that time, the heavy air pollution warning had not been started in Beijing.As an example of false alarms, the heavy "red warning" of haze-fog was given by the relevant agency for 12 a.m., 23 December 2015, but in reality, serious haze-fog occurred south of Beijing at 5 p.m. on that day.Such issues with the prediction systems affect the prevention decisions about haze-fog, and bring a significant impact on the traffic of the city, and the physical and mental health of residents.The main reason for such inaccuracies in prediction is due to the lack of research on complex relationships among the different factors in the formation of haze-fog in traditional methods.Therefore, it is difficult to simulate their nonlinear dynamic evolution using these methods.Thus, there exist inconsistencies between the emergency early warning of haze-fog and its actual occurrence.
Sustainability 2017, 9, 352 2 of 14 November 2015, as shown as Figure 1.However, at that time, the heavy air pollution warning had not been started in Beijing.As an example of false alarms, the heavy "red warning" of haze-fog was given by the relevant agency for 12 a.m., 23 December 2015, but in reality, serious haze-fog occurred south of Beijing at 5 p.m. on that day.Such issues with the prediction systems affect the prevention decisions about haze-fog, and bring a significant impact on the traffic of the city, and the physical and mental health of residents.The main reason for such inaccuracies in prediction is due to the lack of research on complex relationships among the different factors in the formation of haze-fog in traditional methods.Therefore, it is difficult to simulate their nonlinear dynamic evolution using these methods.Thus, there exist inconsistencies between the emergency early warning of haze-fog and its actual occurrence.The formation of haze-fog is based on different factors [11], which mainly involve PM2.5-based fine pollution particles, and prevailing weather conditions.This formation can be seen as a complex system.There are complex, iterative, and cognitive relationships among pollutants, weather conditions, pollution particles and haze-fog.For example, PM2.5 mainly involves two kinds of pollutants.One are primary particles emitted in the form of a solid state.The other are secondary particles produced by chemical reactions of the precursor pollutants, for example, NOx, SO2, etc.Therefore, the formation of haze-fog can be modeled in terms of the fuzzy cognitive complex system.FCM (fuzzy cognitive map) [12,13], as a knowledge representation of causal relationship (explicit and implicit) between the concepts and a method for system modeling, has been proposed by Kosko in 1986 and has extended ternary logic relationship in the fuzzy relationship in [−1, 1].FCM is composed of the fuzzy information processing ability of fuzzy logic, the causal propagation method of cognitive map and the adaptive dynamic characteristics of neural network.FCM can also be used for the sensitivity analysis of factors, and the explanation of causal relationship, etc.This also supports a feedback mechanism and it is not like traditional methods that only draw a static conclusion.The final result may be a fixed point, limit cycle, or chaotic attractor, and FCM has been expanded in better expression of knowledge and reasoning ability than traditional methods.Thus, FCM can be used to express the complex cognitive model for haze-fog formation.
Moreover, it is well established that the evolution of weather system is a fuzzy and nonlinear dynamic process [14,15].Haze-fog is a kind of disaster weather, whose formation is a dynamic, nonlinear evolution process.In fact, the formation of haze-fog contains a wealth of data which can be exploited for its understanding.The monitoring data (such as the emission concentration of various pollutants, the meteorological data, etc.) provide massive and valuable data samples for research on the formation of haze-fog.The FCM model can be automatically obtained by learning the data [16][17][18][19], and it has been applied in fault diagnosis, medical prediction, pollution management, and intelligent analysis of social phenomena and so on [20][21][22][23][24]. Thus, training of nonlinear and dynamic haze-fog formation time series data to mine the fuzzy cognitive association values becomes an immediate scientific issue to be solved for the formation of haze-fog.The formation of haze-fog is based on different factors [11], which mainly involve PM 2.5 -based fine pollution particles, and prevailing weather conditions.This formation can be seen as a complex system.There are complex, iterative, and cognitive relationships among pollutants, weather conditions, pollution particles and haze-fog.For example, PM 2.5 mainly involves two kinds of pollutants.One are primary particles emitted in the form of a solid state.The other are secondary particles produced by chemical reactions of the precursor pollutants, for example, NO x , SO 2 , etc. Therefore, the formation of haze-fog can be modeled in terms of the fuzzy cognitive complex system.FCM (fuzzy cognitive map) [12,13], as a knowledge representation of causal relationship (explicit and implicit) between the concepts and a method for system modeling, has been proposed by Kosko in 1986 and has extended ternary logic relationship in the fuzzy relationship in [−1, 1].FCM is composed of the fuzzy information processing ability of fuzzy logic, the causal propagation method of cognitive map and the adaptive dynamic characteristics of neural network.FCM can also be used for the sensitivity analysis of factors, and the explanation of causal relationship, etc.This also supports a feedback mechanism and it is not like traditional methods that only draw a static conclusion.The final result may be a fixed point, limit cycle, or chaotic attractor, and FCM has been expanded in better expression of knowledge and reasoning ability than traditional methods.Thus, FCM can be used to express the complex cognitive model for haze-fog formation.
Moreover, it is well established that the evolution of weather system is a fuzzy and nonlinear dynamic process [14,15].Haze-fog is a kind of disaster weather, whose formation is a dynamic, nonlinear evolution process.In fact, the formation of haze-fog contains a wealth of data which can be exploited for its understanding.The monitoring data (such as the emission concentration of various pollutants, the meteorological data, etc.) provide massive and valuable data samples for research on the formation of haze-fog.The FCM model can be automatically obtained by learning the data [16][17][18][19], and it has been applied in fault diagnosis, medical prediction, pollution management, and intelligent analysis of social phenomena and so on [20][21][22][23][24]. Thus, training of nonlinear and dynamic haze-fog formation time series data to mine the fuzzy cognitive association values becomes an immediate scientific issue to be solved for the formation of haze-fog.

Previous Research on the Formation of Haze-Fog
The formation of haze-fog has been mainly studied from the following three perspectives: physical chemistry mechanism, statistical analysis, and data mining.

Physical Chemistry Mechanism in the Formation of Haze-Fog
A diffusion model is used to simulate the physical chemistry mechanism in the formation of haze-fog.This model is a numerical method, which can quantitatively simulate the emission, migration, diffusion, and chemical reaction of pollution with time and space.As the core of air quality prediction model, the diffusion model has gone through three generations [25,26].
The first generation, mainly, includes the Gauss model and the Lagrange trajectory model.For example, California Puff (CALPUFF) is an atmospheric quality evaluation and prediction system for complex terrain [27].The Gauss diffusion model, as the core of CALPUFF, uses a large number of discrete smoke clusters to represent continuous plume dispersion, and a "snapshot" method to evaluate the concentrated contribution of a single smoke group to a receptor point.
In order to overcome the uniformity of the single grid in the first generation of models, the simulated area is divided into multiple grids, and each grid has independent emission data and meteorology data in the second generation.Based on the Eulerian mesh model [28,29], it introduces a more complicated meteorological model and parameters, and detailed nonlinear chemical reaction mechanism, where chemical transport models (CTM) are used to calculate the concentration of pollutants or air pollution based on the meteorological field and the source list.The transmission process follows the principle of mass conservation of pollutants.
In order to take into account all of these atmospheric variables into the model, USEPA (United States Environmental Protection Agency) proposed the third generation called Models-3 (Third-Generation Air Quality Modeling System)/CMAQ (Community Multi-scale Air Quality).This is a kind of atmospheric chemical transport model based on "one atmosphere" [25,26].Each chemical mechanism in Models-3/CMAQ contains a limited number of chemical reactions.In addition, it is important that the aerosol module is involved in CCTM for the formation of haze-fog because the increase in aerosol concentration reduces the visibility and, thus, forms haze-fog [2].Aerosol particles are the product of interactions between pollutants and meteorology factors and a general name for the solid and liquid particles suspended in the air, whose main component is PM 2.5 .
In short, the physical chemistry mechanism applies the physical and chemical reaction of pollutants under meteorological conditions in the diffusion for prediction of the air quality in time-space.The methods exhibit superiority in simulating the transition and distribution of pollutants in time-space.However, as many complex nonlinear physical and chemical mechanisms are not taken into account, it is difficult to simulate complex physical and chemical reactions under the influence of complex meteorological field for the physical chemistry mechanism.As a result, it sometimes fails to predict the formation of haze-fog.

Statistical Analysis of the Formation of Haze-Fog
Statistical analysis is used to obtain the factors that influence the formation of haze-fog from the data of haze-fog monitored by satellite or ground monitoring equipment.This is one of the most common methods used to analyze the formation of haze-fog.There are three viewpoints related to this approach.
The first view is of meteorological conditions.Ding et al. [30] show that relative humidity is an important influencing factor using the comprehensive judgment method in distinguishing fog and haze happening in China.Fu et al. [31] obtained the influence of wind velocity and relative humidity on haze-fog using mean and frequency calculation methods in the North China Plain.Guo et al. [32] indicate that the long-lasting fog and haze event often occur in a high pressure weather system and calm wind conditions.Zhang et al. [33] show that the dynamic effect on the haze-fog evolution is almost the same as the thermodynamic effect based on meteorological factors using a multiple linear regression model.In addition, Zhang et al. [11], Yang et al. [34], and Yang et al. [35] point out that PLAM (Parameter Linking Air-quality and Meteorology) is a "pollution weather condition" index, such as air pressure, temperature, wind power, relative humidity, stability, precipitation, etc., which can quantitatively reflect the degree of static and stable weather.
The second view is of pollutants.Jansen et al. [36] obtained that both gaseous NO 2 and SO 2 are main factors resulting in the reduction of visibility according to hourly concentrations of particulate sodium and the mixing ratios by the online MARGA ADI 2080 analyzer.Shen et al. [37] obtained that nitrate and organic compounds dominate the aerosol component during the severe haze-fog episodes and are related to secondary aerosol formation and air mass origin by statistical analysis.
The third view is of both pollution and meteorological conditions.Zhang et al. [11] analyzed the factors of haze-fog origins in China, especially in North China, and shows that the existence of heavy haze-fog is related to the high concentration of aerosol particles and continuous haze-fog is related to the prevailing calm weather in the country at present.Zhang et al. [38] focused on the changes of PM 1 in the atmosphere and the influence of meteorological conditions on the weather with heavy haze-fog at a regional background station in the Yangtze River Delta area of China.In the study performed in Chung et al. [39], due to the increase in both water supply and emission of air pollution, the typical pattern of historical mist and haze in London is observed commonly in Korea today.Sun et al. [40] obtained the influence coefficient of PM 2.5 and meteorological conditions on haze-fog under different relative humidity and PM 2.5 concentration levels by a multiple linear regression equation of visibility in Beijing.
The main contributions of the methods are the determination of the possible factors affecting the formation of haze-fog.Haze-fog is the result of a large number of pollution particles suspended in the air and accompanying meteorological conditions.The quantitative relationships among the factors responsible for the formation of haze-fog have seldom been constructed, besides the research [40] has mainly focused on the influence coefficients in the formation of haze-fog.Moreover, the statistical methods [30][31][32][33][34][35][36][37][38][39][40] are more like ex post, linear, and static computing, rather than nonlinear models for the formation of haze-fog.Ultimately, they cannot simulate the sequential evolution of the formation of haze-fog.

Data Mining Methods for the Formation of Haze-Fog
With the aim to simulate nonlinear relationships, data mining is applied in the forecast of air quality.First, it is used to predict the concentration of pollutants [41,42], using techniques such as neural networks, SVM (support vector machine), GM (1,1) Model (Grey model), comprehensive forecast model, GA_ANN (genetic algorithm and artificial neural networks), etc.
It is, then, extended to apply in the formation of haze-fog.Meng et al. [43] constructed an MRM (multiple regression model) for calculating fog and haze intensity by using a logistic function as a haze-fog intensity function.The influencing parameters in the model are mined by fitting the measured data of haze-fog to the model in order to forecast the haze-fog intensity.
At present, the existing data mining method [43] in predicting haze-fog only mines the influencing weights of meteorological and pollutant indices on haze-fog from the available data because of the limitation of the multiple regression model.They cannot get the influencing weights among these meteorological factors or the weight among these pollutant indexes.In addition, they also do not exploit the dynamics of data for the formation of haze-fog.Therefore, this aspect needs to be further explored using complex nonlinear dynamic relationships among factors responsible for the formation of haze-fog.

Proposed Problems
The research on the formation of haze-fog is still in the early stage, especially in China.It is the lack of focus on the complicated nonlinear relationship of haze-fog formation in traditional methods.It is difficult to simulate the nonlinear dynamic evolution processes responsible for the formation of haze-fog in the existing physical chemistry mechanism and statistical analysis; therefore, their prediction and analysis on the formation of haze-fog is sometimes inaccurate.
Due to the complexity of relationships among different factors and the dynamic evolution of the formation of haze-fog, two considerations need to be taken into account.One is the complex cognitive model for the formation of haze-fog.Second is to discover the degree (fuzzy values) of the cognitive relationships in the nonlinear evolution of the formation of haze-fog.These two points can be explained as follows: (1) The complex cognitive model for the haze-fog formation The complex cognitive model should address certain questions about the process of haze-fog formation.For example, what are the factors involved in the formation of haze-fog?What are the relationships among the factors in the cognitive map for haze-fog formation?How do we represent the states (fuzzy value) of factors in fuzzy form?How do the states and the relationships evolve with form?
The FCM structure is similar to a recurrent artificial neural network, where concepts are represented by neurons and causal relationships by weighted links connecting the neurons.The fuzzy cognitive map shown in Figure 2  Due to the complexity of relationships among different factors and the dynamic evolution of the formation of haze-fog, two considerations need to be taken into account.One is the complex cognitive model for the formation of haze-fog.Second is to discover the degree (fuzzy values) of the cognitive relationships in the nonlinear evolution of the formation of haze-fog.These two points can be explained as follows: (1) The complex cognitive model for the haze-fog formation The complex cognitive model should address certain questions about the process of haze-fog formation.For example, what are the factors involved in the formation of haze-fog?What are the relationships among the factors in the cognitive map for haze-fog formation?How do we represent the states (fuzzy value) of factors in fuzzy form?How do the states and the relationships evolve with form?
The FCM structure is similar to a recurrent artificial neural network, where concepts are represented by neurons and causal relationships by weighted links connecting the neurons.The fuzzy cognitive map shown in Figure 2 is a 4-tuple (C, W, A, f) mathematically, where: There are three possible types of causal relationships between concepts: wij > 0, which indicates positive causality between concepts Ci and Cj.That is, the increase (decrease) in the value of Ci leads to the increase (decrease) on the value of Cj; wij < 0, which indicates negative causality between concepts Ci and Cj.That is, the decrease (increase) in the value of Ci leads to the increase (decrease) on the value of Cj; and wij = 0, which indicates no relationship between Ci and Cj.
• A(t) = {A1(t), A2(t), …, An(t)} is a sequence of concepts activation degrees at the moment t.A(0) indicates the initial vector and specifies initial values of all concept nodes and A(t) is a state vector at certain iteration t.
The state vector specifies current values of all concepts (nodes) in a particular iteration.The value of a given node is calculated from the preceding iteration values of nodes, which exert influence on the given node through cause-effect relationship (nodes that are connected to the given node).

•
f is a transformation function, which includes recurring relationship on t ≥ 0 between A(t + 1) and A(t).
A simple fuzzy cognitive map.
• C = {C 1 , C 2 , . . ., C n } is the set of n nodes of a graph, which represents a set of concepts of a system, in general.

•
W: (C i , C j ) → w ij is a function of n × n to a pair of concepts (C i , C j ) taking value in the range −1 to 1, with w ij denoting a weight of directed edge from C i to C j , if i = j, and w ij equal to zero if i = j.Thus, W (n × n) = (w ij ) is a connection matrix.
There are three possible types of causal relationships between concepts: w ij > 0, which indicates positive causality between concepts C i and C j .That is, the increase (decrease) in the value of C i leads to the increase (decrease) on the value of C j ; w ij < 0, which indicates negative causality between concepts C i and C j .That is, the decrease (increase) in the value of C i leads to the increase (decrease) on the value of C j ; and w ij = 0, which indicates no relationship between C i and C j .
• A(t) = {A 1 (t), A 2 (t), . . ., A n (t)} is a sequence of concepts activation degrees at the moment t.A(0) indicates the initial vector and specifies initial values of all concept nodes and A(t) is a state vector at certain iteration t.
The state vector specifies current values of all concepts (nodes) in a particular iteration.The value of a given node is calculated from the preceding iteration values of nodes, which exert influence on the given node through cause-effect relationship (nodes that are connected to the given node).

•
f is a transformation function, which includes recurring relationship on t ≥ 0 between A(t + 1) and A(t).
∀i, j ∈ {1, 2, . . . ,n}, A i (t , where A j (t) is the state of cause concept j at t iteration and A j (t) is the state of effect concept i at t + 1 iteration, w ji is a cause-effect relationship weight from C j to C i .The transformation function is used to confine the weighted sum to a certain range, which is usually set to [0, 1].The three most commonly used transformation functions are shown in Equations ( 1)-( 3).
• logistic There are several simulation scenarios, which are dependent on the transformation function.Applying a discrete transformation function (e.g., the bivalent or trivalent function), the simulation heads to either a fixed state vector value, which is called fixed-point, or keeps cycling between a number of fixed state vector values, which is known as a limit cycle.Using a continuous transformation function (e.g., the logistic signal function), the fixed-point and limit cycle, as well as a so called chaotic attractor, may appear.
The nodes in FCM can be expressed as concepts in the formation of haze-fog.The arc between nodes is the relationship between concepts.Each node has a time series state space.Each node or each arc has strong cognitive semantics so that it is very intuitive to express problems.The inference of FCM takes the advantages of computer in digital computing based on matrix operation.The state space of FCM is formed by the automatic transmission of the evolution function from initial condition.Thus, the dynamic behavior of haze-fog formation can be simulated through the interaction of the concepts in the cognitive network.
(2) Discovering the fuzzy cognitive relationships in the fuzzy cognitive map for haze-fog formation The formation of haze-fog generates a lot of data.The data are complex and sequential.How to discover the fuzzy cognitive relationships from the data resources needs to be solved.
The FCM learning algorithm can obtain the association matrix of FCM from data.There are two classes of FCM learning algorithms: Hebbian-based learning and evolved-based learning.The former are Hebbian-based algorithms [17,44,45], mainly including NHL (nonlinear Hebbian learning) and AHL (active Hebbian learning).The latter are learning algorithms based on evolution theory [17,46,47], which are composed of PSO (particle swarm optimization), RCGA (real coded genetic algorithm), etc.The evolutionary learning can obtain the cause-effect relationships of FCM from the time series data.Therefore, FCM is more applicable to time series data mining, to simulate the nonlinear dynamic evolution process in the formation of haze-fog, and to find fuzzy cognitive relationships (values) in the formation of haze-fog.FCM can be used to simulate the dynamic behavior of haze-fog formation complex cognitive model for haze-fog formation.
Being aimed at the complexity in the formation of haze, from the view of the complex cognitive relationship in the formation of haze-fog, this research is focused on mining the fuzzy cognitive relationships (values) by nonlinear time series data for the formation of haze-fog.This can also be applied in the prediction of the intensity of haze-fog and the analysis of complex relationship in the formation of haze-fog for emergency warning decision support systems.The concepts in the formation of haze-fog mutually affect and interact forming an organic and complex system.There are cause-effect relationships among the concepts representing pollutants.For example, the pollutants of SO 2 , NO 2 may cause "secondary pollution" of PM 2.5 .In the same way, there are cause-effect relationships among the concepts representing meteorological conditions, such as temperature, may influence relative humidity.There are cause-effect relationships between the concepts representing pollutants and meteorological conditions on the formation of haze-fog.According to the cognitive relationships among them, the relationships of FCM can be constructed as Figure 3.

A New
Then the degrees of the cause-effect relationships are described as W = {w ij |w ij which is the value of the arc< C i , C j > in [−1, 1]}, where w ij is the causality degree of the concepts C i to C j .A represents the state spaces of FCM in the formation of haze-fog.The concepts in the formation of haze-fog have state spaces with time series corresponding to the sequence dataset in the formation of haze-fog.Each state is a fuzzy value in [0, 1].The state of the FCM in the formation of haze-fog at time t is represented as A(t) = (Polut(t), Meteo(t), Haze(t)).The Haze(t) is a state value of haze-fog at time t.The Polut(t) = [polut 1 (t), polut 2 (t), . . . ,polut q (t)] that is one vector representing the concentrations of pollutants at time t.The Meteo(t) = [meteo 1 (t), meteo 2 (t), . . . ,meteo n (t)] is one state vector of the meteorological factors at time t.
haze-fog can be simulated as concept nodes of a fuzzy cognitive map in the formation of haze-fog.The concept nodes C = {C1, C2, …, Cn} are defined as a set of the concepts in the formation of haze-fog.
The concepts in the formation of haze-fog mutually affect and interact forming an organic and complex system.There are cause-effect relationships among the concepts representing pollutants.For example, the pollutants of SO2, NO2 may cause "secondary pollution" of PM2.5.In the same way, there are cause-effect relationships among the concepts representing meteorological conditions, such as temperature, may influence relative humidity.There are cause-effect relationships between the concepts representing pollutants and meteorological conditions on the formation of haze-fog.According to the cognitive relationships among them, the relationships of FCM can be constructed as Figure 3.
Then the degrees of the cause-effect relationships are described as W = {wij|wij which is the value of the arc< Ci, Cj > in [−1, 1]}, where wij is the causality degree of the concepts Ci to

The Evolution Mechanism of the Fuzzy Cognitive Map to Haze-Fog Formation
There are causal relationships of the concepts representing pollutants and meteorological conditions on the concept of haze-fog.The evolution mechanism of the formation of haze-fog follows the cognitive inference rules of FCM which can be shown in Equation (4):

The Evolution Mechanism of the Fuzzy Cognitive Map to Haze-Fog Formation
There are causal relationships of the concepts representing pollutants and meteorological conditions on the concept of haze-fog.The evolution mechanism of the formation of haze-fog follows the cognitive inference rules of FCM which can be shown in Equation (4): where α i , β j , γ k , respectively, are the parameter of haze intensity, pollutants, and meteorological conditions in the formation of haze-fog, θ l is the parameter of other possible influencing factors, f is the evolution function, such as sigmoid, and W p = w p1 , w p2 , . . ., w pq T and the W m = [w m1 , w m2 , . . . ,w mn ] T .
They are the relationship vectors of pollutant and meteorological conditions of the formation of haze-fog, respectively.There are causal relationships among the concepts representing pollutants.The inference is shown as Equation ( 5): Similarly, there are causal relationships among the concepts representing meteorological conditions.The inference is shown as Equation ( 6): where S i is a set of nodes associated with C i , w ji is the casual value of the C j to C i , and λ i , λ j , η i , η j are the state parameters in the FCM, respectively.The logistic function can be chosen as transformation function shown in Equation (3).The state values of the concepts in the formation of haze-fog evolve with time after the initial state has been determined in accordance with the nonlinear dynamic reference rules that are represented by Equations ( 3)-( 6), and finally may end at one of three stable states that are a fixed point, a limited cycle, and a chaotic attractor.

The Approach to Data Mining of the Fuzzy Cognitive Map for Haze-Fog Formation
How to discover the degrees of cause-effect is a key issue in fuzzy cognitive mapping for the formation of haze-fog.Massive sequential data is generated in the process of haze-fog formation.It can be considered as a nonlinear evolution of fuzzy cognitive map for the formation of haze-fog, and the degrees of cause-effect relationship need a data mining method of fuzzy cognitive map in order to be discovered.
Taking the genetic algorithm in evolutionary learning as an example, the work flow using the time series mining of the fuzzy cognitive map for the formation of haze-fog is shown in Figure 4.The key procedures are the initialization method, the determination of mining end condition, and optimization method.The causality value can be mined by these methods.
the degrees of cause-effect relationship need a data mining method of fuzzy cognitive map in order to be discovered.
Taking the genetic algorithm in evolutionary learning as an example, the work flow using the time series mining of the fuzzy cognitive map for the formation of haze-fog is shown in Figure 4.The key procedures are the initialization method, the determination of mining end condition, and optimization method.The causality value can be mined by these methods.The main steps are the mining of the degrees in the fuzzy cognitive map by simulating the nonlinear changes of the fuzzy cognitive map in the haze-fog formation driven by multidimensional sample data.The genetic algorithm is depicted in Table 1.The main steps are the mining of the degrees in the fuzzy cognitive map by simulating the nonlinear changes of the fuzzy cognitive map in the haze-fog formation driven by multidimensional sample data.The genetic algorithm is depicted in Table 1.
Table 1.The genetic algorithm of the fuzzy cognitive map.

Inputs: Sample Data from the Process of Haze-Fog Formation
Step1.Initialize parameters of genetic algorithm and the FCM within the known range.Step2.Generate initial population based on operator.Step3.Calculate fitness function according to the time series data.Step4.Evolve the population.
Step5.Return to Step 3, until the fitness function is maximized (i.e., the end of mining conditions) after finite iterations.

Outputs: The Relationship Degrees in the Fuzzy Cognitive Map for Haze-Fog Formation
Each correlation degree in the FCM can be defined as a gene.Assume that there are N genes; these degrees (weights) can be compiled as a real vector, which represents an N-dimensional chromosome, i . . ., w N ].According to the range of each gene, created using Logistic mapping, it forms a numbers of M initial chromosomes {W (1) , . . ., W (u) , . . .W (M) } for M chromosomes.
The simulation at time t can be calculated by the previous at time t − 1 and the evolution rules of Formulas (4)- (6).Assuming that the simulation states of the fuzzy cognitive map at time t are A (t) = A 1 (t), . . . ,A i (t), . . . ,A v (t) and the actual states (from the measured data) are A(t) = {A 1 (t), . . . ,A i (t), . . . ,A v (t)}, the error is given as the difference between the actual value of A(t) and the simulated value of A (t).This error is basis of fitness function definition, such as the one given in Equation (7).The higher the sufficiency is, the better is the fitting of time series data.When the fitness function reaches the threshold value, the process ends: The individual weights are optimized through crossover and mutations in the genetic algorithm.A new individual is generated by crossover operation of two parent individuals.Mutation operation gives the individual component of the real number code.Through the crossover and the mutation, the best choice is selected from the new individuals and their parent, and it enters into the following generation groups for the purpose of optimization.

Experiments and Results
The meteorological data from NCEP (National Centers for Environmental Prediction) in the USA has been used [48].The meteorological data consider temperature, pressure, relative humidity, and wind speed.The pollutant and haze-fog data have been obtained from the datacenter of the MEP (Ministry of Environment Protection) of China.The pollutant data are about PM 2.5 , SO 2 , NO 2 , and CO.These daily data have been chosen from 23 October 2014 to 6 January 2015, for Beijing.
According to the study in Section 2.2 and the data characteristics, the meteorological conditions of temperature, pressure, relative humidity, and wind speed are identified as meteorological concepts of the fuzzy cognitive map for haze-fog formation.According to the study of pollutants and their characteristics summarized in Section 2.2, PM 2.5 , SO 2 , NO 2 , and CO have been identified as the polluting concepts of the fuzzy cognitive map for haze-fog formation.The haze-fog has been identified as fuzzy cognitive map for understanding its formation.
Accordingly, one weight matrix has been constructed for meteorological conditions and haze-fog.Another weight matrix has been constructed for pollutants and haze-fog.The concepts included in the FCM are temperature, pressure, relative humidity, wind speed, PM 2.5 , SO 2 , NO 2 , CO, and haze-fog.A genetic algorithm is chosen as a data mining approach for the FCM.The experiment has been run and the values are reported below: • recombination method-single-point crossover; Through the implementation of experiments, using the cause-effect weights in the formation of haze-fog, the key influencing factors can be analyzed.In meteorological conditions' wind speed has a stronger influence on the haze-fog in Beijing.In the pollutants, PM 2.5 is the main pollutant influencing the formation of haze-fog in Beijing.
In order to verify the fuzzy cognitive map for the formation of haze-fog, the intensity of haze-fog formation is represented as a fuzzy value in fuzzy cognitive map and a single rule of validity determination is formulated.The fuzzy intensity has been divided into four sections.Clear cases are in [0, 0.25), slight haze in [0.25, 0.50), mild haze in [0.50, 0.75), and severe haze in [0.75, 1].Since the actual value is an interval such as fog, or haze and so on, while the forecast result is fuzzy value, a rule is defined for determining the validity of the FCM and its data mining: if (the forecast value is in the interval of corresponding actual intensity) the number of valid forecast in the right interval plus one; else the invalid number in a wrong interval plus one.
The valid forecast points to forecast value being in the range of corresponding actual intensity.In order to further validate it over multiple times, the Kappa Index of Agreement (K) is defined to help evaluate the outputs expressed by Equation (8), where P A is the observed consistency and P e is the expected consistency.If all forecast results are valid, K equals 1.The larger the Kappa value, the better the consistency: However, the following points still need to be studied further, especially under the formation conditions that are becoming increasingly complex: (1) Quantitatively dynamic models need to be further developed for the formation of haze-fog under increasingly complex scenarios.(2) The relationships among the factor concepts in the formation of haze-fog need to be well recognized and modeled.(3) The dynamic and nonlinear changes need to be further simulated for forecasting the formation of haze-fog.
Thus, a new view on the formation of haze-fog addressing the above issues has been proposed.It is aimed at the complex relationships in the formation of haze-fog using fuzzy cognition and data mining.The research scheme has been put forward in two sections.One is construction of the FCM and the other is the data mining approach for FCM on the formation of haze-fog.The FCM involves the evolution mechanism in its construction.The other is the construction of the FCM for the formation of haze-fog, and the data mining for the fuzzy cognitive relationships (degrees) from the dynamic sequential data.
The preliminary experimental results based on genetic algorithm indicate that the nonlinear cognitive and construction for the formation of haze-fog can get better forecast and analysis of key influencing factors.Of course, there are many problems to be improved, such as involvement of more influencing factors, parameter setting, comparison of several experiments, accuracy, operational efficiency, etc.Nevertheless, the research has proposed a kind of model and method simulating the dynamic and complex relationships among the influencing factors and haze-fog and, to some extent, demonstrated better theoretical and practical significance for the formation of haze-fog with increasingly complex meteorological conditions and pollutants.This can provide the basis for forecasting the formation of haze-fog, complex analysis of the formation of haze-fog, and provide important decision support for emergency early warning and control of haze-fog.

Figure 2 .
Figure 2. A simple fuzzy cognitive map.• C = {C1, C2, …, Cn} is the set of n nodes of a graph, which represents a set of concepts of a system, in general.• W: (Ci, Cj) → wij is a function of n × n to a pair of concepts (Ci, Cj) taking value in the range −1 to 1, with wij denoting a weight of directed edge from Ci to Cj, if i ≠ j, and wij equal to zero if i = j.Thus, W (n × n) = (wij) is a connection matrix.
Perspective: The Fuzzy Cognitive Map for Haze-Fog Formation 3.1.The Construction of the Fuzzy Cognitive Map for Haze-Fog Formation From the view of cognition, there is cause-and-effect relationship between the pollutants, meteorological conditions, other possible factors, and the formation of haze-fog.The factors and haze-fog can be simulated as concept nodes of a fuzzy cognitive map in the formation of haze-fog.The concept nodes C = {C 1 , C 2 , . . ., C n } are defined as a set of the concepts in the formation of haze-fog.
Cj.A represents the state spaces of FCM in the formation of haze-fog.The concepts in the formation of haze-fog have state spaces with time series corresponding to the sequence dataset in the formation of haze-fog.Each state is a fuzzy value in [0, 1].The state of the FCM in the formation of haze-fog at time t is represented as ( .The ( ) is a state value of haze-fog at time t.The ( ) = [ ( ), ( ), …, ( )] that is one vector representing the concentrations of pollutants at time t.The ( ) = [ ( ), ( ),…, ( )] is one state vector of the meteorological factors at time t.

Figure 3 .
Figure 3. FCM in the formation of haze-fog.

Figure 3 .
Figure 3. FCM in the formation of haze-fog.

Figure 4 .
Figure 4.The genetic algorithm procedure of the fuzzy cognitive map for haze-fog formation.

Figure 4 .
Figure 4.The genetic algorithm procedure of the fuzzy cognitive map for haze-fog formation.