Dynamic Correlation Analysis Method of Air Pollutants in Spatio-Temporal Analysis

Pollutant analysis and pollution source tracing are critical issues in air quality management, in which correlation analysis is important for pollutant relation modeling. A dynamic correlation analysis method was proposed to meet the real-time requirement in atmospheric management. Firstly, the spatio-temporal analysis framework was designed, in which the process of data monitoring, correlation calculation, and result presentation were defined. Secondly, the core correlation calculation method was improved with an adaptive data truncation and grey relational analysis. Thirdly, based on the general framework and correlation calculation, the whole algorithm was proposed for various analysis tasks in time and space, providing the data basis for ranking and decision on pollutant effects. Finally, experiments were conducted with the practical data monitored in an industrial park of Hebei Province, China. The different pollutants in multiple monitoring stations were analyzed crosswise. The dynamic features of the results were obtained to present the variational correlation degrees from the proposed and contrast methods. The results proved that the proposed dynamic correlation analysis could quickly acquire atmospheric pollution information. Moreover, it can help to deduce the influence relation of pollutants in multiple locations.


Introduction
In the rapid expansion of society and economy, pollutants and sources are emerging as threats to indoor and outdoor air quality, although various measures have been conducted to control pollution. In practice, many information systems are established to monitor pollutant discharge. The systems usually provide the functions of real-time monitoring and trend prediction. The functions provide only the basic information for the administrator and decision maker. Moreover, the influence relation is important for the management of environment and public health [1,2]. There is an urgent demand to explore the influence relation of pollutants and the potential pollution sources. The paper focused on the analysis method of pollutants and sources, which can provide a solution to the emerging issues in air quality management.
The issue of pollutant relation and source tracing belongs to the spatial and temporal analysis of atmospheric variables [3,4]. For studying the issue, some explored fluid mechanics and probability models, such as Gaussian plume model [5], Gaussian puff model [6], state-space model [7] and hidden Markov model [8]. The models simulate the gas diffusion process from the source to the surrounding area. The category of the models is built on mechanism analysis, which relies heavily on the professional knowledge of environmental sciences and physics. Besides, for the demand of source tracing, the models are difficult to apply reversely, that is, to find out the pollution source with the gas distribution. The other category of analysis methods is the data-driven solution. The implicit information is extracted from data with statistical and information processing methods, such as the spatial-temporal statistics [9][10][11], functional data analysis [12,13], and correlation analysis [14][15][16]. The spatial-temporal statistics focus on the statistical parameters from the historical data. The functional data analysis can build the regressive model with the data feature. The correlation analysis focuses on the numerical relationship of variables with an intuitional and lucid correlation degree. The methods above rely on a certain amount of data, and they output a general condition for a period. They are short of the timeliness and dynamic features.
Different deficiencies exist in the methods above, which will be introduced in detail in the Section of Related Work. For atmospheric environment management, there are some practical problems. Firstly, air pollutants change obviously in a season and even in a day. Secondly, the pollutant diffusion is impacted by production activities in industrial parks. The different factories can lead to diversiform gas diffusion. Thirdly, there is the cross-impact of multiple variables on a point, as well as multiple positions. The complicated interaction effect is a severe problem in practical analysis. In brief, there is a gap between practical demand and the existing methods. The correlation must be analyzed dynamically in real-time. Besides, the spatial correlation should be conducted to excavate the pollution source information intuitively and rapidly.
For the problems above, a dynamic spatio-temporal correlation analysis method is proposed in a data-driven thought. The method in this paper emphasizes the correlation degree of pollutant variables and positions, of which the process runs dynamically, and the results are direct for influence relation and source tracing. The method is designed considering the inference of multiple positions in the spatial dimension, and the dynamic real-time calculation in the temporal dimension. The case experiment is carried out with the monitoring data of an industrial park in Hebei Province, China.
The rest of this paper is organized as follows. Section 2 introduces the related work, including the spatial distribution model and correlation analysis method. In Section 3, the main spatio-temporal framework and method are proposed. Experiments are conducted in Section 4, and the results are discussed in Section 5. Finally, the study of the paper is concluded in Section 6.

Related Work
As mentioned in the Introduction, the main tools to analyze air pollutants in the spatial and temporal dimensions include the gas diffusion model, spatial-temporal statistics, functional data analysis, and correlation analysis method. The basic principle and related studies are presented in this section. They are also analyzed under the management demand of an industrial atmospheric environment.

Gas Spatial Diffusion Model
The spatial distribution is a fundamental feature of the atmospheric elements. It plays a vital role in the analysis of pollutant diffusion and surrounding influence. The classical models have been built for the gas diffusion analysis, in which the Gaussian plume model [5] and the Gaussian puff model [6] have been the representatives, based on the probability model. The probability model makes posterior probability statistics of gas diffusion at a specific time point through prior probability and judges the diffusion parameters with the probability value. Many researchers use the Gaussian model to calculate the concentration distribution of leakage media under different conditions, as well as the variation rule in the time dimension.
In the study and application of the Gaussian model [17][18][19], some focus is done on the issue of gas diffusion with the known emission source. The default coordinate system is set up taking the emission source as the origin, and the wind direction, and its vertical relations as axes. In the Gaussian model with established parameters, only the position information of three directions and emission time are needed to calculate the gas concentration at the specified position. Besides, others focus on the issue of gas coverage. In the case of specified parameters (standard difference of source strength, etc.) and gas concentration, the approximate gas coverage can be found based on the model.
The leading role of the Gaussian model is the forward analysis, in which the pollutant diffusion and distribution can be obtained based on the source information. However, in demand for pollution source tracing, the back-forward inference is needed to find out the source strength based on the gas distribution. In the back-forward case, the Gaussian model is difficult to reverse because of the hypothetical excess parameters. The reversed model will output different inference results of the source when some of the parameters are inaccurate. Hence, there is a distinct shortage in the diffusion model for the inference of variable influence and source tracing.

Spatial-Temporal Statistics and Functional Data Analysis
Spatial and temporal analysis has drawn attention based on various geographic information systems, including atmospheric monitoring. The classical methods include spatial-temporal statistics and functional data analysis. The spatial-temporal statistics [9][10][11] mainly analyze the mutual structure of spatial distribution and the feature of time series. The spatial distribution pattern is estimated by the first-order (large scale samples) structure and the second-order (small scale or local samples) structure, and the non-sample spatial region is predicted or interpolated by the estimated results. The functional data analysis [12,13] mainly transforms the original discrete data into a functional form, so as to explore the correlation between the data through the analysis of function.
Scholars have applied the spatial-temporal statistics and functional data analysis methods to environmental issues. In the method studies [20][21][22], the statistics parameters are obtained and converted to form functions. The functions can fit the data trends with the least-squares, variance analysis, maximum likelihood estimation, etc. Based on the functions, the data can be analyzed in the mapping relation from the functions.
For the spatial-temporal statistics and functional data analysis methods, there are some difficulties in the application for the real-time analysis demand in our problem. Firstly, the methods mainly realize the analysis during a period. The results are the description and representation of past conditions. It still needs the exploration of the real-time conduction for the methods. Secondly, a fundamental condition of the methods is sufficient data of many points over a long period. The statistics results may be unauthentic if the available samples are not enough. Thirdly, the accurate regression of a function is difficult because of the complex nonlinearity and being nonstationary. The analysis results are mainly impacted by the fitting level of the function based on the data. Hence, the applications of the statistics and functional methods become difficult for various concrete problems in dynamic demand.

Correlation Analysis Method
Correlation analysis of atmospheric pollutants is the simple and effective access to determine the influencing factors and trace the pollution source. In a literature review, the mainstream of correlation analysis methods includes partial correlation [14,23], principal component [15,24], and grey correlation analysis [16,25], which have been applied widely in different fields.
The partial correlation analysis method focuses on the issue of more than three variables. It analyzes the correlation relationship between two variables, independently, without the third one. In partial correlation, the correlation coefficient R or R 2 is set as the criterion for the correlation degree. Li et al. [23] applied partial correlation analysis to the impact of market elements on the domestic stock market. Porth et al. [26] studied the nutrient resource allocation between plant growth and recuperation based on the partial correlation of gene expressions. Olszewski et al. [27] analyzed the longitudinal correlation between two particles in heavy-ion collisions and extracted the relationship between partial covariance and conditional covariance. It proved the feasibility of the statistical method in the physics field.
Principal component analysis aims at obtaining an independent comprehensive index, namely principal component, by synthesizing a variety of indicators. The principal component index is expected to map almost all the information on the initial data. Calce et al. [28] applied principal component analysis to the standard evaluation of the osteoarthritis. Lionnie et al. [29] established a biometric recognition pattern system, in which principal component analysis extracts features in the mathematical and statistical solution. The cross-validation proved the validity of the fusion method. Cai et al. [30] proposed a detection and location method for disturbances in the power system, in which principal component analysis was fused with k-nearest neighbor analysis.
The grey system theory has been studied widely in various fields. Moreover, the grey relational analysis method is broadly used in the assessment system. Grey relational analysis refers to the quantitative description and comparison of the development and change trend of a system. It determines the closeness by judging the geometric shape similarity of the reference and several comparative data. Fu et al. [31] studied the relationship between the air quality indexes of Beijing and its surrounding region with the grey convex relation model. Cao et al. [32] tried to determine the main influence factors of the atmospheric corrosion of Q235 carbon steel with a grey relational analysis method. Hashemi et al. [33] built a comprehensive green supplier selection model, in which the analysis network process was used to deal with the interdependencies between the criteria, based on the improvement of traditional grey relational analysis. Malekpoor et al. [34] applied grey relational analysis to the sustainable electricity generation planning, in which the evaluation and rank of systems were determined with grey interval values.
It can be found that correlation analysis methods perform differently in concrete applications. An appropriate method should be selected with the specific demand. The grey correlation analysis method has a simple and reliable structure with an appropriate calculation scale. Moreover, there is not an excessive requirement for the sample size. It is more suitable for the demand of real-time and fast analysis. Besides, most of the studies use the methods in a static view, in which a constant correlation number is obtained based on a period of historical data. It is a practical demand to analyze the real-time correlation in different time points. Then, the correlation analysis method should be improved in the dynamic view along time.

Dynamic Spatio-Temporal Correlation Analysis Method
There are some practical demands for air quality management. Firstly, it is expected to explore and trace the pollution source region, except for the existing real-time monitoring and future prediction. Secondly, the data-driven correlation analysis method can help inferencing the influence variables and possible source region, based on the review of related work. Thirdly, it is needed to obtain the analysis result in time, of which multiple dimensions should be covered, including the pollutant variables and locations. Therefore, the dynamic spatio-temporal correlation analysis method is designed. The general framework and basic dynamic correlation method will be presented firstly, and Then, the spatio-temporal correlation analysis algorithm will be concluded finally.

Spatio-Temporal Correlation Analysis Framework
Based on the demand analysis of the industrial atmospheric management, the correlation analysis should meet three aspects of needs: (1) the interaction of multiple pollutant variables should be explored, (2) the influence of different locations should be analyzed, and (3) the analysis should be conducted in the real-time based on the monitoring system. Then, a comprehensive correlation analysis framework is designed, as shown in Figure 1 The framework in Figure 1 mainly consists of three parts, namely, the data source, core analysis method, and result presentation.
For the data source, the atmospheric monitoring system is set as the infrastructure. Taking the air monitoring grid in China as an example, monitoring stations have been established with a grid layout, in which the equipment is placed at the intersection of the rectangular mesh. The monitoring grid is expected to increase the measurement coverage, and each station can reflect the circumjacent air conditions. The monitoring stations provide data in the framework, and the data consist of multiple pollutant variables with a certain frequency.
For the core analysis method, a dynamic correlation method is studied in this paper, which is introduced in Section 3.2. The method can output the correlation between the pollutant variables, as well as the correlation between monitoring points.
For the result presentation, various forms can be selected, referring to the data types. The pollutant variable correlation is the time series, which can be shown in the curve graph. The correlation of points has the two-dimensional cross-relation with time features. Moreover, the two types of pollutant and point correlations can be integrated, for example, pollutant variable A in point 1 can be analyzed with variable B in point 2. Then, the integration result can be queried in an appropriate form.

Dynamic Correlation Calculation
In the spatio-temporal correlation analysis framework, the vital component is the dynamic correlation analysis method. The concrete applications are conducted based on the correlation analysis. For the need of dynamic calculation, the method is studied with information entropy and grey relational analysis.

Adaptive Sliding Window with Information Entropy
In the traditional correlation analysis, the result is static based on all historical data. In the dynamic method, the correlation should be calculated in time with a small time interval. The calculation cannot cover all historical data repeatedly, considering the computing load and speed. Moreover, a sliding window is a useful tool to reduce the calculated amount. However, a fixed-length window may lose efficacy. The data feature may be lost if the window is short, while the computing load may increase if the window is long. Then, information entropy is introduced to improve the sliding window in the adaptive view.
Information entropy can extract data variation characteristics quantitatively and effectively. The change of time-series data can be mapped to a scalar of data fluctuation based on information entropy. The framework in Figure 1 mainly consists of three parts, namely, the data source, core analysis method, and result presentation.
For the data source, the atmospheric monitoring system is set as the infrastructure. Taking the air monitoring grid in China as an example, monitoring stations have been established with a grid layout, in which the equipment is placed at the intersection of the rectangular mesh. The monitoring grid is expected to increase the measurement coverage, and each station can reflect the circumjacent air conditions. The monitoring stations provide data in the framework, and the data consist of multiple pollutant variables with a certain frequency.
For the core analysis method, a dynamic correlation method is studied in this paper, which is introduced in Section 3.2. The method can output the correlation between the pollutant variables, as well as the correlation between monitoring points.
For the result presentation, various forms can be selected, referring to the data types. The pollutant variable correlation is the time series, which can be shown in the curve graph. The correlation of points has the two-dimensional cross-relation with time features. Moreover, the two types of pollutant and point correlations can be integrated, for example, pollutant variable A in point 1 can be analyzed with variable B in point 2. Then, the integration result can be queried in an appropriate form.

Dynamic Correlation Calculation
In the spatio-temporal correlation analysis framework, the vital component is the dynamic correlation analysis method. The concrete applications are conducted based on the correlation analysis. For the need of dynamic calculation, the method is studied with information entropy and grey relational analysis.

Adaptive Sliding Window with Information Entropy
In the traditional correlation analysis, the result is static based on all historical data. In the dynamic method, the correlation should be calculated in time with a small time interval. The calculation cannot cover all historical data repeatedly, considering the computing load and speed. Moreover, a sliding window is a useful tool to reduce the calculated amount. However, a fixed-length window may lose efficacy. The data feature may be lost if the window is short, while the computing load may increase if the window is long. Then, information entropy is introduced to improve the sliding window in the adaptive view.
Information entropy can extract data variation characteristics quantitatively and effectively. The change of time-series data can be mapped to a scalar of data fluctuation based on information entropy. Then, a rational threshold can be set to distinguish the data fluctuation range, and it can guide the sliding window length in the correlation analysis.
In the concrete design, the sliding window length should be adjusted according to the time-series features. When the near-term data change smoothly, the sliding window should be lengthened to expand the data range and cover more data characteristics. When the data fluctuate severely, the interception window size should be shortened, the correlation analysis range will be reduced, and the identification of instantaneous regional characteristics will be improved. Meanwhile, the adjustment can improve the calculation efficiency, avoiding redundant computing. In the idea of window adjustment, an adaptive sliding window determination method is proposed based on information entropy [35].
(1) The default window length L 0 is given firstly, and minimum of L 0 should be 10, and its maximum should be less than ten percent of the total data number. At each time point, the previous L 0 of values are used to measure the time series variation. The mean value of the segment is calculated: where i is the time point, m is the mean value of the data segment, d i is the i-th value in the data segment.
(2) The variation of the time series is measured with the definition of data fluctuation scalar z i : (3) The data fluctuation scalar is converted into a probability measure p i , which reflects the change degree of a single point relative to the change degree of whole intercept data segment. And it is converted in the percentage form: (4) The information entropy is applied to transform the probability measure to the data fluctuation characteristic. Concretely, the changes of each point data are transformed to the probability, and information entropy is calculated with change characteristics carried in the intercept data. The information entropy H is calculated as following: The adjustment proportion of sliding window length is defined as where H 0 = log 2 L 0 is the maximum information entropy value in the current data segment, and the new window length L is defined as where s min and s max are the stability threshold, and s min = min p i , and s max = max p i .

Grey Relational Analysis
As introduced in the related work, the grey relational analysis, which is based on grey theory, seeks and defines the quantitative relationship between the factors of a system. It is one of the few methods which can reflect the geometric relationship between the data intuitively. The process of grey relational analysis [16] is introduced briefly here.
(1) Define the object variable y and its potential associated variables x k , k is the serial number of associated variables, and 1 ≤ k ≤ n. The time series values in y and x k are denoted as y(i) and x k (i).
(2) The original data of object variable and associated variables should be normalized to remove the effect of different measurement units.
(3) The object variable y(i) is set as the reference sequence, and a comparison matrix is built by conducting subtraction operation on the reference sequence and the associated variable sequence x k (i).
(4) Calculate the maximum difference between the two levels in the matrix max The item value of each variable corresponding to the reference sequence is obtained, and the mean value of the correlation coefficient is calculated. Then, the correlation sequence ξ k (i) can be formed, as the following formula: where ρ is the resolution ratio, 0 < ρ < 1. The greater the difference between correlation coefficients, the stronger the ability to distinguish, in which the difference is positively related with ρ. ρ can be defined as about 0.5 according to the experience. (6) According to the correlation sequence in Formula (7), the correlation degree between the object variable and the k-th associated variable is calculated: where L is the data size in the sliding window determined with the method in Section 3.2.1.

Dynamic Spatio-Temporal Correlation Algorithm
Based on the correlation analysis framework, two basic tasks should be conducted with the correlation analysis methods, including the correlation of variables in one monitoring point and the correlation of different points. The algorithm is designed in this subsection for the two tasks by organizing the theoretical algorithms in Section 3.2 based on the framework in Section 3.1.
The algorithm consists of two parts, one is the single-point pollutant variables correlation, and the other is the multiple points correlation. The flow of the dynamic spatio-temporal correlation algorithm is shown in Figure 2 For the algorithm shown in Figure 2, the analysis on points and variables are conducted respectively. For the left column, the loop of points is designed to obtain the variable correlation information at each point. For the right column, the loop of variables is for the point correlation information.
There is the time recurrence in both extrinsic loops to calculate the correlation dynamically. In the time recurrence, the data before the current moment, of which the size is 0 L , are used to determine the sliding window firstly. The window length can be adjusted according to Equations (1)-(6), and the new length is L . Then, the L values before now are used to calculate the correlation degree following the grey relational method in Section 3.2.2. Finally, the results can be presented with different forms which will be shown intuitively in the experiment section.

Dataset and Experiment Setting
The experiment is designed and conducted to verify the proposed correlation analysis method. The monitoring data have been collected in an industrial park in Hebei Province, China. As shown in Figure 3 For the algorithm shown in Figure 2, the analysis on points and variables are conducted respectively. For the left column, the loop of points is designed to obtain the variable correlation information at each point. For the right column, the loop of variables is for the point correlation information.
There is the time recurrence in both extrinsic loops to calculate the correlation dynamically. In the time recurrence, the data before the current moment, of which the size is L 0 , are used to determine the sliding window firstly. The window length can be adjusted according to Equations (1)- (6), and the new length is L. Then, the L values before now are used to calculate the correlation degree following the grey relational method in Section 3.2.2. Finally, the results can be presented with different forms which will be shown intuitively in the experiment section.

Dataset and Experiment Setting
The experiment is designed and conducted to verify the proposed correlation analysis method. The monitoring data have been collected in an industrial park in Hebei Province, China. As shown in Figure 3  For the multidimensional correlation analysis, the correlation degree of different variables in various points should be analyzed. For the paper length limit, a few variables and points are selected from the previous two experiments. The selected relation to be analyzed is shown in Table 1, in which the star mark means the related matric elements will be analyzed. Moreover, the performance of the proposed method is interpreted comparing with other methods. Firstly, the traditional static correlation analysis is set as the contrast, in which one constant degree is output based on the whole data segment. The first method is abbreviated as "static correlation". Because the proposed method consists of the adaptive sliding window and grey relational analysis, the two parts are replaced with the classical methods respectively to form the contrast methods. Secondly, the sliding window length is fixed, referring to the traditional calculation. Then, the second contrast method is grey relational analysis with a fixed sliding window, abbreviated as "FSW-GRA". Thirdly, another correlation method is tried to replace grey relational analysis. The classical partial correlation is selected to form the third contrast method, namely partial correlation For the correlation analysis of multiple pollutants, various variables are focused on by selecting just a monitoring point (HS station). PM 2.5 is set as the object variable, and the relative variables include PM 10 , CO, temperature, and humidity. Then, the correlation degree between PM 2.5 with the other four variables is calculated. In the experiment, three sections of a period (10 days) in different seasons, are analyzed, namely, the middle ten days in July 2016, December 2016, and May 2017.
For the correlation analysis of multiple points, the pollutant variable is fixed (PM 2.5 ), and the points are the main analysis object. On the one hand, the relation of any two points is tested. On the other hand, HS station is mainly analyzed with four circumjacent points, including No.1 (500 m in the east), No.2 (1000 m in the northeast), No.3 (500 m in the west), and No.4 (1000 m in the southeast). The time period is the same as the previous experiment.
For the multidimensional correlation analysis, the correlation degree of different variables in various points should be analyzed. For the paper length limit, a few variables and points are selected from the previous two experiments. The selected relation to be analyzed is shown in Table 1, in which the star mark means the related matric elements will be analyzed. Moreover, the performance of the proposed method is interpreted comparing with other methods. Firstly, the traditional static correlation analysis is set as the contrast, in which one constant degree is output based on the whole data segment. The first method is abbreviated as "static correlation". Because the proposed method consists of the adaptive sliding window and grey relational analysis, the two parts are replaced with the classical methods respectively to form the contrast methods. Secondly, the sliding window length is fixed, referring to the traditional calculation. Then, the second contrast method is grey relational analysis with a fixed sliding window, abbreviated as "FSW-GRA". Thirdly, another correlation method is tried to replace grey relational analysis. The classical partial correlation is selected to form the third contrast method, namely partial correlation with adaptive sliding window, abbreviated as "ASW-PC". The proposed method in this paper is abbreviated as "ASW-GRA". The contrast methods are conducted in some of the three experiments above.

Correlation of Multiple Pollutants
In this part of the experiment, the correlation between different variables is analyzed in one monitoring point. Based on the experimental settings, the correlation degrees between PM 2.5 and PM 10 , CO, temperature and humidity are calculated in three periods. The results are shown in Figure 4, in which the three subfigures are corresponding to the middle ten days of three months in different seasons. with adaptive sliding window, abbreviated as "ASW-PC". The proposed method in this paper is abbreviated as "ASW-GRA". The contrast methods are conducted in some of the three experiments above.

Correlation of Multiple Pollutants
In this part of the experiment, the correlation between different variables is analyzed in one monitoring point. Based on the experimental settings, the correlation degrees between PM2.5 and PM10, CO, temperature and humidity are calculated in three periods. The results are shown in Figure 4, in which the three subfigures are corresponding to the middle ten days of three months in different seasons.   Figure 4 show the change of the influence factor on PM2.5 along the time. In each subfigure, the correlation degree between PM2.5 and the other four variables is calculated every hour, and the total number of data is 240 (10 days). The correlation degree can be ranked at each time point, and the main influence factor is not fixed at different time points. Moreover, the correlation trends are different in multiple seasons. It can be inferred from the results that a certain variable should not  Figure 4 show the change of the influence factor on PM 2.5 along the time. In each subfigure, the correlation degree between PM 2.5 and the other four variables is calculated every hour, and the total number of data is 240 (10 days). The correlation degree can be ranked at each time point, and the main influence factor is not fixed at different time points. Moreover, the correlation trends are different in multiple seasons. It can be inferred from the results that a certain variable should not be determined as the only and the most important impact factor generally, but according to the time change.

Results in
Parts of the correlation above are selected to be re-analyzed with contrast methods. Concretely, the correlations of PM 2.5 -PM 10 in July 2016 and PM 2.5 -temperature in December 2016 are calculated with four methods, including "Static correlation", grey relational analysis with fixed sliding window "FSW-GRA", partial correlation with adaptive sliding window "ASW-PC" and the proposed method "ASW-GRA". The results are shown in Figure 5. Besides, the deviation between the dynamic methods (the latter three) and the static correlation degree is calculated. The deviation is shown in Figure 6. be determined as the only and the most important impact factor generally, but according to the time change.
Parts of the correlation above are selected to be re-analyzed with contrast methods. Concretely, the correlations of PM2.5-PM10 in July 2016 and PM2.5-temperature in December 2016 are calculated with four methods, including "Static correlation", grey relational analysis with fixed sliding window "FSW-GRA", partial correlation with adaptive sliding window "ASW-PC" and the proposed method "ASW-GRA". The results are shown in Figure 5. Besides, the deviation between the dynamic methods (the latter three) and the static correlation degree is calculated. The deviation is shown in Figure 6.  be determined as the only and the most important impact factor generally, but according to the time change.
Parts of the correlation above are selected to be re-analyzed with contrast methods. Concretely, the correlations of PM2.5-PM10 in July 2016 and PM2.5-temperature in December 2016 are calculated with four methods, including "Static correlation", grey relational analysis with fixed sliding window "FSW-GRA", partial correlation with adaptive sliding window "ASW-PC" and the proposed method "ASW-GRA". The results are shown in Figure 5. Besides, the deviation between the dynamic methods (the latter three) and the static correlation degree is calculated. The deviation is shown in Figure 6.   In Figure 5, the traditional static correlation degree cannot reflect the change over time. In fact, the main influence factor is not fixed, as shown in Figure 4. The static correlation degree may mislead the verdict of the influence factor. For dynamic performance, an obvious distinction is expected to for different time points. In this view, the fluctuation of our method (ASW-GRA) is bigger than others, which means it can represent the change more markedly. For ASW-GRA and FSW-GRA, they distinguish in the sliding window length. There is seemingly a delay for the fixed window length, which is evident in Figure 6b. For ASW-GRA and ASW-PC, they distinguish in the correlation calculation method. The deviation of ASW-GRA is larger than ASW-PC, although they perform similarly in the whole trend. The deviation shows the discrimination ability of grey relational analysis and partial correlation.

Correlation of Multiple Points
In the experiment of multiple point correlation, the points are analyzed for the pollutant variable PM2.5. The correlation degree of any two points can be calculated along time, where a twodimensional matrix will be formed at each time point. For simplicity, some results of cross-correlation degree of any two points are presented in Figure 7.  In Figure 5, the traditional static correlation degree cannot reflect the change over time. In fact, the main influence factor is not fixed, as shown in Figure 4. The static correlation degree may mislead the verdict of the influence factor. For dynamic performance, an obvious distinction is expected to for different time points. In this view, the fluctuation of our method (ASW-GRA) is bigger than others, which means it can represent the change more markedly. For ASW-GRA and FSW-GRA, they distinguish in the sliding window length. There is seemingly a delay for the fixed window length, which is evident in Figure 6b. For ASW-GRA and ASW-PC, they distinguish in the correlation calculation method. The deviation of ASW-GRA is larger than ASW-PC, although they perform similarly in the whole trend. The deviation shows the discrimination ability of grey relational analysis and partial correlation.

Correlation of Multiple Points
In the experiment of multiple point correlation, the points are analyzed for the pollutant variable PM 2.5 . The correlation degree of any two points can be calculated along time, where a two-dimensional matrix will be formed at each time point. For simplicity, some results of cross-correlation degree of any two points are presented in Figure 7. In Figure 5, the traditional static correlation degree cannot reflect the change over time. In fact, the main influence factor is not fixed, as shown in Figure 4. The static correlation degree may mislead the verdict of the influence factor. For dynamic performance, an obvious distinction is expected to for different time points. In this view, the fluctuation of our method (ASW-GRA) is bigger than others, which means it can represent the change more markedly. For ASW-GRA and FSW-GRA, they distinguish in the sliding window length. There is seemingly a delay for the fixed window length, which is evident in Figure 6b. For ASW-GRA and ASW-PC, they distinguish in the correlation calculation method. The deviation of ASW-GRA is larger than ASW-PC, although they perform similarly in the whole trend. The deviation shows the discrimination ability of grey relational analysis and partial correlation.

Correlation of Multiple Points
In the experiment of multiple point correlation, the points are analyzed for the pollutant variable PM2.5. The correlation degree of any two points can be calculated along time, where a twodimensional matrix will be formed at each time point. For simplicity, some results of cross-correlation degree of any two points are presented in Figure 7. In Figure 7, the three-dimensional mesh is drawn for the cross-correlation of any two points, where the right planar graph is the x-y view of the left 3-D mesh. The color in Figure 7  Except for the general presentation of correlation between any two points, the object point HS station is analyzed solely with four points, and four sets of correlation degrees are obtained. The results of the three periods are shown in Figure 8. In Figure 7, the three-dimensional mesh is drawn for the cross-correlation of any two points, where the right planar graph is the x-y view of the left 3-D mesh. The color in Figure 7  For the correlation degree between HS station and circumjacent points, the season factor significantly reacts. There is a bright distinction in the general trend of different periods. The impact level of points can be ranked with the correlation degree. Then, it can help to deduce the direction of the pollution source. Besides, the effect of points may vary at different times. For example, in Figure  8c, Point 4 dominates from the 50th to the 60th hour, but Point 2 surpasses at the 60-70th hour.
Different dynamic contrast methods are analyzed in one period (July 2016) for HS station and No.1 point. The results of contrast methods are shown in Figure 9, of which the subfigures show the direct result and the deviation from the static correlation.   The contrast methods perform similarly with the first experiment ( Figures 5 and 6). The values of the deviation from ASW-GRA fluctuate more sharply than the other two. It reflects that the proposed method can distinguish the correlation degree at different time points. The dynamic property of our method can be proved again with the set of data in this part.

Multidimensional Correlation
The previous two experiments were conducted by controlling the analysis objects, either for variables or for points. The variables and points are analyzed crosswise in this part. Following the selected elements in Table 1, the correlation degrees between PM2.5 in Point 1 and CO in Point 2, SO2 in Point 1 and PM2.5 in Point 2 are calculated in three periods. The results are shown in Figure 10. The contrast methods perform similarly with the first experiment ( Figures 5 and 6). The values of the deviation from ASW-GRA fluctuate more sharply than the other two. It reflects that the proposed method can distinguish the correlation degree at different time points. The dynamic property of our method can be proved again with the set of data in this part.

Multidimensional Correlation
The previous two experiments were conducted by controlling the analysis objects, either for variables or for points. The variables and points are analyzed crosswise in this part. Following the selected elements in Table 1, the correlation degrees between PM 2.5 in Point 1 and CO in Point 2, SO 2 in Point 1 and PM 2.5 in Point 2 are calculated in three periods. The results are shown in Figure 10.  The contrast methods perform similarly with the first experiment ( Figures 5 and 6). The values of the deviation from ASW-GRA fluctuate more sharply than the other two. It reflects that the proposed method can distinguish the correlation degree at different time points. The dynamic property of our method can be proved again with the set of data in this part.

Multidimensional Correlation
The previous two experiments were conducted by controlling the analysis objects, either for variables or for points. The variables and points are analyzed crosswise in this part. Following the selected elements in Table 1, the correlation degrees between PM2.5 in Point 1 and CO in Point 2, SO2 in Point 1 and PM2.5 in Point 2 are calculated in three periods. The results are shown in Figure 10. The contrast methods are also conducted for the elements above (one set of data in a period is selected). The static correlation degree between PM2.5 (Point 1) and CO (Point 2) is 0.332, and static degree between SO2 (Point 1) and PM2.5 (Point 2) is 0.247. The deviations between the static degree and dynamic methods are shown in Figure 11.
The third experiment is conducted in the cross analysis on different pollutants in various monitoring points. The trend of correlation degree is similar to the previous experiments, including the data change and the contrast method performance. The results can help in analyzing the major influence factor from different positions. The contrast methods are also conducted for the elements above (one set of data in a period is selected). The static correlation degree between PM 2.5 (Point 1) and CO (Point 2) is 0.332, and static degree between SO 2 (Point 1) and PM 2.5 (Point 2) is 0.247. The deviations between the static degree and dynamic methods are shown in Figure 11. The contrast methods are also conducted for the elements above (one set of data in a period is selected). The static correlation degree between PM2.5 (Point 1) and CO (Point 2) is 0.332, and static degree between SO2 (Point 1) and PM2.5 (Point 2) is 0.247. The deviations between the static degree and dynamic methods are shown in Figure 11.
The third experiment is conducted in the cross analysis on different pollutants in various monitoring points. The trend of correlation degree is similar to the previous experiments, including the data change and the contrast method performance. The results can help in analyzing the major influence factor from different positions. The third experiment is conducted in the cross analysis on different pollutants in various monitoring points. The trend of correlation degree is similar to the previous experiments, including the data change and the contrast method performance. The results can help in analyzing the major influence factor from different positions.

Discussion
Correlation analysis works weightily in atmospheric pollutant monitoring and source trace. The problem is emphatically considered; how to find out the main pollution influence factor in real-time with direct results. For a direct measurement and convenient analysis method, a dynamic correlation calculation method is proposed, which has been tested with the practical monitoring data in an industrial park of Hebei province, China.
The method can be evaluated from two aspects. On the one hand, it can reach the basic function of the traditional correlation analysis, which is reflected by that the dynamic correlation degrees distribute around the constant line in Figures 5 and 9. On the other hand, the most striking feature of the proposed method is the dynamic performance, which can be found in the results of different tests. Unlike traditional statistical result, the correlation degree varies along time. It means that the impact factor on a certain pollutant variable or monitoring station is not fixed. Therefore, it is essential to obtain a real-time correlation degree to judge the main impact factor for the pollution source trace and control.
For dynamic performance, some similar methods were formed. For a quantitative comparison, the information entropy is introduced to represent the fluctuation degree. The results of the last experiment in Section 4.2.1 are analyzed with information entropy. The entropy is transformed and presented in Table 2, in which the larger the value, the larger the fluctuation degree. It reflects that the proposed method distinguishes the correlation degrees of each time point. The apparent change helps to find out the most relevant influence factors over time. The feature of the results is the specific performance of the dynamic property in the proposed method. For the paper length limitation, only some variables and points are selected and presented. In fact, the proposed method can be applied to the correlation analysis of any two factors in the same type. For example, PM 2.5 is analyzed with four variables in Section 4.2.1, but any two of the five variables can be calculated following the proposed algorithm. In general, the proposed method is essential for the correlation between variables, which is not limited by the examples in the experiment. In fact, the method has been encapsulated as a program in the information management system of an industrial park in Hebei Province [36]. In the information system, multiple variables can be analyzed following the proposed method, from the view of pollutants and positions. The function of dynamic correlation analysis in the information system has helped administrators to trace the pollution source. Besides, the proposed method can provide the decision-making support with other system functions of the real-time monitoring and trend prediction [37,38].
For the method to calculate the monitoring data iteratively in real-time, there is a requirement for the computing resource with high performance. In the future, the improvement can be carried out to reduce the calculated amount. Then, the method can be applied widely in small-scale systems and low-performance terminals. Besides, the method analyzes the correlation degree in discrete points. When there is a need for the continuous distribution of the atmosphere, other gas diffusion methods should be explored to integrate with the method.

Conclusions
For the atmospheric management issue of pollutant interaction and source tracing, a dynamic correlation analysis method is proposed. It is designed with a convenient process and direct result measurement. The proposed method realizes the relation extraction for pollutant variables in real-time, as well as the space factors, which have been tested with the practical monitoring data. The method is an effective support for air quality management in the modern information era. It provides the reference framework for the emerging pollutant and source for air quality. The correlation result can help pollution control and sustainable planning. In future work, the method can be applied in other analyses of new variables, such as particulate matter, nitrogen oxides, traffic emission, and consumer products. Besides, the method can be explored with the continuous analysis models, which can output the fine-grained results of the atmosphere diffusion. The improved correlation analysis method will support pollution management with information mining.