Climate Finance: Mapping Air Pollution and Finance Market in Time Series

: Climate ﬁnance is growing popular in addressing challenges of climate change because it controls the funding and resources to emission entities and promotes green manufacturing. In this study, we determined that PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 are the target pollutant in the atmosphere and we use a deep neural network to enhance the regression analysis in order to investigate the relationship between air pollution and stock prices of the targeted manufacturer. We also conduct time series analysis based on air pollution and heavy industry manufacturing in China, as the country is facing serious air pollution problems. Our study uses Convolutional-Long Short Term Memory in 2 Dimension (ConvLSTM2D) to extract the features from air pollution and enhance the time series regression in the ﬁnancial market. The main contribution in our paper is discovering a feature term that impacts the stock price in the ﬁnancial market, particularly for the companies that are highly impacted by the local environment. We offer a higher accurate model than the traditional time series in the stock price prediction by considering the environmental factor. The experimental results suggest that there is a negative linear relationship between air pollution and the stock market, which demonstrates that air pollution has a negative effect on the ﬁnancial market. It promotes the manufacturer’s improving their emission recycling and encourages them to invest in green manufacture—otherwise, the drop in stock price will impact the company funding process.


Introduction
Growing industrialization bring serious air pollution in emerging economic entities such as China and India. Paris Climate Accords were signed by 196 parties around the world in order to control the rapid exacerbation of climate change. The concentration of carbon dioxide in Earth's atmosphere is rapidly rising and is more than 420 parts per million (ppm) according to NASA data. This requires significant action to contain the atmospheric carbon dioxide, which benefits humanity as a whole. Climate finance is a popular topic in recent years, particularly after Paris Climate Accord was signed Bodansky (2016). By utilizing the flexibility of the financial market, resources can be re-allocated to promote the green industry and also offer more funding to traditional manufacturers to improve emission recycling Buchner et al. (2019); Hong et al. (2020). Air pollution plays an important role in the area of climate finance because it determines the concentration of carbon dioxide in the Earth's atmosphere; carbon dioxide and other pollutants block off the interaction between ultraviolet rays and photosynthesis, which exacerbate global warming Kelp et al. (2018). Emerging economic entities such as China and India are experiencing rapid industrialization, but polluted air is the price they pay for the fast economic growth Anwar et al. (2021). Containing global warming and decreasing air pollution are the responsibility of those countries and of all human beings Lelieveld et al. (2015).
There are a limited number of research studies in modeling the stock prices by climate condition today, and few research outputs are mapping the relationship between the climate condition and financial market today. Air pollution seems far away from the stock market today, but it will have a closer relationship in the near future, as responsible investors are more concerned about long-term returns in their investment, so discovering the relationship between air pollution and the financial market is becoming more and more popular over time Banga (2019). Traditionally, stock prices in the financial market usually depend on the economic condition, revenues of the specific firms, dividend policy Banerjee et al. (2007), and the prospect of the industry Mehtab (2020). Air pollution or the climate condition is less significant as evidence of the hypothesis for the regression analysis. In the meantime, investors or academic researchers are analyzing the time-series pattern of stock prices using lagged value or past volatility by using mathematicak tools such as Stochastic calculus Grigoriu (2013), random processes, ARIMA time series regression Adebiyi et al. (2014), and GARCH volatility model Alberg et al. (2008). However, with the increase in communication efficiency in the market and globalization, the value of financial assets is impacted by various elements, which increasingly depend on factors in both finance and non-finance areas. This means the traditional model is not adequte to analyze all of these features. With the development of computer technology and deep neural networks Pang et al. (2020), people tend to use a more complex model to capture the information that affects the volatility of stock price; long short-term memory (LSTM) is one of the popular deep neural networks used to capture the time series pattern, and the model remembers more long-term information than the recent value Hochreiter and Schmidhuber (1997). On the other hand, convolutional neural networks (CNNs) are widely used in image analysis because they are able to extract the features in the image and capture local dependencies Albawi et al. (2017). Machine learning helps market investors to predict the stock price more accurately than the traditional model and have better forecasting on the future market volatility. There is also a combination deep neural network model that includes CNN and LSTM together, called ConvLSTM2D Sari et al. (2020), which has been widely used in video classification because the convolutional layers are able to capture the information in each picture of video and LSTM provides the long term memory, the model can remember information in the past few seconds in the video.
The price of financial assets will be more reliant on the climate condition particularly after the Paris Climate Agreement, and large globalized organizations are pushing for decarbonization. In this paper, we propose a novel structure in regression and neural network technique to model the stock price and environmental factor in order to provide a more accurate time series model in stock price and offer a sustainability view to investors in targeted companies. We modeling the stock price based on the traditional time series model and the deep neural network ConvLSTM2D and use it to enhance the time series regression in predicting the stock price. Our model includes the autoregressive model (AR) and the feature term we call AP 1 , which contains the information extracted by the ConvLSTM2D from the air pollution data. The experiments use the air pollution data from the four major industrialized cities in China, which are Beijing, Taiyuan, Changchun, and Shijiazhuang. We take the sample of the concentration of PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 in the atmosphere to indicate the air pollution from 1 January 2015 to 31 October 2021. The initial value of AP 1 is the mean of scaled air pollution data, which uses back-propagation and gradient descent to update the value of AP 1 in the regression model and then re-trained by the deep neural network model again.. By utilizing the advantage of ConvLSTM2D, the local dependencies and long-term dependencies in air pollution will be measured and captured by the model, and updating the regression in stock price models all this information together. The aim of our paper is to provide an accurate model to capture information from air pollution and time-series patterns, which demonstrates the relationship between the stock price and climate conditions. The samples of the stock price taken in this paper are several top capitalization manufacturers in China, and those companies have large size of factories located in those selected cities, which include Shougang Group, Shenhua Group, Sany Group from Beijing. Datong Coal Mining, Shanxi Coal International, and Sanxi Coking Co from Taiyuan, as well as Hbis Group and Maanshan Iron & Steel from Shijiazhuang. FAW Jiefang and FAWAY from Changchun.
The rest of this paper is organized as follows. Section 2 demonstrates the related background of the traditional time series model and the model selection techniques. Section 3 introduces the background of ConvLSTM2D and outlines the methodology of this paper by combining the traditional model with the deep neural network. Section 4 describes an experimental study to demonstrate the relationship between air pollution and stock price. Section 5 concludes the findings and demonstrates the future development of climate finance and green manufacturing.

Time Series Modeling
This section introduces the related background of time series analysis based on Box-Jenkins's theory Box et al. (2015Box et al. ( , 1976.

Box-Jenkins's Method
The autoregressive integrated moving average (ARIMA) model is widely used in the current time series modeling, and it takes the lag value of the timestamp in time series and regression error, which is a linear combination of error terms. ARIMA is a general form of Autoregressive-Moving Average (ARMA) that adds the differencing order for the value of each timestamp. The ARIMA model takes the parameters of (p, d, q), represented as the lag order of p in the Autoregressive (AR) model, the lag order of q in the Moving Average (MA) model, and the dth differencing order Box et al. (2015Box et al. ( , 1976. The formula of ARMA(p, q) model is: The particular case of ARMA with integer order of differencing usually used to process the time series value to be stationery, using B to represent the backshift operator Box et al. (2016) and then the ARIMA(p, d, q) can be rewritten as: The ARIMA model gives the user flexibility to decide the parameters; if setting q = 0, d = 0, the ARIMA modeling is the same as that of the autoregressive model. In the experiment part of this paper, we use the autoregressive model with a linear combination of lag values of current timestamp and feature term AP 1 , given as: The likelihood function for the autoregressive model assuming the data generated from a mean zero µ = 0 stationary AR(p) with Gaussian error. Suppose there is a sample of N observation y = (y 1 , ..., y n ) with c = 0 and denote the parameter vector β = (φ 1 , ..., φ p , σ 2 ) Fang et al. (2021aFang et al. ( , 2021b, the corresponding unconditional loglikelihood function is given as: where Σ is the determinant of Σ and σ 2 Σ is the N × N theoretical autocovariance matrix of y.

Model Selection
Model selection techniques are used to determine the parameter for the model; there are popular model selections from information-theoretic studies, such as Akaike's Information Criterion (AIC) Akaike (1974); Sakamoto et al. (1986), Bayesian Information Criterion (BIC) Neath and Cavanaugh (2012), and Hannan-Quinn (HQ) Bierens (2004), given as: N is the number of observations for a given time series, and L is likelihood of the data.
In this paper, we use the new Bayesian information criteria for selecting the parameters in AR(p), which is Minimum Message Length (MML). It has been proven to work well in the time series analysis including AR Fitzgibbon et al. (2004), MA Sak et al. (2005), and ARMA model Fang et al. (2021b), MML was introduced by Wallace and Freeman. MML is based on coding theory and assumes the sender transmits the message to the receiver; its methodology is to encode a two-part message, where the first part is the time series model and the second part is data in the given model. The receiver decodes the message by using Bayesian prior. MML thus gives a quantitative information-theoretic trade-off between model complexity (length of the first part of the message) and goodness of fit (length of the second part of the message) and selects the parameter with a minimum value of MML.
The formula of MML is where π(β) is the Bayesian prior distribution over the parameter set and f (y 1 , ..., y n β) is likelihood function from equation from (4). n is an indication of accuracy of data, F(β) is the Fisher Information matrix of the parameter vector β, which plays an essential role in MML. h(k) is the prior on parameter p. κ k is the lattice constant where k is number of free parameters; it is bounded above by 1 12 and bounded below by 1 2πe , accounting for the expected error in log-likelihood function autoregressive model given in Equation (4). According to the papers Fang et al. (2021aFang et al. ( , 2021b, MML works well in the traditional time series modeling and hybrid model with the neural network.

Methodology
ConvLSTM2D maps the convolutional neural network (CNN) with the LSTM model, as the ConvLSTM2D treats the convolutional layer as the value in each timestamp and utilizes LSTM in the time series analysis. Liu et al. (2017) demonstrated that Conv-LSTM is a highly accurate deep learning model in short-term forecasting through the se of traffic data, and outperforms the traditional ARIMA model or LSTM. By utilizing the advantages of LSTM and CNN, ConvLSTM2D is able to memory the long=term information for the convolutional layer, and Xu also show that the ConvLSTM structure works well in air pollution prediction Xu and Lv (2019). Retta and Kethavath (2021) previously showed that CNN-RNN works well in air pollution forecasting, particularly for PM 2.5 . CNN is widely used in image classification as it extracts pixel information by the kernel or filter with the proper level of strides; kernels usually have three channels for colorful images (RGB). By setting the appropriate size of the kernel, CNN maps the pixel into the next convolution layer after the matrix multiplication, so the information in the image is projected to the next layer and carried forward Albawi et al. (2017). There is usually more than one kernel used to scan the images and extract different information. The activation function of ReLU (6) Agarap (2018) and Sigmoid (7) Yonaba et al. (2010) are popularly used in the classification task. The ConvLSTM2D inputs the two-dimensional matrix values in the convolutional neural network and continues to train with the LSTM model by flattening the CNN output, and the final step uses a fully connected dense layer to conduct the classification or regression tasks. Figure 1 shows the architecture of the LSTM, as it maintains an information highway from C t−1 to C t to allow the model to use memory from past information Hochreiter and Schmidhuber (1997). The LSTM uses a gate mechanism, including input gate, forget gate, and output gate, to determine how much information should be carried forward from this layer to the next layer by using the element-wise operation from the sigmoid activation function of the input. If the output from the sigmoid activation function is closer to 0, it indicates the large proportion of information should be forgotten; otherwise, the current layer will carry more information into the next layer and the model will be trained by back-propagation. The LSTM employ the following operation: Output:ŷ t = Vh t + c where W and U are parameters for the information from the previous hidden state and current hidden state, b is the bias term, and V is the parameter of output.
The X t is the input vector for the time series value in LSTM; however, the X t will take the convolutional layer as input in the ConvLSTM2D model, the architecture of ConvLSTM2D shown in Figure 2. The ConvLSTM structure has been suggested to work well in the stock market prediction because of extracting the local dependencies with the time series pattern based on Kelotra's paper Kelotra and Pandey (2020). Overlapping each LSTM layer together to form the whole architecture of ConvL-STM2D, it provides the convolutional layers with a time series characteristic and creates a model that captures long-term and short-term dependencies, as shown in Figure 3. We use ConvLSTM2D to capture air pollution information. There are local dependencies between the air pollutants PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 , because the manufacturer usually emits continuous emissions with the admixture of different pollutants. In the meantime, the pollutants will not disperse in the atmosphere in a short amount of time and will stay to impact air quality in the coming days. Using the convolutional neural network will enable capturing the local dependencies in those pollutants each day, and treating them as the input of LSTM to capture the long-term and short-term patterns in time series. To address the challenge of modeling the relationship between air pollution condition and stock price, we initialize the feature term AP 1 as the scaled mean of different pollutants in the length of days prior to the current stock price X t as given below: The parameter of p is determined by the Minimum Message Length (MML) from Equation (5) in Section 2.2 and uses the ordinary least squares (OLS) from linear regression to model the stock price of X t . In the meantime, the initial value of AP 1 will be updated by gradient descent in pre-set number of epochs, and back-propagation will enhance the linear relationship between the stock prices and feature term AP 1 . The hybrid model of ConvLSTM2D and Autoregressive provides the time series regression from the lagged value of stock price and feature extracted from the air pollution, and the relationship is expressed by the linear regression. The algorithm and model construction processes shown in the Algorithm 1.
Linear regression provides double training for the results from the ConvLSTM2D shown in the Algorithm 2. The hyper-parameter of the learning rate η = 0.1, and the number of epochs = 10, the loss function used is: the gradient regarding with the feature term AP 1 calculated by chain rule is given as: ∂ŷ ∂ŷ ∂AP 1 = (y −ŷ) * W AP 1 where W AP 1 is the coefficient for the feature term AP 1 .
Algorithm 2 Algorithm with updating feature term AP 1 by using gradient descent Require: number of epochs = 10 while i ≤ epochs do 1. Predicting the value of feature term AP 1 from trained ConvLSTM2D model 2. Modeling the linear regression by ordinary least squares 3. Using gradient descent and hyperparameter of learning rate.to update the feature term AP 1 4. Train ConvLSTM2D model by new feature term AP 1 end while

Data
This section demonstrates experimental studies for modeling the air pollution data and stock prices. We select four major industrialized cities in China, namely Beijing, Taiyuan, Changchun, and Shijiazhuang, and also include ten leading heavy manufacturers in China with a high amount of capital. The air pollution data were collected from the Ministry of Ecology And Environment in China, and stock prices were collected from Yahoo Finance. The selected manufacturers have headquarters or main factories located in those cities, so they interact with local environment governments and are impacted by the local air pollution. The selected manufacturers are Shougang Group, Shenhua Group, and Sany Group from Beijing; Datong Coal Mining, Shanxi Coal International, and Sanxi Coking Co from Taiyuan; Hbis Group and Maanshan Iron & Steel from Shijiazhuang; and FAW Jiefang and FAWAY from Changchun. We believe that taking air pollution data into account would increase the performance of predicting the stock prices for the heavy manufacturers.

Experiment
To evaluate the performance, we first divided the dataset such that 95% was in the sample and the remaining 5% as out of sample estimation, which included a 54-ahead forecast window, and we conducted a rolling forecast. We compared our model results with ARIMA, which was implemented using the auto.arima() function of the forecast package for R Hyndman et al. (2020) and this function outputs the best ARIMA model by information criteria of AIC and parameters of maximum p and q. Fang previously showed the MML selects the ARIMA model with lower prediction errors than AIC and BIC, and MML also outperforms in selecting the lower RMSE in the hybrid ARIMA-LSTM model Fang et al. (2021aFang et al. ( , 2021b. Table 1 shows parameters p in our model are selected by MML and learning rate η used in ConvLSTM2D. Secondly, we compared the model with the historical mean model of stock price itself. Finally, we conducted a hypothesis test regarding the feature term AP 1 extracted from prior 30 days of air pollution data to the targeted variable. The initial value of AP 1 uses a scalar of minimum and maximum values in the range of 0 to 1 and then uses back-propagation to update this value to record the linear dependencies between stock price and air pollution.
Table 2 shows the performance comparison between different models in three selected stock prices for manufacturers located in Beijing; the results suggest that there is a regression relationship between the air pollution of prior 30 days and the current stock prices in the heavy manufacturing industry. Table 2 suggests that our model has lower prediction errors in terms of MAE, RMSE, and SSE for the selected manufacturers of Shougang and Sany. The linear model combining the feature term AP 1 better explains the long-term de-pendencies for the stock price, as it has lower errors, through combining with the short term dependencies of the autoregressive model, which more accurately predicts the stock price. We tested the hypothesis regarding the significance of the population linear relationship between the feature term AP 1 and X t . The null hypothesis tests if the population slope is equal to 0, whereas the alternative hypothesis is that the population slope is not equal to 0, based on the econometrics theory of Wooldridge (2015). Table 3, the p-values for the feature term AP 1 in the manufacturers Shougang, Shenhua, and Sany are less than 0.05 significance level, where the lower p-value indicates the lower probability of null hypothesis H 0 holding true; this also suggests that the alternative hypothesis H 1 holds true Biau et al. (2010). We can conclude that the variable feature term AP 1 is significant in this model and has a relationship with the stock price. Thus, we conclude that there is statistically significant evidence that the independent variables of feature term AP 1 and dependent variable stock price X t are linearly related Emmert-Streib and Dehmer (2019). As we can see that the coefficients of feature term AP 1 are negative in the company of Shenhua and Sany, which indicates the negative linear relationship with the stock price, demonstrating that air pollution has a negative effect on the financial market. This encourages the manufacturer to improve their emission recycling and encourage them to invest in green manufacture to prevent the drop in stock price from impacting the company funding process. Table 4 demonstrates the different models' performance in the manufacturers of Datong Coal Mining, Shanxi Coal International, and Sanxi Coking Co from the city of Taiyuan. The selected companies are the largest coal mining producers in China, which determines the economic growth in the Shanxi province .

Taiyuan
The results from Table 4 suggest that our model outperforms for the companies of Sanxi in terms of lower forecasting error than ARIMA or historical mean models, and the feature terms AP 1 have a significant relationship with the stock price. The coal producers are obviously correlated with the air pollution because of the huge amount of PM 2.5 , PM 10 , and CO emission into the atmosphere during the mining processes Li and Hu (2017). Taiyuan is the capital city of Shanxi province and the leading producer of coal in China. This province stores one-third of China's coal deposits Li and Hu (2017). The stock price of coal mining producers and feature term AP 1 from air pollution show a positive relationship with the large coefficient shown in Table 5; it is reasonable as coal mining promotes the stock price and increases air pollution at the same time.

Shijiazhuang
Shijiazhuang is one of the main industrialized cities in north China and has suffered from air pollution in the last decade. Table 6 shows a comparison of models for accuracy in different stock price predictions. It experiment shows that our model outperforms ARIMA and the historical mean model in forecasting. The results suggest that our model has lower MAE, RMSE, and SSE because it maps the air pollution feature projected into regression analysis. The results also indicate long-term dependencies of time series in the manufacturers of heavy industry with respect air, quality as Table 7 justifies the significance of linear regression between feature term AP 1 and the targeted variable of stock price because of a lower p-value than the benchmark of 0.05 in the hypothesis test.

Changchun
The results from Tables 8 and 9 indicate that the automobile manufacturer FAW Jiefang shows a significant linear relationship between air pollution and its stock price, it is one of the largest automobile producers in China. The companies with a larger amount of capitalization have a closer tie with the air quality, and our model generates a lower error in the 54 steps of ahead forecasting than others. Because the larger-sized companies are more often engaged with carbon emission trading, and are highly impacted by the quoted price of carbon emission Jiang et al. (2014), the linear regression with the financial market is more significant. The positive coefficient in feature term AP 1 suggests that heavy manufacturers have a significant relationship with the exacerbation of air pollution in China, and the business growth for those manufacturers highly relies on the carbon emission.

Conclusions and Future Work
In this paper, ConvLSTM2D was suggested for mapping the relationship between air pollution and the stock market since it yielded accurate results in the rolling forecast in most cases, better than the time series model without feature term AP 1 from air pollution data, which indicates the nonlinear relationship between air pollution and stock price. The experiment suggests that time series models with environmental factors have lower mean squared error and other evaluation metrics. The experiment also provides hypothesis tests to suggest a significant relationship between the stock price and the environment feature term. The local dependencies in changes of recent conditions of air pollution are captured by the convolutional neural network and flattened to the LSTM in order to extract the time series pattern. Gradient descent and back-propagation in feature term AP 1 successfully made our model fit the data of air pollution and stock price. The results from the experiment suggest that business manufacturer activity in heavy industry impacts the air quality of selected cities, as reflected in the stock price. This provides a novel modeling technique in discovering the relationship between emissions and the financial market. We believe that this model can be used for promoting green investment in the stock market and control air pollution. In the future, the climate factor will have a tighter connection with the finance market; this will be a mechanism of how the financial market will increasingly rely on environmental conditions, as most globalized companies are working towards decarbonization. Modeling the environmental condition and financial markets such as carbon trading will provide more insights into the investors on the long-run benefits of targeted companies and portfolio risk management regarding how the environment needs more attention in academic research.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.