Estimation of the Pm 2.5 Pollution Levels in Beijing Based on Nighttime Light Data from the Defense Meteorological Satellite Program-operational Linescan System

Nighttime light data record the artificial light on the Earth's surface and can be used to estimate the degree of pollution associated with particulate matter with an aerodynamic diameter of less than 2.5 μm (PM2.5) in the ground-level atmosphere. This study proposes a simple method for monitoring PM2.5 concentrations at night by using nighttime light imagery from the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS). This research synthesizes remote sensing and geographic information system techniques and establishes a back propagation neural-network (BP network) model. The BP network model for nighttime light data performed well in estimating the PM2.5 pollution in Beijing. The correlation coefficient between the BP network model predictions and the corrected PM2.5 concentration was 0.975; the root mean square error was 26.26 μg/m 3 , with a corresponding average PM2.5 concentration of 155.07 μg/m 3 ; and the average accuracy was 0.796. The accuracy of the results primarily depended on the method of selecting regions in the DMSP nighttime light data. This study provides an opportunity to measure the nighttime environment. Furthermore, these results can assist government agencies in determining particulate matter pollution control areas and developing and implementing environmental conservation planning.


Introduction
Aerosols have extensive impacts on our climate and our environment [1], and tropospheric aerosols (also known as particulate matter (PM)), in particular, can cause adverse effects on public health [2].Epidemiologic studies indicate strong links between the concentrations of PM with aerodynamic diameters of less than 10 μm and less than 2.5 μm (PM10 or PM2.5, respectively) with public morbidity, respiratory-related mortality and cardiovascular diseases [3][4][5][6][7][8].The concentration of PM has become an important index of air pollution and has gained more and more attention from the administrations and organizations of environmental protection, public health and science around the world.Both the European Union (1999) and the United States have set air quality standards that dictate strict limits on PM concentrations in the ambient air.
In recent years, with the rapid development of industrialization and urbanization, PM has become the primary air pollutant in most major cities in China [9].This pollution not only threatens people's health, but also causes decreases atmospheric visibility and degrades city scenery [10].Therefore, the Chinese government has enacted ambient air quality standards [11], which limit the values of PM2.5 concentrations and set air pollution classification rules.
During the daytime, aerosol characteristics, such as aerosol optical thickness (AOT), have been used to monitor the degree of PM concentrations.The relationship between AOT and PM concentrations has been thoroughly researched [12][13][14][15][16][17].However, studies on nighttime PM pollution monitoring are very rare.
Since the 1970s, the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) sensors have gathered meteorological data, which is archived by the National Oceanic and Atmospheric Administration (NOAA).DMSP imagery has a spatial resolution of 2.7 km and consists of two spectral bands and one thermal band.However, DMSP images are widely known not for their initial purpose, but for their ability to capture nighttime images of light on the Earth's surface.Global population density and economic activity are clearly visible from space using this nighttime light imagery.
The DMSP-OLS sensors have a unique capacity for detecting faint light, and many research topics have been studied using this nighttime light imagery.A major application is the mapping of gross domestic product (GDP) and economic activity on global and regional scales [18][19][20][21][22][23].Public lighting is a valuable indicator of a country's economic condition and is directly reflected in nighttime light.Moreover, because larger populations need more public lighting, nighttime light also reflects population density [24,25].Mapping human settlements is also possible because most nighttime light is emitted from human settlements [26][27][28].Furthermore, a great number of applications involving DMSP-OLS data exist in other areas of study, such as carbon cycling [29], fishing boat mapping [30], energy consumption [31], security evaluation [32] and ecological evaluation [33].
In this study, we explored the relationship between daily DMSP nighttime light images and daily PM2.5 concentrations in Beijing, China.We created a back propagation (BP) neural-network model to estimate PM2.5 concentrations at night.This article is organized as follows: Section 2 describes our study area and data; presents the analysis of the nighttime light responses to PM2.5 concentrations from both the spatial and temporal perspectives; Section 3 verifies the findings from this study, and discusses future work; and Section 4 summarizes the discoveries of this study.

Study Area
To monitor PM2.5 concentrations using DMSP nighttime satellite images, we selected Beijing (Figure 1) as the study area for analysis.Beijing is situated in the eastern part of China, to the west of the Bohai Sea and covers more than 16,410 square kilometers.Beijing is one of the largest metropolitan areas in the world.Rapid urban sprawl over the past 30 years has put tremendous pressure on the local and regional environment [34].To improve air quality, the Chinese government implemented aggressive air pollution control measures in Beijing and surrounding areas.All on-road vehicles (including both trucks and passenger cars) that failed to meet the European Level IV emissions standards were banned from Beijing's roads.Mandatory restrictions were implemented to reduce the use of government vehicles and personal vehicles by ~20% by allowing these vehicles on the roads only on alternate days, based on license plate numbers.In addition to mobile sources, several heavily polluting industrial sources, such as power plants, were ordered to reduce their operating capacities or to completely shut down.Although great efforts have been made to control the air pollution, the PM levels remain much higher than the national air quality standard.

DMSP-OLS Nighttime Light Data
The DMSP-OLS operates satellites in sun-synchronous orbits with nighttime overpasses in the 8:00-10:00 p.m. range (local time) to map artificial lighting present on the Earth's surface.The OLS is an oscillating scan system, with a swath width of 3000 km and 14 orbits per day; each OLS instrument can generate a complete coverage of nighttime data in a 24-h period.At night, the OLS has a photo-multiplier tube (PMT) to intensify visible-band light to observe faint sources of visible to near-infrared emissions [35].
In this study, we accessed the daily DMSP-OLS nighttime light (NTL) data products, which consist of 85 datasets from 9 October 2013 to 6 February 2014, without cloudy days and without days with snow on the ground.The data are organized into DMSP-OLS Nighttime Lights Global Composites (Version 4) and can be obtained from the National Geophysical Data Center (NGDC) website [36].

Daily PM2.5 Average Concentrations
We accessed the 120 datasets of daily air pollution index (API) for Beijing for the period between 9 October 2013 and 6 February 2014.These data were obtained from the Beijing Municipal Environmental Protection Bureau website [37].
When the primary pollutant was fine particles, the daily average PM2.5 concentrations were derived from the air pollution index (API) via the method presented in Table 1.The method is based on the Chinese Ambient Air Quality Standard (GB3095-2012) [11,38].

Phase of the Moon and Digitization
Step 1: We determined the phase of the moon between 9 October 2013 and 6 February 2014 based on the Internet Observatory of China website [39].We then digitized the data using the following method: (1) a full moon day was assigned the value 16; (2) the waxing and waning of the moon were assigned the value 8; (3) the last day of a lunar month and the new moon day were assigned the value 0; and (4) the other days' values were obtained via linear interpolation.Step 2: We also determined the time of the rising and setting of the moon between 9 October 2013 and 6 February 2014.These data were obtained from the CalSky website [40].If the satellite's passing time was not in the range between the rise and set of the moon, the days value was assigned the value 0.

LANDSAT-8 OLI-TIRS Data of Beijing
We accessed the 1 September 2013 LANDSAT-8 OLI-TIRS data for Beijing.The data set was provided by GSCloud, Computer Network Information Center, and Chinese Academy of Sciences [41].

Beijing Meteorological Data
We compiled the daily ground climate data for Beijing between 1 October 2013 and 1 February 2014 collected by the Chinese international exchange station.The data include temperature, relative humidity (RH), wind speed and barometric pressure, among other data sets.The data were provided by the National Meteorological Information Center of China [42].

Data Pre-Processing Phase Division of the DMSP-OLS NTL Data into Four Regions
The phase of the moon has a significant influence on the gain settings of the OLS.The gain settings increase as the lunar illumination decreases.Higher gain settings are associated with larger numbers of detected pixels in the nighttime light imagery (Figure 2).During the darkest 10 nights of each lunar cycle, the effects of the along scan gain and bidirectional reflectance distribution function (BRDF) algorithms are minimized, and the gain reaches its maximum monthly level [43].To avoid the influence of the moon, we selected the darkest nights of each lunar cycle.Thus, the night's lunar phase was digitized with the value 0. Consequently, the 85 groups of available data decreased to 33 groups.
The primary objective of the on-orbit OLS gain control is the generation of consistent imagery of clouds at all scan angles for visual interpretation by Air Force meteorologists.In normal operations, the video digital gain amplifier (VDGA) is modified to track scene illumination predicted from lunar phase and elevation.The resulting base gains are modified every 0.4 ms by an onboard along-scan gain control (ASGC) algorithm [43].The imagery presented herein was obtained under a computer control mode known as along-track gain control (ATGC), in which the VDGA gain remains at a fixed value throughout an entire scan [44].Thus, each pixel in the same scanning line has the same magnification.The spatial distribution of land use in Beijing includes mountainous areas, forest, and farmland to the west of downtown Beijing.We hypothesized that the urban area emits visible light at night, whereas the mountain, forest and farmland do not.Using visual interpretation, we circled the urban area of Beijing city (area a) using the LANDSAT-8 OLI-TIRS data.In addition to the urban core, three areas were defined according to the western boundary of urban area: one 5-km-wide strip extending westward from the western boundary, another 5-km-wide strip between 5 and 10 km from the western boundary, and a third 5-km-wide strip extending eastward from the western boundary (represented by areas b, c and d in Figure 3, respectively).
Using ArcGIS spatial analysis tools and the 4 defined regions, we divided the daily DMSP-OLS NTL data into four regions.We then summed the regional digital number (DN) values, which yielded SUM_ DNa, SUM_ DNb, SUM_ DNc, SUM_ DNd for areas a, b, c, and d, respectively.

PM2.5 Concentrations Data Corrected By Relative Humidity
Among the various meteorological factors, such as temperature, RH, wind speed and barometric pressure, RH has been determined to significantly affect air pollution levels [45].
In an air sample that is heated to 50 °C [46], the particles in the air sample are almost the "dry mass".The PM2.5 concentrations measured by tapered-element oscillating microbalances (TEOM) are lower than in ambient air.Therefore, a proper RH correction should be introduced to reduce the impact of the variation in RH on the PM2.5 concentrations.Hence, we define the "RH_corrected PM2.5 concentration" as the "wet" PM2.5 concentration in the ambient air.This correction represents the "dry" PM2.5 concentration obtained under relatively dry conditions multiplied by a hygroscopic growth factor.Based on previous studies [16,47,48], the "RH_corrected PM2.5 concentration" can be written as follows:

Data Normalization
To avoid data saturation, the input variables in this study's model were normalized based on their possible ranges using the following equation: where xi, xmin, xmax and xnorm are the input variable's actual value, minimum value, maximum value and normalized value, respectively.The neural-network model's output is also an indexed value that responds to the input variable.Using the following equation to de-normalize the indexed output value, we obtain a real-value output: where yai, ymin, ymax and ynorm are the output variable's real value, minimum value, maximum value, and the indexed output value from the neural-network model, respectively.

Model Evaluation Phase
Generally, root mean square error (RMSE) and average accuracy have been used to measure the performance of neural-network models [55].In this study, the parameters were (i) RMSE; (ii) average accuracy (Pav); (iii) R 2 ; (iv) mean bias error (MBE); and (v) index of agreement (IA).The five parameters were computed using Equations ( 4)-( 8), respectively:  (4) where yai and ymi are the output variable's real value and the measured variable value, respectively, RMSE is the RMSE between the value of the output variable and measured variable, and N is the sample number.Lower RMSE values indicate better model performance.[ ] where R 2 , ym and ya are the correlation coefficient, average measured variable value and average actual output variable value, respectively.The R 2 value represents the correlation between the predicted and measured variables.The predicted and measured variables are assumed to follow a normal distribution.R 2 values range from 0 to 1. Higher correlation values represent stronger linear relationships between the actual and predicted variables.
where yai and ymi are the output variable's real value and the measured variable value, respectively, and N is the sample number.Lower MBE values indicate better model performance.
where yai, ymi and ym are the output variable's real value, the measured variable value and average measured variable value, respectively, and N is the sample number.Values close to 1 indicate better model performance.

Results and Discussions
The architecture of the BP neural-network was 4-6-1, and the learning algorithm was a gradient descent BP algorithm, with the learning rate of 0.2 and a momentum coefficient of 0.2.The activation function was a sigmoid tangent function in the input layer and hidden layer and a sigmoid logistic function in the output layer.In total, 23 groups of data were used for training, five groups were used for verification purposes, and five groups were used for testing.
The R 2 value of the correlation between the BP network model predictions and the RH_corrected PM2.5 concentrations was 0.975.The RMSE was 26.26 μg/m 3 and the MBE value was −1.806 μg/m 3 , with a corresponding average PM2.5 concentration of 155.07 μg/m 3 .The Pav value was 0.796 and the IA value was 0.988.The predicted PM2.5 concentrations matched the RH_corrected PM2.5 concentrations well, as shown in Figure 5.Because the temporal difference between the 2013 Landsat 8 OLI-TIRS images and the 2013 nighttime light data is slight, we can compare these data sets to estimate the spatial consistency of urban areas.The lit areas of the city cover a larger area than the urban built-up areas (see Landsat 8 image).In addition, water bodies and urban forests are also illuminated in the DMSP-OLS data.This "blooming" phenomenon was also reported by other researchers [27,56].The light intensity of the suburbs relative to the urban core contains information on the ground air's extinction coefficient.Furthermore, the extinction coefficient can be used to estimate the particulate matter content in the air.Thus, the estimation of PM2.5 concentrations is feasible using DMSP-OLS data.
DMSP-OLS sensors capture images every day, but the daily imagery is affected by sensor noise, atmospheric and moonlight variations.The National Oceanic and Atmospheric Administration (NOAA) has attempted to minimize the sensor noise.By using DMSP-OLS data from nights with little or no moonlight, we have minimized the effects of the along-scan gain and bidirectional reflectance distribution function (BRDF) algorithms.Thus, the daily nighttime light data are comparable in terms of the effects of atmospheric variations.
The DMSP-OLS data provide an effective method for estimating extinction coefficient variations in ground air with PM pollution.PM pollution reduces the transmission capability of light, resulting in differences in spatial distribution of nighttime light data (Figure 6 [57]).We presumed that area d was a light-emitting region and that areas b and c were not.The upward luminous flux of areas b and c was radiated from area d.This process is complex and nonlinear.To solve this problem, the BP neural-network was selected to estimate the degree of PM2.5 pollution.The primary data set of this research is DMSP imagery that provides information on the ground air's extinction coefficient.We proposed an effective method to split DMSP imagery, intentionally highlighting the differences in the spatial distribution of NTL data.Future studies should focus on the quantitative relationship between PM pollution and DMSP-OLS data throughout the lunar cycle.New sources of high spatial resolution nighttime images, such as the NPP-VIIRS Nighttime Light Data [58][59][60], the EROS-B commercial satellite data [61] and aerial photography [62], may allow for better estimation methods and results.
Meteorological conditions are a significant driver of local ambient air pollution concentrations, especially wind speed and relative humidity.Wind can disperse particulate matter but cannot influence the extinction coefficient.Therefore, relative humidity was used to correct the daily "dry" PM2.5 concentrations.DMSP-OLS data can be obtained only in clear and cloudless days.Other days were not included in this study.Nighttime inversion conditions and upper air meteorological conditions (turbulence, etc.) may impact the light transmission from the ground to satellite, and we will consider these factors in the future studies.

Conclusions
By selecting Beijing, the largest and fastest developing city in the world, as the study area, this research synthesizes RS and GIS techniques and estimates the PM2.5 pollution degree of the ground-level atmosphere.This study proposed a simple monitoring method of nighttime PM2.5 concentrations.
The innovation of this research lies in the selection of spatial regions in the data set.Based on the land use type to the west of Beijing, the distribution characteristics of the land and the light scattering from urban to suburban, four regions were defined from which to extract the nighttime light data, instead of constructing various indices from the entire DMSP-OLS imagery data set.The relative numerical ratio of the data extracted from these four regions reflects the extinction coefficient of the atmosphere, and this extinction coefficient can be used to retrieve the aerosol concentration.
Our method and results can provide guidance for developing and implementing environmental conservation planning.Furthermore, these data can assist government agencies in determining PM pollution control areas, initiating regulation projects, and undertaking nighttime environmental purifying measures.

Figure 1 .
Figure 1.The geographic location of the study area is indicated by the black square on the Chinese national map (upper).The study area (115.5°-117.5°E,39.5°-40.5°N) is indicated by the red polygon (lower).

Figure 2 .
Figure 2. Relationship between the value of the moon phase and the nighttime light imagery.The top 4 days (3 November 2013, 11 November 2013, 14 November 2013 and 17 November 2013) demonstrate that the gain settings gradually increase as the lunar illumination decreases.Higher gain settings are associated with larger numbers of detected pixels in the nighttime light imagery.The lower 4 days (22 November 2013, 26 November 2013, 28 November 2013 and 30 November 2013) demonstrate that the gain settings reach maximum values during the darkest 10 nights of a lunar cycle.

Figure 3 .
Figure 3.The 4 daily NTL data regions.The purple polygon (a) is the urban boundary of Beijing.The green region (b) extends westward 5 km from the western boundary of the urban area.The blue region (c) extends westward 5 km from the western boundary of area b (i.e., between 5 and 10 km from the urban boundary).The red region (d) extends eastward 5 km from the western boundary of the urban area.
5 concentrations.The input variables for the BP neural-network model were SUM_ DN.a, SUM_ DN b, SUM_ DN c and SUM_ DN d, and the output variable of this model was the RH_corrected PM2.5 concentration.The topological structure of the BP neural-network model in this paper consisted of four input neurons in the input layer, one output neuron in the output layer and one hidden layer with six neurons.The flowchart of the 4-6-1 structure is shown in Figure 4.
and Pi are the average accuracy and the accuracy of the ith result.Pav provides information on the accuracy of the model using a given dataset.Values close to 1 indicate better model performance.

Figure 6 .
Figure 6.The sketch map (left) describes the position of a point light source and the surface to compute the luminous flux.Point O has a point light source with a light intensity of 1.0 unit (I0).The distance from point O to point H is 1076 meters (The aerosol scale height of Beijing in winter is 1076 meters [57]).A horizontal plane passes through point H and is perpendicular to the Z-axis; this plane is termed the Z = 1076 m Plane.Point P is on the Z = 1076 m Plane.The luminous flux through point P and parallel to the Z-axis is a function of two parameters.One is the distance from point P to point H; the other is the extinction coefficient of atmosphere.The line chart (right) describes the function for a different situation.The X-axis is the distance from point P to point H, and the Y-axis is the upward luminous flux through point P and parallel to the Z-axis.The label e is the extinction coefficient of atmosphere.Area d was a light-emitting region, and areas b and c were not.The upward luminous flux of areas b and c was radiated from area d.