Use of Simulated and Observed Meteorology for Air Quality Modeling and Source Ranking for an Industrial Region

The Gaussian-based dispersion model American Meteorological Society/Environmental Protection Agency Regulatory Model (AERMOD) is being used to predict concentration for air quality management in several countries. A study was conducted for an industrial area, Chembur of Mumbai city in India, to assess the agreement of observed surface meteorology and weather research and forecasting (WRF) output through AERMOD with ground-level NOx and PM10 concentrations. The model was run with both meteorology and emission inventory. When results were compared, it was observed that the air quality predictions were better with the use of WRF output data for a model run than with the observed meteorological data. This study showed that the onsite meteorological data can be generated by WRF which saves resources and time, and it could be a good option in low-middle income countries (LIMC) where meteorological stations are not available. Also, this study quantifies the source contribution in the ambient air quality for the region. NOx and PM10 emission loads were always observed to be high from the industries but NOx concentration was high from vehicular sources and PM10 concentration was high from industrial sources in ambient concentration. This methodology can help the regulatory authorities to develop control strategies for air quality management in LIMC.


Introduction
Urbanization-related issues have become very prominent across the world [1][2][3], especially in developing countries like India where cities have started facing an acute air pollution problem due to urbanization [4]. Many Indian megacities, such as Delhi, Mumbai, Bangalore, and Kolkata, are witnessing increasing health problems due to rapid increase in air pollution [5]. The total health cost due to air pollution for Mumbai was about USD 8 billion for the year 2012 [6]. This problem becomes particularly complex to resolve in urban areas because of diverse emission sources such as vehicles, industries, bakeries, hotels, diesel generating sets and combustion of solid fuels in the domestic sector.
Air quality monitoring networks have been installed at various locations in many cities. Also, installation and operation of a large number of air quality monitoring stations need considerable financial resources from government which may not be supported in low-middle income countries (LIMC). This monitoring data is increasingly used to communicate the existing status of air quality. However, it doesn't contribute to the understanding of sources and meteorological factors. Although the observed data represent air quality status for a particular location only, the use of dispersion models can provide information about much larger areas. Further, modeling helps in the determination of concentration plots on spatial and temporal scales and contributions from different types of source for air pollutants [7][8][9][10][11]. The dispersion model can also be used to identify pollution sources with the help of emission inventory [12,13]. This is very useful in making rational management strategies [7,[14][15][16][17][18][19][20]. A dispersion model can also determine the contribution of various sources in a region whereas a receptor model determines source contribution at a particular location [21]. Data required for dispersion models include emission inventory, geographical data, and meteorological data of the region [22]. Data availability, especially meteorological data, is an important factor for the assessment of air quality in LMIC because running a meteorological monitoring station requires resources. The use of poor-quality meteorological data in air quality models may contain significant adverse effect model output quality [23,24]. Meteorological data is generally taken from a nearby meteorological station and is used for the study region. The results of air quality model may have significant error despite advanced computer technology, and various techniques like numerical modeling techniques, performance evaluations of state-of-the-art [23][24][25][26][27]. A survey has been done for the air quality and meteorological model [28].
The observed meteorological data from a monitoring station may not give good performance in air quality modeling for the urban industrial region where several emission sources are present at multiple heights and variation in topography. An alternative is to generate onsite meteorological data using a meteorological model which could be an effective option in LMIC. A study was conducted on the coupling of American Meteorological Society/ Environmental Protection Agency Regulatory Model (AERMOD) with Weather Research and Forecasting (WRF) model in Pune city (India) for a single pollutant PM 10 [29]. However, their predicted concentration obtained by the WRF-generated meteorology and observed values have not been compared and contribution of various sources in the study region has not been estimated. Short term air quality forecasting also has been carried out using WRF forecasted meteorology and AERMOD for five days for Chembur region [30]. In these studies, the requirement of horizontal homogeneous hourly surface and upper meteorological data has been fulfilled from WRF model for AERMOD. The main objective of this study was to generate onsite meteorological data at mesoscale using WRF model and compare the results with observed meteorological data. Then, we proposed to use both the data in air quality modeling and to evaluate the option of making WRF coarse resolution output feasible in LMIC. This study was also continued to rank the contribution of emissions and ambient concentrations from sources for NO x and PM 10 . This will be useful for air quality management of the urban area for regulation purposes [31] in LMIC.

Study Area
The study area, Chembur, represents an industrial site of Mumbai city in India with global coordinates 19.05 • N and 72.89 • E. This area covers M East and M West wards of Municipal Corporation of Greater Mumbai, which is one of the financial centers of India as shown in Figure 1a. Chembur has a population of 1.2 million. It measures 6.5 km east-towest and 8.45 km north-to-south, as shown in Figure 1c. This region has marine alluvium type of soil and North-South running basalt hills to its South [32]. The topographical features have been shown in Figure 1d in the Universal Transverse Mercator coordinate system. The elevation is maximum at the central part of the study area and minimum along the boundary of the Eastern study area. The elevation ranges from 1 to 200 m. The elevation just above the location of Rashtriya Chemicals and Fertilizers Limited (RCFL) is 100 to 200 m. Major industries in this area are Bharat Petrochemical Corporation Limited (BPCL), Hindustan Petroleum Refinery Corporation Limited (HPCL), Tata Thermal Power Corporation Limited (TPCL) and RCFL. Containers and heavy-duty vehicles from this area use the Port Trust Road, Mahul Road and Ramakrishna Chemburkar Marg (R C Marg). Road conditions are poor due to the continuous movement of heavy vehicles. The residential areas spread over the north boundary of the study area has a residential zone comprising Chheda Nagar (between point 2 and 3) and Shramjivi Nagar (Left side area of point 2), the south boundary is adjacent to the Tata Thermal Power Plant. The west boundary lies by RCFL and Mahul, and the east boundary is aligned with Shahyadri Nagar and Prayag Nagar. Around twenty years ago, Chembur was one of the most polluted regions in Mumbai. With the sustained effort and pressure from authorities and industries for implementing a series of control measures, the region has witnessed an improvement in air quality. In the last two decades, the region characteristics have improved due to the closure of many industries, but residential development and vehicular density has increased [33]. Chembur still needs appropriate air quality studies for developing management strategies, as its ambient air quality is poor when compared with the National Ambient Air Quality Standard 2009, Central Pollution Control Board (CPCB) New Delhi (India). CPCB has published a document, giving a Comprehensive Environmental Pollution Index (CEPI) score for various industrial regions in the country. Chembur has a score of 69.19 CEPI in this report [34]. This score shows that this region should be rated as a severely polluted area. Hence, the region requires better understanding of air quality processes so that effectiveness of the action plan can be realized.

Methodology and Data
The schematic data flow of the study has been shown in Figure 2. AERMOD requires emission inventory and nine hourly meteorological parameters (wind speed, wind direction, rain fall, temperature, humidity, pressure, ceiling height, global horizontal radiation and cloud cover) as the input data. These meteorological parameters were generated from the WRF model for the year. Also, the meteorological parameters were observed at RCFL for the same time period and both data sets were compared. Prediction of concentrations using an air quality model (AERMOD) was carried out with the observed meteorological data and WRF generated data. Meteorological parameters were prepared in columns and temporal resolution was prepared in rows of a spreadsheet. This spreadsheet was processed in AERMET which is a pre-processor of AERMOD. The terrain data at 90 m resolution of Shuttle Radar Topography Mission (SRTM) was used in AERMAP which is also the pre-processor of AERMOD. Then, the AERMOD model was used to predict concentration of NO x and PM 10 as shown in Figure 2. Also, comparisons of both the models, WRF and AERMOD, were done. The metrological model, the setup of parameterization of variables, dispersion model, emission load, and observations have been presented section-wise.

Meteorological Model
The mesoscale model, Advanced Research WRF model version 3.2, has been used in this study [35]. This model is designed to assist both atmospheric research and operational forecasting needs [36]. NCEP FNL (Final) Operational Global Analysis data have been used as an input for WRF, which are on 1.0 ×1.0 degree grids prepared operationally every six hours. This product is from the Global Data Assimilation System (GDAS), which continuously collects observational data from the Global Telecommunications System (GTS) and other sources for analyses [37]. It is a limited area, non-hydrostatic primitive equation model with multiple options for various physical parameterization schemes. This version employs Arakawa C-grid staggering for the horizontal grid and a fully compressible system of equations. A terrain following hydrostatic pressure coordinates with vertical grid stretching is implemented vertically. The time split integration uses a third order Runge Kutta scheme with smaller time step for acoustic and gravity wave modes. The WRF model physical options used in this study consist of the WRF model Single Moment 6-class simple ice scheme for microphysics, the Kain-Fritsch scheme for the cumulus convection parameterization, and the Yonsei University planetary boundary layer scheme. The rapid radiative transfer model and the Dudhia scheme are used for longwave and shortwave radiation, respectively, while the Noah land surface model has been selected. All these parameterizations constitute a well-tested suite of schemes over the Indian region [38][39][40]. The model domain extends between 71 • E to 81 • E zonally and 11 • N to 21 • N meridionally, consisting of 100 by 100 grid points with 25 km grid spacing as shown in Figure 1b. The model was run from 1st January to 31st December of the year. The model has 28 vertical levels with the top of the model at 10 hPa. Topography as well as snow cover information have been obtained from the United States Geological Survey. The meteorological parameters have been extracted from the WRF model at ground level. The WRF model has been run at 25 km resolution which provides time series meteorological parameters for a specific period at a particular location. In this study, 9 hourly meteorological parameters (cloud cover, temperature, pressure, relative humidity, wind direction, wind speed, ceiling height, rainfall, and global horizontal radiation) have been simulated for the year using WRF. WRF gives output in network common data format, and GRADS v 2.2 is post processing software which reads the network common data format. The output from WRF was fed to Grads 2.2 to generate digital hourly meteorological data to arrange in an Excel spreadsheet. The AERMET required data in excel spreadsheet or other formats. The input in AERMET was given in excel spreadsheet which was prepared using the data obtained by Grads 2.2. Here, hourly data for each meteorological parameter are provided in different columns. The spreadsheet meteorological data were imported in AERMET which is pre-processor of AERMOD.

Dispersion Model
Dispersion model uses emission inventory, geographical and meteorological data to predict concentration at the receptor's point in the study region. The format of the input data varies with different models. There are many specific models for vehicular and industrial sources as well as for a variety of sources [19,[41][42][43]. Industrial Source Complex (ISC3), developed by the United State Environmental Protection Agency, is a steady-state Gaussian plume model which can be used to assess pollutant concentrations from a wide variety of sources associated with an industrial source complex [44]. ISC3 operates in both long-term (ISCLT3) and short-term (ISCST3) models. ISCST3 model is the regulatory model in India and it has been used in many case studies [45]. Later on, it was updated to AERMOD whose performance was appreciable as they added some advanced algorithms to get more accurate results [46]. The air quality model that we use, AERMOD, has been applied to evaluate dispersion of several pollutants, including PM 10 , HCN, SO 2 , SF 6 , and VOCs and is recommended widely by regulatory authorities [47][48][49][50][51].
The study area (as given Figure 1c) was given in AERMOD and emission locations were digitized according to real earth surface reference and quantities of emissions were put based on estimated emission inventory. Therefore, there is no resolution concept for emission inventory in this study. The meteorological data output from the WRF model was processed in AERMET and its output was fed into AERMOD. The pre-procedure AERMAP of AERMOD calculates representative terrain-influence height, also referred to as the terrain height scale, at a receptor in modeling of air quality. Cartesian uniform gridded receptors were given, apart from discrete receptors and all receptors were at 2 m height. Anemometer height was 10 m and surface roughness length was 1 m in this model run. Building downwash terminology was not considered. AERMOD calculates concentration for each hour using hourly meteorological data for each pollutant separately.

Emission and Concentration Data
Emission load has been computed for point sources (specifically 36 stacks of BPCL, 30 stacks of HPCL, 4 stacks of Tata Power and 17 stacks of RCFL), line sources (the 6 roads of Chembur), and area sources (e.g., bakeries, hotels and restaurants, crematorium and domestic sector). These area sources were taken from a previous study (CSIR-NEERI) [52] for M East and M West wards, where area sources emission load has been computed. These sources are for the region of Chembur (M East and West Ward) where domestic sectors are available. Industrial emission data are collected from industries and vehicular emission inventory are prepared based on field survey data for the study period. Vehicular emission rate is estimated using the actual number of vehicles in unit time, emission factors and vehicle kilometer travelled [53]. The percentage of contribution is estimated from type of the sources after making emission inventory of the region. PM 10 and NO x emissions are mainly caused by industries (94% and 64%, respectively). Remaining 36% of NO x emission is contributed by vehicles (18%), domestic sources (17%) and others (1%). The emission factors of vehicles are available for particulate matter, and this has also been taken as PM 10 . Figure 3 depicts emission loads of NO x and PM 10 from various types of sources in the study area. The observed concentration data were collected for NO x and PM 10 at industrial sites, i.e., HPCL and BPCL at a height of about 3.5-4.0 m. Continuous (hourly) ambient air quality monitoring was done at these sites using Telydene instrument for NO x and PM 10 .

Results and Discussion
The results of WRF were used as inputs in estimation of concentration by AERMOD. NCEP FNL (Final) Operational Global Analysis data was used to process as an input for WRF. It produced 30 meteorological parameters for the required time resolution and study period. AERMET (pre-processor of AERMOD) required nine meteorological parameters and these were extracted, out of 30 meteorological parameters for air quality modeling. After this, the contributions from various sources in ambient concentration were estimated in this study.

Validation of Wind and Temperature Time Series
WRF generated the nine meteorological parameters hourly which were used in air quality modeling. The validation of meteorological output from WRF was done. In the validation of WRF, output time series temperature and wind are significant for air pollution. Therefore, the point of validation of temperature was conducted using the hourly temperature of the year of WRF with observed temperature data of RCFL industry, Chembur. A fair estimate of the dispersion of pollutants in the atmosphere is possible based on the frequency distribution of wind direction as well as wind speed. Wind transports pollutants from various sources, causing turbulent mixing and diluting pollution. Boundary layer cumulus clouds vent pollution into the free troposphere, and temperature and humidity levels in the boundary layer affect chemical reactions and the rates at which many dangerous compounds are formed.

Validation of Wind
Wind data was derived using WRF model at the height of 10 m from the ground for a whole year, and wind rose of Chembur, which was simulated by (a) WRF model and (b) RCFL observed data for the year, was plotted ( Figure 5). Maximum wind persistence corresponds from the west and south-west direction in WRF, while in RCFL observed data, the wind was found to blow in the west and north-west direction for most of the time. Observed wind data had more periods of calmness than the data simulated by WRF. Observed wind rose had 62% calm condition whereas simulated wind rose had 4% calm condition over a period. The root mean square error and mean bias error between predicted and observed wind speed were 4.05 and 3.29, respectively. The root mean square error and mean bias error between predicted and observed temperature were 3.69 and −1.5, respectively. The observed data of wind was collected at RCFL industry at the height of 6 m, and it represents the microscale domain. Chembur has considerable variation in topography as shown Figure 1d. The topography changes after a few meters and this causes the wind to divert. Consequently, this variation of topography may cause the mismatch of wind rose of WRF and observed data.
Modeling was performed with both the meteorological parameters of WRF output and RCFL observed data. WRF performance of wind is poor due to the surface inhomogeneity. However, for the purpose of dispersion modeling, we consider WRF as a good representation of the mesoscale flow. The observed wind at RCFL is not representative of the entire Chembur region for dispersion modeling. This may be because the maximum emission is from industrial sources in this region, which are at an elevated height. Hence, for these sources, the mesoscale meteorology generated by the WRF model may be more appropriate than the observed microscale meteorology.

Validation of NO x and PM 10
The annual PM 10 and NO x concentration contour plots for all sources of the study domain are shown in Figure 6a,b respectively. A comparison of simulated concentration using WRF output with the observed concentration of PM 10 and NO x are in Table 1, and the model was well-compared for this study area. The root mean square error and mean bias error between predicted and observed concentrations for NO x were 1.76 and 0.063, respectively, while they were 0.41 and 0.83 for PM 10 , respectively. The standard deviations for NO x at BPCL and HPCL were 33.6 and 30.2, respectively, and the standard deviations for PM 10 at BPCL and HPCL were 16.4 and 12.4, respectively. It was seen that the values obtained through air quality modeling were closer to the observed concentrations with the mesoscale meteorology than the surface level meteorology. The model results using observed surface meteorology were high. Modeled values were in good agreement with the observed values at both locations for NO x , but for PM 10 , simulated concentrations were lesser than the observed concentration at HPCL. This can be due to the vehicular congestion and resuspended particles. The model performed well with mesoscale meteorology after all the sources and the entire region were considered. As Chembur has immense variation in topography, land use and geographical structure, as shown in Figure  1d, microscale meteorology varies with these factors. Further, mesoscale meteorology has been used for other analyses for air quality. The contours were plotted using the model for NO x and PM 10 concentrations based on one-hour average values for one year for all sources in the study area. The same analysis was repeated only for industrial sources and vehicular emission sources separately. It was observed that the maximum concentration of PM 10 was 71.8 µg/m 3 .This concentration was observed near the Chembur Gaothan area where the vehicular congestion was more intense. Around BPCL and HPCL area, PM 10 concentration was around 50 µg/m 3 . The minimum concentration of PM 10 was less than 42 µg/m 3 , and this concentration was observed near the southern boundary of the study region, which has been represented by deep violet color. The PM 10 modeling was carried out using 30 µg/m 3 as the background concentration. This concentration was estimated when modeled concentrations matched with observed concentrations. This background concentration also includes resuspended particulate matter (RSPM), which is induced by vehicular congestion and other factors. In PM 10 modeling , this background concentration can be taken as a correction factor. The maximum concentration of NO x was observed to be 53 µg/m 3 near the Ghatkopar-Mankhurd link road. This can be due to the vehicular congestion in Deonar Village, BPCL and HPCL area. Annual minimum concentration was less than 10 µg/m 3 along the eastern, western and southern boundaries of the study area, which has been shown in deep violet color.  Emission load does not represent the rank-wise contribution of sources to the ambient concentration of the region. Hence, modeling was carried out for industrial sources, vehicle sources, and low duty diesel vehicles (LDDVs) to observe the relative sourcewise contribution to the ambient air quality for future scope of implementation of control strategies and environmental management.

Results of Industrial Sources
For industrial emission sources modeling, four industries (BPCL, HPCL, RCFL and TPCL) were considered in Chembur. In this study, NO x and PM 10 emissions were modeled for the year to find out the dominant source in the study domain.

Contribution of NO x and PM 10 Concentration by Industries
NO x and PM 10 emission load were 64% and 94%, respectively from industries in Chembur ( Figure 3). However, it contributes less to ambient concentration in the study area. The southern part of the study area is dominated by industrial sources, and due to meteorology, the maximum concentration of NO x is 6.2 µg/m 3 , seen at HPCL. Also, NO x concentration is 4.8 µg/m 3 in the southern part of BPCL. The maximum concentration of PM 10 is 35 µg/m 3 at BPCL and 33 to 34 µg/m 3 at HPCL and RCFL. Table 2 shows the comparison of the simulated concentration of NO x and PM 10 for industrial sources only and ambient simulated concentration of NO x and PM 10 for this study area, respectively.

Results of Line Sources
In line source modeling, six roads in Chembur area have been considered. In the present study, NO x and PM 10 emissions have been modeled for the year to find out the dominant sources in the study domain. Vehicular emission varies with time such as morning peak, evening peak, off peak and the lean peak of the day.

NO x and PM 10 Concentration Contribution by Vehicles
NO x and PM 10 emission loads from vehicles in Chembur are 17% and 3%, respectively ( Figure 3). Nevertheless, these are contributing considerably to ambient concentration because they are ground emission sources. The northern part of the study area is dominated by vehicular sources and high density of vehicles. The maximum concentration of NO x was 43 µg/m 3 at Chedda Nagar and 40.1 µg/m 3 at Chembur Naka. The southern part of the study area has an inconsequential effect on vehicular pollution (only 2-5 µg/m 3 ). At Chheda Nagar, the concentration of ambient air quality from all the sources was 54 µg/m 3 , while the concentration by vehicular sources was 41 µg/m 3 . Chheda Nagar is in the northern region of the study area, and this part is affected less by industrial emissions. The northern part of the study domain is highly populated compared to the southern part. The southern part has lesser contribution from vehicles. At Chheda Nagar, the maximum concentration of PM 10 from vehicles was 37.84 µg/m 3 and from the other sources was 70.8 µg/m 3 .

NO x and PM 10 Concentration Modeling by Diesel Car and LDDV
Northern and western corners in this analysis are dominated by diesel cars and light duty diesel vehicles (LDDVs), while the other areas are almost free from NO x pollutant. The maximum concentration of NO x was 12.5 µg/m 3 and it was found at Chheda Nagar. The entire area of Shramjivi Nagar showed NO x concentration of 7-9 µg/m 3 sourced from diesel cars and LDDVs. Concentration contribution in ambient air quality from vehicles was 43.1 µg/m 3 . Thus, it can be concluded that diesel cars and LDDVs are contributing to one fourth of the line source emission. In PM 10 concentration modeling, northern and western corners are affected by diesel cars and LDDVs. The maximum PM 10 concentration of 32.8 µg/m 3 was observed at Everard Nagar (Point 1 area) and 32.5 µg/m 3 at Chembur Naka from diesel cars and LDDVs with 30 µg/m 3 background concentration. Thus, it can be concluded that diesel cars and LDDVs are contributing to around 17% to line sources and 25% to the total concentration.

Summary and Conclusions
The aim of the study was to generate onsite meteorological data for usage in air quality modeling to see the feasibility of coarse resolution of WRF output in LMIC. It generated onsite and real time meteorological data, which was fed in AERMET, the pre-processor of the dispersion model AERMOD. AERMOD calculated concentrations for NO x and PM 10 using compiled emission inventory for all the sources, namely industries and vehicles, of the study area. Air quality modeling results showed that in this particular case, the meteorological data from WRF output at mesoscale performed better than the observed meteorological data. WRF output could be a good option which may represent a better meteorology for the purpose of dispersion modeling. Also, this may be because industrial sources have a significant contribution in the region. The results were used to identify the critical areas and relative contribution of various sources to ambient air quality. The use of WRF model is very economical in resources and time.
The ambient concentration load of the study area is shown in Figure 7 for NO x and PM 10 . Here, for ambient concentration, the vehicular emissions are dominating in the region because these are ground level sources. Based on the study, following conclusions can be drawn: Amongst total emissions, PM 10 emission load was 3%, and NO x emission load was 17% from vehicles. Industr sources contributed 64% and 94% of NO x and PM 10 load, respectively. The domestic sector contributed significantly to NO x emission as 18% of total emission load. NO x emission load of industries was 64% of the total emission load, but it contributed only 25-30% of NO x concentration in the ambient air. There was 94% contribution to total emission load of PM 10 from industries in the study domain, but only 57% contribution to ambient air quality level. NO x emission contribution from vehicles was 17% of total emission, but in ambient air quality it contributed only 26% of the total because it is a ground level source. Vehicular PM 10 emission contribution was 3% of the total emission load, but in ambient air quality it contributed 25% of the total ambient PM 10 concentration. At ambient concentration of NO x , diesel cars and LDDVs contributed one fourth of the line sources for this study domain. Micro-meteorology may vary a lot due to the topography of the region. Topographic features could be one of the limitations of the use of WRF output meteorology for air quality modeling because it may not capture the high-rise buildings in the region. In AERMOD, only one meteorological profile can be used but meteorology may not be uniform for the entire region. It has been found many times that the observed meteorological parameters carry some measurement errors [54][55][56].
This will be very useful in the forecast, implementation, control, and management strategies for improving air quality and also for performing a risk analysis of different types of sources in the region as future scope. Also, this work can be useful to regulatory authorities to develop a framework for air quality management in LMIC. The shortage of research was (a) the meteorological data were available at only one location which can be observed at other locations and comparison can be done for the same and (b) suspended dust can be estimated and incorporated in emission inventory. Various physics option parameters to set up the WRF model and simulation of the microscale meteorology with a comparison of observed meteorology could be possible future researches. Health benefit analysis can also be done by estimating population exposure with air quality [57].
Author Contributions: Conceptual, methodology development, software operation and writing, A.K., review, data procurement and guidance, A.K.D. and R.S.P. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.