Functional Kriging for Spatiotemporal Modeling of Nitrogen Dioxide in a Middle Eastern Megacity

: Long-term hour-speciﬁc air pollution exposure estimates have rarely been of interest in epidemiological research. However, this can be relevant for studies that aim to estimate the residential exposure for the hours that subjects mostly spend time there, or for those hours that they may work in another location. Here, we developed a model by spatially predicting the long-term diurnal curves of nitrogen dioxide (NO 2 ) in Tehran, Iran, one of the most polluted and populated megacities in the Middle East. We used the statistical framework of functional data analysis (FDA) including ordinary kriging for functional data (OKFD) and functional analysis of variance (fANOVA) for modeling. The long-term NO 2 diurnal curves had two distinct maxima and minima. The absolute minimum value of the city average was 40.6 ppb (around 4:00 p.m.) and the absolute maximum value was 52.0 ppb (around 10:00 p.m.). The OKFD showed the concentrations, the diurnal maximum/minimum values, and their corresponding occurring times varied across the city. The fANOVA highlighted that the effect of population density on the NO 2 concentrations is not constant and depends on time within the diurnal period. The provided estimation of long-term hour-speciﬁc maps can inform future epidemiological studies to use the long-term mean for speciﬁc hour(s) of the day. Moreover, the demonstrated FDA framework can be used as a set of ﬂexible statistical methods.


Introduction
Air pollution is one of the main health and environmental problems of the new civilized world at the global level [1,2]. It is revealed that air pollution has a serious toxicological effect on human health [3,4]. Transportation, power plants, and natural processes have led to humans and ecosystems faced with serious air quality issues [4,5]. Many acute and chronic impacts on human health and even death may be caused by exposure to high levels of air pollutants [6,7]. The World Health Organization (WHO) reported around 4.2 million yearly deaths due to exposure to air pollution worldwide [8].
Nitrogen oxides are a group of highly reactive gases and important air pollutants that affect acids, the formation of photochemical oxidants, and the concentration of hydroxyl radicals [9]. Nitrogen dioxide is one of the nitrogen oxides that are primarily released into the air from combustion [3]. Its main emission sources are power plants, cars, trucks, and buses [10,11]. The exposure to its high concentration can cause cardiovascular diseases and irritation in the human respiratory system, particularly asthma and respiratory symptoms such as difficulty breathing, coughing, and wheezing [12,13].
In different studies, the concentration of air pollution has been assessed by measuring or predicting at monitoring stations using various modeling methods [5][6][7][8][9][10][11][12]. The accurate prediction of air pollution improves the protection strategy of the affected population and helps manage the air quality control system effectively. Spatial interpolation tools such as kriging methods have been widely used to estimate the spatial distribution of pollutants and predict airborne pollutant concentrations using geo-statistical analysis in previous studies [1,8,10,14]. These methods estimate the air pollutant concentrations at unmonitored locations by considering the information on the geographic positions of the recorded data points and the correlation between the measurement points [14]. Various geo-statistical methods have been applied in the literature such as regression kriging for ozone and NO 2 in Japan [15]; mapping occurring of high concentrations of air pollution with indicator kriging for PM 10 and CO in Tehran [1]; modeling the spatial distribution of two-year averages of several air pollutants using ordinary kriging, universal kriging, and co-kriging with Gaussian semi-variogram in Tehran [16]; and the spatio-temporal modeling of daily fine particulate matter in the megacity of Tehran [17]. However, all of the aforementioned geo-statistical studies are for modeling scalar observations and not curves or functions that result from successive measurements of data [18,19]. Functional data analysis (FDA), which was introduced to model the so-called functional data, generally made of curves indexed over the time domain, has been used to model and predict this type of data by smoothing methods since the beginning of the 1990s. This technique has been applied in different fields of spatial statistics such as air quality studies. Goulard and Voltz proposed a functional kriging approach for predicting curves at unmeasured locations. Ignaccolo carried out kriging with external drift to predict the curves of particulate matter concentration in the Piemonte region (Italy) where they employed a multivariate functional principal component analysis for air quality zoning [20,21]. Furthermore, Wang et al. used a functional spatio-temporal model to study the intra-day variation of the diurnal ozone pollutant in Beijing, China [22].
The Middle Eastern megacity of Tehran is the capital of Iran and has serious air pollution due to urbanization, high traffic, and emissions from industrial production activities [17,23,24]. The average monthly, seasonal, and annual concentrations of NO 2 in Tehran have been spatially modeled and reported in previous studies [23,25]. However, a spatiotemporal model for the long-term NO 2 concentrations that capture the intra-day fluctuation has not been developed yet.
To date, the majority of epidemiological studies on the long-term health effects of air pollution, has overlooked the intra-day variation of long-term air pollution. More commonly, studies have used exposure estimates for the annual averages of pollutants or daily averages over one or several years [2,[26][27][28][29][30][31][32][33][34]. However, people spend time in different locations, mostly at home and work. Thus, it might be better to limit the longterm exposure estimates for each geographical location to the corresponding time that people mostly spent there. Furthermore, the intra-day changes in pollutant concentrations may vary between different geo-locations in the study area, which means that the spatial distribution of exposure can change during the day [22]. Hence, models that overlook the intra-day variation of long-term air pollutants in epidemiological studies could potentially result in exposure misclassification.
Here, we developed a flexible statistical model that considers the intra-day changes of long-term nitrogen dioxide in the megacity of Tehran. The developed model could benefit future epidemiological studies in this highly populated and polluted Middle Eastern megacity. We benefited from the statistical framework of functional data analysis that enables flexibility in modeling the intra-day variation. Furthermore, due to the possible effect of population density on the NO 2 concentrations [35], we also studied its effect on the intra-day variation of long-term NO 2 concentrations.

Study Area
Tehran is the capital of Iran with a populated area of about 613 km 2 and a residential population of about 8.2 million and a daytime population of more than 10 million due to diurnal migration. There are towering mountains in its north and in the central desert of the country in its south. Tehran's weather is typically sunny with an annual daily temperature of 18.5 • C that increases to around 40 • C in July and decreases to around −10 • C in January [23,24,36,37].

Data
We obtained hourly NO 2 concentrations (in ppb) for one calendar year from March 2014 to March 2015 from 33 air quality monitoring stations administered by the Tehran Air Quality Control Company (AQCC) and the Iranian Department of Environment (DOE). At the monitoring stations, chemiluminescent analyzers (model AC 32 M of Environment SA, France) were used to measure the NO 2 concentrations. The long-term hourly means of NO 2 concentrations were calculated by averaging the data over the calendar year for each hour. The data of 26 monitoring stations were used for model development and the extra seven stations were used for model validations that are indicated by star signs and square signs in Figure 1, respectively. Moreover, the geographical coordinates of the monitoring stations (longitude and latitude) in decimal degrees were obtained from the same two governmental agencies.

Study Area
Tehran is the capital of Iran with a populated area of about 613 km and a residential population of about 8.2 million and a daytime population of more than 10 million due to diurnal migration. There are towering mountains in its north and in the central desert of the country in its south. Tehran's weather is typically sunny with an annual daily temperature of 18.5 °C that increases to around 40 °C in July and decreases to around −10 °C in January [23,24,36,37].

Data
We obtained hourly NO concentrations (in ppb) for one calendar year from March 2014 to March 2015 from 33 air quality monitoring stations administered by the Tehran Air Quality Control Company (AQCC) and the Iranian Department of Environment (DOE). At the monitoring stations, chemiluminescent analyzers (model AC 32 M of Environment SA, France) were used to measure the NO concentrations. The long-term hourly means of NO concentrations were calculated by averaging the data over the calendar year for each hour. The data of 26 monitoring stations were used for model development and the extra seven stations were used for model validations that are indicated by star signs and square signs in Figure 1, respectively. Moreover, the geographical coordinates of the monitoring stations (longitude and latitude) in decimal degrees were obtained from the same two governmental agencies. The population density variable was categorized into three groups: less than 10,000 persons per square kilometers (low population density), between 10,000 and 20,000 persons per square kilometers (moderate population density), and more than 20,000 persons per square kilometers (high population density). In Figure 1, the green monitoring stations are located in areas with low population densities (seven monitoring stations), yellows are in medium population densities (13 monitoring stations), and reds are located in high population densities (13 monitoring stations). Tehran regions with low population densities are those in -predicted maps inserted in results section with labels of 1, 9, 21, 22; regions with moderate population densities are those with labels of 2 to 6, 12, 16, 18, 19; and regions with high population densities are those with labels of 7, 8, 10, 11, 13 to 15, 17, and 20.

Kriging for Functional Data
Let χ (t) be a curve or a functional random variable that is observed at location s defined over the interval of T as its time domain (e.g., the NO diurnal curve). The population density variable was categorized into three groups: less than 10,000 persons per square kilometers (low population density), between 10,000 and 20,000 persons per square kilometers (moderate population density), and more than 20,000 persons per square kilometers (high population density). In Figure 1, the green monitoring stations are located in areas with low population densities (seven monitoring stations), yellows are in medium population densities (13 monitoring stations), and reds are located in high population densities (13 monitoring stations). Tehran regions with low population densities are those in -predicted maps inserted in results section with labels of 1, 9, 21, 22; regions with moderate population densities are those with labels of 2 to 6, 12, 16, 18, 19; and regions with high population densities are those with labels of 7, 8, 10, 11, 13 to 15, 17, and 20.

Kriging for Functional Data
Let χ s (t) be a curve or a functional random variable that is observed at location s defined over the interval of T as its time domain (e.g., the NO 2 diurnal curve).
Assume that we observe a sample of curves χ si (t) defined for t ∈ T, s i ∈ D , i = 1, . . . , n. In order to predict a curve at an unmonitored site, s 0 , initially, the curves are pre-processed by fitting a set of spline basis functions. After that, the spatial dependence among curves was estimated using the trace-variogram function. Finally, the parameters for performing prediction by ordinary kriging at unmonitored locations were estimated by solving a linear system based on the estimated trace-variogram. This approach defines the best linear unbiased predictor (BLUP) for χ s0 (t) given bŷ where the λ i coefficients satify E χ s 0 (t) − χ s 0 (t) = 0 and minimize the below equation Hence, the optimization problem of the OKFD method was as follows

Expressing Functional Data Using Basis Functions Set
We assumed that each observed functional data can be expressed in terms of a set of basis functions, β 1 (t), β 2 (t), . . . , β K (t), by where a i = (a i1 , . . . , a iK ) and β(t) = (β 1 (t), β 2 (t), . . . , β K (t)). We used the splines as the underlying set of basis functions to estimate functional data (i.e., diurnal curves of NO 2 concentrations) from the discrete annual means of hourly NO 2 concentrations in each station [21,38].

Estimating the Trace-Variogram
In order to account for auto-correlations in geo-spatial modeling, the trace-variogram was used. The definition of the trace-variogram function for the modeling of spatial functional data is An adaptation of the classical moment method (MoM) gives the following estimator for this quantity:γ where |N(h)| is the number of distinct elements in N(h). Once the trace-variogram was estimated for a sequence of distance values of h by the moment method, we fitted parametric

Choosing the Optimum Number of Basis Functions
The number of basis functions to express the functional data was chosen by crossvalidation. Consider the case that the functional data χ s i (t), i = 1, . . . , n, observed at time points of t 1 , . . . , t M , and we wish to approximate them through explanation using a basis functions set. A simple way of establishing an appropriate K, the number of basis functions, can be by calculating the cross-validation SSE in a classical nonparametric sense.
Let χ (j) s i t j be the estimated functional data at t j by using some basis functions set when the datum χ s i t j has been temporarily suppressed from the sample. Then for each K, the cross-validation SSE is calculated by The strategy was to minimize the NPCV(K) statistic with respect to the K to find the optimum number of basis functions [38,39].

Goodness-of-Fit Criteria
To assess the spatial prediction capability of the developed model, we considered four assessment metrics that were the root mean square error (RMSE), the correlation coefficient between the measured values and the corresponding predicted values, the normalized mean bias factor (NMBF), and the weighted normalized mean square error of the normalized ratios (WNNR) [39].

Functional Analysis of Variance
We assigned monitoring stations into three groups based on their region's population density (high, moderate, and low) and compared the mean diurnal curves of NO 2 between these three areas using a functional analysis of variance (FANOVA) [40].

Software and Packages
The R statistical environment version 4.1.2 [41] that was created by R core team at Vienna, Austria at 2021 was used for data analysis. The "geofd" R package was used for geo-spatial functional modeling and prediction [42]. The "fda" R package was used for general functional data analysis methods including smoothing by expansion using a set of basis functions, and the functional analysis of variance [43].

Results
The smoothed diurnal curves of NO 2 in 26 model-training stations are presented in Figure 2. They show that the long-term yearly average of NO 2 concentrations varied between 13.28 to 91.46 ppb during a diurnal time period. Moreover, it highlights the heterogeneity in the values and shapes of the NO 2 diurnal curves.
According to Figure 2, the curves changed abruptly at hours of about 5, 10, 16, 22, thus, we placed four knots at these time points in the smoothing stage. On the other hand, we chose an order of four splines (a cubic spline basis system), so that the number of basis functions was 8. Descriptive statistics (mean and standard deviation curves) of the smoothed NO 2 diurnal curves are presented in Figure 3.

Results
The smoothed diurnal curves of NO in 26 model-training stations are presented in Figure 2. They show that the long-term yearly average of NO concentrations varied be tween 13.28 to 91.46 ppb during a diurnal time period. Moreover, it highlights the heter ogeneity in the values and shapes of the NO diurnal curves.  we chose an order of four splines (a cubic spline basis system), so that the number of basis functions was 8. Descriptive statistics (mean and standard deviation curves) of the smoothed NO diurnal curves are presented in Figure 3. Based on the results presented in Figure 3a, one can see that there were two minima and two maxima in the NO mean diurnal curve, which was calculated across all monitoring stations, during. These maxima occurred in the morning between 9:43 a.m. and  (Figure 3a). However, the absolute minimum value occurred in the early morning (around 5 a.m., which coincided with the local minimum of the mean curve) and its local minimum occurred in the afternoon (around 4 p.m., which coincided with the absolute minimum of the mean curve). Hence, one could expect a more homogenous situation between NO Based on the results presented in Figure 3a, one can see that there were two minima and two maxima in the NO 2 mean diurnal curve, which was calculated across all monitoring stations, during. These maxima occurred in the morning between 9:43 a.m. and  Figure 3a). However, the absolute minimum value occurred in the early morning (around 5 a.m., which coincided with the local minimum of the mean curve) and its local minimum occurred in the afternoon (around 4 p.m., which coincided with the absolute minimum of the mean curve). Hence, one could expect a more homogenous situation between NO 2 diurnal curves in different areas of Tehran in the early mornings.
The results of the trace-variogram fitted models are shown in Table 1 Based on Table 1, the value of spatial structure criteria was Sill (Sill+Nug) = 12,943.076 (12,943.076+8184.138) = 0.612, so there was a notable spatial auto-correlation between the data (the criteria were more than 0.5). Table 2 shows the goodness-of-fit metrics in seven validation-monitoring stations. Predictions were generally good with a slight overestimation in three out of seven monitoring stations (see NMBF). The correlation coefficient between the predicted and the observed data in the validation stations were in the range of 0.614 to 0.903 with an average of 0.781.  Figure 4 presents the prediction of the NO 2 diurnal curve at two validation-monitoring stations with the worse and best prediction based on the correlation coefficient between the observed and predicted values. It shows that the predicted values were consistent with the observed data, and the behavior of the predicted curves was similar to the corresponding smoothed curves. Figure 5 displays the results of FANOVA for comparing the mean of diurnal NO 2 concentration in areas with low, medium, and high population densities. We can see that the population density effect was more complicated than a constant increase or decrease in the mean value during the time. Only areas with low population density had a negative mean difference with respect to the grand mean NO 2 diurnal curve. Actually, the low populated areas were less polluted about 8 ppb in the early morning (between 6 a.m. and 7 a.m.), 2 ppb in the afternoon around 4 p.m., and 6 ppb from 9 p.m. to 4 a.m. The areas with a moderate population density generally were more polluted (about 1 to 1.5 ppb) with respect to the Tehran averages and had approximately the same value from 9 a.m. to 9 p.m., but was about 3.5 ppb more polluted in the early morning (between 5 a.m. and 6 a.m.). The areas with high population density were at most times more polluted with respect to the Tehran averages but had higher NO 2 mean values, especially in the morning between 6 a.m. and 10 a.m. and had lower values in the afternoon around 4 p.m.

District 16
−0.066 3.45 0.0080 0.877 Figure 4 presents the prediction of the NO diurnal curve at two validation-monitoring stations with the worse and best prediction based on the correlation coefficient between the observed and predicted values. It shows that the predicted values were consistent with the observed data, and the behavior of the predicted curves was similar to the corresponding smoothed curves.   Table 2. (b) The validation station of district 10 that had the best prediction based on the RMSE metric in Table 2; bold curves show the prediction of the OKFD model for that station and the red dots show the corresponding observed hourly data. Figure 5 displays the results of FANOVA for comparing the mean of diurnal NO concentration in areas with low, medium, and high population densities. We can see that the population density effect was more complicated than a constant increase or decrease in the mean value during the time. Only areas with low population density had a negative mean difference with respect to the grand mean NO diurnal curve. Actually, the low populated areas were less polluted about 8 ppb in the early morning (between 6 a.m. and 7 a.m.), 2 ppb in the afternoon around 4 p.m., and 6 ppb from 9 p.m. to 4 a.m. The areas with a moderate population density generally were more polluted (about 1 to 1.5 ppb) with respect to the Tehran averages and had approximately the same value from 9 a.m. to 9 p.m., but was about 3.5 ppb more polluted in the early morning (between 5 a.m. and 6 a.m.). The areas with high population density were at most times more polluted with respect to the Tehran averages but had higher NO mean values, especially in the morning between 6 a.m. and 10 a.m. and had lower values in the afternoon around 4 p.m.  Table 2. (b) The validation station of district 10 that had the best prediction based on the RMSE metric in Table 2; bold curves show the prediction of the OKFD model for that station and the red dots show the corresponding observed hourly data.  Table 2. (b) The validation station of district 10 that had the best prediction based on the RMSE metric in Table 2; bold curves show the prediction of the OKFD model for that station and the red dots show the corresponding observed hourly data. Figure 5 displays the results of FANOVA for comparing the mean of diurnal NO concentration in areas with low, medium, and high population densities. We can see that the population density effect was more complicated than a constant increase or decrease in the mean value during the time. Only areas with low population density had a negative mean difference with respect to the grand mean NO diurnal curve. Actually, the low populated areas were less polluted about 8 ppb in the early morning (between 6 a.m. and 7 a.m.), 2 ppb in the afternoon around 4 p.m., and 6 ppb from 9 p.m. to 4 a.m. The areas with a moderate population density generally were more polluted (about 1 to 1.5 ppb) with respect to the Tehran averages and had approximately the same value from 9 a.m. to 9 p.m., but was about 3.5 ppb more polluted in the early morning (between 5 a.m. and 6 a.m.). The areas with high population density were at most times more polluted with respect to the Tehran averages but had higher NO mean values, especially in the morning between 6 a.m. and 10 a.m. and had lower values in the afternoon around 4 p.m. To achieve better interpretability, the predictions of mean NO diurnal curves in areas with low, moderate, and high population densities are depicted in Figure 6. According to this figure, the shapes of the NO diurnal curves were different at most times in areas with different population densities. The values of the NO concentrations in areas with low population densities were notably lower than areas with moderate and high population densities with the exception of times around 4 p.m. The NO concentrations were slightly lower in areas with moderate population density with respect to areas with high population density, but with the exception of times between 1:25 p.m. and 6:24 p.m. The absolute minimum of the mean NO diurnal curve in areas with a low population density To achieve better interpretability, the predictions of mean NO 2 diurnal curves in areas with low, moderate, and high population densities are depicted in Figure 6. According to this figure, the shapes of the NO 2 diurnal curves were different at most times in areas with different population densities. The values of the NO 2 concentrations in areas with low population densities were notably lower than areas with moderate and high population densities with the exception of times around 4 p.m. The NO 2 concentrations were slightly lower in areas with moderate population density with respect to areas with high population density, but with the exception of times between 1:25 p.m. and 6:24 p.m. The absolute minimum of the mean NO 2 diurnal curve in areas with a low population density was around 5 a.m., which was a bit later than the time of the local minimums of the mean NO 2 diurnal curves in the moderate and high population densities.
Atmosphere 2022, 13, x FOR PEER REVIEW 10 of 15 was around 5 a.m., which was a bit later than the time of the local minimums of the mean NO diurnal curves in the moderate and high population densities. The predicted maps of NO concentrations at the starting time point of each hour, from 1 to 24, are presented in Figure 7. According to the predicted maps, the central and east parts of Tehran (i.e., districts 7, 8, 10, 11,13,14,15, and 20 had the highest long-term concentration of NO (upper 45 ppb) for almost all days). The north and northwest districts of Tehran in general have the lowest concentrations, which were almost lower than 34 ppb. Interestingly, this matrix of the predicted maps revealed the existence of different patterns of the NO changes during the day. For example, the north and northwest parts had their cleanest conditions approximately the hours from 4 a.m. to 6 a.m., while it seems that the rest of Tehran had less polluted conditions from midday to 5 p.m. Figure 6. The prediction of the mean NO 2 diurnal curves in low population density areas (green curve), moderate population density areas (blue curve), and high population density areas (red curve).
The predicted maps of NO 2 concentrations at the starting time point of each hour, from 1 to 24, are presented in Figure 7. According to the predicted maps, the central and east parts of Tehran (i.e., districts 7, 8, 10, 11,13,14,15, and 20 had the highest long-term concentration of NO 2 (upper 45 ppb) for almost all days). The north and northwest districts of Tehran in general have the lowest concentrations, which were almost lower than 34 ppb. Interestingly, this matrix of the predicted maps revealed the existence of different patterns of the NO 2 changes during the day. For example, the north and northwest parts had their cleanest conditions approximately the hours from 4 a.m. to 6 a.m., while it seems that the rest of Tehran had less polluted conditions from midday to 5 p.m.

Discussion and Conclusions
In this study, we presented the results of a functional kriging model for the long-term NO 2 diurnal curves in the Middle Eastern megacity of Tehran. Moreover, we investigated the effect of population density on the shape and values of the NO 2 diurnal curves using a functional analysis of variance.
The long-term mean NO 2 concentrations in Tehran were typically above about 40 ppb (~75 µg/m 3 ), which would be about 7 times higher than the 2021 World Health Organization's recommended guidelines [44,45]. We found two peaks defining the maxima concentrations in the mean NO 2 diurnal curve: the local maximum occurred at 10:02 a.m. and the absolute maximum occurred at 10:12 p.m. Since the most important source of NO 2 emissions is road transport, the maxima concentrations of NO 2 may have occurred due to heavier traffic congestions at these special times of the day. In particular, most truck transport operates late at night in order to distribute goods from storehouses to small stores in the city. Therefore, the major increase in NO 2 concentrations late at night can be attributed to local regulations that allow for trucks and heavy vehicles to commute only between 10 p.m. and 6 a.m. Moreover, two minima were observed on 4:59 a.m. (local minimum) and 3:55 p.m. (absolute minimum). This trend in Tehran's hourly NO 2 concentrations were also reported in other studies, for example, Masoudi reported descriptive statistics for the NO 2 hourly means of four monitoring stations in Tehran from 2009 to 2010 [36]. Taheri also reported that black carbon and PM 2.5 in Tehran had a similar trend [46]. In accordance with our study, bimodal diurnal NO 2 curves with the absolute minimum value in the afternoon and the absolute maximum in the night were reported by Cichowicz in central Poland, and by Moreno in Madrid, Spain [47,48].
Furthermore, we showed that the population density was positively associated with the NO 2 concentration. The prediction of NO 2 diurnal mean curves in areas with low population density was notably lower than that in areas with medium or high population densities. There was one exception where NO 2 concentrations remained the same at all population densities and that the time point coincided with the time of the overall minimum value. In line with our study, Zoest et al. showed a strong relationship between the NO 2 concentrations and population density in the city of Eindhoven, The Netherlands where they included population density as a covariate in a spatio-temporal regression kriging model [35]. Kaplan et al. also confirmed this relationship by investigating the correlation between the NO 2 concentrations and population densities [49]. We studied the relationship of the NO 2 concentrations and population densities over a diurnal time. The observed differences between the shapes and values of the predicted mean curves may be due to changes in the intensity of traffic over diurnal time and can also be associated with a difference in the working time of industrial activities in each area. In line with this result, as expected, the estimated NO 2 concentrations were higher at the central and eastern parts of Tehran that have a high population density.
To conclude, we presented here a model for long-term hourly NO 2 concentrations in the megacity of Tehran. Our results could benefit epidemiological studies of long-term exposure to nitrogen dioxides in the Middle Eastern megacity of Tehran. Examples of such epidemiological studies are the effect of long-term exposure to air pollutants on the incidence of lung cancer, the air pollutants' long-term effects on low birth weight, and the effect of exposure to long-term multiple pollutants on the incidence of leukemia [50][51][52]. Our long-term exposure model provided not just an updated estimate for the long-term exposure to nitrogen dioxide, but it also provides for the long-term means across all times of the day. Future epidemiological studies of the effect of long-term air pollutants in Tehran could assign the exposure to each person based on a collection of locations such as the participants' houses, workplaces, schools, etc., and the corresponding time that they spent at each location. This could be important because we revealed in our analysis that the trend in the changes in the concentrations of the pollutants was not consistent in all regions of Tehran. For example, regions 21 and 22 in the west of Tehran had less polluted conditions