The COVID-19 Mortality Rate Is Associated with Illiteracy, Age, and Air Pollution in Urban Neighborhoods: A Spatiotemporal Cross-Sectional Analysis

There are different area-based factors affecting the COVID-19 mortality rate in urban areas. This research aims to examine COVID-19 mortality rates and their geographical association with various socioeconomic and ecological determinants in 350 of Tehran’s neighborhoods as a big city. All deaths related to COVID-19 are included from December 2019 to July 2021. Spatial techniques, such as Kulldorff’s SatScan, geographically weighted regression (GWR), and multi-scale GWR (MGWR), were used to investigate the spatially varying correlations between COVID-19 mortality rates and predictors, including air pollutant factors, socioeconomic status, built environment factors, and public transportation infrastructure. The city’s downtown and northern areas were found to be significantly clustered in terms of spatial and temporal high-risk areas for COVID-19 mortality. The MGWR regression model outperformed the OLS and GWR regression models with an adjusted R2 of 0.67. Furthermore, the mortality rate was found to be associated with air quality (e.g., NO2, PM10, and O3); as air pollution increased, so did mortality. Additionally, the aging and illiteracy rates of urban neighborhoods were positively associated with COVID-19 mortality rates. Our approach in this study could be implemented to study potential associations of area-based factors with other emerging infectious diseases worldwide.


Introduction
COVID-19 infected hundreds of millions of people and killed over 6.6 million by December 2022, and it continues to impact communities worldwide [1,2]. Since the beginning of the outbreak, Iran has had the highest number of total cases in the eastern Mediterranean region (more than 7 million, 9.4% of the total population). Additionally, the country experienced the highest total COVID-19-associated deaths (144,673; CFR 1.91%) [3,4]. For example, during the first 20 months of the outbreak of COVID-19 in Tehran, the capital and the most populated city of Iran, 60,111 infections and 7034 deaths were recorded [5]. portation, and built environment variables. We believe that a comprehensive examination, including more possible components, would better reveal each factor's role, since the predictors compete. Therefore, our study has compared three different regression models, including OLS, GWR, and MGWR, to model the association of COVID-19 mortality rates with the explanatory factors. Finally, our study included 20 months of data, including four peaks of COVID-19 in Tehran, which was not covered by previous studies. Thereby, our study poses questions to obtain the objectives of this research. (1) What are the spatial, temporal, and spatiotemporal patterns of the COVID-19 mortality rates in Tehran, Iran? (2) What risk factors explain the spatially varying COVID-19 mortality rates at the neighbourhood level in Tehran?

Study Area
Tehran is located at latitude 35 • 68 92 N and longitude 51 • 38 90 E. It is divided into 22 districts and 350 neighborhoods and covers an area of 615 km 2 [18] (Figure 1). The population is over 9 million, according to the most recent census in 2016 (4,522,000 females and 4,534,000 males). The average population density in the neighbourhoods is 21,503 people per km 2 , with a standard deviation (SD) of 12,785 [31].

Data
COVID-19 mortality data were obtained from the Iranian Ministry of Health and Medical Education's Hospital Information System (HIS) [36]. We excluded data that were incomplete or inaccurate (about 50 records). Our analysis was based on data from 7,043 people who died due to COVID-19. This dataset spans a period of twenty months, covering December 2019 to July 2021 and includes the date of death, age, gender, hospitalization location, and date and home address [36,37].
This study's administrative geographical data include the most recent municipal division data (city, region, and neighbourhood boundaries). The built environment and land statistics were obtained from the Tehran city municipality [36]. Spatial analytic methods were used to produce additional spatial data, such as spatial density indicators for our study variables (Supplementary file 1). The study variables were aggregated at the neighbourhood level using ArcGIS Pro 3.0.2 software (ESRI, Redlands, CA, USA).
The Tehran Air Quality Control Company web portal was used to obtain air quality index (AQI) data from 2016 to 2021 [32]. The public transport data were obtained from the municipality of Tehran directly and updated with Open Street Map (OSM) datasets (https://www.openstreetmap.org/ accessed on 23 January 2023) using the QGIS platform (https://www.qgis.org/ accessed on 23 January 2023). In addition, the Iranian Statistical Centre provided the socioeconomic characteristics of neighborhoods (including unemploy- Tehran's neighbourhoods' environmental, economic, and social characteristics are highly diverse and dissimilar [31]. Air pollution is caused by a high population density, the dominance of personal mobility, the entry of dust from the South, and the presence of polluting businesses in the West and Southwest (27 km 2 (3.7%) of the city's total area is devoted to industrial purposes) [12,[32][33][34]. Tehran is one of the world's most polluted cities [33], with levels of O 3 , NO 2 , and particular matter of ≤2.5-micron size (PM 2.5 ) remaining high throughout the year but reaching the highest levels in the fall and winter [34,35]. Tehran's annual average air temperature is about 15-18 • C [32].

Data
COVID-19 mortality data were obtained from the Iranian Ministry of Health and Medical Education's Hospital Information System (HIS) [36]. We excluded data that were incomplete or inaccurate (about 50 records). Our analysis was based on data from 7,043 people who died due to COVID-19. This dataset spans a period of twenty months, covering December 2019 to July 2021 and includes the date of death, age, gender, hospitalization location, and date and home address [36,37].
This study's administrative geographical data include the most recent municipal division data (city, region, and neighbourhood boundaries). The built environment and land statistics were obtained from the Tehran city municipality [36]. Spatial analytic methods were used to produce additional spatial data, such as spatial density indicators for our study variables (Supplementary File S1). The study variables were aggregated at the neighbourhood level using ArcGIS Pro 3.0.2 software (ESRI, Redlands, CA, USA).
The Tehran Air Quality Control Company web portal was used to obtain air quality index (AQI) data from 2016 to 2021 [32]. The public transport data were obtained from the municipality of Tehran directly and updated with Open Street Map (OSM) datasets (https: //www.openstreetmap.org/ accessed on 23 January 2023) using the QGIS platform (https: //www.qgis.org/ accessed on 23 January 2023). In addition, the Iranian Statistical Centre provided the socioeconomic characteristics of neighborhoods (including unemployment rates) of 2016 [38].

Explanatory Variables
The explanatory variables used in this study are listed in Table 1. Figure 2 shows how these variables are distributed spatially across the study area.
(1) Built environment, land use, and urban facilities: previous research has found a relation between the density of various land uses and the COVID-19 epidemic [38,39]. The goal of this study, on the other hand, was to determine the spatial association between land use categories and COVID-19 mortality rates. As a result, the density of various uses, such as banks, restaurants, and high-rise residences, was investigated. (2) AQI: the literature has shown a strong, positive relationship between air pollutants and COVID-19 transmission and mortality in several geographic regions [40][41][42].
The major air contaminants considered as initial independent variables in this study are NO, NO 2 , O 3 , CO, and particulate matter PM 2.5 and PM 10 (NO and NO 2 are collectively referred to as NO x ). Some previous studies have used the effects of these pollutants on respiratory diseases individually or in combination with each other [23,[43][44][45][46][47][48][49][50]. In the present study, both were used, and the most significant variables were included in the final model. Exposure to air pollution is an essential risk factor for many of the chronic diseases that cause people to be more likely to become seriously ill, require intensive care and mechanical ventilation, and die from COVID-19 [51]. In this study, the average of 5 years (2016 to 2021) for each pollutant was derived from the pollutant-related data of air pollution monitoring stations. Then, using the inverse distance weighted (IDW) interpolation technique, data were calculated in a GIS with pixels of 1 × 1 km in size. The 5-year average of pollution was then calculated for each neighbourhood separately using zonal statistical methods. (3) Public Transportation: public transportation contributes to the geographic spread of COVID-19 [47]. Few studies have examined the correlation between these factors and mortality rates. To analyze the association at the neighbourhood level, we considered variables such as distance from fuel stations, metro and bus rapid transit (BRT) stations, and the spatial density of main roadways. (4) Socioeconomic features: recent research has shown that socioeconomic variables impact COVID-19 transmission and mortality in various settings in developing countries, such as Iran [52]. In this study, we looked at the spatial correlations between six such variables (population density, illiteracy, unemployment, older age, having a rented home, and being an immigrant) and neighbourhood-level COVID-19 mortality. Large numbers of industrial units in a region has a significant effect on the number of active COVID-19 cases in that region [53,54].
Land use for social services (V 3 ) Spatial density of social services units per km 2 Social service centres may be a place of the spread of infectious diseases as they attract many people at times of epidemics [12].
Banking (V 4 ) Spatial density of banks per km 2 Some studies have shown that banks and automated teller machines (ATMs) are important for the spread of COVID-19 [55].
Health service (V 5 ) Spatial density of health service centres (including special hospitals for COVID-19 patients, public clinics, and laboratories) per km 2 For example, hospitals for COVID-19 patients, public clinics, and laboratories naturally have an extremely strong association with the rate of COVID-19 infection [12].

Deteriorated buildings (V 6 )
The ratio of areas with deteriorated and old buildings to the total area of each neighbourhood in km 2 * 100 (%) Low-income and impoverished people generally reside in worn and inadequately constructed settings, where the houses are deteriorated leading to a relatively great danger of disease outbreak [12,56].
High-rise buildings (V 7 ) Spatial density of residential high-rise buildings per km 2 Overcrowding, dense space, and health conditions in high-rise buildings can increase the risk of COVID-19 outbreaks, which can affect people in different age groups and individuals who are suffering from underlying diseases [57].
Distance from the city business district (CBD) (V 8 ) Distance to the CBD in km Previous studies show that the COVID-19 transmission decreases with the distance from the city centre [47].
Presence of restaurants (V 9 ) Spatial density of restaurants per km 2 Controlling transmission in restaurants is an important component of public health measures for COVID-19 [58]. Where the urban road network creates the most intersections, individual and collective contacts increase. In the long run, this increases the spread of the disease in the surrounding areas [68].

Fuel stations (V 22 ) Distance to the fuel (petrol and gas) stations in meters
As with other location based public facilities, fuel stops may increase the transmission and spread of the COVID-19 virus in nearby areas [55].
There is a significant relationship between population density, overcrowding, and the spread of COVID-19 virus [64,69].
Health literacy enables people to understand the reasons behind medical recommendations and to become aware of the possible outcomes of their actions. Instead, higher levels of adult's illiteracy rates can be seen as a social risk factor for rising COVID-19 related deaths [70].
Areas with a higher unemployment rate positively associated with COVID-19 high mortality rates [71].
Older age groups experienced higher COVID-19 mortality rates. Subsequently, areas with a large proportion of elderly people face a high risk of infection [72,73].
Rate of rented homes in % (V 27 ) Total number of rented houses/total number of all types of housing units * 100 (%) Mostly recent studies have examined the correlation between COVID-19 outbreaks and poor housing condition [74,75].
Rate of Immigrants in % (V 28 ) Ratio of immigrants in the total population = immigrant population/total population * 100 (%).
Areas with higher rates of immigration appear to have been more affected by COVID-19 [76,77].  Table 1).

Geographical Smoothing of Mortality Rates
Intrinsic variance instability in estimating death due to population variation in spatial units has attracted widespread attention in disease mapping [81,82]. The spatial variance of the COVID-19 mortality rate per spatial unit requires spatial smoothing. To address this issue, we used empirical Bayesian smoothed (EBS) [82] technique in GeoDa software (Center for Geospatial Analysis and Computation, Tempe, AZ, USA) to create smoothed EB rates (per 100,000 people) to reduce random fluctuations due to population size by calculating risk as the total weighted crude rate for each neighbourhood [69,83].

Spatio-Temporal Analysis
Before conducting a local spatial cluster analysis of COVID-19 deaths, Global Moran's I [66] was used to investigate the global spatial distribution pattern of COVID-19 mortality rates in the study area. Kulldorff's scan statistics approach [84] was then used to identify significant temporal, spatial, and spatio-temporal clusters of COVID-19 fatalities. The relative risk (RR) and the log-likelihood ratio (LLR) were computed. Monte Carlo simulations, first introduced by Dwass, [83] were utilized to calculate a p-value. Monte Carlo is one of the constituent methods for interpreting scan statistics results [85]. The  Table 1).

Geographical Smoothing of Mortality Rates
Intrinsic variance instability in estimating death due to population variation in spatial units has attracted widespread attention in disease mapping [81,82]. The spatial variance of the COVID-19 mortality rate per spatial unit requires spatial smoothing. To address this issue, we used empirical Bayesian smoothed (EBS) [82] technique in GeoDa software (Center for Geospatial Analysis and Computation, Tempe, AZ, USA) to create smoothed EB rates (per 100,000 people) to reduce random fluctuations due to population size by calculating risk as the total weighted crude rate for each neighbourhood [69,83].

Spatio-Temporal Analysis
Before conducting a local spatial cluster analysis of COVID-19 deaths, Global Moran's I [66] was used to investigate the global spatial distribution pattern of COVID-19 mortality rates in the study area. Kulldorff's scan statistics approach [84] was then used to identify significant temporal, spatial, and spatio-temporal clusters of COVID-19 fatalities. The relative risk (RR) and the log-likelihood ratio (LLR) were computed. Monte Carlo simulations, first introduced by Dwass, [83] were utilized to calculate a p-value. Monte Carlo is one of the constituent methods for interpreting scan statistics results [85]. The maximum window size for spatiotemporal analysis was set at 50% for the study area and time using SatScan software (M Kulldorff and Information Management Services Inc, Cambridge, MA, USA). A circular shape was chosen to scan and detect all significant spatio-temporal clusters [86]. A Poisson model was used for COVID-19 mortality calculations [85]. For spatio-temporal and purely temporal models, the time aggregation period was set at one month [83]. No geographical overlap was defined as the criteria for reporting secondary clusters. The Supplementary Files S2-S4 explain the purely temporal, purely spatial, and spatiotemporal results of the scan statistics methodology.
The Monte Carlo testing was set to calculate test statistics for each random replication at the p = 0.05 level. Irrespective of the number of Monte Carlo replications chosen, the hypothesis is unbiased, resulting in a correct significance level that is neither conservative nor liberal or an estimate. In the standard Monte Carlo method with 999 random replicas, the lowest p-value the testing can report is 1/(999 + 1) = 0.001, which was set for calculating p-values in all Spatio-temporal analyses. In this study, a Monte Carlo test with 1,000 iterations was applied to evaluate the spatial variability of each surface of parameter estimates produced by the multiscale geographically weighted regression (MGWR) model [87]. All the spatial results were mapped using QGIS, V.3.26, a free and open-source GIS package [88].

Linear and Geographically Multivariate Data Analysis
Pearson's correlation coefficient and R 2 correlation were used to explore the global and linear correlation between COVID-19 mortality rates (per 100,000 population) and exploratory variables. Mirrored scatter plot matrix was used to visualize the bivariate relationships between combinations of variables. Each scatter plot in the matrix visualizes the relationship between a pair of variables [89]. According to Pearson's coefficient and R 2 values, five variables (including V7, V23, V25, V27, and V28) were removed from the list of explanatory variables (Supplementary Files S5 and S6 for details). Mirrored scatter plot matrix shows the global relationships between COVID-19 mortality rates and all selected explanatory variables in Tehran based on Pearson's correlation test (Supplementary File S6). The methodology flowchart of this research is shown in Figure 3.
In the following step, exploratory regression was used to model linear relationships and select the most important variables for the OLS analysis while considering multicollinearity. While exploratory regression is similar to stepwise regression, rather than only looking for models with high adjusted R 2 values, exploratory regression looks for models that meet all of the requirements and assumptions of the OLS model. The variance inflation factor (VIF) indicates multicollinearity, and values between 5 and 10 are recommended, with 10 as the maximum VIF level [86,90]. The max VIF was set at 7.5, as ESRI recommended [91,92], and the model was executed eight times to select the final significant variables. Based on these exploratory regression results, the variables V11 (PM 10 ), V13 (NO 2 ), V16 (O 3 ), V24 (illiteracy rate), and V26 (proportion of elderly) as variables with the highest significance (≥60%), which explain the highest adjusted R 2 (≥0.5), were included in our OLS model (Supplementary File S7). Based on these values, an OLS multivariate regression model was examined to explore the spatial autocorrelation of residuals (see Supplement File S8 for details). The technique assumes a stationary and constant relationship over space [92]. Using Moran's I, we looked for the presence of OLS residuals and spatial autocorrelation within the study area. The value of Moran's I varies between +1 and −1 [64], where a value close to 1 represents strong spatial autocorrelation [93]. We then employed the GWR model (Supplementary File S9 for details), an extension of the basic OLS standard regression, reflecting the variables' distribution's spatial heterogeneity [94]. The model can estimate and generate a set of local parameters, including adjusted R 2 , the corrected Akaike's information criterion (AIC c ), local coefficients, and residuals for each spatial unit to examine the spatial variation of the relationship between response and explanatory variables [95].   In the GWR model, the regression parameters of a particular location carry out local regression estimates based on sub-sample information for adjacent areas. The parameters of the estimated variables are adjusted as the spatial position changes [95]. GWR provides advantages to regression modelling and accounts for spatial variations, considering the spatial scale constant over time and space. However, in many cases, a fixed spatial scale is not valid where phenomena involve numerous spatial processes with various spatial scales [96]. Accordingly, MGWR is used to explore the relation between the response variable and exploratory variables, whichever vary spatially at different scales (see Supplementary File S10 for details). Compared to GWR, the MGWR model has several advantages. Particularly, it can accurately depict spatial heterogeneity, diminish collinearity, and lessen the bias in the parameter estimates [97]. Each predictor variable has its bandwidth in an MGWR model, which allows the scale of non-stationary relation to vary for each response to the predictor variable relation, as described in equation (Equation (1)), assuming that there are n observations, for observation i∈{1, 2, . . . , n} at location (u i , v i ), and y i is the response variable, while bwj in β bwj indicates the bandwidth used for the calibration of the jth conditional relationship [98]. The open-source MGWR application (Python) was used to run the GWR and MGWR models [99]. The Gaussian model was used to run both models and locations (identified by identification numbers), coordinates variables (x and y), four independent variables (Supplementary Files S9 and S10), and the EBS mortality rate as the dependent variable were introduced to the models. To select optimal bandwidths in both models for comparison purposes, the adaptive bi-square spatial kernel method [100] was used, and the Golden Section mode [100] was applied as a weighting scheme for calibrating both models. The AIC c was used as an optimization criterion in the calibration of the GWR and MGWR models. In addition, local VIFs [100] were applied to evaluate probable multicollinearity among explanatory variables at different spatial units. It was also possible to test the statistical significance of each surface of parameter estimates produced by GWR and MGWR via random sampling. In this study, a Monte Carlo test with 1000 iterations [101] was applied to evaluate the spatial variability of each surface of parameter estimates produced by the MGWR model. A pseudo-p-value <0.05 indicates that the observed spatial variability of a coefficient surface is significant at 95% CI (i.e., non-random) [100]. This study used Pseudo-t statistics and explanatory variable standardized coefficients (Beta) to map the spatially varying relationships between COVID-19 mortality rates and the explanatory variables. Bivariate choropleth maps, which show the quantitative relation between two variables in a feature layer [100], were used to represent and compare the MGWR model parameter estimates and original values of the explanatory variables (Fig.10). This mapping method is useful for finding the local patterns and variations of two parameters in a single map [92]. The ArcGIS Pro 3.0.2 package (ESRI, Redlands, CA, USA, 2022) was used to visualize our final model results. Figure 1 shows the spatial distribution of Tehran's COVID-19 EBS mortality rates (per 100,000 population) in Tehran. Among the 350 neighbourhoods investigated, the lowest EBS rate was about 11, and the highest was about 350 (mean = 85 and SD = 50). This map shows that most high-rate neighbourhoods are located in the central and southern parts of the study area.

COVID-19 Mortality Rates
During the 20-month study period, 60,111 individuals reportedly contracted COVID-19 in the study region. In addition, data analysis reveals that COVID-19 caused the deaths of 7043 people (an average of 78 per 100,000 population). The monthly distribution of the mortality rate per 100,000 population by gender is depicted in Figure 4a. This graph depicts the variation in mortality rates between the two sexes over twenty months. In certain months (e.g., February 2019, September and October 2020 and March 2021), the monthly mortality rate for both groups reached as high as 8 per 100,000 inhabitants. Importantly, male mortality rates (4.57 per 100,000 men on average) were higher than female rates (3.27 per 100,000 women on average). Figure 4b displays the COVID-19 mortality rate by gender and age group. Men still account for 58.9% (91 deaths per 100,000 men) of the total COVID-19-related deaths, whereas women account for 41.1% (64 deaths per 100,000 women). However, the mortality rate was 68% higher among those aged ≥65 years, while those under 25 years had the lowest COVID-19 mortality rates in Tehran, with less than one death per 100,000 population.

Retrospective Purely Temporal Analysis
The purely temporal pattern of COVID-19 deaths is shown in Figure 5. Four peaks of COVID-19 deaths over the study period were observed. COVID-19 deaths peaked in September and October 2020 and from March to April 2021. In addition, statistically signifi-

Retrospective Purely Temporal Analysis
The purely temporal pattern of COVID-19 deaths is shown in Figure 5. Four peaks of COVID-19 deaths over the study period were observed. COVID-19 deaths peaked in September and October 2020 and from March to April 2021. In addition, statistically significant clusters (LLR value = 523.4, p < 0.05) were formed during the months of February 2020 to November 2020.

Global and Local Purely Spatial Analysis
The results of Moran's I (I = 0.143 and p < 0.05, z-score = 4) show that the COVID-19 mortality rates (from 2019 to 2021) were spatially clustered in the study area. This result allowed us to enter local-scale spatial analyses more confidently. Based on purely spatial statistical analysis, seven significant spatial clusters (p < 0.05) were identified, which are given in Table 2 and Figure 6. According to these results, with a highest LLR value of 242.88 and a RR value of 1.85, cluster number one has been identified as the most likely in the study area. This cluster formed in the centre of the study area and consisted of 71 neighbourhoods with a population just above 1.5 million (number of cases = 1956 and number of expected cases = 1210). As illustrated in Table 2

Global and Local Purely Spatial Analysis
The results of Moran's I (I = 0.143 and p < 0.05, z-score = 4) show that the COVID-19 mortality rates (from 2019 to 2021) were spatially clustered in the study area. This result allowed us to enter local-scale spatial analyses more confidently. Based on purely spatial statistical analysis, seven significant spatial clusters (p < 0.05) were identified, which are given in Table 2 and Figure 6. According to these results, with a highest LLR value of 242.88 and a RR value of 1.85, cluster number one has been identified as the most likely in the study area. This cluster formed in the centre of the study area and consisted of 71 neighbourhoods with a population just above 1.5 million (number of cases = 1956 and number of expected cases = 1210). As illustrated in Table 2

Retrospective Space-Time Analysis
As Figure 7 shows in more detail, we identified only one significant space-time cluster (LLR = 467.31 and RR = 2.29, p < 0.05). This cluster consists of 75 neighbourhoods with a population of about 3,567,000. In addition, this cluster formed from February to October 2020 (nine months), which corresponds to the temporal clustering pattern ( Figure 5). This cluster extends from the centre to the south and southeast areas of the city.

Retrospective Space-Time Analysis
As Figure 7 shows in more detail, we identified only one significant space-time cluster (LLR = 467.31 and RR = 2.29, p < 0.05). This cluster consists of 75 neighbourhoods with a population of about 3,567,000. In addition, this cluster formed from February to October 2020 (nine months), which corresponds to the temporal clustering pattern ( Figure 5). This cluster extends from the centre to the south and southeast areas of the city.

OLS Model Results
The results of the OLS model show that, with an AICc value of 3,455.82, and R 2 value of 0.57 and an adjusted R 2 value of 0.56, our model accounts for about 57 and 56% of the COVID-19 mortality rates in the study area. Further-

OLS Model Results
The results of the OLS model show that, with an AIC c value of 3,455.82, and R 2 value of 0.57 and an adjusted R 2 value of 0.56, our model accounts for about 57 and 56% of the COVID-19 mortality rates in the study area. Furthermore, the results demonstrate that the probability and robust probability coefficients of our independent variables (selected by exploratory regression) were also statistically significant (p < 0.01) in the OLS model (Supplementary File S7). When Moran's I was applied to ensure that the OLS model residual values really were spatially random, the result showed that the autocorrelation was significant, and residuals were spatially clustered (Moran's I = 0.160. z-score = 6.5, p < 0.05). However, as follows, GWR and MGWR can solve the problems of non-stationarity and problems with heteroscedasticity. In addition, these models enabled us to improve the reliability of the predictions [92].

GWR Model Results
Diagnostic indicators of the GWR model showed that, with the AIC c value of about 661, the R 2 value of 0.67, and the adjusted R 2 value of 0.65, our GWR model accounts for about 67 and 65% of the COVID-19 mortality rates within the study area ( Table 3). The statistics for the GWR model are summarized in Table 4, which demonstrates that the GWR model predicts changes in local coefficients at the local scale but within the same bandwidth. The MGWR model can address this weakness of the GWR model by allowing various bandwidths ( Table 4). The column "Mean" in Table 4 shows the general coefficients of both GWR and MGWR models. For instance, NO 2 had the strongest association with COVID-19 mortality rates in the whole city, regardless of any specific geographical area.

MGWR Model Results
The diagnostic metrics of the MGWR model are provided in Table 5. The results of the MGWR model showed that, with the AIC c values of about 661 and the R 2 value of 0.69, an Adj R 2 value of 0.66 in our model accounts for about 69 and 66% of the COVID-19 mortality rates within the study area (Table 5). In addition, as indicated in Table 4, the MGWR model improved the predictions of the estimate of the explanatory variable's coefficient in the local linear regression model by varying the bandwidth of each explanatory variable [92].

Model Performance (Validation)
The summary of the model's diagnostics shows that the adjusted R 2 value increased from 0.56 (OLS) and 0.64 (GWR) to 0.66 (MGWR). Moreover, selecting the most parsimonious model, AIC c and the residual sum of squares (RSS) with the lowest values are preferred [92]. As Table 6 shows, AIC c decreased from 3455.81 (OLS) and 660.788 (GWR) to 641.051 (MGWR). The MGWR model produced a better AIC c (641.051), which indicates that the MGWR model provides significantly better goodness-of-fit than OLS and GWR models when assessing the relationship between the explanatory variable and the COVID-19 mortality rates. Furthermore, to compare the difference in the goodness of fit of the GWR and MGWR models for different spatial units, the local R 2 values were mapped for the GWR and MGWR models (Figure 8). These maps indicate that the GWR and MGWR models accurately explain the local relationship between COVID-19 mortality rates and dependent variables all over the city (local R 2 > 0.6). Two models were best fitted to the city's western, south-western, and north-western parts ( Figure 8A,B). However, as shown in Figure 8, the highest local R 2 values in the MGWR model (mean of R 2 = 0.68) covered more neighbourhoods in the city. Therefore, this study identified the MGWR model as the best model to visualize and interpret the spatial associations between COVID-19 mortality rates and dependent variables.
the GWR and MGWR models accurately explain the local relationship be-tween COVID-19 mortality rates and dependent variables all over the city (local R 2 > 0.6). Two models were best fitted to the city's western, south-western, and north-western parts ( Figure 8A,B). However, as shown in Figure 8C and 8D, the highest local R 2 values in the MGWR model (mean of R 2 = 0.68) covered more neighbourhoods in the city. Therefore, this study identified the MGWR model as the best model to visualize and interpret the spatial associations between COVID-19 mortality rates and dependent variables.  To examine whether the residuals of the MGWR model as the best-fitted model were autocorrelated, Anselin Moran's I statistic was used. The results showed no significant autocorrelation in the MGWR residuals (I = −0.031, z-score = −1.12, p > 0.05), which is a random pattern confirming their independence. While the difference between the results of the GWR and MGWR models was not remarkable, it is reasonable that we represent the results of the best model. After selecting the best-fitted model, the estimated coefficients for each explanatory variable were mapped to display their effects on COVID-19 mortality rates across Tehran. Figure 9 shows the spatial distribution map of the pseudo t-values of significant positive local estimates based on the MGWR model for five significant covariates. The PM 10 variable (min = 2.6, mean = 2.85, max = 3.1, and p < 0.05) explains the most significant correlation with mortality rates in the western parts of the city (Figure 9-V11). The NO 2 variable (min = 4.8, mean = 5.9, max = 6.5, and p < 0.05) explains the significant correlation with COVID-19 mortality rates in the western half of the city (Figure 9-V13). The relationship between O 3 and COVID-19 mortality rates was significant (min = 3.9, mean = 5.6, max = 6.9, and p < 0.05) in the central neighbourhoods, which stretch toward the southern parts of the city (Figure 9-V16). As the map shows, the most associated illiteracy rate (%) (min = 0.226, mean = 2, max = 3.55, and p < 0.05) was observed in the south-eastern to the central and northern parts of the city (Figure 9-V24). The relationship between the age rate and COVID-19 mortality rate was (min = 6.2, mean = 6.5, max = 6.7, and p < 0.05) positively significant with high values in and around the central parts of the city (Figure 9-V26).    Figure 10 shows the spatial distribution bivariate maps of the local spatially varying Beta coefficients (standardized parameter estimates) and the values of the explanatory variables aggregated at the neighbourhood scale. These bivariate maps show that the correlation between each variable and the dependent variable was not constant across the city. Figure 10-V11 shows the estimated coefficients for the PM 10 (min = 0.103, mean = 0.115, max = 0.13, and p < 0.05). According to this map, the Beta values are high in 43.4% of neighbourhoods located in the north-eastern and eastern parts of the city, which indicates a strong association between PM 10 and COVID-19 mortality rates. Although the PM 10 values were higher in the city's northern parts, the coefficients are low in these areas. Figure 10-V13 shows the coefficients for the NO 2 variable (min = 0.31, mean = 0.37, max = 0.41, and p < 0.05). Beta values of NO 2 were higher than the mean in 88.6% of the neighbourhoods, with most of them in the eastern parts of the city. In the neighbourhoods located in the south-eastern and central neighbourhoods, the higher values of NO 2 and Beta increased in parallel. However, the northern and eastern neighbourhoods of the city represent different spatial patterns. While NO 2 values were higher in these areas, the corresponding Beta values were lower, which indicate weak correlations. Figure 10-V16 shows the coefficients for the O 3 variable (min = 0.236, mean = 0.33, max = 0.42, and p < 0.05). Beta values of O 3 are higher than the mean in 28.2% of the neighborhoods, most of which are in the central parts of the city. Central neighborhoods had the higher values of O 3 and Beta increased in parallel. However, the northern and northwestern neighbourhoods of the city represent different spatial patterns. While O 3 values were higher in these areas, the corresponding Beta values were lower, which indicates weak correlations in these neighbourhoods. Figure 10-V24, shows the coefficients for the illiteracy rate (%) variable (min = 0.013, mean = 0.125, max = 0.24, and p < 0.05). Beta values of the illiteracy rates (%) were higher than the mean in 51.7% of the neighbourhoods, with most of them in the central parts toward the northern parts of the city. In the neighbourhoods located in the south-eastern neighbourhoods, the higher percentages of illiteracy rates and Beta increased in parallel. However, eastern neighbourhoods of the city represent different spatial patterns, too. While illiteracy rate values are higher in these areas, Beta values for the illiteracy rates (%) were lower, which shows the weak correlations in these neighbourhoods. Figure 10-V26 shows the coefficients for the aging rate (%) variable (min = 0.365, mean = 0.374, max = 0.38, and p < 0.05). Beta values of the age rate were higher than the mean in 51.4% of the neighbourhoods in the central part of the city. In the neighbourhoods located in the central part of the city, the higher age groups and Beta increased in parallel. However, the northern and north-eastern neighbourhoods of the city represent different patterns. While ages are higher in these areas, Beta values for this variable were lower, which shows the weak correlations in these neighbourhoods. In fact, age had a strong correlation with COVID-19 mortality rates in the central neighbourhoods.

Discussion
We identified one purely temporal cluster of COVID-19 mortality rates in the research area between February and November of 2020 ( Figure 5). Similar temporal groupings during that era have been observed in previous studies, for example, in Brazil [102]. Iran has previously dealt with a variety of diseases, but COVID-19 stunned the system with its severity, rapid spread, and pathological consequences [103]. COVID-19 vaccination did not begin until late February 2021. As a result, many people died during the following three disease peaks [104]. We discovered seven significant spatial mortality clusters, the most significant of which were Cluster 1 in the city's center and Cluster 3 in the north. In addition, there was a significant central space-time cluster that expanded in the south and southeast (Figure 7). This cluster was formed from February 2020 to October 2020, which is extremely comparable to the COVID-19 purely temporal cluster in the research region ( Figure 5). As a result of this variability and clustering, COVID-19 mortality clusters might have been influenced by the particular characteristics of the various metropolitan districts that were our second aim in this study.
Previous research indicates that environmental problems, such as air pollution concentration in Tehran, can increase COVID-19 mortality rates in big cities [36]. This is in accordance with our findings. However, the MGWR model (the best-fitting model) showed that the COVID-19 mortality rates associated with explanatory variables varied substantially across the study area. Most of the factories (e.g., industrial sand and cement factories) are in the west. Therefore, PM 10 levels are higher in the west. This higher PM 10 levels are also strongly associated with COVID-19 mortality in the west. However, there are some areas in the northern part of the city in which the amount of PM 10 is not very high but is highly associated with COVID-10 mortality. Lowering the PM 10 in these areas might also help reduce the COVID-19 mortality rate. Although the lockdowns reduced the PM 10 levels in Tehran between 20 and 30 percent [105], this short-term drop did not diminish the citywide long-term detrimental consequences of PM 10 . Indeed, the industrial plants surrounding the city increased their production of detergents and hygiene products during the COVID-19 pandemic, which account for the continued city pollution [106]. In addition, the predominant wind direction in Tehran is from the west and south, which may carry PM 10 and other pollutants from their sources (e.g., industrial sources, construction sites, landfills, and desert areas in the south of the city) [107]. In addition, as shown in Figure 1, South Tehran borders the country's central deserts from where the wind carries dust and air pollutants. At the same time, the northern part of the city is surrounded by mountains, which prevent any further discharge of pollution.
In our investigation, NO 2 was most strongly related to COVID-19 mortality rates in the eastern half of the city (Figure 9), where the concentration of this gas also is greater than in the rest of the city ( Figure 10). Previous research has validated the association between NO 2 and COVID-19 mortality rates [108,109]. Over 75% of Tehran's residents use non-standard fuel-powered automobiles, which account for 40% of the city's air pollution, including NO 2 [105,109,110]. Except for the northern portion of the city, bus terminals are another source of NO 2 across the city [106]. The long-term exposure of Tehran's residents to air pollution has resulted in a considerable increase in the mortality rate from respiratory disorders, according to earlier research [105] and we can now see how this has exacerbated the COVID-19 situation. Several research results indicate that during the COVID-19 pandemic, public transit usage did not drop. Instead, the use of private cars grew leading to a rise in pollution levels [63,107]. For example, the Tehran Air Quality Control Company (AQCC) confirmed that from March 2021 to November 2021, Tehran was one of the most polluted cities in Iran [32].
In accordance with earlier studies that revealed the association between ozone and high COVID-19 mortality rates [33,45,108], our study demonstrated that this gas is positively associated with COVID-19 mortality rates, which we particularly noted in the central and southern parts of the city (Figure 9). In these parts of Tehran, the annual temperature is high (Figure 2), and private vehicles are commonly used due to inadequate public transportation and traffic congestion [111], which contributes to the levels of ozone, a secondary pollutant associated with high levels of other climatic factors, such as temperature and nitrogen oxides [50,112]. In addition, another study conducted in Tehran revealed that the O 3 concentration did not decrease during the lockdowns, and this gas is recognized as one of the risk factors that raises the likelihood of COVID-19 mortality [12].
In addition to the air quality indices, our data revealed a varied positive association between illiteracy and COVID-19 mortality rates. Previous research has proven the positive correlation between socioeconomic statuses, such as illiteracy, and COVID-19 mortality [105]. While we found a correlation between COVID-19 mortality and illiteracy, which was exceptionally high in the south-western and the central regions of the city (Figure 2), there was an even higher correlation in the centre and the south-eastern neighbourhoods extending into additional northern locations ( Figure 10). Two studies conducted in the United States found that a lower level of literacy led to less awareness about the risk of COVID-19 [113,114]. As a result, preventive measures are less prevalent in those populations [113]. Other research has demonstrated that COVID-19-related mortality rates are twice as high in disadvantaged areas of major cities in developing countries [114].
Men were more likely to die from COVID-19 than women [115][116][117][118]. This general gender distribution was also seen in Tehran, with about a 60 to 40% difference between males and women, respectively, throughout the twenty-month research period. According to most research, men's COVID-19 mortality rates are affected mainly by smoking and alcohol, which affect the lungs. In addition, recent studies have shown that women have a better immune system against infections than men [119]. In addition, numerous studies demonstrate that oestrogen is a protective factor against numerous viral infections [120,121], and this hormone has been shown to suppress SARS-CoV replication by modulating cell metabolism [113]. Our findings also reveal that approximately 68% of the total COVID-19 deaths occurred in the elderly (Figure 4). Numerous prior studies indicate that older age is one of the risk factors for increased COVID-19 mortality rates [116,[122][123][124]. Our findings also indicate that older age is substantially linked to COVID-19 mortality rates (Figure 9). This was particularly seen in the centre of the city where people are generally of an older age ( Figure 10). Our survey reveals that 8% of the city's population is currently ≥65 years, with the majority residing in the city's northern regions (Figures 2 and 10). Many older adults cannot take care of their health properly and are also subject to the harmful air quality in central Tehran. The elderly also suffer from a higher rate of chronic diseases (diabetes, blood pressure, lung problems, etc.), which are associated with a higher risk of death due to COVID-19 [118].

Policy Implications
The following long-term urban health policy proposals can help to reduce mortality rates associated with pandemics such as COVID-19.
(1) While achieving sustainable urban growth without air pollutants may be difficult, the level of urban pollution must be reduced to prevent additional respiratory illnesses and COVID-19-related deaths. A network of environmentally friendly transportation should be developed. The number of private automobiles that use fossil fuels might be reduced and replaced with electric vehicles.
(2) The old polluting industries that surround Tehran must be upgraded with cuttingedge technology and, if possible, partially replaced by "green industry." The green industry is an emerging concept that aligns industrial development with global sustainable development toward a green economy [125].
(3) It is critical to prevent the spread of respiratory disorders induced by PM 2.5 and PM 10 pollution in southern and western Tehran. Additionally, certain health interventions should be implemented in areas with a high probability of PM 10 concentration. The use of facemasks, for example, is an excellent short-term solution.
(4) The elderly are more likely to suffer from COVID-19 symptoms. Therefore, shortand long-term health strategies for protecting the elderly and providing health care in high-risk areas deserve special consideration. In addition, a better urban infrastructure, particularly in the health and medical sectors, would reduce the deaths caused by pandemics like COVID-19.
(5) It is critical to reduce illiteracy and raise public awareness as it is associated with the number of deaths caused by epidemics like COVID-19. This would be critical for Tehran's southern half, where illiteracy is prevalent.

Limitations and Futures Research Strategy
This study has a few limitations. The COVID-19 mortality data were obtained from the Ministry of Health's Health Information System (HIS). COVID-19 disease was initially difficult to recognize, and its specifics were not documented in hospitals. As a result, some of the first patients may have yet to be fully documented. Furthermore, due to a lack of health care during the peak of the COVID-19 epidemic, many patients may have died outside of hospitals and were not recorded in the Ministry of Health's HIS system. Another limitation is the need for more local household income data. We attempted to avoid this issue by examining other socioeconomic indicators, such as the unemployment rate. Additionally, due to sanctions and economic difficulties, most of Tehran's power plants have used low-quality fuel oil for decades [121]. The use of this oil might release other air pollutants that we have not considered in this study since monitoring stations have never measured other potential air pollutants in Tehran. Instead, we looked at other significant air pollution indicators. Additionally, it is worth mentioning that we had not controlled for the number of days to calculate the monthly mortality rates to create Figure 4.
Furthermore, the deceased's geographic coordinates were not recorded. As a result, aggregated statistics were used at the neighborhood level. Future research could employ GIS techniques to predict COVID-19 mortality rates in Tehran and develop a mortality risk map based on neighborhood characteristics. In addition, more research is needed to determine how COVID-19 mortality rates in metropolitan areas are associated with determinants, such as underlying conditions, employment type, income level, place of work address, and other information about the deceased's health status.

Conclusions
A spatiotemporal pattern of the COVID-19 mortality rate in Tehran was investigated, with neighborhoods having higher mortality rates at different times. COVID-19 mortality rates were found to be significantly related to the Air Quality Index, age, and illiteracy rates, and these associations varied across Tehran neighborhoods. The MGWR model showed the best performance among the different regression models used to identify factors associated with COVID-19 mortality rates. Our approach could be useful for future COVID-19 mortality rate modeling and other infectious diseases. Finally, the findings have implications for COVID-19 mortality rate reduction initiatives in long-term urban health plans that promote healthy and resilient cities.

Institutional Review Board Statement:
We used aggregated data at the neighborhood level. All experiments were performed according to relevant guidelines and regulations. No individual data was used and due to the retrospective nature of the study, no consent is needed from the patients. The study was conducted in accordance with the Mohaghegh Ardabili University and approved by the Research Ethics Committee of Shahid Beheshti University, Tehran, Iran (#IR.SBU. RETECH.REC.1399.076) for studies involving humans.
Informed Consent Statement: Informed consent was not obtained from the patients due to the nature of the study and it was also confirmed by the ethical committee of Shahid Beheshti University in Tehran, Iran.
Data Availability Statement: All the data used in this study are available via Supplementary File S1.