Accurate Estimation of Air Pollution in Outdoor Routes for Citizens and Decision Making

: There is clear evidence of the effects of air pollution on health. In this paper, we present an innovative application designed to assess Air Quality (AQ) exposure based on the World Health Organization’s AQ Guidelines, analysing pollutants and their concentrations independently. Our aim is to provide this information to citizens based on their health proﬁle (medical history or requirements) before and during outdoor trips of their choice, both walking and cycling, empowering them to proactively make informed personal decisions about their route choices and identifying potentially unhealthy travel environments. For this purpose, we have access to ofﬁcial data from AQ monitoring stations that are updated periodically every 10 min. Then, by using spatial interpolation techniques (with Ordinary Kriging), we estimate each pollutant over a grid superimposed on the city map. Once the pollutants have been mapped on each route, they are analysed in order to consider the different alternatives for deciding and planning changes in speed or trajectory. We evaluated the application in the city of Valencia (Spain) as a use case under different scenarios, and showed the results to assess exposure to pollution on the routes of citizens.


Introduction
According to the World Health Organization (WHO), approximately 90% of the global population is exposed to air containing elevated pollutant concentrations.Recent assessments underscore a distressing fatality count of 7 million individuals annually due to the effects of outdoor and indoor air pollution [1].People die and suffer many illnesses from exposure to poor air quality (AQ), due to pollutants such as particulate matter (PM2.5 and PM10), carbon monoxide (CO), ozone (O 3 ), nitrogen dioxide (NO 2 ), and sulfur dioxide (SO 2 ) to name a few, mainly due to the burning of fossil fuels.Moreover, of significant concern for public health, 96% of the urban population is exposed to pollution levels surpassing the WHO AQ Guideline (AQG) [2] and in many large cities by more than five times.
It is crucial to emphasise that exposure to elevated levels of fine particulate matter and nitrogen dioxide surpassing the WHO's recommendations resulted in an estimated 238,000 and 49,000 premature deaths, respectively, in 2020 [3].These pollutants have been associated with asthma, heart disease, and stroke.Additionally, chronic exposure to fine particulate matter accounted for 275,000 premature deaths in Europe in 2020, while chronic nitrogen dioxide exposure was responsible for 64,000 deaths, and acute ozone exposure contributed to 28,000 deaths [4].
There is a growing recognition and active involvement of local governments and authorities in addressing the concerns surrounding air pollution.They are implementing control and supervision plans, striving to mitigate the harmful effects.This global public health challenge has sparked an accelerated political interest, reflecting a heightened commitment to tackle the issue [1,5].The notable increase in the number of cities monitoring air pollution data indicates a growing emphasis on assessing and monitoring air quality.
In Europe, air pollution emissions have declined in the last two decades.Despite this positive trend, air pollution remains the most important environmental health risk in this region.The Directive 2008/50/EC of 21 May 2008 on ambient AQ and cleaner air for Europe is one of these AQ measures.According to this directive, the number of sampling points in each zone or agglomeration should be at least one sampling point per 2 million inhabitants or one sampling point per 50,000 km 2 , where the latter criterion results in a higher number of sampling points, but not less than one sampling point per zone or agglomeration.For instance, Valencia city (Spain) is an example where these networks are deployed, with a set of AQ official monitoring stations for polluting gases (the network of stations of the Generalitat Valenciana [6]) and the stations deployed in the city of Valencia incorporated into open data, also known as Valencia minute by minute [7]. Figure 1 shows some of these AQ monitoring stations in different cities, such as Burjassot city (SP), Glasgow city (UK), and Valencia city (SP).In this context, our goal is to inform and assist citizens in their daily trips by providing an accurate distribution of pollution along the entire determined or specified route, showing the pollution levels by taking into account his/her profile (health profile or medical history), and to warn about the risks based on the WHO AQG.With this information, the citizen may plan and decide on an alternative route and proactively make informed personal decisions about his/her route choices.For this, we have developed an application based on the information collected from the different official AQ monitoring stations, applying spatial interpolation to estimate the pollutant concentration over the city map.
The rest of the paper is structured as follows.In Section 2, we show the related work with regard to pollution exposure and outdoor mobility.In Section 3, we analyse the main aspects of Air Quality (AQ) and the official AQ monitoring surveillance network.In Section 4, we explain the mapping process of pollution over a grid on the city map using interpolation techniques in order to integrate the pollution information obtained with the route planner together with the geographic information.In Section 5, we present our results covering different scenarios.Finally, in Section 6, we summarise the main conclusions and future work.

Related Work
Citizens' route choice decisions are often about minimising travel time, although in many situations, the risk of illness causes citizens to apply other criteria, such as reducing air pollution, particularly when the citizen suffers from respiratory diseases or allergies.In a study by [8], it was found that cyclists could opt for alternative routes with approximately 20% less exposure to pollution (given by nitrogen and carbon oxides) compared to shorter routes in a Danish city.Similarly, in [9] it is demonstrated that pedestrians in California could reduce PM2.5 exposure, around 40%, by choosing appropriate routes.Additionally, in [10] a web application for cyclists is shown which can help to find alternative routes with 4% less pollution.Furthermore, numerous previous studies examining exposure at the route level, such as [11], have consistently demonstrated that different routes within cities exhibit different levels of air pollution.
From a different point of view, regarding the trade-off between emissions and exposure to pollution, in [12], the feasibility of utilising emission and exposure metrics as benchmarks for assessing real-world policy is considered.The authors propose, based on the type of vehicle used, air pollution tolls given by time, taking into account the emission costs.However, the specific trade-off between the impact of exposure as a factor in route choice has not been extensively evaluated.In this line, in [13], an application is presented that does not systematically reduce travel distance or travel time, when CO 2 emissions are considered as a vehicle cost factor.
In [14], the authors present a population-based assessment, carried out in Helsinki (Finland), of multiple environmental exposures for active commuting, allowing urban-scale exposure analyses to inform exposure levels.It highlights that cyclists have more noise exposure and pollution, which exceed healthy thresholds.In [15], the authors show a summary of different studies focusing on main air pollutants and their impact on cyclists.In [16] is detailed the Green Paths software (v.1), an open-source routing method (based on shortest path algorithm) and exposure assessment tool for a route planner for Helsinki which takes into account exposure metrics given by air quality, noise, and greenery.This software uses OpenStreetMap, both with walking and cycling street network maps, with pre-calculated exposure cost attributes assigned to the network edges.
In addition, it is worth mentioning several commercial platforms that offer you AQ information in your route, such as AirNow [17], BreezoMeter [18], and PlumeLabs [19].Their use can be beneficial although they are not open-source.Airnow provides only AQ at the source and destination, and is constrained to North America and limited to specific roads, considering only O 3 and PM as pollutants.BreezoMeter shows AQ index (AQI) only at the monitoring spots and is limited to specific roads.Finally, PlumeLabs analyses the AQ in your location.As we can see, the information provided by these platforms is constrained to the source or destination, as well as the locations with monitoring systems, but it does not include information for your whole route.As we can see, the information that these platforms report is simpler than our approach.
In summary, we have seen that there are different initiatives to improve citizens' mobility and assess air pollution in order to assist them in their daily trips, but from a different point of view and using less accurate air pollution estimation techniques, which are neither focused on official AQ monitoring measurements nor assess the citizen's profile (medical history).We have not found alternatives that can provide a detailed and accurate air pollution estimation along a specified route, taking into account this profile and the WHO AQG.This is our goal and the research gap we aim to fill.

Air Quality Issues and Its Monitoring Network
There is clear evidence of the health effects of air pollution and the growing interest of governments and local authorities in tackling this problem.Next, the arguments and key elements behind this are presented.

Air Pollutants and Recommendations
Air pollution comes from several major sources, including inefficient energy consumption in households, industries, and the agricultural and transport sectors, as well as coal-fired power plants.In addition, in some areas, air pollution is exacerbated by sand and desert dust, waste incineration and deforestation.In addition, natural elements such as geographical features, meteorological conditions, and seasonal variations can affect air quality.Consequently, these pollutants have adverse effects on our health.
Among the pollutants of greatest health concern are PM2.5, PM10, CO, O 3 , NO 2 , and SO 2 .These pollutants can lead to health problems resulting from both short-term and long-term exposure.It is important to note that certain pollutants do not have specific thresholds below which no adverse effects occur.Therefore, even low levels of exposure to these pollutants can cause adverse health effects.
The AQG by WHO [2] serves as a comprehensive global reference for defining thresholds and recommended limits for crucial air pollutants that pose health risks.These guidelines are developed using a rigorous and transparent methodology based on evidence-based decision-making.Alongside the guideline values, these guidelines also present interim targets, progressively moving from higher to lower concentrations, with the final target as the ultimate goal.A detailed overview of this guideline and the corresponding levels is shown in Table 1.

Air Quality Index
Based on the previous information and in order to simplify its understanding, there are different definitions for Air Quality Index (AQI) depending on the authorities and standards that govern them, although in the end, they share the same criteria approximately.In general, AQI is a scale of air pollution calculated from AQ data over a designated time period as an indication of how clean the air is.The European AQI (E-AQI) is calculated for the five main pollutants regulated in European legislation, that is, O 3 , NO 2 , SO 2 , PM2.5 and PM10.
The E-AQI ranges have six different levels designated by a number and a colour from 1 (good) to 6 (extremely poor): 1 (very good, dark green); 2 (good, green); 3 (medium, light orange); 4 (poor, orange); 5 (very poor, dark orange) and 6 (extremely poor, red).An example is shown in Figure 2b for the 11 AQ official monitoring stations in Valencia city (Spain) [7], with location shown in Figure 2a.As we can see in Figure 2b, the city center, the main avenues, and the harbour are the most polluted areas as well as the ring road to the city, while the zones far from these areas are still unpolluted.Notice that this is a picture of 19 July 2023 at 10:00, Wednesday.A range is given for each pollutant.The maximum allowed per pollutant (in µg/m 3 ) at each index level is depicted in Table 2.For each pollutant, the index is calculated separately according to the concentrations; the higher the concentrations, the higher the index.The overall hourly E-AQI is simply defined as the highest value of the five individual pollutants indexes computed for the same hour and the overall daily E-AQI is the highest value in a day.

Accessing to AQ Monitoring Data
According to Directive 2008/50/EC, in the cities, there are a number of AQ official stations monitoring polluting gases, as shown in Figure 1.This information is usually open and available.In practice, we have different options to access the AQ dataset.
In particular, in the case of Valencia city [6], there are two options, both in open-data.The first option is to download the data directly using a specific URL for each station [7].Within this option, we get flat file formats (CSV, JSON, and Excel) or geographic file formats (GEOJSON, Shapely, and KML).In the second option, we can use a specific API to obtain this data.We chose this option because it allows us to automate the retrieving process using Opendatasoft API [20].In other infrastructures, we have similar alternatives since these data are usually publicly available.
Thus, the dataset from Valencia city is made up of 11 records from these 11 different AQ monitoring stations, as shown in Figure 2 and mentioned in Section 3.2, from different zones of Valencia.For each one, we get the following pollutants: NO 2 , CO, SO 2 , O 3 , PM10 and PM2.5.Notice that this information is updated in the system every 10 min according to ISO 11771:2010, ISO 37122:2019 and European Regulation Directive 2008/50/EC.

Merging Pollution Data over a Grid on the City Map
The street network within a city can be seen as a bidirectional graph, where nodes correspond to intersections and edges represent streets, paths, or road segments.In addition, since the AQ sampling remains constrained to the specific locations where AQ monitoring nodes are installed, a spatial interpolation technique becomes necessary.This technique will ensure precise pollution measurements at the grid over the city map, enabling the analysis of different pathways for routes.

Kriging for Spatial Interpolation
Kriging is a spatial statistical technique that allows the analysis of geolocated information and it is based on spatial autocorrelation.It entails an interpolation approach rooted in Gaussian processes, guided by previous covariances derived from previous measurements.This technique is designed to estimate a value at a given point within a random field [21] by performing a weighted average of known values from the neighbouring points.A fundamental principle underlying Kriging is that, under appropriate prior assumptions, it offers the optimal linear unbiased prediction, relying on covariance assumptions [22].
In Kriging, a value in a location x 1 is a realization of z(x 1 ) that is estimated from N measurements of the random variables Z(x 1 ), Z(x 2 ), . . ., Z(x N ) that are correlated.By assuming stationarity in the random function based on the homogeneity of samples, the correlation between two random variables only depends on their separation, denoted as distance h (or lag, and it is independent of their location.Thus, the covariance function for the response at two different points x i , x j with h = |x i − x j | is given by Equation (1) for some function c, cov[Z(x i ), Z( It allows us to define the variogram and the co-variogram functions as: where N(h) denotes the pairs of observations i, j separated by h and m(h) is defined as: Then, the estimated value is calculated by linear estimation from the observed values and weights as follows: To calculate these weights w i in Equation ( 4), we must keep two criteria, global unbiased and minimal variance of estimation.The first criterion implies that the mean of the estimations must be equal to the mean of the real values.The second criterion determines that the squared deviations of the estimations must be a minimum.Both criteria are applied and solved using Lagrange multipliers.
Notice that there are different methods to calculate these weights giving different types of Kriging techniques.From all of them, the most common are Simple Kriging (SK) and Ordinary Kriging (OK).In OK, the global mean is unknown, while in SK, the global mean is known.In our case for the estimation of air pollutants, we use OK.
Furthermore, from the set of samples, we need to calculate and model the empirical variogram.The objective of a variogram modelling is to determine the autocorrelation structure of the underlying stochastic process, given by the nugget effect , sill, and range parameters, as depicted in Figure 3.The nugget effect represents the measurement error at γ(h) for h = 0.The sill is the value of γ(h) when h → ∞ representing the variance of the random field and the range is the distance at which data are no longer auto correlated [22].As a first approach, the variogram is estimated based on the measured values taken in different locations, and is later replaced by an adjusted model (such as spherical, exponential, gaussian, to name a few).In our case, we have used the Matern model [23], shown in Equation ( 5), because it has more versatility: where σ 2 is the variance parameter representing the sill, which is the limit of the variogram as distance h → ∞, ν is the smoothness parameter, which determines the rate of decay of the variogram, ρ is the range parameter which represents the distance at which the variogram reaches the sill value, Γ(ν) is the gamma function, and K ν is the modified Bessel function with order ν.In this model, if ν = 0.5, the model becomes exponential.If ν > 0.5, it allows discontinuities in the variogram at the origin, and if ν → ∞, it becomes a Gaussian model [24].

Mapping of Pollution over the Grid on the City
Once we estimate with OK the value of each pollutant over the area under test (area of the user mobility), we have to map a value for each pollutant to each point over the grid of the city.
For this, we use a grid of 0.0001 decimal degrees that corresponds to 11.5 m.In practice, the grid is traversed with an index defined as a combination of the longitude and latitude coordinates.We use OpenStreetMap (OSM) [25] to know all the details and information for the route estimation, which is available in OSMnx [26] for Python language.

Impact of the Different Pollutants Based on User Profiles
Up to this point we have seen in a generalised way the interpolation and mapping process of the different pollutants.However, in practice, the application is focused on the user profile.
Figure 4 depicts the flow chart and methodology used in our proposal.Initially, the user asks for a specific route, selecting the mode: walking or cycling.Based on his/her profile (or medical history), we retrieve information about pollutants that should be avoided by the user.The user is provided with relevant and critical information, empowering him/her to proactively make informed personal decisions about their route choices and identify potentially unhealthy travel environments.Since the user's profile is confidential, to preserve privacy we anonymise any personal information but we have access to metadata that indicate based on his/her medical history which pollutants are considered critical or relevant.For instance, there are some common profiles, such as people with asthmatic problems, pregnant women, and people with chronic obstructive pulmonary diseases with respiratory infections.In the case of asthma, this order usually is considered PM2.5, O 3 , SO 2 , and NO 2 [27,28].In the case of pregnant women, we include all pollutants in this order NO 2 , CO, PM2.5, PM10, and O 3 [29][30][31].Finally, with chronic obstructive pulmonary diseases with respiratory infections, we mainly focus on PM2.5, PM10, and NO 2 .

Results
In this section, we show our results using three different types of tests in order to evaluate the pollution distribution over a selected route.In particular, we will analyse: (a) evolution over 24 h in 2 h steps, (b) impact taking into account different transport modes, and (c) behaviour of the different pollutants.We must highlight that our goal is to check how this system performs in order to estimate pollution in different scenarios on different routes, and not to analyse how the pollution itself behaves.For simplicity, the routes have been selected using the shortest path criteria between an origin and destination with different modes as indicated, whether walking or cycling.Over these routes, we assess the pollution information that is provided to the user.We evaluate the application in the city of Valencia (Spain) as depicted in Sections 3.2 and 3.3.
The colouring scale of the maps shown in this section to identify the pollutant concentration is based on the maximum acceptable value indicated by the WHO as follows: the green colour is assigned if the value is below 50% of the maximum acceptable value, yellow if it is between 50 and 75%, orange if it is between 75 and 100%, and finally red if it is above the maximum acceptable value.

Evolution over 24 h in 2 h Steps
In this test, we evaluate the behaviour of PM2.5 and NO 2 over 24 h in 2 h steps.In this case, we have selected a large walking route and have analysed how these pollutants change over time.This test was carried out during a working day in summer (19 July 2023, Wednesday).The selected route goes from the south-west to the north-east of the city, from one end to the other end of town.
As we can see in Figure 5, at 8:00 am NO 2 is near the WHO's limit and PM2.5 is above it.This is due to intensive traffic because of the movement of people travelling to their jobs.In particular, we can see this effect on the outskirts and at the harbour, as well as on main avenues.These hours are peak hours.After that, as seen in Figure 6, pollution starts decreasing at noon until 16:00, which shows the lowest value during the day, when people start going home after work.Until this time, pollution is evenly distributed and decreases more in the zones close to the sea, shown on the left side in the figures, due to the breeze.In Figure 7 at 18:00, pollution increases again but in a slower and more homogeneous, more staggered way, although PM2.5 remains stable.In Figure 8 at 22:00, pollution starts decreasing since road traffic dicreases too.In Table 3, we summarise the average concentration of NO 2 and PM2.5 during this working day.Notice that this pollutant distribution also depends on the orography and weather conditions, but the representations are accurate since they are based on valid and official measurements taken from the AQ surveillance monitoring stations shown in Sections 3.2 and 3.3.From these figures, it is clearly seen in Figure 5 at 8:00 in the morning that PM2.5 is above the WHO's limit.At that time, in case of respiratory and cardiovascular problems related to the lungs or high blood pressure [5], the user should decide to change the route (avoiding in particular the city center) or delay it at least two hours, because by that time these levels would have dropped, as depicted in Table 3.

Impact Taking into Account Different Transport Modes
In this test, we analyse how the shortest route changes depending on the transport mode selected, by car, bicycle, or on foot, and how the pollution affects on that route.With this, we can see how different modes of transport can influence the pollution exposure and the route choice.This analysis was carried out during 16 July 2023, Sunday at noon.For this scenario, we use SO 2 concentration as an indicator of pollution.
As we can see in Figure 9, the pollution is lower compared with a working day.We can highlight that depending on the route modality, there are differences in the pollution exposure.In particular, in the cycling mode, since we are constrained to specific types of routes, pollution fluctuates more compared with walking and driving.Between walking and driving, walking is more uniform than driving.However, there are more fluctuations due to the number of junctions and higher road type diversity.In this case, the time slot used for the analysis is safe in terms of air pollution.Thus, the user can choose any kind of transport mode.It is worth mentioning, as shown in Figure 9c,d, that cycling is a better alternative than walking.

Behaviour of the Different Pollutants
Finally, in this test, we carry out an independent analysis for each pollutant, compared at the same time over the same route.These results are shown in Figure 10.We can see the differences in the pollution levels from different pollutants and how these may affect the route selection.
From Figure 10, we see that PM indicators are the worst ones with the highest levels, compared with the other pollutants.This scenario would be a problem in case of respiratory and cardiovascular disorders [5] and the user should plan on and decide an alternative, either delaying his/her trip or changing the route, avoiding the city center.From the correlation between the pollutants detailed in Table 4, we can see that NO 2 with PM10 and PM2.5 have a correlation of 0.75 and 0.85, respectively.The main reason behind this is due to fossil fuel combustion, as explained in [32].On the other hand, we see the inverse relationship between O 3 and the other pollutants, with the exception of SO 2 .This is because O 3 is a secondary pollutant which is formed in the atmosphere from the reaction of sunlight with the other pollutants (primary pollutants).Thus, when the primary pollutants grow, later O 3 grows by reducing the primary ones.

Comparison with Related Work
In Table 5, we present a comprehensive comparison between our proposal and the current state-of-the-art with a similar approach.Specifically, we conducted a detailed analysis of various alternative approaches mainly discussed in Section 2. For each alternative, we examined its advantages and disadvantages (pros and cons), accompanied by relevant comments and insights.Finally, as we can see compared to the related work, our proposal provides detailed and accurate air pollution estimation along a specified or given route, taking into account the user's profile (or medical history) and the WHO AQ Guideline.

Conclusions and Future Work
Air pollution stands as a significant issue contributing to a diverse array of illnesses, with severe impact on some types of cancer and cardio-respiratory diseases.
In this paper, an integrated framework is proposed to assist citizens in their daily trips, before and during outdoor trips of their choice, by analysing the exposure to air pollutants in order to minimise the risk of diseases, based on the citizen's profile (medical history) as a trade-off of a marginal increase in travel distance.Our proposal is based on the World Health Organization's Air Quality (AQ) Guidelines, analysing pollutants and their concentrations independently with the aim to empower citizens to proactively make informed personal decisions about their route choices (by changing speed or trajectory), and identify potentially unhealthy travel environments by showing the pollution levels along the whole journey.
The pollution measurement for each pollutant is calculated based on the whole trajectory of the outdoor route using spatial interpolation of these pollutants (with Ordinary Kriging) and a mapping process of these pollutants over a grid superimposed over the city map.This application is based on open-source tools and can integrate any pollutant available at the AQ monitoring stations.
We have evaluated the application in the city of Valencia as a use case under different scenarios and we have shown the results to assess the exposure to pollution on citizens' routes.Also in case of respiratory problems, we have included a decision analysis taking into account the information provided by the application.It is seen that in practice, it negatively recommends crowded traffic junctions to reduce the amount of exposure to the pedestrian or rider.As could be expected, pollution is highly correlated with road traffic.In addition, we have seen the evolution of the different pollutants in a day and their interactions, showing that when the primary pollutants grow (NO 2 and SO 2 ), O 3 grows later by reducing the primary ones through sunlight reactions.
As future work, we are working on two different lines.We are developing low-cost AQ monitoring nodes that will allow us to use the application in environments with few official stations.In addition, we are using these pollution indicators as a new metric (as an alternative to distance) to search for the least polluted route over the city map.

Figure 3 .
Figure 3. Example of a generic theoretical variogram, showing the nugget effect, sill, and range.

Figure 4 .
Figure 4. Flowchart of the citizen application to check air pollution on his/her outdoor routes.

Figure 9 .
Figure 9. Pollution impact on different modes of transport (16 July 2023, Sunday at noon).

Table 2 .
European Air Quality Index (E-AQI) level and maximum allowed per pollutant (in µg/m 3 ) per index level.

Table 3 .
Evolution of the average concentration of NO 2 and PM2.5 (in µg/m 3 ) during a working day, 19 July 2023, Wednesday.

Table 4 .
Correlation between AQ pollutants and their concentrations

Table 5 .
Comparison with the state-of-the-art: pros and cons.