Using Task Farming to Optimise a Street-Scale Resolution Air Quality Model of the West Midlands (UK)

: High resolution air quality models combining emissions, chemical processes, dispersion and dynamical treatments are necessary to develop effective policies for clean air in urban environ-ments, but can have high computational demand. We demonstrate the application of task farming to reduce runtime for ADMS-Urban, a quasi-Gaussian plume air dispersion model. The model represents the full range of source types (point, road and grid sources) occurring in an urban area at high resolution. Here, we implement and evaluate the option to automatically split up a large model domain into smaller sub-regions, each of which can then be executed concurrently on multiple cores of a HPC or across a PC network, a technique known as task farming. The approach has been tested for a large model domain covering the West Midlands, UK (902 km 2 ), as part of modelling work in the WM-Air (West Midlands Air Quality Improvement Programme) project. Compared to the measurement data, overall, the model performs well. Air quality maps for annual/subset averages and percentiles are generated. For this air quality modelling application of task farming, the optimisation process has reduced weeks of model execution time to approximately 35 h for a single model conﬁguration of annual calculations.


Introduction
Air pollution has become the biggest environmental risk for public health both globally and locally [1][2][3][4]. Air pollution can cause adverse health effects, e.g., diseases associated with respiratory, circulatory, nervous, digestive and urinary systems [5]. In 2016, the World Health Organisation (WHO) estimated [6,7] premature deaths attributed to ambient air pollution as about 4.2 million per year and that about 91% of the world's population dwelt in areas with air pollution levels higher than WHO guidelines [8]. The mortality burden associated with ambient air pollution is about 28-36,000 per year in the UK [9]. The availability of air quality information is of vital importance to improve the understanding of the associated health effects [10,11], and to develop effective and equitable air pollution control policies.
Air quality measurements can provide direct information about the levels of air pollutants in the atmosphere. The UK Automatic Urban and Rural Network (AURN) [12] is the largest automatic air quality monitoring network across the UK. The quality-assured stationary sites in AURN can normally provide continuous measurements of air pollution concentrations at high temporal resolution (e.g., hourly air quality data), but with coarse spatial resolution due to the limited number of sites [12], and at significant capital and operational cost. Owing to the advanced development of Internet of Things, low-cost sensors [13] are also increasingly used for air quality measurements, as indicative measures. These techniques can enable the dense network of air quality monitoring required for building smart cities. Other monitoring approaches, such as mobile measurements using bicycles [14,15] and vehicles [16], generate air quality information at both high temporal and spatial resolutions within relatively small domains, while satellite measurements can provide a globally consistent air quality monitoring service at a coarse spatial resolution [17]. However, these measurement approaches are unable to provide the high-resolution spatial and temporal air pollutant concentration data required for some detailed population exposure calculations, or to evaluate potential policy options.
To complement the information obtainable from air quality monitoring services, the use of air quality modelling has rapidly increased over recent decades. These tools play a key role in environmental science because of their capability to quantify the deterministic relationships between emission sources, dispersion, mixing, concentrations, advection and deposition over different distance and time scales [18]. Their use has been promoted by the 2008 European Directive on Ambient Air Quality and Cleaner Air for Europe that explicitly encourages the adoption of modelling for air quality management such as forecasting and emission reduction plans [19]. Air quality models use mathematical equations to simulate physical and chemical processes affecting air pollution in the atmosphere using different approaches depending on the degree of meteorological and chemical detail required for a given application [20,21]. Dispersion, transport and chemical processes are modelled from local to regional scales using different types of models. Local-scale models can represent explicit source properties, such as geometry and efflux conditions, incorporating a simplified chemical scheme and using representative meteorological and emission data [22,23]. Regional-scale models use diffusion equations (e.g., Eulerian models) [24,25] or instantaneous flow approaches (e.g., Lagrangian models) [26,27] to simulate full chemistry and physical mechanisms acting in the atmosphere, accounting for the interaction of the emissions, homogeneously mixed on each grid, with meteorology. Other models adopt a simpler and less data-demanding approach to estimate air pollutant concentrations using a statistical or empirical approach [28][29][30][31]. The simplification of these models is achieved by ignoring the time-varying processes affecting air pollutant concentrations connected with variations in emissions, processing and meteorological conditions. Models that represent physical and chemical processes are the most suitable for air quality assessment planning for a number of reasons including: the capability to simulate at different spatial scales (from hemispheric simulations to regional and local scale) [32] and temporal scales (from short time period or event analysis to annual and inter-annual simulations) and the possibility to conduct several types of analysis, from the dispersion processes of inert and/or trace pollutants at a particular ground-level receptor influenced by an emission source, to simulations of the full chemistry acting in the atmosphere on pollutants from all available emission sources, and their interactions with meteorology. This also allows the models to be used to assess the likely changes in air pollutant concentrations resulting from differing scenarios of emissions reductions and/or forecasts of future climatic conditions [33].
Different mesoscale meteorological and chemistry-transport models (CTMs) (e.g., WRF [34], CMAQ [35] or WRF-Chem [36]) are commonly used worldwide by governments and researchers to study air pollution exposure [37,38], plan emissions reductions [39] and create scenarios [40] to reduce air pollution in urban areas. These systems require extensive computational time and resources in comparison to statistical and empirical models. This is due to the necessity to account for atmospheric dynamics and complex chemical dispersion and deposition processes of potentially thousands of primary and secondary pollutants over urban areas at different spatial resolution (from a few hundreds of meters to several km) [28]. Parallel HPC (High Performance Computer) clusters are generally used to supply the computational resources for this type of dispersion modelling. HPC clusters are able to run calculations from simulations of three-dimensional domains with different cell dimensions and sizes in parallel with a reasonable computational time [41].
In contrast, for most applications, local-scale dispersion models execute sequentially in terms of spatial and temporal calculations, leading to extended runtimes for the simulation of large urban areas.
Computer models representing physical or chemical processes such as pollutant dispersion are commonly initially written in a sequential form, as this is simple to develop and it is easily portable across different types of computational architecture. However, the computational burden associated with modelling complex atmospheric processes often requires runtime optimisation, such as code parallelisation whereby calculations are distributed over multiple cores on HPC clusters [42]. Sequential code can be converted into parallel code using parallelisation algorithms such as OpenMP [43], Parallel Virtual Machine (PVM) [44] and Message Passing Interface (MPI) [45], but a simpler approach, not requiring changes to code architecture, is to use task farming. Task farming involves running the same, possibly sequential, code on multiple processors using differing model configuration parameters and data inputs [46]. The application of task farming to modelling of physical processes with differing configurations relating to different spatial areas is sometimes known as spatial parallelisation. For some applications, task farming may be wasteful in terms of computational resources compared to full code parallelisation; for instance, the same code will be executed separately on each core and, for spatial parallelisation, there may be a need for additional calculations at the edge of each computational sub-domain. However, spatial parallelisation is relatively simple to implement from a code development perspective and can lead to runtime optimisations that broadly scale with the number of processors available. A task farming approach has previously been applied to the AERMOD Gaussian plume dispersion model with preliminary testing [47], alongside a qualitative assessment of the possibility of code parallelisation, which concluded that it would require significant development effort.
ADMS-Urban [48,49] is a quasi-Gaussian plume air dispersion model that represents the structure of the atmospheric boundary layer using two governing parameters: the boundary layer depth and the Monin-Obukhov length. It uses a physics-based approach, so it requires a range of data inputs (meteorological, emissions and long-range pollutant transport data). ADMS-Urban explicitly represents the full range of source types occurring in an urban area at high resolution (industry, transport and diffuse sources); the model is able to account for the influence of complex urban morphology (building density, street canyons [50,51]) on dispersion and generates street-scale resolution maps that highlight both pollution hotspots and areas of better air quality. The model has been used to quantify urban air pollution levels in many cities worldwide [49,52], but model run times can be extensive when run sequentially, with city-scale calculations taking weeks to execute on standard Windows PCs. Multiple model runs can be required in order to assess different policy or emissions scenarios, and to perform sensitivity analyses, so improving model run times is a key requirement for enabling the analysis of a broad range of scenarios. This paper presents the results of a novel approach to running ADMS-Urban, where task farming has been used to spatially parallelise the model configuration, and each run component has been executed on an HPC-Bluebear at the University of Birmingham. The approach has been tested for a large model domain covering the West Midlands (WM), UK (902 km 2 ), as part of modelling work in the WM-Air (the West Midlands Air Quality Improvement Programme) project [53]. WM-Air is a five-year impact-focussed programme to support the improvement of air quality and associated health, environmental and economic benefits in the West Midlands. Section 2 describes the methodology of task farming in the ADMS-Urban model and presents the modelling configuration for the WM case study. Section 3 reports the model evaluation from the receptor run and several types of air quality maps from the contour run. Section 4 discusses the results and Section 5 gives a summary.

ADMS-Urban Model
ADMS-Urban can model pollution sources with explicit point, line, area or volume geometry, using quasi-Gaussian plume dispersion expressions, with skewed vertical profiles used in convective conditions [54]. Road sources are modelled as a special case of line sources, where traffic-induced turbulence effects are included based on user-defined emission rates and/or traffic flow and speed data [55]. Point sources are modelled as elevated sources for large industry sources and stack parameters (e.g., stack height and diameter, efflux temperature and exit velocity) are needed. A regular grid of volume sources with uniform source depth is also used to represent total emissions of both the explicit sources and other sources where less detailed source characteristics are available, such as domestic heating or minor industrial processes.
Dispersion calculations for each explicit source and a single volume source (forming one cell of the uniform regular grid) are initially carried out along an "internal grid" of calculation locations following the downwind plume centreline, with along-wind interpolation, lateral and vertical profile factors used to obtain concentrations at the required output locations. The internal calculation grid resolution is finest at the source location and increases in geometric sequence with increasing distance from the source, as plume properties are expected to vary more slowly further from the source. The single volume source grid cell dispersion patterns are spatially translated to the location of each cell, scaled by the individual cell emissions and applied to the final output locations.
For pollutants which are considered inert on local scales, concentrations from all included sources are summed at each output location to form the total output concentration. For pollutants where local chemistry processes are significant, such as NO x and NO 2 , a concentration-weighted average of dispersion time is also calculated at each output location and used in an implementation of the Generic Reaction Set (GRS) chemistry scheme [56,57].

Run-Time Optimisation Using Task Farming
ADMS-Urban is a serial program designed to run on a single processor. However, the latest version of the model includes the ability to split up a large modelling region into smaller sub-regions, each of which can then be executed concurrently on multiple cores of a HPC or across a PC network, a technique known as task farming.
Only those output points that fall within a given sub-region are included in that sub-region run. Conversely, since the concentration at any output point can be affected by any upwind source within the modelling region, it is important that all source emissions are included in each sub-region run. A study of agricultural non-point source dispersion modelling, for sources with a maximum horizontal dimension of 60-80 m, showed that the exact geometry of neutrally buoyant non-point source types made little difference to predicted downwind concentrations beyond approximately 100 m [58]. Large efficiency gains can therefore be achieved, for appropriate source types, by only explicitly modelling those sources that fall within the sub-region (plus an additional "buffer" zone), while the emissions from more distant sources can be modelled via the (computationally much cheaper) grid source. This also provides justification for regional-scale chemical transport models typically only requiring gridded input emissions.
In ADMS-Urban, road sources are modelled as neutrally buoyant sources (with an initial mixing depth to account for vertical spread in the wake of vehicles) and can therefore be spatially truncated to sub-regions in this way. Figure 1 below shows an example of which road sources are explicitly modelled for a particular sub-region run. Run times can be optimised by ensuring a similar number of explicitly modelled sources and associated output points are included within each sub-region, hence, smaller sub-regions are used in areas with a higher density of explicit road sources. Conversely, point sources with high pollutant emission rates are always modelled explicitly due to their non-negligible buoyancy and elevated source height, which affect their dispersion over a long distance. Point sources are generally selected for explicit modelling if they have annual average emissions greater than 1 g/s of a pollutant of interest, or are subject to specific national regulation ("'Part A" sources). The number of point sources of this type is often much smaller than the number of modelled road sources and so the run-time cost of including all point sources in each sub-region run, with extents normally of the order of 1 km, is comparatively small.

Case Study
The West Midlands Combined Authority (WMCA) in the UK covers seven constituent local authorities (Birmingham, Coventry, Dudley, Sandwell, Solihull, Walsall and Wolverhampton). Geographically, the West Midlands (WM) is an area of around 902 km 2 roughly centred on Birmingham. Air quality modelling is an important tool for the investigation of air quality within the WM region and for the assessment of the impact of specific intervention scenarios on air quality within the region.

Emissions
Emission sources in the model included explicit point sources, explicit road sources and 1 km × 1 km horizontal resolution grid sources for the baseline year of 2016 (shown as Figure 2). The EMIT Atmospheric Emissions Inventory Toolkit (developed by CERC) was used to pre-process the emission data before import into the ADMS-Urban model. Table 1 shows an overview of total emissions for different source types over WM computational domain.

Point Sources
Point source emission rates were taken from the UK National Atmospheric Emissions Inventory (NAEI) [59], which collected detailed emission data from large individual sources. Other smaller emission sources in the industrial and commercial sector were included as grid sources (Section Grid Sources). Large industrial point sources were considered explicitly as elevated point sources in the dispersion model. The emission inventory for these point sources combined the NAEI 2016 data (for emission rates) and Birmingham City Council (BCC) Airviro [60] model data (for stack parameters, e.g., stack height and diameter, efflux temperature and exit velocity). Representative typical stack characteristics by sector were used for the point sources where the stack characteristics are not known. The emission rates from the point sources were given for a wide range of pollutants; those of interest are NO x as NO 2 , PM 10 , PM 2.5 , Non-Methane Volatile Organic Compounds (NMVOC) and SO 2 . The location of the point sources was given in the British National Grid Coordinate System (OSGB) and has been converted to the modelling coordinate system in Lambert Conformal Conic Projection (LCC). The use of modelling coordinate was consistent with that in the regional Community Multiscale Air Quality (CMAQ) modelling system, to prepare for the development of a coupling system between CMAQ and ADMS-Urban model under WM-Air.

Road Sources
Road sources in the current baseline model combined the traffic maps from Transport for West Midlands (TfWM) PRISM model [61] and BCC's SATURN model [62]. The SATURN model has more road links within the forthcoming Clean Air Zone of Birmingham [63]. The traffic map covers major roads, e.g., motorways and "A" roads. Minor roads not represented by the current traffic map are modelled as grid sources. The traffic data for AM peak, PM peak and inter-peak time periods have been combined and converted into Annual Average Daily Traffic (AADT). The traffic flows were categorised into heavy and light vehicles. These traffic model output data were evaluated against the TfWM's traffic count data. The light vehicle from the traffic model agrees well with traffic counts, while the heavy vehicle is consistently underestimated compared to traffic counts and an adjustment was made. Bus timetable data from Remix [64] were also processed and included in the model input. Representative fleet composition data (Euro classification for each sub-type of heavy and light vehicles) were taken from ANPR data in a recent Birmingham Clean Air Zone (CAZ) document [62] and has been incorporated into the EMIT calculations. The UK NAEI 2014 road traffic emission factors, with real-world adjustments following the approach described in Hood et al. 2018 [49], were used for the calculation of emission rates.

Grid Sources
Grid sources for 2016 were defined at 1 km × 1 km resolution with a typical depth of 10 m. The base gridded emissions were downloaded from the NAEI website [59] in the OSGB coordinate system, and have been converted to the LCC modelling coordinates. The pollutants of interest are NO x as NO 2 , NMVOC, PM 10 , PM 2.5 and SO 2 . SNAP07 has been reduced by subtracting the emission contribution from the explicit major road sources. EMIT also aggregates the explicit major road emissions into the same 1 km × 1 km grid. The residual emission for this SNAP07 sector can be then derived and modelled as SNAP07_minor road.

Time Varying Factors
Time-varying factors from the EMEP model [65,66] were available for each hour of the day by SNAP sector and pollutant. An emissions inventory covering the area of interest was available, with total emission for each sector. These emission rates were used to calculate a combined set of weighted average monthly emission factors for each pollutant, which were applied to the total gridded emission rates. Separate time varying factors were applied to particulate and gaseous gridded emissions, reflecting different balances between sectors and source types for these pollutants. In additional to the gridded emission rates, time varying factors have been also applied to explicit road sources. The monthly factors used for explicit road source emissions were taken from Community Modelling and Analysis System (CAMS) regional emissions v3.1 [67]. Diurnal profiles for road traffic have been calculated using 24-h flow and speed data from automatic traffic count sites (data downloaded from TfWM), typically available for 1 week per site. The roads of interest were isolated, and the light and heavy vehicle hourly flows and speeds were processed through an Emissions Factor Toolkit (EFT, version 9.0) [68] spreadsheet to calculate hourly emission rates of the pollutant of interest. The emission rates were then normalised by the average emission rate on the road, to give a time varying profile for the road. The roads were classified into medium or high flow and average time-varying profiles were calculated for each type. The diurnal profile for medium roads was also applied to the grid source, representing both the significant contribution of minor roads to the residual gridded emissions and the representation of emissions from roads outside the current sub-region and buffer zones in the gridded emissions.

Background Data
Background concentration files were created using historic observation data from a variety of rural background sites surrounding the West Midlands modelling area, available from the Department for Environment, Food and Rural Affairs (Defra) UK-Air website [12]. Data were limited in the West Midlands area, so a suitable background file was created using the following sites for different pollutants:  10 and PM 2.5 : Chilbolton (with large periods of missing data filled using data from Sheffield Devonshire Green). The direction of each monitoring site from the centre of the modelling region, and wind direction sectors which were appropriate for each site, were calculated. The monitored wind direction for each hour was used to identify upwind monitoring data for that hour. The use of Chilbolton for particulate background concentrations was due to the fact that appropriate background monitoring sites for PM were scarce around the West Midlands area. The monitored Chilbolton concentration was multiplied by the ratio of the annual average concentration at a rural area bordering the West Midlands to that at Chilbolton based on Defra's background concentration maps [69].

Meteorological Data
For the West Midlands, an appropriate synoptic meteorological measurement site is located at Birmingham Elmdon, within Birmingham Airport, with data obtained from Met Office MIDAS in CEDA Archive [70]. "UK Hourly weather data", "UK Mean Wind" and "UK Hourly rainfall data" have been combined to create the met data format required by the model. The generated met file included hourly data for wind direction, wind speed (converted from knots to m/s), total cloud fraction, air temperature, relative humidity and precipitation.

Advanced Canyon and Urban Canopy Files
The data required to carry out the advanced canyon [51] and urban canopy [50] calculations are (1) a road network shapefile and (2) a buildings shapefile, including a height field. The building data have been obtained from Digimap database [71] via the University. The ADMS-Urban software package included ArcGIS tools [72] which have been used to calculate an Advanced Canyon file. The building height and canyon width along each road link were derived. The gridded urban canopy parameters have also been calculated for use in representing urban wind flow variations. These will enable the ADMS-Urban model to account for the street canyon effect for road emissions and spatially varying urban canopy flow for all source types.

Spatial Splitting
The task farming approach was achieved by spatially splitting the domain within the ADMS model. The overall rectangular output grid domain extent covering WM was first divided into 468 smaller sub-domains (forming a grid of 26 by 18 sub-domains, shown as Figure 2), each with a size of 2 km × 2 km. For some 2 km × 2 km subdomains with denser road links in city centre areas, which also included increased numbers of output points to fully resolve the near-road concentrations, a further spatial splitting into 1 km × 1 km or 500 m × 500 m was adopted in order to reduce the overall computation time. The total number of sub-domains was 540 (although no output points have been specified beyond 1 km outside the WM boundaries). The maximum number of road sources in a single sub-domain was 598 and the maximum number of output points in a domain was 4725. A buffer zone [73] of 750 m for road sources (to exclude explicit road sources unlikely to contribute significantly to modelled concentrations in the sub-domain) was used for each sub-domain.

Receptor Run: Model Evaluation
For the purpose of model evaluation, the model was first run in a "Receptor" Mode (a run with output for a limited number of specified receptors) for 32 air quality measurement sites within the WM over the whole year of 2016, with measured concentration data obtained from local authorities and Defra's AURN [12] (shown as Figure 3, mostly with available hourly air quality measurements). These sites included three types, i.e., 1 airport site, 19 roadside sites and 12 urban background sites. In order to reduce the model computational time, the source exclusion option [73] was used to not explicitly model road sources far away from specified receptors, and therefore unlikely to contribute significantly to modelled concentrations at receptors, with a specified exclusion distance of 750 m. The Receptor run was conducted in a Windows PC and it took about 12 h' computation time to get the hourly output of five air pollutants (NO x , NO 2 , O 3 , PM 10 and PM 2.5 ) across a whole year for all 32 receptors. The Model Evaluation Toolkit [74] was used to conduct the evaluation of the model by comparing to the measured air quality data using statistical and graphical methods.  Figure 4 shows the evaluation of modelled annual NO x , NO 2 and O 3 against observations using scatter plots divided by site type. Overall, the model performed well in terms of NO x and NO 2 for all site types. The good fits for O 3 further suggested good performance of the model chemistry. Table 2 shows the statistics (see definitions in [49] Figure 5 shows the evaluation of modelled annual average PM 10 and PM 2.5 against observations using scatter plots divided by site types; note that there were no PM 2.5 measurements at the single airport site. PM 10 had a very good fit for the airport and urban background sites. PM 10 tended to slightly over-predict at roadside sites, possibly related to uncertainties in traffic non-exhaust emissions and background data. The model had good predictions for the small number of sites with available PM 2.5 measurement data. For PM 10 and PM 2.5 , Fb ranges between (−0.13, 0.12), slightly wider than (−0.07, 0.09) in Hood et al. (2018) [49]. Fac2 indicates that more than 74% of PM 10 and PM 2.5 are within a factor of 2 of observations. NMSE varies between (0.43, 0.53) for PM 10

Contour Run: Air Quality Maps
For the generation of air quality maps, the model was then run in a "Contour" Mode (with the splitting option activated) to include output points covering the whole WM (and extending up to 1 km outside the WM boundary). An array job with 540 cores, each for a single sub-domain as shown in Figure 2, was submitted to the HPC at the University of Birmingham using the Linux version of the ADMS-Urban model. The overall elapsed time for the run (determined by the slowest core of 540 cores) for the typical whole year 2016 baseline case is about 35 h, with a median core run time of about 5 h and a minimum run time of 16 s for sub-domains without any emission sources. The total computational time (summing over all 540 cores) is about 169 days. Figure 6 shows a comparison of elapsed (clock) time of the slowest core and total computational time using task farming, (estimated by a typical one day simulation due to the substantial computational time requirement for a single core simulation). From 1 core to 4 core, the typical elapsed time can reduce by about 80%. From 156 cores, the typical elapsed time reduces more gradually. The total computational time profile has a slower decrease than the profile for the elapsed time of the slowest core, which is due to the increase in the buffer zone calculations for larger numbers of cores (especially for 540 cores). The choice of the number of cores can be dependent on the local HPC service, such as limitations in the number of available cores and the maximum allowed time for a single core (walltime). The output for each subdomain was in netcdf file format, which has been combined and interpolated using the CombineCOF and AddInterpIGP utilities developed by CERC. The re-combination and interpolation time was about 1 h. The recombined and interpolated outputs for the hourly output of the whole year over WM region contained~0.61 million and~1.26 million output locations, and had file sizes of about 120 GB and 247 GB, respectively. The final netcdf output was then processed to derive annual/subset averages and other statistical output (e.g., percentiles) using the "Process comprehensive output" tool in the model. This process took a couple of hours, dependent on the number of output air pollutants. The final contour plots over WM at a specified resolution (e.g., 10 m × 10 m) were created via GIS tools, in particular using interpolation in Surfer and display in ArcGIS. Figure 7 presents a map of the annual average NO 2 concentration which is a key air quality challenge for roadside locations in the UK, at 10 m × 10 m horizontal resolution for the baseline year of 2016. Other pollutants such as NO x , O 3 , PM 10 and PM 2.5 are shown in Figure A1. The legend of Figure 7 indicates colour scales with annual mean NO 2 concentrations higher than the UK objective value of 40 µg m −3 [75] shown in orange and red. There were relatively higher concentrations of NO 2 near motorways and major roads in city centre areas, mostly due to the higher traffic-related emissions. Away from major roads and in rural areas, NO 2 concentrations were generally lower.

Projected Air Quality Map for Health
For the purposes of health-related research, including assessment of personal exposure and exploration of relationships between air pollution levels and socio-demographic characteristics typically available on different spatial scales, the 10 m × 10 m horizontal resolution annual air quality map may need to be further aggregated into other polygon layers, e.g., Lower Layer Super Output Areas (LSOA) and ward levels (averages over these layers). Figure 8 shows examples of projected annual air quality maps for NO 2 averaged over LSOA layer and ward level layer. There were clear patterns of higher concentration in city centre areas and lower concentration in rural areas. The projected air quality maps for other pollutants such as NO x , O 3 , PM 10 and PM 2.5 are shown in Figure A2 (in LSOA layer) and Figure A3 (in the ward layer). These can then be linked to population and health data for the assessment of the health impacts of air pollution. Note that the spatial averaging process leads to a narrower range of concentrations across the whole area, as both the lowest concentrations in the most rural areas and the highest concentrations adjacent to road sources are no longer fully represented; this reduction in dynamic range is significant and dependent on spatial resolution.

Percentile Air Quality Map
Apart from annual average air quality targets, percentiles are also a useful indicator for the exceedance of the air quality objective value. For NO 2 , the 99.8 percentile for the 1 h mean [75] is normally used, which represents the 18 th highest concentration in hourly series over a whole calendar year. The UK objective for 1 h mean NO 2 concentration (where public exposure occurs) is "200 µg m −3 not to be exceeded more than 18 times a year" [75]. Figure 9 shows the 99.8 percentile for 1 h mean NO 2 concentration. The exceedance of air quality objective value (200 µg m −3 ) for 1 h NO 2 concentration was found mostly along motorways and major roads linking to motorways, areas in which public exposure may be limited.

Air Quality Maps over Temporal Subsets
Post-processing tools provide the flexibility to obtain concentration averages over temporal subsets, which may be useful when mapped for health/exposure study. Figure 10 shows air quality maps of NO 2 over selected temporal subsets, i.e., AM (7 a.m.-9 a.m.) weekday, IP (inter-Peak) (9 a.m.-3 p.m.) weekday, PM (3 p.m.-7 p.m.) weekday, and IP (9 a.m.-3 p.m.) weekend. For AM weekday and PM weekday, the influence of major road emissions is clearly visible over the whole WM region, due to the region-wide increased traffic activity during these peak periods. For IP weekday and IP weekend, the major roads have a lesser contribution to concentrations over the WM region, compared with peak periods on weekdays. The NO 2 concentration for IP weekday shows higher overall concentration levels and a clearer pattern of influence from road emissions than that for IP weekend.

Discussion
The air quality sites used to evaluate the model performance included three representative types (i.e., airport, roadside and urban background sites). Overall, the model performed well for all pollutants. For airport and urban background sites, the model results reproduce the measured values well since these are less influenced by local emissions and complex building geometry. For roadside sites, the concentrations of air pollutants are more influenced by local emission (e.g., traffic NO x ) and street canyon geometry, which may be reflected in higher uncertainty for some of the roadside sites. NO x and NO 2 concentration levels are most closely related to local emissions and can be well predicted by the current model. PM concentrations are more related to the regional background, which may have some uncertainty. In order to reduce the model uncertainty, the model has been set up with the best available emissions, meteorology, building data, source locations and monitor locations for the WM region. From the model best practice and model evaluation, the model configuration is satisfactory for the wider WM contour run. It is of note that, for PM 10 and PM 2.5 , the background concentrations (constrained to observations) are greater than the increment associated with emissions in the model domain at all sites.
The contour run was performed by using the modelling capability of task farming (via spatial splitting) within the ADMS-Urban model. For this air quality modelling application in a large urban area, the optimisation process has reduced weeks of model execution time to only~35 h and the model can generate high horizontal resolution "street-scale" air quality maps over WM. There are relatively higher annual concentrations of NO 2 in city centre areas (e.g., Birmingham), mainly due to the higher local traffic emissions. As NO 2 concentrations are closely related to local traffic emissions, the control of traffic in city centre areas would have a substantial effect on reducing proximate NO 2 levels. A Birmingham CAZ is proposed to be in place from June 2021 to reduce air pollution levels within Birmingham city centre [63].
The high horizontal resolution air quality maps generated from the model output can be further aggregated into other health-related layers (such as LSOA and ward layers) to study the relationship between air quality and health data. Apart from the annual averages, the model output and post-processing flexibility enables the calculation of other statistics, such as percentiles. Exceedances of the air quality objective value for the 99.8 percentile of the 1 h mean NO 2 concentration were generally only found for motorways and major roads directing to motorways, which is unsurprising due to intensive traffic activity. The reduction in traffic speed limits for some motorways (e.g., trials on speed limits on M6 and M5 motorways near Birmingham reduced from 70 mph to 60 mph by Highways England [76]) may thus help to reduce the NO 2 exceedance, although exposure at such locations may be limited.
Air quality maps over temporal subsets can be also derived for health/exposure study. As expected, AM weekday and PM weekday have clear patterns of region-wide traffic, while NO 2 concentrations over these periods are much higher than the annual averages. The influences of traffic over IP weekday and IP weekend are less significant compared with AM weekday and PM weekday. These findings can be useful for exposure and health studies over working periods. Reis et al. [77] also highlighted the importance of workday population mobility on the exposure to air pollutants.

Conclusions
A WM-Air ADMS-Urban baseline model configuration has been developed and model predictions for NO x , NO 2 , O 3 , PM 10 and PM 2.5 have been evaluated using measurement data. Overall, the model performed well and run times are manageable using the task farming approach. A regional (e.g., CMAQ) modelling system can provide spatially varying regional background predictions, which can be coupled with the ADMS-Urban model in future, but may also have its own uncertainties in terms of model configuration, compared with observational constraints. The post-processing flexibility enables the creation of air quality maps for annual/subset averages and other statistical output (e.g., percentiles). The model outputs can be useful for the study of health impacts of air pollutants.
Future work will draw upon the demonstrated efficient execution of multiple air quality modelling scenarios on HPC. It is important to ensure that the model configuration includes model inputs that are sufficiently detailed that they allow different scenarios to be represented. There are a range of possible air quality modelling scenarios: local and national, short-term and long-term, transport related and non-transport related. The combination of detailed inventory and spatially defined emissions, high resolution dispersion simulation, efficient parallelisation and flexible post-processing will allow the exploration of multiple scenarios. These may investigate concentration responses to interventions such as Clean Air Zones, the influence of solid fuel combustion, agricultural emissions, air quality-climate interactions and the relationships between air pollution exposure and population distribution. These in turn enable optimisation of combined benefits and equity of possible future limit value and exposure reduction based on air quality targets.