Application of Density Plots and Time Series Modelling to the Analysis of Nitrogen Dioxides Measured by Low-Cost and Reference Sensors in Urban Areas

: Temporal variability of NO 2 concentrations measured by 28 Envirowatch E-MOTEs, 13 AQMesh pods, and eight reference sensors (ﬁve run by Shefﬁeld City Council and three run by the Department for Environment, Food and Rural Affairs (DEFRA)) was analysed at different time scales (e.g., annual, weekly and diurnal cycles). Density plots and time variation plots were used to compare the distributions and temporal variability of NO 2 concentrations. Long-term trends, both adjusted and non-adjusted, showed signiﬁcant reductions in NO 2 concentrations. At the Tinsley site, the non-adjusted trend was − 0.94 ( − 1.12, − 0.78) µ gm − 3 /year, whereas the adjusted trend was − 0.95 ( − 1.04, − 0.86) µ gm − 3 /year. At Devonshire Green, the non-adjusted trend was − 1.21 ( − 1.91, − 0.41) µ gm − 3 /year and the adjusted trend was − 1.26 ( − 1.57, − 0.83) µ gm − 3 /year. Furthermore, NO 2 concentrations were analysed employing univariate linear and nonlinear time series models and their performance was compared with a more advanced time series model using two exogenous variables (NO and O 3 ). For this purpose, time series data of NO, O 3 and NO 2 were obtained from a reference site in Shefﬁeld, which were more accurate than the measurements from low-cost sensors and, therefore, more suitable for training and testing the model. In this article, the three main steps used for model development are discussed: (i) model speciﬁcation for choosing appropriate values for p , d and q , (ii) model ﬁtting (parameters estimation), and (iii) model diagnostic (testing the goodness of ﬁt). The linear auto-regressive integrated moving average (ARIMA) performed better than the nonlinear counterpart; however, its performance in predicting NO 2 concentration was inferior to ARIMA with exogenous variables (ARIMAX). Using cross-validation ARIMAX demonstrated strong association with the measured concentrations, with a correlation coefﬁcient of 0.84 and RMSE of 9.90. ARIMAX can be used as an early warning tool for predicting potential pollution episodes in order to be proactive in adopting precautionary measures.


Introduction
Air pollution is one of the most serious environmental threats to health, killing 6.4 million people in 2015 worldwide both in developed and less-wealthy nations [1]. Out of these, 2.8 million deaths were caused by indoor air pollution and 4.2 million deaths by outdoor air pollution. Air pollution is causing various health problems including respiratory problems, cardiovascular diseases, lung cancer and asthma [2]. Walters and Ayres [3] have also reported that air pollution, especially particulate matter and nitrogen dioxide (NO 2 ) pollution, may cause premature deaths and hospital admissions for conditions such as cardiovascular problems, allergic reactions and lung cancer. It is reported that exposure to air pollution is particularly harmful for children, people with existing health problems and the elderly [4][5][6]. Evidence suggests that the negative impacts of air pollution are dependent on the levels of air pollutants and length of exposure, higher levels and longer exposure resulting in more severe adverse effects [3,6]. Furthermore, air pollution may reduce visibility, damage historical buildings and monuments, affect vegetation and reduce crop yield and quality [4][5][6][7].
Air quality modelling is carried out for several purposes, including air quality prediction, quantifying the impacts of air pollution, modelling the impacts of various factors on air pollution, modelling pollution processes and transport, running and testing emission scenarios, modelling the dispersion of air pollutants in the atmosphere, quantifying the emissions of air pollutants from various emission sources, determining long-term trends in air pollutant concentrations and producing high-resolution spatiotemporal maps of air pollution [6][7][8][9][10][11][12][13][14][15]. In this study, the main purposes were to compare the performance of different time series models and to develop a time series model for predicting future NO 2 concentrations in Sheffield, UK. Time series models are one of the popular tools for predicting the future by understanding the past changes in air pollution concentrations [16].
Time series modelling is a useful tool for data analysis and has several benefits, which include data cleaning, data understanding and forecasting [17]. Time series modelling can help us filter out the noise and reveal the true signal in a dataset. Once a dataset is cleaned and the time series is divided into its different components, it helps us understand the true nature of the dataset. Finally, like other modelling approaches, time series modelling helps us predict future levels with the help of present and past levels of the time series (here NO 2 concentrations) [17]. To build a time series model the time series data must be stationary. For a time series to be stationary, the mean, variance and covariance of the time series should not be a function of time [18]. If the time series is not stationary, it should be stationarised first before fitting a model to it [18]. To stationarise a time series, it should be detrended and deseasonalised using differencing and power transformations [16]. A time series model can be linear or nonlinear depending on the relationship between current and past observations [16]. The Autoregressive (AR) and moving average (MA) models are the two widely used linear time-series models [19]. The AR and MA models can be combined to form the autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models [20]. To model and predict a seasonal time series, the seasonal autoregressive integrated moving average (SARIMA) model is used, which is a variation of ARIMA [19]. ARIMA along with its various variations are also known as the Box-Jenkins models because they are based on the Box-Jenkins principle [19]. Classification of time series models is shown in Figure 1 [21][22][23].
Although air pollutant levels have decreased in the UK and Europe during the last decade or so, some air pollutant levels still exceed air quality standards in many urban areas [24] and 620 air quality management areas (AQMAs) have been declared across the UK [25]. Out of these areas, five cities (Birmingham, Leeds, Southampton, Derby and Nottingham) will not achieve the targets until 2025 and London will not comply until 2030 if additional control measures are not taken [24]. Most of the exceedances are due to the high levels of NO 2 and PM 10 [26], emphasising that these air pollutants are a serious problem in urban areas. Further actions are required to cut emissions and understand the main drivers of high levels of NO 2 and PM 10 in urban areas. Air pollution modelling is a part of these actions.
In first part of this article, NO 2 concentrations measured by different grades of sensors are compared graphically, whereas, in the second part, a comparison of time series modelling approaches is made using measured time series data of NO 2 collected at an urban monitoring site in Sheffield. A comparison is made between linear and nonlinear time series models and the better performing model is then compared with the ARIMAX model, which, in addition to persistence, uses exogenous variables. The aim is to assess the suitability of time series models for NO 2 prediction and choose the best time series model for air quality modelling in the urban environment. VARX, vector autoregressive with exogenous variables; VARMA, vector autoregressive moving average; NARX, nonlinear autoregressive with exogenous variables; NARMAX, nonlinear autoregressive moving average with exogenous variables; SETAR, self-exciting threshold autoregressive; NNNAR, neural network nonlinear autoregressive.

A Brief Description of the Monitoring Network
In this study, NO 2 concentrations (µg/m 3 ) from a network of low-cost sensors (LCSs) and reference sensors were analysed in Sheffield, United Kingdom. A network of LCSs was made by the Urban Flows Observatory, University of Sheffield, UK, consisting of 28 Envirowatch E-MOTEs and 13 AQMesh pods. These are all urban traffic sites. In addition, five reference air quality monitoring stations are operated by Sheffield City Council (SCC) and three are operated by the UK Department for Environment, Food and Rural Affairs (DEFRA), which are part of the Automatic Urban and Rural Network (AURN), the largest air quality monitoring network in the UK. The locations of the air quality monitoring stations (AQMSs) are shown in Figure 2 and their names, coordinates and annual mean NO 2 concentrations are given in Table 1. Data from these sensors were analysed from August 2019 to September 2020.  LCSs are compact, portable and use less power when compared to reference instruments. LCSs range in price from a couple of thousand to several thousand pounds (for a relatively sophisticated multi-pollutant and meteorological sensor with communication capabilities). Reference sensors are expensive, both to purchase and maintain, and bulky, but are the most accurate units, recommended for use by EU and UK government bodies for air quality (AQ) monitoring and complying with standards such as MCERTS in the UK. A single reference unit costs in the region of 20,000 pounds to monitor a single gaseous or particle pollutant. LCSs and reference sensors employ different techniques for air pollutant measurement, which include optical particle counters, light scattering, metal oxide semiconductor sensors, electrochemical sensors, nondispersive infrared sensors, ultraviolet fluorescence, chemiluminescence, infrared photometry and photo-ionisation detection sensors. For more detail, see [27,28]. The LCSs used in this project were either Envirowatch E-MOTEs [29] or AQMesh pods [30]. The Envirowatch E-MOTEs are deployed in a local mesh in a cluster, providing data via ZigBee, within a certain area for high-resolution monitoring, no more than 100 m from each other, with a gateway providing an uplink capability. AQMesh sensors are independent and can be deployed at both high and low spatial resolutions. Both Envirowatch E-MOTEs and AQMesh pods are electrochemical sensors, whereas the sensors used by DEFRA and SCC are reference sensors which use chemiluminescent analysers for NO 2 and an ultraviolet (UV) absorption analyser for O 3 measurements. The reference sensors are more reliable and accurate than the electrochemical sensors. The lowest detection limit for reference sensors and electrochemical sensor is <2 µg/m 3 . Electrochemical sensors are smaller and cheaper to purchase and maintain. Envirowatch E-MOTEs in a cluster communicate with a gateway by means of the Zigbee protocol within a specific area for high-resolution monitoring. The use of this protocol allows the individual units to communicate with each other and pass data from sensors that are not in range or without line-of-sight of the gateway. Using GPRS, the gateway device communicates the collected data over an internet connection to a cloud server operated by Envirowatch. Each AQMesh pod independently sends data to a cloud server using GPRS. For the reference sensors, the data logger is connected to the central management and coordination unit (CMCU) central computer, which collects the data using a GPRS mobile phone connection or wireless broadband.

Comparing NO 2 Measured at Different Sites
In this study, we compared NO 2 concentrations measured at different AQMSs using LCSs and reference sensors from August 2019 to September 2020. For inter-site comparison, density plots and time variation plots were used, which were developed in R programming language [31] using the "openair" package [32]. Density plots, also known as kernel density estimation (KDE) plots, showed the distribution of NO 2 concentrations and are smoothed versions of histograms. The peaks of density plots show where values of the variables are concentrated. Furthermore, time variation plots were developed to show the temporal variability in NO 2 concentrations over different temporal scales; especially, they depict the diurnal, weekly and annual cycles of NO 2 concentrations.

Long-Term Temporal Trend Analysis
Quantification of temporal trends in air pollutant concentrations serves to assess the effects of emission control strategies over a given period of time. In this study, the temporal trend of NO 2 concentrations was determined over the last 20 years (2000-2019) at the Tinsley AQMS, which is part of the UK AURN. This is an urban background site located at the Sheffield Tinsley Community Centre, approximately 200 metres east of the M1 motorway. The temporal trend was also determined at the Devonshire Green AQMS from 2014 to 2019, an urban background site, installed in Devonshire Green Park in Sheffield within a self-contained air-conditioned unit, surrounded mainly by open land and vegetation. To calculate long-term trend in NO 2 concentrations, here we employed the TheilSen function of the "openair"' package [32]. The TheilSen function is a non-parametric approach and uses bootstrap simulations. This technique estimates all the regression parameters through bootstrap resampling. The technique is not affected by outliers as it is based on the median (not mean) and can be applied to a non-normal distribution.

Time Series Model
To carry out time series modelling in this study, NO 2 , NO and O 3 data were used from the Devonshire Green AQMS in Sheffield. The site is part of the AURN network and has been monitoring various air pollutants, including NO 2 , NO, NOx and O 3 , since 2013. Figure 3 shows the monthly average of pollutants, which exhibits a general trend of higher NO, NO 2 and NOx concentrations in winter and lower concentrations in summer. In contrast, O 3 concentrations were higher in spring and summer and lower in winter. In winter the atmosphere is relatively static and the atmospheric boundary layer is shallower, which is a hindrance in pollutant dispersions; whereas, in summer, the atmosphere is more turbulent and both horizontal and vertical dispersion processes are active in dispersing locally emitted pollutants. This is probably the main reason that the concentrations of NO, NO 2 and NOx were lower in summer and higher in winter. On the other hand, ground level O 3 is a secondary pollutant and is produced by the photochemical reactions of NOx and volatile organic compounds (VOCs) in the presence of solar radiation. In summer, high temperature and solar radiation lead to higher levels of O 3 production than in winter, when the weather in the UK is cold, not conducive to O 3 formation. O 3 concentration was highest in May, when in addition to the weather conditions, O 3 precursors were abundant, in contrast to June and July when the precursors were relatively lower [33]. Time series data mainly have three building blocks: seasonality, trend and residuals [34]. Deconstruction (decomposition) of a time series is helpful in understanding the behaviour of a time series. The seasonality (seasonal component) of the time series refers to the fluctuations in the levels of pollutants related to seasonal factors. Seasonality is always of a fixed and known period, normally twelve months. The trend component is the overall long-term pattern of the time series and indicates if a pollutant concentration is increasing or decreasing over time. The residual or error component of the time series is the remainder part that cannot be attributed to seasonality and trend components. The three components of the time series are shown below in Equation (1): In Equation (1), a time series (Y) is divided into three components, which are the trend component (T c ), seasonal component (S c ) and residual or remainder component (R c ). For dividing the time series into three components, seasonal-trend decomposition based on loess (STL) was used, which was initially proposed by Cleveland et al. [35]. The process of extracting these components from the time series is referred to as decomposition. The three components of the NO 2 time series are shown in Figure 4. In this study, air pollution data were analysed using four time series models: ARIMA, ARIMAX, the self-exciting threshold autoregressive (SETAR) and neural network nonlinear autoregressive (NNNAR). For more details on these models, see [36][37][38]. Time series model development and application consist of three main steps [39]: model specification; -model fitting (parameters estimation); and model diagnostic (testing the goodness of fit of the fitted model).

Model Specification
Model specification means choosing appropriate values for p, d and q for a given time series, where p is the lag orders (i.e., the number of lagged values that NO 2 is regressed on), referred to as the autoregressive component, d is the degree of differencing (integration) and q is the order of moving average. Differencing a series involves simply subtracting its current and previous values' d times. Often, differencing is used to stabilize a series when the stationarity assumption is not met. For model specification (to specify the values of p, d and q), firstly, the auto-correlation function (ACF) and partial auto-correlation function (PACF) plots of the NO 2 time series were used. The ACF plot (Figure 5a) displays correlation between a series and its lags (previous values) and helps determine the order of differencing (d). Furthermore, the ACF plot can help in determining the order of the MA component. An MA component represents the error of the model as a combination of previous error terms [39]. The order q determines the number of terms to be included in the model. The ACF plot of the NO 2 series (Figure 5a) shows the properties of a nonstationary series, as the ACF fails to die out rapidly with increasing lags. All ACF values are significant (significantly greater than zero) and the only pattern is perhaps a linear decrease with increasing time lag. This shows that NO 2 series is nonstationary. Before a time-series model (e.g., the ARIMA model) is applied, the time-series must be stationarised, which means the variance, mean and auto-covariance of the time series need to be time invariant. The ACF plots of the NO 2 series differenced once and twice are presented in Figure 5b,c, respectively. After differencing once, the pattern emerged much more clearly, suggesting that the ARIMA model with difference 1 (d = 1) was probably a suitable model. The PACF plot (Figure 5d) displays the correlation between a variable and its lags that is not explained by previous lags. PACF plots are useful when determining the order p of the AR component, which specifies the number of lags used in the model; for example, AR (2) or ARIMA (2,0,0) showed that the order of AR was 2. The PACF plot showed that an AR (1) model should be considered (Figure 5d). The plot of differenced NO 2 is shown in Figure 5e, wherein the oscillating pattern around the zero shows that the series was stationary after the series was differenced once. This suggested that differencing of order 1 terms is sufficient and should be included in the model. Sometimes, if the series is not stationarised after differencing once, we need to difference it again. In addition to the ACF and PACF plots, we also considered model selection criteria based on the value of the Akaike's information criterion (AIC). The time series model with a lower value for the AIC showed a better fit. Based on the AIC values (Table 2), ARIMA (1,1,1) was selected. Differencing, autoregressive and moving average components make up an ARIMA model. ARIMA (1,1,1) showed that the model incorporated differencing of degree 1 and used an AR term of first lag and an MA of order 1. Although some of the other models (Table 2) had slightly lower AIC values, the difference was not significant and the model was more complicated, which is against the principle of parsimony, meaning that the selected model should use the lowest number of parameters providing adequate representation of the time series. When the ARIMA (1,1,1) model was applied to the square root and log of the NO 2 time series in contrast to observed NO 2 concentrations, the AIC values dropped to 3230.6 and 1117.5, respectively, which are much lower than the values presented in Table 2. Therefore, log-NO 2 was adopted in this study instead.

Model Fitting
Model fitting is basically parameter estimation. This study used the maximum likelihood estimation approach, which produces more accurate estimation of parameters in comparison to least square estimation [39]. The main advantage of the maximum likelihood estimation is that it uses all of the information in the data rather than just using the mean and variance of the data, as is the case with least square estimation [39]. Using daily NO 2 concentrations (2015 to 2019), the dataset was divided into training (70%) and testing (30%) datasets, which were selected randomly. Time series linear models were fitted (trained) with the TSA-package [40] and forecast-package [41] in R programming language [31]. The neural network nonlinear autoregressive and self-exciting threshold autoregressive (SETAR) nonlinear models were implemented in R-package "tsDyn" [38]. The model performance was assessed using both training and testing datasets. The specified models were run to estimate the model parameters.

Model Diagnostic and Forecasting
Residuals of a model are defined as the difference between the actual (observed) and predicted (modelled) values, as shown in Equation (2): Residual analysis of the model was performed by graphical presentation using a plot of standardised residuals, a quantile-quantile (Q-Q) plot and a histogram of the residuals. For an adequate model, a plot of standardised residuals shows a rectangular scatter around a zero horizontal level with no trends, whereas a Q-Q plot and histogram of the residuals show the normality of the error terms.
The model performance was assessed by comparing predicted with observed NO 2 concentrations both graphically, using time plots and scatter plots, and using several statistical metrics, including correlation coefficients (r), root mean square error (RMSE), factor of two (FAC2), mean biased error (MBE) and mean absolute error (MAE) [42,43]. The Pearson correlation coefficient measures the strength of the linear relationship between the measured and modelled concentrations. FAC2 is the fraction of predicted concentrations within a factor of two of the measured concentrations. RMSE, MAE and MBE provide an estimation of the error of the model prediction. However, RMSE and MAE do not define the direction of the error, as they provide absolute error, whereas MBE, in addition to the size, defines the direction of the error; i.e., negative MBE indicates under-prediction whereas positive MBE indicates over-prediction of a model.
The fitted model was used to predicted NO 2 concentrations, which were crossvalidated with the testing dataset. If the predicted value (forecast) is denoted byŶt (l), where l is the lead time for forecast and time t is the forecast origin, then forecasting one time-unit into the future (one step ahead) (Ŷt (1)) can be achieved as in Equation (3) where µ is the intercept and φ is the estimated parameter of the AR component. Equation (3) shows that, to forecast the next step, a proportion φ of the current deviation from the process mean is added to the process mean. To consider a general lead time l, we replace time t by t + l in Equation (3) and take the conditional expectation of both sides, which produces Equation (4) [39]:Ŷ Equation (4) is recursive in the lead time l and shows how the forecast for any lead time l can be built up from the forecast for a shorter lead time l by starting with the initial forecast as given by Equation (3). The forecastŶ t (2) is then obtained fromŶ t (2) (2) and so on. Equation (4) is also known as the "difference equation form" of the forecast [39]. An explicit expression for the forecast in terms of the observed history of the time series for Equation (4) can also be given in the form of Equation (5) [39].Ŷ Equation (5) shows that the current deviation from the mean is discounted by a factor φ l , whose magnitude decreases with increasing lead time l. The discounted deviation is then added to the process mean to produce the lead l forecast.
It should be noted that, in forecast, if an exogenous variable is used, then the number of steps ahead are ignored and the number of forecast periods is set to the number of rows of exogenous variables in the new data, which here is equal to the number of rows of the testing dataset.

Density Plots
NO 2 concentrations (µg/m 3 ) measured by different grades of sensors at different monitoring stations in Sheffield were compared employing densities plots. Density plots offer an easy way to compare the distribution of NO 2 concentrations measured at different monitoring stations. Figure 6 shows NO 2 concentrations measured by Envirowatch E-MOTEs. The performance of Envirowatch E-MOTEs was assessed by Munir et al. [44] by comparing their concentrations with a nearby reference sensor in Sheffield. The sensors were divided into four sub-groups based on the mean NO 2 concentrations. In first group of sensors (Figure 6a), the range of NO 2 concentrations was 0 to 50 µg/m 3 and the highest density (mode) occurred at 10-15 µg/m 3 . This represents the group of sensors with the lowest concentrations, which are deployed outside the city centre at the University of Sheffield campus. The second group (Figure 6b The distributions of NO 2 concentrations measured by different AQMesh pods are compared using density plots in Figure 7. Most of the AQMesh pods were deployed in the outskirts of the city (Figure 2). The distribution is mostly skewed right with heavy right tails. Only three AQMesh pods exceeded AQ standards, which were NO 2 _206 (Abbeydale Rd, 43.67 µg/m 3 ), NO 2 _2003 (Brightside Lane, 41.34 µg/m 3 ) and NO 2 _2005 (Savile Street, 46.91 µg/m 3 ). The measurements seem realistic as these sensors are deployed on very busy roadsides.   Density plots of NO 2 concentrations measured by AURN sites and SSC sites are shown in Figure 8a The concentrations measured at both AURN and SCC sites were well below the AQ standards. The lowest concentrations were recorded at the King Ecgbert site, which is located in a background location well outside the city centre. In this section, we consider the distribution and annual mean of NO 2 concentrations measured at different AQMSs of different grades. Density plots are a useful tool for understanding the distribution of NO 2 concentrations and comparing the distributions of different monitoring sites. However, density plots provide no information on the temporal variability of NO 2 concentrations, which are analysed in the coming section.

Time Variation Plots
It is important to know how NO 2 concentrations vary at different time scales. Time variation plots show variations in NO 2 concentrations on diurnal, weekly and annual cycles. Such information is useful in understanding the emission sources of air pollutants. In this section, temporal variabilities of NO 2 concentrations are analysed by employing time variation plots, using data from AURN, SSC, Envirowatch E-MOTEs and AQMesh pods. NO 2 concentrations measured by AURN and SCC sites are analysed first to set a benchmark of temporal variation for the LCS.
Barnsley is a roadside (urban traffic) AQMS, whereas Tinsley and Devonshire Green are urban background sites. Therefore, NO 2 concentrations measured at Barnsley site (NO 2 _brn) were higher than at the other two sites (Figure 9). Figure 9 shows that, on diurnal cycles, NO 2 concentrations were higher in the busy morning hours (07:00-09:00 h) and busy afternoon hours (17:00-19:00 h) and lower at night and midday. Weekly cycles showed that NO 2 concentrations were significantly lower on weekends (Sunday being the lowest) than on weekdays. Annual cycles of NO 2 showed higher concentrations in colder months (e.g., January, February, November and December) than the warmer months (e.g., June, July and August). All three AURN sites demonstrated the same temporal trends on diurnal, weekly and annual cycles. Diurnal and weekly cycles of NO 2 concentrations are controlled by road traffic flow, whereas annual cycles in addition to emissions are controlled by meteorological parameters. In winter, the temperature is low and the atmosphere is stagnant, which hinders the dispersion of air pollutants emitted locally. In contrast, in summer the atmosphere is more turbulent, encouraging both vertical and horizontal dispersion of pollutants.  Figure 10 shows the temporal variability of NO 2 concentrations using data from five SSC sites: NO 2 _fv (Firvale), NO 2 _ke (King Ecgbert), NO 2 _lf (Lowfield), NO 2 _tins (Tinsley) and NO 2 _wic (Wicker). SCC AQMSs demonstrated a temporal trend similar to AURN sites. In addition to some minor differences between different sites, NO 2 concentrations were significantly lower at the King Ecgbert site, which is a background site located outside the city centre in a quite location. To demonstrate temporal variability in NO 2 concentrations (Figure 11) measured by Envirowatch E-MOTEs, four E-MOTEs were selected: NO 2 _904 (Railway Station), NO 2 _902 (Harmer Lane/Sheaf Street), NO 2 _731 (Howard Street) and NO 2 _738 (Brook Hill, near Firth Court). These four sites were selected because NO 2 _904 presented an unusual time profile, whereas NO 2 _902, NO 2 _738 and NO 2 _731 showed a typical time variations similar to many other sensors. The diurnal, weekly and annual cycles followed almost similar pattern as those shown by AURN and SSC sites. However, there were some minor differences between various E-MOTEs installed at different locations within the city. Also, differences in winter vs. summer and weekend vs. weekday concentrations were much prominent in AURN sites than in E-MOTEs. NO 2 _904 demonstrated higher concentrations at all times, even at nights and weekend, as the taxi stand next to the train station remains busy at all times waiting for train passengers (probably with vehicles idle while waiting). Figure 12 shows the diurnal, weekly and annual cycles of NO 2 concentrations measured by four AQMesh pods. The four AQMesh sites were: NO 2 _1999 (Malin Bridge School), NO 2 _2001 (London Rd), NO 2 _2003 (Brightside Lane) and NO 2 _204 (Hunter's Bar School). Here the diurnal and weekly cycles showed similar trends; however, the annual cycles were different from each other and from the ones shown by AURN and SSC AQMSs. The NO 2 _2001 annual cycle was particularly different as it showed higher concentrations in summer months than in winter months. However, weekly cycles presented normal trends, as expected and shown by AURN sites.

Long-Term Trends in NO 2 Concentrations
The temporal trend in NO 2 concentrations was determined using the TheilSen function for the last 20 years (2000-2019) at Tinsley and for the last six years (2014-2019) at the Devonshire Green AQMS. Both of these sites are part of the AURN. Trends were expressed in µgm −3 /year. Both non-adjusted (non-deseasonalised) and adjusted (deseasonalised) trends were calculated. Sheffield Tinsley demonstrated negative trends during the study period. Both adjusted and non-adjusted trends were negative and highly significant. The non-adjusted trend was −0.94 (−1.12, −0.78) µgm −3 /year, whereas the adjusted trend was −0.95 (−1.04, −0.86) µgm −3 /year ( Figure 13). The temporal trend at the Devonshire Green site was also negative and highly significant. The non-adjusted trend was −1.21 (−1.91, −0.41) µgm −3 /year and the adjusted trend was −1.26 (−1.57, −0.83) µgm −3 /year ( Figure 14).  At both AQMSs, NO 2 levels decreased during the study period. The reduction in air pollution levels could have been caused by reductions in pollutant emissions or changes in climatic conditions. However, when the trends were adjusted for the effect of changes in climatic conditions, the trends were still negative and slightly greater than the non-adjusted trends (Figures 13 and 14). This probably proves that reduction in NO 2 levels were due to reductions in emissions, showing that, despite the fact that the number of vehicles on roads has gone up, the amount of exhaust emissions has decreased. The reduction in exhaust emissions was probably caused by the stringent emission policies of the UK government. Although generally pollution levels have decreased, the reduction is not uniform spatially and temporally. For example, the trend at the Tinsley site was lower (−0.94 µgm −3 /year) than at the Devonshire Green site (−1.21 µgm −3 /year). However, it should be noted that at Devonshire Green the trend was determined for a shorter period. When a shorter period was also considered for Tinsley, i.e., from 2014 to 2019, the trend was −1.65 µgm −3 /year. This probably shows that reductions in pollutant concentrations have been greater more recently. However, the reductions are not enough and NO 2 levels still exceed air quality standards in several parts of the city. Therefore, more actions are required to further improve air quality in Sheffield to comply with air quality guidelines.

Time Series Modelling Comparison of Linear and Nonlinear Time Series
The performances of two univariate nonlinear persistent models, NNNAR and SE-TAR, were compared to that of the linear ARIMA model. Several statistical metrics were calculated to assess the performances of these models (Table 3). Comparing their performances in Table 3, it can be observed that the ARIMA model performed better than the two nonlinear models. Correlation coefficients (r-values) for ARIMA, NNNAR and SETAR were 0.59, 0.45 and 0.44 and RMSE values were 8.61, 10.45 and 10.56, respectively, which show that there is probably no need for the use of a more complicated nonlinear model for air quality prediction, as ARIMA in this particular example performed better than the nonlinear counterparts. The nonlinear models are not discussed further. For more details on SETAR and NNNAR, readers are referred to Fırat [36] and Waheeb et al. [37]. Here, we first discuss the univariate ARIMA model and then compare its performance with ARIMAX. The estimated parameters of the fitted ARIMA (1,1,1) model are shown in Table 4. The estimated parameters were used to predict NO 2 concentrations. A comparison of predicted and observed NO 2 concentrations for the testing (held-out) dataset is shown in Figure 15. As expected, the ARIMA model performed better when model performance was assessed based on the training dataset with r-value 0.68, MBE 0.07 and FAC2 0.93 than when the model performance was assessed using testing dataset having r-value 0.59, MBE −0.26 and FAC2 0.91. Residual analysis was performed for the model. A plot of standardised residuals showed a rectangular scatter around a zero horizontal level with no trends, which showed the adequacy of the model. A quantile-quantile plot of the residuals showed that the data points followed the straight line closely, which led us to accept the normality of the error term. This was also confirmed by the histogram of the residuals, which appeared closed to normal (Figure not shown for brevity).

Outputs of ARIMAX Model
In addition to the autoregressive, moving average and differencing components used by univariate ARIMA, the more advanced ARIMAX model incorporates external variables, also known as exogenous regressors, into time series modelling. The exogenous variables incorporated into the ARIMAX model should be strongly correlated with the modelled variable. Here, in the first instance we used a single exogenous variable (NO concentrations) for modelling NO 2 concentrations and assessed how it improved the model performance compared to the ARIMA model developed in the previous section, which did not use any external variable. In the next step, NO and O 3 were used as exogenous variables and their effect was assessed on the model goodness of fit. Ideally, meteorological parameters such as temperature, relative humidity and wind speed should have been used in the model as well. However, due to data availability problems, they were not used as regressors in the model.
The estimated parameters of the ARIMAX (1,1,1) model are shown in Table 5 Table 7. The values of different statistical metrics calculated for assessing the model performance are provided in Table 8, where r-values were 0.90 and 0.84 and RMSE values were 11.75 and 9.90 for the training and testing datasets, respectively. Graphical presentation showed a strong association between predicted and observed concentrations (Figure not shown for brevity). This showed that the exogenous variables helped improve the model performance significantly compared to univariate persistent models. The above analysis showed that traditional time-series persistent models, e.g., ARMA or ARIMA, are useful tools for air pollution analysis and prediction; however they only depend on the behaviour of the past data without taking into account the effect of other pollutants and meteorological parameters that interact with the modelled pollutant. Therefore, these approaches fail to predict future levels accurately. The more advanced versions of these models, like ARIMAX, are able to analyse the effect of environmental factors and other pollutants and can result in better prediction by reducing the model error and strengthening correlations between modelled and observed concentrations.
The association of the NO 2 concentration with NO and O 3 concentrations is shown in the form of a scatter plot in Figure 16. The chemistry of NO 2 with O 3 and NO is shown in Equations (6) and (7). NO + O 3 → NO 2 +O 2 (6) Equations (6) and (7) define the chemistry of NO, NO 2 and O 3 . In Equation (6), one molecule of O 3 is consumed and one molecule of NO 2 is produced. In contrast, in Equation (7), one molecule of NO 2 is consumed and one molecule of O 3 is produced, so no net chemistry occurs. NO 2 is positively correlated with NO (r = +0.75) and NOx (r = +0.89) and negatively correlated with O 3 (r = −0.68). The negative association between NO 2 and O 3 is well known [13,45,46]. It shows that NO 2 concentration is strongly correlated with O 3 and other NOx species. Therefore, adding O 3 and NO as exogenous regressors in the model can explain a significant proportion of NO 2 and improves the model performance.
Catalano et al. [47] developed an ARIMAX model and compared its performance to an artificial neural network (ANN). They modelled NO 2 concentrations at Marylebone Road, London, using several exogenous variables: traffic volume, wind speed, wind direction, temperature and lagged NO 2 concentrations. According to Catalano et al. [47], the ARIMAX model performed better than an ANN. The mean ratio of the predicted to the measured concentrations was 0.89 for the ARIAMX model and 0.86 for the ANN. Both models had the same correlation coefficient of 0.91; however, mean absolute percentage error (MAPE) values were 18.62 and 16.53 for the ARIMAX mode and the ANN, respectively. This shows that the prediction of the ARIMAX model is comparable with a neural network and can be used for air pollution forecast in urban areas to provide timely warning of high air pollution levels to the public.

Conclusions
We have a network of AQMSs in Sheffield consisting of low-cost and reference sensors. Here, NO 2 concentrations measured by Envirowatch E-MOTEs, AQMesh pods and AURN and SCC AQMSs were analysed to characterise the temporal variability of NO 2 concentrations. Density plots are a useful tool for comparing and analysing the distributions of NO 2 concentrations measured by various sensors. Time variation plots were employed to characterise and compare the temporal variability of NO 2 concentrations measured at different monitoring sites. Time variation plots visualise how NO 2 concentrations vary during different time periods (e.g., diurnal, weekly and annual cycles) and help us understand their emission sources. Long-term data for NO 2 concentrations showed negative trends at both the Sheffield Tinsley and Devonshire Green sites, indicating that pollutant emissions decreased because of stringent emission policies. However, the reductions in pollution varied both spatially and temporally. Moreover, further smart interventions are required to cut emissions and improve air quality to comply with air quality guidelines. NO 2 concentrations were modelled from the Devonshire Green AQMS in Sheffield using univariate linear and nonlinear and multivariate time series models with exogenous variables. The addition of exogenous variables to the ARIMAX model significantly improved the model performance. Model specification, fitting and diagnostics were discussed. Model performance was assessed by calculating several statistical metrics, including r, RMSE, MBE, FAC2 and MAE. The ARIMA model showed better performance than nonlinear persistent models, showing that linear models were not only easy to apply and interpret but also showed better performance than the more complicated models such as SETAR and NNNAR. The best model fit was achieved when NO 2 concentration was modelled using ARIMAX with two exogenous variables (NO and O 3 ). These variables were strongly correlated with NO 2 . NO was positively correlated, whereas O 3 was negatively correlated with NO 2 .
Time series models with exogenous regressors can be successfully used to predict future pollution concentrations, providing early warning for the public so that timely precautionary measures can be adopted. The main weakness of the study was that we did not provide full details of the low-cost sensors calibration. Ideally, several of these low-cost sensors should have been collocated with reference sensors for detailed comparison and calibration. However, due to several practical problems (e.g., planning permission), this was not possible and the only calibration carried out was described in Munir et al. [44]. They compared the measurements of Envirowatch E-MOTEs with the measurements of an AURN site (Sheffield Devonshire Green). Future work will include installation of lowcost sensors, collocated with reference sensors for over a year, and a comparison of their measurements during different time periods. This study provides a detailed methodology for time series modelling, which can be used as an early warning tool for air pollution episodes, and compares different linear and nonlinear modelling approaches.

Data Availability Statement:
The data presented in this study are openly available in reference number [44,48].