Bayesian Proxy Modelling for Estimating Black Carbon Concentrations using White-Box and Black-Box Models

: Black carbon (BC) is an important component of particulate matter (PM) in urban environments. BC is typically emitted from gas and diesel engines, coal-ﬁred power plants, and other sources that burn fossil fuel. In contrast to PM, BC measurements are not always available on a large scale due to the operational cost and complexity of the instrumentation. Therefore, it is advantageous to develop a mathematical model for estimating the quantity of BC in the air, termed a BC proxy, to enable widening of spatial air pollution mapping. This article presents the development of BC proxies based on a Bayesian framework using measurements of PM concentrations and size distributions from 10 to 10,000 nm from a recent mobile air pollution study across several areas of Jordan. Bayesian methods using informative priors can naturally prevent over-ﬁtting in the modelling process and the methods generate a conﬁdence interval around the prediction, thus the estimated BC concentration can be directly quantiﬁed and assessed. In particular, two types of models are developed based on their transparency and interpretability, referred to as white-box and black-box models. The proposed methods are tested on extensive data sets obtained from the measurement campaign in Jordan. In this study, black-box models perform slightly better due to their model complexity. Nevertheless, the results demonstrate that the performance of both models does not differ signiﬁcantly. In practice, white-box models are relatively more convenient to be deployed, the methods are well understood by scientists, and the models can be used to better understand key relationships.


Motivation
Seven million people die each year due to the adverse health effects of air pollution, with 4.2 million deaths attributed to exposure to ambient (outdoor) air pollution [1]. Approximately 91% of the world's population lives in areas where air pollution exceeds guideline limits established by the World Health Organization (WHO) [2]. Several health-relevant ambient air pollutants include carbon monoxide (CO), ozone (O 3 ), nitrogen oxides (NO and NO 2 ), sulfur dioxide (SO 2 ), experts on how the uncertainties of different variables can be adequately accounted for [20]. Recently, more practitioners have resorted to data-driven approaches, such as neural networks, as alternatives to physics-and expert-based methods [24]. Typically, data-driven methods do not require an in-depth understanding of air pollutant dynamics and other explanatory variables. However, the use of expert or known information may guide the inclusion of key variables and provide information about the nature of the expected relationship (e.g., linear or non-linear). Such methods can take advantage of the growth in low-cost air pollution sensing networks and computing technologies [25].
In particular, BC analysis has been investigated extensively such as described in [26,27]. BC modelling is also well developed, which mostly consists of physics-based models. Examples include the global transport model [28], regional climate-atmospheric chemistry model [29], land use regression models [30], and the BC mixing state model [31]. Recently, BC has been modelled using statistical distributions [32] and the linear mixed-effect model [33]. However, these models do not act as a proxy where BC concentrations can be estimated using other measured variables.
While data-driven methods have been extensively used as other air pollutant estimators, a major issue is the lack of proper interpretation of the results due to the usage of non-transparent models (e.g., black-box (BB) models). Uncertainty analysis is also often neglected when the model outcomes are point-based estimates [24]. To overcome such issues, we develop a white-box (WB) model to predict BC mass concentrations using PM and PN data. The developed model for the BC proxy is then integrated with a Bayesian framework to address over-fitting and uncertainty quantification issues that typically arise in a modelling process. A Bayesian BB model is also developed for comparison.

Case Study: Jordan Air Pollution Measurement Campaign
The proposed BC proxy methodology will be tested on air quality data sets obtained from a mobile measurement campaign performed at several locations in Jordan, including the two most populated urban areas in the country: Amman and Zarqa. This campaign took place from 29 May to 4 June 2014. The sampling interval was 30 s for the aethalometer and 10 s for the remaining instruments used in this campaign. The data was then further pre-processed for analysis and proxy development by taking the average to be one minute per data point. The campaign map and other detailed information about measurement campaign are described by Hussein et al. [34].
The measurements represent one of the most comprehensive mobile campaigns involving PM number concentrations and size distributions down to the UFP regime in urban areas in the Middle East and North Africa (MENA) region [35]. Understanding air pollutant sources in the area is a challenge: Amman is known as the economic and political center of the country, whereas Zarqa is one of the industrial centers. Furthermore, air pollution in the southern part of Jordan is mainly affected by dust particles due to sand resuspension from desert areas [35]. Urban air pollution in Amman and Zarqa originates from a vast range of sources, including emissions from traffic and industrial activities, local-scale household activities involving the burning of biomass (e.g., heating in the winter), and natural sources (e.g., dust resuspension). Airborne dust (super-micron PM >1 µm) is a major problem, not only in Jordan, but throughout the MENA region [36]. According to the majority of anthropogenic air pollutant sources in the area, BC is likely to be an important pollutant that contributes meaningfully to total PM concentrations.
The mobile campaign included the measurement of size-fractionated PM concentrations (10 nm-10 µm) and BC mass concentrations. Table 1 lists the associated aerosol instrumentation and measured variables. Using several portable instruments that cover a wide size range and with different cutoff diameters makes it possible to derive particle number and mass concentrations in several size fractions. In this study, we focus on the following size fractions: submicron (0.01 µm-1 µm) particle number concentrations with three fractions in the following particle diameter ranges: 10 nm-20 nm, 20 nm-300 nm and 0.3 µm-1 µm. The PM x was obtained from the DustTrak, which recorded PM 1 , PM 2.5 , and PM 10 . , which reports the BC concentration based on changes in light attenuation at a wavelength of 880 nm of particles collected on a disposable filter. The filter was replaced each day prior to the measurements. The sample flow rate was 0.1 L/min and a 2.5 µm size selective inlet was used. We set the time-resolution at 30 s. The BC data was post processed to remove any spurious spikes in the concentration (e.g., >1000 µg/m 3 ) that were associated with sudden vibration of the instrument. This type of monitor (i.e., microAeth AE51) was tested against a reliable type (aethalometer AE31) and also for real-time performance in field measurements. According to Cheng and Lin [37], negative BC levels may be present using AE51 at low actual BC levels or at a high time-resolution. Negative values can be eliminated very effectively by adopting the optimized noise-reduction averaging (ONA) algorithm.

Methods: Bayesian Modelling
Although physics-based models are often considered as WB models [38], statistical models that can explain how they behave, how they produce predictions, and what the influencing variables are, can also be categorized as WB models. This type of modelling is also known as WB machine learning [39,40]. Examples include linear and logistic regression, decision trees, and generalized additive models [41]. This section describes the details of the Bayesian methods for estimating BC concentrations in the forms of WB and BB models.

Features Analysis
The collection of aerosol instrumentation presented in Table 1  The relationship between BC mass concentrations and size-fractionated PM mass and number concentrations is displayed in Figure 1. The use of all PM size fractions in the BC proxy may be redundant since some variables may have similar contributions and others may not have meaningful associations with BC. Therefore, it is important to analyse the features of the PM and PN concentrations. Three types of correlation analyses are performed including Pearson (r p ), Spearman (r s ), and mutual information (MI). The Pearson correlation coefficient is known to be effective for evaluating the linear relationship between two continuous variables [42], whereas the Spearman correlation coefficient is computed based on the ranked values for each variable rather than the raw data [43]. MI is also applied to ensure undetected non-linear correlations between these variables are captured [44,45]. The relationship between these variables and BC mass concentrations are shown on each subplot of Figure 1. From the correlation analysis, it can be seen that the correlations between each PM variable with BC mass concentrations does not differ significantly. In this case, the variables of PM 2.5 and PN 0.3−0.025 (shown as red data points in Figure 1) are selected as the features for the inputs of the BC proxy based on physical characteristics of BC. The size distribution of BC is known to be in the range of 25-300 nm and that fraction contributes significantly to PM 2.5 [9]. The use of the PM 2.5 variable is advantageous because it is typically included in routine air quality monitoring around the world [46] and is increasingly used as part of low-cost air quality sensing networks [47].

Bayesian Model: White Box
From the previous subsection, it is known that BC contributes to both PM 2.5 and PN 0.3−0.025 . From Figure 1, it is also known that the relationship between BC mass concentration with PM 2.5 and PN 0.3−0.025 on a logarithmic scale are linear and non-linear, respectively. Therefore, the proposed structure of BC proxy can then be written as: The mathematical description of the proposed proxy structure can be simplified to be: where y, X 1 , and X 2 are log 10 [BC], log 10 [PM 2.5 ], and log 10 [PN 0.3−0.025 ], respectively. The notation ε is a random error term that follows a Gaussian distribution, ε ∼ N (0, σ 2 ) and σ 2 is a noise variance. Finally, the model coefficient is symbolized by β = {β 1 , β 2 , β 3 , β 4 }. The proxy based on Equation (2) can be considered as a WB model since the relationship between the inputs and output are visible and transparent. The aim in Bayesian modelling is not only to find single "best" values of model coefficients (β), but rather to explicitly account for the uncertainty of the coefficient estimate using the posterior distributions of model coefficients. Bayes' rule [48] can be defined as: The following sub-subsections will discuss the setup of the prior distribution and likelihood function and then describe the Bayesian inference for obtaining the posterior and predictive distributions.

Prior Distribution
The proxy coefficients are modelled as a Gaussian distribution, given by p(β) ∼ N (µ 0 , σ 0 ), whereas the noise variance is assumed as a random variable following an inverse Gamma distribution, given by p(σ 2 ) ∼ IG(a, b).
Informative priors are established by applying nonlinear regression on the training data. First, the variable, µ 0 , can be initiated using the estimated variables,β, where the standard deviation, σ 0 , is chosen to be twice that of the mean value, µ 0 . Second, the parameters a and b are estimated by taking the squared residual values between the nonlinear regression estimation and the real BC mass concentration data. This provides the residual mean, µ r , and its corresponding variance, σ 2 r . Using the properties of the inverse Gamma distribution, the parameter a can be estimated using a = (µ r /σ r ) 2 + 2, whereas the parameter b can be computed using b = µ r (µ r /σ r ) 2 + 1 as proposed in Zaidan et al. [49,50]. These prior parameters are then used as starting values for the chosen initiation method for running Markov Chain Monte Carlo (MCMC) algorithm.

Likelihood Function
The likelihood for this model is the conditional probability of observing the data (X 1 and X 2 ) and the model parameters (β, σ 2 ). The likelihood also follows a Gaussian distribution and it can be written as:

Posterior and Predictive Distributions
Using the likelihood function and the prior distribution, the posterior distribution can be computed using Bayes' theorem, shown in Equation (3), to give: Since the probabilistic model above becomes non-linear, the exact inference is intractable. Hence, in order to estimate posterior distributions, we resort to the use of a sampling method, referred to as No-U-Turn Sampler (NUTS). NUTS is an MCMC algorithm that closely resembles Hamiltonian Monte Carlo [51]. To initialize the NUTS sampler, Automatic Differentiation Variational Inference (ADVI) [52] is used first, where instead of sampling the posterior, the parameters of a tractable distribution are fitted to match the posterior [53,54].
Once posterior distributions have been estimated using the NUTS algorithm, the predictive distribution can be computed by generating data from the model using the posterior draws from parameters. The implementation is done using PyMC3 [55].

Bayesian Model: Black Box
Neural networks can be considered as BB models since they provide little explanatory insight into the relative influence of the independent variables in the prediction process [56]. From a statistical perspective, neural networks are a robust approximator to estimate real-valued (prediction) and discrete-valued (classification) target functions because they can mimic the non-linearity of the functions and their learning methods are well-developed [57]. Neural networks and their family have been used in a large number of applications [58], including air pollution research [24]. In order to enable for a fair comparison with the Bayesian WB approach, here a Bayesian method is implemented into neural networks, known as a Bayesian neural network (BNN) [45].
A neural network, f (X, β), can be viewed as a probabilistic model, that follows a Gaussian distribution, given by: where the notations of X, β, and γ are the inputs, the neural network parameters (i.e., weights), and the precision of the Gaussian distribution, respectively. Equation (6) is also known as a likelihood function. In a Bayesian framework, a prior distribution needs to be assigned, where in this case, the prior follows a Gaussian distribution with mean zero and the precision of α, given by: Using the prior distribution and likelihood function, the posterior distribution for the BNN can be computed based on Bayes theorem, shown in Equation (3), to give: The inclusion of the prior distribution leads to a regularization, which then counters over-fitting [59,60]. Furthermore, BNN provides a degree of belief on the estimated output, which can be used to assess the quality of the predictions. However, due to the non-linear dependence of f (X, β) on β, the posterior distribution calculation is intractable.
The first solution to estimate the BNN posteriors was proposed using a Laplace approximation [61,62]. Solutions to compute more accurate posterior distributions have been developed, including variational inference [63], sampling-based variational inference [64], and expectation propagation [65]. In this case, we adopt the recent solution based on automatic differentiation to variational inference (ADVI) proposed by Kucukelbir et al. [52], Blundell et al. [66]. This approach optimises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood.
The structural details of the BNN can be found in Hagan et al. [59], whereas the posterior based Bayesian optimisation can be found in Blundell et al. [66]. As with the Bayesian implementation of WB modelling, the BNN implementation also uses pyMC3 [55].

Results
This section discusses the BC proxy modelling process and explains the performance of the BC proxies.

Modelling Process
As discussed in Section 3.1, the PM 2.5 and PN 0.3−0.025 data sets are used as inputs to the BC proxy, whereas BC data is assigned as the proxy's output. K-fold cross-validation is used to select the training and testing data. The cross-validation is repeated many times with different randomization in each repetition, where the method is known as repeated k-fold cross-validation [67]. For the WB model, the proposed proxy structure, shown in Equation (1), is first established. Then the model coefficients of β and noise variance (σ 2 ) can be estimated using the NUTS inference, as explained in Section 3.2. Multi-process sampling is performed on two core processors simultaneously (i.e., 2 chains in 2 jobs). Figure 2 shows an example of the posterior distribution of our parameters and the individual samples drawn. These include all model parameters that are β and σ. The colors blue and orange show two different samplings performed in parallel. It can be seen that the sampling chains for the individual parameters, shown in the subplots on the right hand side, converge well and are stationary (e.g., no large drifts or other odd patterns). As mentioned previously, the predictive distributions can then be obtained by generating data from the models using the estimated posterior distributions. For the BB model, the data needs to first be normalized. In this way, the weight-input product is guaranteed to be small when initialising the network weights to small random values. The magnitudes of the weights also have a consistent meaning [59]. The next step is to setup the many possibilities of the BNN structures, including the different number of hidden layers, different activation functions on hidden layers, such as rectified linear unit (ReLU), sigmoid and hyperbolic tangent function (tanh) [68], and the number of neurons on each layer. We start the training from the simplest BNN structure first, then the results are validated using the testing data and the performance is recorded. The training and validation processes are performed iteratively by increasing the complexity of the BNN structure to find the best BNN structure. For the BB model, it is not possible to run an MCMC sampler, such as NUTS, because the sampling will become very slow as the model is scaled up to deeper architectures with more layers and/or the number of neurons increases. Instead, the ADVI variational inference algorithm is used as mentioned in Section 3.3. The ADVI is based on a mean-field approximation such that the correlations in the posterior are neglected, with the advantage of being computationally much faster and scaling well to higher dimensions. The "brute-force" method is applied to find the best BNN structure using performance metrics, which are also used for model evaluation, as described in Section 4.2. It is then found that the most optimal BNN structure is one single hidden layer network with a ReLU function with 100 neurons. Figure 3 shows the procedure of the WB and BB model developments. The database is established using the measurement campaign data as explained in Section 2. The data is then divided into training and testing data using repeated k-fold cross-validation. The parameter k is chosen between 2 and 10 and each process is repeated 50 times with different randomisation. The key differences between WB and BB development are in the normalisation method for the data sets applied to the BB model and the search of optimal structure for the BB model. The approaches utilised are computationally demanding, therefore these modelling process are performed in a super-cluster, provided by CSC-IT Center for Science Ltd. [69].

Performance Analysis
This subsection discusses the performance analysis of the developed BC proxies. Figure 4 shows a fraction of the time-series results of the BC proxies tested on data measured in Amman city center on 29 May 2014 from 21:44 to 23:42. The blue, red, and green lines represent the real BC measurement and the estimated BC quantity via the BB and WB proxies, respectively. The red and green light areas are 2σ and 3σ predicted uncertainty of their respected estimations. Even though the BC proxies do not fit perfectly the data points of BC measurement, it can be seen that both BC estimations track well the pattern of the BC measurement. Furthermore, the uncertainty estimations of 2σ nearly cover the entire region of the BC measurement. Nevertheless, the time-series result only displays a snapshot in a particular area at particular time, it does not represent the overall performance metric of the proxy development results. Three additional metrics are used to describe the overall quality of the BC proxy performance. The first metric is called the mean absolute error (MAE). It has a simple interpretation as the average absolute difference between the predicted proxy values (ŷ) and the real measurement data points (y). The second metric is the root mean squared error (RMSE), which is also known as the standard deviation of the residuals (prediction errors). The third metric is called the coefficient of determination, denoted by R 2 . It provides a measure of how well the observed outcomes are replicated by the proxy, based on the proportion of total variation of outcomes explained by the proxy. The summary of these three performance metrics is shown in the Table 2. Table 2. The performance metrics used for evaluation of the developed BC proxies. The real measurement value, the mean of the measurement data points, and the predicted proxy value are symbolized by y,ȳ, andŷ, respectively. The notations of i and n are the point number and the total predicted values from the proxies, respectively.

Performance Metrics Formulation
Mean Absolute Error The proposed BC proxies are implemented on the data obtained from the measurements performed across urban areas (i.e., Amman and Zarqa) and throughout Jordan (including urban areas). The repeated k-fold cross-validation is used to establish training and test data sets. Using the performance metrics shown in Table 2, the performance of the proxies are then evaluated and the results are presented in Table 3. The low values of MAE and RMSE indicate that the proxy performance is better than the high values of these metrics. On the other hand, the high R 2 values indicate the proxy performance is better than the lower values. It can be seen here that in both cases and across different areas in Jordan, all performance metrics indicate that the BB model outperforms the WB model. However, it can be observed that the difference in the performance metric values are relatively small indicating that the performance of the WB and BB models are similar. In this case, considering the complexity of setting up the BB model, as well as its non-transparency, it is worthwhile to utilise the developed WB model for estimating BC measurements. Table 3. Mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R 2 ) of the proposed BC proxies evaluated on BC measured at urban areas across Jordan. The following performance metrics will focus on the WB model development applied on the Jordan data. In addition to these three metrics, we present the WB model results as a regression plot and an error histogram ( Figure 5). The regression plot displays the relationship between the outputs of the proxy (y-axis) and the real measurement data (x-axis) (Figure 5a). The error histogram shows the histogram of the difference between estimated proxy values and real measurement values (Figure 5b). From both figures, it can be observed that the results are adequate: most of estimated BC values lie close to the ground truth line in the regression plot (with R 2 is at 0.77), whereas the error histogram demonstrates that the highest probability (i.e., histogram peak) lies at zero. The latter means that the difference between the BC estimations and the BC measurements are mostly close to zero. These figures conclude that the developed WB proxy is adequate for field deployment under measurement conditions similar to those in urban areas of Jordan.

Discussion
This subsection discusses the advantages of using the developed proxy in terms of uncertainty analysis, as well as its potential to be embedded in low-cost PM sensors, and bias-variance trade-off in modelling choice.
Since we can directly estimate the uncertainty associated with our prediction using the (Bayesian) predictive distributions, we are able to quantify how many estimated BC data points fall within each degree of model uncertainty (standard deviation, symbolized as σ). Figure 6 shows the bar chart of the percentage of estimated BC data points that lie within σ to 3σ. It can be seen that about 81% and 96% of the estimated BC proxy data points lie within a very high confidence interval (σ) for both the WB and BB models. The remaining BB estimation (4%) lie within 2σ, whereas 16% and 3% of the WB estimations lie on confidence intervals of 2σ and 3σ, respectively. In this case, BB is slightly better than WB due to its model flexibility, therefore it is able to adequately mimic the real BC measurements. Nevertheless, both methods successfully estimate all BC measurements to within 3σ confidence intervals. It is also beneficial to embed the developed proxies into low-cost PM sensors. Low-cost sensor devices have become increasingly popular to fill the data gap between air quality networks [70]. However, BC measurement using low-cost sensors is often absent due to the complexity of the measurement technique [71]. This motivates the integration of the developed proxy for estimating BC mass concentrations using inputs from low-cost PM sensors. In addition to trace gases, PM 2.5 has been a standard measurement of many low-cost air quality sensors [47,72,73]. However, there has been no low-cost technologies available to monitor UFPs as of yet, such as PN 0.3−0.025 [74]. Therefore, the number of proxy inputs needs to be reduced to one, by taking only PM 2.5 measurements. In this case, the input of PN 0.3−0.025 is excluded, the equation then becomes: The above formulation simplifies the BC proxy to be a linear regression model which can also be treated through a Bayesian formulation. The use of a linear model leads to an exact Bayesian inference computation as described in Zaidan et al. [75]. Having an exact Bayesian solution also makes the proxy easy to be embedded in low-cost PM sensors. However, after applying it on this case study, the performance degrades, as shown in Table 4. Nevertheless, the function of low-cost PM sensors is typically not for scientific research, instead they are often used to provide an approximation of air pollution information to the public. Therefore, the results of the proxy embedded in low-cost PM sensors should be implementable for scaling-up the air pollution information. The combination of air pollutant proxies with low-cost sensors and "real" instruments is promising because proxies are not affected directly by physical sensor failures and environmental conditions. However, the proxy accuracy might drift over time because there may be new measured data that has not been incorporated in the process of proxy development, which is known as extrapolation, or the low-cost sensors/instruments might physically degrade. This phenomenon can be detected through uncertainty quantification from the proxy-based Bayesian predictive distribution and the proxy can then be re-trained and updated accordingly.
In term of the choice of proxy approach, consideration of WB or BB models also needs to take into account the relative predictive performance in terms of bias or variability (bias-variance tradeoff) [76]. An estimator can be biased if the estimate is consistently higher or lower than the true estimate. Although WB models are easier to understand and fast to train, they can (in general) be less flexible, which may result in estimates far away from the true values (potential for high bias). On the other hand, although BB models have the ability to capture unexplained complexity and provide accurate estimates (low bias), the additional complexity may result in high variance [77], that is they can be very sensitive to small fluctuations in the training set. This phenomena is often referred to as over-fitting. In this situation, estimates can also change considerably if different training data is used (leading to a lack of reproducibility). The main objective of any supervised machine learning algorithm is to obtain low bias and low variance, although there is often a trade off involved. In our approaches, the variance associated with the estimates and predictions is minimised in the Bayesian models using (informative) prior information which is often referred to as a form of parameter regularisation. In complex settings and in particular with noisy data there is an increasing need to impose some form of regularisation in the model. As we have seen in the results, the inclusion of this information does not come at a significant cost in terms of the parameter estimates as the results for the complex BB provides similar performance to the WB model. Therefore, the Bayesian WB models are able to provide good performance across these two known issues (bias and variance).

Conclusions
This paper presents the development of BC proxies based on a Bayesian framework using WB and BB models. A considerable advantage of the Bayesian methods presented is the prevention of over-fitting and an explicit understanding of the uncertainty surrounding the BC mean prediction. The results demonstrate that the performance of the WB and BB proxies does not differ significantly. Both methods are evaluated on data obtained from a mobile air pollution measurement campaign in Jordan and give adequate results. Reasonable coefficients of determination (R 2 ) are achieved: 0.77 and 0.78 for the WB and BB proxies, respectively. Both methods also estimate the BC mass concentration to within one standard deviation of the predictive distributions at 81% and 96% of data points for the WB and BB models, respectively. All estimations lie within 3σ standard deviation of the predictive distribution. Since both types of proxies provide similar performance, the WB proxy is relatively more convenient to be deployed in practice than the BB proxy as the WB model structure can be relatively easily built upon using known relationships in the data or using expert information. The data used in the BB proxy is also required to be normalised. This is not the case for the WB model, where the normalisation scaling factors may be difficult to determine when they are used in testing data.
Nevertheless, the proposed method may not always be applicable to be deployed in practice, because PN measurement is not always performed at every air pollution measurement station. As demonstrated in the discussion, the developed proxy based on a single input PM 2.5 has shown a reduction in the proxy performance. As the first future effort which can be extended from this work is to carry out more experiments in the same and/or other regions with additional measured variables involved. The inclusion of measurements of trace gases, radiation, and meteorological variables is expected to improve the robustness of the BC proxies. Therefore, other variables associated with BC emissions can be investigated. For example, the proposed proxies can be improved by accommodating more input variables which are typically measured in most official air pollution stations and/or via low-cost sensors, such as NO x , which is known to correlate well with BC [78]. Furthermore, in order to improve the proxy performance, more advanced models will be developed that include temporal information for BC mass concentration and/or spatial information. For example, AutoRegressive (AR) type models representing WB, such as AutoRegressive with eXogenous inputs (ARX) models, can be developed further within Bayesian frameworks. This type of model is also equivalent to dynamic neural-networks, such as recurrent neural networks representing BB models. Finally, in order to scale-up the usage of the proposed proxy, PM 2.5 and other variable measurements using low-cost sensors will be carried out together with the existing instruments for validation. Then, the developed proxy will also be deployed by taking input from low-cost sensor measurements. In this way, low-cost estimation of BC mass concentrations can be realised.

Abbreviations
The following abbreviations are used in this manuscript: