From a Low-Cost Air Quality Sensor Network to Decision Support Services: Steps towards Data Calibration and Service Development

Air pollution is a widespread problem due to its impact on both humans and the environment. Providing decision makers with artificial intelligence based solutions requires to monitor the ambient air quality accurately and in a timely manner, as AI models highly depend on the underlying data used to justify the predictions. Unfortunately, in urban contexts, the hyper-locality of air quality, varying from street to street, makes it difficult to monitor using high-end sensors, as the cost of the amount of sensors needed for such local measurements is too high. In addition, development of pollution dispersion models is challenging. The deployment of a low-cost sensor network allows a more dense cover of a region but at the cost of noisier sensing. This paper describes the development and deployment of a low-cost sensor network, discussing its challenges and applications, and is highly motivated by talks with the local municipality and the exploration of new technologies to improve air quality related services. However, before using data from these sources, calibration procedures are needed to ensure that the quality of the data is at a good level. We describe our steps towards developing calibration models and how they benefit the applications identified as important in the talks with the municipality.


Introduction
Good air quality in urban areas is essential for human well-being. Since 2008, when the European Union released the Ambient Air Quality EU Directive 2008/50/EC that establishes health-based standards and objectives for pollutants present in the air, the assessment of outdoor air quality has been focused on by municipalities. Assessing and forecasting air quality is a crucial part of the strategies deployed in order to take measures to reduce air pollution and avoid poor air quality for citizens [1]. This highlights the importance of accurate air quality monitoring, which is typically achieved through the deployment of air quality sensor networks. Moreover, air quality affecting the individual depends on local phenomena such as weather, wind, the layout of the city, and pollution sources, which makes the topic interesting for a network of sensors deployed in a city with the goal to collect data and use it for prediction models [2][3][4]. There is a balance to be made when deploying such networks, concerning the coverage of a region against deployment costs. A bulky, industrial sensor type is more accurate but more expensive, both in deployment and in maintenance, while lower-cost sensors reduce by a large margin the deployment costs but at the cost of poorer data quality. However, the research community is working

Air Quality Data Pipeline
The Norwegian University of Science and Technology (NTNU), Telenor and the Information Technologies Institute, Centre for Research and Technology Hellas (ITI-CERTH) have collaborated with the municipality of Trondheim (Norway) since August 2018 on the exploration of options for improved air quality services based on new technologies within the context of artificial intelligence (AI) and Internet of Things (IoT). This collaboration is referred to as the AI4IoT (https://research.idi.ntnu.no/ai4eu/, accessed on 3 May 2021) pilot on air quality monitoring and ran within the framework of the AI4EU project, whose objective is to build a European AI on-demand platform. The pilot focused on technologies for data capture, advanced analyses and visualizations and has addressed the different parts of a value chain for a complete solution (see Figure 1). The services and data provided for this pilot will be part of the European AI on-demand platform to showcase its use.  Figure 1 provides an overview of how data are collected and processed in the air quality pilot. It consists of three components: first the low-cost sensor network, which collects and regularly sends data via an IoT gateway to an air quality server (AQ Server) that stores and adjusts the data based on the lab calibration parameters. The AQ Server is the data source for various AI4EU platform services. Those services can be standalone services or linked in a pipeline. They access data from the AQ Server and can then process them to create prediction or classification models and their visualizations. In this paper, we focus on supervised learning but the pipeline is general enough to accommodate any other type of model. Those components can then be accessed by decision-making tools for providing air quality information to citizens or decision makers.

Application Domain: Air Quality Pilot in Trondheim
The municipality of Trondheim (Norway) has experienced periods of poor air quality, often as a result of high particle dust levels stemming from traffic. Air quality standards are defined in directives issued by the European Union on ambient air quality and cleaner air for Europe (2008/50/EC and 2004/107/EU). The specified air quality standards are based on existing research on the health effects of exposure to pollution components. Norway has agreed to follow these regulations, which for PM10 means that the threshold level of 50 µg/m 3 (average over 24 h) should not be exceeded more than 30 times a year. In 2013, the Norwegian Environmental Agency ordered Trondheim municipality, Trøndelag county and the Norwegian Road Authorities (in their roles as pollution authorities and road owners) to collaborate on how to improve the situation, in the follow up of Norway having been taken to court over high pollution levels (https://www.eftasurv.int/newsroom/updates/ internal-market-norway-be-brought-court-over-air-pollution, accessed on 3 May 2021). Actions were taken in terms of regulations (e.g., related to traffic), but the municipality wants to further extend these measures with the use of intelligent decision support systems. As new research suggests that there are indications of health effects with lower exposure levels the Norwegian Health and local authorities have set even more ambitious targets. According to the annual report on air quality in the city of Trondheim [9]: "The air quality in Trondheim is satisfactory most of the year, but the particulate matter fraction PM 10 is a problem from October to May due to the use of studded tires on ice-free roads. Particulate matter from transport to and from construction sites, landfills and quarries is also a problem locally. Nitrogen dioxide (NO 2 ) from diesel engines is also a problem during winter. This problem is not related to peak values, but to annual mean concentration. Finally, during cold winters, burning of oil and wood for household heating can lead to high concentrations of PM 2.5 .
A use case is related to street cleaning, as the road owners in the area (Trondheim municipality, Trøndelag county and the Norwegian Road Authorities) are responsible for avoiding unhealthy levels of particle dust. Their available cleaning actions include washing, brushing and spraying and are carried out by special vehicles. In the past, a regular schedule for cleaning was applied but this has been changed to avoid unnecessary actions and currently cleaning is based on a subjective anticipation of needs, i.e., by foreseeing high pollution levels. This problem is challenging and has so far been solved by operators of the services using their experience to assess the current situation and trends and "having a quick look at the weather forecast". Recently, the contract for cleaning actions have been put out for private offers and thus the experiences of operators might vary as new contracts are made.
The use of early warning systems for public health is cited as an important use case for pollution measurements [10]. However, this is an example of a wider range of important applications, apart from the national and international regulations that countries must follow, which arise from pollution-sensing networks. Asthmatic people living in crowded cities, parents deciding in which neighborhood to raise their children, or policy makers in public authorities, are examples of important target groups that need to consider air quality levels in their decision making, either in their daily lives or in strategic planning. For instance, asthmatic people might want to be notified if there is a prediction of high pollution levels in the next day, parents might decide where to buy a house depending on the average pollution at a location or policy makers might have to decide whether to restrict traffic at certain areas or periods. Therefore, there are groups that benefit from data outputs from air quality sensor networks.
In the concrete case of our pilot, discussions with Trondheim municipality resulted in the development of several scenarios in which air quality data can be used by the municipality itself, but also allows third parties to create novel services. One starting point for the municipality was the need to have good quality data with the final objective of making it available to the general public. Furthermore, within the wide range of possibilities for application that use air quality data, the municipality showed a particular interest in scenarios involving: (i) the provision of warnings for increasing pollution levels to operators of cleaning services; (ii) decision-making support in the form of visualizations of measured and forecasted air quality levels for the decision-makers at the municipality; (iii) monitoring and forecasting services delivered through mobile apps for citizens to plan their activities (short term) and to incentivize them to choose green transportation options; (iv) running what-if analyses to gain knowledge on the effects that regulating traffic patterns will have on pollution levels. In Section 4 we will provide more details regarding the first option.

Low-Cost Sensor Network in Trondheim
The new network consists of a total of 25 new sensors, with 23 already deployed and 2 still to be placed. The already deployed sensors have been placed in strategic places, mostly schools and kindergartens, in Trondheim. This choice was made by the municipality considering three factors (More info at https://sites.google.com/trondheim.kommune.no/ kunnskapsdeling/luftkvalitet-engelsk, accessed on 3 May 2021): first, those are owned by the municipality and, therefore, there are fewer logistic restrictions on the installation of sensors; second, the users of such places are groups of particular risk (for instance, children); third, their relative location and lack of data for a given area. Future deployment plans include mobile sensors, to be positioned on top of urban buses. From all the lowcost sensors, two are co-located (In the context of this paper, we define co-location as the placement of low-sensor devices and industrial sensors at Elgeseter and Torvet stations as close as possible due to practical considerations. The distance between air intakes is less than 2 m and, in the case of sensors in high traffic sites, both are facing the main road.) with reference air quality stations (Elgeseter and Torget), which are part of a national network of industrial sensors, in order to be able to obtain ground truth data for analysis of the data. In Trondheim's area there are five industrial sensors, owned either by the municipality or the road administration, and the Norwegian Institute for Air Research classifies them as either traffic or background stations. Under this classification, Torget is a background station while Elgeseter and all other sensors are traffic stations. Figure 2 shows the locations of the sensors in the network, with a highlight on the locations where low-cost sensors are co-located with reference sensors, while Figure 3a shows a photograph of the setup at Elgeseter, with a low-cost sensor mounted on a reference sensor.   All sensors were installed with appropriate casing, as shown in Figure 4. The sensor board was designed with the ability to detect chemical pollutants (nitric oxide (NO), nitrogen oxide (NO 2 ) and ozone (O 3 )) and particulate matter (PM 1 , PM 2.5 and PM 10 ). For gas sensing, the following Alphasense sensors are used: NO-A4 for NO [11], NO2-A43F for NO 2 [12] and OX-A431 for O 3 [13], with an Alphasense 3-sensor AFE board support circuit [14], while PM is measured with an Alphasense OPC-N3 [15]. Each board includes an Amphenol Chipcap 2 [16] temperature and humidity sensor, and a GPS module with an OriginGPS Hornet ORG1510-MK05 [17]. In this paper we will deal with PM measurements; therefore we detail in Table 1 the technical specification of the Alphasense OPC-N3 sensor, while we refer for the respective datasheets for all other components.  The global cost of deployment for this low-cost network is estimated to be around NOK (Norwegian kroner) 10,000-20,000 (∼EUR 1000-2000) for components and installation, per device. For the industrial reference sensors the estimated cost is around NOK 400,000-800,000 (∼EUR 40,000-80,000) per device, making it around 50 times more expensive than the low-cost sensors. The operating infrastructure cost is similar for both kinds of sensors. Our low-cost sensor network communicates through NB-IoT, with a yearly subscription costing 100 NOK (∼EUR 10) per device. Each sensor communicates independently to the server and 4G and 5G networks have been carefully designed to be able to handle massive numbers of IoT devices communicating over the NB-IoT protocol. It should also be noted that the operational costs for industrial reference sensors are substantial as these sensors are subject to strict routines of maintenance and calibration, to ensure high quality data.
A general challenge with chemical sensors is stability, since the performance and sensitivity degrade over time, which in turn limits the operating time before re-calibration is required [18], but calibration procedures with machine learning methods are also possible [19]. Nonetheless, in this paper we focus on particulate matter (PM) detection, as it is the pollutant that is less out of the control of the local municipality. The OPC-N3 sensor measures particle counts in 24 bins from 0.35 to 40 µm by illuminating one particle at a time with a laser, and measuring the intensity of scattered light. The amount of scattered light is a function of the particle size, which is calibrated using a proprietary algorithm from Alphasense. A new setting (increasing the time the fan in the devices is on for each measurement) might result in somewhat better correlation with the reference sensors (at Torvet and Elgesetergate). This setting has recently also been applied to the rest of the sensors. There are interesting lessons that can be learned from this case regarding challenges encountered when putting IoT sensors to work in harsh conditions; among others, covers used to prevent humidity and salt from reaching the electronics also pose a problem to obtain the needed airflow. This can be solved by spending more power (longer fan-on time), which again poses a new problem if the device does not have a fixed power connection. Another example of challenges in harsh conditions is the need to put the sensors out of reach of people and thereby in sub-optimal locations for the measurements of air quality (Figure 3b), as there can be huge variations from the street level to high up on a building wall.
The deployment of the low-cost sensor network was finalized in late June 2020, with data collection starting at that point. Initial analysis of data incoming from the reference locations showed that the deployed sensors underestimated the pollution levels, with a weak correlation between reference sensors. The OPC-N3 sensor has some tunable parameters, with the most important for data quality being the sampling interval. After deployment several configurations were tested in one sensor, with the final configuration being set to a sampling interval of 1 min. The whole network was configured with the same parameter configuration on 12 November 2020. Therefore, throughout this paper we will limit our analysis to data received in the period between 15 November 2020 and 23 February 2021, with data being sent continuously 24 h per day and seven days per week. For the sake of preparing the dataset for the several components of this paper we selected this cutoff date, but the network is continuously up and running.
Previous studies on the quality of Alphasense low-cost sensors, both in laboratory conditions [20] and outdoors [21,22], reported that they tend to overestimate the real PM concentration with a larger number of outliers under high humidity conditions [21,22]. Furthermore, the influence of meteorological conditions was shown to be important for different types of low-cost sensors [23,24]. The measurements from our network showed a behavior with a large underestimation of pollution levels. Table 2 shows the summary statistics for the pollutant datasets, both from low-cost sensors and their respective references, together with correlations between low-cost and reference sensors. Figure 5 shows the analysis on our data, comparing PM 2.5 and PM 10 measurements from a low-cost sensor and a reference sensor, colored according to the relative humidity. In accordance to the cited literature, we observe that low-cost sensors measurements are affected by external factors, such as humidity and possibly others meteorological factors (temperature, air pressure, etc.). This, together with a low correlation between low-cost and reference sensors, led us into investigating automatic calibration models that take external features into account.

Datasets
Due to the deployment process previously described, we had available low-cost sensor data from 15 November 2020 to 23 February 2021. The sensors are connected via Narrowband IoT to an air quality server backend, which provides data to end users as a SQLite database. For the moment, these are not yet publicly available data. As for the reference sensors, data are fetched through a public API (https://api.nilu.no/, accessed on 3 May 2021) of the Norwegian Institute for Air Research (NILU). For a fair comparison, data from low-cost sensors are aggregated with hourly averages, to be in the same format as the available data for reference sensors.
As additional inputs we use weather and traffic data. As mentioned, it was shown in field evaluation studies that the performance of low-cost sensors is influenced by weather and, in particular, humidity was typically mentioned. We have two sources of weather data. First, the Norwegian Meteorological Institute has a public API (https://frost.met. no/index.html, accessed on 3 May 2021) with historical and real time information about weather measurements from different stations across the country. Several are available in Trondheim but, unfortunately, most do not have available measurements for all of the most important weather elements. Therefore, we chose to use data from a single station (Voll), which are the most complete, as representative of the weather conditions; the station is located less than 5 km from the reference locations. Data from this source consist of temperature, relative humidity, precipitation, air pressure, wind speed and wind direction. Additionally, the OPC-N3 sensor also provides measurements of temperature and humidity, which might be much more useful as local measurements. These measurements are, then, taken inside the box where sensors and boards are kept, which means that their magnitudes are not comparable with temperatures measured outside. Nonetheless, we observed a good correlation between these measurements and the ones from the Meteorological Institute (0.96 for both low-cost sensors), indicating that we might rely on them for remotely located sensors. However, the rate of faulty measurements are also higher for those, particularly the temperature reading with a rate of faulty measurements that can go up to 25%. We will elaborate on this and test in practice what the difference is when presenting results for automatic calibration models.
For traffic, the Norwegian Public Roads Administration hosts a variety of induction loop sensors located at relevant entry points of the city of Trondheim. The sample rate is on a per hour basis and counts how many vehicles passed through the sensors. These data are available through a public API (https://www.vegvesen.no/trafikkdata/start/, accessed on: 3 May 2021), and we include in our dataset traffic counts from the three traffic sensors closest to the reference sensors.

Method
Nonetheless, training of models for correction of low-cost sensor data against a reference has proven to produce good results [25]. To test the accuracy of such approaches in our setup we trained a random forest regressor with a different number of input features, in both locations with a reference sensor. Random forests showed good results in the calibration of other types of low-cost sensors [25,26], and thus we selected it as the method for the calibration procedure. In this approach, the input is the measurement of the low-cost sensor and an additional number of context features, and the target is the measurement from the reference sensor. We used the Elgeseter sensor for training and initial testing, and then also tested the trained calibration model with data from the Torget sensor. For all sensors we evaluated the performance with the root mean squared error (RMSE) and the correlation coefficient (R 2 ) metrics.
Additionally, we tested a different approach that attempts to classify the current real pollution level, instead of predicting the exact value. This approach is based on the observation that, often, one might not be interested in the exact value but rather on the air quality level in the surroundings of a sensor. For instance, this could be used as a means to build target datasets for warning systems or to provide a more structured information to end users, through map visualizations in mobile apps. Therefore, we trained a random forest classifier with the same setup as described for the regressor. As performance metrics we evaluated recall, precision and area under the receiver operating characteristic curve (AUC).
Finally, we calibrated the whole dataset for both sensors (low-cost sensors at Torget and Elgeseter) and recalculated the correlation against the reference sensor. In this case, we used the model tested with data from the Elgeseter sensor and used the same model to calibrated data from the Torget sensor. Table 3a shows the results obtained for sensor calibration with regression. We note that the calibration model tended to overfit when using only pollutant data as input, with a good training score but poor testing scores. This can be explained by the limited amount of data, combined with pollutant data not being enough for the algorithm to find calibration patterns. On the other hand, test scores improved as we included more input features and, in particular, using weather and all available measures for different particulate matter sizes seemed to make most of the difference. In relation to weather features we tested different combinations as input to the model, using both meteorological data from the Meteorological Institute and data obtained by the OPC-N3 sensor, which measures local temperature and humidity. As mentioned, using weather features significantly improved the model, with better results when using official weather data when compared to OPC-N3 weather data. While official weather data are measured in a different location in the city, those are more accurate than OPC-N3 measurements. Moreover, the high rate of faulty measurements for OPC-N3 temperature highly influenced the training process, especially in the presence of limited training data. In turn, Table 3b presents results for the calibration procedure with a classification model, i.e., with the goal of assigning air quality levels to the low-cost sensor data. Also in this case, the influence of weather features on the model performance is noticeable, although using all pollutant measures did not seem to have such an influence as in the regression model. Moreover, classification of PM 2.5 levels resulted in better performance than for PM 10 . Particularly, the model for PM 10 seems to have had difficulties in maintaining the balance between false positives and negatives, as we can see by looking into the recall and precision values; only when combining PM 10 and weather inputs are both metrics above 0.5.

Results
Generally, the calibration procedure output better results for PM 2.5 than PM 10 . For the regression model this can be partially explained by the fact that PM 2.5 values are lower than PM 10 , therefore yielding a lower RMSE. However, even the calibration model, which should not be dependent on the magnitude of the time series, obtained better performance with PM 2.5 . This result comes in line with the observation that the initial correlation between the low-cost and reference sensors for PM 10 was worst than for PM 2.5 , which indicates worse data quality for this measurement. Additionally, evaluation studies showed PM 10 measurements with OPC sensors to have a higher variance and be more affected by outliers.
After analyzing the performance of calibration models, we proceeded to calibrated the whole dataset with the trained model. Table 4 shows an updated analysis on data statistics, but now with the calibrated low-cost sensor dataset. The calibration model used is the one with inputs PM 1 , PM 2.5 , PM 10 and weather. This shows the effectiveness of the calibration procedure, with a significant increase in the correlation scores. It is particularly interesting to note that the correlation in the Torget sensor also significantly improved even though the model was trained on data from Elgeseter and then transferred. Moreover, the effect of external features, such as humidity, was reduced by the calibration procedure as seen in Figure 6. Here, we replicated the analysis of Figure 5 using calibrated data. The magnitude of the measurements from low-cost sensors are now much more in line with the references.

Method
In this method, we tested the feasibility of extrapolating pollutant values for different locations in the Trondheim area from the available low-cost sensor data. We trained a random forest regression model that takes as inputs the pollutant values of each low-cost sensor and predicts PM 2.5 and PM 10 concentrations for a target location. The available industrial sensors provide us with accurate measurements of pollutants for five different geographical locations that are used as ground truth data. The network of sensors used in our model consists of 23 low-cost sensors and 5 reference sensors whose geographical location can be seen in Figure 2. Apart from the main pollutants, low-cost sensors also detect humidity and temperature levels, allowing us to model cross-sensitivities between measurements and weather conditions. To account for the relative distance between lowcost sensors and target sensors, we calculated the Haversine distance of every low-cost sensor from the target and added it as a feature in the dataset.
Meteorological and traffic data of the area were available but they were not included in the dataset for this testing scenario. Our motivation was to test if the network of lowcost sensors is sufficient to achieve good results without the use of external data sources. Since meteorological stations and traffic sensors are limited to a few specific locations, we would have to limit our testing to those areas in order to avoid added bias in favor of low-cost sensors located next to them, compared to low-cost sensors that are further away. Moreover, the calibration process of low-cost sensors can assist in minimizing the interference of unknown variables with our measurements.
Various regression algorithms, such as Support Vector Regressor, XGBoost, and Random Forest Regressor, were evaluated during the testing phase, with Random Forest Regressor returning slightly better results. An additional advantage of the random forest algorithm is its ability to calculate the most important features of the dataset that, in turn, assist in the interpretation of the predictions, detection of high-impact areas of pollution, and comparison of cross-correlations between air pollutants. The average value of the three closest low-cost sensors and the value of the closest low-cost sensor are used as a baseline of performance for our model predictions. Visual estimation of the proximity of adjacent low-cost sensors to the reference sensors is provided in Figure 7. . Adjacent low-cost sensors used in the baseline models for every reference sensor. Red: reference sensors. Green: two of the three closest adjacent low-cost sensors used in the average value baseline model. Yellow: the third and closest adjacent low-cost sensor used both in the average value and closest low-cost sensor baseline models. Red/Yellow: in the Torget and Elgeseter areas there is a low-cost sensor next to the reference sensor, so a dual coloring is used to mark both.

Results
The test set results of our model are shown in Table 5. For both pollutants, the regression model achieved a low root mean squared error (RMSE) and a high correlation coefficient (R 2 ) compared to the baseline values. Specifically, the negative R 2 value of the closest sensor to the target highlights the extreme fluctuations of pollutant concentration based on external factors, when we account for distance. By contrast, the model achieved R 2 values higher than 0.6 for both pollutants, capturing cross-correlations among the features sufficiently.
There are certain limitations in the model that have to be taken into account. There were only four months of low-cost sensors data since the last calibration in November, so they do not contain enough information for seasonality and yearly trends. Moreover, the five reference sensors cover a small part of the Trondheim area, offering limited ground truth data that, in turn, affects the generalization of the model. Additional online time of low-cost sensors and further improvements in their calibration will allow for a better model to be trained in the future. On the background described in Section 2, the environmental unit in Trondheim municipality supported the idea of exploring the possible development of a warning system for increased particle dust levels based on historical data including pollutants, weather and traffic. The main users of the system would be the operators of the cleaning services, but also people at the environmental unit of the municipality might want to use the system for follow-up of contracts in their role as pollution authorities. The initial requirements for the system (identified in collaboration with the municipality) were to: 1. Offer warnings for situations with high levels of particle dust, 24 h ahead of time; 2. Describe the quality of the warnings to the users in an understandable format; 3. Provide an indication of on which features the warnings have been based.
We address this problem as a classification problem with two classes based on the characteristics of a state, i.e., selected features describing the current situation: • Class 1: States that are followed by high pollution levels (i.e., above a given threshold) for at least one of the 24 following hours. • Class 2: States that are not followed by high pollution levels.
Then, if a state is identified as Class 1 a warning will be issued to the operators of the street cleaning service.
Calibrated data might be beneficial for the success of the application. First, as previously mentioned, low-cost sensor data correlate better with a reference sensor after calibration. Moreover, if no reference sensors are available close to low-cost sensors then calibrating their data and developing warning systems based on them might be the only available option. Therefore, we tested a warning system in a location with both a low-cost and reference sensor, comparing two different models that receive as input data from each sensor seperately. As performance metrics, we used the recall, precision and area under the receiver operating characteristic curve (AUC). Table 6 shows the results for the test set. Again, we had limited data available due to the deployment process of the low-cost sensor network, but they provided insight into the feasibility of using low-cost sensor data for this kind of warning system when comparing with reference data. We note that the behavior of the models was similar when using both sources of pollution data. The warning for PM 2.5 seemed to be able to find more positive Class 1 samples than with PM 10 . Given the limited period of data available and the unbalanced nature of such datasets, further tests and data might be needed before deciding on a deployment of such models, but we believe that this indicates that using low-cost sensor data for warning systems in different locations in the city is feasible.

Visualizations
We used a number of visual analytics techniques to present the low-cost sensor data to the viewer, to allow for a deeper understanding of the data and the possible discovery of patterns. We have focused on presenting the spatio-temporal distribution of all types of collected measurements and the pollution associations between geographical areas.
In order to show the spatio-temporal distribution of the collected measurements, we used a map view, locating all the sensors using their geographical coordinates, as can be seen in the left panel of Figure 8. In order to show as much information as possible in the map, we used star-like glyphs to display the information collected by each sensor at a specific time instance. Glyphs in general are compact representations that visualize multi-dimensional data by mapping multiple dimensions on multiple characteristics of a shape, such as a polygon, an arrow, etc. [27]. In this paper, we used star-like glyphs, which are a form of polygon for representing several attributes at once [28]. The distance of each star tip to the center of the star is proportional to the value for a specific type of measurement. In the example shown in Figure 8, three types of measurement are shown, namely PM 10 , PM 2.5 and NO 2 . Each type of measurement corresponds to a specific angle around the center of the star glyph, as shown in Figure 9. The overall shape of the glyph is therefore characteristic of the vector of measurements collected by each sensor, allowing comparisons among sensors and geographical areas. The specific values of each measurement can be viewed by mousing over each glyph. The color of the glyphs shows the level of pollution (green: low levels; yellow: medium levels; orange: high levels; purple: very high levels-relative to the extent of the overall measurements, and adhering to the code used in the official air pollution forecasting service offered by NEA).  The timeline and slider at the bottom of the map allow the user to explore different time instances. Moving the slider back and forth, the user can view how the shapes of the glyphs change with time. This type of interaction allows temporal patterns or peculiarities to be detected by the user, along with the spatial patterns on the map. Seasonal patterns such as fluctuations in pollutants between day and night, as well as "waves" of pollution moving from south to north, can be seen using this kind of interaction.
Associations between the sensors are visualized using a graph-based view, as shown in the right panel of Figure 8. Graphs have been particularly suited in the literature for visualizing associations between entities [29]. Here, each node of the graph corresponds to a sensor. The graph is a k-nearest neighbor graph among the sensors. An edge between two sensors is placed if the two sensors are similar with respect to their history of measurements over a specified time window (indicated by a green shade at the timeline view). This type of visualization provides another map for the sensors, this time in the feature space, rather than the geographic space. As the history of measurements changes with time, so does the association between nodes and the overall structure of the graph. This can be seen interactively by using the same slider used for the map view. The user can further select groups of nodes on the graph to highlight the corresponding points on the map, as shown in Figure 8.
The combined map and graph views can potentially reveal interesting patterns. An example is shown in Figure 10. Figure 10a depicts a particular snapshot in time, where the sizes and shapes of the sensor glyphs indicate low levels of pollution. The graph view for this instance reveals some groups of similar sensors located in the feature space. These groups have been formed by taking into account not only the measurements of this instance but also measurements of previous time instances within a short time window. In this example, the user has selected the rightmost group of graph nodes, and the corresponding sensors are highlighted in red in the map view. Most of the sensors are also located in nearby locations on the map, showing that feature similarity is relevant to geographic proximity. However, the group also contains a sensor at the far south, which is similar to the others only in the feature space, i.e., in its temporal behavior. Figure 10b shows the snapshot taken a few hours later. The previously selected sensors seem to behave similarly, measuring high pollution concentrations, in contrast to the non-selected ones, the majority of which stay at low levels. This change in behavior is also reflected in the far sensor to the south, which also exhibits the same behavior. This example shows that the sensor clustering at the feature space can reveal associations between sensors that go beyond geographic proximity, and that can be exploited for understanding pollution patterns and making predictions.

Conclusions
This paper discusses the challenges towards the development of innovative applications and services to improve air quality monitoring and decision-support for the municipality of Trondheim, in Norway. Air pollution is a highly local phenomena but monitoring it can be quite expensive with costs including, for example, air quality sensors, calibration procedures or measurement communication. To face this challenge a network of low-cost sensors was deployed throughout the city, to complement the (much smaller) network of industrial sensors which were previously available. Low-cost sensors drastically reduce the cost of deploying such networks but at the expense of noisier and often faulty measurements and, often, additional processing is needed to ensure good quality data further down in the service pipeline.
In our deployment we found the behavior of low-cost sensors to differ from the field studies publicly available. Contrary to those, our sensors tended to underestimate the pollution levels, when comparing against industrial reference sensors, at places where both sensor types are co-located. However, we showed how procedures based on machine learning have the potential to calibrate data incoming from low-cost sensors, with a high increase in post-calibration correlation against references, and captures the effects from external features such as humidity and other meteorological influences in the sensors' output.
Calibrating pollution data is not an end in itself, but a means to produce meaningful inputs to services that can be used by the municipality, and citizens in general, for decisionmaking support. We illustrated two applications that benefit from the calibration procedure and showed how they can fulfill the desire for real-time services from the municipality: a warning system for prediction of pollution peaks in the next 24 h, and visualization techniques with spatial and temporal patterns of the measurements.
In the future we plan to continue working with the municipality to develop new services and functionalities that will be of interest to decision makers and the general public. In particular, initial steps were taken towards the development of a mobile app that can use data from the sensor network, combined with some of the visualization techniques here presented, to raise awareness among citizens and promote healthier lifestyles. Additionally, there is work towards the development of a digital twin of the city that incorporates a traffic simulation based on real data along the inclusion of several other emission sources to allow the generation of what-if analyses, i.e., test the effect of the municipality's decisions through the simulation of future scenarios in a digital copy of the city, with traffic and pollution patterns modeling based on real measures [30]. All services are to be deployed through the European AI on-demand platform, currently under development under the AI4EU project. In relation to sensor calibration, we would like to test automatic transfer of calibration models through the whole sensor network. In this paper we show that transferring calibration models between sensors is feasible but the evaluation of such a procedure needs a reference sensor as a target. A possibility is to co-locate all sensors with references, gather data, calibrate them and then deploy to their final locations, but a more automatic solution would ease and speed the deployment process. Additionally, sensor placement methods [31] can be investigated in order to optimize the number of sensors needed to cover the map and the location to where deploy such sensors.  Acknowledgments: The authors would like to thank Hans Jørgen Grimstad, Bjørn Borud and the team at lab5e (https://lab5e.com, accessed on 3 May 2021) for the design of the sensor node and structural components and enclosure, development of firmware and backend server to receive real time data from the sensor network. Also Leendert Wienhofen, Thea Berg Lauvsnes and the team at Trondheim municipality are acknowledged for their stamina in work to improve air quality services based on new technologies.

Conflicts of Interest:
The authors declare no conflict of interest.