Particulate matter (PM) is one of the major airborne pollutants in urban environments and is one of the most problematic air pollutants, in terms of its negative effects on human health [1
]. The effects of PM on human health, which have been widely studied in the last twenty years, include asthma, lung cancer and cardiovascular issues [2
]. Generally, the level of health effects from PM are related to the size of particles. For instance, PM up to 10 micro-meters (μm) in diameter (PM10
,) can penetrate into the bronchi. PM up to 2.5 μm (PM2.5
, fine particles) can penetrate the lungs, while ultrafine particles (PM0.1
) are able to pass through the lung tissue and enter the circulatory system [4
]. The International Agency for Research on Cancer (IARC) concluded in 2013 that PM is carcinogenic to humans [6
]. According to the European Environmental Agency (EEA), in 2014, 428,000 premature deaths in 41 European countries were caused by PM2.5
in the air [7
Traditionally, like most other pollutants, PM concentration is measured at fixed air quality monitoring stations by using accurate and expensive instrumentation. In the European Union, the density of such networks of monitoring stations is determined by the EU Air Quality Directive 2008/50/EC, which defines the minimum number of fixed monitoring stations for each target pollutant based on the air pollution levels, population and coverage area [8
]. However, due to the substantial cost associated with setting up and maintaining such stations, the number of monitoring sites tends to be quite small in most areas, and, while the resulting networks are capable of fulfilling the regulatory needs, their number is generally insufficient for providing detailed information about the spatial distribution of the pollutants, identify pollution hotspots, or provide comprehensive personalized information about air quality to citizens at locations not covered by the network. Although pollutant dispersion models can be used to address this issue to some extent, they often exhibit bias [9
]. Integrating the observations from a dense network of low-cost sensors with model information through techniques such as data fusion and data assimilation is able to provide spatially continuous concentration fields with significantly reduced bias [11
]. This adds values to the sensor observations by spatially interpolating between monitoring locations and at the same adds value to the model by constraining the model with actual observations. As such, the advantages of both datasets are combined in a mathematically objective fashion, and the resulting up-to-date concentration fields allow for the possibility of providing more relevant personalized information about air quality and exposure to the public.
The recent advancements in the field of low-cost micro-sensors and information and communication technology (ICT) are an opportunity to realise this objective of providing up-to-date and useful air quality information by complementing the official outdoor air quality monitoring networks and improving the spatial and temporal resolution of air quality data [13
Currently, several categories of low-cost micro-sensors for PM measurements are available, e.g. Sharp GP2Y1010 [17
], Shinyei PPD42NS [18
], Plantower PMS1003 [19
], Nova SDS011 [21
], AirBeam [22
], Alphasense Optical Particle Counter (OPC-N2) [23
], and Wuhan Cubic PM3007 [24
]. These sensors are all based on optical light scattering using a laser and applying Mie theory on the scattered light to determine the particle size [25
]. These PM sensors come in compact sizes, are light, have low energy consumption, operate at a high sampling frequency, and cost from tens to hundreds of Euros each [26
]. Such sensors are promising to be deployed in the outdoor environment in terms of their size, cost and ease of use. In particular, such sensors have already been used among non-profit organisations and citizen scientists [27
]. However, it is crucial that the sensors’ accuracy, precision and reliability are assessed in a comprehensive and repeatable manner under real-world conditions before they are deployed in large numbers [27
So far, only a limited number of studies have evaluated this new generation of low-cost sensors for PM2.5
monitoring, and their performance under various environmental conditions and different time scales is still not well understood [28
]. Genikomsakis et al. (2018) [29
] performed an on-field testing of low-cost portable system for monitoring PM2.5
concentrations in Thessaloniki (Greece), by using a Nova PM sensor SDS011 with an equipped calibrated instrument as the basis of comparison, during the period of 6–8 March 2017. Their results showed that the Nova PM sensor SDS011 maintained a high level of accuracy (R2
value ranging from 0.93 to 0.95) despite of the errors introduced due to the conditions of the mobile test run. Badura et al. (2018) [30
] conducted a performance assessment for Nova PM sensor SDS011, ZH03A (Winsen), PMS7003 (Plantower), and OPC-N2 (Alphasense) sensors, with a TEOM 1400a analyser for almost half a year from 21 August 2017 to 19 February 2018 in Wrocław (Poland). They found a high, linear relationship between TEOM and sensors for 1 min, 15 min, and 1-hour averaged data for the PMS7003 sensors (R2
≈ 0 83–0.89), for SDS011 units (R2
≈ 0 79–0.86), and for one unit of ZH03A (R2
≈ 0 74–0.81), and R2
values for daily averages were at the level 0.91–0.93 for PMS7003, 0.87–0.90 for SDS011, and 0.89 for ZH03A, respectively. The performance characteristics of the available low-cost sensors should be well known, before their deployment for sensor-based management of air pollution [10
This study is one of the major tasks within the EU H2020 project hackAIR (www.hackair.eu
], which is building a collective, awareness-raising platform for outdoor air quality with pilots in local communities in Norway and Germany. In this study, PM2.5
measurements from a set of three low-cost PM sensor units (SDS011) were compared against tapered element oscillating microbalance (TEOM) observations made at an air quality monitoring station in Oslo, Norway. The TEOM device is a well-characterised instrument and commonly used in air quality monitoring. We assessed the performance of the three units for a four-month period in winter and spring 2018.
3. Results and Discussion
3.1. Sensors and Reference Instrument Operational Data Coverage
presents the results of PM2.5
measurements for the period of nearly four-month from three SDS011 sensors and official reference monitoring station, respectively, at hourly time scale. Data gaps (i.e., from 31 December 2017 to 3 March 2018) for the reference monitoring station were related to official reference-equivalent instrument error (e.g., power outages and maintenance activities) (Figure 4
a). Data gaps for S2 and S3 (i.e., 3 March 2018 for S3, 30 March 2018 to 31 March 2018 for both S2 and S3) were due to platform error (Figure 4
c, d). In general, the operation of the tested three SDS011 PM sensors was quite stable all near four-month study, and no obvious sensor errors have been observed (Figure 4
e). Episodes of elevated PM2.5
concentrations were observed during the new-year eve within the time-period of 23:00 p.m., 31 December 2017 to 01:00 a.m., 1 January 2018. This is clearly connected to particle emissions from the fireworks (Figure 4
All measurements were conducted under varying meteorological conditions. Figure 5
illustrates that the PM2.5
data from three sensors follow similar patterns near four-month period, thus indicating that they respond similarly to varying environmental conditions. Qualitatively no significant drift of the signal was observed for any of the three sensor systems over the study period. The PM2.5
concentrations at hourly time scale ranged from 0.4 μg/m3
to 127.5 μg/m3
. The T range the sensor systems were exposed to was −14.0–+11.4 °C, and the RH range was 15.4–+99.5%, respectively (Figure 5
). The operation of the three tested SDS011 sensors was stable throughout the almost four-month study period and was no obvious errors in terms of data availability or failures of electronic parts of the sensors were observed within these meteorological conditions.
3.2. Linearity of the Response and Accuracy
Data from three SDS011 sensor systems was compared with data generated from the official air quality monitoring station over a nearly four-month period (11 December 2017–31 March 2018) (Figure 6
, Table 2
). The results show that the PM sensors provided a consistent measurement response to measurements of the reference monitoring station. Three sensors demonstrated a substantial degree of correlation against the official reference instrument from air quality monitoring station, with R2
values equal to 0.71 (S1), 0.68 (S2), and 0.55 (S3), respectively. This result is consistent with the similar study implemented in Wroclaw, Poland by Badura et al. 2018 [30
]. As can be seen in Table 2
, the slope of all three regression models is slightly below 1, indicating a general under-estimation of the PM2.5
mass for all three units, particularly for higher pollution levels. Furthermore, the mean error is generally below 2 μg/m3
and the RMSE (Root-Mean-Square Error) is less than 6 μg/m3
for all three units. Sensor system S3 shows the overall worst performance of all three units.
The three sensors demonstrated satisfactory to comparatively high data accuracy of the long-term mean concentration with values of 98.16%, 86.82% and 80.76%, respectively (Table 3
). The long-term averaged data accuracy for three sensors reached 88.58%.
3.3. Intersensor Variability
The inter-sensor variability over the almost four-month study period (11 December 2017–31 March 2018) was analysed. We can see that three sensors provide quite similar results and do not vary substantially (Figure 4
, Table 4
, Figure 7
), with inter-model variability around 9.64%, which calculated as following:
visualizes inter-sensor variability for three SDS011 sensors as a scatterplot matrix with inter-sensor correlations exhibiting R values higher than 0.97.
3.4. Influence of Relative Humidity and Air Temperature
Sensors were exposed to T in the range of −14.0–+11.4 °C, and RH in the range of about 15.4–+99.5%. These parameters were measured by the DHT22 sensor located beside the SDS011 sensor, within same sensor casing. Therefore, measurements of T and RH are independent and not affected by the data availability of sensors or electronic parts of sensors.
Most low-cost sensors for air quality, including such as Alphasense OPC-N2 [23
], Plantower PMS7003 [19
], and Nova SDS011 [21
], are to some extent influenced by the ambient environmental conditions. Therefore, we explored the relationship between the observed PM2.5
error as a function of T and RH (Figure 8
shows how the PM2.5
sensor error (calculated as the hourly mean sensor observation minus the hourly mean observations from the TEOM instrument) varies with T and RH. While all of the raw hourly data the overall patterns can be most easily observed by analysing the red line which represent a Loess fit [40
] to the raw data.
As for the dependence of the PM2.5
sensor error with air T, all three units show similar patterns. For relatively low T under −5 °C the errors were either slightly negative on average (S1 and S3) or close to zero (S2). For T around zero degrees, all three units show slightly positive errors between 0 μg/m3
and 5 μg/m3
. For higher T the average PM2.5
of all three units decreases again. There is no obvious physical reason for this pattern and we think that this peak at around zero degrees is rather related to high RH values at these temperatures (see also Figure 9
In terms of the impact of RH on the PM2.5 sensor errors, we can initially observe that there is a similar behaviour for all three units. The errors tend to be quite stable between −5 μg/m3 and 0 μg/m3 for RH levels less than approximately 80%. However, for RH values between 80% and 100% we can see a substantial increase in PM2.5 error for all three units. At close to 100% RH, all three units show positive PM2.5 errors of 10 μg/m3 to 15 μg/m3 on average. While the RH values that occurred during our study period ranged from a low of 15% to nearly 100%, the highest frequency of RH values in this study was found in two clusters of around 80% and 90%, respectively.
In order to better disentangle the effects of T and RH, Figure 9
shows the PM2.5
error of each sensor as a function of both RH and air T at the same time. The most obvious pattern is the substantial cluster of positive errors for high RH (>90%) at T above 0 °C. This pattern exists for all three units, although there are some slight differences in the magnitude of the errors. We can further observe the largest negative errors for low T (<−5 °C) and RH between approximately 40% and 80%. The errors tend to decrease again slightly for even lower T (<−13 °C) but the range of RH is very small in this case and the overall number of samples is very low, making it difficult to draw further conclusions. The effect of RH and T on sensors’ data quality needs to be taken into consideration when using low-cost particle sensors [41
There could be several reasons for the effect of RH on the sensor performance. The most obvious reason is that the low-cost sensor has no system for drying the particles before they enter the optical chamber, which means that aerosol particles as well as fog droplets are counted. This leads to a positive artefact compared to the TEOM. The second reason is particle growth by water vapour condensation. Depending on the chemical composition of the particles, water vapour can condense onto the particle and particles grow by condensation. This growth in particle diameter is reflected by the radius to the power of three in the particle mass and would also lead to a positive artefact compared to the TEOM, as the TEOM measures dry particles. The third reason is the change of optical properties of particles measured if water condensation occurs onto the particle. A critical parameter when calculating the particle density distribution is the refractive index of the particles. Water vapour condensation changes the imaginary component in the Mie equation. This is the extinction coefficient of the material, defined as the reduction of transmission of optical radiation caused by absorption and scattering of light, leading to a wrong estimation of the size and therefore the mass reported by the instrument.
Based upon these observations of high RH negatively affecting the sensors’ response, we filtered sensor data with RH less than 80%, and plotted it against officially measured concentrations of PM2.5
). The results indicate that the three sensors demonstrated an increased degree of correlation against the official reference instrument from air quality monitoring station (Figure 6
), with R2
values increased from 0.71 to 0.80 (S1), 0.68 to 0.79 (S2), and 0.55 to 0.71 (S3), respectively. However, the slope of the regression lines is slightly lower than before for all three units (Figure 6
Furthermore, we filtered sensor data with RH less than 70% which aligned with the manufacturer-provided RH operating range (max. RH 70%), and plotted it against officially measured concentrations of PM2.5
). The results showed that three sensors demonstrated a decreased degree of correlation against the official reference instrument from air quality monitoring station, with R2
values decreased from 0.71 to 0.65 (S1), 0.68 to 0.57 (S2), and 0.55 to 0.47 (S3), respectively. The physical reason for these decreased correlation is not entirely clear, but it may be related to the fact that nearly 60% of the data were filtered out. Since this filtering also includes most observations at concentration of greater than 20 μg/m3
, the available range of concentrations is decreased significantly, which could be responsible for a decrease in correlation.
3.5. Correction for Temperature and Humidity Effects
It is to some extent possible to statistically correct for the effects of RH and T that were shown in the previous section, although such corrections tend to be specific to the location at which the co-location is being carried out and cannot easily be transferred to other locations with different conditions. We demonstrate an example for such a correction procedure here using simple multilinear regression (MLR) [37
] and a random forest (RF) model [38
shows the improvement in sensor accuracy that can be achieved when RH and T are accounted for as part of the calibration. The left column of Figure 12
shows the original out-of-sensor data for all three sensors, whereas the middle column and the right column show the data after calibration using a MLR and a RF model, respectively. It can be seen that already a simple linear regression can improve the accuracy with respect to reference data somewhat, although the increases in R2
value are relatively modest. However, using the same dataset with a RF model increases the correlation significantly, explaining roughly 10% more of the variability for sensors S1 and S2 and even 20% more of the variability in the case of sensor S3, with R2
value increased from 0.71 to 0.80, from 0.68 to 0.79, from 0.55 to 0.76, respectively (Figure 12
It should be noted that this correction for the effect of air T and RH is only valid for the particular location at which the model was trained. As such, the model is dependent on both the specific characteristics of the environmental parameters at this site but also of the characteristics of the PM that occurs at this site (e.g., particle type, size, etc.). Applying such a correction model at a different location that has either different environmental conditions or different particle characteristics is likely to result in inferior performance.
shows scatter plots of the relative expanded uncertainty as a function of the PM2.5
concentration measured at the air quality monitoring station, following the methodology described by Spinelle et al. (2015) [43
]. Based on these plots, two out of the three units (S1 and S2) reach the data quality objective (DQO) of 50% as defined in the European Air Quality Directive [8
]. Both sensors reach relative expanded uncertainties [44
] of below 50% approximately at concentrations of 20 μg/m3
. S3, however, does not meet the DQO.
The conducted comparison of three SDS011 sensors with data from an official reference air quality monitoring station demonstrated that the SDS011 sensor generally follow the PM2.5 variability. Linear regression indicated a good correlation between the two datasets with R2 values equal to 0.71 (S1), 0.68 (S2) and 0.55 (S3), respectively, over almost four-month period in challenging, Norwegian winter conditions with frequently high relative humidity. The inter-sensor variability analysis showed that the three sensors provided quite similar results and did not vary substantially from each other, with inter-model variability around 9.64%, and inter-sensor correlations R values higher than 0.97. RH and, to a very small extent, T affect the SDS011 performance. Particularly high RH values (over 80%) cause significant overestimates of the true PM2.5 mass. While the sensors provide generally reasonable estimates of PM2.5 mass out-of-the-box, our results also indicate that a field calibration under representative environmental conditions is highly beneficial for improving the accuracy of the measurements. This study was limited to a relatively low number of months with limited variation in environmental conditions. To cover a wider range of meteorological conditions and to test the long-term stability of the sensors, we are working on a follow-up study that will evaluate the performance of the sensors using a yearlong time series at least and sensors located in a wide variety of differing environmental conditions and pollution regimes.
Despite the reasonably good performance of these sensors, it should be noted that the potential misuse of these sensors is nonetheless high, especially when they are used outside of a research environment for citizen science applications and personal air quality monitoring, where the users might not have the required knowledge to adequately judge the uncertainty of the sensors. In such cases, the deployment of these sensors will almost certainly not be confined to environments with RH < 80%, and there is no specific notification to the users that the readings are unreliable when the RH is > 80%, although the manufacturer provides a recommended RH operating range. In addition, the relative uncertainties can be quite high for hourly values, and users should be aware of this limitation and take caution in interpreting such measurements. Nevertheless, considering their very low cost and the performance assessment results overall, we conclude that the SDS011 has significant potential for implementing a dense monitoring network when the environmental conditions exhibit on average relatively low relative humidity (RH < 80%). When used under environmental conditions that often exhibit high relative humidity, appropriate automated filtering or correction routines need to be established to remove problematic observations from the datasets or at minimum provide the users with clear indications of the estimated observations uncertainty. If these conditions are met, we conclude that networks of SDS011 sensors could in future, for example, complement the regulatory outdoor air quality monitoring networks and improve the spatial and temporal resolution of PM2.5 data, opening up various applications for the research community, regulatory agencies and raising public awareness.