Assessment and Improvement of Two Low-Cost Particulate Matter Sensor Systems by Using Spatial Interpolation Data from Air Quality Monitoring Stations

: Two low-cost ﬁne particulate matter (PM 2.5 ) sensor systems have been established by the government and community in Taiwan. Each system combines hundreds of PM 2.5 sensors through an Internet of Things architecture. Since these sensors have not been calibrated, their performance has been questioned. In this study, the spatial interpolation data from air quality monitoring stations (AQMSs) was used to quantify the performances of the two sensor systems. The linearity, sensitivity, offset, precision, accuracy, and bias of the two sensor systems were estimated. The results indicate that the linearity of the government’s sensor system was higher than that of the community sensor system. However, the sensitivity of the government’s system was lower than that of the community system. The relative standard deviation, relative error, offset, and bias of the community sensor system were higher than those of the government sensor system. However, the government sensor system exhibited superior spatial interpolation results for the AQMS data than the community sensor system did. The precision and accuracy of the two sensor systems were poor during a period of low PM 2.5 concentrations. A working platform of improvements consisting of monitoring the operation loop and automatic correction loop is proposed. The monitoring operation loop comprises ﬁve modules, namely outlier detection, temporal anomaly analysis, spatial anomaly analysis, spatiotemporal anomaly analysis, and trajectory analysis modules. The automatic correction loop contains spatial interpolation module, a sensor performance detection module, and a correction module. The proposed working platform can enhance the performance of low-cost sensor systems, especially as alert systems for reportable events. few kilometers away. Low-cost sensors set up by the government are used as monitoring equipment to deter illegal air pollution in industrial areas. However, there has been a lack of evaluation in monitoring accuracy and methods for improvement. This study suggests an approach to assess the performance of low-cost sensor systems, which was systematically analyzed in terms of detecting ambient PM 2.5 distributions by using the spatial interpolation data of an AQMS system. The results indicated that the government’s SAQ-200 sensor system outperformed the community AirBox sensor system on most parameters. However, the SAQ-200 sensor system has a lower sensitivity than the AirBox sensor system. The variations in the PM 2.5 concentrations detected in the six two-month periods had the following sequence: AirBox sensors > AQMSs > SAQ-200 sensors. After its installation, the SAQ-200 sensor system was corrected using the ARRF method. The AirBox sensors were not corrected after installation. Both systems require a more complete correction method to provide accurate observations. To improve the performance of a low-cost IoT sensor system, a working platform consisting of a monitoring operation loop and an automatic correction loop was proposed. This working platform provides additional functions, such as those related to identifying instrument faults and pollution, calling for repairs, determining the scope of pollution, issuing alerts, tracing possible pollution sources, and performing automatic correction. The results of this study can indicate the steps to be taken to obtain accurate pollution detection results.


Introduction
The equipment and operation costs of an air quality monitoring station (AQMS) are high. Therefore, a small number of AQMSs are established in a city or region. Low-cost sensors have been widely used in new monitoring systems [1][2][3]. These systems can supplement existing traditional air quality monitoring networks in a region to increase the density of observed data for the spatiotemporal analysis of air pollution [4,5]. In many countries, a large number of low-cost sensors have been installed in cities to compensate for the lack of air quality stations [6]. These sensors are connected using an Internet of Things (IoT) architecture to form wireless sensor networks. Moltchanov et al. [7] demonstrated the capability of a low-cost sensor network in analyzing spatiotemporal concentration variations. Such a network is suitable for assessing the exposure to urban air pollution. Kanabkaew et al. [8] analyzed the air quality of Mae Sot City, Thailand by using monitoring data from a low-cost IoT PM 2.5 sensor system. Gao et al. [4] used a distributed network of low-cost sensors to measure the spatiotemporal variations of fine particulate matter (PM 2.5 ) in Xi'an, China. Lung et al. [9] used low-cost sensors to assess the spatiotemporal variations in PM 2.5 at the street level in a mountain community.
However, a disadvantage of low-cost sensors is that the data recorded by them have poor quality because they are not appropriately managed and calibrated [10]. Zheng et al. [11] indicated that low-cost particulate matter sensors may not be ideal for monitoring low concentrations of particulate matter. Concern regarding data quality hinder the widespread adoption of low-cost sensor technology [12]. The 1st EuNetAir Air Quality Joint Intercomparison Exercise was organized in Aveiro (Portugal) in October 2014 for evaluating and assessing different micro sensor systems. The results of this exercise indicated that the overall performance of low-cost sensors depends on their characteristics and on the platform adopted [13]. Numerous researchers have proposed many methods for improving the performance of low-cost sensors. Holstius et al. [14] used 1-h and 24-h PM 2.5 data from a PM 2.5 β-attenuation monitor for field-calibrating low-cost sensors. Their results indicated that the variances in the data observed with low-cost sensors over 1 and 24 h were 60% and 72%, respectively, after linear corrections. Lin et al. [15] proposed a multi sensor space-time data fusion framework for analyzing the data from 1176 low-cost sensors. Their results indicated that reasonable and superior estimates of the spatiotemporal PM 2.5 concentrations are obtained. Manikonda et al. [16] indicated that four low-cost PM 2.5 sensors (Speck, Dylos, TSI AirAssure, and UB AirSense) have adequate precision for monitoring air quality in an indoor environment after they have been field-calibrated using well-characterized reference instruments. Sayahi et al. [17] revealed that low-cost Plan tower particulate monitor sensors (PMSs) exhibited good correlations with reference monitors in the winter season. However, some intra-sensor variability and drift occurred in one sensor.
The low-cost PM 2.5 sensor system of the Taiwanese government, which is based on an IoT framework, was designed to measure the ambient air quality in an industrial area. Because monitoring fugitive emissions from factories is one of the aims, the installed amounts of the sensor in industrial areas are higher than in community areas. Moreover, due to concerns about ambient air quality, people installed low-cost PM 2.5 sensors outside their homes and joined the civilian monitoring IoT of the sensor manufacturer. People often stated that the PM 2.5 values measured by the low-cost PM 2.5 sensor system of the government and AQMSs are lower than that measured by their sensors. Thus, people frequently questioned that the ambient PM 2.5 concentrations announced by the government is too low. However, the manufacturer of the sensors used by the people often does not indicate the calibration and quality control procedures that must be performed to maintain the sensor's performance [18]. Both low-cost sensors employ the light-scattering method to measure PM 2.5 mass concentrations based on their confidential proprietary algorithm. One of them, for example, use a beam with a central wavelength of -640 nm and full width half maximum of about 600−680 nm as the light source for scattering of PM 2.5 . The particles of the air sample are illuminated with the beam in the detector chamber of the sensor, and the resulting scattered light is measured by a recipient photodiode detector. Then, the scattered light signals are converted to mass concentration data. However, the meteorological factors, such as RH and temperature, affect the data, which was likely not accounted for in the manufacturer calibrations [11,19]. The believability of data from low-cost sensors has been a controversy. Williams [20] indicated that if the data reveal patterns which seem plausible based on the other experience, then the data themselves can be accepted. An approach to improve the believability is using the calibration technique of proxy that use a few well-maintained and high-quality instruments [21] or suitably configured network [22] as a basis of reliable information. Weissert et al. [23] used a random forest model to describe the effect of land use features on local-scale air quality, which identify the impact of a land-use effect through deviations from the model. This study discusses and compares the operation status of the two low-cost sensor systems by the role of professional commentators who proposed an approach for assessing the performance of two low-cost sensor systems in detecting ambient PM 2.5 to provide a basis for the managers of government and non-government departments in the mainte-nance and management of the system. In this approach, the data obtained from different AQMSs were considered as true values of PM 2.5 because the sensors in these AQMSs are subjected to routine calibration checks. The linearity, sensitivity, offset, precision, accuracy, and bias of the two low-cost sensor systems were estimated. The obtained results indicated the accuracy of the investigated low-cost sensor systems. Finally, a working platform consisting of a monitoring operation loop and an automatic correction loop is proposed for the government's low-cost sensor system to improve its performance, especially for air quality monitoring in industrial areas.

Study Area and Three Monitoring Systems
Taichung City is a special municipality located in central Taiwan. At the end of 2019, approximately 2.815 million people lived in Taichung City, which is the second most populous city in Taiwan ( Figure 1). Most people live in the urban area of 492 km 2 , and only a few people live in the mountainous area of 1722 km 2 . The city center is a densely populated area. Taichung was selected as the target city of this study, in which the adaptability of two low-cost PM 2.5 sensor systems was evaluated. Figure 1 illustrates the locations of 17 AQMSs, 506 low-cost Taiwan Environmental Protection Administration (EPA) sensors (Model: SAQ-200, EnSense Co., Ltd, Taichung, Taiwan), and 420 private low-cost sensors (Model: AirBox, EDIMAX Co., Ltd, Taichung, Taiwan) in Taichung City, where most SAQ-200 sensors are located in industrial parks. The AQMSs belong to the EPA (five stations), Taichung Environmental Protection Bureau (seven stations), and Taiwan Power Company (five stations). The AQMS data for 2019 were used to evaluate the performance of the SAQ-200 and AirBox systems. The monitoring data of the SAQ-200 and AirBox sensor systems were obtained from Civil IoT Taiwan (https://ci.taiwan.gov.tw/dsp/environmental.aspx (accessed on 1 February 2021)) and EDIMAX Technology Co., Ltd. (https://www.edimax. com/edimax/post/post/data/edimax/tw/edigreen_data_release_policy/ (accessed on 1 February 2021)), respectively. More than 8 million data points were obtained from the 506 SAQ-200 sensors, 420 AirBox sensors, and 17 AQMSs after year-round hourly observations. For the purpose of this study and better comprehension of the sampler's performance, we used hourly data in one representative week for every two months. The first week (from Monday to Sunday) of each even month was selected as the representative week for every two months. If rainfall occurred in the first week of the even month, the next week without rain was considered to be the representative week.

Performance Analysis of Two Low-Cost Sensor Systems
The structures of the two low-cost sensors are very simple to meet the purpose of a low price and a large number of settings. They choose to align the sampling port face down to avoid rain, but they do not have defogging facilities to avoid humidity artifacts. As mentioned in the introduction, the meteorological factors such as RH and temperature affect their data. The two low-cost sensors have their own purpose. However, their calibration and quality control were insufficient to maintain the performance. In this study, the data obtained from different AQMSs were considered as true values of PM 2.5 because the sensors in these AQMSs are subjected to routine calibration checks. These calibrated values were used to obtain the spatial interpolation data in Taichung City by the inverse distance weighting (IDW) interpolation method. Figure 2 shows the suggested flowchart of performance analysis of a low-cost sensor system that includes six steps of data reading, spatial interpolation, linear regression, performance detection, statistical analysis, and outlier detection.  The spatial interpolation data were obtained using IDW interpolation. In this study, the equation for IDW was as follows: where C, d, n, and p are the concentration of PM 2.5 , distance between the data point and the prediction location, number of measured points (neighbors), and power value (in this study, p = 2), respectively.
The linear regression method was used to model the relationship between the data from spatial interpolation and from each sensor system. The R 2 , slope, intercept, relative standard deviation (E σ ), and relative error (E r ) values of the two low-cost sensor systems were calculated to determine their linearity, sensitivity, offset, precision, and accuracy, respectively. The R 2 , slope, and intercept values were obtained from the results of linear regression between the monitoring data of the low-cost sensor systems and the spatial interpolation data of the AQMSs at the same position. The relative standard deviation (E σ ) and relative error (E r ) were calculated using the following equations.
where x I,avg and x t,avg are the average of the spatial interpolation data from the AQMSs and the average of monitoring data from the low-cost sensor system during a representative week in a two-month period. The terms σ I and σ t represent the standard deviations of the AQMS and low-cost sensor system data, respectively. The mean absolute normalized gross error (MANGE) between the monitoring data of the low-cost sensor systems and the spatial interpolation data of the AQMSs at the same position was estimated to evaluate the bias of the two systems. The MANGE is calculated as follows: where n and m are the number of low-cost sensors and the number of measurements performed during the evaluation period, respectively. The terms C AQMS,i,j and C sensor,i,j represent the PM 2.5 concentrations of a species obtained from the data of the low-cost sensor systems and AQMSs for the site i during interval j, respectively. The boxplot outlier detection method was used to identify the data anomalies from a cluster. This method is a straightforward but effective technique for visualizing outliers. In the previously mentioned method, the interquartile range (IQR) is used to measure the dispersion and variability of data. The observed data set is divided into four defined intervals (Q1-Q4) according to the values of the data. A boxplot is then plotted according to the observed data. The boxplot represents the min, max, first-quarter (Q1), median (M), and third-quarter (Q3) values. In general, the fence of outliers is selected as a range between the higher control limit and lower control limit. The equation for calculating the fence is as follows: where Equations (2)−(4) do not apply under bad weather conditions, such as rain, snow, and dark clouds (high humidity) due to their heavy scattering in the detection chamber of sensor. When the light source of a sensor is severely degraded, it will determine the fault condition through Equations (5)−(7).
Quantum Geographic Information System (QGIS) [24] was used to map the PM 2.5 concentration distributions according to the observation data obtained from the AirBox, SAQ-200, and AQMS systems.

Monitoring Data Analysis of AQMSs
All the ambient particulate mass monitors in the AQMS are fitted according to the US Environmental Protection Agency PM 2.5 Federal Equivalent Method, which regulates the equipment architecture, analysis principle, and calibration method of ambient PM 2.5 monitoring. This method is suitable for the measurement of PM 10 and PM 2.5 and complies with the quality control and quality assurance guidelines of the US EPA Method 201A [25]. Therefore, the monitoring data best reflects the regional PM 2.5 concentrations. Figure 3 displays the box-whisker plots of the monthly average PM 2.5 concentrations measured at 17 AQMSs in Taichung City in 2019. A clear variation could be observed in the monthly average PM 2.5 concentrations between the 17 AQMSs, especially during the transition from the northeast monsoon to the southwest monsoon (April and May). Thus, the ambient PM 2.5 concentration distribution was uneven in Taichung. This result was obtained because the PM 2.5 concentration was affected by the industry and population distributions. Taiwan's monsoon can be classified as the southwest monsoon, which occurs from late April to late September, or northeast monsoon, which occurs from late September to late April. In general, a high ambient PM 2.5 concentration occurs in the northeast monsoon period. In the southwest monsoon period, the weather is hot and the mixed layer height is high. Therefore, the PM 2.5 concentrations are low.

Performance Analysis of the Two Low-Cost Sensor Systems
As mentioned above, the hourly data in one representative week for every two months was used to assess the performance of the two low-cost sensor systems. The PM 2.5 data obtained from the AQMSs are considered the most accurate representations of ambient PM 2.5 concentrations. Therefore, in this study, the spatial interpolation PM 2.5 data of the AQMSs were considered as the real PM 2.5 concentrations at the positions of the low-cost sensors. Figures 4 and 5 display plots of the PM 2.5 concentrations obtained from the AirBox and SAQ-200 sensors, respectively, versus the spatial interpolation PM 2.5 values obtained from the AQMSs at the same position during six two-month periods in 2019. The results indicated that the concentration of PM 2.5 in the northeast monsoon season (periods 1, 2, 5, and 6) was higher than that in the southwest monsoon season (periods 3 and 4). Moreover, the dispersion of the data obtained from the AirBox sensors ( Figure 4) was higher than that obtained from the SAQ-200 sensors ( Figure 5) during each 2-month period. This result was obtained because the government's SAQ-200 sensors are believed to be better managed and maintained than the community AirBox sensors. In addition, the SAQ-200 sensors are located near the industry right (high concentrations of PM 25 and, thus, likely better performance), which may be a possible reason for their low dispersion of the data. The data from the SAQ-200 sensor system were corrected using the average relative response factor (ARRF) method [26], which is a calibration specification setting by the Taiwan EPA for counties and cities. The ARRF is the ratio of the average monitoring data from 10 SAQ-200 sensors to the monitoring data from an AQMS PM 2.5 monitor over a one-month period. The organization of Airbox system members is loose due to no rigorous organizational norms. It is not as good as SAQ-200 owned by the government, which has coordinative management and operation. The supplier of the AirBox sensor is not responsible for the regular maintenance of the sensor it sells due to the low price. The AirBox sensors were not calibrated after installation. Therefore, the actual accuracies of the AirBox sensor system does not know. In this study, linear regression analysis was conducted on the data for determining the R 2 , slope, and intercept values for each two-month period to represent the linearity, sensitivity, and offset of the two low-cost PM 2.5 sensor systems, respectively.  The variations in the linearity, sensitivity, and offset of the two low-cost PM 2.5 sensor systems in 2019 display in Figure 6a-c. Generally, the PM 2.5 concentrations during the northeast monsoon and southwest seasons are higher and lower, respectively. The linearities and sensitivities of the SAQ-200 and AirBox sensor systems were higher during the northeast monsoon season than during the southwest monsoon season, which means that their linearities and sensitivities will all decrease with PM 2.5 concentration. The linearity of the AirBox and SAQ-200 sensor systems were similar during the northeast and southwest monsoon seasons. The opposite trend was observed for the PM 2.5 concentrations measured by both sensor systems. The offset of the SAQ-200 sensor system was low and high during the northeast and southwest monsoons, respectively. This trend was opposite to those observed for linearity and sensitivity. The offset of the AirBox sensor system did not vary during the northeast and southwest monsoon seasons. This was significantly different from the SAQ-200 sensor system. The offset results exhibited high instability of the AirBox sensor system, which was not independent on PM 2.5 concentrations. It can be attributed to the poor management and maintenance of the AirBox sensor system. People did not systematically maintain and calibrate their sensors. The variations in the relative standard deviation, relative error, and MANGE of the two low-cost PM 2.5 sensor systems during periods 1−6 in 2019 are displayed in Figure 7a-c, respectively. The relative standard deviation, relative error, and MANGE values were used to represent the precision, accuracy, and bias of the two low-cost PM 2.5 sensor systems, respectively. The results indicate that the relative standard deviation, relative error, and MANGE of the AirBox PM 2.5 sensors were considerably higher than those of the SAQ-200 PM 2.5 sensors during each two-month period in 2019. Therefore, the SAQ-200 PM 2.5 sensor system had higher precision and accuracy as well as a lower bias than the AirBox PM 2.5 sensor system. This result indicates that good management and maintenance are essential for a low-cost PM 2.5 sensor system.

Spatial Distribution Analysis
Observations from low-cost sensors have considerable potential to provide highresolution mapping of regional air quality [27], especially in regions lacking AQMSs [28]. The hourly observed data of three monitoring systems at 12:00 p.m. (noon) on Thursday in the representative week of each two-month period in 2019 were used to map the PM 2.5 concentration distributions with QGIS 3.12. A notable difference was observed between the spatial interpolation results of the PM 2.5 data obtained from the two low-cost sensor systems and AQMS system (Figure 8). In order to facilitate the comparison of the spatial analysis results of the three monitoring systems, this study was without considering their differences in the number of instruments and the setting positions. The PM 2.5 concentrations detected by the three monitoring systems in the six two-month periods had the following sequence: AirBox sensors >AQMSs > SAQ-200 sensors. This result indicates that the data obtained from the AirBox sensors had high bias, whereas the data obtained from the SAQ-200 sensors had low bias. The two low-cost PM 2.5 sensors could indicate some locations with high PM 2.5 concentrations. The large hotspot measured by AQMS in November is located at the boundary between Districts #13 and #15, but no SAQ-200 sensors are located in these two districts (see the dots in each subfigure of Figure 8b). Thus, this large hotspot is not picked up by the SAQ-200 network. The differences between the different subfigures in Figure 8 not only reflect the different spatial distribution and interpolation of the systems, but also reflect their different amounts. However, the high uniformity cannot ensure high representativeness for air quality, which is due to the lack of good calibration and management in the AirBox network. The SAQ-200 sensor system was calibrated using the relative average response factor method between one-month average data from AQMS PM 2.5 monitor and from 10 off-site SAQ-200 sensors. The AirBox sensors were not calibrated after installation. Therefore, an improved correction method is required for the two low-cost PM 2.5 sensors to obtain PM 2.5 information with high spatial and temporal resolution. The locations of sensors or monitors in the three air quality monitoring systems are not evenly distributed in Taichung City, so there are all some deviations in their PM 2.5 concentration distribution results of QGIS 3.12 modellings. In particular, there are six AQMSs located on the coast area ( Figure 1) that high-density AQMs are likely to cause a relatively large or small PM 2.5 concentration gradient from the coast to inland Figure 8c. Therefore, it is recommended to configure low-cost sensors with grids at equal intervals in the future.

Outlier Detection in the Observed Data
To illustrate the outlier phenomenon in the observed data, the target area was divided into 10 × 12 uniform grids as an example, which had a resolution of 3.8 km × 3.8 km and covered an area of 38.0 km × 45.6 km (Figure 9). Because the agreement in the outlier detection varies with the area and location of the grid, the proper grid size must be selected for use. The sensors in each grid were considered to be in a cluster. If no sensor was present in the grid, no cluster was considered to exist. The AirBox and SAQ-200 sensor systems had 50 and 39 clusters, respectively. Figures 10 and 11 display the box-whisker plots and outliers (red or blue dots) of the observed data for the clusters of the AirBox and SAQ-200 sensor systems, respectively, at 12:00 p.m. on Thursday of the representative week in the six two-month periods of 2019. The variability of the clusters was significantly higher for the AirBox sensors than for the SAQ-200 sensors. Therefore, the IQR of the clusters in the AirBox sensor system was significantly higher than that of the clusters in the SAQ-200 sensor system. Table 1 presents the outlier detection results for the data obtained from the two low-cost PM 2.5 sensor systems at 12:00 p.m. on Thursday of the representative week in each two-month period of 2019. The results indicate that the outlier rates of the SAQ-200 sensor system were significantly higher than those of the AirBox sensor system during periods 3-6. This result was obtained because the fence ranges of the SAQ-200 sensor system were lower than those of the AirBox sensor system. The fence ranges of the AirBox sensor system were 2.1-4.9 times higher than those of the SAQ-200 sensor system. The highest difference in the fence ranges of the two systems was observed in period 3 (May-June). The number of outliers is considerably affected by the fence of the outlier detection. Thus, the number of outliers is a meaningless factor when the variability of the observed data becomes too large. The basis for identifying the observed outlier result of sensors as faults or measured pollution is presented in Section 4.1.3.

Suggestions for Improvements
An improved correction method is required for the two low-cost PM 2.5 sensor systems to obtain PM 2.5 information with high spatial and temporal resolution. A working platform consisting of a monitoring operation loop and an automatic correction loop is proposed, which is an approach of rolling automatic correction to ensure the availability of data from a low-cost sensor system. The flowchart of the proposed working platform is displayed in Figure 12. This working platform involves setting up the operation and is, therefore, suitable for new or existing sensor systems. The monitoring data of a low-cost sensor system and AQMS system are continuously read by the monitoring operation loop and an automatic correction loop, respectively. The monitoring operation loop includes an outlier detection module (ODM), a temporal anomaly analysis module (TAAM), a spatial anomaly analysis module (SAAM), a spatiotemporal anomaly analysis module (STAAM), and a trajectory analysis module (TRAJM). The automatic correction loop contains a spatial interpolation module (SIM), a sensor performance detection module (SPDM), and a correction module (CM). The two loops periodically output the performance reports, according to the set defaults. The descriptions of the module operations are as follows.

Cluster Attributes
The IoT architecture of the SAQ-200 sensor system is used to reinforce the air quality monitoring requirements of AQMSs, especially in industrial areas. Therefore, most of the SAQ-200 sensors are densely arranged in the industrial zone (Figure 9b). By contrast, community AirBox sensors are evenly distributed in urban areas (Figure 9a). Since the size and shape of industrial parks are very different, dividing clusters by industrial parks is inappropriate. Therefore, the gridding method is appropriate for cluster attributes. For example, the division of clusters in Taichung is suggested to be similar to that in Figure 9. However, the area should be divided into 19 × 23 uniform grids, which have a resolution of 2 km × 2 km and cover an area of 38 km × 46 km. The sensors in each grid are considered to be in a cluster, and each cluster is given a two-dimensional coordinate.

ODM (Outlier Detection Module)
An ODM detects the outlier sensor and its locations. For a normal distribution dataset, the range fence of outliers is ±2.698 standard deviations (σ), which is equal to the value obtained using Equations (5)−(7). High-variability clusters face major fence problems. Therefore, a strict fence should be selected. Because 87% and 95% of the population values lie within ranges of ±1.5σ and ±2σ, respectively, a fence range of ±1.5σ to ±2σ is recommended for outlier detection. In the ODM, the outlier detection process continues running through a circulation mechanism. However, when an outlier is found, the system enters the next module, namely the TAAM, through a parallel process. The circulation mechanism ensures the continuous sequential analysis of the outliers of the observations in each cluster.

TAAM (Temporal Anomaly Analysis Module)
A TAAM is used to identify the observed outlier result of sensors as faults or measured pollution. This module sequentially determines the data variation of each sensor in a certain time segment. The time segment is the retrospective period for the previous 10 (a selectable value) observations. According to the variability of observations, the value detected by an outlier sensor can be attributed to instrument failure or pollution. Figure Figure 13). Therefore, a σ range of 1.0 ≤ σ ≤ 3.0 is recommended as a basis for determining air pollution occurrence. When σ is greater than 3, the sensor may malfunction, and a maintenance worker will be automatically called out through IoT technology for a repair.

SAAM (Spatial Anomaly Analysis Module)
After identifying the pollution event through the TAAM, spatial abnormality analysis must be performed to confirm the pollution impact range. A SAAM is used to determine the range of impact of a pollution event. The module continuously analyzes the outliers of the observations in each cluster in sequence. A σ range of 1.0 ≤ σ ≤ 3.0 is used for confirming the number of polluted clusters and their locations. When the value of σ is within the previously mentioned range, the process enters the next module and the alert system is activated. The alert system then provides an automatic anomaly notification to environmental protection officials.
To illustrate the efficacy of the SAAM, a factory fire accident in a waste tire treatment plant was used as an example. The fire accident lasted from 13:29 on 7 May 2019 to 04:25 on 9 May 2019, on the outdoor stacking site, which had more than 40,000 waste tires on it. A fire plume enveloped the Taichung urban area for 36 h. The observations of the two low-cost sensor systems indicated the status of the fire plume. The intensity of the AirBox sensor system response was higher than that of the SAQ-200 sensor system. Figure 14 displays the results of spatial anomaly analysis of the average PM 2.5 concentrations of clusters at 10:00 on 8 May 2019. The fire plume flowed through a sector of Taichung City at an angle of 25 • . In this sector, the ground concentration of PM 2.5 increased with an increase in downwind distance. The situation at that time showed that the pollution impact was large because the drop of the fire plume was strong. The monitoring results show that both sensor systems can reflect the occurrence of fire accidents. Thus, the SAAM is effective. The TAAM is used to determine whether outliers cause pollution detection, and the SAAM is used to determine the pollution impact range. A STAAM is used to judge whether a pollution incident has terminated. Therefore, the STAAM executes the TAAM and SAAM until all outliers disappear. In the factory fire accident, a fire plume enveloped Taichung City over 36 h. This incident was recorded in its entirety by the two low-cost sensor systems. The entire process was recorded by the TAAM into a pollution event file that can be accessed by pollution managers.

TRAJM (Trajectory Analysis Module)
The TRAJM uses a backward trajectory to simulate the transmission path of a contaminated air mass for determining the possible sources of pollution. When a pollutant drifts passively with the wind, the trajectory is the integration of the pollutant position vector in space and time. In the backward trajectory calculation, the position of the pollutant source is obtained from the average velocity at the polluted position P and the estimated backward position P', as follows: First quadrant : θ = 90 • , x = x + a, y = y + b Second quadrant : θ = 180 Third quadrant : θ = 270 Fourth quadrant : θ = 360 where a = W s (P, t) + W s (P , t − ∆t) 2 ∆t csc θ (13) and where W s and W d are the wind speed (km h −1 ) and wind direction (º), respectively, and ∆t is the integration time of a step (h). When the path of a backward trajectory is determined, the possible sources of pollution on the path can be traced.

SIM (Spatial Interpolation Module)
A SIM is used to obtain the regional baseline pollutant concentrations as standard values for the calibration of low-cost sensors. The calibration is performed for each sensor rather than each cluster. The standard calibration value for each sensor is acquired from the results obtained through the spatial interpolation of the hourly monitoring data of AQMSs. Therefore, each sensor is calibrated on site by the SIM. Two of the most frequently used deterministic models in spatial interpolation are the IDW and Kriging methods. A small difference exists between the results obtained for the spatial distribution of particulate matter by using the IDW and Kriging methods [29].

SPDM (Sensor Performance Detection Module)
An SPDM is used to estimate the bias of a low-cost sensor and correct its cloud correction formula. The SPDM can be effectively and directly used to conduct performance determination and calibration. The linearity, sensitivity, accuracy, precision, offset, and bias of a sensor are computed for a weekly or monthly period (a selectable period) by using the SPDM. The computation process is similar to that described in Section 3.2. However, the PM 2.5 concentrations detected by low-cost sensors and the spatial interpolation results for the PM 2.5 values obtained from AQMSs are used to calculate the performance parameters of the low-cost sensors. The E σ , E r , and MANGE values of each low-cost sensor in a system are examined to determine if the sensor meets the relevant norms. If the sensor meets such norms, a daily or weekly routing loop is initiated. If the sensor does not meet such norms, the process enters the next module.

CM (Correction Module)
A CM is used to calculate the calibration factor (CF) of each non norm sensor. First, the R 2 , intercept (A 0 ), and slope (A 1 ) values are estimated and linear regression is performed between their observed data set during the check-up period. The R 2 , A 0 , and A 1 values are then used to calculate the CF of each sensor. The formula for calculating the CF of sensor i is as follows:

Conclusions
The low-cost sensor set up used the non-government services is to satisfy people to know the air quality of their homes rather than the value of an empty product station a few kilometers away. Low-cost sensors set up by the government are used as monitoring equipment to deter illegal air pollution in industrial areas. However, there has been a lack of evaluation in monitoring accuracy and methods for improvement. This study suggests an approach to assess the performance of low-cost sensor systems, which was systematically analyzed in terms of detecting ambient PM 2.5 distributions by using the spatial interpolation data of an AQMS system. The results indicated that the government's SAQ-200 sensor system outperformed the community AirBox sensor system on most parameters. However, the SAQ-200 sensor system has a lower sensitivity than the AirBox sensor system. The variations in the PM 2.5 concentrations detected in the six two-month periods had the following sequence: AirBox sensors > AQMSs > SAQ-200 sensors. After its installation, the SAQ-200 sensor system was corrected using the ARRF method. The AirBox sensors were not corrected after installation. Both systems require a more complete correction method to provide accurate observations. To improve the performance of a lowcost IoT sensor system, a working platform consisting of a monitoring operation loop and an automatic correction loop was proposed. This working platform provides additional functions, such as those related to identifying instrument faults and pollution, calling for repairs, determining the scope of pollution, issuing alerts, tracing possible pollution sources, and performing automatic correction. The results of this study can indicate the steps to be taken to obtain accurate pollution detection results.