Calibration of Low-Cost NO2 Sensors through Environmental Factor Correction

Low-cost air quality sensors (LCSs) have become more widespread due to their low cost and increased capabilities; however, to supplement more traditional air quality networks, the performance of these LCSs needs to be validated. This study focused on NO2 measurements from eight Clarity Node-S sensors and used various environmental factors to calibrate the LCSs. To validate the calibration performance, we calculated the root-mean-square error (RMSE), mean absolute error (MAE), R2, and slope compared to reference measurements. Raw results from six of these sensors were comparable to those reported for other NO2 LCSs; however, two of the evaluated LCSs had RMSE values ~20 ppb higher than the other six LCSs. By applying a sensor-specific calibration that corrects for relative humidity, temperature, and ozone, this discrepancy was mitigated. In addition, this calibration improved the RMSE, MAE, R2, and slope of all eight LCS compared to the raw data. It should be noted that relatively stable environmental conditions over the course of the LCS deployment period benefited calibration performance over time. These results demonstrate the importance of developing LCS calibration models for individual sensors that consider pertinent environmental factors.


Introduction
Although annual ambient NO 2 concentrations have been decreasing in many locations in North America, many East Asian countries are seeing increases in the annual mean NO 2 concentration [1]. Studies have shown that exposure to NO 2 can lead to increased preterm birth and infant mortality, increased asthma symptoms, allergic rhinitis, and chronic obstructive pulmonary diseases [2,3]. The U.S Environmental Protection Agency's (EPA) current National Ambient Air Quality Standard (NAAQS) for 1-h NO 2 is 100 ppbV, while the World Health Organization's 2005 1-h NO 2 guideline value is 200 µg/m 3 [4,5]. Even in areas that meet health-based standards to protect against exposure to NO 2 , there is still the need to measure ambient NO 2 levels, as photochemical reactions of NO 2 can result in ground-level ozone and fine particle formation [4]. Anthropogenic NO 2 is mainly formed through high-temperature combustion processes from both mobile and stationary sources; the EPA's 2017 National Emission Inventory reports 52% of NO 2 emissions from mobile sources such as cars, trucks, and planes, and 32% from stationary sources such as power plants and cement kilns [6,7].
The main purpose of air monitoring networks is to observe the community's exposure to air pollutants. In Maricopa County, Arizona, regulatory monitoring includes quantification of ambient levels of NO 2 , CO, PM 10 (particulate matter with an aerodynamic diameter Table 1. Results summary of NO 2 LCSs tested by the South Coast Air Quality Management District [19]. While SCAQMD only tests LCSs as-is, there are a variety of published studies that have tested various calibration techniques on similar sensors. Han et al. [20] conducted a 12-month field evaluation of four Alphasense NO2-B43F sensors and tested several linear regression and neural network calibration methods that accounted for temperature and relative humidity. They concluded that the neural network significantly improved the NO 2 data compared to the other methods. However, one potential problem in this study is that they used one set of sensors to train the calibration models and the other set to test sensor performance, resulting in questions about the performance between individual LCSs deploying the same sensing technology. Suriano et al. [21] designed and tested a field evaluation system for LCSs, including two Alphasense NO2-B43F sensors, and used linear regression and multivariate linear regression models to calibrate these sensors, considering the interfering effects of relative humidity, temperature, and ozone. Sahu et al. [22] also evaluated Alphasense NO2-B43F sensors and recommended incorporating temperature and relative humidity into calibration models and maximizing the diversity of data used in training; they also concluded that using a local non-parametric calibration method with a learned metric maximized sensor performance. Another study, Masey et al. [23], tested Aeroqual S5000 NO 2 sensors and used a linear regression calibration model that included ozone but not temperature and relative humidity. This study analyzed the impact of varying training datasets on calibration performance and concluded that the best performance came from combining data from intermittent periods during the deployment into one training dataset. Lin et al. [24] further studied the impact of ozone on Aeroqual S5000 NO 2 sensors and found that the sensor bias was significantly correlated to nearby ozone measurements and saw improved sensor performance after correcting for ozone.
Despite these prior studies, questions remain about the intercomparability of multiple LCSs measuring NO 2 using the same sensing approach. In this work, we evaluated the performance of eight NO 2 LCSs of the same make and model compared to a FRM instrument over a 4-month period. The number of LCSs used allowed us to better characterize their performance as a cohesive network and identify any outliers. We examined the impact of environmental factors such as temperature, relative humidity, and ozone on the LCSs and developed calibration methods to correct for these potential influences. Additionally, we studied the impact of training data volume on the performance of the calibration, in terms of accuracy and precision among the LCSs.

Low-Cost Sensors and Reference Monitoring
The collocation study was conducted in Maricopa County, Arizona, USA for 4 months, from October 2020 to February 2021. One of the largest urban areas in the Southwestern United States, Maricopa County has a growing population of 4.4 million residents with elevated levels of air pollution, exceeding the EPA's 8-h O 3 standard of 70 ppb [25]. Over the study period, the meteorological conditions varied widely, with a temperature range of 2-34 • C and a relative humidity range of 6-97%. Eight LCSs, Clarity Node-S models (Clarity Movement Co., Berkeley, CA, USA), were used in this study. These devices are capable of measuring PM 2.5 and NO 2 while also containing sensors to track the temperature and relative humidity inside the device. The NO 2 measurement is made with an Alphasense NO 2 -A43F electrochemical cell (Alphasense Ltd., Great Notely, UK) that has an ozone filter at the front end to limit interfering species, as the electrochemical cell has a 100% cross-sensitivity with ozone [26]. These LCSs are solar powered and take and report measurements every 15 min, uploading these data to the Clarity cloud using a cellular connection, where the data are then averaged into hourly measurements.
These LCSs were collocated with a NO 2 FRM instrument at Maricopa County Air Quality Department's West Phoenix monitoring site (AQS ID: 04-013-0019). This air monitoring site operates at a neighborhood scale (radius of 0.5-4 km) while covering a population of over one million individuals [8]. The LCSs were mounted on a fence 2 m above the ground and 10 m away from the FRM inlet as shown in Figure 1. The NO 2 FRM instrument at this site is a Thermo Scientific 42iQ NO-NO 2 -NO x Analyzer (ThermoFisher Scientific, Franklin, MA, USA), while the O 3 FEM instrument is a Teledyne-API T400 (Teledyne API, San Diego, CA, USA). Data from these instruments were obtained from MCAQD at a 1-h resolution.
monitoring site operates at a neighborhood scale (radius of 0.5-4 km) while covering a population of over one million individuals [8]. The LCSs were mounted on a fence 2 m above the ground and 10 m away from the FRM inlet as shown in Figure 1. The NO2 FRM instrument at this site is a Thermo Scientific 42iQ NO-NO2-NOx Analyzer (ThermoFisher Scientific, Franklin, MA, USA), while the O3 FEM instrument is a Teledyne-API T400 (Teledyne API, San Diego, CA, USA). Data from these instruments were obtained from MCAQD at a 1-h resolution.

Calibration Models
Three data series are compared in this study; the raw performance of the sensors based solely on NO2 sensor response; a calibration model applied by the manufacturer to correct bias and account for sensor response to changing temperature and relative humidity; and a calibration model that starts with the manufacturer calibration that further accounts for sensor response to changing ozone concentrations. The training and testing time periods of these calibration models were varied in an attempt to optimize their performance.

Raw Calibration
Raw NO2 concentrations are calculated by Clarity using the potentiostat voltages proportional to the working and auxiliary electrodes (vGas and vAux, respectively) measured from an Alphasense NO2-A43F sensor using Equation (1), where asSensitivity is a sensor-specific value provided by the manufacturer that allows for the conversion from nA to ppb. The 10 6 /499 quantity is specific to the implementation of the potentiostatic circuit and is calculated by solving the circuit (in particular, the current measuring circuit).

Clarity Baseline Calibration Model
The Clarity baseline calibration model uses the electrochemical cell's working and auxiliary voltage along with the sensor's internal temperature and relative humidity readings in a multivariate regression model. This model is fit to collocation data for each sensor, which corrects for shifts in the baseline caused by the current temperature and

Calibration Models
Three data series are compared in this study; the raw performance of the sensors based solely on NO 2 sensor response; a calibration model applied by the manufacturer to correct bias and account for sensor response to changing temperature and relative humidity; and a calibration model that starts with the manufacturer calibration that further accounts for sensor response to changing ozone concentrations. The training and testing time periods of these calibration models were varied in an attempt to optimize their performance.

Raw Calibration
Raw NO 2 concentrations are calculated by Clarity using the potentiostat voltages proportional to the working and auxiliary electrodes (vGas and vAux, respectively) measured from an Alphasense NO 2 -A43F sensor using Equation (1), where asSensitivity is a sensor-specific value provided by the manufacturer that allows for the conversion from nA to ppb. The 10 6 /499 quantity is specific to the implementation of the potentiostatic circuit and is calculated by solving the circuit (in particular, the current measuring circuit).

Clarity Baseline Calibration Model
The Clarity baseline calibration model uses the electrochemical cell's working and auxiliary voltage along with the sensor's internal temperature and relative humidity readings in a multivariate regression model. This model is fit to collocation data for each sensor, which corrects for shifts in the baseline caused by the current temperature and relative humidity changes in the recent past. This calibration model was then compared against the NO 2 measurement by the FRM sensor.

Ozone Correction Calibration Model
Initial review of data showed a high correlation between the LCS NO 2 measurements and FEM O 3 measurements as shown in Table S1. To elucidate the impact of ozone on NO 2 measurements, a second calibration model, Equation (2), was applied on top of the Clarity baseline calibration to also incorporate LCS response to changing ozone concentrations. We proceeded with this approach as O 3 FEM measurements were available at the West Phoenix collocation site, and in future field deployments, these LCSs will also be collocated with O 3 FEM instruments. Figure 2 demonstrates that ozone only has a noticeable effect on the LCS NO 2 measurements above a certain threshold value, a, which is reflected in Equation (2). Lab studies performed on the Alphasense NO2-B43F, which has a larger ozone filter than the NO 2 -A43F model, have shown an O 3 cross-sensitivity of 6.6% and warned that this cross-sensitivity could increase over time [27]. This same study found that the NO2-B43F signal was linearly dependent on relative humidity with hysteresis and that six of the LCSs had a direct relationship between temperature and reported NO 2 concentrations, but that two sensors showed an inverse relationship between sensor values and temperature. This study was unable to explain this discrepancy and may indicate that LCSs built from the same components may behave differently in field deployments. against the NO2 measurement by the FRM sensor.

Ozone Correction Calibration Model
Initial review of data showed a high correlation between the LCS NO2 measurements and FEM O3 measurements as shown in Table S1. To elucidate the impact of ozone on NO2 measurements, a second calibration model, Equation (2), was applied on top of the Clarity baseline calibration to also incorporate LCS response to changing ozone concentrations. We proceeded with this approach as O3 FEM measurements were available at the West Phoenix collocation site, and in future field deployments, these LCSs will also be collocated with O3 FEM instruments. Figure 2 demonstrates that ozone only has a noticeable effect on the LCS NO2 measurements above a certain threshold value, a, which is reflected in Equation (2). Lab studies performed on the Alphasense NO2-B43F, which has a larger ozone filter than the NO2-A43F model, have shown an O3 cross-sensitivity of 6.6% and warned that this cross-sensitivity could increase over time [27]. This same study found that the NO2-B43F signal was linearly dependent on relative humidity with hysteresis and that six of the LCSs had a direct relationship between temperature and reported NO2 concentrations, but that two sensors showed an inverse relationship between sensor values and temperature. This study was unable to explain this discrepancy and may indicate that LCSs built from the same components may behave differently in field deployments. The Clarity baseline calibration model used in this study can interpret low NO2 concentrations as zero or negative concentrations; however, when viewed by consumers, the negative concentrations are clipped. The ozone correction calibration model applies a second correction to force all data above zero, as detailed in Equations (3)-(5).

Calibration Evaluation Methods
From the LCS NO2 data, several parameters, including root-mean-square error (RMSE), mean absolute error (MAE), R 2 , standard deviation, and slope were calculated and used to evaluate the performance of each calibration compared to the FRM NO2 The Clarity baseline calibration model used in this study can interpret low NO 2 concentrations as zero or negative concentrations; however, when viewed by consumers, the negative concentrations are clipped. The ozone correction calibration model applies a second correction to force all data above zero, as detailed in Equations (3)-(5).

Calibration Evaluation Methods
From the LCS NO 2 data, several parameters, including root-mean-square error (RMSE), mean absolute error (MAE), R 2 , standard deviation, and slope were calculated and used to evaluate the performance of each calibration compared to the FRM NO 2 measurements. RMSE and MAE were used to quantify the accuracy of the LCSs, R 2 was used to quantify the fit of the data relative to FRM measurements, and standard deviation was used to Toxics 2021, 9, 281 6 of 17 determine precision among the LCSs. Additionally, slope and the Pearson correlation coefficient were used to quantify the impact of environmental factors such as relative humidity, temperature, and ozone on the LCSs. RMSE was calculated using the procedure documented in the EPA's Performance Testing Protocols, Metrics, and Target Values for O 3 Air Sensors; MAE was calculated using Equation (6), where y i is the LCS measurement and x i is the FRM measurement; and standard deviation was calculated using Equation (7) [17]. Figure 3 shows the times series plots for the West Phoenix NO 2 FRM instrument and the raw data from LCS #2, while Figure 4 is a scatter plot of LCS #2 raw NO 2 vs. West Phoenix NO 2 FRM with linear regression parameters. In Figure 3, it is apparent that the raw LCS data under-measure NO 2 compared to the FRM and regularly report negative values. Figure 4 indicates that the overall responsiveness of the LCSs reflects changing NO 2 concentrations as the slope is close to 1; however, variability, as demonstrated by the data scatter, shows LCS shortcomings. Statistics for the colocation period from all eight LCSs are shown in Table 2. By testing eight LCSs, we demonstrated that not all sensors of the same make and model always behave the same, as we saw that LCS #5 and #11 have substantially larger RMSE values and lower R 2 values than the other LCSs. By including the performance of these two outlier LCSs, the standard deviation between all LCSs for the whole period was 8.8 ppb; however, excluding the two outlier LCSs resulted in the standard deviation dropping to 3.2 ppb. Results from the other six LCSs fell in the range of previously studied LCSs shown in Table 1; for example, Sensor 2 had a MAE of 8.7 ppb and a R 2 of 0.5974, which were comparable to AQMesh (V5.1) LCS.

Raw LCS NO 2 Measurements
As shown in numerous other studies [20][21][22][23][24], the data from LCSs can be improved through calibrations that account for relative humidity, temperature, and ozone, among other factors. Table S1 provides evidence that the raw output from these sensors is affected by these environmental conditions compared to their effect on the FRM instrument. In Table S1, we see that the raw LCS NO 2 data was more strongly impacted by environmental parameters such as relative humidity, temperature, and ozone; LCS #2 had a higher Pearson correlation to temperature, relative humidity, and ozone (T: −0.66, RH: 0.52, O 3 : −0.89) than the FRM data (T: −0.32, RH: 0.14, O 3 : −0.81). This effect is demonstrated in Figure 5, as the average absolute biases of the LCSs were higher when the relative humidity was greater than 35% and ozone was greater than 40 ppb. Additionally, Figure 5 and Table S1 further highlight the difference between LCSs #5 and #11 and the rest of the sensors, as they typically had absolute biases 20 ppb higher and slopes double those of the other sensors.

Clarity LCS NO2 Calibration
To correct for varying responses to environmental conditions, Clarity applies a calibration to the raw NO2 concentration that accounts for the impact of temperature and relative humidity on sensor response. To develop their calibration, Clarity recommends collocating their LCSs with a FRM/FEM instrument for at least 2 weeks. This initial calibration was trained using data from a 15-day period (26 October 2020-9 November 2020 LCS #2 n = 361). The time series for the LCS #2 calibrated data is seen in Figure 6 with the scatter plot in Figure 7. After the application of the correction calibration, the LCS data underestimated and overestimated NO2 at times with a decrease in the number of negative values (n = 226) compared to the raw data (n = 366). Summarized statistical results of all sensors for the whole deployment period are in Table 3, while Figure 8 directly compares the raw data to the Clarity calibrated data. Figure 8A clearly shows that the calibration reduced the RMSE from an average of 15 ppb to 9 ppb, and Figure S1 shows a reduction in MAE from 12 ppb to 7 ppb for all LCSs. Additionally, the average standard deviation over the whole deployment period was lowered from 8.8 ppb to 5.5 ppb. However, the scatter in the data was not uniformly improved, as LCS #5 and #11 had

Clarity LCS NO 2 Calibration
To correct for varying responses to environmental conditions, Clarity applies a calibration to the raw NO 2 concentration that accounts for the impact of temperature and relative humidity on sensor response. To develop their calibration, Clarity recommends collocating their LCSs with a FRM/FEM instrument for at least 2 weeks. This initial calibration was trained using data from a 15-day period (26 October 2020-9 November 2020 LCS #2 n = 361). The time series for the LCS #2 calibrated data is seen in Figure 6 with the scatter plot in Figure 7. After the application of the correction calibration, the LCS data underestimated and overestimated NO 2 at times with a decrease in the number of negative values (n = 226) compared to the raw data (n = 366). Summarized statistical results of all sensors for the whole deployment period are in Table 3, while Figure 8 directly compares the raw data to the Clarity calibrated data. Figure 8A clearly shows that the calibration reduced the RMSE from an average of 15 ppb to 9 ppb, and Figure S1 shows a reduction in MAE from 12 ppb to 7 ppb for all LCSs. Additionally, the average standard deviation over the whole deployment period was lowered from 8.8 ppb to 5.5 ppb. However, the scatter in the data was not uniformly improved, as LCS #5 and #11 had lower R 2 values (0.35, 0.34) with the Clarity calibration compared to the raw data (0.36, 0.40) ( Figure 8B). In terms of slope, the Clarity calibration improved LCS #5 (2.1 to 0.76) and #11 (1.8 to 0.73) with minimal improvements seen in the other sensors ( Figure 8C). Additionally, as seen in Figures 6 and  7, the calibration did not completely remove negative or zero values of NO 2 , leading to larger errors. , x FOR PEER REVIEW 9 of 17 lower R 2 values (0.35, 0.34) with the Clarity calibration compared to the raw data (0.36, 0.40) ( Figure 8B). In terms of slope, the Clarity calibration improved LCS #5 (2.1 to 0.76) and #11 (1.8 to 0.73) with minimal improvements seen in the other sensors ( Figure 8C). Additionally, as seen in Figures 6 and 7, the calibration did not completely remove negative or zero values of NO2, leading to larger errors.     Figure 8B). In terms of slope, the Clarity calibration improved LCS #5 (2.1 to 0.76) and #11 (1.8 to 0.73) with minimal improvements seen in the other sensors ( Figure 8C). Additionally, as seen in Figures 6 and 7, the calibration did not completely remove negative or zero values of NO2, leading to larger errors.       In addition to comparing raw and Clarity calibrated data across the whole period, we wanted to compare data excluding the calibration training period and evaluate how the calibration performed over time. By excluding data from the calibration training period during the first 15 days of deployment, we were able to demonstrate the predictive ability of the calibration model. When this analysis was performed ( Figure S2), we saw the same trends as when the whole period was evaluated: specifically, a uniform decrease in RMSE and MAE and minimal changes in R 2 and slope, indicating that the calibration did improve the data during the test period. To evaluate the calibration over time, the RMSE and R 2 values for the LCSs were calculated for 2-week periods, as shown in Figure  9 for both the raw and Clarity calibrated data, to produce a temporal trend. This analysis showed no statistical change in sensor performance over time. This lack of sensor drift can be explained by looking at the environmental parameters experienced by the LCSs over their deployment period. Figure S3 shows that temperature, relative humidity, NO2, and In addition to comparing raw and Clarity calibrated data across the whole period, we wanted to compare data excluding the calibration training period and evaluate how the calibration performed over time. By excluding data from the calibration training period during the first 15 days of deployment, we were able to demonstrate the predictive ability of the calibration model. When this analysis was performed ( Figure S2), we saw the same trends as when the whole period was evaluated: specifically, a uniform decrease in RMSE and MAE and minimal changes in R 2 and slope, indicating that the calibration did improve the data during the test period. To evaluate the calibration over time, the RMSE and R 2 values for the LCSs were calculated for 2-week periods, as shown in Figure 9 for both the raw and Clarity calibrated data, to produce a temporal trend. This analysis showed no statistical change in sensor performance over time. This lack of sensor drift can be explained by looking at the environmental parameters experienced by the LCSs over their deployment period. Figure S3 shows that temperature, relative humidity, NO 2 , and O 3 did not substantially differ between the calibration period and the final 15 days of deployment (29 January 2021-12 February 2021), indicating that the training data spanned the range of environmental conditions experienced during the study.
x FOR PEER REVIEW 11 of 17 Figure 9. Average RMSE (A) and average R 2 (B) between the 8 sensors evaluated biweekly for the whole deployment period for raw NO2 and Clarity 15-day calibration.

Impact of Volume of Training Data on Calibration Performance
To investigate possible approaches to improve the calibration process, we tested whether larger training data volume resulted in an improvement of the LCS measurements. For this, we doubled the volume of data the Clarity model was trained on to include data collected between 26 October 2020 and 23 November 2020 (LCS #2 n=721). Figure 10 shows the RMSE (A) and R 2 (B) values of the test period for the Clarity calibrated data using 15-and 30-day training periods. By using ~30 days for calibration, the RMSE decreased for all sensors during the test period, there was less scatter in the data except for LCS #7, and the standard deviation decreased to 4.3 ppb. Tables 4 and S2 summarize the results of the 30-day Clarity calibration for the whole period, including how the data responded to environmental conditions. The 30-day Clarity calibrated data, Table S2, better accounted for environmental biases in the LCSs relative to the NO2 FRM measurement compared to the raw data previously shown in Table S1. For example, LCS #2 with the 30-day Clarity calibration had similar correlations to temperature, relative humidity, and ozone (T: −0.30, RH: 0.03, O3: −0.73) compared to the FRM data (T: −0.32, RH: 0.14, O3: −0.81). The 30-day Clarity calibrated data also had higher correlation coefficients with the FRM data (0.66 ≤ ≤ 0.89) compared to the raw data (0.60 ≤ ≤ 0.83). However, the 30-day Clarity calibration only adjusted for temperature and relative humidity, motivating further investigation of whether there was still room for improvement by accounting for other covariates in the model.

Impact of Volume of Training Data on Calibration Performance
To investigate possible approaches to improve the calibration process, we tested whether larger training data volume resulted in an improvement of the LCS measurements. For this, we doubled the volume of data the Clarity model was trained on to include data collected between 26 October 2020 and 23 November 2020 (LCS #2 n=721). Figure 10 shows the RMSE (A) and R 2 (B) values of the test period for the Clarity calibrated data using 15-and 30-day training periods. By using~30 days for calibration, the RMSE decreased for all sensors during the test period, there was less scatter in the data except for LCS #7, and the standard deviation decreased to 4.3 ppb. Table 4 and Table S2 summarize the results of the 30-day Clarity calibration for the whole period, including how the data responded to environmental conditions. The 30-day Clarity calibrated data, Table S2, better accounted for environmental biases in the LCSs relative to the NO 2 FRM measurement compared to the raw data previously shown in Table S1. For example, LCS #2 with the 30-day Clarity calibration had similar correlations to temperature, relative humidity, and ozone (T: −0.30, RH: 0.03, O 3 : −0.73) compared to the FRM data (T: −0.32, RH: 0.14, O 3 : −0.81). The 30-day Clarity calibrated data also had higher correlation coefficients with the FRM data (0.66 ≤ x ≤ 0.89) compared to the raw data (0.60 ≤ x ≤ 0.83). However, the 30-day Clarity calibration only adjusted for temperature and relative humidity, motivating further investigation of whether there was still room for improvement by accounting for other covariates in the model.

Ozone Correction for LCS Calibration
The Clarity calibration does not account for the effect of ozone on NO2 measurements because the LCS does not have a built-in O3 sensor even though the electrochemical sensor used to monitor for NO2 is known to also respond to O3 [26]. For this study, the LCSs were collocated at an air quality monitoring site that also had an O3 FEM instrument. In review of the Clarity calibrated data, an initial observation was that elevated ambient levels of ozone impacted LCS NO2 performance, motivating an ozone correction to data collected during high ambient ozone levels. This effect is illustrated in Figure 2. In addition to correcting for the influence of ozone, our supplemental calibration also was designed to eliminate negative values by establishing a baseline. Figures 11 and 12 show the results of these corrections applied to the 30-day Clarity calibrated data on LCS #2. With this correction, the standard deviation between LCSs fell to 3.6 ppb, the RMSE decreased for all LCSs, and the R 2 increased for all sensors as shown in Table 5. Compared to results of similar multivariate linear regression model studies [21], our data showed a higher

Ozone Correction for LCS Calibration
The Clarity calibration does not account for the effect of ozone on NO 2 measurements because the LCS does not have a built-in O 3 sensor even though the electrochemical sensor used to monitor for NO 2 is known to also respond to O 3 [26]. For this study, the LCSs were collocated at an air quality monitoring site that also had an O 3 FEM instrument. In review of the Clarity calibrated data, an initial observation was that elevated ambient levels of ozone impacted LCS NO 2 performance, motivating an ozone correction to data collected during high ambient ozone levels. This effect is illustrated in Figure 2. In addition to correcting for the influence of ozone, our supplemental calibration also was designed to eliminate negative values by establishing a baseline. Figures 11 and 12 show the results of these corrections applied to the 30-day Clarity calibrated data on LCS #2. With this correction, the standard deviation between LCSs fell to 3.6 ppb, the RMSE decreased for all LCSs, and the R 2 increased for all sensors as shown in Table 5. Compared to results of similar multivariate linear regression model studies [21], our data showed a higher average R 2 (0.80) compared to (0.41), an average slope and intercept of (0.989, −0.3) compared to (0.664, 3.0), and a MAE of 4.6 ppb compared to 3.0 ppb. correlation to the FRM NO2 measurements (0.79 ≤ ≤ 0.92) and optimized their response to environmental conditions compared to the 30-day Clarity calibration, as shown in Table  S3. Furthermore, Figure 2 demonstrates that the impact of high ozone on the LCS NO2 measurements was lessened, as the LCS NO2/FRM NO2 values at ozone concentrations greater than 20 ppb were reduced. Figure 13 illustrates how the impact of environmental conditions on the sensor's accuracy was reduced, as the absolute sensor biases decreased compared to Figure 5. Looking at the calibration's performance over time in Figure 14, no temporal trends were observed; however, the ozone corrected data consistently had a RMSE that was lower and a R 2 value that was higher than the Raw and Clarity calibrated data.    correlation to the FRM NO2 measurements (0.79 ≤ ≤ 0.92) and optimized their response to environmental conditions compared to the 30-day Clarity calibration, as shown in Table  S3. Furthermore, Figure 2 demonstrates that the impact of high ozone on the LCS NO2 measurements was lessened, as the LCS NO2/FRM NO2 values at ozone concentrations greater than 20 ppb were reduced. Figure 13 illustrates how the impact of environmental conditions on the sensor's accuracy was reduced, as the absolute sensor biases decreased compared to Figure 5. Looking at the calibration's performance over time in Figure 14, no temporal trends were observed; however, the ozone corrected data consistently had a RMSE that was lower and a R 2 value that was higher than the Raw and Clarity calibrated data.     In addition to improving sensor performance, this calibration also increased the LCS correlation to the FRM NO 2 measurements (0.79 ≤ x ≤ 0.92) and optimized their response to environmental conditions compared to the 30-day Clarity calibration, as shown in Table S3. Furthermore, Figure 2 demonstrates that the impact of high ozone on the LCS NO 2 measurements was lessened, as the LCS NO 2 /FRM NO 2 values at ozone concentrations greater than 20 ppb were reduced. Figure 13 illustrates how the impact of environmental conditions on the sensor's accuracy was reduced, as the absolute sensor biases decreased compared to Figure 5. Looking at the calibration's performance over time in Figure 14, no temporal trends were observed; however, the ozone corrected data consistently had a RMSE that was lower and a R 2 value that was higher than the Raw and Clarity calibrated data.

Conclusions
LCSs will continue to become more widespread as sensing technologies advance and the need for high-spatial-resolution air quality data grows. However, with this increased usage comes the necessity of ensuring the LCSs generate high data quality with an optimal approach for calibration and evaluation. This study showed that environmental factors such as relative humidity, temperature, and ozone can affect the raw measurements of these LCSs. However, by accounting for these variables in the LCS calibration, the accuracy and precision of the data can be improved. Additionally, this study demonstrated that increasing the calibration training data volume results in improved calibration performance. It is important to note that relatively steady environmental conditions throughout the deployment period benefited the performance of the LCSs; with rapidly changing environments such as extreme swings in temperature and relative humidity, one would anticipate more variable calibration performance and the need for more regular calibration adjustments. This study also showed the importance of developing a calibration model unique to each LCS, as not every LCS of a given brand and model responds to changing environmental conditions equally. This conclusion is based on the fact that two out of the eight tested sensors showed extraneous behavior demonstrating the importance of the scale of the study.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Figure S1:

Conclusions
LCSs will continue to become more widespread as sensing technologies advance and the need for high-spatial-resolution air quality data grows. However, with this increased usage comes the necessity of ensuring the LCSs generate high data quality with an optimal approach for calibration and evaluation. This study showed that environmental factors such as relative humidity, temperature, and ozone can affect the raw measurements of these LCSs. However, by accounting for these variables in the LCS calibration, the accuracy and precision of the data can be improved. Additionally, this study demonstrated that increasing the calibration training data volume results in improved calibration performance. It is important to note that relatively steady environmental conditions throughout the deployment period benefited the performance of the LCSs; with rapidly changing environments such as extreme swings in temperature and relative humidity, one would anticipate more variable calibration performance and the need for more regular calibration adjustments. This study also showed the importance of developing a calibration model unique to each LCS, as not every LCS of a given brand and model responds to changing environmental conditions equally. This conclusion is based on the fact that two out of the eight tested sensors showed extraneous behavior demonstrating the importance of the scale of the study.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/toxics9110281/s1, Figure S1: MAE values for raw and Clarity 15-day calibrated LCS data, Figure S2: RMSE, MAE, R 2 , and slope comparisons for raw and Clarity 15-day calibrated data with training data excluded, Figure S3: Environmental factor comparison for the 15-day calibration period and the final 15 days of LCS deployment. Table S1: Slope and Pearson correlation coefficients of West Phoenix FRM NO 2 and raw LCS data, Table S2: Slope and Pearson correlation coefficients of West Phoenix FRM NO 2 and Clarity 30-day calibrated LCs data, Table S3: Slope and Pearson correlation coefficients of West Phoenix FRM NO 2 and Ozone corrected 30-day calibrated LCs data.  Data Availability Statement: Data supporting the reported results will be provided upon reader's request.