Comparing Building and Neighborhood-Scale Variability of CO2 and O3 to Inform Deployment Considerations for Low-Cost Sensor System Use

The increased use of low-cost air quality sensor systems, particularly by communities, calls for the further development of best-practices to ensure these systems collect usable data. One area identified as requiring more attention is that of deployment logistics, that is, how to select deployment sites and how to strategically place sensors at these sites. Given that sensors are often placed at homes and businesses, ideal placement is not always possible. Considerations such as convenience, access, aesthetics, and safety are also important. To explore this issue, we placed multiple sensor systems at an existing field site allowing us to examine both neighborhood-level and building-level variability during a concurrent period for CO2 (a primary pollutant) and O3 (a secondary pollutant). In line with previous studies, we found that local and transported emissions as well as thermal differences in sensor systems drive variability, particularly for high-time resolution data. While this level of variability is unlikely to affect data on larger averaging scales, this variability could impact analysis if the user is interested in high-time resolution or examining local sources. However, with thoughtful placement and thorough documentation, high-time resolution data at the neighborhood level has the potential to provide us with entirely new information on local air quality trends and emissions.


Introduction and Background
As research into and the use of low-cost air quality sensor systems continues to expand there is great potential for this technology to support community-level investigations. Furthermore, given the nature of these sensor systems, such investigations provide data with increased resolution on both temporal and spatial scales. Ideally, such sensor systems offer greater insight into personal exposure [1], small-scale variability [2], and local emission sources or potential 'hot spots' [3]. One of the barriers to widespread sensor use has been concerns over data quality and reliability. There is a growing body of research demonstrating the ability of sensors to quantify pollutants at levels relevant to ambient investigations [4][5][6][7]. However, other issues have received less attention, for example, strategies for siting low-cost sensors. Sensor deployment and siting considerations are particularly important because while it is sometimes possible to re-analyze or re-quantify sensor data as new techniques become available, it is rarely possible to re-collect data as environmental conditions and emissions impacting a site are dynamic in nature. Careful consideration prior to and documentation of the sensor

Deployment Overview (Sensor Systems, Siting, and Timeline)
The sensor systems utilized for this study, called Y-Pods (Hannigan Lab at CU Boulder, Boulder, CO, USA), contain several gas-phase and environmental sensors. This analysis utilizes data from the SGX (Corcelles-Cormondreche, Switzerland, formerly e2v) metal oxide semiconductor O 3 sensors (model MiCS-2611) and ELT non-dispersive infrared CO 2 sensors (model S-300) as well as data from environmental sensors (i.e., temperature and relative humidity). These sensor systems, or similar ones (e.g., the U-Pod, predecessor to the Y-Pod) operating the same sensors, have been used in prior sensor quantification and spatial variability studies [2,[15][16][17][18]. Figure 1 includes a photo of the interior of a Y-Pod and an example of two deployed Y-Pods. The Y-Pods, and all previous iterations, include a fan to drive active air flow resulting in multiple air exchanges per minute. The observations presented here would likely need to be re-evaluated for a system relying on passive flow. More information on signal processing and sensor quantification is available below in Section 2.2.
As previously mentioned, this study was integrated into a larger field deployment in Los Angeles allowing us to leverage one of the existing study sites and ongoing sensor calibration efforts. The study area is primarily high density residential with schools and some businesses nearby. In addition to local traffic and businesses (such as restaurants) other emission sources include two major highways to the North and East of the sampling area. The diagram in Figure 1 illustrates where the Y-Pods (B2, B3, B4, and B5) were added to the building site (main sensor system-B1). Note, the placements vary with respect to elevation and proximity to obstructions. Two Y-Pods were placed on the front of the building on a fire escape, two and three stories off the ground, and 6-12" from the side of the building. The fire escapes at the front and back of the building are both constructed of metal and allow for free airflow through and around the structures. The main Y-Pod was elevated on the roof, on top of a structure housing the stairs, close to the front of the building, and with no obstructions on any sides. The fourth and fifth Y-Pods were placed at the back of the building on another fire escape, one at the roof-level and the other three stories off the ground, again 6-12" from the side of the building. The back of the building is obstructed by a narrow alley that does not allow through-traffic; the lack of access to representative air flow makes the placement of B5 the least "ideal". (model MiCS-2611) and ELT non-dispersive infrared CO2 sensors (model S-300) as well as data from environmental sensors (i.e., temperature and relative humidity). These sensor systems, or similar ones (e.g., the U-Pod, predecessor to the Y-Pod) operating the same sensors, have been used in prior sensor quantification and spatial variability studies [2,[15][16][17][18]. Figure 1 includes a photo of the interior of a Y-Pod and an example of two deployed Y-Pods. The Y-Pods, and all previous iterations, include a fan to drive active air flow resulting in multiple air exchanges per minute. The observations presented here would likely need to be re-evaluated for a system relying on passive flow. More information on signal processing and sensor quantification is available below in Section 2.2.
As previously mentioned, this study was integrated into a larger field deployment in Los Angeles allowing us to leverage one of the existing study sites and ongoing sensor calibration efforts. The study area is primarily high density residential with schools and some businesses nearby. In addition to local traffic and businesses (such as restaurants) other emission sources include two major highways to the North and East of the sampling area. The diagram in Figure 1 illustrates where the Y-Pods (B2, B3, B4, and B5) were added to the building site (main sensor system-B1). Note, the placements vary with respect to elevation and proximity to obstructions. Two Y-Pods were placed on the front of the building on a fire escape, two and three stories off the ground, and 6-12″ from the side of the building. The fire escapes at the front and back of the building are both constructed of metal and allow for free airflow through and around the structures. The main Y-Pod was elevated on the roof, on top of a structure housing the stairs, close to the front of the building, and with no obstructions on any sides. The fourth and fifth Y-Pods were placed at the back of the building on another fire escape, one at the roof-level and the other three stories off the ground, again 6-12″ from the side of the building. The back of the building is obstructed by a narrow alley that does not allow through-traffic; the lack of access to representative air flow makes the placement of B5 the least "ideal".  Figure 1 also illustrates the location of several other neighborhood sites from which data was used in this analysis (N1, N2, and N3). These sensor systems were deployed on a relatively small scale with the furthest distance between any two neighborhoods sites being less than 1000 ft. It is important to note that the placement of N1, N2, and N3 at their respective sites also introduces some added variability as these placements differed site to site. The Y-Pod placement for N1 was most similar to B3 on a large second story balcony, on the side of a building open to the road. The Y-Pod  1 also illustrates the location of several other neighborhood sites from which data was used in this analysis (N1, N2, and N3). These sensor systems were deployed on a relatively small scale with the furthest distance between any two neighborhoods sites being less than 1000 ft. It is important to note that the placement of N1, N2, and N3 at their respective sites also introduces some added variability as these placements differed site to site. The Y-Pod placement for N1 was most similar to B3 on a large second story balcony, on the side of a building open to the road. The Y-Pod placement for N2 was also most similar to B3-at the front of the building, on the street side, but set back by a small yard/driveway and lower in elevation (~10 ft off the ground). The Y-Pod placement for N3 was most similar to B1, placed on the roof of a multi-family residence.
This study relies on comparing co-located sensor data with spatially deployed sensor data, therefore we limited the data utilized to match the lengths of our co-located datasets meaning approximately three weeks of data were included in the analysis. Figure 2 shows the timeline of long-term sensor use, including time periods of co-location and periods of field deployment. The co-location of all sensor systems prior to the field deployment was used to understand neighborhood variability; this co-located time period is referred to as Week 0. The Week 0 co-location occurred in a different part of Los Angeles at a regulatory monitoring site; this site is described in greater detail below in Section 2.2.2. For the first week of the building-scale variability study, the building Y-Pods (B2, B3, B4, and B5) were co-located with B1-this is referred to as Week 1. During this period the neighborhood Y-Pods (N1, N2, and N3) were already deployed to their field sites. Immediately following the first week of the field deployment the sensor systems were separated to their respective locations on the building and this is referred to as Week 2. The data from Week 2 was designated as the deployed dataset for both the neighborhood sites and the building sites. This study relies on comparing co-located sensor data with spatially deployed sensor data, therefore we limited the data utilized to match the lengths of our co-located datasets meaning approximately three weeks of data were included in the analysis. Figure 2 shows the timeline of longterm sensor use, including time periods of co-location and periods of field deployment. The colocation of all sensor systems prior to the field deployment was used to understand neighborhood variability; this co-located time period is referred to as Week 0. The Week 0 co-location occurred in a different part of Los Angeles at a regulatory monitoring site; this site is described in greater detail below in Section 2.3.2. For the first week of the building-scale variability study, the building Y-Pods (B2, B3, B4, and B5) were co-located with B1-this is referred to as Week 1. During this period the neighborhood Y-Pods (N1, N2, and N3) were already deployed to their field sites. Immediately following the first week of the field deployment the sensor systems were separated to their respective locations on the building and this is referred to as Week 2. The data from Week 2 was designated as the deployed dataset for both the neighborhood sites and the building sites.

Signal Processing and Sensor Quantification
Sensor signals were saved to a text file on a micro-SD card on the Y-Pod every 6-25 s, depending on the programming. As some of the metal oxide sensors used here require a warm-up period, the first half hour of data after a pod has been powered off for half an hour or more was removed. Minutemedians were computed; using medians instead of averages removes any single extreme points likely the result of electronic noise. For both the CO2 and O3 sensors, voltage values were recorded to the SD card as ADC values. These voltages were used as is for the CO2 sensor, but for the O3 sensor they were converted to a normalized resistance prior to analysis [2,4,19]. Note, all of the datasets for Weeks 0, 1, and 2 are complete with the exception of the O3 data from Y-Pod N3, on which the O3 sensor appears to have malfunctioned. Thus, this data has been excluded from the analysis.
Sensor signals were converted to concentrations using field calibration, which involves: (1) colocation with high-quality reference instruments; (2) the development of a calibration model using the air quality sensor signals, environmental sensor signals, and trusted reference data as well as a technique such as multiple linear regression; and (3) the evaluation of that model and its application to testing or validation data. Ideally; the sensors are co-located before and after the field deployment to better facilitate corrections for drift. It is common to incorporate environmental parameters into these calibration models as low-cost sensors are often cross-sensitive to temperature, humidity, and sometimes other pollutants [6]. This method of sensor quantification has been used by our research group as well as others [2,20,21] and with techniques such as linear regression, multiple linear regression, and machine learning [5,15]. Details of the calibration employed here are presented below.

Quantification of CO2 Sensors
For CO2 sensor quantification, the Y-Pods were twice co-located with a LI-840A (Licor, Lincoln,

Signal Processing and Sensor Quantification
Sensor signals were saved to a text file on a micro-SD card on the Y-Pod every 6-25 s, depending on the programming. As some of the metal oxide sensors used here require a warm-up period, the first half hour of data after a pod has been powered off for half an hour or more was removed. Minute-medians were computed; using medians instead of averages removes any single extreme points likely the result of electronic noise. For both the CO 2 and O 3 sensors, voltage values were recorded to the SD card as ADC values. These voltages were used as is for the CO 2 sensor, but for the O 3 sensor they were converted to a normalized resistance prior to analysis [2,4,19]. Note, all of the datasets for Weeks 0, 1, and 2 are complete with the exception of the O 3 data from Y-Pod N3, on which the O 3 sensor appears to have malfunctioned. Thus, this data has been excluded from the analysis.
Sensor signals were converted to concentrations using field calibration, which involves: (1) co-location with high-quality reference instruments; (2) the development of a calibration model using the air quality sensor signals, environmental sensor signals, and trusted reference data as well as a technique such as multiple linear regression; and (3) the evaluation of that model and its application to testing or validation data. Ideally; the sensors are co-located before and after the field deployment to better facilitate corrections for drift. It is common to incorporate environmental parameters into these calibration models as low-cost sensors are often cross-sensitive to temperature, humidity, and sometimes other pollutants [6]. This method of sensor quantification has been used Sensors 2018, 18, 1349 5 of 17 by our research group as well as others [2,20,21] and with techniques such as linear regression, multiple linear regression, and machine learning [5,15]. Details of the calibration employed here are presented below.

Quantification of CO 2 Sensors
For CO 2 sensor quantification, the Y-Pods were twice co-located with a LI-840A (Licor, Lincoln, NE, USA) placed at a regulatory monitoring site near downtown Los Angeles. The Licor LI-840A has an expected uncertainty of <1% of the reading as stated by the manufacturer, and the instrument is calibrated using a zero and two-point span calibration with gas standards. The Licor used in this study was calibrated prior to a deployment during the previous summer and was stored between these deployments. As a result of the time lag, we expect drift to have impacted the CO 2 reference data. However, as we are interested in sensor to sensor comparisons and the sensor data is baseline shifted (as described below), this drift is of minimal concern. These two co-locations with the Licor were 8 weeks apart and included 17 days total, 12 of which were used for calibration model training and 5 of which were used for model testing. In this instance more of the co-location data was designated for training in order to increase the robustness of the model and expand the environmental conditions for which the model was trained. The model used, Equation (1), included predictors for temperature (Temp), absolute humidity (AH), time (t), and the sensor signal or voltage (V) and solves for the CO 2 concentration (C): Due to logistics and a lack of available reference data, both calibration co-locations occurred prior to the building-scale variability study ( Figure 2). For this reason, further signal processing was necessary. Given that the CO 2 calibration model is extrapolating in time, additional drift was expected. For this reason, the CO 2 data was converted using the calibration model and then this data was baseline corrected (to remove drift), and finally the 10th percentile value from each Y-Pod was normalized to 400 ppm. We selected 400 ppm as it is the approximate atmospheric background concentration of CO 2 [22]. In light of the goals of this case study-comparing relative differences across co-located verses deployed sensors-this additional processing was deemed reasonable. Furthermore, the results illustrate the high correlation and agreement between co-located sensors post-processing as would be expected and is also present in the calibration data (Appendix A).

Quantification of the O 3 Sensors
For O 3 sensor quantification, the Y-Pods were co-located with API/Teledyne 400 instruments (San Diego, CA, USA) at two different regulatory monitoring sites. The first site was in Los Angeles in a mixed-use area with some nearby housing and industry. The second site was outside of Los Angles in Shafter, a rural Californian community. These two co-locations occurred prior to and following the building-scale field deployment and therefore no additional signal processing was necessary. The model, Equation (2), used included predictors for temperature (Temp), absolute humidity (AH), time (t), the normalized sensor resistance (R/R 0 ), as well as an interaction term between temperature and concentration, and solves for the O 3 concentration (C). The interaction term is intended to address not only changes in baseline driven by temperature but changes in the magnitude of sensor response driven by temperature. This model has been demonstrated as well performing for this sensor in previous studies [2,16]: Sensors 2018, 18, 1349 6 of 17 Table 1 below provides the performance statistics from the generation and validation of the calibration models. The complete statistics for individual Y-Pods as well as time series data are available in Appendix A. For both CO 2 and O 3 , there is relative consistency across the training and testing datasets. Additionally, the RMSE for the O 3 sensor was consistent with uncertainty typically cited for both this same sensor and other metal oxide O 3 sensors [2,16,23]. A previous study using the CO 2 sensor in a portable sensor system found a RMSE ranging from approximately 9-16 ppm depending on the calibration model selected [4].

Neighborhood-Scale Variability
Comparing Week 0 (co-located) to Week 2 (deployed to field sites), there is increased variability in both the CO 2 and O 3 data. For CO 2 , this variability is most extreme in the comparison between B1 and N3, which was also the site furthest away from B1 and closest to the highways. For this pair of sensors, the correlation decreases from 0.96 to 0.89 and the spread in the absolute differences as well as the median absolute difference increases, see Figure 3. This is not the case for the comparisons of B1 to N1/N2 where there is only a very small decrease in correlation. Examining the time series plots (available in Appendix B) reveals differences in the variability seen in Week 0 versus Week 2. For Week 0 the variability seems primarily driven by offsets in which one Pod is biased low or high for a period, whereas for Week 2, the variability seems driven by differences in trends between the sites typically in the form of short-term enhancements. These enhancements present in the Week 2 data are likely sources or plumes impacting the sites unevenly.
For O 3 , spatial variability across field sites was much more apparent. Although there was little change in the correlation coefficient, there was an increase in the spread in both the scatterplot and the boxplot (Figure 4). For Week 0, nearly all the absolute differences between B1 and N1/N2 were below the expected uncertainty (RMSE = 5.28 ppb). For Week 2, after the Y-Pods were spatially deployed the spread increased to well above the RMSE, see Figure 4. The time series plots (Appendix B) confirmed that this increased variability was primarily driven by short-term dips in O 3 likely caused by localized destruction occurring in a NO x plume. While it is possible that the differences in increased variability between the sensor types were in part due to CO 2 being a primary pollutant (thus less well-mixed) and O 3 a secondary pollutant (generally more well-mixed), it is also worth noting that the CO 2 sensor has a lower signal/noise than the O 3 sensor in this application.
deployed the spread increased to well above the RMSE, see Figure 4. The time series plots (Appendix B) confirmed that this increased variability was primarily driven by short-term dips in O3 likely caused by localized destruction occurring in a NOx plume. While it is possible that the differences in increased variability between the sensor types were in part due to CO2 being a primary pollutant (thus less well-mixed) and O3 a secondary pollutant (generally more well-mixed), it is also worth noting that the CO2 sensor has a lower signal/noise than the O3 sensor in this application.    The boxplots show the absolute differences between B1 and each of the neighborhood pods, with the whiskers at the 5th and 95th percentile respectively. The ozone sensor for N3 malfunctioned and the data was not included.

Building-Scale Variability
Somewhat surprisingly spatial variability was also observed at the building-level for both sensor types when comparing Week 1 (co-located at the building) and Week 2 (deployed). For CO2, there was a decrease in the correlations on the same scale as occurred across some of the neighborhood sites ( Figure 5). For O3, again there are no significant changes to the statistics, but there is an increase in spread (Figure 6), similar to Figure 4. The time series (Appendix B) showed the events driving these differences were short-term in nature and appeared to be driven by local emissions or transported plumes. This influence of nearby emissions events was observed by Miskell and colleagues as well [10]. The boxplots show the absolute differences between B1 and each of the neighborhood pods, with the whiskers at the 5th and 95th percentile respectively. The ozone sensor for N3 malfunctioned and the data was not included.

Building-Scale Variability
Somewhat surprisingly spatial variability was also observed at the building-level for both sensor types when comparing Week 1 (co-located at the building) and Week 2 (deployed). For CO 2 , there was a decrease in the correlations on the same scale as occurred across some of the neighborhood sites ( Figure 5). For O 3 , again there are no significant changes to the statistics, but there is an increase in spread (Figure 6), similar to Figure 4. The time series (Appendix B) showed the events driving these differences were short-term in nature and appeared to be driven by local emissions or transported plumes. This influence of nearby emissions events was observed by Miskell and colleagues as well [10]. was a decrease in the correlations on the same scale as occurred across some of the neighborhood sites ( Figure 5). For O3, again there are no significant changes to the statistics, but there is an increase in spread (Figure 6), similar to Figure 4. The time series (Appendix B) showed the events driving these differences were short-term in nature and appeared to be driven by local emissions or transported plumes. This influence of nearby emissions events was observed by Miskell and colleagues as well [10].   Hourly-averaged data was added to both Figures 5 and 6 to determine whether this spatial variability impacted data on more typical temporal reporting scales. Similar to Miskell and colleagues, the variability does not seem to impact the hourly O3 data [10]. However, given the decreased correlation coefficients (particularly for sites B2 and B5), it appears there was some variability still present in the hourly-averaged CO2 data.
For both pollutants, the most dramatic differences were between sites B1 and B4/B5, the two sites at the back of the building. Speaking with community partners from the project we determined that the building has both a natural gas hot water heater and natural gas dryers toward the back of the building where there are also pipes that appear to be venting these emissions. Sources on the building would seem to explain the large magnitude of the observed variability. By comparison, for the sites B2 and B3, which were on the front of the building above the road, there were occasional increasing spikes for CO2 and decreasing spikes for O3 that are smaller in magnitude. The range of responses observed in the sensors, along with this contextual information affirms that multiple pollutant sources were impacting the building in an uneven manner.
Providing further evidence for multiple sources, Figure 7 includes the absolute differences between Y-Pod B1 and B5 for CO2 (in blue) and O3 (in red). There are periods where the differences The scatter plots to the right show the same correlations, again with minute and hourly data, for Week 2 when they were spatially deployed around the building site.
Hourly-averaged data was added to both Figures 5 and 6 to determine whether this spatial variability impacted data on more typical temporal reporting scales. Similar to Miskell and colleagues, the variability does not seem to impact the hourly O 3 data [10]. However, given the decreased correlation coefficients (particularly for sites B2 and B5), it appears there was some variability still present in the hourly-averaged CO 2 data.
For both pollutants, the most dramatic differences were between sites B1 and B4/B5, the two sites at the back of the building. Speaking with community partners from the project we determined that the building has both a natural gas hot water heater and natural gas dryers toward the back of the building where there are also pipes that appear to be venting these emissions. Sources on the building would seem to explain the large magnitude of the observed variability. By comparison, for the sites B2 and B3, which were on the front of the building above the road, there were occasional increasing spikes for CO 2 and decreasing spikes for O 3 that are smaller in magnitude. The range of responses observed in the sensors, along with this contextual information affirms that multiple pollutant sources were impacting the building in an uneven manner.
Providing further evidence for multiple sources, Figure 7 includes the absolute differences between Y-Pod B1 and B5 for CO 2 (in blue) and O 3 (in red). There are periods where the differences between CO 2 and O 3 were well-correlated indicating a shared source. Following this period were instances where the differences were primarily visible in one pollutant or the other. This lack of correlation likely indicates two separate sources, one with relatively more CO 2 and another with more NO. Furthermore, there were many instances where these differences between the two building sites were well above the RMSE values. In Figure A6 (Appendix B) the spatial differences have been plotted in such a way as to highlight the temporal aspect of both the increases in CO 2 and decreases in O 3 at the B5 site. The correlation between differences in CO 2 and O 3 occur primarily in the evening hours, while the uncorrelated periods result in enhancements during early morning and daytime hours. These temporal patterns also point to separate sources influencing the sensor data. In addition to nearby emission events, Miskell and colleagues observed that direct sunlight causes thermal variations in the instruments causing variability [10]. We compared the internal temperatures in the Y-Pods to determine whether this could be a source of variability in our study as well. Figure 8 depicts the variability in light of temperature differences. Again, B1 was placed on a roof with no nearby obstructions meaning that it was exposed to more direct sun than B5, which was placed on a fire escape in an alley. In Figure 8, the internal temperature differences, between B1 and B5, less than three degrees Celsius were plotted separately from differences greater than three degrees Celsius. The line of best fit for the group with larger temperature differences (in yellow) illustrates a consistent bias in the data at low and high concentrations. This bias is visible in the time series as well, the B1 values are consistently greater than the B5 values when the temperature difference is above three degrees. Conversely, B1 and B5 are better matched in terms of long-term trends for smaller temperature differences. Although the calibration model does incorporate corrections for temperature effects, the model would be unable to account for the small differences driven by direct sunlight exposure as this would be difficult to control during co-location. The corrections incorporated into the calibration model are intended to deal with less acute temperature effects (e.g., diurnal patterns). In addition to nearby emission events, Miskell and colleagues observed that direct sunlight causes thermal variations in the instruments causing variability [10]. We compared the internal temperatures in the Y-Pods to determine whether this could be a source of variability in our study as well. Figure 8 depicts the variability in light of temperature differences. Again, B1 was placed on a roof with no nearby obstructions meaning that it was exposed to more direct sun than B5, which was placed on a fire escape in an alley. In Figure 8, the internal temperature differences, between B1 and B5, less than three degrees Celsius were plotted separately from differences greater than three degrees Celsius. The line of best fit for the group with larger temperature differences (in yellow) illustrates a consistent bias in the data at low and high concentrations. This bias is visible in the time series as well, the B1 values are consistently greater than the B5 values when the temperature difference is above three degrees. Conversely, B1 and B5 are better matched in terms of long-term trends for smaller temperature differences. Although the calibration model does incorporate corrections for temperature effects, the model would be unable to account for the small differences driven by direct sunlight exposure as this would be difficult to control during co-location. The corrections incorporated into the calibration model are intended to deal with less acute temperature effects (e.g., diurnal patterns). difference is above three degrees. Conversely, B1 and B5 are better matched in terms of long-term trends for smaller temperature differences. Although the calibration model does incorporate corrections for temperature effects, the model would be unable to account for the small differences driven by direct sunlight exposure as this would be difficult to control during co-location. The corrections incorporated into the calibration model are intended to deal with less acute temperature effects (e.g., diurnal patterns). Figure 8. Two plots illustrating the effect of temperature differences between the pods. The scatter plot (left) depicts B1 vs. B5, separating points where the temperature difference between the two pods is less than and greater than three degrees Celsius. The time series (right), shows two days of data from B1 and B5 where the B1 data also has an overlay of temperature differences between the pods. Figure 8. Two plots illustrating the effect of temperature differences between the pods. The scatter plot (left) depicts B1 vs. B5, separating points where the temperature difference between the two pods is less than and greater than three degrees Celsius. The time series (right), shows two days of data from B1 and B5 where the B1 data also has an overlay of temperature differences between the pods.
Siting choices and additional shading for the sensor systems could reduce this variability. Although some of the variability between building-sites can be attributed to thermal differences, it is important to recall that this variability is displayed as a bias rather than the larger spread associated with the variability driven by nearby emissions. Therefore, this variability would be unlikely to affect any conclusions about spatial differences due to sources in the same way the short-term enhancements would when examining high temporal resolution data.

Impact of Siting Choices on Neighborhood Varibaility Analysis
In agreement with the findings of Miskell and colleagues, we have observed that local emissions or plumes can drive intra-site variability as well as temperature differences caused by exposure to direct sunlight [10]. Also, as with the previous study, this spatial variability does not impact O 3 concentrations on typical reporting scales (hourly or eight-hour averages for example). However, the same is not necessarily true for CO 2 suggesting it may be valuable to further investigate this aspect of variability for primary pollutants. The spatial variability observed here becomes especially important for communities interested in high-time resolution data, which may be used to assess exposure and/or understand the impact of local emission sources within a neighborhood. When high temporal and spatial resolution is of interest, incorrect placement could result in the inappropriate attribution of sensor responses or failing to record emissions that are present. Figure 9 includes several days of data demonstrating the large magnitude of differences that can be observed across a single site.
To further explore the impact of the building-scale variations on the community-scale spatial differences, Figures 10 and 11 depict the average of the neighborhood sites with one building site selected and assumed to be representative for that location. The shading on the plot indicates the standard deviation for each mean. For the first case, in blue, Y-Pod B5 was selected as the building site Pod and for the second case, in red for minute median and green for hourly averaged data, B1 was selected. Similar to the previous comparison, there are minimal differences between the hourly O 3 datasets and only a few instances in the hourly CO 2 data where the mean of the B5 dataset differs beyond the standard deviation of the B1 dataset. However, examining the minute-median data for either pollutant, one might draw different conclusions regarding the neighborhood variability depending on which building site was selected. For example, one might anticipate more variability with B5 selected, or fewer local sources capable of scavenging O 3 with B1 selected. If examining the maximum daily CO 2 concentrations, the results for several days would differ. Regardless of which building site is selected, the diurnal trends are consistent potentially providing an indication of regional trends. Also, for the minute data, this difference between the datasets is more extreme for the CO 2 data possibly due to CO 2 being a primary pollutant and less well-mixed in the atmosphere.
of variability for primary pollutants. The spatial variability observed here becomes especially important for communities interested in high-time resolution data, which may be used to assess exposure and/or understand the impact of local emission sources within a neighborhood. When high temporal and spatial resolution is of interest, incorrect placement could result in the inappropriate attribution of sensor responses or failing to record emissions that are present. Figure 9 includes several days of data demonstrating the large magnitude of differences that can be observed across a single site. To further explore the impact of the building-scale variations on the community-scale spatial differences, Figures 10 and 11 depict the average of the neighborhood sites with one building site selected and assumed to be representative for that location. The shading on the plot indicates the standard deviation for each mean. For the first case, in blue, Y-Pod B5 was selected as the building site Pod and for the second case, in red for minute median and green for hourly averaged data, B1 was selected. Similar to the previous comparison, there are minimal differences between the hourly O3 datasets and only a few instances in the hourly CO2 data where the mean of the B5 dataset differs beyond the standard deviation of the B1 dataset. However, examining the minute-median data for either pollutant, one might draw different conclusions regarding the neighborhood variability depending on which building site was selected. For example, one might anticipate more variability with B5 selected, or fewer local sources capable of scavenging O3 with B1 selected. If examining the maximum daily CO2 concentrations, the results for several days would differ. Regardless of which building site is selected, the diurnal trends are consistent potentially providing an indication of regional trends. Also, for the minute data, this difference between the datasets is more extreme for the CO2 data possibly due to CO2 being a primary pollutant and less well-mixed in the atmosphere.

Generalizability of Building-Scale Spatial Variability & Potential Recommendations
There are a few aspects of this study that limit generalizability: we used short periods of data, we only examined the variability around one building in Los Angeles (variability might look different around a different structure or in a different city), and the two sensors types we used rely on different operating principles. Given these limitations, there are still recommendations based on this analysis that can be made. As the following recommendations are intended for individuals or groups interested in conducting sensor studies, more general "best practice" recommendations have been regional trends. Also, for the minute data, this difference between the datasets is more extreme for the CO2 data possibly due to CO2 being a primary pollutant and less well-mixed in the atmosphere.

Generalizability of Building-Scale Spatial Variability & Potential Recommendations
There are a few aspects of this study that limit generalizability: we used short periods of data, we only examined the variability around one building in Los Angeles (variability might look different around a different structure or in a different city), and the two sensors types we used rely on different operating principles. Given these limitations, there are still recommendations based on this analysis that can be made. As the following recommendations are intended for individuals or groups

Generalizability of Building-Scale Spatial Variability & Potential Recommendations
There are a few aspects of this study that limit generalizability: we used short periods of data, we only examined the variability around one building in Los Angeles (variability might look different around a different structure or in a different city), and the two sensors types we used rely on different operating principles. Given these limitations, there are still recommendations based on this analysis that can be made. As the following recommendations are intended for individuals or groups interested in conducting sensor studies, more general "best practice" recommendations have been included as well. While some of these are more general, specifically the first and fourth ones, the results of the study nonetheless affirm their value. Furthermore, these suggestions complement the US EPA's existing recommendations for planning a study and siting sensors [11]. These recommendations are especially relevant to studies involving high-time resolution data on a neighborhood or source-scale:

•
Compare Sensors: Co-locating sensors in the field will support a better understanding of inter-sensor variability prior to their deployment, which will aid in attributing new variability introduced by the deployment of sensor systems to separate sites. These relative comparisons can also be valuable if there are problems with the calibration. • Placement and Distribution: To study a particular emission source, place sensors upwind and downwind of the site of interest, at varying distances. Some of the sensors should have a line of sight to the emission source. Consider factors such as typical wind directions and potential obstructions, which may impact the transport of emissions. These placements should also minimize added variability when possible. For example, shading all sensor systems, placing them on the same sides of buildings, or placing them exclusively on rooftops could reduce the variability and biases that result from occasional direct sunlight. • Supplementary Materials Sensor Data: Consider using multiple systems or sensor types. The ability of sensors to capture variability on small spatial scales could be leveraged to aid in source identification by placing multiple sensor systems at a site with the objective of capturing local emissions with some systems and targeting exclusively regional trends with other systems. Leveraging data from multiple sensor types could also shed light on sources and emissions by studying the correlations or temporal patterns of data from sensors intended to measure different target pollutants. • Document Deployment: Document your deployment in writing and with photos (take photos of the sensor systems from different angles and photos from the sensors of what they "see"). Learning about nearby activities could provide contextual information that can aid in data interpretation and reduce the misinterpretation of sensor data.

Conclusions
This deployment demonstrated how the variability in CO 2 and O 3 , measured using low-cost sensors, across a single sampling site can be comparable to the variability across several sites in a neighborhood. However, this spatial variability occurs primarily in high-time resolution (<1 h) data as it seems to be driven by nearby emission plumes and occasional thermal differences. As Miskell and colleagues reported these differences do not persist at typical reporting scales [10], but if a researcher or community is interested in high-temporal resolution data then this variability could become significant. This variability might also be more important to consider for studies taking place on smaller spatial scales, such as the neighborhood scale at which this study takes place, rather than larger regional scales.
While minute-level data is not currently utilized for regulatory purposes, this level of data can provide powerful preliminary and Supplementary Materials information when it comes to understanding the activities and experiences in a community and at local scales. Furthermore, the presence of building-level variability does not exclude sensors from being used in air quality investigations, but rather affirms their ability to detect these differences in trends. Through attention to siting and thorough planning/documentation, there is the potential for the collection of an entirely new type of data that could for example, inform detailed investigations into the impact of a single source on a neighborhood, track the transport of emissions through an area, or clarify the acute effects of brief, high-concentration exposures. These potential applications suggest that this new type of data, made possible by sensors, could eventually support improved public health.
Supplementary Materials: Processed sensor data is available by request, please contact the corresponding author. To discuss the availability of raw sensor data and associated code for processing, please also contact the corresponding author. . We would like to thank project partners: Esperanza Community Housing, Sandy Navarro, partners at the University of Southern California and Occidental College, and all community member participants. We would also like to thank our regulatory partners for assistance with sensor co-locations, including access to reference monitoring stations and associated data: South Coast Air Quality Management District, San Joaquin Valley Air Pollution Control District, and California Air Resources Board (note regarding reference air quality data: this data has not passed through the normal review process, and is therefore not QA'd and is unofficial data).

Conflicts of Interest:
The authors declare no conflict of interest.