A Mobile Air Pollution Monitoring Data Set

Air pollution was observed in Hamilton, Ontario, Canada using monitors installed in a mobile platform from November 2005 up to November 2016. The dataset is an aggregation of several project specific monitoring days, which attempted to quantify air pollution spatial variation under varying conditions or in specific regions. Pollutants observed included carbon monoxide, nitric oxide, nitrogen dioxide, total nitrogen oxides, ground-level ozone, particulate matter concentrations for size cuts of 10 μm, 2.5 μm and 1 μm, and sulfur dioxide. Observations were collected over 114 days, which occurred in varying seasons and months. During sampling, the mobile platform travelled at an average speed of 27 km/h. The samples were collected as one-minute integrated samples and are prepared as line-segments, which include an offset for instrument response time. Sampling occurred on major freeways, highways, arterial and residential roads. This dataset is shared in hopes of supporting research on how to best utilize air pollution observations obtained with mobile air pollution platforms, which is a growing technique in the field of urban air pollution monitoring. We conclude with limitations in the data capture technique and recommendations for future mobile monitoring studies. Dataset: Data are provided as a supplemental document. Dataset License: CC-BY-SA


Summary
Air pollution is well established to have negative human health effects [1][2][3].By the 1970s it was apparent that measuring pollution in an urban setting required more than one monitoring unit to capture the spatial variation [4,5].Monitoring programs began to use multiple monitors within a city; however, this can quickly become expensive as the number of pollutants and locations observed increases.Another approach that researchers have applied to assess the spatial variability of air pollutants is mobile air pollution monitoring, where air pollution monitoring equipment is mounted on a mobile platform, such as a motor-vehicle.Two general approaches have emerged, which include the use of lightweight lower cost sensors that are small and portable, which may be attached to an individual or on a bike [6].The second approach is to install reference grade air monitoring equipment in a mobile platform (e.g., commercial van) [7].By reference grade, we are referring to the equipment that a national or regional environmental agency has approved for use in their monitoring programs (e.g., EPA Reference Equivalent Methods).Mounting monitors on a mobile platform allows for observations in various locations.
Mobile data collection results in a unique dataset, which is spatially and temporally discontinuous.Unlike a fixed location monitor that measures change across time; mobile data may not repeat sampling at a location with regularity.Repetition may be staggered, which introduces periods of unobserved data.Researchers have begun to address the challenge of determining a long-term mean from incomplete samples, which has involved adjusting the mobile data based on the value at a fixed location monitor [8], or incorporating both mobile and fixed location data into the air pollution model [9].Attempts have been made to understand the sampling saturation necessary in mobile campaigns [10][11][12].We believe that investigating challenges with mobile collection is timely in air pollution research.
This dataset contains some of the earliest examples of mobile air pollution monitoring [7].The data have supported many research outputs that have included the identification of elevated air pollution concentrations during localized temperature inversion episodes in the City of Hamilton [13], land use regression modelling of SO 2 [14], comparison of air quality change between two time-periods [15], examining the effectiveness of road dust mitigation approaches [16], real-time air pollution mapping of multiple pollutants [9], and estimating air pollution exposure by mode choice of students [17].
Our motivation to provide this data to the research community, is that it is not possible for us to address all the challenges that surround the use of mobile air pollution data; we hope by providing this rich dataset to the research community that others can use this data for simulation studies in mobile sampling campaigns, to compare their results to another geographic region, or supplement their studies.This data set consists of eleven years of mobile air pollution monitoring conducted in Hamilton, Ontario, Canada between November 2005 and up to November 2016.The observations were collected during 114 days of air pollution monitoring.

Data Description
The data are provided as a single plain text comma separated values file.The file contains 46,080 rows of observations (air pollution observations) and 16 columns (attributes) for each observation.The data were collected as one-minute integrated samples and are prepared as line segments, i.e., the start to end points of the one-minute integrated sample.The column headers and description of each attribute are included in Table 1.NA data are indicated by a value of NA.

Methods
The study area is Hamilton, Ontario, Canada, which has received many studies on its air quality [7,13,[18][19][20][21].Hamilton is developed around an industrial core, which has traditionally focused on steel production.In recent years, these operations have slowed.The 2006 population for Hamilton was 504,560, increasing to 519,955 in 2011 and to 536,930 in 2016 [22].During the period of study, the population increased by 6.4%.The Niagara Escarpment transects Hamilton, separating the region Data 2019, 4, 2 3 of 8 into an upper and a lower city.Temperature inversions have been demonstrated to elevate air pollution concentrations in the lower city [13].The city is intersected with two freeways and two other major highways, which are sources of urban pollution.The city is located on the western tip of Lake Ontario.
In Figure 1, we present the land use, transportation network, the escarpment that separates the upper and lower city, and the boundary for the City of Hamilton.
The study area is Hamilton, Ontario, Canada, which has received many studies on its air quality [7,13,[18][19][20][21].Hamilton is developed around an industrial core, which has traditionally focused on steel production.In recent years, these operations have slowed.The 2006 population for Hamilton was 504,560, increasing to 519,955 in 2011 and to 536,930 in 2016 [22].During the period of study, the population increased by 6.4%.The Niagara Escarpment transects Hamilton, separating the region into an upper and a lower city.Temperature inversions have been demonstrated to elevate air pollution concentrations in the lower city [13].The city is intersected with two freeways and two other major highways, which are sources of urban pollution.The city is located on the western tip of Lake Ontario.In Figure 1, we present the land use, transportation network, the escarpment that separates the upper and lower city, and the boundary for the City of Hamilton.revisit approach (i.e., a location may only have been sampled once).The data were collected for specific studies, but they have all been combined into this larger dataset.During collection a GPS unit obtained locational information that was paired with the air pollution observations.The average speed of the mobile platform was 27 km per hour; the intent was to drive as slowly as possible, but during highway data collection this was limited to speeds that would not impede traffic flow.The operation of the mobile platform including instrument calibration and operation was overseen by technicians from the Ontario Ministry of the Environment, Conservation and Parks.The monitoring instrumentation is presented in Table 2. Following October 31, 2011, a few of the instruments were replaced.Monitoring instruments were installed in a commercial van with the air intake centered on the vehicle roof.In Figure 3, we present images of the mobile platform including the installation of monitors, the air intake, and the sampling manifold.
Data 2018, 3, x FOR PEER REVIEW 4 of 8 The mobile pollution data were collected between November 2005 and November 2016, covering a span of eleven years.In Figure 2, we present the periods when observations were conducted, and which pollutants were measured.During the data collection there was no standard location revisit approach (i.e., a location may only have been sampled once).The data were collected for specific studies, but they have all been combined into this larger dataset.During collection a GPS unit obtained locational information that was paired with the air pollution observations.The average speed of the mobile platform was 27 km per hour; the intent was to drive as slowly as possible, but during highway data collection this was limited to speeds that would not impede traffic flow.The operation of the mobile platform including instrument calibration and operation was overseen by technicians from the Ontario Ministry of the Environment, Conservation and Parks.The monitoring instrumentation is presented in Table 2. Following October 31, 2011, a few of the instruments were replaced.Monitoring instruments were installed in a commercial van with the air intake centered on the vehicle roof.In Figure 3, we present images of the mobile platform including the installation of monitors, the air intake, and the sampling manifold.We report one-minute integrated samples, which are the average concentration obtained over one-minute.Because the vehicle is usually moving during the one-minute observation the samples are not point samples but are better represented as line segments.The line segment length varies based on the speed of the vehicle over the minute.The air pollution instruments report in near-real  We report one-minute integrated samples, which are the average concentration obtained over one-minute.Because the vehicle is usually moving during the one-minute observation the samples are not point samples but are better represented as line segments.The line segment length varies based on the speed of the vehicle over the minute.The air pollution instruments report in near-real time, but there is a lag-time from when the air sample enters the instrument until the measurement is completed.This lag period is presented in Table 2 for each instrument and ranges from 6 s up to 240 s.The lag period was corrected by offsetting the GPS coordinates based on the length of each instruments' response time.An additional lag of 20 s was added to account for the air intake time, which was estimated from giving a puff test at the air intake.However, we recognize that this value may be over or under estimated as the sampling system received modifications during the period of study.The potential locational error is greatest for the longest segments, which correspond to faster speeds.For example, the CO instruments have a lag of 60 s (+20 system lag), for an air pollution measurement completed at 10:00:00, we would have selected the line segment end coordinates (POINT_X_END & POINT_Y_END) obtained for the GPS at 9:58:40 (80 s earlier).The line segment start location (POINT_X_START & POINT_Y_START) would have been the GPS coordinates from 9:57:40, which is when the observation began.In Table 3, we present summary statistics for the observation data.

User Notes
Other open data we have found useful include the Government of Canada's Historical Climate Data (URL: http://climate.weather.gc.ca/), which includes hourly meteorological observations within and adjacent to the study area.Depending on the station the data may include temperature ( • C), dew point temperature ( • C), relative humidity (%), wind direction, wind speed (km/h), visibility (km), station pressure (kPa), humidex, and wind chill ( • C) and a weather description.Fixed location air pollution data can be obtained from the Ontario Ministry of the Environment Conservation and Parks' Air Quality Ontario Webpage (URL: http://www.airqualityontario.com/),which includes three monitors in Hamilton and one adjacent in Burlington.The pollutants observed at each monitor may include O 3 (ppb), PM 2.5 (µg/m3), NO (ppb), NO 2 (ppb), NOX (ppb), SO 2 (ppb) and CO (ppm), which are reported at hourly intervals.Additional pollutants beyond criteria air pollutants can be obtained from Environment and Climate Change Canada's National Air Pollution Surveillance Program (NAPS) (URL: http://maps-cartes.ec.gc.ca/rnspa-naps/data.aspx), such as VOCs and trace elements at varying intervals from hourly, daily, monthly, or irregular intervals.Industrial point source emission data are available from the Government of Canada's National Pollutant Release Inventory (URL: https://www.canada.ca/en/services/environment/pollution-waste-management/nationalpollutant-release-inventory.html), which tracks over 320 pollutants at 7,000 facilities in Canada.

Limitations and Recommendations
Reflecting on this data collection, we would be remiss if we did not address the many challenges we encounter in the use of these data.First, the challenge of capturing a spatiotemporal phenomenon with a spatiotemporal platform.Because air pollution releases will move throughout the region, capturing short spatiotemporal events is problematic.For example, if elevated concentrations are obtained, does this represent an infrequent release passing through the airshed, or is it representative of the general conditions?This may be improved with a regular monitoring schedule, such as that employed by Apte et al. [12].Second, understanding the limitations of the equipment.In this approach we have assumed that instrument response time is equal to a lag period, which is not completely the case, but is our best approximation in rapidly changing conditions.The lag period is affected by the flow in the air intake, the length of the sample tubing, and the operation mode of the instrument among other factors.
Based on our experience in working with these data, we have two recommended data collection approaches that include: (1) If the objective is to develop an air pollution surface of ambient air pollution concentrations, our recommendation is a drive and park approach.With this approach, the researcher will select their sample locations based on their sampling criteria, then drive to each site and sample for a period while operating from battery power, for example 15 min.The length of time will be determined by the research objective and local temporal variability, e.g., understanding a day's spatial variation will require shorter parking periods compared to understanding long-term conditions.Using an approach where the vehicle is parked, provides an opportunity for collecting additional parameters, such as wind-speed and wind-direction.(2) If the goal is to obtain in-traffic concentrations, then we recommend that researchers drive the road segment multiple times over their period of interest.The multiple segments can be averaged to provide a more representative estimate of local in-traffic pollution.For example, if we were interested in measuring freeway concentrations, we would drive a loop of the freeway segments between two interchanges.It may be helpful to capture video while driving, which could be processed using machine learning techniques to identify potential high emitting vehicles [23].
Prior to data collection, we recommend conducting a test for mobile vehicle self-pollution, which is important for the drive and park approach.The self-pollution may occur from driving to the location, or if the vehicle must be running to supply power for the monitoring units.We suggest setting up the monitors in a "clean" environment with the same air intake height, and then driving the mobile vehicle beside the unit and recording the change to background conditions.If the vehicle is turned off, the time it takes to return to background conditions should be calculated, which can be used to remove observations from the time-series.If the vehicle is running during collection, the change observed to background conditions could be subtracted from the data.

Figure 1 .
Figure 1.Panel A (Top), presents the extent of the City of Hamilton with the land use, transportation network, and the escarpment.Panel B (Bottom), overlays the air pollution line segment start points on the land use layer.

Figure 1 .
Figure 1.Panel A (Top), presents the extent of the City of Hamilton with the land use, transportation network, and the escarpment.Panel B (Bottom), overlays the air pollution line segment start points on the land use layer.The mobile pollution data were collected between November 2005 and November 2016, covering a span of eleven years.In Figure2, we present the periods when observations were conducted, and which pollutants were measured.During the data collection there was no standard location

Figure 2 .
Figure 2. Air Pollution Observations by Pollutant and Time.Black lines indicate monitoring periods.The red line indicates the change in instrumentation.

Figure 2 . 8 Figure 3 .
Figure 2. Air Pollution Observations by Pollutant and Time.Black lines indicate monitoring periods.The red line indicates the change in instrumentation.Data 2018, 3, x FOR PEER REVIEW 5 of 8

Figure 3 .
Figure 3. Mobile Monitoring Unit.Panel A: Rack mounted monitors.Panel B: Air intake highlighted by red rectangle.Panel C: Sampling Manifold connected to air intake.

Table 1 .
Air pollution file attributes.

Table 2 .
Air pollution Monitor Attributes.

Table 2 .
Air pollution Monitor Attributes.

Table 3 .
Air pollution Observation Summary Statistics.