The United States Environmental Protection Agency (EPA) has implemented a Bayesian spatial data fusion model called the Downscaler (DS) model to generate daily air quality surfaces for PM2.5
across the contiguous U.S. Previous implementations of DS relied on monitoring data from EPA’s Air Quality System (AQS) network, which is largely concentrated in urban areas. In this work, we introduce to the DS modeling framework an additional PM2.5
input dataset from the Interagency Monitoring of Protected Visual Environments (IMPROVE) network located mainly in remote sites. In the western U.S. where IMPROVE sites are relatively dense (compared to the eastern U.S.), the inclusion of IMPROVE PM2.5
data to the DS model runs reduces predicted annual averages and 98th percentile concentrations by as much as 1.0 and 4 μg m−3
, respectively. Some urban areas in the western U.S., such as Denver, Colorado, had moderate increases in the predicted annual average concentrations, which led to a sharpening of the gradient between urban and remote areas. Comparison of observed and DS-predicted concentrations for the grid cells containing IMPROVE and AQS sites revealed consistent improvement at the IMPROVE sites but some degradation at the AQS sites. Cross-validation results of common site-days withheld in both simulations show a slight reduction in the mean bias but a slight increase in the mean square error when the IMPROVE data is included. These results indicate that the output of the DS model (and presumably other Bayesian data fusion models) is sensitive to the addition of geographically distinct input data, and that the application of such models should consider the prediction domain (national or urban focused) when deciding to include new input data.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited