Consistency and Discrepancy between Visibility and PM2.5 Measurements: Potential Application of Visibility Observation to Air Quality Study

High-quality measurements of air quality are the highest priority for understanding widespread air pollution. Visibility has been widely suggested to be a good alternative to PM2.5 concentration as a measure. In this study, the similarities and differences between visibility and PM2.5 measurements in China are checked and the results reveal the potential application of visibility observation to the study of air quality. Based on the quality-controlled PM2.5 and visibility data from 2016 to 2018, the nonparametric Spearman correlation coefficient (ρ) values between stations for PM2.5 and visibility-derived surface extinction coefficient (bext) decrease as the station distance (R) increases. Some relatively low ρ values (<0.4) occur in regions characterized by the lowest (background) levels of PM2.5 and bext values, for example, the Tibetan and Yungui Plateau. The relatively lower ρ for bext compared to PM2.5 is probably caused by the predefined maximum threshold of visibility measurements (generally 30 km). A significant correlation between PM2.5 and bext is derived in most stations and relatively larger ρ values are evident in eastern China (Northeast China excluded) and in winter (the national median ρ is 0.67). The abrupt changes in specific mass extinction efficiency (αext) imply a potentially large influence of alternation of visibility sensors or recalibrations on visibility measurements. The bext data are thereafter corrected by comparison to the reference measurements at the adjacent stations, which leads to a three-year quality assured of visibility and bext datasets.


Introduction
As a byproduct of rapid and energy-intensive economic development, a huge amount of precursors and particles are emitted into the atmosphere. Anthropogenic emissions, working with natural emissions such as dust and biomass burning, lead to a thick layer of extensive haze covering thousands or tens of thousands of acres. Large areas of haze occur all over the world but are more often observed in developing countries. For example, a dense blanket of polluted air often occurs in eastern China, especially over the North China Plain in winter and in central eastern China in the crop harvest season [1][2][3][4][5]. Similarly, a thick brownish-gray haze often covers much of the Ganges Plain in the pre-monsoon season [6].
Measurements are the highest priority for understanding the formation and maintenance of widespread haze events and their extensive impacts. Ground measurement is the usual approach to obtaining accurate measurements of air quality indexes, among which PM 2.5 concentration (particulate matter with aerodynamic diameter < 2.5 µm) is the key component. PM 2.5 fine particles are critical because of their extensive impacts. PM 2.5 is the major factor leading to a reduction in visibility in the absence of precipitation. Visibility has reduced in the past half-century in many regions of the world owing to the increase in aerosol emissions into the atmosphere [7][8][9]. An increase in PM 2.5 also lead to the reduction in solar energy reaching the ground across the world from 1960 to the 1980s that was known Sensors 2023, 23, 898 3 of 12 and discrepancy between PM 2.5 and visibility measurements may show a few important issues which have not been revealed by previous univariate analyses of measurement uncertainties. This tells us that caution should be taken in applying visibility measurements in the study of air quality. The rest of this paper is structured as follows. Section 2 introduces the data and method. Section 3 presents major results; and the discussion and conclusions are summarized in Section 4.

Data and Method
Hourly PM 2.5 and PM 10 concentrations (µg m −3 ) are available at~1600 stations that cover at least one-year worth of measurements for the period from 2016 to 2018. The spatial distribution of the stations is shown in Figure 1, which clearly shows that the stations are clustered in urban regions, for example, 12 national stations are mainly located in urban and suburban areas of Beijing. Conversely, the visibility stations are much more evenly distributed, especially in eastern China (Figure 1). Hourly visibility data (km) are available from the National Meteorological Information Center (NMIC) of CMA at 2395 stations for the period from 2016 to 2018. Visibility is divided by 2.996 to derive the extinction coefficient (b ext : Mm −1 ) according to the instrument manual [35]. Hourly relative humidity (RH) measurements are also available from the NMIC to allow a humidity correction of b ext .
PM2.5 concentrations when aerosol hygroscopic growth is carefully considered. A good linear relation between PM2.5 and the visibility-derived extinction coefficient (bext) is observed based on short-term measurements [33,34]. However, analysis of the consistency and discrepancy between PM2.5 and visibility measurements may show a few important issues which have not been revealed by previous univariate analyses of measurement uncertainties. This tells us that caution should be taken in applying visibility measurements in the study of air quality. The rest of this paper is structured as follows. Section 2 introduces the data and method. Section 3 presents major results; and the discussion and conclusions are summarized in Section 4.

Data and Method
Hourly PM2.5 and PM10 concentrations (μg m −3 ) are available at ~1600 stations that cover at least one-year worth of measurements for the period from 2016 to 2018. The spatial distribution of the stations is shown in Figure 1, which clearly shows that the stations are clustered in urban regions, for example, 12 national stations are mainly located in urban and suburban areas of Beijing. Conversely, the visibility stations are much more evenly distributed, especially in eastern China ( Figure 1). Hourly visibility data (km) are available from the National Meteorological Information Center (NMIC) of CMA at 2395 stations for the period from 2016 to 2018. Visibility is divided by 2.996 to derive the extinction coefficient (bext: Mm −1 ) according to the instrument manual [35]. Hourly relative humidity (RH) measurements are also available from the NMIC to allow a humidity correction of bext. The data quality assurance procedure of Wu et al. [36] is adopted in this paper to provide quality control for the PM2.5 and bext data. In general, measurements that are greatly deviated from the observations at the adjacent time or in neighboring areas or have a very low temporal variance are classified as outliers. Regular calibration of instruments and consistency between PM2.5 and PM10 are checked for assurance of PM2.5 data. For bext data, the consistency between 10 min and 1 min measurements is checked for assurance [31].
In order to check the relationship between PM2.5 and bext, PM2.5 measurements at stations no further than 25 km from a visibility station are averaged to match bext data. This collocation procedure trades off two requirements. One is to match PM2.5 and bext at stations that are adjacent to each other as close as possible in order to minimize the potential effect of spatial variation; the other is to collocate substantially more bext and PM2.5 stations for statistical analysis. Simultaneous measurements of PM2.5 and bext are available at 502 stations ( Figure 1b) that are used for the following analysis.
Spatial Spearman correlation coefficients (ρ) of PM2.5 concentration and ambient bext between stations are calculated separately. The variation of ρ with the distance between The data quality assurance procedure of Wu et al. [36] is adopted in this paper to provide quality control for the PM 2.5 and b ext data. In general, measurements that are greatly deviated from the observations at the adjacent time or in neighboring areas or have a very low temporal variance are classified as outliers. Regular calibration of instruments and consistency between PM 2.5 and PM 10 are checked for assurance of PM 2.5 data. For b ext data, the consistency between 10 min and 1 min measurements is checked for assurance [31].
In order to check the relationship between PM 2.5 and b ext , PM 2.5 measurements at stations no further than 25 km from a visibility station are averaged to match b ext data. This collocation procedure trades off two requirements. One is to match PM 2.5 and b ext at stations that are adjacent to each other as close as possible in order to minimize the potential effect of spatial variation; the other is to collocate substantially more b ext and PM 2.5 stations for statistical analysis. Simultaneous measurements of PM 2.5 and b ext are available at 502 stations ( Figure 1b) that are used for the following analysis.
Spatial Spearman correlation coefficients (ρ) of PM 2.5 concentration and ambient b ext between stations are calculated separately. The variation of ρ with the distance between stations (R) is studied. A few abnormally lower ρ values are found, especially for b ext data in some provinces, which implies potentially larger random errors of visibility and then b ext measurements than that of PM 2.5 .
The linear regression between hourly PM 2.5 and b ext for RH < 40% are performed to check their consistency, the slope of which represents the specific mass extinction efficiency (α ext ). α ext can also be calculated directly by dividing b ext by PM 2.5 . In some stations, α ext shows a few abrupt changes that are detected by a simple but effective approach developed by Killick et al. [37]. This implies a potentially large influence of alternation of visibility sensors or re-calibrations. The visibility or b ext data of the collocated 502 stations are detected and corrected by comparison to the reference PM 2.5 measurements, and thereby the visibility in these stations is set as the benchmark. The detection and correction of the remaining 1893 visibility stations should compare to the visibility measures in nearby stations. When there is a benchmark station nearby (<30 km), the visibility measurement of the benchmark station is set as the baseline. Otherwise, the regional average visibility of all meteorological stations within 100 km is set as the reference series. After all corrections are completed, a dataset consisting of three years worth of visibility and b ext data is finally produced. Figure 2 shows the scatterplot of ρ versus R. The ρ values between two stations for PM 2.5 and b ext are calculated because both quantities are not normally distributed. Not surprisingly, ρ decreases as R increases. The decay of ρ with R may be represented by an exponential equation [38].

Results
where the coefficient R 0 indicates the distance at which ρ decreases by a factor of e, representing the horizontal scale of the correlation. ρ 0 , the zero intercept, represents ρ where the station distance is zero. The appropriate value for γ, the parameter determining how ρ decays with R, is not obvious. A range of possible values is considered that leads to substantially large variations of R 0 , but not ρ 0 . As suggested by Liu et al. [39], ρ 0 provides information on the uncertainty of the measurements (σ 2 (e)), i.e., where σ 2 (M) represents the variance of the measurements. The relative uncertainty of PM 2.5 is 15% and b ext is 20%, which are close to the expected measurement uncertainties. An interesting feature of Figure 2 is that some relatively lower ρ values (<0.4) are observed for R < 50 km, not only for PM 2.5 but also for b ext . A further check of these low ρ values shows that they occur in regions characterized by the lowest (background) levels of PM 2.5 and b ext values, for example, in the Tibetan and Yungui Plateau. This can be clearly shown by Figure 3 in which mean ρ values of PM 2.5 and b ext between stations with R < 50 km are shown. Relative smaller ρ values are also found in Northeast China where PM 2.5 and b ext exceed that in the Tibetan and Yungui Plateau. ρ for visibility and, therefore, b ext , between stations are smaller than that of PM 2.5 , which can be evident from Figure 2 and their fitting equations. The most likely explanation is that the maximum visibility is generally set to be 30 km (about 100 M m −1 for b ext ). Given the fact that α ext is about 5 m 2 g −1 in eastern [33] and southwest China [34], the threshold of visibility prevents visibility measurements from resolving PM 2.5 variation from a few to 20-30 µg m −3 . In other words, visibility data cannot reflect a subtle variation in the background level of PM 2.5 and b ext as a result of their predefined maximum threshold. Therefore, it is recommended that raw b ext data and visibility data be provided, which would be critical for the application of visibility data to the study of air quality study, or, more specifically, in the estimation of PM 2.5 from visibility data.  ρ for visibility and, therefore, bext, between stations are smaller than that of PM2.5, which can be evident from Figure 2 and their fitting equations. The most likely explanation is that the maximum visibility is generally set to be 30 km (about 100 M m −1 for bext). Given the fact that αext is about 5 m 2 g −1 in eastern [33] and southwest China [34], the threshold of visibility prevents visibility measurements from resolving PM2.5 variation from a few to 20-30 μg m −3 . In other words, visibility data cannot reflect a subtle variation in the background level of PM2.5 and bext as a result of their predefined maximum threshold. Therefore, it is recommended that raw bext data and visibility data be provided, which would be critical for the application of visibility data to the study of air quality study, or, more specifically, in the estimation of PM2.5 from visibility data. Figure 4 presents the seasonal spatial distribution of ρ between PM2.5 and bext under conditions with RH < 40%, under which temporal variation of PM2.5 should be expected to resemble that of bext because the hygroscopic growth is marginal. As summer (rainy season) match points between PM2.5 and bext are very limited if an RH of 40% is used in eastern China, we used an RH of 60% to produce sufficient match points to perform a robust statistical analysis. In order to minimize the potential effect of hygroscopic growth, the ambient light scattering enhancement (fRH) of PM2.5 is considered by using an empirical equation below. An exponential equation is derived to describe the relationship between ρ and R, which is also shown.  ρ for visibility and, therefore, bext, between stations are smaller than that of PM2.5, which can be evident from Figure 2 and their fitting equations. The most likely explanation is that the maximum visibility is generally set to be 30 km (about 100 M m −1 for bext). Given the fact that αext is about 5 m 2 g −1 in eastern [33] and southwest China [34], the threshold of visibility prevents visibility measurements from resolving PM2.5 variation from a few to 20-30 μg m −3 . In other words, visibility data cannot reflect a subtle variation in the background level of PM2.5 and bext as a result of their predefined maximum threshold. Therefore, it is recommended that raw bext data and visibility data be provided, which would be critical for the application of visibility data to the study of air quality study, or, more specifically, in the estimation of PM2.5 from visibility data. Figure 4 presents the seasonal spatial distribution of ρ between PM2.5 and bext under conditions with RH < 40%, under which temporal variation of PM2.5 should be expected to resemble that of bext because the hygroscopic growth is marginal. As summer (rainy season) match points between PM2.5 and bext are very limited if an RH of 40% is used in eastern China, we used an RH of 60% to produce sufficient match points to perform a robust statistical analysis. In order to minimize the potential effect of hygroscopic growth, the ambient light scattering enhancement (fRH) of PM2.5 is considered by using an empirical equation below.  Figure 4 presents the seasonal spatial distribution of ρ between PM 2.5 and b ext under conditions with RH < 40%, under which temporal variation of PM 2.5 should be expected to resemble that of b ext because the hygroscopic growth is marginal. As summer (rainy season) match points between PM 2.5 and b ext are very limited if an RH of 40% is used in eastern China, we used an RH of 60% to produce sufficient match points to perform a robust statistical analysis. In order to minimize the potential effect of hygroscopic growth, the ambient light scattering enhancement (f RH ) of PM 2.5 is considered by using an empirical equation below.
where κ is set to 0.096 according to the reference [29]. Consistent with our expectation, a significant correlation between PM 2.5 and b ext is derived in most stations. The percentages with significant correlation are 91%, 89%, 88%, and 93% from spring to winter. Relatively larger ρ values are evident in eastern China (Northeast China excluded), for example, in the Beijing-Tianjin-Hebei (BTH), the Yangtze Delta Region (YDR), and Pearl Delta Region (PDR). This is partly because of a wider range of PM 2.5 and b ext variation, which leads to a larger difference in ranks of the individual element and thereby ρ. In regions with low aerosol loading, for example, in the Tibetan autonomous region, Qinghai, and Yunnan Province, relatively smaller ρ values are observed. This is likely because the measurement uncertainties prevent the very subtle variation in both quantities from detection. Poor correlations are also evident at most stations in northeastern China, which seems not to be explained by the latter cause since PM 2.5 and b ext in this region generally exceed those in the Tibetan Plateau. Given the fact that visibility sensors are much more accurate for smaller visibility (<10-15 km) relative to larger visibility, this should be kept in mind when the visibility data are used to estimate PM 2.5 , especially in those regions dominated by PM 2.5 concentration with tens of µg m −3 .
where is set to 0.096 according to the reference [29]. Consistent with our expectation, a significant correlation between PM2.5 and bext is derived in most stations. The percentages with significant correlation are 91%, 89%, 88%, and 93% from spring to winter. Relatively larger ρ values are evident in eastern China (Northeast China excluded), for example, in the Beijing-Tianjin-Hebei (BTH), the Yangtze Delta Region (YDR), and Pearl Delta Region (PDR). This is partly because of a wider range of PM2.5 and bext variation, which leads to a larger difference in ranks of the individual element and thereby ρ. In regions with low aerosol loading, for example, in the Tibetan autonomous region, Qinghai, and Yunnan Province, relatively smaller ρ values are observed. This is likely because the measurement uncertainties prevent the very subtle variation in both quantities from detection. Poor correlations are also evident at most stations in northeastern China, which seems not to be explained by the latter cause since PM2.5 and bext in this region generally exceed those in the Tibetan Plateau. Given the fact that visibility sensors are much more accurate for smaller visibility (<10-15 km) relative to larger visibility, this should be kept in mind when the visibility data are used to estimate PM2.5, especially in those regions dominated by PM2.5 concentration with tens of μg m −3 .
Seasonally, relatively larger ρ values occur in winter (the national median ρ is 0.67) and smaller ρ values are observed in summer (the national median ρ is 0.54). This is likely because the variability in PM2.5 and bext in summer is smaller than that in winter. Furthermore, the temporal variation of aerosol chemical components, size distribution, and, therefore, the aerosol hygroscopic growth, may partly contribute to the smaller ρ values in summer, although the hygroscopic growth is considered in the analysis.
A temporal variation in the relationship between PM2.5 and bext is evident in some stations after a closer look at the scatter plot of PM2.5 and bext at each station. Figure 5 presents an example at Yangzhou, Jiangsu province, which shows a dramatically different relationship between PM2.5 and bext in 2016 relative to that in 2017 and 2018. This results in a poor correlation between the two quantities. The substantial change in the PM2.5-bext relationship may not be due to the temporal changes in aerosol compositions that mainly Seasonally, relatively larger ρ values occur in winter (the national median ρ is 0.67) and smaller ρ values are observed in summer (the national median ρ is 0.54). This is likely because the variability in PM 2.5 and b ext in summer is smaller than that in winter. Furthermore, the temporal variation of aerosol chemical components, size distribution, and, therefore, the aerosol hygroscopic growth, may partly contribute to the smaller ρ values in summer, although the hygroscopic growth is considered in the analysis.
A temporal variation in the relationship between PM 2.5 and b ext is evident in some stations after a closer look at the scatter plot of PM 2.5 and b ext at each station. Figure 5 presents an example at Yangzhou, Jiangsu province, which shows a dramatically different relationship between PM 2.5 and b ext in 2016 relative to that in 2017 and 2018. This results in a poor correlation between the two quantities. The substantial change in the PM 2.5 -b ext relationship may not be due to the temporal changes in aerosol compositions that mainly affects the slope of the regression analysis (i.e., α ext ). We can see that a dramatic change in the intercept (from~−47 to~265 M m −1 in winter) is observed, which implies some unusual changes in the observations in one quantity but not in the other.
It is interesting to note that PM 2.5 observations at three adjacent stations (within 5 km) are close to each other and show a similar pattern in the three years of the study. Conversely, b ext shows a striking phenomenon, that is, in 2016, it is extremely different from 2017 and 2018 ( Figure 6). The relatively larger b ext cannot be supported by contemporaneous PM 2.5 observations. PM 2.5 in these three years at these three stations shows a consistent and stable variation. The abnormally high b ext in 2016 is very likely owing to the calibration or replacement of the visibility sensor, although this needs to be checked against the metadata. affects the slope of the regression analysis (i.e., αext). We can see that a dramatic change in the intercept (from ~ −47 to ~265 M m −1 in winter) is observed, which implies some unusual changes in the observations in one quantity but not in the other. It is interesting to note that PM2.5 observations at three adjacent stations (within 5 km) are close to each other and show a similar pattern in the three years of the study. Conversely, bext shows a striking phenomenon, that is, in 2016, it is extremely different from 2017 and 2018 ( Figure 6). The relatively larger bext cannot be supported by contemporaneous PM2.5 observations. PM2.5 in these three years at these three stations shows a consistent and stable variation. The abnormally high bext in 2016 is very likely owing to the calibration or replacement of the visibility sensor, although this needs to be checked against the metadata.  It is interesting to note that PM2.5 observations at three adjacent stations (within 5 km) are close to each other and show a similar pattern in the three years of the study. Conversely, bext shows a striking phenomenon, that is, in 2016, it is extremely different from 2017 and 2018 ( Figure 6). The relatively larger bext cannot be supported by contemporaneous PM2.5 observations. PM2.5 in these three years at these three stations shows a consistent and stable variation. The abnormally high bext in 2016 is very likely owing to the calibration or replacement of the visibility sensor, although this needs to be checked against the metadata. Abrupt changes in visibility and thereby b ext are also evident from another perspective. Figure 7 presents the time series of the ratio of station b ext to the regional mean b ext in Tianjin, a municipal city near Beijing. A sudden drop in b ext values occurred at stations 04, 07, 10, and 12 at the beginning of 2018, which cannot be reflected by PM 2.5 measurements (Figure 8). Since the stations are located in a very small region, this abrupt and inconsistent change of b ext is very likely associated with the recalibration or alternation of the sensors. Abrupt changes in visibility and thereby bext are also evident from another perspective. Figure 7 presents the time series of the ratio of station bext to the regional mean bext in Tianjin, a municipal city near Beijing. A sudden drop in bext values occurred at stations 04, 07, 10, and 12 at the beginning of 2018, which cannot be reflected by PM2.5 measurements (Figure 8). Since the stations are located in a very small region, this abrupt and inconsistent change of bext is very likely associated with the recalibration or alternation of the sensors.  Given the fact that PM2.5 is stable but visibility, and hence bext, drifts with time, the time series of αext should show abrupt change points. Therefore, a simple but effective method developed by Killick et al. [37] is used to detect those change points. This method Abrupt changes in visibility and thereby bext are also evident from another perspective. Figure 7 presents the time series of the ratio of station bext to the regional mean bext in Tianjin, a municipal city near Beijing. A sudden drop in bext values occurred at stations 04, 07, 10, and 12 at the beginning of 2018, which cannot be reflected by PM2.5 measurements (Figure 8). Since the stations are located in a very small region, this abrupt and inconsistent change of bext is very likely associated with the recalibration or alternation of the sensors.  Given the fact that PM2.5 is stable but visibility, and hence bext, drifts with time, the time series of αext should show abrupt change points. Therefore, a simple but effective method developed by Killick et al. [37] is used to detect those change points. This method Given the fact that PM 2.5 is stable but visibility, and hence b ext , drifts with time, the time series of α ext should show abrupt change points. Therefore, a simple but effective method developed by Killick et al. [37] is used to detect those change points. This method can detect abrupt changes in the mean, variance, and trend of the time series; we only detect the change points of the mean values of α ext here. Figure 9 presents an example of this analysis. The value of α ext , i.e., the ratio of b ext to PM 2.5 under RH < 40%, shows two change points (Figure 9d), which can be also clearly shown by the scatter plot of PM 2.5 and b ext (Figure 9a). Since α ext values during the first two periods are abnormally higher than the expectation, indicating the abnormal measurement of b ext (Figure 9b), b ext measurements are then corrected by taking the measurements during the third period as the benchmark. Linear regression between b ext and PM 2.5 during the three periods is performed, leading to Sensors 2023, 23, 898 9 of 12 three linear equations. The questionable b ext values during the first two periods are then corrected by using the following equations.
where i is the first or second period, and A i and B i represent the intercept and slope of the linear equation for the first or second period. The intercept and slope of the linear equation for the third period, the reference period, are represented by A r and B r , respectively. The corrected result is shown in Figure 9b, which shows a much better correlation between b ext and PM 2.5 (0.87) than before (0.57) ( Figure 9a). As shown in Figure 10, the time series of the ratio of corrected b ext at an individual station to the regional mean in Tianjin are much more homogeneous compared to the series in Figure 7.
can detect abrupt changes in the mean, variance, and trend of the time series; we only detect the change points of the mean values of αext here. Figure 9 presents an example of this analysis. The value of αext, i.e., the ratio of bext to PM2.5 under RH < 40%, shows two change points (Figure 9d), which can be also clearly shown by the scatter plot of PM2.5 and bext (Figure 9a). Since αext values during the first two periods are abnormally higher than the expectation, indicating the abnormal measurement of bext (Figure 9b), bext measurements are then corrected by taking the measurements during the third period as the benchmark. Linear regression between bext and PM2.5 during the three periods is performed, leading to three linear equations. The questionable bext values during the first two periods are then corrected by using the following equations.
where i is the first or second period, and Ai and Bi represent the intercept and slope of the linear equation for the first or second period. The intercept and slope of the linear equation for the third period, the reference period, are represented by Ar and Br, respectively. The corrected result is shown in Figure 9b, which shows a much better correlation between bext and PM2.5 (0.87) than before (0.57) ( Figure 9a). As shown in Figure 10, the time series of the ratio of corrected bext at an individual station to the regional mean in Tianjin are much more homogeneous compared to the series in Figure 7. The method above is also suitable for the correction of visibility. Based on the method developed above, the visibility or bext data of the selected 502 stations are detected and corrected by comparison to the reference PM2.5 measurements, and, therefore, the visibility in these stations is set as the benchmark. In addition to the 502 collocated stations, the corrections of the 1893 remaining visibility stations should be related to nearby visibility measurements. If there is a benchmark station nearby (<30 km), the visibility adjustment in the target station refers to the benchmark station. Otherwise, the regional average visibility of all meteorological stations within 100 km in diameter is set to the reference series. After all corrections are completed, a dataset consisting of three years worth of visibility and bext data is finally produced.

Discussion and Conclusions
An analysis of the similarities and differences between visibility and PM2.5 measurements shows implications for the potential application of visibility observation to the study of air quality. Based on the quality-controlled PM2.5 and visibility data from 2016 to 2018, the nonparametric Spearman ρ value between two stations for PM2.5 and bext de- The method above is also suitable for the correction of visibility. Based on the method developed above, the visibility or b ext data of the selected 502 stations are detected and corrected by comparison to the reference PM 2.5 measurements, and, therefore, the visibility in these stations is set as the benchmark. In addition to the 502 collocated stations, the corrections of the 1893 remaining visibility stations should be related to nearby visibility measurements. If there is a benchmark station nearby (<30 km), the visibility adjustment in the target station refers to the benchmark station. Otherwise, the regional average visibility of all meteorological stations within 100 km in diameter is set to the reference series. After all corrections are completed, a dataset consisting of three years worth of visibility and b ext data is finally produced.

Discussion and Conclusions
An analysis of the similarities and differences between visibility and PM 2.5 measurements shows implications for the potential application of visibility observation to the study of air quality. Based on the quality-controlled PM 2.5 and visibility data from 2016 to 2018, the nonparametric Spearman ρ value between two stations for PM 2.5 and b ext decreases as the distance between the stations, R, increases. The decay of ρ with R could be represented by an exponential equation. The relative uncertainty in PM 2.5 measurements is 15% and 20% for b ext . Some relatively lower ρ values (< 0.4) observed for R < 50 km occur in regions characterized by the lowest (background) levels of PM 2.5 and b ext values, for example, the Tibetan and Yungui Plateau. The relatively smaller ρ for b ext between stations than that of PM 2.5 is probably caused by the predefined maximum threshold of visibility measurements (generally 30 km) and may be an obstacle to the application of visibility measurements in the study of air quality.
A significant correlation between PM 2.5 and b ext is derived in most stations. The percentages with significant correlation are 91%, 89%, 88%, and 93% from spring to winter. Relatively larger ρ values are evident in eastern China (Northeast China excluded) and in winter (the national median ρ is 0.67). A temporal variation in the relationship between PM 2.5 and b ext is evident in some stations, mainly caused by the abrupt changes in b ext that are detected by an efficient approach developed by Killick et al. [37]. This implies a potentially large influence of alternation of visibility sensor instruments or re-calibrations. Therefore, the b ext data are corrected by comparison to the reference measurements at the adjacent stations. A dataset of three years worth of visibility and b ext data, which would be a complementary network to the study of air quality study, is finally produced.
To the best of our knowledge, this is the first time that a thorough check of the quality of automatic measurements of visibility in China has been made. Human observations of visibility in China are used by researchers to retrieve PM 2.5 concentration, which shows the great potential of visibility measurements in the study of air quality. A great development in visibility measurements in China has been the use of visibility sensors to replace human observation. Instrument-based measurements of visibility is an objective rather than subjective measure and can provide measurements with high temporal resolution. However, we should keep in mind that instrumental measurements suffer a lot of issues that require a thorough evaluation of the obtained visibility data. Data quality is of high priority for the further application of these valuable data in air quality studies. We carefully evaluate the visibility measurements by collocating them with PM 2.5 measurements in this study. It is clearly shown that visibility measurements are indeed associated with notable uncertainty. We develop a simple but effective to correct the visibility measurements. In particular, the systematic differences discussed above have been corrected at some stations. The quality of visibility data is greatly improved, which paves the way to use visibility data in air quality studies.