A Cluster Analysis of Constant Ambient Air Monitoring Data from the Kanto Region of Japan

This study demonstrates an application of cluster analysis to constant ambient air monitoring data of four pollutants in the Kanto region: NOx, photochemical oxidant (Ox), suspended particulate matter, and non-methane hydrocarbons. Constant ambient air monitoring can provide important information about the surrounding atmospheric pollution. However, at the same time, ambient air monitoring can place a significant financial burden on some autonomous communities. Thus, it has been necessary to reduce both the number of monitoring stations and the number of chemicals monitored. To achieve this, it is necessary to identify those monitoring stations and pollutants that are least significant, while minimizing the loss of data quality and mitigating the effects on the determination of any spatial and temporal trends of the pollutants. Through employing cluster analysis, it was established that the ambient monitoring stations in the Kanto region could be clustered topologically for NOx and Ox into eight groups. From the results of this analysis, it was possible to identify the similarities in site characteristics and pollutant behaviors.


Introduction
Constant ambient air monitoring can provide important information about surrounding atmospheric pollution. In Japan, constant ambient air monitoring of six priority substances (SO 2 , carbon monoxide (CO), NO x , photochemical oxidant (O x ), suspended particulate matter (SPM), and non-methane hydrocarbons (NMHC)) is conducted by local prefectural governments under the Air Pollution Control Act. These constant ambient air monitoring data are very useful for analyzing the current situation and trends of pollution within an area, and many research works in Japan that have used these data have been reported [1][2][3]. However, at the same time, ambient air monitoring places a significant financial burden on the autonomous communities. Thus, it is necessary to identify less significant monitoring stations and pollutants, while minimizing the loss of data quality and mitigating the effects on the determination of any spatial and temporal trends of the pollutants. There have been some trials to re-examine the efficacy of constant ambient air monitoring stations, for example, in the cities of Shizuoka, Funabashi, and Hiroshima Prefecture in Japan. However, currently, no reliable guidelines exist regarding the optimal method by which this could be achieved.
In this study, we applied cluster analysis to constant ambient air monitoring data from the Kanto region of Japan, based on the expectation that similarities in site characteristics and pollutant behaviors could be identified, and that monitoring stations could be grouped topologically.
Cluster analysis [4] and principal components analysis have been commonly used in air pollution studies. In particular, intensive cluster analyses for ozone (O 3 ) and particulate matter (PM) pollution have been conducted. Lavecchia et al. [5] applied a cluster analysis to Italian ozone monitoring network data and discussed the similarities between the data of each monitoring station. Gramsch et al. [6] applied the same approach to O 3 and PM 10 concentrations and demonstrated that these two pollutants had similar cluster patterns, suggesting that these pollutants' concentrations were controlled by meteorological and topographical factors. Lu et al. [7] applied three different cluster analyses to PM 10 pollution in Taiwan. Giri et al. [8] applied a hierarchical cluster analysis to seasonal PM 10 data in Kathmandu, Nepal, and demonstrated that monsoon rainfall had only a limited effect on decreasing PM 10 concentrations. Cluster analysis has also been applied to other pollutants, such as nitrogen oxides (NO x ) [9,10], carbon dioxide (CO 2 ) [11], sulfur dioxide (SO 2 ) [9], pollen [12], and Pb [13].
We applied cluster analysis to constant ambient air monitoring data obtained in 1996 and 2006 in the Kanto region, which includes the capital region of Japan ( Figure 1). By employing cluster analysis, ambient monitoring stations could be clustered topologically for NO x and O x . Based on the results of the analysis, suggestions for reducing both the number of monitoring stations and the number of chemicals monitored are possible.

Air Monitoring Data
This study focused on the Kanto region in Japan, including the seven prefectures of Tokyo, Gunma, Tochigi, Ibaraki, Chiba, Saitama, and Kanagawa. The Kanto region has approximately 500 ambient air monitoring stations, which account for one quarter of all the monitoring stations in Japan. The constant ambient air monitoring data obtained by 476 monitoring stations during the fiscal years of 1996 and 2006 were used in this study. The monitoring data were kindly provided by the National Institute for Environmental Studies of Japan, and the data for the Kanagawa Prefecture were obtained from the prefecture's website. The priority pollutants in this study were NO x , O x , NMHC, and SPM. SO 2 and CO were excluded because their concentrations in the area of interest were very low and did not show large spatial differences. There are two types of ambient air monitoring stations in Japan: general environmental air monitoring stations and vehicle emission monitoring stations [14]. The latter are arranged near large roadways to monitor air pollution caused by vehicle emissions. Monitoring data from both types of station were used for the cluster analysis.

Cluster Analysis
SPSS ® Statistics 17.0 (SPSS Inc., Chicago, IL, U.S.) was used for the cluster analysis. The data matrices for each pollutant were prepared for cluster analysis using data from the air monitoring stations. In each matrix, the element in the ith row and jth column stands for the ith measurement in the year from the jth air monitoring station. Missing data in the jth column were interpolated from the annual average value of the jth monitoring station. The percentages of missing data ranged from 1.42% to 7.84%. The cluster number was fixed at eight by considering the total number of both air monitoring The Pacific Ocean stations and prefecture numbers in the Kanto region. The square Euclidean distance technique and the Ward method were adopted for the cluster analysis.

Concentration Contour Maps
Concentration contour maps were used to clarify the characteristics of each cluster. A concentration contour map is a contour of pollutant concentrations in which the monitoring month is presented as the abscissa and the monitoring time as the ordinate. Tarasova et al. [15] and Zvyagintsev et al. [16] have used this method for air pollution analysis. Using these maps, the seasonal and diurnal variations of pollutant concentrations can be examined visually.

General Environmental Air Monitoring Stations
For NO x , the concentration data from the general environmental air monitoring stations and vehicle emission monitoring stations were treated separately. The general environmental air monitoring stations in the Kanto region were clustered into eight groups for NO x monitoring in both fiscal years of 1996 and 2006 ( Figure 2). In the legend of Figure 2, the annual average NO x concentrations for each cluster are shown. The annual average NO x concentration in the Kanto region in 1996 was 39.4 ppb, which decreased to 26.7 ppb in 2006. The annual average values were calculated by averaging all measured NO x values in the year. This concentration decrease was most likely because of the Automobile NO x PM Control Law of Japan established in 2006. By application of cluster analysis to the NO x data from the general environmental air monitoring stations, these monitoring stations were clustered territorially. Furthermore, the territorial grouping of the monitoring stations was retained. For example, cluster 7 in 1996, which shows the highest averaged NO x concentration (64.5 ppb), also appears as cluster 7 in 2006 in the same region, also with the highest NO x concentration (42.0 ppb). Clusters 4 and 5 in 1996 appear as clusters 5 and 6 in 2006 in the same region. These results show the possibility for the reasonable elimination of NO x monitoring in the general environmental air monitoring stations in such clusters, because it can be said that the air pollution by NO x in these regions is similar and that the pollution situation has not changed during the 10-year interval. Clusters with the highest average NO x concentrations, such as clusters 6 and 7 in 1996, were located in the Tokyo, Kanagawa, and Chiba prefectures, where traffic volumes were highest. Figure 3 presents NO x concentration contour maps for these two clusters. While the absolute concentrations are different for each cluster, all eight clusters show similar seasonal and diurnal concentration variations with two diurnal peaks at around 08:00 and 20:00, and a seasonal peak during the winter. The NO x concentration contour maps in 1996 show a similar trend to those in 2006 (data not shown). The diurnal variation in NO x concentrations can be explained partially by variations in traffic volume. However, the traffic volume on national roads in the Kanto region begins to increase during the early morning at around 04:00 and remains almost constant from 07:00 to 18:00 [17]. Therefore, variations in traffic volume cannot explain fully the decreasing NO x concentrations during the daytime. Other factors, such as NO x elimination by convection or photochemical reactions during the daytime, should also be considered in explaining the diurnal variations in NO x concentrations. The seasonal variation in NO x concentrations can be explained by the development of ground-based inversion layers during the winter, which effectively trap NO x near the ground surface. Furthermore, the strength of ultraviolet light is weaker during the winter, increasing the lifetime of NO x because of the low concentrations of hydroxyl radicals.   Figure 5 presents the clustering results for air monitoring stations using O x concentrations. O x is defined as oxidative chemicals (limited to those that can generate iodine from neutralized potassium iodide solution, except for NO 2 ) derived from photochemical reactions from ozone, peroxyacetyl nitrate, and others by the Ministry of the Environment of Japan. Similar to the case for NO x measured by the general environmental air monitoring stations, air monitoring stations in the Kanto region were clustered well territorially for O x. Despite some elimination and consolidation, the territorial group was preserved after 10 years. These results show the possibility for the reasonable elimination of O x monitoring in the air monitoring stations.

O x
Variations in the annual average Ox concentration in each area were small, and it is speculated that the clusters were grouped by concentration variations or other factors (e.g., time variation trends), rather than by absolute concentrations. The annual average O x concentrations in the Kanto region in 1996 and 2006 were 23.6 and 25.1 ppb, respectively.  In cluster 3, in 2006, O x concentrations higher than the environmental standards (60 ppb) were observed more frequently, despite the cluster having an annual average concentration that could be considered normal. Cluster 3 was located in Saitama Prefecture, but also encompassed parts of the Tochigi and Ibaraki prefectures.    Figure 9 shows the cluster analysis results of air monitoring stations using SPM concentrations. In the central urban area, some overlaps of clusters were observed. Overall, the air monitoring stations in the Kanto region were clustered into eight territorial groups for both years. Some differences among the clustering for both years were observed. Clusters 2 and 3 in 1996, in the north section of Chiba, merged into one cluster (cluster 2) in 2006. Cluster 5 in 1996, split into three groups: clusters 3, 4, and 6. These results indicate that the air pollution situation changed in the Kanto region during the decade. In addition, it can be said that it is difficult to eliminate monitoring stations for SPM using the cluster method. The annual average SPM concentration in the Kanto region in 1996 was 45.5 µg/m 3 , which decreased to 29.3 µg/m 3 in 2006. This large decrease in concentration was mainly caused by the implementation of regulation according to the Automobile NO x PM Control Law of Japan. The banning of small incinerators to prevent dioxin emissions also contributed to the decrease in SPM concentrations. The annual average SPM concentrations were highest in clusters 2, 5, and 6. The percentages of vehicle emission monitoring stations in these clusters were high. Similar to NO x , if the monitoring data from the vehicle emission monitoring stations were treated separately, the SPM pollution characteristics in the Kanto region would be clearer. Figure 10 presents the SPM concentration contour maps for clusters 5 and 4 in 2006. In 2006, SPM pollution during the summer was prominent in all eight clusters. The SPM concentration contour map for clusters 4 and 5 showed the typical concentration variation trend observed in the Kanto region; a concentration peak in the morning from June to July (the rainy season) and a concentration peak in the evening in December. High concentrations of SPM during the summer can be explained by accelerated photochemical reactions. Kaneyasu et al. [18] reported that high SPM concentrations during the rainy season in Japan could also be caused by meteorological factors. In 2006, cluster 4, located in the northern area of Saitama to the center of Gunma, showed specific SPM concentration variations, i.e., they increased in the evening during the summer. In this area, southeasterly winds are dominant during the summer, suggesting that winds transported NO x and NMHC that were emitted in the urban area during the day. Photochemical reactions occurring during their transport would contribute to the elevated SPM concentrations in cluster 4 in the evening.

Conclusions
Ambient air monitoring stations in the Kanto region were clustered into eight groups using constant air monitoring concentrations of NO x , O x , NMHC, and SPM. The air pollution characteristics of the clusters were analyzed using concentration contour maps. It was confirmed that the ambient monitoring stations could be clustered topologically for NO x and O x using cluster analysis. If the ambient air monitoring stations could be reasonably grouped, then a method for reducing the number of both monitoring stations and chemicals monitored should be possible. Such a method should be simple, versatile, and mechanical and thus, we suggest that the number of monitoring stations in the Kanto region could be reduced by adopting the following three simple criteria: (1) retain the monitoring station (or chemical) if similarities exist between its monitored data and the averaged monitored data of the cluster to which it belongs; (2) retain the monitoring station (or chemical) if the monitored data show higher concentrations; and (3) retain the monitoring station (or chemical) if the monitored concentration levels exhibit an increasing trend. For the first criterion, Euclidean distances between each element of monitored data and the average of the monitored data in the topological group matrix were calculated, and only the top 5%-15% of monitoring stations with the smallest Euclidean distances was retained. For the second criterion, the top 5%-15% of monitoring stations with the highest annual averaged concentrations in 1996 or 2006 was retained. For the third criterion, the top 5%-15% of monitoring stations with the highest ratio of annual averaged concentrations in 2006 to 1996 was retained. The retention ratio for each criterion was varied within the range of 5%-15%. When the retention ratio was set at 10%, over 30% of monitoring stations could be removed by adopting the above criteria. In our next paper, we will describe this suggested method in greater detail.