An Interactive Web Mapping Visualization of Urban Air Quality Monitoring Data of China

In recent years, main cities in China have been suffering from hazy weather, which is gaining great attention among the public, government managers and researchers in different areas. Many studies have been conducted on the topic of urban air quality to reveal different aspects of the air quality problem in China. This paper focuses on the visualization problem of the big air quality monitoring data of all main cities on a nationwide scale. To achieve the intuitive visualization of this dataset, this study develops two novel visualization tools for multi-granularity time series visualization (timezoom.js) and a dynamic symbol declutter map mashup layer for thematic mapping (symadpative.js). With the two invented tools, we develops an interactive web map visualization application of urban air quality data of all main cities in China. This application shows us significant air pollution findings at the nationwide scale. These results give us clues for further studies on air pollutant characteristics, forecasting and control in China. As the tools are invented for general visualization purposes of geo-referenced time series data, they can be applied to other environmental monitoring data (temperature, precipitation, etc.) through some configurations.


Introduction
Recent years, in China, frequent occurrences of hazy weather in big cities have aroused great attention on urban air quality issues among the public, government managers and academic researchers.Discussions about air quality of big cities (like Beijing, Wuhan, etc.) frequently appear in social media.Since 2012, the government has adopted two new air quality monitoring technical standards [1,2], then has built a real-time air quality reporting platform and an air quality forecasting system.Because air pollution has detrimental effects on human health, vegetation, crops, etc., it has great political, societal and economic impacts [3][4][5].Therefore, with the increasing availability of urban air quality data and the great environmental challenges we are facing, many studies have been conducted to explore new approaches for understanding this big environmental monitoring dataset.
Environmental monitoring data can be described by multivariate time series observations generated from geo-located monitoring stations.For our research topic, urban air quality monitoring data consist of many air pollutant concentration values (such as fine particles, carbon monoxide, sulfur dioxide, nitrogen oxides zone, etc.), which are reported hourly from monitoring stations fixed at specific positions in a city.These geo-referenced time series data are an important study subject in the areas of geovisualization and environmental science.
Multi-dimensional data visualization considering spatial distributions, temporal granularities and multivariate thematic attributes has been an interesting question in geovisualization, as well as in the environmental science area.In geovisualization, many studies have focused on systematic theories of multivariate spatio-temporal data visualization [6].Multiple static maps series, animation maps, space-time cubes and self-organizing maps (SOM) are used to deal with spatio-temporal data mapping [7,8].In parallel, basic graph charts, such as parallel coordinate plots (PCL), are used to show multivariate thematic attributes.To achieve effective visualization of multivariate spatio-temporal data, combinations of these methods are used; such as the visualization system for space-time and multivariate patterns (VIS-STAMP) [8], which proposed a systematic way of joining those visualization strategies in multiple visual views interactively.However, expertise is needed to use these visualization systems, which are limited to domain experts.Thus, it requires some easy-to-use visualization applications to face a wider range of users.
Visualization of environmental data is important in the area of environmental science [9].Conventional technologies in the Geographic Information System (GIS) and statistical map making skills are widely used to manage, analyze and visualize environmental data [10].With the development of web technologies, open source GIS standards and web mapping tools are adopted in the analysis and presentation of environmental data-related studies not only for experts and managers, but also for the public in general [11][12][13].These studies mainly focus on the architecture of environmental data visualization system building, and the visualization techniques are limited to conventional methods.Therefore, more effective visualization tools need to be developed for visualization of the increasing accumulated multi-dimensional environmental monitoring data.Currently, as urban air quality is an urgent societal problem for many policy-makers and a study topic for scholars, air quality data are gaining more attention.
Many studies have been conducted on mapping urban air pollutions mainly from two directions.Firstly, for the distribution of monitoring stations is rarely homogeneous, and spatial interpolations are used for pollution exposure maps [14], concentration maps of air pollution [15] and mapping spatio-temporal trends of air pollution [16].Secondly, conventional charts and modern visualization tools are also being used for air quality data exploration [17] and visual analytics [18,19].Our research follows the second direction and aims at providing a web mapping application for visualizing a whole year of air quality monitoring data of all cities in China with one dynamic and interactive map.This visualization application provides freedom of configuration for users to explore this multi-dimensional dataset from different angles.The major contributions of this paper are in three aspects: (1) we develop two novel web client tools for interactive visualization of spatio-temporal data; (2) we propose a mashup strategy for web mapping by combining and extending the function of different visualization tools, which can be generalized to visualize other kinds of spatio-temporal data, such as temperature, precipitation, etc.; (3) we implement an on-line interactive urban air quality data visualization application that helps to explore and analyze a big air quality dataset and that puts forward clues for further studies on air pollution.

Method and Data
Mapping spatio-temporal datasets is mapping changes of geographical features over time and space [7].Because the information load is huge, the interaction and dynamics [20] of spatio-temporal visualization should be well designed.Multiple levels of spatial and temporal details are important considerations in spatio-temporal visualization.Mostly, the methods that are proposed in the literature for visualizing spatio-temporal data in a multi-scale perspective have focused on either the spatial or temporal aspect, rather than integrating both views over multiple scales.Qiang [21] and Van de Weghe [22] presented a continuous spatio-temporal model for space-time analysis.However, in the real visualization domain, space and time are recognized or recorded by discrete intervals, such as the tiled map service and the periodic timekeeping system.The tiled map uses a quad tree to represent multi-scale feature of space, while the timekeeping system uses the year, season, month, day and hour structure to record time-based events.In this section, we propose a space and time zooming method conforming to the tiled map service and timekeeping system.This method also conforms to the "overview first, zoom and filter, then details on demand" process [23].
Mapping the time component onto an axis on 2D space or 3D space is conventional method for time series visualization.The space-time cube [24] maps time to a 3D axis vertically, resulting in a 3D trajectory for time series datasets.As the storygraph [25], it maps x and y coordinates of space to two vertical axes and time to the horizontal axis, which can show trends in time series datasets.This paper provides a cartographic method to encode the time component of spatio-temporal data.The method separates the time component from the map space and uses the glyph map symbol to encode and interact with time, which gives more freedom for time representation and interaction.Furthermore, all symbols on the map are controlled by the dynamic map layer for displaying appropriate symbols at different map zoom levels and view extents.
Environmental monitoring data, i.e., the air quality monitoring data have multidimensional features, which can be structured as data cubes with a hierarchical structure [26].Figures 1 and 2 illustrate the hierarchies of time and geographical information respectively.In this paper, the time structure in the dashed polygon part is used for the visualization, as the instance on the right side of Figure 1.For the geographical part, this paper focuses on the air quality of city points, each of which has a semantic importance illustrated in the table on the right side of Figure 2. The importance level of the city points influences their weight in dynamic map symbol selection when zooming and panning the map.In the following subsections, the technologies and framework of the mapping application and the design and development of two JavaScript visualization tools, timezoom.jsand symadaptive.js,are discussed.The two visualization components are mainly based on the general purpose web mapping library leaflet.jsand the data visualization library D3.js.Then, the data and their processing for this study are presented.

Air Quality Mapping Technologies and Framework
As mentioned above, environmental monitoring data, i.e., urban air quality data are geo-referenced, and a map is very suitable for visualizing this dataset.To achieve the purpose of mapping such a big dataset on one dynamic web map application, we resort to some popular visualization tools in the GIS and information visualization area.With the booming web mapping tools, cartographers have more and more choices for their mapping works.Nevertheless, it is also a challenge for cartographers to know the characteristics of all of these tools and to maintain their own mapping frameworks.Roth [27] mentions this problem and provides a comprehensive analysis of current web mapping technologies, which gives us some advice to cope with the continued evolution of these technologies.To be more practical, cartographers need only three categories of tools for a web mapping work: application frameworks, mapping libraries and visualization libraries.Application frameworks work as the engine to drive the whole mapping application; mapping libraries as the map engine to glue all kinds of map services and geospatial data services; visualization libraries as the symbolization engine to visualize the non-spatial information within geographic entities.This is a kind of software mashup framework, with which the application engine, the map engine and the symbol engine can be selected by the map makers according to their visualization topics and development habits.Among all of these mapping technologies, this study adopts web.py [28], leaflet.js[29] and D3.js [30] as the tool framework for the air mapping application.web.py works as the web application framework to drive the whole mapping application; leaflet.jsworks as the map engine to glue map services and data services; D3.js acts as a map symbolization engine to render the time series air quality data of the multivariate index.These three technologies work together for the air quality mapping application (Figure 3).This is a kind of technology mashup [31] working with the mapping mashup application.Though this study focuses on air quality monitoring data visualization, the framework and invented visualization tools can be applied to the visualization work of other similar spatio-temporal environmental monitoring data, such as meteorological monitoring data (temperature, precipitation, humidity, etc.).

Time Series Map Symbol Encoding: timezoom.js
Time series data visualization mostly has two kind of routines: the linear one and the cyclic one [32][33][34].Linear ones take the perspective that time is a continuous concept represented by a time line.Cyclic ones [35] regard time as a periodic concept, such as a time keeping system.The timekeeping model is cyclic for the year, season, month and day, which originates from the astronomical observations of our ancestors.This paper constructs the time series air quality data with the hierarchical timekeeping system, and multiple temporal granularities are manipulated by map symbol interactions.In each node of this leveled structure, the values of each index of air quality are calculated as the mean value of lower level nodes.This hierarchic time structure is formulated in Formula (1). Figure 4 shows the timezoom.jssymbol structure.
In Formula (1): Time H is the time hierarchy structure that will be calculated; Time V is the time series values' array, such as the 365 days of one year data value array, {t 1 : v 1 , t 2 : v 2 , ..., t 365 : v 365 }; Time S is the time system for hierarchy construction, such as {Year, Season, Month, Day}.This can be decided by users and can be extended to hours, minutes and seconds levels, as shown in Figure 2; Operation is the operation for the value aggregation from the lower time level to the higher time level.This can be any statistical method for aggregation, such as average, median, quantile, etc.
The timezoom.jssymbol is a radial tree map visualization tool based on D3.js.The root node of the hierarchy structure is at the center with leave nodes on the circumference.The values of all nodes are encoded with a defined color scheme.The initial state of this symbol shows the whole hierarchy.When clicking on any sector, the symbol will zoom to that sector and fold up other sectors; for example, when you click on the Q2 sector of the symbol, other sectors will be folded up, and the outside circle of the symbol will zoom to the days only in Q2 (Figure 4).

Adaptive Map Symbol Control: symadaptive.js
Zooming control means adaptively selecting appropriate symbols at different zoom scales and view extents.Yang [36] proposes a strategy for multi-scale visualization of massive point data, which selects points with a heavy algorithm on server-side and makes light-weight symbol displacement on the client-side.Jari Korpi [37] evaluated point symbol clutter reduction methods for map mashups.According to these works, in this paper, the map symbol clutter reduction is achieved by symbol conflict detection with Rbush.js[38], which is a tool library based on the spatial index algorithm, Rtree [39].At each map zoom level, symbols of points in the current map view extent are being tested for conflicts.When the symbol has conflicts with other symbols, the point that has a lower semantic weight will be dropped (Figure 5).The semantic level of each city is decided by the level of its administration division; see Figure 2. In this paper, the mechanism of semantic importance-driven map symbol selection is proposed.In practice, one can have their own semantic level fields in their data, and this symadaptive.jslayer tool can help to dynamically select appropriate symbols according to the assigned level weight field.

Data Sources and Processing
The Air Quality Index (AQI) is a number without any unit used to indicate how polluted the air is.It is adopted by the newly-published national environment protection standard Technical Regulation on Ambient Air Quality Index (on trial) in China.With this standard, different air pollutants have their own concentration level ranges, and the index value of each pollutant can be calculated by Formulas (2) and (3).The AQI is assigned the maximum value of the individual index value of all of the reported pollutants (Table 1).There are six ranges of the AQI values, and each range is assigned a descriptor and a visual color code, as shown in Table 2.
AQI = max{I AQI 1 , I AQI 2 , I AQI 3 , ..., I AQI n } where: I AQl P is the individual Air Quality Index of pollutant P, C P is the concentration of pollutant P, BP Hi is the the concentration division point of pollution P that is ≥ C P , BP Lo is the concentration breakpoint of pollution P that is ≤ C P , I AQI Hi is the index division point (Table 3) corresponding to BP Hi , I AQI Lo is the index division point (Table 3) corresponding to BP Lo .We collect the hourly reported air quality data from the national real-time air quality reporting system.By 2017, there were 367 cities with 1497 monitoring sites in China reporting their Air Quality Index hourly, as shown in Figure 6.Daily air quality of cities is the object of this study.Therefore, the hourly reported data are aggregated into the daily reported data according to the standard [2].First, the average hourly data of a city are calculated as the mean pollutant concentration of all monitoring stations.Then, the daily value is calculated as the mean concentration values of 24 h if it has 16 h of validated values in a day; otherwise, the daily value is invalidated.The final data are daily records of all cities with 24-h average concentrations of 5 pollutants (SO 2 , NO 2 , CO, PM10, PM2.5) and 24-h maximum concentrations of 2 pollutants (O 3 , O 3 -8H).All of the concentration values are converted to an Individual Air Quality Index(IAQI) value ranging from 0-500 according to Table 3 and Formulas ( 2) and ( 3).Thus, we can get the Air Quality Index (AQI) value and the primary pollutants of each city each day.

Results and Discussion
Based on the discourse above, this study designs the air mapping application in terms of the navigation of space, time and theme [40].The whole application is built on the framework illustrated in Figure 3, with the two invented JavaScript visualization tools: timezoom.jsand symadaptive.js.

Spatial Navigation
Air quality data of all cities are loaded by symadaptive.js,and the time series data are used to build the timezoom.jssymbol for each city point.These geo-located glyph symbols are overlaid on an OpenStreetMap tiled map background.The display of all symbols is under the control of symadaptive.js.Meanwhile, all of the symbols can be hovered over and clicked to invoke the display of detailed air quality information about the city point interacted with.The weight of each city is decided by its administrative level: central city of region, province capitol, district capitol and county capitol (see Figure 2).Figure 7 shows the interaction effects of zooming the map to different scales.

Temporal Navigation
The temporal component of air quality data is presented by the timezoom.jscomponent.The timezoom.jssymbol is driven by the date hierarchy structure of air quality data.All symbols are event connected.Therefore, when clicking on one map symbol at a specific time section, all symbols on the map will zoom and change their appearance simultaneously to show air quality data according to the chosen time section.For a better and efficient query, a bigger time wheel is designed for temporal navigation on the map side panel.Figure 8 shows the interactions with the time series data by the timezoom.jssymbol.

Thematic Navigation
Air quality has a total description as AQI, which contains several important air quality sub-indexes, PM10, PM2.5, SO 2 , NO 2 , O 3 _8h, O 3 , CO.We design a side panel on the map for thematic attribution selection.When the thematic attribute changes, the map symbol of all points will change accordingly.Thus, it is easy to switch the map view among the whole air quality map and individual pollutant concentration maps.
To sum up, this study presents an urban air quality mapping application based on invented tools: symadaptive.jsand timezoom.js.The map of air quality starts with visualizing the data of central cities in six major parts of China (see (Figure 9)).

Results and Analysis
In the air quality application, the concentrations of all of the pollutants are illustrated with graded color hues under equal interval classification of all values of the year.The AQI map is rendered with the commonly-used standard color scheme (Table 2).With the thematic navigation radio button, we can render all air maps of different pollutants.In Figure 10 is presented a series of maps of the AQI value and seven important pollutants' concentrations at a nationwide scale, and the dataset can be switched between two years, 2014 and 2015 (Figure 9).From this series of maps, several significant findings are shown clearly.These findings provide some clues for in-depth research on the air pollutant causality and relationships among air pollutants from a spatio-temporal view.

Nationwide Air Quality Condition of China
The interactive mapping application gives us an overview of the air quality condition of China at a nationwide scale throughout a whole year's time (Figure 10a).First, for most of the cities in China, it is more likely to have a good air condition during summer days than winter days.Some special places are in the northwest part of China, where several cities have a terrible air condition throughout the year.This situation should be given more attention in further research.Second, air quality maps of a different time granularities show a clear ribbon pattern along the coastline and southwest part of China where cities have better air conditions than other part of China; meanwhile, the cities in the north part of China have the worst air condition.The possible reasons for such patterns reside in two aspects.On the one hand, the coastline cities have better air circulation conditions for air purification than hinterland areas of China.On the other hand, the cities in the southwest are less developed than the hinterland; thus, the pollutant emission is lower.Nevertheless, these are hypotheses that need further study.Furthermore, the patterns can provide auxiliary information for policy-makers to adopt different measures in different cities.

Spatio-Temporal Pattern of Air Pollutants
With the thematic navigation radio button, one can have air maps of different pollutants.In Figure 10, from this series of maps, several significant patterns are shown clearly.First, we can easily indicate that PM10 and PM2.5 more likely contribute to the AQI value due to their similar pattern in terms of space and time (see Figure 10b,c, PM10, PM2.5 concentration maps).Actually, the public cares more about the PM10 and PM2.5 index in China, and they are the critical impacts of smog air conditions.Second, NO 2 is mainly caused by automobile exhaust, which is more likely to be worse in big cities, as shown in the NO 2 concentration maps (Figure 10e).Third, from the map of SO 2 , one can find that the concentration of SO 2 is more likely to be serious in winter months in the north of China (see SO 2 concentration maps in Figure 10d).This situation can be connected to the burning of coal for heating in winter of cities in the north of China, which emits a great amount of SO 2 .Fourth, there is an obvious situation that the value of CO is hardly serious enough to impact public health, as CO is a deadly poisonous gas that has critical control in China (see the CO concentration maps in Figure 10h).
Unlike other pollutants, which are emitted directly into the air by some specific sources, ozone (O 3 ) is created by sunlight acting on NOx and VOC in the air.Thus, the index value of O 3 is higher on sunny days throughout the year or in areas that have a longer sunlight duration and stronger sunlight intensity.In the concentration map of O 3 (Figure 10f,g), we can find that most cities in China will have higher O 3 values in Seasons 2 and 3, when sunshine is greater and the hours longer during the daytime.In Lhasa City, Tibet, the O 3 value is high across the year because of its high elevation and thin air, which causes the high intensity of sunlight.Some cities in the south of China and along the coastline will show different O 3 value patterns (see the O 3 concentration maps in Figure 10f,g).This kind of situation may be caused by unstable weather conditions in the south of China; for example, in summer there are many rainy days in which sunlight intensity is mild.
All of the findings mentioned above are hypotheses.Through these clues, we can design further studies on these topics and make more reliable conclusions.Moreover, these illustrations can help environmental regulation governors and the general public to have intuitive images of the air condition of China.

Conclusions
The research presented here describes a novel combination of modern mapping technologies, with which this study develops an online mapping application of air quality of China.In this study, an open web platform is fully used to collect the time series air quality data consistently.Then, data visualization tools (D3.js) and web mapping tools (leaflet.js,rbush.js)are well combined to produce a fine interactive mapping application of spatio-temporal data.From the application, we can get a whole view of the air quality condition of China at a nationwide scale and a year time span at multiple spatio-temporal granularities.This interactive map application clearly presents several significant findings of air quality in China, which provide good assistance for visual air quality analysis and clues for in-depth studies on air pollution.
The lessons we learned from this study reside in three aspects.First, there are more and more open data about our living environment, into which we can delve and find important results for making our living environment better.Second, as cartographers, we should make full use of new technologies for data visualization and web mapping.A good combination of these excellent tools can give us greater power for environmental data visualization.Third, the visual form of data is more expressive than the raw data table, and it would give deeper insights into the data analysis.In other words, nowadays, with more and more open data, fine and flexible visualization tools and the crowd wisdom of the public, we can have a clear vision of the environment around us.
The future work is to enhance the efficiency of the timezoom.jssymbol and to extend it to hour granularities for a more detailed time level.What is more, as time goes on and the air quality data are accumulated, a year selection mechanism should be added to the timezoom.jssymbol, and the comparison function should be enhanced.At the same time, some air quality study problems can be defined from the previous discussions of the findings with the visualization, then our future work will collect evidence to test our hypotheses.

Figure 1 .
Figure 1.Hierarchical structure of the time dimension.The dashed polygon part is handled in this paper.An instance of this structure is on the right side.

Figure 2 .
Figure 2. Hierarchical structure of the geographical dimension.In this paper, we focus on the city level, and the importance weight is in the table on the right side.

Figure 3 .
Figure 3.The air quality monitoring application framework.This paper proposes a mashup strategy to make use of multiple visualization and mapping technologies for web mapping applications.

Figure 4 .
Figure 4. Multi-granularity time zooming interaction (timezoom.js):The inner circle shows the data value of the year, and Sectors Q1, Q2, Q3 and Q4 respectively indicate the four seasonal values and sectors January-December for the monthly data of a year; each month sector is surrounded by daily sectors.Clicking on each sector, the timezoom.jssymbol will zoom to that time sector for detailed views of its value distributions.

Figure 5 .
Figure5.Multi-scale space zooming control (symadaptive.js):When the symbol C is add to the map, C will conflict with A and B. If C has a higher importance level than A and B, then C will be kept on the map, and A and B will be removed.For the other situation, if A or B has a higher importance level than C, C will be ignored.

Figure 8 .
Figure 8. Temporal navigation.The left circle symbol shows the structure of the time symbol, and the right is the map symbol that will be displayed on the map.The toggle buttons below are for the time filter to focus on the time granularities of interest.(a) shows the full year data in a time symbol, and (b) shows the zoomed symbol to Quarter 4 (October, November and December).(c,d) show the symbols that are filtered by the month and day value.

Figure 9 .
Figure 9.Initial view of the online air quality mapping application.

Table 1 .
Reported air index and pollutants and their descriptions.

Table 2 .
Air Quality Index level divisions, descriptors and colors.

Table 3 .
Individual Air Quality Index and corresponding pollutants' concentration limits.