Maps are useful for conveying information to both inexperienced and advanced users. There are many types of maps designed to present data but the underlying maps often come with other challenges such as the how the areas are segmented. Fairbairn et al. suggest scale, level of detail, and multivariate data as common challenges for the representation of geo-spatial data [1
]. Ward et al. state, “A problem of choropleth maps is that the most interesting values are often concentrated in densely populated areas with small and barely visible polygons, and less interesting values are spread out over sparsely populated areas with large and visually dominating polygons"
]. The challenge of perception (C1
—size perceivability) is a fundamental one associated with digital maps. Even when trying to rectify this for a univariate map, few solutions enable opportunities to convey multivariate, high-dimensional data. For example, geo-spatial designs (choropleths, cartograms, symbol maps, etc.) only depict uni-variate or, occasionally, bivariate data. This is a challenge for conveying of multi-variate geospatial data (C2
—multivariate geospatial data). One possibility is glyphs to support multivariate visualization options. However, even if we can present multivariate geospatial data using glyphs, we still run into challenges. If we plot glyphs in their geospatial context, then we risk overlap and over-plotting. In other words, if we place a multivariate glyph at the center of each unit area on a map, the glyphs will either overlap in many cases or be too small to perceive, especially in densely populated areas (see Figure 1
—occlusion). Ellis and Dix state “a glyph representing multiple attributes may need simplifying when reduced in size, resulting in a loss of data"
], suggesting that reducing the size of a scalable multivariate glyph can be problematic (C1
—size perceivability). Another option to address C3
—occlusion is to employ structure-driven glyph placement guided by a Cartesian grid. However this common solution de-couples the glyphs from the original geospatial areas they intend to represent. This is the challenge of geo-spatial glyph-placement (C4
—glyph placement). In order to address all four challenges, C1–C4
, we introduce scale-aware maps, a process of presenting geo-spatial multivariate data based on a desired screen space, that enables dynamic modification to the level of detail shown using both zooming functions and custom scale options. We integrate this with glyphs to present multivariate data in a geo-spatial context to enable interactive exploration, and facilitate easier comprehension with area context using both smooth transitions and uncertainty indicators. We refer to our work as using glyphs as opposed to symbols guided by the definition from Borgo et al. who define a glyph as, “⋯ an independent visual object that depicts attributes of a data record"
]. Our contributions include:
A multivariate map with scalable glyph rendering and presentation (in the form of scale-aware maps) (C1—size perceivability, C2—multivariate geospatial data, C4—glyph placement).
Dynamic hierarchical glyphs that support zooming, and user-controlled level of detail. (C2—multivariate geospatial data, C3—occlusion, C4—glyph placement).
Interactive filters to improve analysis and exploration of multivariate data and comparison of geo-spatial areas. (C2—multivariate geospatial data).
In order to do so, we develop solutions that address the four major challenges, C1–C4.
2. Related Work
McNabb and Laramee provide an SoS (survey of surveys) for information visualization and visual analytics [6
]. The paper includes a section of glyph-focused survey papers, as well as geospatial surveys. Borgo et al. present a survey of glyph design criteria [4
]. Fuchs et al. provide a systematic review of experimental studies on data glyphs [7
]. Ward presents a taxonomy of different glyph placement strategies (discussed further in the glyph placement section) [8
]. We find three survey papers on cartograms [9
]. We do not consider univariate cartograms within the scope of our work as they distort the boundary geometry of the geo-spatial data, which we avoid in our process. We do not consider magic lenses in our related work [12
]. We make this decision considering that magic lenses are manually manipulated, are typically not coupled to geospace, do not necessitate a placement algorithm, and their border transistions are not necessarily smooth [13
]. Although we discuss focus+context in the paper, we focus our related work on the topic of maps and glyph placement. We recommend Cockburn et al. for discussion on the topic [14
]. We do not consider labels as related work as they do not necessarily apply to multivariate data, and labels do not have to follow a cohesive hierarchical structure [15
]. However the work here could likely be adapted specifically for labeled maps.
Janicke et al. use a circle packing method to reduce complexity of point-based data at multiple levels of detail [16
]. The user can zoom in and out of the map while the point distribution is aggregated to present clear, visible point frequencies. This differs from our work as the data is not coupled to geospatial areas. We also focus on multivariate data which is not featured in their work. Rohrdantz et al. present a multivariate map depicting different new trends across the world using the geospatial map and data proportional glyphs [17
]. They use geo-tagging to aggregate their data and do not present any techniques to avoid occlusion. This differs from our work which aims to present glyphs as concicesly as possible, and handles multiple levels of detail with dynamic zooming. Jo et al. present a model for reducing complexity in presenting multiclass data on maps by using aggregative techniques [18
]. Their work emphasizes techniques to increase perceivaility without manipulating the underlying geospatial context, whilst ours focuses on increasing perceivability with existing techniques through geospatial unification. Guo creates a technique known as regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP) [19
]. Rather than focusing on scale, the algorithm focuses on agglomerating clusters, and Guo discusses six variations of their methodology. Although their technique can calculate varying number of regions, they do not discuss how the number of regions are selected, which differs from our work that dynamically allows restructuring of the hierarchy. The method uses region-based plotting, does not use glyphs, does not consider multivariate data, nor support dynamic zooming.
Related Work with a focus on Cartograms:
Dorling visualizes local urban changes across Great Britain [20
]. The paper uses multivariate options to review industry distribution, owner-occupied housing, as well as a set of attributes plotted using Chernoff faces as an equal area representation. Slingsby et al. capture the geo-spatial context and transform their results into a grid, which is then represented by a treemap where the hierarchy is based on temporal data [21
]. Slingsby et al. present a rectangular cartogram showing the postcodes in Great Britain, where postcode district and unit postcodes form the hierarchy [22
]. Cartograms distort geo-space, which we avoid using our procedure. Tong et al. develop Cartographic Treemaps to explore multivariate medical data provided by Public Health England [23
]. This is extended to time-varying data [24
]. Beecham et al. visualize trends to explain the UK’s vote to leave the European Union [25
]. They use a juxtaposed view to present equal area cartograms for different variants. Nusrat et al. produce a cartogram that presents bi-variate data using a ring encoding, where the color presents the leading statistic, and the ring thickness presents the value the leading statistic [26
]. This differs from our work by emphasizing bivariate design, whilst we provide options for up to nine dimensions to be represented clearly. Our method also supports interactive levels of detail with dynamic zooming.
Related Work with a focus on Multivariate Maps:
Multivariate maps have been used in cartography for over 100 years. For example, Minard depicts a multivariate map using pie charts to present cow consumption across France [27
]. The pie charts are placed manually. Kahrl et al. present a range of imagery focused on California’s water supplies including irrigation applied to crops in the form of dense pixel displays across geo-spatial points [28
]. Approaches to add more dimensions to choropleths include bivariate color maps [29
]. Brewer and Campbell present point symbols for representing quantitative data on maps, including bi-variate options [31
]. Although their paper does not focus on glyph placement, their examples place symbols on a region’s centroid and exhibit minor occlusion. The work of Andrienko and Andrienko [32
] contains a range of examples of multivariate maps using glyphs for thematic maps, including temporal glyphs, and multivariate pie glyphs for forest data. They present glyph placement two ways—region-centroid symbol placement for US states and a Cartesian grid to represent forest data over Europe [32
]. They discuss the importance of the link between identifying a symbol and the geo-space it represents (on the map) (C4
—glyph placement). Slocum et al. provide a chapter on multivariate maps, describing techniques to consider when displaying bivariate, trivariate, and multivariate data [33
]. Bertin’s Semiology of Graphics is a foundational work on cartography. The work covers many different maps including multivariate maps of up to 6 variates, using grid-based and coordinate-based placement schemes [34
]. Elmer reviews symbol consideration for bivariate thematic maps but does not support more than two variates [35
]. Our algorithm supports an arbitrary number of variants depending on the glyph design. Kresse and Danko present geographic techniques from basic principles to applications [36
]. They present a table of visual variables to represent data, applied to a given map and symbols. Tsorlini et al. present a taxonomy of thematic cartography symbols, including multivariate options [37
]. The symbols are presented as a hierarchy, focusing on the number of attributes, and arrangement. The focus of their work is not on glyph placement, nor dynamic level-of-detail.
Related Work with a focus on Glyph Placement:
Ward and Lipchak create a software tool for cyclical, temporal multivariate data. Glyphs are placed in an ordered grid structure to enable easy comparison between similar months or entire years [2
]. They also use a radial structure. Our work differs from this work by focusing the glyph placement coupled to geo-spatial areas. Ward presents a taxonomy of different glyph placement strategies [8
]. They introduce glyph designs that can be used, and 15 glyph placement strategies together with a flow chart of how the glyph placement is driven (data-driven or structure-driven). Our placement strategy is considered geo-spatially data-driven. As the modifications are made before the placement process, it falls into original→derived→data-driven. This is expanded by a subsection in a further survey by Ward [38
]. Ropinski and Preim present a taxonomy of usage guidelines for glyph-based medical visualization [39
]. As opposed to Ward’s placement taxonomy, they suggest glyphs should be placed based on physical characteristics or anatomical features. Borgo et al. provide a section on glyph placement which extends on both of the previous taxonomies by suggesting user-driven placement [4
]. Chung et al. discuss glyph sorting strategies and present horizontal axis bins, applying them to sport-event analysis glyphs [40
]. Our work differs by guiding our glyph placement strategy based on a 2D geospatial context. As evidenced by Table 1
, the algorithm we present offers a novel combination of glyph placement, multivariate data, level of detail, dynamic zooming, and smooth transitions.
This section provides the pre-processing steps used to create the scale-aware maps, the run-time process for transitioning between glyph densities, and the options we provide to enhance the exploration of the data. The pre-processing steps are based on previous work by McNabb et al. [41
]. The purpose of the pre-processing step is to build a map whose areas are always perceivable, unit areas that are too small [42
] are unified until they reach a minimum area threshold set by the user. The area-based hierarchy construction is a recursive algorithm broken down into three sub-routines. In these three steps, we select the optimal neighbor for merging, we identify the shared boundary between the given area and its neighbor, and unify them to create a new area which is then inserted back into the list of areas sorted by size. A flow chart of the procedure is found in Figure 2
. Once the pre-processing steps are completed, we move to our run-time implementation. The first step is to identify optimally-sized areas, render any transitions between previously rendered and the newly selected areas, compute the glyph visual mapping of the data, and update the glyph properties, before rendering the glyphs. From here, we provide five options to transform the view. Zooming or scaling to dynamically modify the multi-variate glyphs, attribute filtering to modify the glyph properties and mapping, modification of hidden area indicators to customize glyph design, the revision of glyph attributes to customize the glyph design, and modification of the underlying hierarchy which returns the algorithm back to the pre-processing procedures. We disucss the steps in detail in the sections that follow.
We use a recursive procedure to create a hierarchical area-based data structure. An area hierarchy is created for each contiguous region, where each area is merged with its closest neighbor identified using a customizable distance metric [41
]. We start with a merge candidate list filled with the sorted unit-areas (for one contiguous region). There are three main sub-routines—(a) neighbor selection, (b) creating the parent area, and (c) updating the merge candidate list. If only a single unit-area remains in the merge candidate list, no further merges can be processed and the procedure terminates. (a) In order to select an appropriate neighbor to join, we use a general and flexible distance metric for amalgamation evaluated between neighboring areas which is used to identify our ideal neighbor, based a ’distance’ metric identified.
We use the closest distance considered as the optimal selection for a neighbor,
. This method is discussed further by McNabb et al. to discuss why these attributes are important for neighbor selection [41
]. The measure consists of four constituents—Smallest area (a
), euclidean distance between centroids (d
), univariate data value variance (
), and shared boundary resolution (
). We search and identify each common vertex between neighboring areas to identify the shared boundary. We update the sorted area list by removing the two merged areas, and inserting the newly created parent, which may be used as a new merge candidate. This is repeated until only one area remains in each contiguous region.
Value calculation for unified areas:
The Modifiable Areal Unit Problem (MAUP) [43
] is an important aspect to consider when discussing the modification of boundaries or values. We address this by providing the user options to customize calculation of aggregated univariate data values as well as the customizable distance metric used to evaluate area merge candidates. The data is linked to the unit areas during the initial loading of the shape files. Before the area tree is built, the user can select the type of value amalgamation. This enables the user to choose options of sums, frequencies, and value averages. When amalgamating values using sums, the value can be calculated using aggregation. Qualitative values are calculated using frequencies. For a detailed description of parent value calculation, see McNabb et al. [41
4.2. Geospatial Glyph Placement
In order to enable multi-variate maps, a number of technical challenges must be addressed including—(1) A glyph-placement strategy, (2) A hierarchical glyph design, (3) dynamic level-of-detail support, (4) smooth transitions between child and parent glyphs, (5) multi-variate filtering and selection, and (6) customizable interactive user options. Furthermore, the hierarchical glyph design must support the encoding of aggregation error.
We select visible areas and glyphs based on a minimum area scale requirement (a percentage of screen space), m
. When the map is rendered, the tree is traversed using a depth-first search (DFS) to identify which areas are rendered. If an area is larger than m
we test two criteria—if the area is a leaf node or if either the left or right child is smaller than m
. If either of these true, we render the area. For each area displayed, we create a glyph using the area’s centroid to position the glyph. We create a glyph that reflects the given area’s multivariate data values (based on the user’s selection). We first set the size of the glyph at 2.5% of the screen space as the default, a heuristic we derive from McNabb et al.’s previous user study on perception of scale on choropleth maps [42
]. As the zoom level of the map changes, different areas may meet m
and therefore be presented, creating a dynamic presentation of glyphs. This addresses T1—Overview
and T3—Glyph Placement
, by providing a clear overview of the map with no occlusion, and clearly encoded geo-spatial context. As the size of the glyph changes so does the perceived ideal map structure. The user can manually find their own perceived preference using sliders or using naive estimated glyph placement.
4.3. Glyph Selection
We provide the user four common glyph design options to represent the data (see Table 2
). We chose these four typical options due to their common occurrence in geo-spatial literature [32
]. However, the principles we describe can be applied to any multi-dimensional glyph. The user can switch between each glyph design at any point once the hierarchical data structure has been built. These glyph options are:
Pie charts are an easily recognizable and practical design, making it a suitable option to present multivariate data. Pie charts are primarily used to present distribution per geospatial area, where the angle of a segment is mapped to each data dimension proportionally (see Table 2
Polar Area Chart:
Originally published by Nightingale [44
], a polar area chart is another radial plot but with equal segment angles. The radius or each slice corresponds to the values of each dimension, which facilitates comparison between geo-spatial areas. The polar area chart features different names including the wheel, coxcomb or wing chart. (see Table 2
The bar chart is one of the most visually recognizable visual designs. Values are assigned to bar heights. aligned to the horizontal axis for easy value comparison. (see Table 2
Originally presented by Siegel et al. [45
], a star glyph presents values using lines originating from the same point, at equal angles. The endpoints connect to form a unique polygon based on the length of each line (see Table 2
This addresses the requirements for T2—Multivariate Maps. We choose four standard glyph designs as a proof-of-concept. Glyph placement, not glyph design is the focus of this paper. The principles we present can be extended to any multivariate glyph.
4.4. Adjusting Level-of-Detail with Glyph Density
Adjusting glyph density can be handled in two different ways. First, we give the user a slider which depicts m, a minimum area requirement. The parameter m represents a percentage of screen space. This is used as the primary variable for the depth-first search (DFS). We also allow the user to interactively zoom in or out of the map. This changes the visible extents of the map, modifying the screen space covered by each area. These options enable the rendering of perceivable glyphs, meeting the requirements for T4—Level-of-detail.
4.5. Smooth Merging and Splitting Transitions
In order to increase the smoothness of user interaction and changes to glyph size when zooming or manipulating m
, we apply smooth transitions to child glyph merging and parent glyph splitting. When the user reduces the number of glyphs by either zooming out of the map or increasing the level of detail, glyphs translate towards the origin of their parent in the hierarchy while the opacity is reduced until it is no longer visible. The parent increases in opacity until it is fully opaque, creating a smooth transition. When adding new glyphs (zooming in or reducing minimum scale), the new child glyphs translate away from their parent and increase in opacity to provide a similar effect. Using this technique, we fulfill the requirement for T6—Smooth Interaction
. See Figure 3
. This dynamic animation is best viewed in the accompanying video.
4.6. Dynamic Average Glyph Legend
We provide a dynamic average glyph legend to present how the multivariate data dimensions of the glyph are encoded. The advantage this provides is that each individual glyph on the map can be compared to the average shown in the legend. Each variate is given a label, which provides context to the user about what is presented. The data used to present the glyph is made meaningful by visualizing the average value of each dimension. In Figure 4
, we can see that there seem to be some extreme values for the 80+ and 20–29 range, causing the average per area to be quite small overall.
4.7. Attribute Filtering
Our first filter option is to re-calculate the glyph design with only the toggled dimensions. Each data dimension can be toggled using a check-box incorporated to represent data variates in the glyph design. This allows the user to focus on or emphasize data dimensions that may reveal trends. We support user filtering using focus+context rendering. We provide a gray-scale option which removes the color from context data dimensions, enabling easier comparison. This supports the requirements we set forth in T5—Filtering
. See Figure 5
4.8. Unit Area Density Indicators
We present unit area density indicators that provide a visual queue indicating how hidden unit areas are distributed, and encourage the user to explore the visualization through multiple levels of detail. When two child glyphs merge to form a parent, the child glyphs are then hidden. Our glyph design maps the number of merges to a range of different visual indicators that generally surround the glyph. See Table 2
. We offer four options:
Outline: Outline maps the unit area quantity around each glyph to thickness. The thickness of the outline grows as more areas fall underneath a glyph.
Size: Rather than provide an outline, the glyph’s overall size increases as the glyph represents more unit areas. This works especially well with pie charts, that emulate a proportional map.
Size+Outline: Size + Outline uses a combination of the two previous options.
Shadows: Rather than an outline with a constant opacity, we enable for the user to choose a gradient, enabling less occlusion in the representation.
These unit area density indicators are inspired by the work of Chung et al. [40
] where the indicator was effective but used to represent another data dimension (as opposed to the density of a map). We also give the user an option to represent the indicator mapped to color. The color represents the scale the glyph encodes, as opposed to other visible encodings. This enables the user to gain an understanding of how manipulation of glyph density can affect the map if a transition is made. See Figure Table 2
. This addresses our requirements of T4—Level-of-detail
4.9. Interactive User Options
We provide additional user options to support T5—Filtering. We present a range of user options including value range filters, advanced focus+context rendering options, estimated glyph placement, and context administrative areas.
Data Range Options: We provide data range filtering to enable customized local and global design options for dimension encoding. On a local range, the user can shift the value range to present the data dimensions based on the values found in the leaf nodes (the original dataset) or clamp the ranges amongst those that are currently being rendered to enable a more accurate data range to compare data dimensions. We also support global range options by enabling the user to depict each variant based on its own range or by creating a range based on the highest and lowest value of all mapped dimensions.
Advanced Filters: We include two advanced filters to render focus+context for the user. For numerical values, the user can present focus+context based on values higher or lower than the average value per data dimension.
We provide the user with a variety of color maps, selected from published research, including ColorBrewer [46
] (Refer to Table 2
) and Colorgorical [47
] (Refer to Figure 5
Glyph Scaling: We allow the user to scale the current size of the glyph. This enables the user to explore a ratio between the minimum scale and size of glyphs that meets their own data.
Naive estimated glyph placement: Using the current size of the glyph, we can support the user to make an estimation of the minimum screen space necessary to remove occlusion with the use of a button. This makes it easier to obtain a starting point, in order to decide the design of the map they would like to use. This can also be linked to the glyph scaling to allow for automatic re-placement when scaling the glyph.
Context Administrative Areas:
We can provide additional context behind the areas by rendering every leaf area in a context view, which is shown in Figure 5
Details on demand:
We allow the user to obtain precise insight into the multivariate data by providing a textual representation of the values associated with a glyph by hovering over any glyph. We also include the number of areas depicted to give better context to the underlying data. See Figure 6