MAP-Vis : A Distributed Spatio-Temporal Big Data Visualization Framework Based on a MultiDimensional Aggregation Pyramid Model

: During the exploration and visualization of big spatio-temporal data, massive volume poses a number of challenges to the achievement of interactive visualization, including large memory consumption, high rendering delay, and poor visual e ﬀ ects. Research has shown that the development of distributed computing frameworks provides a feasible solution for big spatio-temporal data management and visualization. Accordingly, to address these challenges, this paper adopts a proprietary pre-processing visualization scheme and designs and implements a highly scalable distributed visual analysis framework, especially targeted at massive point-type datasets. Firstly, we propose a generic multi-dimensional aggregation pyramid (MAP) model based on two well-known graphics concepts, namely the Spatio-temporal Cube and 2D Tile Pyramid. The proposed MAP model can support the simultaneous hierarchical aggregation of time, space, and attributes, and also later transformation of the derived aggregates into discrete key-value pairs for scalable storage and e ﬃ cient retrieval. Using the generated MAP datasets, we develop an open-source distributed visualization framework (MAP-Vis). In MAP-Vis, a high-performance Spark cluster is used as a parallel preprocessing platform, while distributed HBase is used as the massive storage for the generated MAP data. The client of MAP-Vis provides a variety of correlated visualization views, including heat map, time series, and attribute histogram. Four open datasets, with record numbers ranging from the millions to the tens of billions, are chosen for system demonstration and performance evaluation. The experimental results demonstrate that MAP-Vis can achieve millisecond-level query response and support e ﬃ cient interactive visualization under di ﬀ erent queries on the space, time, and attribute dimensions.


Introduction
As data collection methods have matured and diversified, e.g., personal smart devices, intelligent vehicles, Internet of Things (IoT), etc., data sources have become increasingly richer and richer, and the amount of collected data has continuously accumulated in an explosive way. These massive amounts of streaming-generated data are usually arranged in the structure (S, T, A 1 , . . . , A n ); here, S describes the spatial location of the record, T specifies the time moment, and A 1 , . . . , A n are n attributes attached to the target object.
Exploration and visualization over these accumulated, spatio-temporally referenced data can help users to identify inherent patterns, infer correlations and causal relationships, and support surveyed the use of visualization for data mining and discussed some standard techniques for abstracting large-scale datasets, including dimension reduction [13,14], subsetting (e.g., random sampling, filtering, level of details), segmentation (e.g., cluster analysis), and aggregation. These data reduction methods can also be categorized in terms of whether they are online or offline [15]: online aggregation means the aggregates are calculated in real time, while the offline mode refers to preprocessing or precomputation being conducted in advance.
Among the above-mentioned approaches, the sampling approach involves selecting a subset of the raw dataset, while filtering is conducted with a user-defined spatial/temporal/attribute search. These two methods control which data entities of raw dataset can be visualized, then help users to focus on the information of interest. However, the selected subset may still be too large to visualize effectively; moreover, it may not be representative and may cause important structures or outliers to be missed.
Data aggregation refers to a statistical grouping of original data items into a hierarchical tree structure of data aggregates by counting the number of data items that fall within each predefined bin. For example, histograms and heatmaps are typical 1D and 2D binned aggregations [16]. Different types of input values have different bin definitions, for example, adjacent intervals for numerical variables, each value for categorical variables, and day-week-month for temporal variables. Other advantages of the binning technique include its rapid computation speed and its ability to be easily computed in parallel, since the binning process is independent for each data item.
In fact, the aggregation process relies on the widely used concept of the "data cube", first introduced by Gray et al. [17]. The data cube is designed for the hierarchical aggregation of multidimensional data structures and supports rich operations such as scrolling, drilling, slicing, dicing, and rotation. It can also provide a multidimensional view of the original data and allows the user to quickly calculate the data aggregates [2,9,[18][19][20][21][22]. However, for high-dimensional cubes, most of the cube cells are empty; this results in very high storage redundancy and excessive consumption of internal/external storage.
To solve this storage redundancy problem, a number of scholars have proposed a variety of extended structures to reduce the memory usage among the sparse data cubes; these include Dwarf [23], imMens [24], Nanocubes [18], Hashedcubes [20], Gaussian Cubes [21], NeuralCubes [25], etc. Dwarf compresses data cubes using prefix and suffix redundancy to reduce the consumption of memory. imMens divides the high-dimensional data cube into multiple sub-cubes, thereby reducing total memory consumption, and decreases the query time delay through the use of GPU parallel processing rendering. Nanocubes extends the Dwarf concept by proposing a memory-based tree structure, which aggregates the spatial, time, and attribute dimensions at different levels and supports quick multi-dimensional space-time interactive queries. Hashedcubes is a fast, easy-to-implement and memory-efficient data structure that can answer queries from interactive visualization tools when exploring and analyzing large, multidimensional datasets. Moreover, Gaussian Cubes and NeuralCubes utilize other components (multivariate Gaussian and neural networks) in addition to the traditional data cube structure to support advanced data analysis with data visualization.
However, although these compact structures have improved the visualization efficiency, the data scalability problem remains inadequately resolved. For example, imMens only supports a maximum of four dimensions and cannot be freely extended to further dimensions; moreover, Nanocubes is a tightly coupled structure that can only reside in the internal memory of a single computer. Due to the advent of cloud computing, it is of great importance to simultaneously leverage distributed architectures and exploit horizontal scaling so as to obtain better visualization efficiency.

Parallel Implementations for Big Data Visualization
Along with data reduction, researchers have also recently proposed parallel implementations to address scalability issues; these approaches include many-core GPUs, distributed clusters, or hybrid architectures.
Compared with multi-core processors, many-core GPU processors exhibit a higher degree of explicit parallelism and higher data throughput and have been widely used to accelerate index generation, offline preprocessing, and rendering during visualization. Doraiswamy et al. proposed a novel indexing scheme that makes use of modern GPUs to efficiently support spatio-temporal queries over massive point datasets, and that also achieves interactive, sub-second response times for queries over 1.1 billion tweets [26]. Moreover, Perrot et al. designed a distributed visualization framework with Hadoop/Spark that leverages GPU to compute data aggregates with kernel density estimation [27]. The open-source project MapD [28] uses GPU to accelerate the processing of complex, real-time spatio-temporal data and can process billions of rows of data in milliseconds. Finally, imMens [24] also uses the GPU-based WebGL to conduct data processing and rendering.
Distributed clusters with commodity hardware are also very commonly used in big data visualization applications. For example, VisReduce is built on a distributed NoSQL database and implements both online aggregation and data compression [29]. Moreover, the tile-based visual analytics system (TBVA) performs the offline preprocessing of big data through a Spark cluster, generating the tile data of different spatial levels in advance and calculating the statistical values of the attribute dimensions of each tile to achieve an interactive visualization effect [30]. GeoMesa/GeoWave uses NoSQL databases to store spatio-temporal data in a distributed manner and can further use MapReduce or Spark to accelerate data processing and visualization [31,32].
Most of these implementations concentrate on using parallel/distributed platforms for data processing and online rendering, while very few of them consider using these resources to develop a unified solution for big data visualization, from quick pre-processing to scalable storage, and further to efficient online query/filtering.

The Hierarchical Multi-Dimensional Aggregate Pyramid (MAP) Model
By reviewing the characteristics of the target datasets, the generic multi-dimensional aggregate pyramid (MAP) model is built from two well-known graphics concepts: namely, the Spatio-temporal Cube and the 2D Tile Pyramid. The former is extended with the attribute dimension to Space-Time-Attribute Cube and provides the building blocks for the proposed MAP model; while the latter implements the idea of 2D spatial aggregation and provides a good implication for the further simultaneous aggregation of space, time, and attributes.

Space-Time-Attribute Cube
The massive datasets are transformed and aggregated along three dimensions-namely, S (Space), T (Time), and A (Attributes)-so that the Cartesian product S × T × A can metaphorically define the abstract concept of the space-time-attribute cube, as illustrated in Figure 1. In fact, in real datasets, the S × T × A cube is still a multidimensional (nD) structure; this is because the spatial coordinates contain longitude and latitude, and the objects have several different attributes.
As building blocks for future aggregation, the S × T × A cube requires discrete and finite units in all three dimensions. However, the spatial and temporal domains are continuous; thus, the space and time should first be discretized via partitioning into suitable regions and time intervals, respectively. Here, space discretization is accomplished using a regular rectangle grid, i.e., GeoHash. In the temporal dimension, the data is divided and reorganized by a predefined granularity (i.e., day, week, or month); in the attribute dimension, moreover, the categorical values are aggregated using a hierarchical tree structure.
The S × T × A cube supports two types of aggregation: cube aggregation and face aggregation. As illustrated in Figure 1, cube aggregation can be viewed as a contraction of eight cubes into a cube of coarser granularity. Cube aggregation resembles a common concept in computer graphics, namely Level of Details (LoD), in which complex objects are represented with varying levels of granularity so as to achieve a trade-off between the model fidelity and visualization performance. Cube aggregation provides the required function for hierarchical pyramid construction. Face aggregation can be metaphorically seen as a projection of the cube onto one of its faces along one dimension (see Figure 2). For example, for a given attribute, we can study the variation of the aggregate measures over space and time by means of this face aggregation process. Face aggregation will be used in client visualization due to the 2D display environment.

Definition of the 2D Tile Pyramid Model
As shown in Figure 3, the tile map pyramid model is widely used for 2D map display [10,33,34]. This is a typical LoD example, supported by hierarchical spatial aggregation, in which the resolution becomes increasingly coarser from the bottom to the top. Each layer of the pyramid contains a collection of square tiles, while each tile is composed of pixels (usually 256 × 256).  Face aggregation can be metaphorically seen as a projection of the cube onto one of its faces along one dimension (see Figure 2). For example, for a given attribute, we can study the variation of the aggregate measures over space and time by means of this face aggregation process. Face aggregation will be used in client visualization due to the 2D display environment. Face aggregation can be metaphorically seen as a projection of the cube onto one of its faces along one dimension (see Figure 2). For example, for a given attribute, we can study the variation of the aggregate measures over space and time by means of this face aggregation process. Face aggregation will be used in client visualization due to the 2D display environment.

Definition of the 2D Tile Pyramid Model
As shown in Figure 3, the tile map pyramid model is widely used for 2D map display [10,33,34]. This is a typical LoD example, supported by hierarchical spatial aggregation, in which the resolution becomes increasingly coarser from the bottom to the top. Each layer of the pyramid contains a collection of square tiles, while each tile is composed of pixels (usually 256 × 256).

Definition of the 2D Tile Pyramid Model
As shown in Figure 3, the tile map pyramid model is widely used for 2D map display [10,33,34]. This is a typical LoD example, supported by hierarchical spatial aggregation, in which the resolution becomes increasingly coarser from the bottom to the top. Each layer of the pyramid contains a collection of square tiles, while each tile is composed of pixels (usually 256 × 256).  Face aggregation can be metaphorically seen as a projection of the cube onto one of its faces along one dimension (see Figure 2). For example, for a given attribute, we can study the variation of the aggregate measures over space and time by means of this face aggregation process. Face aggregation will be used in client visualization due to the 2D display environment.

Definition of the 2D Tile Pyramid Model
As shown in Figure 3, the tile map pyramid model is widely used for 2D map display [10,33,34]. This is a typical LoD example, supported by hierarchical spatial aggregation, in which the resolution becomes increasingly coarser from the bottom to the top. Each layer of the pyramid contains a collection of square tiles, while each tile is composed of pixels (usually 256 × 256).

Concepts:
(1) Pixel The Pixel, which is the smallest building block of the 2D pyramid, can be represented by Equation (1); here, l refers to the level of the pyramid; x, y represents the spatial coordinates of this pixel; and p represents the attribute value of the pixel (or an aggregated statistical value, i.e., average/sum/min/max). (

2) Tile
A Tile is a group of spatially adjacent pixels, usually a 256 × 256 square grid. The tile can be represented by Equation (2); here, l is the level of the pyramid, X, Y denote the tile's column/row number in this level, and the last union represents the set of pixels within this n*n tile.
A collection of M*N tiles are grouped into one layer, after which all L layers are stacked to form a multi-scale spatial Pyramid, as shown in Equation (3).

Operations:
(1) Extract The Extract operation converts one or more raw data records into a pixel in the bottom level, as shown in Equation (4); here, l n refers to the bottom level of the pyramid; lon, lat represents the geographical position of the original records; and w is the pixel resolution of the output tile.
(2) Group The Group operation establishes the direct relationship between the pixels and the target tile. As shown by Equation (5), it groups the related pixels to enable the tile to be derived.
Appl. Sci. 2020, 10, 598 The above-mentioned Extract and Group operations are combined to produce tiles at the bottom level of the pyramid. Unlike these two operations, which function at the bottom level, the Aggregate operation is conducted level-by-level from the bottom to the top to calculate all the tiles remaining in the whole pyramid. The Aggregate operation is as shown in Equation (6).
(4) Key-value Pair (KVP) For each tile in the pyramid, a key can be generated by using a space-fill curve (e.g., Z-curve), such that the set of pixels of the tile are packaged as a value, as shown in Equation (7). The key-value conversion is conducted for later persistent storage.

Multi-Dimensional Aggregated Pyramid Model
As illustrated in Figure 4, the S × T × A cube and the 2D tile pyramid are combined to define the multi-dimensional aggregation pyramid (MAP) model; this enables the hierarchical aggregation to be achieved not only on the spatial dimension but also on the temporal and attribute dimensions.
Appl. Sci. 2020, 10, 598 7 of 19 (3) Aggregate The above-mentioned Extract and Group operations are combined to produce tiles at the bottom level of the pyramid. Unlike these two operations, which function at the bottom level, the Aggregate operation is conducted level-by-level from the bottom to the top to calculate all the tiles remaining in the whole pyramid. The Aggregate operation is as shown in Equation (6).
( , , ) For each tile in the pyramid, a key can be generated by using a space-fill curve (e.g., Z-curve), such that the set of pixels of the tile are packaged as a value, as shown in Equation (7). The key-value conversion is conducted for later persistent storage.

Multi-Dimensional Aggregated Pyramid Model
As illustrated in Figure 4, the S × T × A cube and the 2D tile pyramid are combined to define the multi-dimensional aggregation pyramid (MAP) model; this enables the hierarchical aggregation to be achieved not only on the spatial dimension but also on the temporal and attribute dimensions. Similarly, the four basic operations involved in constructing the proposed multi-dimensional aggregate pyramid are as follows: (1) Flattened Attribute Aggregation Tree (faa_tree) The aggregation of the different attributes of the spatio-temporal objects will result in a treeshaped structure, faa_tree, after which the derived tree structure can be flattened using a breadth-first traversal to simplify later access and also preserve level locality. As illustrated by Figure 5, there are four taxi records in the sample table, each of which has three attributes, namely, guest, car color and driver gender. Similarly, the four basic operations involved in constructing the proposed multi-dimensional aggregate pyramid are as follows: (1) Flattened Attribute Aggregation Tree (faa_tree) The aggregation of the different attributes of the spatio-temporal objects will result in a tree-shaped structure, faa_tree, after which the derived tree structure can be flattened using a breadth-first traversal to simplify later access and also preserve level locality. As illustrated by Figure 5, there are four taxi records in the sample table, each of which has three attributes, namely, guest, car color and driver gender. The faa_tree of these four records, as shown on the right side, are derived through hierarchical aggregation; subsequently, the faa_tree is flattened to a fixed-length one-dimensional (1D) array (as expressed by Equation (8)), in which the total count is arranged in the first position. Compared to the original tree structure, the fixed-length 1D array structure is much more convenient for subsequent storage and processing.
(2) Spatio-temporal Pixel (st-pixel) We define the smallest S×T×A cube as the Spatio-temporal Pixel (st-pixel), as shown in Figure 6. Due to the existence of faa_tree, the st-pixel is actually a 4D unit structure; it can be located by both the spatial coordinates (longitude x, latitude y) and temporal position t and is also labeled with a cell from the faa_tree array. As shown in Equation (9), t represents the transformed coordinates in the predefined temporal granularity, and aj is one cell in the faa_tree, while l, x, y are the same as in the 2D pixel.  (3) Spatio-Temporal Tile (st-tile) The S×T×A cubes in the lower level can be aggregated level by level to the cube in the higher level. Since the st-pixel is now a 4D unit structure, it is inconvenient for 2D map display in the client. A similar 2D concept, Spatio-Temporal Tile (st-tile), is defined in Figure 7. As a 2D tile with pixels, sttile contains a collection of st-pixels that have the same time tag and attribute value. The tile size is still 256 × 256.
Here in Equation (10), l also refers to the level in the pyramid; X and Y refer to the row and column of the tile; tk represents one position in the discretized temporal intervals, and aj is still one cell in the faa_tree. The faa_tree of these four records, as shown on the right side, are derived through hierarchical aggregation; subsequently, the faa_tree is flattened to a fixed-length one-dimensional (1D) array (as expressed by Equation (8)), in which the total count is arranged in the first position. Compared to the original tree structure, the fixed-length 1D array structure is much more convenient for subsequent storage and processing.
f aa_tree = {a all , a 1 , a 2 , . . . , a n } (2) Spatio-temporal Pixel (st-pixel) We define the smallest S × T × A cube as the Spatio-temporal Pixel (st-pixel), as shown in Figure 6. Due to the existence of faa_tree, the st-pixel is actually a 4D unit structure; it can be located by both the spatial coordinates (longitude x, latitude y) and temporal position t and is also labeled with a cell from the faa_tree array. As shown in Equation (9), t represents the transformed coordinates in the predefined temporal granularity, and a j is one cell in the faa_tree, while l, x, y are the same as in the 2D pixel.
The faa_tree of these four records, as shown on the right side, are derived through hierarchical aggregation; subsequently, the faa_tree is flattened to a fixed-length one-dimensional (1D) array (as expressed by Equation (8)), in which the total count is arranged in the first position. Compared to the original tree structure, the fixed-length 1D array structure is much more convenient for subsequent storage and processing.
(2) Spatio-temporal Pixel (st-pixel) We define the smallest S×T×A cube as the Spatio-temporal Pixel (st-pixel), as shown in Figure 6. Due to the existence of faa_tree, the st-pixel is actually a 4D unit structure; it can be located by both the spatial coordinates (longitude x, latitude y) and temporal position t and is also labeled with a cell from the faa_tree array. As shown in Equation (9), t represents the transformed coordinates in the predefined temporal granularity, and aj is one cell in the faa_tree, while l, x, y are the same as in the 2D pixel. The S×T×A cubes in the lower level can be aggregated level by level to the cube in the higher level. Since the st-pixel is now a 4D unit structure, it is inconvenient for 2D map display in the client. A similar 2D concept, Spatio-Temporal Tile (st-tile), is defined in Figure 7. As a 2D tile with pixels, sttile contains a collection of st-pixels that have the same time tag and attribute value. The tile size is still 256 × 256.
Here in Equation (10), l also refers to the level in the pyramid; X and Y refer to the row and column of the tile; tk represents one position in the discretized temporal intervals, and aj is still one cell in the faa_tree.  The S × T × A cubes in the lower level can be aggregated level by level to the cube in the higher level. Since the st-pixel is now a 4D unit structure, it is inconvenient for 2D map display in the client. A similar 2D concept, Spatio-Temporal Tile (st-tile), is defined in Figure 7. As a 2D tile with pixels, st-tile contains a collection of st-pixels that have the same time tag and attribute value. The tile size is still 256 × 256.
Here in Equation (10), l also refers to the level in the pyramid; X and Y refer to the row and column of the tile; t k represents one position in the discretized temporal intervals, and a j is still one cell in the faa_tree.
st − tile = [l, X, Y], t k , a j (t k ∈ T, a j ∈ f aa_tree) (3) Aggregate As also illustrated in Figure 7, by integrating both the cube aggregation and face aggregation functions in the S × T × A cube, we can define the aggregate operation for the MAP construction. Like the aggregate operation of the 2D tile, it is also conducted level by level from the bottom to the top, enabling all st-tiles in the pyramid to be obtained. The aggregate operation is as shown in Equation (13).  (1) In a similar way to the above, there are four basic operations involved in constructing the proposed multi-dimensional aggregate pyramid, which are listed as follows:Extract Both the geographic location information and temporal information are transformed to the st-tile's coordinates. As shown in Equation (11), all attribute information is used to generate and serialize the st-tile's faa_tree. Extract (2) Group The Group operation is conducted in two steps, as shown in Equation (12): firstly, the mapping relationship between st-pixel and st-tile is established so that it can be determined which st-tile the St-Pixels should belong to; secondly, all st-pixels are reduced into one st-tile, and a new faa_tree is calculated from all the st-pixels.
Group(st − pixel([l n , x, y, t])) = st − tile([l, X, Y], t k , a j , (3) Aggregate As also illustrated in Figure 7, by integrating both the cube aggregation and face aggregation functions in the S × T × A cube, we can define the aggregate operation for the MAP construction. Like the aggregate operation of the 2D tile, it is also conducted level by level from the bottom to the top, enabling all st-tiles in the pyramid to be obtained. The aggregate operation is as shown in Equation (13).

The MAP-Vis Framework
Following the proposed MAP model, we implement a prototype system for spatiotemporal data visualization, which we refer to as MAP-Vis. As shown in Figure 8, the MAP-Vis system adopts the B/S architecture and can be divided into three components; from the bottom to the top, these are high-performance clusters, middleware, and visualization clients.

The MAP-Vis Framework
Following the proposed MAP model, we implement a prototype system for spatiotemporal data visualization, which we refer to as MAP-Vis. As shown in Figure 8, the MAP-Vis system adopts the B/S architecture and can be divided into three components; from the bottom to the top, these are highperformance clusters, middleware, and visualization clients. The underlying Linux cluster is mainly responsible for the generation and storage of the multidimensional pyramid from the raw dataset. The middleware parses the query requests from the client through a data access interface, then filters out the results from the database for client visualization purposes. The client is used to visualize the heat map, time series, and attribute histogram in different views.

The Generation of the MAP Pyramid
During the construction of the MAP pyramid, the input raw dataset is processed and hierarchically aggregated from bottom to top. First, the bottom layer is calculated, after which the upper layers are iteratively derived layer by layer. The pseudocode of the hierarchical aggregation process is presented in Algorithm 1; moreover, the complete preprocessing steps are explained as follows: Step 1: Define the required input parameters, including the maximum pyramid level ln and the time granularity tb. Via the Extract operation defined in Section 3.2, the spatial and temporal coordinates are extracted from the original records along with the faa_tree. As a result, the discrete stpixels are obtained.
Step 2: With help from Spark's Map/Reduce function also, the st-pixel in the same st-tile in the same time interval tk and the same faa_tree cell aj are grouped into one st-tile using the Group operation.
Step 3: The st-tile level l, the space-filling curve code of the tile k, the time t, and the faa_tree cell are joined to generate a Key, after which the st-tile is assigned as a Value to form a key-value pair with the KVP operation introduced in Section 3.2. Finally, the bottom level ln in the MAP pyramid is obtained.
Step 4: After Step 3 is complete, the iterative aggregation can be conducted continuously to obtain the ln-1, ln-2, ..., 1 layer by layer and construct a multi-dimensional aggregation pyramid using the Aggregate operation.
Since all the steps are pure loops, they can be implemented using the Spark's Map/Reduce function and executed in a distributed Linux cluster. The underlying Linux cluster is mainly responsible for the generation and storage of the multi-dimensional pyramid from the raw dataset. The middleware parses the query requests from the client through a data access interface, then filters out the results from the database for client visualization purposes. The client is used to visualize the heat map, time series, and attribute histogram in different views.

The Generation of the MAP Pyramid
During the construction of the MAP pyramid, the input raw dataset is processed and hierarchically aggregated from bottom to top. First, the bottom layer is calculated, after which the upper layers are iteratively derived layer by layer. The pseudocode of the hierarchical aggregation process is presented in Algorithm 1; moreover, the complete preprocessing steps are explained as follows: Step 1: Define the required input parameters, including the maximum pyramid level l n and the time granularity t b . Via the Extract operation defined in Section 3.2, the spatial and temporal coordinates are extracted from the original records along with the faa_tree. As a result, the discrete st-pixels are obtained.
Step 2: With help from Spark's Map/Reduce function also, the st-pixel in the same st-tile in the same time interval t k and the same faa_tree cell a j are grouped into one st-tile using the Group operation.
Step 3: The st-tile level l, the space-filling curve code of the tile k, the time t, and the faa_tree cell are joined to generate a Key, after which the st-tile is assigned as a Value to form a key-value pair with the KVP operation introduced in Section 3.2. Finally, the bottom level l n in the MAP pyramid is obtained.
Step 4: After Step 3 is complete, the iterative aggregation can be conducted continuously to obtain the l n−1 , l n−2 , ..., 1 layer by layer and construct a multi-dimensional aggregation pyramid using the Aggregate operation.

Distributed Storage of the MAP Pyramid in Hbase
In order to improve the storage scalability of MAP-Vis, we chose HBase (a distributed, columnoriented non-relational database) as the storage database. Unlike traditional relational databases, data in HBase are organized in a sorted list of key-value pairs. Each row stores one object, represented by the designed unique rowkey, while each column belongs to a particular column family. At the physical level, the data are physically sorted by rowkey, column name and timestamp; thus, the design of the rowkey and column family in HBase are vital to the efficiency of the multidimensional query.
The height of the pyramid in the spatial dimension is usually very high (e.g., 12 levels in GeoHash), resulting in the explosive increase of decomposed spatial cells. For the time series of data, moreover, the record number along the temporal dimension is also very large; on the contrary, the attached attributes are usually of limited sizes and the faa_tree length is short. Accordingly, the spatial and temporal information are put together in the Row dimension, while the faa_tree information is placed in the Column Family.
The spatial coordinates of each record are firstly encoded into sfc_code using the Z curve, which is one of the most widely used space filling curves. As shown in Table 1, the level l, sfc_code and time interval t are then joined in the form of a rowkey for each row, and two column families (Heatmap and Sum) are then defined. Each column of the Heatmap column family is set according to the faa_tree node and used to store the st-tiles; in addition, each column of the Sum column family is used to store the total value of all st-pixels at time interval t. A rowkey design of this kind exhibits high spatial-temporal aggregation and ensures that objects close in spatial-temporal distance can be stored closely on disks. This storage schema can also fully leverage the column-oriented advantages of HBase, as well as scaling well with the dataset size.

Multidimensional Query for Interactive Visualization
Generally, HBase supports two methods of retrieving data: Get and Scan. The Get operation is performed with a given rowkey to fetch a single row in the HBase table, while the Scan operation filters the entire table with a rowkey range defined by the start-key and end-key. Our multidimensional query is implemented with HBase scan operations. The pseudocode of the multidimensional query is illustrated in Algorithm 2, while the details are explained below: (1) The interactive visualization operations will produce a collection of requests, which will later be sent to the server separately in an asynchronous mode. When the server receives the query request, the middleware parses the request parameters into the space/ time/attribute filters. (2) For the spatial filters, the query geometry will be recursively divided into multi-level SFC cells, while the translated spatial ranges are calculated using our recursive approximation algorithm, published in [35]; (3) The temporal and attribute filters will be discretized and appended to the translated spatial ranges and derive the final scan ranges according to the rowkey schema; (4) The scan ranges are sent to HBase for scan operations, filtering out of false positive data from the initial scan results and aggregating final query results for client rendering.  In order to avoid excessive I/O communication between servers and clients, the mechanism of the HBase CoProcessor is used to implement the above-mentioned multidimensional query algorithm, as illustrated in Figure 9. HBase's coprocessors, which are modeled after Google's BigTable, resemble the stored procedures in the relational databases and can be invoked as a service by the client at any time. When invoked, the coprocessor program is executed remotely at the target HBase regions; the execution results can be processed before being returned to the client. Here, HBase CoProcessor mainly accelerates the exclusion of false positive points and the post-processing before visualization in the former Step 4. CoProcessor mainly accelerates the exclusion of false positive points and the post-processing before visualization in the former Step 4.  Figure 10 depicts the visualization interface implemented of the proposed MAP-Vis framework, which includes three correlated views: map view, timeline view, and attribute histogram view. In the map view, the brightness indicates the spatial distribution of spatial objects, i.e., the brighter place contains denser objects while the darker place means relatively sparse ones. The MAP-Vis framework is implemented using several open source projects and libraries. Leaflet and OpenStreetMap are used for the background map display, while the timeline and attribute histogram use the d3 library to support user interaction.

Experimental Environment and Datasets
The MAP-Vis framework is deployed on a nine-node Linux cluster. Each node in this cluster is equipped with two six-core Intel Xeon E5-2620 2 GHz CPUs, 32 GB memory and connected with 1 GB Ethernet. The operating system is CentOS 6.2 64 bit; the required HBase 1.2 and Spark 1.6 is also installed.
To determine the efficiency and performance of the proposed MAP-Vis framework, we selected four datasets ranging in size from four million to over one billion records. Each of these datasets includes geospatial, temporal, and domain-specific attribute dimensions. A summary of all datasets, including the number of records, the original data size, timespan, and data size in database, are presented in Table 2. Shown from Table 2, the size of the generated MAP data in NYC Taxi is about half of the raw datasets, while the other three increase about three to 80 times. The size difference can be attributed to the sparsity degree of input datasets, i.e., more dense and compact data after aggregation will be less, while more sparse data will become much bigger by blank filling.  Figure 10 depicts the visualization interface implemented of the proposed MAP-Vis framework, which includes three correlated views: map view, timeline view, and attribute histogram view. In the map view, the brightness indicates the spatial distribution of spatial objects, i.e., the brighter place contains denser objects while the darker place means relatively sparse ones. The MAP-Vis framework is implemented using several open source projects and libraries. Leaflet and OpenStreetMap are used for the background map display, while the timeline and attribute histogram use the d3 library to support user interaction.

Experimental Environment and Datasets
The MAP-Vis framework is deployed on a nine-node Linux cluster. Each node in this cluster is equipped with two six-core Intel Xeon E5-2620 2 GHz CPUs, 32 GB memory and connected with 1 GB Ethernet. The operating system is CentOS 6.2 64 bit; the required HBase 1.2 and Spark 1.6 is also installed.
To determine the efficiency and performance of the proposed MAP-Vis framework, we selected four datasets ranging in size from four million to over one billion records. Each of these datasets includes geospatial, temporal, and domain-specific attribute dimensions. A summary of all datasets, including the number of records, the original data size, timespan, and data size in database, are presented in Table 2. Shown from Table 2, the size of the generated MAP data in NYC Taxi is about half of the raw datasets, while the other three increase about three to 80 times. The size difference can be attributed to the sparsity degree of input datasets, i.e., more dense and compact data after aggregation will be less, while more sparse data will become much bigger by blank filling.

Validation of the MAP Model's Efficiency
The client users' interaction operations on the map (zoom in/out and pan) will generate a sequence of spatial/temporal/attribute filters to enable retrieval of the st-tiles from the underlying HBase database. To simulate the real interaction process, 16 filters with different spatial and temporal size were defined in this experiment; i.e., four spatial rectangles (illustrated in Figure 11), three temporal spans (day/month/year), and all/single attribute(s).

Validation of the MAP Model's Efficiency
The client users' interaction operations on the map (zoom in/out and pan) will generate a sequence of spatial/temporal/attribute filters to enable retrieval of the st-tiles from the underlying HBase database. To simulate the real interaction process, 16 filters with different spatial and temporal size were defined in this experiment; i.e., four spatial rectangles (illustrated in Figure 11), three temporal spans (day/month/year), and all/single attribute(s).
The average database response time is recorded in Table 3. From the table, it can be seen that as the size of the spatial temporal extent increases, the response time remains at around 10 ms and does not show a significant increase. This indicates that the MAP model can scale well with different spatiotemporal query sizes and can also guarantee a millisecond-level response. Therefore, the preaggregation model guarantees a constant query time for different spatial-temporal filters generated by any interaction event (zoom in/out or pan). Figure 11. The different spatial extents for queries displayed on OpenStreetMap.

Experiments on Data Preprocessing Capability
During the data preprocessing, the number of worker nodes in the Spark cluster is altered to measure the changes in the pre-processing time (excluding the time required to import into the HBase database). This experiment uses a total of 54 GB NYC Taxi data span about 30 months (from January 2014 to June 2016). All the observed preprocessing times are shown in Figure 12. From the figure it can be seen that the increase in the number of Spark worker nodes leads to obvious reduction in pre-processing time, from 360 min to 160 min, while the efficiency is improved by about 1.3 times. Therefore, by using more computing nodes, the Spark cluster has more worker nodes available to share the pre-processing tasks, thereby significantly improving the pre-processing efficiency. However, as the Spark node number is over eight, the speedup efficiency increases very slowly. This phenomenon coincides with the well-known Amdahl's law and shows there exist some pre-processing operations which cannot be parallelized between nodes, i.e., I/O communication. 2014 to June 2016). All the observed preprocessing times are shown in Figure 12. From the figure it can be seen that the increase in the number of Spark worker nodes leads to obvious reduction in preprocessing time, from 360 min to 160 min, while the efficiency is improved by about 1.3 times. Therefore, by using more computing nodes, the Spark cluster has more worker nodes available to share the pre-processing tasks, thereby significantly improving the pre-processing efficiency. However, as the Spark node number is over eight, the speedup efficiency increases very slowly. This phenomenon coincides with the well-known Amdahl's law and shows there exist some preprocessing operations which cannot be parallelized between nodes, i.e., I/O communication.

Conclusions and Future Work
Owing to the massive volume, high number of dimensions, and spatial-temporal correlations involved, spatial-temporal big data faces many visualization challenges, including large memory consumption, high rendering delay, and poor visual effect. Accordingly, in order to solve these problems, this paper proposes a novel spatio-temporal big data organization model, namely, the multi-dimensional aggregation pyramid (MAP), which integrates the merits of both the tile pyramid model and key-value pair model. The MAP model supports quick aggregation on the space, time, and attribute dimensions, as well as efficient distributed storage with key-value conversion. Based on the proposed MAP model, an open-source visualization framework, MAP-Vis, is implemented on a Linux cluster. The MAP-Vis system uses Spark as a pre-processing tool and HBase as a distributed storage platform. Several key features, namely, distributed storage and distributed computation, enable our solution to scale to large datasets.
The MAP-Vis realizes millisecond-level multidimensional data querying and achieves good interactive visualization. Experimental results validate the efficiency of both the MAP model and the MAP-Vis framework, both of which can provide high scalability for processing capability and online visualization.
Future work can be conducted to explore the following aspects: (1) Our research work currently concentrates on massive point data; however, polyline or polygon types of spatio-temporal big data (e.g., vehicle trajectory) are not well-addressed. Thus, we will extend our proposed framework to accommodate these more complex data types. (2) Other acceleration mechanisms, e.g., GPU, will be considered for quicker pre-processing. In addition, parallel generation of query ranges will be also implemented to improve the query

Conclusions and Future Work
Owing to the massive volume, high number of dimensions, and spatial-temporal correlations involved, spatial-temporal big data faces many visualization challenges, including large memory consumption, high rendering delay, and poor visual effect. Accordingly, in order to solve these problems, this paper proposes a novel spatio-temporal big data organization model, namely, the multi-dimensional aggregation pyramid (MAP), which integrates the merits of both the tile pyramid model and key-value pair model. The MAP model supports quick aggregation on the space, time, and attribute dimensions, as well as efficient distributed storage with key-value conversion. Based on the proposed MAP model, an open-source visualization framework, MAP-Vis, is implemented on a Linux cluster. The MAP-Vis system uses Spark as a pre-processing tool and HBase as a distributed storage platform. Several key features, namely, distributed storage and distributed computation, enable our solution to scale to large datasets.
The MAP-Vis realizes millisecond-level multidimensional data querying and achieves good interactive visualization. Experimental results validate the efficiency of both the MAP model and the MAP-Vis framework, both of which can provide high scalability for processing capability and online visualization.
Future work can be conducted to explore the following aspects: (1) Our research work currently concentrates on massive point data; however, polyline or polygon types of spatio-temporal big data (e.g., vehicle trajectory) are not well-addressed. Thus, we will extend our proposed framework to accommodate these more complex data types. (2) Other acceleration mechanisms, e.g., GPU, will be considered for quicker pre-processing.
In addition, parallel generation of query ranges will be also implemented to improve the query efficiency. (3) Direct GPU-based rendering methods such as WebGL will be further used to increase the data visualization speed.