A Parallel N-Dimensional Space-Filling Curve Library and Its Application in Massive Point Cloud Management

Because of their locality preservation properties, Space-Filling Curves (SFC) have been widely used in massive point dataset management. However, the completeness, universality, and scalability of current SFC implementations are still not well resolved. To address this problem, a generic n-dimensional (nD) SFC library is proposed and validated in massive multiscale nD points management. The library supports two well-known types of SFCs (Morton and Hilbert) with an object-oriented design, and provides common interfaces for encoding, decoding, and nD box query. Parallel implementation permits effective exploitation of underlying multicore resources. During massive point cloud management, all xyz points are attached an additional random level of detail (LOD) value l. A unique 4D SFC key is generated from each xyzl with this library, and then only the keys are stored as flat records in an Oracle Index Organized Table (IOT). The key-only schema benefits both data compression and multiscale clustering. Experiments show that the proposed nD SFC library provides complete functions and robust scalability for massive points management. When loading 23 billion Light Detection and Ranging (LiDAR) points into an Oracle database, the parallel mode takes about 10 h and the loading speed is estimated four times faster than sequential loading. Furthermore, 4D queries using the Hilbert keys take about 1~5 s and scale well with the dataset size.


Introduction
Space-Filling Curves (SFC) map a compact interval to a multidimensional space by passing through every point of the space.They exhibit good locality preservation properties that make them useful for partitioning or reordering data and computations [1,2].Therefore, SFCs have been widely used in a number of applications, including parallel computing [3,4], file storage [5], database indexing [6][7][8], and image retrieval [9,10].SFCs have been also proven a useful solution for massive points management [11].It first maps spatial data objects into one-dimensional (1D) values and then indexes those values using a one-dimensional indexing technique, typically the B-tree.SFC-based indexing requires only mapping functions and incurs no additional efforts in current databases compared with conventional spatial index structures.
Although many classical SFC generation methods have been put forward-recursive, byte-oriented, and table-driven-these methods mainly focused on the efficient generation of n-dimensional (nD) SFC values.Very little existing work focus on efficient query support with SFC values [12,13], so there are no complete nD SFC libraries available both for mapping and query.Furthermore, when faced with millions of input points, serial generation of space-filling curves will hinder scalability.In order to improve scalability, a parallel SFC library is needed for practical use.
To address the above-mentioned problems of completeness, universality, and scalability, a generic nD SFC library is proposed and validated in massive multiscale nD point management (open source available from http://sfclib.github.io/).Designed with object-oriented programming (OOP), it provides abstract objects for SFC encoding/decoding, pipeline bulk loading, and nD box queries.The library currently supports two well-known types of SFC-Morton and Hilbert curves-but could be easily extended to other types.During implementation, the SFC library exploits the parallelism of multicore CPU processors to accelerate all the SFC functions.
The application of this nD SFC library to massive point cloud management was carried out on a bulk loading and geometry query.During bulk loading, a random level of detail (LOD) value l is calculated for each point, following a data pyramid distribution in a streaming mode.This LOD value l is treated as the 4th dimension added to the xyz coordinates.A unique SFC key is also generated with the proposed SFC library for point clustering.Then, only the generated keys are stored as flat records in an Oracle Index Organized (IOT) table without repeating the original values for x, y, z, and l as they can be completely obtained from the key value (by the decode function).This key-only storage schema is much more compact and achieves better data compression.The nD range query is also conducted with the help of these unique SFC keys.
Loading and query experiments show that the proposed nD SFC library is very efficient, exhibiting robust scalability over massive point datasets.For example, when loading 23 billion LiDAR (Light Detection and Ranging) points into Oracle, the parallel mode takes only 10 h, and the loading speed is an estimated four times faster than sequential loading.Further, 4D queries with Hilbert keys take about 1~5 s and scale well the input data size.
The rest of the paper is organized as follows.A general description of the Space-Filling Curve and a review of current nD indexing/clustering methods are presented in Section 2. Section 3 explains the fundamentals of our proposed generic nD SFC library and the application of our SFC library to massive multiscale point management.Section 4 provides the results of load and query performance evaluation and Section 5 discusses the obtained results.Section 6 presents conclusions and future research directions.

Space-Filling Curves and Current Implementations
In mathematics, a space-filling curve is a continuous bijection from the hypercube in nD space to a 1D line segment, i.e., C : R n → R [14].The nD hypercube is of the order m if it has a uniform side length 2m.Analogously, the curve C also has an order m and its length equals to the total number of 2 n*m cells, shown in Figure 1.
Point-to-point mapping (encoding/decoding) and box-to-segment mapping (querying) functions are both needed for normal SFC applications.For point-to-point mapping functions, three types of classical SFC generation methods have been put forward: recursive, byte-oriented, and table-driven.Because of self-similarity, recursive generation of space-filling curves in lower dimensional spaces has been extensively studied [14,15].Butz's byte-oriented algorithm uses several bit operations such as Shifting and Exclusive OR, and can be theoretically applied to any-dimension numbers [7,16].Table-driven algorithms define a look-up table first to quickly generate multidimensional SFCs on the fly [17].However, the structure of the look-up table will be quite complicated when extended to higher dimensions, e.g., 4D+.
For box-to-segment mapping functions, few works have been carried out; those that have can be categorized as iterative and recursive.The iterative algorithms find target intervals of curve values in an iterative way by repeatedly calling two functions: one function to compute the next SFC value inside the window and another function to compute the next SFC value outside the window query [12].Wu presented an efficient algorithm for 2D Hilbert curves by recursively decomposing the query window [13].The latter type is much more efficient than the former one, but it is sequential and limited to the 2D case; we should extent it to nD support and parallel mode.
evaluation and Section 5 discusses the obtained results.Section 6 presents conclusions and future research directions.

Space-Filling Curves and Current Implementations
In mathematics, a space-filling curve is a continuous bijection from the hypercube in nD space to a 1D line segment, i.e., : n C R R → [14].The nD hypercube is of the order m if it has a uniform side length 2m.Analogously, the curve C also has an order m and its length equals to the total number of 2 n*m cells, shown in Figure 1.

Massive LiDAR Point Cloud Management
With the development of spatial acquisition technologies such as airborne or terrestrial laser scanning, point clouds with millions, billions, or even trillions of points are now generated [18].Each point in these point datasets contains not only 3D coordinates but also other attributes, such as intensity, number of returns, scan direction, scan angle, and RGB values.These massive points can therefore be understood and treated as typical multidimensional data, as each point record is an n-tuple.Storage, indexing, and query on these massive multidimensional points poses a big challenge for researchers [19].
Traditional file-based management solutions store point data in specific formats (e.g., ASPRS LAS), but data isolation, data redundancy, and application dependency in such data formats are major drawbacks.The Database Management System (DBMS)-based solutions can be categorized into two types [11]: block and flat-table models.In the block model, the point datasets are partitioned into regular tiles, and each tile is stored as a binary large object (BLOB).Some common relational databases, e.g., Oracle and PostgreSQL, even provide intrinsic abstract objects and SQL extensions for this type of storage model.The open-source library PDAL (Point Data Abstraction Library) can facilitate the manipulation of blocked points in these databases.In the flat-table model, points are directly stored in a database table, one row per point, resulting in tables with many rows [20].Three columns in a table store X/Y/Z spatial coordinates, while other columns accommodate additional attributes.This flat-table model is easy to implement and very flexible for query and manipulation.However, there are no efficient indexing/clustering methods available for high-dimensional datasets in the currently used relational databases.

nD Spatial Indexing Methods
Existing nD spatial indices can be classified into two categories: explicit dynamic trees and implicit fixed trees.
An explicit dynamic tree for nD cases maintains a dynamic balanced tree and adaptively adjusts the index structures according to input features to produce a better query performance [21,22].However, this adjustment will degrade index generation performance, especially when faced with concurrent insertions.This category includes the R-tree and its variants, illustrated in Figure 2.An implicit fixed tree for nD cases relies on predefined space partitioning, such as grid-based methods [23] and Space-Filling Curves, as illustrated in Figure 3.For example, Geohash [24], as a Zorder curve, recursively defines an implicit quadtree over the worldwide longitude-latitude rectangle and divides this geographic rectangle into a hierarchical structure.Then, GeoHash uses a 1D Base32 string to represent a 2D rectangle for a given quadtree node.GeoHash is widely implemented in many geographic information systems (e.g., PostGIS), and is also used as a spatial indexing method in many NoSQL databases (e.g., MongoDB, HBase) [25].Indices based on implicit fixed trees have benefits over explicit dynamic tree methods in two respects.Firstly, the 1D indexing methods, e.g., B-tree, are very mature and are supported in all commercial DBMSs.Thus, this type of mapping-based index can be easily integrated into any existing DBMS (SQL or NoSQL, even without spatial support).No additional work is required to modify the index structure, concurrency controls, or query execution modules in the underlying DBMS.Secondly, an implicit fixed tree does not need to build a whole division tree in practical use.When calculating indexing keys, it only needs the coordinates of each point without involving other neighboring points.It is much more scalable for managing large-volume point datasets.

The Design of the nD SFC Library
The open-source generic nD SFC library was designed in consideration of object-oriented programming and implemented with C++ template features.This allows library codes to be structured in an efficient way to enhance readability, maintainability, and extensibility.The components of this SFC library are illustrated in Figure 4.It contains general data structures (e.g., An implicit fixed tree for nD cases relies on predefined space partitioning, such as grid-based methods [23] and Space-Filling Curves, as illustrated in Figure 3.For example, Geohash [24], as a Z-order curve, recursively defines an implicit quadtree over the worldwide longitude-latitude rectangle and divides this geographic rectangle into a hierarchical structure.Then, GeoHash uses a 1D Base32 string to represent a 2D rectangle for a given quadtree node.GeoHash is widely implemented in many geographic information systems (e.g., PostGIS), and is also used as a spatial indexing method in many NoSQL databases (e.g., MongoDB, HBase) [25].An implicit fixed tree for nD cases relies on predefined space partitioning, such as grid-based methods [23] and Space-Filling Curves, as illustrated in Figure 3.For example, Geohash [24], as a Zorder curve, recursively defines an implicit quadtree over the worldwide longitude-latitude rectangle and divides this geographic rectangle into a hierarchical structure.Then, GeoHash uses a 1D Base32 string to represent a 2D rectangle for a given quadtree node.GeoHash is widely implemented in many geographic information systems (e.g., PostGIS), and is also used as a spatial indexing method in many NoSQL databases (e.g., MongoDB, HBase) [25].Indices based on implicit fixed trees have benefits over explicit dynamic tree methods in two respects.Firstly, the 1D indexing methods, e.g., B-tree, are very mature and are supported in all commercial DBMSs.Thus, this type of mapping-based index can be easily integrated into any existing DBMS (SQL or NoSQL, even without spatial support).No additional work is required to modify the index structure, concurrency controls, or query execution modules in the underlying DBMS.Secondly, an implicit fixed tree does not need to build a whole division tree in practical use.When calculating indexing keys, it only needs the coordinates of each point without involving other neighboring points.It is much more scalable for managing large-volume point datasets.

The Design of the nD SFC Library
The open-source generic nD SFC library was designed in consideration of object-oriented programming and implemented with C++ template features.This allows library codes to be structured in an efficient way to enhance readability, maintainability, and extensibility.The components of this SFC library are illustrated in Figure 4.It contains general data structures (e.g., Indices based on implicit fixed trees have benefits over explicit dynamic tree methods in two respects.Firstly, the 1D indexing methods, e.g., B-tree, are very mature and are supported in all commercial DBMSs.Thus, this type of mapping-based index can be easily integrated into any existing DBMS (SQL or NoSQL, even without spatial support).No additional work is required to modify the index structure, concurrency controls, or query execution modules in the underlying DBMS.Secondly, an implicit fixed tree does not need to build a whole division tree in practical use.When calculating indexing keys, it only needs the coordinates of each point without involving other neighboring points.It is much more scalable for managing large-volume point datasets.

The Design of the nD SFC Library
The open-source generic nD SFC library was designed in consideration of object-oriented programming and implemented with C++ template features.This allows library codes to be structured in an efficient way to enhance readability, maintainability, and extensibility.The components of this SFC library are illustrated in Figure 4.It contains general data structures (e.g., Point and Rectangle), core SFC-related classes (e.g., CoordTransform, SFCConv, OutputSchema, RangeGen, and SFCPipeline), and other auxiliary objects (e.g., RandLOD).

ISPRS Int. J. Geo-Inf. 2018 5 of 19
Point and Rectangle), core SFC-related classes (e.g., CoordTransform, SFCConv, OutputSchema, RangeGen, and SFCPipeline), and other auxiliary objects (e.g., RandLOD).The Point class is used to represent the input points for SFC encoding/decoding, while the Rectangle class supports nD range queries with SFC keys.Both classes are generic and easily extended to any dimensions.
The CoordTransform class converts the coordinates between geographic space and SFC space, i.e., from float type to integer type.Two transformation modes are supported here: translation and scaling.During coordinate transformation, users first define the translation distances and the scaling coefficients, which can be different for each dimension.
The abstract SFCConv class can be inherited to implement different SFC curves and provides the interface for SFC encoding/decoding.This class converts the input SFC space coordinates into a long bit sequence, and then the bit sequence will be encoded into the target code type, e.g., 256-bit number or hash string by the OutputSchema class.The OutputSchema class also supports different string schemas, e.g., Base32 or Base64.The RangeGen class provides the required range query interfaces through which the users input an nD box and a set of 1D SFC ranges are outputted.
Currently, the SFC Library implements two types of SFC curves: Morton/Z-order and Hilbert.The Morton curve only interleaves the input SFC coordinates, so its encoding and decoding is trivial.Butz and Lawder's byte-oriented methods are used for efficient Hilbert encoding/decoding [7,14,16].The details of Hilbert encoding/decoding are presented in Appendix A.

The nD Box Query with SFC Keys
The objective of an nD box query can be stated as follows: Given an R n input box B defined by min and max coordinates in every dimension, the query operation Q(B) returns those points Sk which are fully contained by the nD box B. Because all points are indexed with 1D SFC keys, the box query is equivalent to translating nD input box B into a collection of 1D SFC key ranges K and filtering target points using these derived key ranges.The Point class is used to represent the input points for SFC encoding/decoding, while the Rectangle class supports nD range queries with SFC keys.Both classes are generic and easily extended to any dimensions.
The CoordTransform class converts the coordinates between geographic space and SFC space, i.e., from float type to integer type.Two transformation modes are supported here: translation and scaling.During coordinate transformation, users first define the translation distances and the scaling coefficients, which can be different for each dimension.
The abstract SFCConv class can be inherited to implement different SFC curves and provides the interface for SFC encoding/decoding.This class converts the input SFC space coordinates into a long bit sequence, and then the bit sequence will be encoded into the target code type, e.g., 256-bit number or hash string by the OutputSchema class.The OutputSchema class also supports different string schemas, e.g., Base32 or Base64.The RangeGen class provides the required range query interfaces through which the users input an nD box and a set of 1D SFC ranges are outputted.
Currently, the SFC Library implements two types of SFC curves: Morton/Z-order and Hilbert.The Morton curve only interleaves the input SFC coordinates, so its encoding and decoding is trivial.Butz and Lawder's byte-oriented methods are used for efficient Hilbert encoding/decoding [7,14,16].The details of Hilbert encoding/decoding are presented in Appendix A.

The nD Box Query with SFC Keys
The objective of an nD box query can be stated as follows: Given an R n input box B defined by min and max coordinates in every dimension, the query operation Q(B) returns those points S k which are fully contained by the nD box B. Because all points are indexed with 1D SFC keys, the box query is equivalent to translating nD input box B into a collection of 1D SFC key ranges K and filtering target points using these derived key ranges.
The space-filling curve can be treated as a space partitioning method.The target nD space is divided into a collection of grid cells recursively until the level of space partitioning reaches m.Each cell is labeled by a unique number which defines the position of this cell in the space-filling curve.The space partitioning process by a space-filling curve can be represented by a complete 2 n -ary tree (i.e., for 2D is a quadtree; for 3D is an octree).Every tree node in this implicit 2 n -ary tree covers an nD box and can be translated to an SFC range.This process is illustrated in Figure 5.
The space-filling curve can be treated as a space partitioning method.The target nD space is divided into a collection of grid cells recursively until the level of space partitioning reaches m.Each cell is labeled by a unique number which defines the position of this cell in the space-filling curve.The space partitioning process by a space-filling curve can be represented by a complete 2 n -ary tree (i.e., for 2D is a quadtree; for 3D is an octree).Every tree node in this implicit 2 n -ary tree covers an nD box and can be translated to an SFC range.This process is illustrated in Figure 5.A recursive range generation method is proposed following this feature.The core idea of this recursive method is to recursively approximate the input nD box with different levels of tree nodes.So, the original objective is further transformed and restated as finding a collection of tree nodes whose combination is equal to the input nD box.The recursive range generation algorithm is explained as follows and also illustrated in Figure 6.
1. Start from the root tree node; 2. Get all 2 n child nodes; 3. Fetch one child node and check the spatial relationship between the input box and this child node:  If it equals to this child, stop here and output this child node;  If it is contained by this child, go down from this child node directly and repeat 2;   The tree nodes thus obtained will be later translated to 1D ranges to fetch the required points.For higher-resolution SFCs, the number of 1D ranges usually exceeds thousands or even millions.A recursive range generation method is proposed following this feature.The core idea of this recursive method is to recursively approximate the input nD box with different levels of tree nodes.So, the original objective is further transformed and restated as finding a collection of tree nodes whose combination is equal to the input nD box.The recursive range generation algorithm is explained as follows and also illustrated in Figure 6.

1.
Start from the root tree node; 2. Get all 2 n child nodes; 3.
Fetch one child node and check the spatial relationship between the input box and this child node: If it equals to this child, stop here and output this child node; If it is contained by this child, go down from this child node directly and repeat 2; If it is intersected by this child, bisect the input box along the middle line in each dimension and obtain new query boxes; If no overlap, repeat 3 and check other child nodes.

4.
Repeat 2 and 3 with intersected children nodes and new query boxes until the leaf level; 5.
Translate the obtained nodes into SFC ranges, merge continuous intervals, and return the derived ranges.
The tree nodes thus obtained will be later translated to 1D ranges to fetch the required points.For higher-resolution SFCs, the number of 1D ranges usually exceeds thousands or even millions.Due to the sparsity of points in SFC cells, most 1D ranges will not filter any point, while too many ranges take a long time to load and filter.Therefore, additional tree traversal depth control and adjacent range merge are conducted before returning ranges.Because tree traversal is done in a breadth-first manner, when the traversal goes down to a new level, the number of currently obtained nodes is checked as to whether it is bigger than K*N (N is the number of returned ranges; K is the extra coefficient).If it exceeds K*N, the traversal is terminated.Thus, Step 5 of the recursive range generation will be extended as: 1.
Get all raw ranges from each tree node (number ≥ K*N); 2.
Sort all the gap distances from raw ranges; 3.
Get the nth gap distance (the default value is 1); 4.
Merge all the gaps which are less than/equal to the nth gap distance.
1. Start from the root tree node; 2. Get all 2 n child nodes; 3. Fetch one child node and check the spatial relationship between the input box and this child node:  If it equals to this child, stop here and output this child node;  If it is contained by this child, go down from this child node directly and repeat 2;  The tree nodes thus obtained will be later translated to 1D ranges to fetch the required points.For higher-resolution SFCs, the number of 1D ranges usually exceeds thousands or even millions.

The Parallelization of the Generic nD SFC Library
Applications of this generic nD SFC library usually face massive datasets; therefore, the available parallelism of multicore processors is exploited to accelerate practical use.Profiling demonstrates that the two most compute-intensive steps are SFC key encoding when bulk loading, and SFC range generation for nD box queries.
For bulk SFC encoding, a raw point dataset is usually two or three orders of magnitude larger than the memory of standard computing hardware.The points cannot be loaded into main memory to generate the SFC keys one by one.Pipelining is adopted to fetch data sequentially into the memory to support thread-level parallelism.Due to the sparsity of points in SFC cells, most 1D ranges will not filter any point, while too many ranges take a long time to load and filter.Therefore, additional tree traversal depth control and adjacent range merge are conducted before returning ranges.Because tree traversal is done in a breadth-first manner, when the traversal goes down to a new level, the number of currently obtained nodes is checked as to whether it is bigger than K*N (N is the number of returned ranges; K is the extra coefficient).If it exceeds K*N, the traversal is terminated.Thus, Step 5 of the recursive range generation will be extended as: 1. Get all raw ranges from each tree node (number ≥ K*N); 2. Sort all the gap distances from raw ranges; 3. Get the nth gap distance (the default value is 1); 4. Merge all the gaps which are less than/equal to the nth gap distance.

The Parallelization of the Generic nD SFC Library
Applications of this generic nD SFC library usually face massive datasets; therefore, the available parallelism of multicore processors is exploited to accelerate practical use.Profiling demonstrates that the two most compute-intensive steps are SFC key encoding when bulk loading, and SFC range generation for nD box queries.
For bulk SFC encoding, a raw point dataset is usually two or three orders of magnitude larger than the memory of standard computing hardware.The points cannot be loaded into main memory to generate the SFC keys one by one.Pipelining is adopted to fetch data sequentially into the memory to support thread-level parallelism.During the translation from the nD box into discrete 1D ranges for nD range query, a collection of discrete tree nodes is obtained; each tree node is further translated into an SFC line segment.These discrete tree nodes are then dispatched to concurrent threads and each thread will take the burden of one part of tree nodes, as illustrated in Figure 8.During the translation from the nD box into discrete 1D ranges for nD range query, a collection of discrete tree nodes is obtained; each tree node is further translated into an SFC line segment.These discrete tree nodes are then dispatched to concurrent threads and each thread will take the burden of one part of tree nodes, as illustrated in Figure 8.During the translation from the nD box into discrete 1D ranges for nD range query, a collection of discrete tree nodes is obtained; each tree node is further translated into an SFC line segment.These discrete tree nodes are then dispatched to concurrent threads and each thread will take the burden of one part of tree nodes, as illustrated in Figure 8.

Random LOD Generation
During visualization, due to massive data size, multiple levels of detail are usually created to allow users to select only a certain percentage of points in a query region.Traditionally, a collection of discrete LODs is derived in an iterative way from the bottom level to the top level; this is called Uniform Sampling [28].
In Uniform Sampling, the original datasets are treated as the bottom level, and the points in one level i + 1 are sampled by a uniform random selection to the upper level i.The factor between two adjacent levels can be 2 n , i.e., one in every 2 n +1 points of level i + 1 will be randomly selected into the level i.The drawback of this method is that it's order-dependent and therefore does not perform well in a parallel mode.
To avoid point counting, an alternative streaming method is proposed based on probabilities of an ideal pyramid distribution among all levels.Assume that there are a total of 32 levels, and that the lowest level is l = 31 (most detailed, highest number of points) and the highest level is l = 0 (least detailed, fewest points); then the number of points in an ideal pyramid distribution for each level will be N(0) = 2 n*0 = 1; N(1) = 2 n ; . . . . . .; N(l − 1) = 2 n*l−n . ( The total l probability intervals are derived as A uniform random variable X ⊂ U(0, ) is then defined.The X will generate a random value x for each point in the stream, and the generated x will then be checked for the probability interval it falls in.This will be used as the LOD level number for this point.The assignment process is illustrated in Figure 9.

Random LOD Generation
During visualization, due to massive data size, multiple levels of detail are usually created to allow users to select only a certain percentage of points in a query region.Traditionally, a collection of discrete LODs is derived in an iterative way from the bottom level to the top level; this is called Uniform Sampling [28].
In Uniform Sampling, the original datasets are treated as the bottom level, and the points in one level i + 1 are sampled by a uniform random selection to the upper level i.The factor between two adjacent levels can be 2 n , i.e., one in every 2 n +1 points of level i + 1 will be randomly selected into the level i.The drawback of this method is that it's order-dependent and therefore does not perform well in a parallel mode.
To avoid point counting, an alternative streaming method is proposed based on probabilities of an ideal pyramid distribution among all levels.Assume that there are a total of 32 levels, and that the lowest level is l = 31 (most detailed, highest number of points) and the highest level is l = 0 (least detailed, fewest points); then the number of points in an ideal pyramid distribution for each level will be N(0) = 2 n*0 = 1; N(1) = 2 n ; ……; N(l − 1) = 2 n*l−n . ( The total l probability intervals are derived as A uniform random variable  is then defined.The X will generate a random value x for each point in the stream, and the generated x will then be checked for the probability interval it falls in.This will be used as the LOD level number for this point.The assignment process is illustrated in Figure 9.A total of 32 levels should be enough for any dataset.For a given dataset, 32 levels may be too much and, using the random distribution as described, most of the top levels will be empty.However, it may happen in rare cases that a point is assigned to such a top level, and in such a case the point is reassigned to a lower level.A total of 32 levels should be enough for any dataset.For a given dataset, 32 levels may be too much and, using the random distribution as described, most of the top levels will be empty.However, it may happen in rare cases that a point is assigned to such a top level, and in such a case the point is reassigned to a lower level.

Loading and Indexing with nD SFC Keys
Since all the input point clouds are available as LAS/LAZ files, the loading uses LAStools to convert LAS/LAZ files to ASCII text records.The text records are then inserted into a temporary Oracle heap table with the sqlldr tool and later clustered into an Oracle IOT table using the SFC keys.The IOT table is contrary to a normal heap table, which requires an additional index to be created.This decreases the storage requirements and the data are better clustered.
Bulk point loading is conducted in an existing Python benchmark framework [19] using a new auxiliary tool, named SFCGen.The SFCGen tool is based on the proposed C++ SFC Library.It generates a random LOD level value for each point and calculates a 4D Hilbert key for XYZ coordinates plus LOD level value, i.e., x/y/z/l.The whole loading operation is listed as follows (details are found in Appendix B): 1.
Create the required temporary Oracle heap table; 2.
Load the points into the temporary Oracle table with the defined pipeline command; 3.
Select all temporary points into the Oracle IOT Table.

Points Query with nD SFC Keys
The nD range query is also conducted in the Python benchmark framework.A new auxiliary tool, named SFCQuery, generates the required SFC key ranges from the input nD box.The SFCQuery tool supports command pipe output if no external output file is specified.So, it is integrated together with Oracle sqlldr to load the generated SFC ranges into a temporary IOT query table.A full nD quey is conducted in two steps (details are listed in Appendix C): 1.
1st SFC Join Filter: to fetch the coarse set with a nested-loop join query between points IOT table and SFC ranges IOT table; 2.
2nd Refinement: to obtain the final exact query results by a point-in-2D-polygon check plus other (n − 2)-dimensional comparisons.

Design and Configuration
To evaluate the effectiveness and performance of our proposed management method with the new SFC library, two groups of evaluation experiments on massive LiDAR point cloud management were carried out in an Oracle database.One group was a loading performance comparison between different SFC keys.The other was an nD range query combined with different types of input geometry.
In Table 1 we detail the datasets for the scalability test; all of them are subsets of The Netherlands AHN2 dataset, ranging from 20 million to 23 billion.The sample density is 6~10 points per square meter.The experimental datasets have different extents and sizes, and each dataset extends the area covered by a previous smaller dataset, illustrated in Figure 10.
The hardware configurations for all tests were same as the former benchmark [19].For the tests, a HP DL380p Gen8 server with the following details were used: 2-way 8-core Intel Xeon processors, E5-2690 at 2.9 GHz, 128 GB of main memory, and a RHEL 6 operating system.The directly attached storage consists of a 400 GB SSD, 5 TB SAS 15 K rpm in RAID 5 configuration (internal), and 2 × 41 TB SATA 7200 rpm in RAID 5 configuration.The version of Oracle Database is 12c Enterprise Edition Release 12.1.0.1.0-64bit.

Evaluation of Bulk Points Loading with SFC Encoding
The first step in this subsection is checking how the chunk size of the input point stream affects parallel encoding efficiency.The optimum chunk size established by experiments was used to minimize loading time in the second step.

Parallel LOD Computation and SFC Encoding
This step monitored how long the SFCGen Tool will take for random LOD calculation and 4D SFC key generation (x/y/z/l) with different chunk sizes on the 20 M dataset.It was done in both serial and parallel modes.All the results are listed in the Table 2.
The results in Table 2 show that as the chunk size increases from 5000 to 100,000, the percentage of reduced processing time changes from 17.4 to 29.4%, and the speedup increases from 1.21 to 1.42.This means that the CPU parallelism is better exploited as the chunk size increases.However, although Input/Output overlap in parallel pipelines, the sequential Output still absorbs a large proportion of the total loading time (over 70%).Thus, from this aspect, the speedup is reasonable.The maximum speedup was about 1.45 (141/97 = 1.45).The bulk loading time recorded the time starting from when the LAS files were read until all points were successfully stored in the Oracle IOT table.Suggested by Table 2, the chunk size was set to 100,000, and both the sequential and the parallel encoding modes were used in this test (the 2201 M and 23,090 M data are too big and only the parallel mode was used).The coefficients for coordinate transformation were (69,000, 440,000, −100, 0; 100, 100, 100, 1).All the bulk loading results are listed in Table 3.Although Hilbert encoding is a little more expensive than Morton encoding, with the help of parallel loading, the differences were significantly shortened (e.g., about 30 s in the 23,090 M dataset).Illustrated in Table 3, the IOT clustering with Hilbert keys costs significantly less time than clustering with Morton keys (e.g., about 3700 s less for a 23,090 M dataset).This can be attributed to the better clustering capabilities of the Hilbert curve.

Evaluation of SFC Range Generation and nD Range Query
The first step in this subsection is a check of different range generation configurations on the efficiency of future nD box queries.From the derived optimum configuration, real-case nD box queries were conducted with support from the SFC keys.

Parallel SFC Range Generation
The range generation time monitored how long the SFCQuery Tool spent generating the needed 1D ranges from an nD box as defined by users.The range generation times were obtained with different configurations (N and K).An example parallel range generation command for 4D box input is as follows.
SFCQuery -i 85300/85450/446300/446450/-2.0/-1.5/8/9-p 1 -s 1 -e 0 -t ct.txt -n 1000 -k 8 -o range.txt-v In this command, the total input cell number of a defined 4D box is the product of the side length of a box and the scale coefficient (as specified in Section 4.2.2) for each dimension, i.e., (∆D1 * ∆D2 * ∆D3 * ∆D4) * (S1 * S2 * S3 * S4).The returned cell count is the sum of returned range intervals.Usually the returned cell count is much bigger than the total input cell number (shown in Table 4).As shown in Figure 11, the parallel mode decreases generation time considerably as compared to the serial mode, while the speedup increases with a bigger N*K.At the same time, as indicated by the extra ratio between returned cell count and input cell count, the bigger N*K will go down deeper in the 2 n -ary tree traversal and make the returned SFC ranges more precisely approximate the shape of the input 4D box.If given a specific N*K, more returned ranges (i.e., bigger N) are better for later queries.As shown in Figure 11, the parallel mode decreases generation time considerably as compared to the serial mode, while the speedup increases with a bigger N*K.At the same time, as indicated by the extra ratio between returned cell count and input cell count, the bigger N*K will go down deeper in the 2 n -ary tree traversal and make the returned SFC ranges more precisely approximate the shape of the input 4D box.If given a specific N*K, more returned ranges (i.e., bigger N) are better for later queries.On the 2201 M dataset, a set of real 4D box queries was conducted.As shown in Table 5 and Figure 12, a bigger N*K can filter out more points during the 1st join step and a smaller number of points on the left side will make the 2nd refinement much quicker.However, a bigger N*K makes the 1st join step take a much longer time.Therefore, in this test a combination of N = 10,000 and K = 4 gets the smallest query time.On the 2201 M dataset, a set of real 4D box queries was conducted.As shown in Table 5 and Figure 12, a bigger N*K can filter out more points during the 1st join step and a smaller number of points on the left side will make the 2nd refinement much quicker.However, a bigger N*K makes the 1st join step take a much longer time.Therefore, in this test a combination of N = 10,000 and K = 4 gets the smallest query time.

The 4D SFC Query in Real Cases
Three geometries were defined for real-case nD SFC queries: Rectangle, Polygon, and Line buffer (shown in Figure 13).The outer rectangle in Figure 13
Queries with SFC keys are more scalable over the data size than a pure B-tree filter.2.
Queries with Hilbert keys are quicker than those with Morton keys (about 0.1 s~0.4 s less time).From same N*K, the derived Hilbert ranges can filter out more points than Morton ranges.

3.
The 1st join filter time is much less than the 2nd refinement time.The 2nd refinement is proportional to the left point count.If the 1st join filter uses more ranges to filter more unrelated points and obtain less-coarse results, then the 2nd refinement will take less time.Three geometries were defined for real-case nD SFC queries: Rectangle, Polygon, and Line buffer (shown in Figure 13).The outer rectangle in Figure 13 is the bounding box of the 20 M dataset.Different SFC types were also compared (Morton vs Hilbert) on both query time and returned points counts.Suggested by Figure 13, N and K were set to 10,000 and 4, respectively.Each 4D query filter is listed as follows: (4D_R: 2D_R + −2.0 ≤ VAL_D3 ≤ -1.From Figure 14, Tables 6 and 7, we can conclude that: 1. Queries with SFC keys are more scalable over the data size than a pure B-tree filter.2. Queries with Hilbert keys are quicker than those with Morton keys (about 0.1 s~0.4 s less time).
From same N*K, the derived Hilbert ranges can filter out more points than Morton ranges.3. The 1st join filter time is much less than the 2nd refinement time.The 2nd refinement is proportional to the left point count.If the 1st join filter uses more ranges to filter more unrelated points and obtain less-coarse results, then the 2nd refinement will take less time.

The 2D SFC Query in Real Cases
For 2D real-case queries in this 4D organization scheme, two solutions are usually implemented.The first neglects the NULL dimensions and uses the corresponding full dimension range in the SFC space (i.e., 0~2 m).The second uses basic information of actual data about the NULL dimensions and inputs a purposed range that includes the full corresponding range of point datasets (min_value~max_value).Two different solution commands are listed as follows.
SFCQuery -i 85300/85450/446300/446450--Solution 1 SFCQuery -i 85300/85450/446300/446450/-5/62/0/10--Solution 2 The results obtained are listed in Table 8.As shown in Table 8, we can conclude that for 2D query cases, if we cannot specify exact parameters for the NULL dimensions, we can input a reasonable range (min_value~max_value) from information of actual datasets.This is much more efficient than the use of the full dimension range.

Discussion
The proposed SFC library can provide complete support for massive points management during the loading, indexing, and later querying.Due to its OOP design, the encapsulated functions of this SFC library can be easily integrated with other frameworks, e.g., [19].With the enablement of multithreading, the capability of SFC encoding, decoding, and range generation in the library is greatly accelerated and scales well with input data size.
During the points loading, Hilbert encoding costs a little more time than Morton encoding, but the IOT clustering with Hilbert keys needs much less time than that with Morton keys.Furthermore, the gap of encoding time between Hilbert and Morton is greatly narrowed by parallel implementation.Therefore, the total loading time with Hilbert keys is quicker for massive point datasets.Generally, queries with SFC keys are more scalable over the data size than pure B-tree indexing.Queries with Hilbert keys are quicker than those with Morton keys due to better locality preservation of the Hilbert curve.Therefore, this proves Hilbert keys very suitable for massive points management.
For SFC querying, the total query time depends on the filter capability of generated 1D ranges.The better the quality of the 1D ranges, the much fewer the left points after the 1st join filter will be.The quality of 1D ranges depends on the levels of the 2 n -ary tree traversal, shown in Section 3.1.2.However, deep traversal still cannot decrease the total query time.Too many 1D ranges will cause the join operations to be extremely expensive.Thus, there exists a balanced level during tree traversal.Due to uneven distribution of input points, some of the generated ranges may filter out no data.Thus, another way to improve the quality of 1D ranges can be done with a statistical histogram of data distribution.If the area represented by one derived range contains no data, this ranges should be excluded from the range collection.

Conclusions and Future Work
Since currently no complete and scalable nD Space-Filling Curve library is available for massive point indexing and query retrieval, we propose and implement a generic nD SFC library and apply it to manage massive multiscale LiDAR point clouds.For efficient code reuse, the SFC Library is designed in consideration of object-oriented programming and encapsulates the SFC-related functions, encoding/decoding/query, into abstract classes.It also exploits multicore processor parallelism to accelerate these functions.When integrating this library with massive 4D point cloud management, experimental results show that the proposed nD SFC library provides complete functions for multiscale points management and exhibits robust scalability over massive datasets.
In the near future, the following issues will be investigated: 1.
Real nD geometry query should be supported for future LOD visualization instead of extruded-prism-shape query in 4D cases (2D polygon + 2D attribute filters).

2.
Because SFC keys already contain input coordinate information, coordinate storage in databases might result in redundancy.Key-only storage and decoding in the database will be exploited for higher efficiency.3.
We will explore true vario-scale/continuous LODs implemented in this proposed method, i.e., importance value per point, instead of by a discrete number of levels.4.
Our proposed solution can be integrated onto NoSQL databases, e.g., HBase, to support distributed data management with larger datasets (>10 trillion points).

5.
More applications for different point dataset management, e.g., massive trajectory data collected from moving objects, and dynamic true vario-scale data (5D).

Figure 1 .
Figure 1.The illustration of 2D Hilbert curves with different orders.Figure 1.The illustration of 2D Hilbert curves with different orders.

Figure 1 .
Figure 1.The illustration of 2D Hilbert curves with different orders.Figure 1.The illustration of 2D Hilbert curves with different orders.

Figure 2 .
Figure 2. The illustration of dynamic 2D R-tree.

Figure 3 .
Figure 3.A 2D implicit fixed tree labeled with Z-order keys.

Figure 2 .
Figure 2. The illustration of dynamic 2D R-tree.

Figure 2 .
Figure 2. The illustration of dynamic 2D R-tree.

Figure 3 .
Figure 3.A 2D implicit fixed tree labeled with Z-order keys.

Figure 3 .
Figure 3.A 2D implicit fixed tree labeled with Z-order keys.

Figure 5 .
Figure 5.The relationship between the quadtree node and 1D Hilbert key range.
If it is intersected by this child, bisect the input box along the middle line in each dimension and obtain new query boxes;  If no overlap, repeat 3 and check other child nodes.

4 .
Repeat 2 and 3 with intersected children nodes and new query boxes until the leaf level; 5. Translate the obtained nodes into SFC ranges, merge continuous intervals, and return the derived ranges.

Figure 6 .
Figure 6.Illustration of the recursive 1D range generation process.

Figure 5 .
Figure 5.The relationship between the quadtree node and 1D Hilbert key range.
If it is intersected by this child, bisect the input box along the middle line in each dimension and obtain new query boxes;  If no overlap, repeat 3 and check other child nodes.4. Repeat 2 and 3 with intersected children nodes and new query boxes until the leaf level; 5. Translate the obtained nodes into SFC ranges, merge continuous intervals, and return the derived ranges.

Figure 6 .
Figure 6.Illustration of the recursive 1D range generation process.

Figure 6 .
Figure 6.Illustration of the recursive 1D range generation process.
Figure 7 illustrates our pipeline design for bulk SFC encoding, in which parallelism is designed at the pipeline level.Each pipeline includes three continuous stages: INPUT, SFC_ENCODE, and OUTPUT.The chunk size of the INPUT stage determines how many points are read and encoded in each pipeline.Another feature of this pipelining is overlapped I/O, i.e., when one thread is blocked by the I/O operation, others threads can continue the encoding work.All bulk SFC encoding pipelines are implemented with the Intel Threading Building Blocks (Intel TBB) [26,27].ISPRS Int.J. Geo-Inf.2018 7 of 19 Figure 7 illustrates our pipeline design for bulk SFC encoding, in which parallelism is designed at the pipeline level.Each pipeline includes three continuous stages: INPUT, SFC_ENCODE, and OUTPUT.The chunk size of the INPUT stage determines how many points are read and encoded in each pipeline.Another feature of this pipelining is overlapped I/O, i.e., when one thread is blocked by the I/O operation, others threads can continue the encoding work.All bulk SFC encoding pipelines are implemented with the Intel Threading Building Blocks (Intel TBB)[26,27].

Figure 7 .
Figure 7.The parallel pipeline for bulk SFC encoding.

Figure 7 .
Figure 7.The parallel pipeline for bulk SFC encoding.

Figure 7 .
Figure 7.The parallel pipeline for bulk SFC encoding.

Figure 8 .
Figure 8.The parallel translation from discrete tree nodes to SFC key ranges.Figure 8.The parallel translation from discrete tree nodes to SFC key ranges.

Figure 8 .
Figure 8.The parallel translation from discrete tree nodes to SFC key ranges.Figure 8.The parallel translation from discrete tree nodes to SFC key ranges.

3. 2 .
The Application of the Proposed SFC Library in Massive Multiscale Points Management

Figure 9 .
Figure 9.The illustration of streaming random uniform sampling.

Figure 9 .
Figure 9.The illustration of streaming random uniform sampling.

Figure 11 .
Figure 11.Parallel SFC range generation for the example 4D query box.

Figure 11 .
Figure 11.Parallel SFC range generation for the example 4D query box.

Figure 12 .
Figure 12.The query time of two steps with different returned ranges.

Figure 12 .
Figure 12.The query time of two steps with different returned ranges.

Figure 12 .
Figure 12.The query time of two steps with different returned ranges.

Figure 13 .
Figure 13.Different 2D geometries used in the 4D query cases

Figure 13 .
Figure 13.Different 2D geometries used in the 4D query cases

Figure 14 .
Figure 14.Total query time over different data sizes in the 4D query cases.

Table 1 .
The basic information for the experimental datasets.
Figure 10.The different experimental datasets displayed on OpenStreetMap.

Table 3 .
Total loading time for different data sizes (sec.).

Table 4 .
The returned range information with different N and K values for this 4D box input (sec.).

Table 5 .
4D range query on 2201 M with different returned Hilbert ranges (sec.).

Table 5 .
4D range query on 2201 M with different returned Hilbert ranges (sec.).

Table 6 .
Query time on different data sizes (sec.).

Table 7 .
A comparison of 4D_R query time between Hilbert, Morton, and B-tree (sec.)indices.
Figure 14.Total query time over different data sizes in the 4D query cases.

Table 7 .
A comparison of 4D_R query time between Hilbert, Morton, and B-tree (sec.)indices.

Table 8 .
2D Rectangle Query time on 20 M points with two solutions (sec.).