In recent years, with the development of sensor technology, the volume of spatial data has grown exponentially. However, this data is often unevenly distributed, and traditional indexing methods cannot predict the overall data distribution when data are continuously inserted into the database. This
[...] Read more.
In recent years, with the development of sensor technology, the volume of spatial data has grown exponentially. However, this data is often unevenly distributed, and traditional indexing methods cannot predict the overall data distribution when data are continuously inserted into the database. This makes them inefficient for indexing large-scale, unevenly distributed spatial data. This paper proposes a hybrid indexing method based on the grid-indexing and R-tree methods, called R-MLGTI (R-Multi-Level Grid–Tree Index). The method first divides the two-dimensional space using the Z-curve to form multiple sub-grid regions. When incrementally inserting data, R-MLGTI calculates the grid encoding of the data and computes the
of the corresponding grid
G to measure the sparsity or density within the grid region, where
is a metric that quantifies the data density within grid
G. All data in sparse grids are indexed by R-trees associated with grid encodings. In dense grid areas, a finer-grained space-filling curve is recursively applied for further spatial division. This process forms multiple sub-grids until the data within all sub-grids becomes sparse, at which point the original data is re-indexed according to the sparse grids. Finally, this paper presents a prototype system of the in-memory R-MLGTI and conducts benchmark tests for incremental data import and range queries. The incremental data insertion performance of R-MLGTI is lower than that of the grid-indexing and R-tree methods; however, on various unevenly distributed simulated datasets, the average query time for different query regions in R-MLGTI is about 6.49% faster than that of the grid-indexing method and about 51.78% faster than that of the R-tree method. On a real dataset, Landsat 7 EMT, which contains 2,585,203 records, the average query time for various query ranges is 61.39% faster than that of the grid-indexing method and 17.01% faster than that of the R-tree method. Experiments show that R-MLGTI performs better than the traditional R-tree and grid-indexing methods in large-scale, unevenly distributed spatial data query requests.
Full article