In recent years positioning sensors have become ubiquitous, and there has been tremendous growth in the amount of trajectory data. It is a huge challenge to efficiently store and query massive trajectory data. Among the typical operation over trajectories, similarity query is an important yet complicated operator. It is useful in navigation systems, transportation optimizations, and so on. However, most existing studies have focused on handling the problem on a centralized system, while with a single machine it is difficult to satisfy the storage and processing requirements of mass data. A distributed framework for the similarity query of massive trajectory data is urgently needed. In this research, we propose DFTHR (distributed framework based on HBase and Redis) to support the similarity query using Hausdorff distance. DFTHR utilizes a segment-based data model with a number of optimizations for storing, indexing and pruning to ensure efficient querying capability. Furthermore, it adopts a bulk-based method to alleviate the cost for adjusting partitions, so that the incremental dataset can be efficiently supported. Additionally, DFTHR introduces a co-location-based distributed strategy and a node-locality-based parallel query algorithm to reduce the inter-worker cost overhead. Experiments show that DFTHR significantly outperforms other schemes.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited