Point of interest (POI) matching finds POI pairs that refer to the same real-world entity, which is the core issue in geospatial data integration. To address the low accuracy of geospatial entity matching using a single feature attribute, this study proposes a method that combines the D–S (Dempster–Shafer) evidence theory and a multiattribute matching strategy. During POI data preprocessing, this method calculates the spatial similarity, name similarity, address similarity, and category similarity between pairs from different geospatial datasets, using the multiattribute matching strategy. The similarity calculation results of these four types of feature attributes were used as independent evidence to construct the basic probability distribution. A multiattribute model was separately constructed using the improved combination rule of the D–S evidence theory, and a series of decision thresholds were set to give the final entity matching results. We tested our method with a dataset containing Baidu POIs and Gaode POIs from Beijing. The results showed the following—(1) the multiattribute matching model based on improved DS evidence theory had good performance in terms of precision, recall, and F1 for entity-matching from different datasets; (2) among all models, the model combining the spatial, name, and category (SNC) attributes obtained the best performance in the POI entity matching process; and (3) the method could effectively address the low precision of entity matching using a single feature attribute.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited