Spatial Proximity Relations-Driven Semantic Representation for Geospatial Entity Categories
Abstract
:1. Introduction
- We propose a semantic representation approach for geographical entity types. This approach does not require the assistance of domain experts and uses SDNE to capture the local and global characteristics of geographical entity categories, resulting in an unsupervised representation of geographical entity categories.
- We use the proximity relationship to create a spatial proximity graph of geographic entities, from which we create an adjacency matrix of all entity categories in the experimental area. We then calculate the adjacency strength to strengthen the contribution of the entity with high relevance among the neighboring entities to the core entity.
- The semantic representations (feature embedding) of geographic entity categories obtained from the model in this paper can be effectively applied to downstream tasks such as entity category similarity calculation and similar region extraction, which can then be used to support applications such as similar scene recommendation and commercial site selection.
2. Related Work
2.1. Semantic Similarity Calculation
2.2. Representing Learning
- Statistical learning-based models refer to data analysis and induction of laws that cannot be directly analyzed. The concept of word embedding dates back to the embedding space model [32] in 1975. Currently, many studies are being conducted on embedding methods. Word2vec is an embedding method used in natural language processing that includes primarily the CBow and Skip-Gram models. Zhuang et al. [33] used Word2vec to encode context into a semantic space and proposed a context-aware interactive recognition framework. Word2vec serves as the foundation for other extended embedding research, such as Item2vec, which is based on goods correlation and trains the model using the historical behavior sequence generated by purchasing or browsing activity [34]. These statistical learning methods are highly generalizable and can be applied to text data from a variety of fields.
- A knowledge graph-based approach can greatly improve the effectiveness of knowledge acquisition, fusion, and reasoning [35]. Translation-based models, such as TransE [36], TransH [37], TransR [38], and TransD [39], are commonly used for learning knowledge graphs. The idea behind translation-based models is to consider the relationship in the knowledge graph triplet as a translation from the head to the tail entity.
- The graph-representation approach is more effective at analyzing the relationships between network nodes and can be used as the input for machine learning algorithms [40]. These models are classified into three categories: matrix decomposition-based models, random walk-based models, and graph neural network-based models. The traditional node-embedding quantization method relies on matrix decomposition, while the latter two models are becoming increasingly popular. DeepWalk is a graph-embedding representation model that gained a lot of traction in its early days. It generates sequences from the network using random walks and trains the graph node embedding with Word2vec [14]. Typical graph neural networks include the graph convolutional network (GCN) [41] and the graph attention network (GAT) [42].
2.3. Geographical Entity Representation Learning
3. Method
3.1. Framework
3.2. Construction of Spatial Proximity Graphs of Geographical Entities
3.2.1. Structure of the Spatial Proximity Graph
- (1)
- Definition of Nodes
- (2)
- Definition of Edges
3.2.2. Method for Determining Neighborhoods
3.2.3. Method for Generating Spatial Proximity Graph
3.3. Semantic Representation Model Construction
3.3.1. Neighborhood Matrix Construction
3.3.2. Entity Category Semantic Representation Model
4. Experiment
4.1. Datasets and Preprocessing
4.2. Experiment Setup
4.2.1. Evaluation Tasks
- (1)
- Category pair similarity comparison task: After training the representation learning model, similar geospatial entity categories are mapped to similar feature embeddings. The similarity between feature embeddings expresses the similarity and correlation between the semantics of the entity categories in numerical values [51,52]. and are feature embedding representations of the geographical entity classes M and N, respectively. The similarity between these two geographical entity classes will be evaluated using the cosine similarity between their embedding representations, and the model’s efficacy will be verified by comparing the similarity values produced from the standard model.
- (2)
- Regional similarity task: We use geographic entity category feature embedding in the embedding representation challenge of regions to test the feasibility of the representation learning approach for future projects.
4.2.2. Verification Program Design
- Category pair similarity comparison task
- 2.
- Regional similarity task
4.2.3. Parameter Settings
5. Results and Discussion
5.1. Category Pair Similarity Comparison Task
5.1.1. Analysis of Results
- Some category pairs have slightly different ontological properties but are spatially close, e.g., “dike–gate”. They share similar neighborhood environments, and model training can yield similar feature embeddings. This is reflected in their cosine similarity, but the standard model rates similarity slightly lower. The analysis reveals that the results obtained for such cases capture both the correlation and the similarity between category pairs. This scenario cannot be directly reflected by the traditional standard model, resulting in a slight difference in the calculation results between the two.
- Some category pairs are not spatially adjacent but have similar neighborhood environments. The proposed model will lead to a slightly higher similarity score. In terms of ontological properties, the standard model provides a slightly lower similarity due to differences in some important properties, such as the “function” property between the two categories: water plant and sewage treatment plant.
5.1.2. Error Analysis
5.1.3. Statistical Analysis
- The first part of the data: The test results showed a significant difference (W = 608, p < 0.001), but the effect size was extremely low (r = 0.030), indicating that the difference range in this interval could be ignored;
- The second part of the data: The test results did not reach the significance level (W = 1959, p = 0.386), and the difference was relatively small.
5.2. Regional Similarity Task
5.3. Different Neighborhood Range Widths’ Effects on Embedding Representation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Duan, X. Research on Ontology Alignment Method and Optimization Algorithm Based on Deep Learning; Chongqing Normal University: Chongqing, China, 2020. [Google Scholar]
- Zhao, Y.; Sun, Q.; Liu, X.; Cheng, M.; Yu, T.; Li, Y. Geographical entity-oriented semantic similarity measurement method and its application in road matching. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 728–735. [Google Scholar]
- Ling, Z.; Li, R.; Wu, H.; Li, J.; Gui, Z. Semantic-driven construction of geographic entity association network and knowledge service. Acta Geod. Cartogr. Sin. 2023, 52, 478. [Google Scholar]
- Li, W.; Zhang, Y.; Pan, L. Ontology concept update method based on semantic similarity. Comput. Appl. Softw. 2018, 35, 15–20. [Google Scholar]
- Wang, L.; Zhang, F.; Du, Z.; Chen, Y.; Zhang, C.; Liu, R. A hybrid semantic similarity measurement for geospatial entities. Microprocess. Microsyst. 2021, 80, 103526. [Google Scholar] [CrossRef]
- Zhao, H.; Zhu, Y.; Yang, H.; Luo, K. The semantic relevancy computation model on essential features of geospatial data. Geogr. Res 2016, 35, 58–70. [Google Scholar]
- Zhu, G.; Iglesias, C.A. Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 2016, 29, 72–85. [Google Scholar] [CrossRef]
- Yongbin, T.A.N.; Lingling, G.A.O.; Lin, L.I.; Penggen, C.H.E.N.G.; Hong, W.A.N.G.; Xiaolong, L.I.; Cheng, C.H.E.N. A dynamic weighted model for semantic similarity measurement between geographic feature categories. Acta Geod. Cartogr. Sin. 2023, 52, 843. [Google Scholar]
- Li, L.; Zhu, H.H.; Wang, H.; Li, D.R. Semantic analyses of the fundamental geographic Information based on formal ontology—Exemplifying hydrological category. Acta Geod. Cartogr. Sin. 2008, 37, 230–235. [Google Scholar]
- Li, H.; Zhai, L.; Zhu, H. Semantic similarities calculative modeling for geospatial entity classes based on ontology. Sci. Surv. Mapp. 2009, 34, 12–14. [Google Scholar]
- Tan, Y.; Li, L.; Wang, W.; Yu, Z.; Zhang, Z.; Mao, K.; Xu, Y. Semantic similarity measurement model between fundamental geographic information concepts based on ontological property. Acta Geod. Cartogr. Sin. 2013, 42, 782. [Google Scholar]
- Zhang, D.; Yin, J.; Zhu, X.; Zhang, C. Network representation learning: A survey. IEEE Trans. Big Data 2018, 6, 3–28. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
- Narayanan, A.; Chandramohan, M.; Venkatesan, R.; Chen, L.; Liu, Y.; Jaiswal, S. graph2vec: Learning distributed representations of graphs. arXiv 2017, arXiv:1707.05005. [Google Scholar]
- Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
- Wang, D.; Cui, P.; Zhu, W. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
- Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
- Cao, S.; Lu, W.; Xu, Q. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 18–23 October 2015; pp. 891–900. [Google Scholar]
- Belkin, M.; Niyogi, P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 2003, 15, 1373–1396. [Google Scholar] [CrossRef]
- Jones, C.B.; Ware, M.J. Nearest neighbor search for linear and polygonal objects with constrained triangulations. In Proceedings of the 8th International Symposium on Spatial Data Handling, Vancouver, BC, Canada, 11–15 July 1998; pp. 13–21. [Google Scholar]
- Yan, C.; Liu, X.; Li, A. The representation and application of spatial neighborhood. J. Spatio-Temporal Inf. 2018, 25, 18–22. [Google Scholar]
- Zhu, A.; Lv, G.; Zhou, C.; Qin, C. Geographic similarity: Third law of geography? J. Geo-Inf. Sci. 2020, 22, 673–679. [Google Scholar]
- Song, Y. Geographically optimal similarity. Math. Geosci. 2023, 55, 295–320. [Google Scholar] [CrossRef]
- Zhu, A.; Turner, M. How is the Third Law of Geography different? Ann. GIS 2022, 28, 57–67. [Google Scholar] [CrossRef]
- Zhu, A.X.; Lu, G.; Liu, J.; Qin, C.-Z.; Zhou, C. Spatial prediction based on third law of geography. Ann. GIS 2018, 24, 225–240. [Google Scholar] [CrossRef]
- Jiao, Z.; Tao, R. Geographical Gaussian Process Regression: A Spatial Machine-Learning Model Based on Spatial Similarity. Geogr. Anal. 2025. [Google Scholar] [CrossRef]
- Dhyani, D.; Ng, W.K.; Bhowmick, S.S. A survey of web metrics. ACM Comput. Surv. (CSUR) 2002, 34, 469–503. [Google Scholar] [CrossRef]
- Hamming, R.W. Error detecting and error correcting codes. Bell Syst. Tech. J. 1950, 29, 147–160. [Google Scholar] [CrossRef]
- Zhao, Y. Research on Key Technology of Semantic Consistency Processing for Multi-Sources Vector Data; Information Engineering University: Zhengzhou, China, 2021. [Google Scholar]
- Xu, Z.; Zhu, Y.; Song, J.; Sun, K.; Wang, S. Word embedding-based method for entity category alignment of geographic knowledge base. Inf. Sci. 2021, 23, 1372–1381. [Google Scholar]
- Salton, G.; Wong, A.; Yang, C. A vector space model for automatic indexing. Commun. ACM 1975, 18, 613–620. [Google Scholar] [CrossRef]
- Zhuang, B.; Liu, L.; Shen, C.; Reid, I. Towards context-aware interaction recognition for visual relationship detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 589–598. [Google Scholar]
- Barkan, O.; Koenigstein, N. Item2vec: Neural item embedding for collaborative filtering. In Proceedings of the 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), Vietri sul Mare, Italy, 13–16 September 2016; pp. 1–6. [Google Scholar]
- Wu, S.; Roberts, K.; Datta, S.; Du, J.; Ji, Z.; Si, Y.; Soni, S.; Wang, Q.; Wei, Q.; Xiang, Y.; et al. Deep learning in clinical natural language processing: A methodical review. J. Am. Med. Inform. Assoc. 2020, 27, 457–470. [Google Scholar] [CrossRef]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; p. 26. [Google Scholar]
- Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014. [Google Scholar]
- Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers); Association for Computational Linguistics: Beijing, China, 2015; pp. 687–696. [Google Scholar]
- Tu, C.; Yang, C.; Liu, Z.; Sun, M. Network representation learning: An overview. Sci. Sin. Informationis 1998, 43, 1681. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Zhang, C.; Zhang, K.; Yuan, Q.; Peng, H.; Zheng, Y.; Hanratty, T.; Wang, S.; Han, J. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 361–370. [Google Scholar]
- Zhao, P.; Han, J.; Sun, Y. P-rank: A comprehensive structural similarity measure over information networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China, 2–6 November 2009; pp. 553–562. [Google Scholar]
- Fu, Y.; Wang, P.; Du, J.; Wu, L.; Li, X. Efficient region embedding with multi-view spatial networks: A perspective of locality-constrained spatial autocorrelations. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 906–913. [Google Scholar]
- Liu, K.; Gao, S.; Qiu, P.; Liu, X.; Yan, B.; Lu, F. Road2vec: Measuring traffic interactions in urban road system from massive travel routes. ISPRS Int. J. Geo-Inf. 2017, 6, 321. [Google Scholar] [CrossRef]
- Jia, H.; Chen, M.; Huang, W.; Zhao, K.; Gong, Y. Learning hierarchy-enhanced POI category representations using disentangled mobility sequences. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, Jeju, Republic of Korea, 3–9 August 2024; pp. 2090–2098. [Google Scholar]
- Li, Y.; Chen, T.; Luo, Y.; Yin, H.; Huang, Z. Discovering collaborative signals for next POI recommendation with iterative Seq2Graph augmentation. arXiv 2021, arXiv:2106.15814. [Google Scholar]
- Wang, M.X.; Lee, W.C.; Fu, T.Y. On representation learning for road networks. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 12, 1–27. [Google Scholar] [CrossRef]
- Wang, H. Application study on basic geographic elements in big data environment. Geomat. Spat. Inf. Technol. 2015, 38, 191–193. [Google Scholar]
- Budanitsky, A.; Hirst, G. Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 2006, 32, 13–47. [Google Scholar] [CrossRef]
- Resnik, P. Using information content to evaluate semantic similarity in a taxonomy. arXiv 1995, arXiv:cmp-lg/9511007. [Google Scholar]
Node Number | Osm_id | Code | Fclass |
---|---|---|---|
1 | 855690625 | 2505 | department-store |
2 | 8526320982 | 2301 | restaurant |
3 | 3661152326 | 2525 | mobile-phone-shop |
4 | 4637452293 | 2301 | restaurant |
5 | 4637452294 | 2601 | bank |
6 | 855690289 | 1500 | building |
Geographical Feature Category Pair | Based on the Similarity Between Our Model | Standard Model Similarity | Difference Value of Similarity |
---|---|---|---|
school–theater (cinema) | 0.4832 | 0.4833 | 0.0001 |
gymnasium–golf course | 0.7448 | 0.7458 | 0.0010 |
supermarket–hotel (restaurant) | 0.6871 | 0.6839 | −0.0032 |
swimming pool–open-air stadium | 0.7304 | 0.7458 | 0.0154 |
expressway–light rail | 0.4416 | 0.4953 | 0.0537 |
train station–airport | 0.5083 | 0.4128 | −0.0955 |
latrine–guesthouse (restaurant) | 0.5682 | 0.4085 | −0.1597 |
park–theater (cinema) | 0.5352 | 0.2975 | −0.2377 |
dike–gate | 0.5601 | 0.3073 | −0.2528 |
water plant–sewage treatment plant | 0.7346 | 0.3774 | −0.3572 |
Standard Deviation | Maximum | Minimum | Median |
---|---|---|---|
0.0940 | 0.3637 | 0 | 0.1092 |
Data | W | p | r |
---|---|---|---|
all | 6519 | <0.001 | 0.151 |
first part | 608 | <0.001 | 0.030 |
second part | 1959 | 0.386 | - |
buffer(m) | 10 | 25 | 50 | 100 | 200 |
value | 0.5097 | 0.6469 | 0.7487 | 0.5536 | 0.5280 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tan, Y.; Wang, H.; Cai, R.; Gao, L.; Yu, Z.; Li, X. Spatial Proximity Relations-Driven Semantic Representation for Geospatial Entity Categories. ISPRS Int. J. Geo-Inf. 2025, 14, 233. https://doi.org/10.3390/ijgi14060233
Tan Y, Wang H, Cai R, Gao L, Yu Z, Li X. Spatial Proximity Relations-Driven Semantic Representation for Geospatial Entity Categories. ISPRS International Journal of Geo-Information. 2025; 14(6):233. https://doi.org/10.3390/ijgi14060233
Chicago/Turabian StyleTan, Yongbin, Hong Wang, Rongfeng Cai, Lingling Gao, Zhonghai Yu, and Xin Li. 2025. "Spatial Proximity Relations-Driven Semantic Representation for Geospatial Entity Categories" ISPRS International Journal of Geo-Information 14, no. 6: 233. https://doi.org/10.3390/ijgi14060233
APA StyleTan, Y., Wang, H., Cai, R., Gao, L., Yu, Z., & Li, X. (2025). Spatial Proximity Relations-Driven Semantic Representation for Geospatial Entity Categories. ISPRS International Journal of Geo-Information, 14(6), 233. https://doi.org/10.3390/ijgi14060233