Intelligent Retrieval Method for Geospatial Data Aimed at Data Trading
Abstract
1. Introduction
2. Materials and Methods
2.1. Geospatial Data Source Analysis
2.1.1. Industry Websites
2.1.2. Domain-Specific Materials
2.1.3. Online Encyclopedias
2.2. Knowledge Graph Construction for Data Trading
- The Foundation Module contains the Basic Information class, which is utilized to represent the fundamental information of a data resource, including its name and ID.
- The Intrinsic Content Module comprises the Temporal Extent class, the Spatial Extent class, and the Content Feature class. These classes express the characteristic information of the data resource across the temporal, spatial, and thematic content dimensions. The Temporal Extent class is denoted as the valid time interval of the geographical phenomenon or event recorded by the data resource. The Spatial Extent class is described as the geographical distribution area of the data resource and can be subdivided into Physical Geographical Object classes (e.g., Tibetan Plateau and East Lake) and Human Geographical Object classes (e.g., Wuhan City and Guangdong–Hong Kong–Macao Greater Bay Area). A single data resource can be associated with one or multiple spatial extents. The Content Feature class characterizes the thematic content information recorded by the data.
- The Form Module consists of the Topic Category class, Data Format class, Spatial Reference class, Spatial Accuracy class, and Temporal Accuracy class. In this study, temporal accuracy and spatial accuracy are categorized as follows: Temporal Resolution (e.g., yearly and daily), Temporal Scale (e.g., annual average and daily average), Spatial Resolution, Scale, and Spatial Scale (e.g., provincial level and city level). Spatial Reference is divided into the Geographic Coordinate System and the Projected Coordinate System.
- The Provenance Module includes the Data Publisher class, Processing Method class, Source Description class, Quality Description class, Observation Platform class, and Observation Instrument class. This module describes information related to the production process of the data resource. Users often heavily consider this information when choosing data; for instance, they may prefer authoritative data published by national units when choosing administrative division data, or seek data resources processed from raw data, such as local statistical yearbooks when searching for economic development data related to a specific location.
- (1)
- For structured and semi-structured metadata in geospatial data, which typically include attributes such as Spatial Resolution, Data Source, and Data Theme, this study employed rule-based mapping and direct transformation methods to match metadata fields with corresponding relations in the ontology model, converting them into entities within the knowledge graph.
- (2)
- For unstructured metadata in geospatial data, where certain attributes are embedded in unstructured textual content such as data abstracts and descriptions, this study utilized a combined BERT + GlobalPointer model to perform Named Entity Recognition (NER) on the unstructured text present in the metadata. The BERT model encoded the original text, and the GlobalPointer network decoded it, ultimately outputting different types of entities identified in the text to supplement the knowledge graph.
- (3)
- Following entity recognition, the next step comprised establishing spatial relationships between the identified entities. This study employed unstructured geographical description texts, which were primarily sourced from industry websites, domain-specific materials, and online encyclopedias. These texts served as the main corpus for extracting spatial topological relations in the form of Subject–Predicate–Object (SPO) triplets [20].
2.3. Intelligent Retrieval of Geospatial Data Based on Trading Value
2.3.1. User Requirement Extraction Based on an LLM
2.3.2. Requirement–Entity Mapping in the Knowledge Graph
2.3.3. GDR Based on Multi-Hop Reasoning
- (1)
- Each requirement point in the original requirement lists for “Spatial Extent,” “Data Theme,” and “Observation System” is mapped to corresponding entities within the geospatial data knowledge graph.
- (2)
- A multi-hop query is conducted on the knowledge graph. The process starts from the entities corresponding to the original requirements and traverses along specified relationship types. All entities and relationships discovered through this query are integrated to form the associated requirements, from which a subgraph of the knowledge graph is produced.
- (3)
- Based on this subgraph, the “Basic Information” entities and all their directly connected entities within the geospatial data knowledge graph are queried. This step aims to acquire the sets of geospatial data that correspond to the expanded “Spatial Extent,” “Data Theme,” and “Observation System” requirements.
- (4)
- The resulting geospatial datasets subsequently undergo flexible filtering based on the “Data Format,” “Temporal Extent,” “Spatial Accuracy,” and “Temporal Accuracy” requirements. This process improves the retrieval coverage, which yields a collection of geospatial data that may satisfy the various requirements defined by users.
2.3.4. Retrieval Result Ranking Based on Trading Value
3. Results
3.1. Data Preprocessing
3.2. Construction of the Geospatial Data Knowledge Graph
3.3. Geospatial Data Retrieval
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhao, X.; Chen, H.; Xing, Z.; Miao, C. Brain-Inspired Search Engine Assistant Based on Knowledge Graph. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 4386–4400. [Google Scholar] [CrossRef]
- Wang, Y.; Lipka, N.; Rossi, R.A.; Siu, A.; Zhang, R.; Derr, T. Knowledge Graph Prompting for Multi-Document Question Answering. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2024; Volume 38, pp. 19206–19214. [Google Scholar] [CrossRef]
- Fan, X.; Ji, Y.; Hui, B. A Dynamic Preference Recommendation Model Based on Spatiotemporal Knowledge Graphs. Complex Intell. Syst. 2025, 11, 46. [Google Scholar] [CrossRef]
- Neumaier, S. Semantic Enrichment of Open Data on the Web—Or: How to Build an Open Data Knowledge Graph. Ph.D. Thesis, Technische Universität Wien, Vienna, Austria, 2019. [Google Scholar] [CrossRef]
- Färber, M.; Lamprecht, D. The Dataset Knowledge Graph: Creating a Linked Open Data Source for Datasets. Quant. Sci. Stud. 2021, 2, 1324–1355. [Google Scholar] [CrossRef]
- Lee, J.; Park, J. An Approach to Constructing a Knowledge Graph Based on Korean Open-Government Data. Appl. Sci. 2019, 9, 4095. [Google Scholar] [CrossRef]
- Xiao, X.; Wang, P.; Ge, Y.; Luo, J.; Chen, H.; He, Y.; Lin, H. GeoKG-HSA: A Framework for Habitat Suitability Assessment with Geospatial Knowledge Graphs. Int. J. Appl. Earth Obs. Geoinform. 2025, 144, 104921. [Google Scholar] [CrossRef]
- Liu, J.; Liu, H.; Chen, X.; Guo, X.; Zhao, Q.; Li, J.; Kang, L.; Liu, J. A Heterogeneous Geospatial Data Retrieval Method Using Knowledge Graph. Sustainability 2021, 13, 2005. [Google Scholar] [CrossRef]
- Aghaei, S.; Angele, K.; Huaman, E.; Bushati, G.; Schiestl, M.; Fensel, A. Interactive Search on the Web: The Story So Far. Information 2022, 13, 324. [Google Scholar] [CrossRef]
- Miao, L.; Liu, C.; Fan, L.; Kwan, M.-P. An OGC Web Service Geospatial Data Semantic Similarity Model for Improving Geospatial Service Discovery. Open Geosci. 2021, 13, 245–261. [Google Scholar] [CrossRef]
- Li, W.; Goodchild, M.F.; Raskin, R. Towards Geospatial Semantic Search: Exploiting Latent Semantic Relations in Geospatial Data. Int. J. Digit. Earth 2014, 7, 17–37. [Google Scholar] [CrossRef]
- Ning, H.; Li, Z.; Akinboyewa, T.; Lessani, M.N. An Autonomous GIS Agent Framework for Geospatial Data Retrieval. Int. J. Digit. Earth 2025, 18, 2458688. [Google Scholar] [CrossRef]
- Wiegand, N.; García, C. A Task-Based Ontology Approach to Automate Geospatial Data Retrieval. Trans. GIS 2007, 11, 355–376. [Google Scholar] [CrossRef]
- Hou, Z.; Zhu, Y.; Gao, X.; Luo, K.; Wang, D.; Sun, K. A Chinese Geological Time Scale Ontology for Geodata Discovery. In Proceedings of the 23rd International Conference on Geoinformatics (GEOINFORMATICS 2015), Wuhan, China, 18–20 June 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–5. [Google Scholar] [CrossRef]
- Sun, K.; Zhu, Y.; Pan, P.; Hou, Z.; Wang, D.; Li, W.; Song, J. Geospatial Data Ontology: The Semantic Foundation of Geospatial Data Integration and Sharing. Big Earth Data 2019, 3, 269–296. [Google Scholar] [CrossRef]
- Li, H.; Yue, P.; Wu, H.; Teng, B.; Zhao, Y.; Liu, C. A Question-Answering Framework for Geospatial Data Retrieval Enhanced by a Knowledge Graph and Large Language Models. Int. J. Digit. Earth 2025, 18, 2510566. [Google Scholar] [CrossRef]
- Stančin, I.; Jović, A. An Overview and Comparison of Free Python Libraries for Data Mining and Big Data Analysis. In Proceedings of the 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 20–24 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 977–982. [Google Scholar] [CrossRef]
- Sun, K.; Zhu, Y.Q.; Pan, P.; Luo, K.; Wang, D.X.; Hou, Z.W. Morphology-Ontology of Geospatial Data and Its Application in Data Discovery. In Proceedings of the 23rd International Conference on Geoinformatics (GEOINFORMATICS 2015), Wuhan, China, 18–20 June 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–6. [Google Scholar] [CrossRef]
- Bogdanović, M.; Stanimirović, A.; Stoimenov, L. Methodology for Geospatial Data Source Discovery in Ontology-Driven Geo-Information Integration Architectures. J. Web Semant. 2015, 32, 1–15. [Google Scholar] [CrossRef]
- Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-Stage Joint Extraction of Entities and Relations Through Token Pair Linking. arXiv 2020, arXiv:2010.13415. [Google Scholar] [CrossRef]
- Joulin, A.; Grave, E.; Bojanowski, P.; Douze, M.; Jégou, H.; Mikolov, T. Fasttext.zip: Compressing Text Classification Models. arXiv 2016, arXiv:1612.03651. [Google Scholar] [CrossRef]
- Jiang, S.; Hagelien, T.F.; Natvig, M.; Li, J. Ontology-Based Semantic Search for Open Government Data. In Proceedings of the 2019 IEEE 13th International Conference on Semantic Computing (ICSC), Newport Beach, CA, USA, 3–5 February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 7–15. [Google Scholar] [CrossRef]
- Goel, A.; Gueta, A.; Gilon, O.; Liu, C.; Erell, S.; Nguyen, L.H.; Hao, X.; Jaber, B.; Reddy, S.; Kartha, R.; et al. LLMs Accelerate Annotation for Medical Information Extraction. In Proceedings of the Machine Learning for Health (ML4H), Baltimore, MD, USA, 12–13 December 2023; PMLR: Cambridge, MA, USA, 2023; pp. 82–100. Available online: https://proceedings.mlr.press/v225/goel23a.html (accessed on 20 April 2025).
- Rong, J.; Chen, H.; Chen, T.; Yu, X.; Liu, Y. Retrieval-Enhanced Visual Prompt Learning for Few-Shot Classification. arXiv 2023, arXiv:2306.02243. [Google Scholar] [CrossRef]
- Liu, A.; Feng, B.; Xue, B.; Wang, B.; Wu, B.; Lu, C.; Zhao, C.; Deng, C.; Zhang, C.; Ruan, C.; et al. DeepSeek-V3 Technical Report. arXiv 2024, arXiv:2412.19437. [Google Scholar] [CrossRef]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. arXiv 2023, arXiv:2303.18223. [Google Scholar] [CrossRef] [PubMed]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar] [CrossRef]
- Reinanda, R.; Meij, E.; de Rijke, M. Knowledge Graphs: An Information Retrieval Perspective. Found. Trends Inf. Retr. 2020, 14, 289–444. [Google Scholar] [CrossRef]
- Ristad, E.S.; Yianilos, P.N. Learning String-Edit Distance. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 522–532. [Google Scholar] [CrossRef]
- Lv, Y.; Lu, B. A Normalized Levenshtein Distance Metric. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1091–1095. [Google Scholar] [CrossRef] [PubMed]
- Lan, Y.; Jiang, J. Query Graph Generation for Answering Multi-Hop Complex Questions from Knowledge Bases. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), Seattle, WA, USA, 5–10 July 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 3456–3466. [Google Scholar] [CrossRef]
- Savona, M. The Value of Data: Towards a Framework to Redistribute It; Working Paper No. 2019-05; SPRU—Science and Technology Policy Research; University of Sussex: Brighton, UK, 2019; Available online: https://iris.luiss.it/handle/11385/198247 (accessed on 10 May 2025).
- Attard, J.; Brennan, R. A Semantic Data Value Vocabulary Supporting Data Value Assessment and Measurement Integration. Data Policy 2018, 1, 133–144. [Google Scholar] [CrossRef]
- Yan, C.; Lu, G.; Liu, Y.; Deng, X. A Modified PSO Algorithm with Exponential Decay Weight. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, 29–31 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 239–242. [Google Scholar] [CrossRef]
- Pan, H.; Zhang, Q.; Dragut, E.; Caragea, C.; Jan Latecki, L. DMDD: A Large-Scale Dataset for Dataset Mentions Detection. Trans. Assoc. Comput. Linguist. 2023, 11, 1132–1146. [Google Scholar] [CrossRef]
- Wang, C.; Qiu, M.; Shi, C.; Zhang, T.; Liu, T.; Li, L.; Wang, J.; Wang, M.; Huang, J.; Lin, W. EasyNLP: A Comprehensive and Easy-to-Use Toolkit for Natural Language Processing. arXiv 2022, arXiv:2205.00258. [Google Scholar] [CrossRef]
- Sejuti, Z.A.; Islam, M.S. A Hybrid CNN–KNN Approach for Identification of COVID-19 with 5-Fold Cross Validation. Sens. Int. 2023, 4, 100229. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. The Advantages of the Matthews Correlation Coefficient (MCC) Over F1 Score and Accuracy in Binary Classification Evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [PubMed]









| Prefix | Examples |
|---|---|
| You are a professional assistant for assessing requirements related to geospatial data products. Please carefully analyze the input of the user and extract the following geospatial data requirement elements in JSON format: 1. Spatial_Extent: Geographical area descriptions, such as administrative division names, geographical entity names, or other descriptions referring to specific real-world spatial extents. 2. Temporal_Extent: The temporal coverage of the data. 3. Data_Format: The required file format(s). 4. Spatial_Accuracy: Descriptions of the data’s spatial resolution, scale, or spatial measurement scale. 5. Temporal_Accuracy: Descriptions of the temporal resolution or temporal measurement scale of the data. 6. Data_Theme: The involved geospatial data content. 7. Observation_System: The name of the sensor or the platform carrying the sensor. | Requirement: “I need monthly MODIS vegetation index data for the Tibetan Plateau region and Fuzhou City from the past three years, with a spatial resolution finer than 1 km, preferably in GeoTIFF or JPEG format.” Output: { “Spatial_Extent”: [“Tibetan Plateau region”, “Fuzhou_City”], “Temporal_Extent”: [“past three years”], “Data_Format”: [“GeoTIFF”, “JPEG”], “Spatial_Accuracy”: [“<1000 m”], “Temporal_Accuracy”: [“monthly”], “Data_Theme”: [“vegetation index”], “Observation_System”: [“MODIS”] } |
| Requirement: “Please help me find decade-by-decade recorded air temperature data for Shanghai from June to September 2019.” Output: { “Spatial_Extent”: [“Shanghai”], “Temporal_Extent”: [“June–September 2019”], “Data_Format”: [], “Spatial_Accuracy”: [], “Temporal_Accuracy”: [“decadal”], “Data_Theme”: [“air temperature”], “Observation_System”: [] } |
| Prefix | Example 1 | Example 2 |
|---|---|---|
| You are a professional assistant for geospatial data product requirement analysis. Please carefully assess the input data requirement defined by the user and the included temporal extent description(s). Modify these temporal extent description(s) according to the following requirements and output in the specified format. Requirement: The output must include the year, month, and day information for the start date and the end date. Output Format: { “temporalRanges”: [ {“start_date”: “yyyy-MM-dd”, “end_date”: “yyyy-MM-dd”} ] } | User Data Requirement: I need rainfall data for Wuhan from last March. Temporal Extent Requirement: last March. Reasoning: 1. “Last March” is a relative temporal description that does not meet the specified output requirements and needs to be modified. 2. The current year is {current_year}, such that “last March” corresponds to March 2024. The specific daily temporal extent should be from 1 March 2024 to 31 March 2024. 3. According to the format requirements, the output should be {start_date: “2024-03-01”, end_date: “2024-03-31”}. Output: {“temporalRanges”: [ {“start_date”: “2024-03-01”, “end_date”: “2024-03-31”} ]} | User Data Requirement: I need rainfall distribution data for the recent major torrential rain event in Zhengzhou. Temporal Extent Requirement: Recent. Reasoning: 1. “Recent” does not contain specific year, month, or day information, which does not meet the output specification requirements for temporal extent descriptions and needs to be corrected. 2. “Recent” refers to a time period close to the current date, using the current date as a baseline. The current date needs to be obtained; currently, it is 23 July 2021. By default, one week is calculated forward and backward. Therefore, the required temporal extent is from 16 July 2021 to 30 July 2021. 3. According to the format requirements, the output should be {start_date: “2021-07-16”, end_date: “2021-07-30”}. Output: {“temporalRanges”: [ {“start_date”: “2021-07-16”, “end_date”: “2021-07-30”} ]} |
| Metadata | Value | Ontology Class/ Data Property |
|---|---|---|
| Data Name | Poyang Lake Basin 500 m Vegetation Net Primary Productivity Raster Dataset (2001–2022) | Foundation Module/Data Name |
| Unique Identifier | 162253959959758 | Foundation Module/Unique Identifier |
| Data Abstract | This dataset covers the Poyang Lake Basin for the period of 2001–2022… | Foundation Module/Data Abstract |
| Spatial Location | “Poyang Lake Basin” | Physiographic Object/Anthropographic Object |
| Temporal Coverage | [“2000-12-31”, “2021-12-31”] | Temporal Extent/Start Time and End Time |
| Thematic Classification | “Land Surface,” “Ecological Science Data,” “Vegetation” | Primary, Secondary, Tertiary Theme |
| Keywords | “Vegetation,” “Net Primary Productivity,” “NPP” | Content Features |
| Data Source | Data sourced from MODIS, tile grid… | Data Source |
| Data Generation/ Processing Method | Using the MRT tool to mosaic MOD17A3HGF NPP data… | Processing Method |
| Spatial Projection | WGS84 Coordinate System | Geographic Coordinate System/Projected Coordinate System |
| Data Quality Statement | This dataset maintains quality consistent with the source dataset MODIS MOD17A3HGF.v006 | Quality Description |
| Contact Person | Data Sharing Working Group | Data Distributor/Contact Person |
| Phone Number | 025-******068 | Data Distributor/Contact Phone |
| Email Address | wdc@******.ac.cn | Data Distributor/Contact Email |
| Organization | Nanjing Institute of ****, Chinese Academy of Sciences | Data Distributor/Publishing Institution |
| Address | *****, Nanjing City | Data Distributor/Contact Address |
| Thesaurus Term | Ontology Class | Broader | Narrower |
|---|---|---|---|
| Solid Precipitation | Content Features | [“Atmospheric Precipitation”] | [“Hail”, “Snow”] |
| Jianghan Plain | Physiographic Object | / | / |
| Wuhan City | Anthropographic Object | / | / |
| Satellite | Hosted Sensor |
|---|---|
| GF-1 | PMS |
| GF-1 | WFV |
| GF-10 | PMS-2 |
| GF-11 | PMS-2 |
| Himawari-9 | AHI |
| Himawari-9 | DCS |
| ··· | ··· |
| Training Parameters | GlobalPointer | GPLinker |
|---|---|---|
| batch_size | 16 | 32 |
| optimizer | AdamW | AdamW |
| learning_rate | 2 × 10−5 | 1 × 10−5 |
| epoch | 20 | 20 |
| Evaluation Metrics | GlobalPointer | GPLinker |
|---|---|---|
| 0.7051 | 0.7896 | |
| 0.7219 | 0.8272 | |
| 0.7134 | 0.8079 |
| Knowledge Graph Module | Entity Count | Relation Count |
|---|---|---|
| Foundation Module | 5000 | 0 |
| Intrinsic Content Module | 9602 | 43,983 |
| Form Module | 539 | 18,243 |
| Provenance Module | 16,530 | 18,312 |
| Total | 31,671 | 80,538 |
| Number | Source | Retrieval Query | Query Type | Requirement Points |
|---|---|---|---|---|
| 1 | Simulation | Land use data for Guangdong Province with precision not less than 1 km. | Precise | Spatial Extent, Spatial Accuracy, Data Theme |
| 2 | Simulation | Monthly vegetation index data for Henan Province. | Precise | Spatial Extent, Temporal Accuracy, Data Theme |
| 3 | Simulation | Vector road data for Sichuan Province after 2010. | Precise | Spatial Extent, Temporal Extent, Data Format, Data Theme |
| 4 | Simulation | Cyanobacteria bloom distribution data for the Taihu Lake region processed from MODIS imagery. | Precise | Observation System, Spatial Extent, Data Theme |
| 5 | Simulation | Daily air quality data for the Yangtze River Delta region. | Precise | Spatial Extent, Temporal Accuracy, Data Theme |
| 6 | Simulation | Amount of rainfall in Zhengzhou during the 7.20 torrential rain period. | Ambiguous | Spatial Extent, Temporal Extent, Data Theme |
| 7 | Simulation | Impervious surface data at county level for cities in northern Hubei. | Ambiguous | Spatial Extent, Spatial Accuracy, Data Theme |
| 8 | Simulation | Economic data for Shanghai in Excel format. | Ambiguous | Spatial Extent, Data Format, Data Theme |
| 9 | Simulation | Surface elevation data recorded in China in recent years. | Ambiguous | Spatial Extent, Temporal Extent, Data Theme |
| 10 | Simulation | Daily air quality records for the Beijing–Tianjin–Hebei region in 2016. | Ambiguous | Spatial Extent, Temporal Extent, Temporal Accuracy, Data Theme |
| 11 | Urban Planning Researcher | Find land cover change data for the Yangtze River Delta urban agglomeration in 2022 with 30 m resolution. | Precise | Spatial Extent, Temporal Extent, Spatial Accuracy, Data Theme |
| 12 | Oceanography Ph.D. Candidate | I need sea surface temperature data for Zhejiang Province, preferably in NetCDF format. | Precise | Spatial Extent, Data Format, Data Theme |
| 13 | Regional Economic Analyst | Help me look up night-time light remote sensing imagery for the Guangdong–Hong Kong–Macao Greater Bay Area from the past five years. | Precise | Spatial Extent, Temporal Extent, Data Theme |
| 14 | Municipal Design Engineer | Download the latest 1:2000 scale topographic map within the 6th Ring Road of Beijing. | Precise | Spatial Extent, Spatial Accuracy, Data Theme |
| 15 | Water Resources Science Researcher | Please provide hyperspectral remote sensing data for the Yellow River Basin around the flood season in 2021. | Precise | Spatial Extent, Temporal Extent, Data Theme, Spatial Accuracy |
| 16 | Ecological Environment Bureau Staff | I want some data on water quality monitoring points in the Yangtze River Basin, preferably recent data. | Ambiguous | Spatial Extent, Temporal Extent, Data Theme |
| 17 | Geography Department Undergraduate | Is there any data that can show the urban heat island effect for major cities in our country? | Ambiguous | Spatial Extent, Data Theme |
| 18 | Agricultural Company Technician | Look for data on soil pH of cultivated land in Northeast China. Older data is acceptable. | Ambiguous | Spatial Extent, Data Theme, Temporal Accuracy |
| 19 | Recent University Graduate | I need some global temperature data for my graduation thesis. | Ambiguous | Spatial Extent, Data Theme |
| 20 | Logistics Company Staff | Our company project requires a spatial distribution map of major ports in Europe. | Ambiguous | Spatial Extent, Data Theme |
| 21 | Energy Company Staff | I need to assess the site for a wind power project. Please help me find long-term data on wind speed and wind direction for Changsha City. | Ambiguous | Spatial Extent, Data Theme, Temporal Extent |
| 22 | Urban Planning Researcher | I’m doing urban park planning and need some reference data on green space distribution and population heat maps, preferably for larger cities. | Ambiguous | Spatial Extent, Data Theme |
| 23 | Science Video Producer | Where can I find very intuitive meteorological data? For example, how the wind blows or how clouds move. | Ambiguous | Data Theme, Data Format |
| 24 | Tourism Bureau Staff | Our Pingjiang County wants to develop tourism. Are there any ready-made maps of attraction distribution or data for tourism resource assessment we can reference? | Ambiguous | Spatial Extent, Data Theme |
| Data Resource | Original Requirements: (Anthropographic Object: Guangdong Province), (Spatial Resolution: 1000 m, ≥), (Content Features: Land Use) | ||||
|---|---|---|---|---|---|
| DTVE | Ranking | ||||
| Land Use Data of Guangdong Province at 1 km Resolution (2015) | 1 | 1 | 1 | 1 | |
| 1 km Land Use Data of Guangdong Province (2010) | 1 | 1 | 1 | 1 | 1 |
| Land Use Data of Guangdong Province at 1 km Resolution (2005) | 1 | 1 | 1 | 1 | 1 |
| Land Use Data of Guangdong Province at 1 km Resolution (2013) | 1 | 1 | 1 | 1 | 1 |
| Land Use Construction Land Type Data of Guangdong Province at 1 km Raster Resolution (1995) | 1 | 1 | 0.6065 | 5 | |
| Land Use Data of the Pearl River Delta Urban Agglomeration at 1 km Resolution (2010) | 1 | 1 | 0.6065 | 5 | |
| Woodland Type Data of Guangdong Province at 1 km Raster Resolution (2000) | 1 | 0.6065 | 5 | ||
| Unused Land Type Data of China at 1 km Raster Resolution (2000) | 1 | 0.3679 | 8 | ||
| Data Resource | Keyword Frequency | Ranking |
|---|---|---|
| Land Use Data of Guangdong Province at 1 km Resolution (2005) | 15 | 1 |
| 1:1 Million Scale Land Use Dataset of Guangdong Province (2005) | 14 | 2 |
| 1 km Gridded Land Use Data of Guangdong Province (2015) | 12 | 3 |
| 1 km Gridded Land Use Data of Guangdong Province (2010) | 9 | 4 |
| 1 km Gridded Land Use Data of Guangdong Province (2013) | 8 | 5 |
| Land Use Data of Henan Province at 100 m Resolution (2005) | 8 | 5 |
| Construction Land Type Data of Guangdong Province at 1 km Raster Resolution (1995) | 7 | 7 |
| Land Use Type Dataset of Chengdu-Chongqing Urban Agglomeration at 30 m Resolution (1986) | 7 | 7 |
| Query Number | Keyword Retrieval Method | Intelligent Retrieval Method | ||||
|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | |
| 1 | 0.6592 | 0.7124 | 0.6848 | 0.8121 | 0.8324 | 0.8221 |
| 2 | 0.7037 | 0.6821 | 0.6928 | 0.8403 | 0.8175 | 0.8287 |
| 3 | 0.6311 | 0.6237 | 0.6274 | 0.7957 | 0.8076 | 0.8016 |
| 4 | 0.7317 | 0.7168 | 0.7241 | 0.8083 | 0.7814 | 0.7946 |
| 5 | 0.6793 | 0.6831 | 0.6812 | 0.8619 | 0.8438 | 0.8528 |
| 6 | 0.3226 | 0.3169 | 0.3197 | 0.7315 | 0.7193 | 0.7253 |
| 7 | 0.3577 | 0.3392 | 0.3483 | 0.8367 | 0.8251 | 0.8309 |
| 8 | 0.4172 | 0.4263 | 0.4217 | 0.7714 | 0.7621 | 0.7667 |
| 9 | 0.4073 | 0.3982 | 0.4027 | 0.7152 | 0.7291 | 0.7221 |
| 10 | 0.3381 | 0.3517 | 0.3448 | 0.8173 | 0.8067 | 0.8120 |
| 11 | 0.7014 | 0.6931 | 0.7129 | 0.7924 | 0.8416 | 0.8071 |
| 12 | 0.6289 | 0.7248 | 0.6782 | 0.8267 | 0.8123 | 0.8488 |
| 13 | 0.6352 | 0.6650 | 0.6495 | 0.8035 | 0.8617 | 0.7816 |
| 14 | 0.6597 | 0.6865 | 0.6996 | 0.8541 | 0.7890 | 0.8372 |
| 15 | 0.6223 | 0.6344 | 0.6218 | 0.7738 | 0.7949 | 0.8185 |
| 16 | 0.3349 | 0.3426 | 0.3755 | 0.7892 | 0.7488 | 0.7878 |
| 17 | 0.3518 | 0.4201 | 0.3628 | 0.7601 | 0.8186 | 0.7719 |
| 18 | 0.4123 | 0.3979 | 0.3967 | 0.7556 | 0.7764 | 0.7946 |
| 19 | 0.3375 | 0.3815 | 0.3512 | 0.7389 | 0.7485 | 0.7835 |
| 20 | 0.3952 | 0.3540 | 0.3581 | 0.8052 | 0.7847 | 0.7925 |
| 21 | 0.3614 | 0.4182 | 0.4246 | 0.8093 | 0.8169 | 0.8236 |
| 22 | 0.4267 | 0.3291 | 0.3814 | 0.8224 | 0.7556 | 0.7914 |
| 23 | 0.3130 | 0.33386 | 0.3039 | 0.7545 | 0.7303 | 0.7578 |
| 24 | 0.4098 | 0.4063 | 0.3712 | 0.8101 | 0.8123 | 0.7907 |
| Comprehensive Evaluation | 0.4933 | 0.5016 | 0.4973 | 0.7953 | 0.7924 | 0.7977 |
| Comparison | Geospatial Data Resource Management Systems | |||
|---|---|---|---|---|
| PANGAEA | ESDS | National Earth System Science Data Center | GDR (This Study) | |
| Data Feature Dimensions | 9 | 12 | 16 | 20 |
| Conditional Filtering | √ | √ | √ | √ |
| Keyword-Based Retrieval | √ | √ | √ | √ |
| Semantic Retrieval | × | × | × | √ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Bo, J.; Li, W.; Liu, R.; Duan, M.; Ding, X.; Hu, C. Intelligent Retrieval Method for Geospatial Data Aimed at Data Trading. ISPRS Int. J. Geo-Inf. 2026, 15, 26. https://doi.org/10.3390/ijgi15010026
Bo J, Li W, Liu R, Duan M, Ding X, Hu C. Intelligent Retrieval Method for Geospatial Data Aimed at Data Trading. ISPRS International Journal of Geo-Information. 2026; 15(1):26. https://doi.org/10.3390/ijgi15010026
Chicago/Turabian StyleBo, Jianghong, Wang Li, Ran Liu, Mu Duan, Xuan Ding, and Chuli Hu. 2026. "Intelligent Retrieval Method for Geospatial Data Aimed at Data Trading" ISPRS International Journal of Geo-Information 15, no. 1: 26. https://doi.org/10.3390/ijgi15010026
APA StyleBo, J., Li, W., Liu, R., Duan, M., Ding, X., & Hu, C. (2026). Intelligent Retrieval Method for Geospatial Data Aimed at Data Trading. ISPRS International Journal of Geo-Information, 15(1), 26. https://doi.org/10.3390/ijgi15010026

