Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition

Zhang, Yuting; Li, Jingzhong; Li, Pengpeng; Liu, Tao; Du, Ping; Hao, Xuan

doi:10.3390/ijgi14110431

Open AccessArticle

Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition

by

Yuting Zhang

^1,2,3,

Jingzhong Li

^1,2,3,*,

Pengpeng Li

^1,2,3,

Tao Liu

^1,2,3

,

Ping Du

^1,2,3 and

Xuan Hao

^1,2,3

¹

Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China

²

National-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou 730070, China

³

Key Laboratory of Science and Technology in Surveying & Mapping, Lanzhou 730070, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(11), 431; https://doi.org/10.3390/ijgi14110431

Submission received: 9 September 2025 / Revised: 26 October 2025 / Accepted: 1 November 2025 / Published: 2 November 2025

Download

Browse Figures

Versions Notes

Abstract

Geographic Named Entity Recognition (Geo-NER) is a crucial task for extracting geography-related entities from unstructured text, and it plays an essential role in geographic information extraction and spatial semantic understanding. Traditional approaches typically treat Geo-NER as a sequence labeling problem, where each token is assigned a single label. However, this formulation struggles to handle nested entities effectively. To overcome this limitation, we propose Geo-MRC, an improved model based on a Machine Reading Comprehension (MRC) framework that reformulates Geo-NER as a question-answering task. The model identifies entities by predicting their start positions, end positions, and lengths, enabling precise detection of overlapping and nested entities. Specifically, it constructs a unified input sequence by concatenating a type-specific question (e.g., “What are the location names in the text?”) with the context. This sequence is encoded using BERT, followed by feature extraction and fusion through Gated Recurrent Units (GRU) and multi-scale 1D convolutions, which improve the model’s sensitivity to both multi-level semantics and local contextual information. Finally, a feed-forward neural network (FFN) predicts whether each token corresponds to the start or end of an entity and estimates the span length, allowing for dynamic inference of entity boundaries. Experimental results on multiple public datasets demonstrate that Geo-MRC consistently outperforms strong baselines, with particularly significant gains on datasets containing nested entities.

Keywords:

Geo-NER; Geo-MRC; GRU; multi-scale 1D convolution; nested entities

1. Introduction

Geo-NER is a fundamental task in information extraction that aims to identify and classify geography-related entities, such as place names, administrative regions, organizations, and, in certain contexts, person names, from unstructured text. Formally, Geo-NER aims to detect the textual boundaries of geographic entities and assign them semantic categories (for example, LOC, ORG, or PER), thereby transforming unstructured geographic information into structured representations that can be effectively integrated into Geographic Information Systems (GIS) and other spatial applications [1,2]. With the rapid development of the internet and mobile communication technologies, massive volumes of textual data (such as news articles, social media posts, and travel logs) contain abundant geographic information [3,4]. This information is critical for developing intelligent GIS, enhancing disaster response capabilities, and advancing location-aware services [5,6,7,8]. However, Geo-NER poses distinctive challenges that go beyond those in conventional NER tasks. First, geographic entities exhibit semantic ambiguity and polysemy, as the same name may refer to different spatial levels or even non-spatial concepts (e.g., “New York” as a city or a state) [9]. Second, Geo-NER often involves hierarchical and nested structures, such as “Shanghai-Pudong-Lujiazui”, where multiple spatial entities overlap or are embedded within one another [10]. Third, contextual dependence is strong: the same term (e.g., “Washington”) may denote a person or a place depending on surrounding text [11]. Fourth, geographic boundaries and entity granularity can vary across datasets and applications, introducing additional complexity. Therefore, developing accurate and efficient Geo-NER methods is essential for both geographic information extraction and downstream spatial data applications.

Over the past few decades, research on Geo-NER has progressed from rule-based and statistical models to neural and pre-trained language model approaches. Early rule-based and statistical methods effectively utilized handcrafted gazetteers and pattern matching to identify geographic entities [12], but their reliance on predefined rules limited generalization across domains. The introduction of neural architectures such as Bi-LSTM-CRF and attention-based models [13], as well as pre-trained language models like BERT, greatly enhanced contextual understanding and semantic representation [14,15]. Nevertheless, most of these models still formulate Geo-NER as a sequence labeling task [16], which makes it challenging to capture the complex and nested structures commonly observed in geographic texts. Despite the success of span-based and MRC-based frameworks [17,18], two key challenges remain unsolved. First, existing models typically utilize only the top-layer representations from pre-trained language models, which capture high-level semantics but lose fine-grained lexical and syntactic cues essential for recognizing long or context-sensitive entities. Second, conventional sequence labeling and MRC methods struggle to detect nested entities that share partial boundaries. For example, in “北京清华大学的科研成果… (Research results from Tsinghua University in Beijing)”, there are two ORG entities, “北京清华大学 (Tsinghua University in Beijing)” and “清华大学 (Tsinghua University)”, that share the same ending part, but existing models often identify only one. Similarly, in “…在上海浦东新区 (…in Shanghai’s Pudong New Area)”, two LOC entities, “上海 (Shanghai)” and “上海浦东新区 (Shanghai’s Pudong New Area)”, share the same beginning part, yet traditional models cannot recognize both simultaneously.

To address these challenges, we propose Geo-MRC, an MRC framework for Geo-NER that integrates multi-layer BERT representations, GRU-based semantic fusion, and multi-scale 1D convolutions to capture both global semantics and local contextual cues. Building on these enriched representations, we introduce a dynamic boundary inference mechanism that jointly predicts start positions, end positions, and span lengths, enabling complete entity span reconstruction from any two predictions without predefined decoding rules. This design effectively resolves nested and overlapping entities, including those sharing boundary tokens. The main contributions of this work are as follows:

(1): We present Geo-MRC, a novel MRC-based framework that reformulates Geo-NER as a question-answering task, enabling robust recognition of nested and overlapping geographic entities.
(2): We design a multi-layer feature extraction strategy that fuses BERT representations, GRU-based semantic fusion, and multi-scale 1D convolutions to capture both multi-level semantics and local contextual patterns.
(3): We propose a dynamic boundary inference mechanism that jointly predicts start positions, end positions, and span lengths, allowing flexible and accurate entity span reconstruction.
(4): We create manually augmented nested versions of four public Chinese and English datasets and conduct comprehensive experiments, achieving consistent improvements over strong MRC-based baselines on both mixed and nested benchmarks.

The remainder of this paper is organized as follows: Section 2 reviews related work. Section 3 details the proposed methodology. Section 4 presents the experimental setup, results, and discussions. Finally, Section 5 concludes the study and outlines future directions.

2. Related Work

Geo-NER has evolved considerably, spanning from early rule-based and statistical approaches to deep learning and pre-trained language model methods. Meanwhile, to address the challenges of nested and overlapping entities, MRC-based frameworks have emerged as an alternative paradigm for NER. In this section, we first review the development of Geo-NER methods and then discuss the progress of MRC-based NER approaches.

2.1. Geo-NER Methods

The research on Geo-NER has evolved from rule-driven to data-driven approaches, broadly categorized into three types: rule-based and statistical methods, traditional deep learning-based methods, and pre-trained language model-based methods.

Rule-based and statistical methods: Rule-based approaches rely on manually curated dictionaries and pattern-matching techniques to identify geographic entities [19,20]. For example, location names may be recognized through common suffixes, while organization names may be detected using domain-specific keywords [21]. These methods perform well in domain-specific scenarios but suffer from poor generalization due to their reliance on handcrafted rules. Moreover, because they depend heavily on surface forms, such methods are unable to capture contextual variations or nested relationships that are common in geographic text. Statistical methods, such as those based on Hidden Markov Models (HMM) and Conditional Random Fields (CRF) [22,23], enhance recognition accuracy by incorporating contextual features and the probabilistic distribution of labeled sequences. However, their feature engineering process is labor-intensive and insufficient for modeling complex spatial hierarchies or overlapping geographic entities, which are often encountered in real-world Geo-NER tasks.
Traditional deep learning-based methods: These methods have become the mainstream approach in Geo-NER in recent years. A notable example is the work by Lample et al. [24], who utilized Bidirectional Long Short-Term Memory networks (Bi-LSTM) to capture contextual dependencies, in conjunction with a Conditional Random Field (CRF) layer to ensure global label consistency. This architecture achieved significant success in general NER tasks. Within the Geo-NER domain, such methods have been further enhanced by integrating geographic gazetteers, attention mechanisms, external knowledge bases, and multi-task learning strategies to improve the identification of geographic entities [25,26,27,28]. Compared with traditional statistical models, deep learning-based approaches reduce the dependence on handcrafted features and demonstrate stronger capabilities in capturing context-sensitive geographic information. However, they still treat Geo-NER as a sequence labeling problem, where each token receives a single label. This design makes it difficult to detect nested and boundary-sharing entities such as “北京清华大学 (Tsinghua University in Beijing)” or “上海浦东新区 (Shanghai’s Pudong New Area),” where multiple spatial units overlap. In addition, as the volume of textual data continues to grow, traditional neural architectures begin to show limitations in handling large-scale corpora due to constrained modeling capacity and computational inefficiencies.
Pre-trained language model-based methods: These approaches leverage large-scale corpus pre-training to generate rich semantic representations, thereby significantly improving the accuracy of geographic entity recognition. For example, models such as BERT, ALBERT, ERNIE, BART, and XLNet have been integrated into traditional deep learning architectures to enhance contextual understanding in Geo-NER tasks [9,29,30,31,32]. In addition, some studies reformulate Geo-NER as a sequence-to-sequence task using encoder–decoder frameworks, enabling better modeling of complex dependencies and facilitating the recognition of flexible, nested, or ambiguous geographic expressions [33,34]. To further adapt to the unique characteristics of geographic texts, domain-specific pre-trained models such as GeoBERT [35] and CnGeoPLM [36] have been developed, demonstrating improved performance in Geo-NER. Compared to traditional deep learning methods, pre-trained language models excel at capturing long-distance dependencies, reduce reliance on large-scale annotated data, and demonstrate greater robustness and accuracy when handling complex geographical entities. However, despite these advances, most of these models still follow the sequence labeling paradigm and cannot effectively handle the multi-level semantic hierarchies or spatially overlapping entities inherent in geographic data. Therefore, new frameworks are needed to model the nested, boundary-sharing, and hierarchically structured characteristics of geographic entities more explicitly.

2.2. MRC-Based NER Methods

To address the limitations of traditional sequence labeling approaches in recognizing nested entities, MRC-based methods for NER have attracted increasing attention. These methods reformulate the NER task as a question-answering problem, where entity recognition is achieved by asking entity-type-specific questions. Specifically, an MRC-based method concatenates a question with the input text and then encodes the combined sequence using a pre-trained language model. The model predicts whether each token represents the start or end of an entity span, thereby determining the entity boundaries [14]. This approach overcomes the constraint of assigning a single label to each token in sequence labeling, allowing multiple roles for the same token and effectively supporting the recognition of nested entities.

This method has been widely applied in various domains and has demonstrated notable performance improvements. For instance, in the financial domain, they have been used to extract company names and financial product entities [37]; in the medical domain, to identify disease names, drug names, and symptoms from clinical records and biomedical literature [38]; in the legal domain, to detect victims, stolen items, and tools of crime in legal documents [39]; and in materials science, to extract material names, chemical compounds, and associated properties from scientific texts [40]. In general domains, MRC methods, combined with models such as BERT and LSTM, have been applied to entity extraction from social media and news data, significantly enhancing the ability to handle nested entities and high-noise text [41]. Compared with traditional approaches, the MRC framework reframes entity recognition as a context-aware inference task, enhancing model flexibility and partially mitigating annotation conflicts inherent in nested entity scenarios.

However, most existing MRC-based models rely solely on the final-layer output of pre-trained language models, without fully exploiting the rich semantic information distributed across multiple layers. This limitation weakens the ability of the models to recognize long or complex entities that require deeper contextual understanding. Furthermore, the lack of a dynamic boundary inference mechanism results in a rigid reliance on fixed start and end predictions, reducing the model’s flexibility in handling nested entities with overlapping or shared boundaries. These limitations highlight the need for further improvements to the MRC framework, which has consequently become a central focus of ongoing research.

3. Method

The MRC formalism treats named entity recognition as a span extraction task, where for each entity type, the model predicts the start and end positions of the answer span to a predefined question (e.g., “Which locations are mentioned in this text?”). While this formulation supports nested entities across types, it assumes that each token can belong to only one entity span per query and that start and end positions are predicted independently. Such assumptions are often violated in geographic texts, where entities frequently share prefixes, suffixes, or hierarchical relations. These limitations motivate the Geo-MRC framework, which introduces joint boundary inference and multi-layer semantic fusion to extend the standard MRC formulation for complex geographic entity structures. To this end, we propose Geo-MRC, an MRC-based framework that reformulates Geo-NER as a question-answering task to effectively handle nested and overlapping entities. As illustrated in Figure 1, the framework consists of three main components: (1) input representation construction, (2) multi-layer contextual feature extraction integrating BERT representations, GRU-based fusion, and multi-scale 1D convolutions, and (3) dynamic boundary inference that jointly predicts start positions, end positions, and span lengths to reconstruct complete entity spans.

3.1. Input Representation Construction

Geo-MRC reformulates the Geo-NER task as a machine reading comprehension problem, where entity recognition is achieved by posing predefined questions for each entity type. This formulation removes the restriction of assigning a single label to each token and naturally supports the recognition of nested and overlapping entities. To adapt the input for this framework, the model constructs a unified sequence by concatenating the question and the original text.

For the Geo-NER task, we design three question templates corresponding to the three geographic entity categories: location names, organization names, and person names, as shown in Table 1. This design allows the model to perform multiple predictions by asking different questions, enabling a single token to participate in multiple entity spans.

Given an input text sequence T = [t₁, t₂, …, t_m] (where t_i denotes the i-th token in the text, and m is the text length) and a question sequence Q = [q₁, q₂, …, q_n] (where q_i denotes the i-th token in the question, and n is the question length), the model constructs a unified input sequence by concatenating the question and the text as follows:

Input = [CLS] || Q || [SEP] || T || [SEP]

(1)

where [CLS] denotes the special classification token, [SEP] denotes the separator token between the question and the text, || denotes sequence concatenation, and the total length of the sequence is n + m + 3.

After constructing the unified sequence, we tokenize it with the BERT-base WordPiece tokenizer to ensure vocabulary consistency [42]. This produces three inputs to BERT: token IDs, which are vocabulary indices used for embedding lookup; segment IDs, where 0 indicates the question and 1 indicates the text so that the model can distinguish the two segments as in pre-training; and an attention mask, where 1 marks real tokens and 0 marks padding to prevent padding from influencing attention. These tokenized representations are then fed into BERT for multi-layer contextual feature extraction.

3.2. Multi-Layer Contextual Feature Extraction

Building on the tokenized output from Section 3.1, we employ the BERT-base model to extract contextual representations from multiple layers. In particular, we first obtain hidden states from all 12 Transformer layers (Section 3.2.1), then integrate the upper six layers through a GRU-based semantic fusion mechanism (Section 3.2.2), and finally enhance the fused features using multi-scale 1D convolutions combined with element-wise feature integration (Section 3.2.3).

3.2.1. Multi-Layer BERT Representations

The tokenized sequence, represented by token IDs, segment IDs, and attention masks (see Section 3.1), is fed into the pre-trained BERT-base encoder (hidden dimension d = 768 and a total of L = 12 layers) to generate hierarchical contextual representations. Inside BERT, the input is first mapped into continuous vectors through token, position, and segment embeddings, and then processed by a stack of Transformer blocks. Let

H^{(l)} \in ℝ^{(n + m + 3) \times 768}

denote the output of the l-th Transformer layer, with

H^{(0)}

as the embedding output. The forward computation is given by:

H^{(l)} = T r a n s f o r m e r_{l} (H^{(l - 1)}), l = 1, 2, \dots, 12

(2)

To fully leverage the multi-layer semantic representations of BERT, we extract the outputs from its last six layers:

{H^{(7)}, H^{(8)}, \dots, H^{(12)}}

. This selection is based on the following considerations: First, layers 7 to 12 progressively enhance semantic representation while preserving certain syntactic features, allowing for a better balance between semantic and structural information. Second, compared to utilizing all 12 layers, the last six layers reduce the computational cost of subsequent fusion while maintaining performance. Finally, BERT has demonstrated strong generalization capabilities across various NLP tasks [43,44,45], making it a suitable backbone for Geo-NER in this study.

3.2.2. GRU-Based Semantic Fusion

To address the limitation of MRC models that rely solely on the final BERT layer, we employ a GRU-based module to integrate the last six layers of BERT representations. As a lightweight recurrent architecture, the GRU introduces update and reset gates that control information flow and capture dependencies across layers, enabling more effective semantic fusion [46,47]. The overall fusion process and the internal structure of the GRU unit are depicted in Figure 2.

In this study, the outputs

{H^{(7)}, H^{(8)}, \dots, H^{(12)}}

are treated as a sequential input of length six and processed by a GRU in a step-wise manner. At each time step t, the GRU receives the BERT output

H^{(t + 6)}

(e.g., t = 1 for

H^{(7)}

, t = 2 for

H^{(8)}

, etc.) together with the previous hidden state

h_{t - 1}

, and computes the current hidden state

h_{t}

:

h_{t} = G R U (H^{(t + 6)}, h_{t - 1}), t = 1, 2, \dots, 6

(3)

where

h_{0}

denotes the initial hidden state. After the final time step, the GRU produces

h_{6}

, which integrates semantic information from all six BERT layers. A linear transformation is then applied to refine this representation:

H_{G R U} = W h_{6} + b

(4)

where

H_{G R U} \in ℝ^{(n + m + 3) \times 768}

denotes the fused representation, W and b denote the learnable weight matrix and bias term, respectively.

3.2.3. Multi-Scale 1D Convolutions

To better recognize entities of varying lengths, we apply multi-scale 1D convolutional operations. These convolutions capture local contextual features across different receptive fields, thereby strengthening the model’s ability to detect geographic entity boundaries. As shown in Figure 3, convolutional kernels with different widths are applied to the GRU-fused representations to extract features over multiple contextual spans.

Specifically, the model applies three 1D convolution operations with kernel sizes of 3, 5, and 7 to the GRU-based representations. The operation is defined as:

C_{k} = ReLU ({Conv 1 D}_{k} (H_{G R U}) + b_{k}), k \in {3, 5, 7}

(5)

where Conv1D_k (·) denotes the 1D convolution operation with a kernel width of k, ReLU (·) denotes the activation function, and

C_{k} \in ℝ^{(n + m + 3) - k + 1}

denotes the output feature corresponding to kernel size k. The outputs from different convolution kernels are concatenated and projected back to the original feature dimension:

C = [C_{3}; C_{5}; C_{7}]

(6)

H_{c o n v} = \tanh (W C + b)

(7)

where [;] denotes the concatenation operation,

C \in ℝ^{(n + m + 3) \times 3 - 12}

denotes the concatenated feature representation. The resulting contextual features

H_{c o n v} \in ℝ^{(n + m + 3) \times d}

enhance the model’s ability to capture local semantic patterns across multiple scales.

To further integrate semantic and local contextual information, the fused GRU output and convolutional features are combined via element-wise addition:

H = H_{G R U} \oplus H_{c o n v}

(8)

where

\oplus

denotes element-wise addition, and

H \in ℝ^{(n + m + 3) \times d}

denotes the output representation of the feature extraction layer. This representation is then fed into the dynamic boundary inference module to jointly predict start positions, end positions, and span lengths.

3.3. Dynamic Boundary Inference

In geographic texts, location and organization entities frequently appear in nested or boundary-sharing structures. This is because many organizations, such as universities, companies, or research centers, include geographic identifiers within their names. Similarly, location names themselves often exhibit hierarchical composition, where smaller regions are embedded within larger ones. These linguistic patterns reflect real-world spatial hierarchies and are much more common in Geo-NER than in general NER, making them a defining feature of this domain.

To address this domain-specific phenomenon, Geo-MRC introduces a dynamic boundary inference module that effectively detects entities with shared or nested boundaries. Its objective is to accurately determine the boundaries of geographic entities based on the fused contextual representation H obtained in Section 3.2. Instead of assigning token-level labels independently, this module formulates boundary detection as a joint prediction problem, simultaneously modeling start positions, end positions, and span lengths.

To focus on the text portion, we extract the sub-representation H_text from H, excluding the question tokens:

H_{t e x t} = H [n + 2 : n + 2 + m, :] \in ℝ^{m \times d}

(9)

where n and m denote the lengths of the question and text, respectively. The sequence H_text is then projected into a predictive space through a FFN:

H_{p r e d} = F F N (H_{t e x t})

(10)

On top of H_pred, we employ three classifiers, each implemented as an FFN, to jointly perform:

Start and end position prediction. Token-level probabilities $P_{s t a r t}, P_{e n d} \in ℝ^{m}$ are obtained via sigmoid functions, indicating whether each token is the beginning or end of an entity.
Length prediction. For each token, span length distributions $L_{s t a r t}, L_{e n d} \in ℝ^{m \times l}$ are predicted, where l is the maximum entity length, enabling flexible boundary construction. In this work, the length predictor operates on token-level representations derived from the BERT tokenizer. Specifically, for Chinese datasets, we use Chinese-BERT-wwm-ext, and for English datasets, we employ BERT-base-cased, both adopting WordPiece-based tokenization. Consequently, the predicted length refers to the number of WordPiece sub-tokens that form each entity span rather than full words or characters.
Boundary inference. Based on these outputs, entity spans are dynamically determined. If a token is predicted as a start, its boundary is given by [pos_i, pos_i + L_start[i]], where pos_i is the position of the token. Conversely, if predicted as an end, the span is [pos_i-L_end[i], pos_i]. Duplicate spans are removed, and the entity type is determined by the predefined question.

To further illustrate the working mechanism of the dynamic boundary inference module, consider the question template for organization names shown in Table 1: “What are the organization names in the text?” For instance, in the sentence “北京清华大学的科研成果… (Research results from Tsinghua University in Beijing …)”, three relevant entities can be identified: LOC: 北京 (Beijing), ORG: 清华大学 (Tsinghua University), and ORG: 北京清华大学 (Tsinghua University in Beijing). During prediction, the model jointly estimates the start, end, and length of each potential entity span. Given the above question, it predicts a start token at “北 (Bei)” with a span length of six tokens corresponding to “北京清华大学” and another start token at “清 (Qing)” with a span length of four tokens corresponding to “清华大学”. This start + length reasoning enables the model to recover nested entities that share the same ending token, while conversely, reasoning from the end + length perspective allows recovery of entities that share the same starting token. In this bidirectional manner, Geo-MRC flexibly reconstructs overlapping spans without predefined decoding rules, effectively capturing the complex nested and overlapping structures that frequently occur in geographic texts.

4. Experiments

This section presents a series of experiments to evaluate the performance of the proposed Geo-MRC model on the Geo-NER task. The evaluation includes experimental setup, comparative experiments, ablation studies, and a discussion of the results.

4.1. Experimental Setup

4.1.1. Datasets

This study utilizes four publicly available benchmark datasets for named entity recognition: two Chinese datasets (RenMinRiBao and MSRA) and two English datasets (CoNLL-2003 and OntoNotes 5.0). These datasets were originally designed for flat NER and do not include annotations for nested entities. To facilitate a comprehensive evaluation of Geo-MRC’s performance on complex entity structures, we manually augmented the original annotations with nested entities. The annotation process was conducted by three graduate students with backgrounds in geographic information science and natural language processing. Each dataset was independently annotated by two annotators, and all ambiguous cases were discussed collectively to ensure labeling consistency. The annotation followed the general principles of OntoNotes 5.0, with modifications to better capture nested geographic entities. Specifically, entity inclusion and boundary decisions were based on syntactic completeness and semantic independence. For example, in the English dataset, the phrase “New York Stock Exchange” was annotated to include both the full entity and the nested sub-entity “New York”. Similarly, in the Chinese dataset, the phrase “北京清华大学 (Tsinghua University in Beijing)” was annotated to include the nested entities “北京 (Beijing)” and “清华大学 (Tsinghua University) alongside the full span.

To further simulate real-world linguistic complexity, we introduced compound nested cases involving long spans or overlapping components. For example, the phrase “Microsoft Research Asia in Beijing” was annotated to include the nested entities “Microsoft Research Asia”, “Microsoft” and “Beijing”. Similarly, the phrase “John F. Kennedy International Airport” was annotated to include “John F. Kennedy” as a nested entity. All annotated data were double-checked and reviewed collaboratively to ensure correctness. Although no formal inter-annotator agreement metric was calculated, disagreements were resolved through discussion until consensus was reached, resulting in highly consistent annotations. After augmentation, nested entities accounted for approximately 16% of all annotated entities across the datasets, as illustrated in Table 2, which provides representative examples of augmented annotations. To provide a more systematic description, we categorize them into three representative types:

(1): Hierarchical nesting, where entities form spatial or administrative hierarchies, such as “上海浦东新区 (Shanghai’s Pudong New Area)”, reflecting part–whole relationships that require multi–level semantic understanding. This type of nesting is closely related to the first problem identified in this study, as existing models that rely only on top-layer features often fail to capture such hierarchical semantics.
(2): Boundary sharing, where two or more entities share the same start or end positions, such as “北京清华大学 (Tsinghua University in Beijing)” containing both “清华大学 (Tsinghua University)” and “北京 (Beijing)”. This form of overlap corresponds directly to the second problem addressed in this work. Conventional sequence labeling and MRC-based methods typically detect only one of the entities because they assume that each token can belong to a single span, making them ineffective for boundary-sharing cases.
(3): Cross-level semantic overlap, where geographic and organizational meanings coexist within the same span, such as “兰州交通大学 (Lanzhou Jiaotong University)”, representing both a location and an institution. Although this phenomenon is not the focus of the present study, it reflects the inherent semantic ambiguity of geographic entities. The MRC framework can naturally handle such cases by designing different questions for different entity types, allowing the model to extract multiple category labels for the same text span.

These categories provide a formal characterization of the nesting patterns commonly observed in geographic texts and establish a clearer foundation for evaluating the model’s ability to handle different structural complexities.

Finally, all manually annotated datasets were converted into the MRC format used for training Geo-MRC, where each annotated instance was represented as a question–context pair. In this structure, the question explicitly specifies the target entity type (e.g., location, organization, or person), and the model learns to predict the corresponding answer span for each pair. This conversion directly enables Geo-MRC to perform type-specific span extraction through supervised learning. Table 3 illustrates concrete examples of how the augmented annotations were transformed into the MRC input format, demonstrating the pairing of entity-specific questions with their corresponding text contexts.

4.1.2. Evaluation Metrics

This study employs Precision, Recall, F1 Score, and Micro-F1 as the evaluation metrics. A predicted entity is considered correct only if its boundaries and entity type exactly match the ground truth. Precision, Recall, and F1 Score are calculated for each entity type to assess category-specific performance, while Micro-F1 is used to provide an overall performance measure by micro-averaging results across all entity types.

4.1.3. Parameter Settings

The Geo-MRC model is implemented using the PyTorch 2.1.0 framework, with BERT-base adopted as the pre-trained encoder. The main parameter settings of the model are summarized in Table 4.

4.2. Comparative Experiments

This study addresses two core research questions. RQ1: Can multi-layer semantic fusion improve recognition of long or context-sensitive geographic entities that require both global and local cues, compared with models that rely on a single top layer of a pretrained encoder? RQ2: Can joint boundary inference with start, end, and length recover boundary-sharing nested entities that conventional sequence labeling and typical MRC decoders often miss? Accordingly, we compare Geo-MRC with four representative MRC-based NER models, namely MRC-I2DP [48], GFMRC [49], NER-to-MRC [50], and MRC-CAP [51]. These baselines are widely used in previous studies and provide complementary contrasts aligned with RQ1 and RQ2. All models are trained and evaluated under identical settings to ensure a fair comparison.

4.2.1. Performance on Mixed Datasets

Table 5 presents the comparative results on four Chinese and English mixed datasets, which include both non-nested and manually augmented nested entities. As shown in the results, Geo-MRC consistently outperforms all baseline models across all datasets, achieving the highest Micro-F1 scores. This demonstrates the model’s robustness and effectiveness in handling a diverse range of entity structures in Geo-NER tasks, particularly in scenarios that combine simple and complex entities within a unified framework.

Specifically, on the Chinese datasets RenMinRiBao and MSRA, Geo-MRC achieves Micro-F1 scores of 88.40% and 87.94%, respectively, surpassing the strongest baselines by 5.75% and 7.34%. At the category level, Geo-MRC performs particularly well on PER and LOC entities, with F1 scores exceeding 89%. For ORG entities, although the model’s F1 score (79.76%) falls slightly below that of the best baseline MRC-CAP (82.02%) on RenMinRiBao, the margin is relatively small, indicating that Geo-MRC remains competitive. Moreover, Geo-MRC demonstrates consistent performance across both datasets, with only a 0.46% variation in Micro-F1, suggesting strong stability and generalizability. This consistency also reflects a similar structural composition and difficulty level between the RenMinRiBao and MSRA datasets.

On the English datasets CoNLL-2003 and OntoNotes 5.0, Geo-MRC achieves Micro-F1 scores of 79.27% and 72.24%, respectively, surpassing the best-performing baselines by 2.69% and 2.20%. At the level of individual entity categories, Geo-MRC obtains the highest F1 scores for all three types: PER, LOC, and ORG. However, the overall performance on OntoNotes 5.0 is noticeably lower than that on CoNLL-2003. The Micro-F1 score drops by nearly 7%, and the F1 scores for LOC and ORG entities are 50.88% and 67.13%, which are the lowest among all evaluated datasets. This performance decline is mainly due to the inherent complexity of the OntoNotes 5.0 dataset, including ambiguous entity boundaries and an imbalanced distribution of entity types, which increase the difficulty of accurate recognition. Despite these challenges, Geo-MRC still outperforms all baseline models across each entity category, demonstrating its robustness and adaptability in complex English entity recognition scenarios.

To evaluate training efficiency, we measured the average time per epoch of Geo-MRC across four datasets. The model requires approximately 2′53″ on RenMinRiBao, 6′27″ on MSRA, 1′57″ on CoNLL-2003, and 5′33″ on OntoNotes 5.0. Under identical hardware and parameter settings, Geo-MRC consistently achieves the fastest training speed among all compared MRC-based methods. This efficiency gain mainly stems from its start–end–length inference strategy, which eliminates the quadratic complexity of 2-D span modeling and avoids the heavy multi-head co-prediction structure used in some existing approaches.

Overall, Geo-MRC achieves leading or near-optimal performance across all datasets and entity categories, demonstrating its stability and adaptability in mixed-type entity recognition tasks. However, the performance of all models varies significantly across datasets and entity types. For example, the F1 scores for PER entities are consistently higher than those for LOC and ORG entities across all datasets. Additionally, Micro-F1 scores on Chinese datasets are generally higher than those on English datasets. This reflects the significant impact of factors such as varying data distributions, linguistic characteristics, and entity category differences on model performance. In addition, Geo-MRC maintains superior computational efficiency, achieving faster convergence and lower training time compared with all baselines, which further validates its practicality in large-scale Geo-NER tasks.

4.2.2. Performance on Nested Datasets

To further evaluate the performance of each model in handling nested entities, Table 6 presents the comparison results on four Chinese and English nested entity datasets. As shown in the results, Geo-MRC consistently achieves the highest Micro-F1 scores across all datasets, demonstrating its strong advantage in the task of nested entity recognition.

Specifically, on the Chinese datasets RenMinRiBao and MSRA, Geo-MRC achieves Micro-F1 scores of 51.66% and 56.57%, outperforming the best baseline by 4.96% and 4.82%, respectively. At the category level, Geo-MRC obtains the highest F1 scores across all three entity types: PER, LOC, and ORG. Notably, the improvement in the LOC category is the most significant; for instance, on the MSRA dataset, the average F1 score for LOC increases by 7.78%, making it the largest gain among the three categories. In addition, all models perform better on the MSRA nested dataset than on RenMinRiBao in terms of both Micro-F1 and per-category F1 scores. For example, the Micro-F1 score of Geo-MRC on MSRA is nearly 5% higher than that on RenMinRiBao. These results suggest that differences in task difficulty and sample distribution across the Chinese nested datasets have a significant impact on model performance.

On the English datasets CoNLL-2003 and OntoNotes 5.0, Geo-MRC achieved Micro-F1 scores of 44.69% and 39.75%, respectively, outperforming the best baseline by 8.93% and 3.67%. At the category level, Geo-MRC also achieved the highest F1 scores across all three categories: PER, LOC, and ORG. Notably, on the CoNLL-2003 dataset, the most significant improvement was observed in the PER category, with an average F1 score increase of 8.27% compared to the baselines. In contrast, on the OntoNotes 5.0 dataset, the ORG category showed the greatest improvement, with an average F1 score increase of 5.72%. In addition, Geo-MRC demonstrated better overall recognition performance on CoNLL-2003 than on OntoNotes 5.0, with a nearly 5% difference in Micro-F1, and consistently higher F1 scores across all entity categories. These results suggest that differences in entity structure, dataset scale, and annotation style across English nested datasets have a considerable impact on model performance.

Overall, Geo-MRC achieved the highest Micro-F1 scores across all nested entity datasets, demonstrating its stability and robustness in nested structure recognition. However, the model’s performance still varies between Chinese and English datasets. Specifically, the overall recognition results on Chinese nested datasets are generally better than those on English datasets. For example, Geo-MRC achieved a Micro-F1 score of 56.57% on MSRA, which is significantly higher than the 39.75% obtained on OntoNotes 5.0. Additionally, the LOC category exhibited the most notable improvements in Chinese datasets, while the English datasets showed relatively greater gains in the PER and ORG categories. These discrepancies reflect differences in linguistic structure and nesting complexity between Chinese and English nested entities, presenting distinct challenges for model generalization and boundary prediction.

4.3. Ablation Studies

To further investigate the contribution of each component in Geo-MRC, we conducted three ablation experiments by removing the GRU module, the CNN module, and the length prediction module, respectively. The experiments were performed on all four Chinese and English datasets, covering both mixed and nested datasets. The Micro-F1 results for each variant are presented in Table 7.

In terms of overall performance, the full Geo-MRC model consistently achieves the highest Micro-F1 scores across all datasets in both mixed and nested entity recognition tasks, validating the generality and effectiveness of the proposed architecture across different languages and data types. Among the three modules, the length prediction module contributes most significantly. Specifically, in the mixed datasets, removing the length prediction module leads to a drop in Micro-F1 of 8.80%, 9.00%, 5.53%, and 3.13% on RenMinRiBao, MSRA, CoNLL-2003, and OntoNotes 5.0, respectively. In the nested datasets, the corresponding declines are 5.87%, 6.01%, 8.85%, and 4.76%. These results highlight the critical role of the length prediction mechanism in accurately inferring entity boundaries within complex nested structures.

The GRU module demonstrates significant effectiveness in enhancing contextual modeling, particularly on English datasets. Specifically, in the mixed datasets, removing the GRU results in performance drops of 3.71% and 0.55% for CoNLL-2003 and OntoNotes 5.0, respectively. The declines are even more pronounced in the nested datasets, reaching 5.31% and 4.59%. These findings indicate that the GRU effectively captures long-range dependencies, which is especially beneficial for identifying long-span or syntactically complex English entities. Moreover, on Chinese datasets, where contextual structures are relatively compact, the GRU still provides consistent improvements. For instance, its removal leads to Micro-F1 drops of 3.24% and 1.53% on the RenMinRiBao and MSRA mixed datasets, and 1.14% and 2.43% on the corresponding nested datasets.

The CNN module is primarily responsible for extracting local contextual features, enhancing the model’s ability to perceive phrase boundaries. In the mixed datasets, removing the CNN module leads to performance drops of 2.10%, 0.81%, 4.38%, and 1.46% on RenMinRiBao, MSRA, CoNLL-2003, and OntoNotes 5.0, respectively. On the nested datasets, the corresponding declines are 1.07%, 1.42%, 3.53%, and 2.61%. These results suggest that, while the CNN module has a relatively limited impact compared to the GRU and length prediction modules, it still plays a significant role in improving the model’s fine-grained perception of entity boundaries. This is particularly evident in the English datasets, where the CNN helps mitigate the challenges posed by boundary ambiguity in nested entity recognition, by improving local context representation.

Overall, the components of Geo-MRC exhibit complementary strengths across different types of entity recognition tasks. Specifically, the length prediction mechanism is crucial for modeling nested structures, the GRU module enhances the ability to capture long-range dependencies, and the CNN module provides additional support for extracting local semantic features. The synergy among these three components significantly improves the performance of Geo-MRC in complex Geo-NER.

4.4. Results Analysis and Discussion

This section analyzes Geo-MRC’s performance from multiple perspectives based on the aforementioned experimental results and discusses its potential challenges across different languages, data types, and entity structures.

(1) Analysis of consistency and discrepancy between mixed and nested datasets

As shown in Table 5 and Table 6, Geo-MRC consistently achieves the highest Micro-F1 scores on both mixed and nested datasets, demonstrating its robustness and adaptability in both general entity recognition and complex nested scenarios. In particular, on the Chinese mixed datasets RenMinRiBao and MSRA, Geo-MRC achieves Micro-F1 scores of 88.40% and 87.94%, respectively, and maintains strong performance on the corresponding nested datasets with scores of 51.66% and 56.57%. These results highlight the model’s robustness in Chinese datasets, where linguistic structures are more regular and naming patterns are more stable. In contrast, English datasets often contain more complex nested structures. For example, OntoNotes 5.0 includes a large number of overlapping entities and cross-phrase nesting patterns. Although Geo-MRC’s Micro-F1 score on such tasks is relatively lower (only 39.75% on nested data), it still significantly outperforms other baselines. This highlights Geo-MRC’s cross-linguistic transferability and capability to model complex entity structures.

(2) Analysis of performance differences across entity types

In terms of F1 scores across individual entity categories, Geo-MRC generally performs better on PER and LOC entities than on ORG entities, regardless of whether the dataset is mixed or nested. For example, on the Chinese mixed datasets RenMinRiBao and MSRA, the F1 score for PER exceeds 95%, and that for LOC remains consistently above 89%. In contrast, the F1 score for ORG is comparatively lower, reaching only 79.76% on RenMinRiBao. This performance gap is primarily due to the higher structural complexity and variability of organization names, which are more prone to nesting and boundary ambiguity in text. Interestingly, on the English OntoNotes 5.0 mixed dataset, the recognition performance for ORG slightly surpasses that for LOC. This can be attributed to the presence of more non-standard place names and noisy annotations in the dataset. These findings further illustrate that differences in structural complexity, contextual distribution, and corpus noise across entity types present varying challenges to boundary modeling, which in turn lead to inconsistent performance across categories under different language settings.

(3) Analysis of nested structure recognition capability

Geo-MRC demonstrates overall stable performance on the nested datasets, which confirms its effectiveness in handling complex entity boundary issues. Unlike traditional MRC-based models that rely solely on start and end position tagging, Geo-MRC introduces a length prediction mechanism, forming a dynamic boundary inference strategy that integrates start position, end position, and entity length. This design enables the model to better identify nested entities with shared boundaries, thereby significantly enhancing recognition performance. For example, on the Chinese MSRA nested dataset, Geo-MRC achieves a Micro-F1 score of 56.57%, representing an improvement of 4.82% over the best baseline model, NER-to-MRC. Furthermore, Geo-MRC also attains the highest Micro-F1 scores on the nested datasets of RenMinRiBao, CoNLL-2003, and OntoNotes 5.0. These results indicate that the length prediction mechanism not only improves the precision of boundary detection but also enhances the model’s adaptability to complex nested entity structures.

(4) Analysis of module design and performance contribution

The ablation experiments reveal that the performance advantage of Geo-MRC stems from the synergistic interaction of its components. Among them, the length prediction module contributes the most significantly. Removing this module results in an average Micro-F1 decrease of 6.62% on the mixed datasets and 6.37% on the nested datasets, indicating its critical role in ensuring boundary completeness and continuity. The GRU module plays a vital role in modeling long-range dependencies and capturing global contextual semantics, especially in English datasets. For instance, on the nested datasets of CoNLL-2003 and OntoNotes 5.0, removing GRU leads to Micro-F1 drops of 5.31% and 4.59%, respectively. The CNN module enhances the model’s ability to capture local semantic patterns. Although its overall impact is slightly smaller, it provides effective support in cases involving short entities, adjacent boundaries, and multiple entities in close proximity, thereby improving both boundary precision and robustness.

(5) Analysis of Chinese-English corpus adaptability

Geo-MRC generally performs better on Chinese datasets compared to English datasets, particularly in nested entity recognition tasks. For instance, the Micro-F1 score on the Chinese MSRA nested dataset reaches 56.57%, which is significantly higher than the 39.75% achieved on the English OntoNotes 5.0 dataset. This discrepancy is primarily attributed to structural differences between the two languages. Chinese entities tend to be more compact and have clearer boundaries, whereas English entities often involve multi-word phrases, prepositional modifiers, and complex syntactic structures, which increase the difficulty of boundary identification. Moreover, English datasets generally exhibit more label inconsistency, semantic ambiguity, and deeper nesting levels, all of which pose additional challenges for accurate recognition. These findings suggest that while Geo-MRC demonstrates promising cross-lingual generalization, further optimization is still needed in aspects such as syntactic alignment, contextual representation, and language-specific adaptation to enhance its robustness and generalizability across different linguistic settings.

(6) Insights from comparative experiments

The comparative experiments provide broader insights into the design of MRC-based frameworks for Geo-NER. The results suggest that models relying solely on independent boundary tagging tend to produce incomplete or inconsistent recognition of overlapping entities, while approaches incorporating joint boundary reasoning achieve more coherent span prediction. Furthermore, the experiments demonstrate that leveraging multi-layer semantic representations allows the model to capture both global and local cues without depending on external lexical resources. This indicates that hierarchical semantic fusion is more effective for representing context-dependent geographic entities. Finally, the overall comparison highlights that lightweight inference strategies can maintain high accuracy while substantially improving computational efficiency. These findings collectively validate the design philosophy of Geo-MRC and provide useful implications for developing more general and scalable Geo-NER models.

In summary, Geo-MRC demonstrates consistent and significant performance advantages across multilingual and structurally diverse datasets, with particularly strong results in nested entity recognition tasks. The structural innovations introduced by the model, especially the length prediction module, substantially enhance its ability to resolve complex entity boundaries. However, its performance on English datasets with more intricate structures still leaves room for improvement. Future work may focus on enhancing contextual modeling of entities and integrating syntactic knowledge in order to further improve the model’s robustness and adaptability in challenging linguistic scenarios.

5. Conclusions

This paper proposes an improved model named Geo-MRC within the MRC framework to address the structural limitations and representational constraints of traditional sequence labeling methods in handling nested entities in Geo-NER tasks. Geo-MRC reformulates the Geo-NER task as a question-answering prediction task and introduces a dynamic boundary inference strategy that jointly predicts the start position, end position, and entity length. This approach significantly enhances the model’s ability to recognize overlapping and nested entities. Moreover, by integrating the semantic representation power of BERT, the contextual fusion capability of GRU, and the local feature extraction strength of multi-scale 1D convolutions, Geo-MRC achieves multi-level semantic enhancement and boundary awareness throughout its overall architecture. Experimental results on multiple Chinese and English datasets show that Geo-MRC achieves consistent and significant improvements over existing MRC-based baselines, particularly in recognizing nested entities. Ablation studies further confirm that the length prediction module plays a key role in boundary modeling, while the GRU and CNN components provide complementary contextual and local feature representations.

Building on the current research, future work can be explored in the following directions. Enhanced cross-lingual modeling capabilities: Given the challenges in English datasets such as multi-word nested entities and syntactic inconsistencies, future research could incorporate multilingual pretrained models (e.g., Aya 23 [52], Gemma 3 [53]) or language-adaptive fine-tuning strategies to improve the model’s generalization across diverse linguistic environments. Integration of structured external knowledge: Introducing structured resources such as geographic knowledge graphs or gazetteers into the training process may enhance the model’s ability to recognize unseen entities and resolve boundary ambiguities more effectively.

In summary, Geo-MRC presents a novel and generalizable framework for geographic named entity recognition. Its dynamic boundary inference mechanism enables flexible reconstruction of nested and overlapping entities, while the integration of multi-layer semantic fusion and convolution-based boundary modeling ensures robust and accurate performance across languages and datasets. The proposed model establishes a solid foundation for advancing intelligent geographic information extraction and spatial semantic understanding.

Author Contributions

Data curation, Ping Du and Xuan Hao; Methodology, Yuting Zhang and Pengpeng Li; Project administration, Pengpeng Li; Resources, Jingzhong Li and Pengpeng Li; Writing—original draft, Yuting Zhang and Pengpeng Li; Writing—review and editing, Jingzhong Li and Tao Liu; All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by Young Talent Project of Gansu Provincial Organization Department (Individual Project) (2025QNGR14), Key Talent Project of Gansu Provincial Organization Department (2025RCXM012), Gansu Youth Science and Technology Fund (24JRRA275), the Joint Innovation Fund Project of Lanzhou Jiaotong University and Corresponding Supporting University (LH 2024019), the National Natural Science Foundation of China (42271454), Major Project of Gansu Provincial Joint Research Fund (24JRRA848), Lanzhou City Science and Technology Plan Project (2024-3-68).

Data Availability Statement

The corresponding author will provide the data supporting the findings of this study upon reasonable request.

Acknowledgments

The authors would like to thank the anonymous reviewers and the editor for their constructive comments and suggestions for this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Nozza, D.; Manchanda, P.; Fersini, E.; Palmonari, M.; Messina, E. LearningToAdapt with word embeddings: Domain adaptation of named entity recognition systems. Inf. Process. Manag. 2021, 58, 102537. [Google Scholar] [CrossRef]
Berragan, C.; Singleton, A.; Calafiore, A.; Morley, J. Transformer based named entity recognition for place name extraction from unstructured text. Int. J. Geogr. Inf. Sci. 2023, 37, 747–766. [Google Scholar] [CrossRef]
Li, P.; Liu, J.; Luo, A.; Wang, Y.; Zhu, J.; Xu, S. Deep learning method for Chinese multisource point of interest matching. Comput. Environ. Urban Syst. 2022, 96, 101821. [Google Scholar] [CrossRef]
Weckmüller, D.; Dunkel, A.; Burghardt, D. Embedding-Based Multilingual Semantic Search for Geo-Textual Data in Urban Studies. J. Geovis. Spat. Anal. 2025, 9(2), 31. [Google Scholar] [CrossRef]
Kryvasheyeu, Y.; Chen, H.; Obradovich, N.; Moro, E.; Van Hentenryck, P.; Fowler, J.; Cebrian, M. Rapid assessment of disaster damage using social media activity. Sci. Adv. 2016, 2, e1500779. [Google Scholar] [CrossRef]
Tadakaluru, A. Context Optimized and Spatial Aware Dummy Locations Generation Framework for Location Privacy. J. Geovis. Spat. Anal. 2022, 6, 27. [Google Scholar] [CrossRef]
Li, Y.; Peng, L.; Sang, Y.; Gao, H. The characteristics and functionalities of citizen-led disaster response through social media: A case study of the #HenanFloodsRelief on Sina Weibo. Int. J. Disaster Risk Reduct. 2024, 106, 104419. [Google Scholar]
Marasinghe, R.; Yigitcanlar, T.; Mayere, S.; Washington, T.; Limb, M. Towards responsible urban geospatial AI: Insights from the white and grey literatures. J. Geovis. Spat. Anal. 2024, 8, 24. [Google Scholar] [CrossRef]
Tao, L.; Xie, Z.; Xu, D.; Ma, K.; Qiu, Q.; Pan, S.; Huang, B. Geographic named entity recognition by employing natural language processing and an improved BERT model. ISPRS Int. J. Geo-Inf. 2022, 11, 598. [Google Scholar] [CrossRef]
Li, P.; Zhu, Q.; Liu, J.; Liu, T.; Du, P.; Liu, S.; Zhang, Y. A Multi-Semantic Feature Fusion Method for Complex Address Matching of Chinese Addresses. ISPRS Int. J. Geo-Inf. 2025, 14, 227. [Google Scholar] [CrossRef]
Gritta, M.; Pilehvar, M.T.; Collier, N. A pragmatic guide to geoparsing evaluation. Lang. Resour. Eval. 2020, 54, 683–712. [Google Scholar] [CrossRef]
Nadeau, D.; Sekine, S. A survey of named entity recognition and classification. Lingvist. Investig. 2007, 30, 3–26. [Google Scholar] [CrossRef]
Luo, L.; Yang, Z.; Yang, P.; Zhang, Y.; Wang, L.; Lin, H.; Wang, J. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics 2018, 34, 1381–1388. [Google Scholar] [CrossRef]
Li, J.; Sun, A.; Han, J.; Li, C. A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 2022, 34, 50–70. [Google Scholar] [CrossRef]
Wang, Y.; Sun, Y.; Ma, Z.; Gao, L.; Xu, Y.; Sun, T. Application of pre-training models in named entity recognition. In Proceedings of the 2020 12th International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, China, 22–23 August 2020; IEEE: Piscataway, NJ, USA, 2020; Volume 1, pp. 23–26. [Google Scholar]
Li, X.; Feng, J.; Meng, Y.; Han, Q.; Wu, F.; Li, J. A unified MRC framework for named entity recognition. arXiv 2020, arXiv:1910.11476. [Google Scholar]
Yu, J.; Ji, B.; Li, S.; Ma, J.; Liu, H.; Xu, H. S-NER: A Concise and Efficient Span-Based Model for Named Entity Recognition. Sensors 2022, 22, 2852. [Google Scholar] [CrossRef]
Bao, X.; Tian, M.; Zha, Z.; Qin, B. MPMRC-MNER: Unified MRC Framework for MNER. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 47–56. [Google Scholar]
Leidner, J.L. Toponym resolution in text: Annotation, evaluation and applications of spatial grounding. In ACM SIGIR Forum; ACM: New York, NY, USA, 2007; Volume 41, pp. 124–126. [Google Scholar]
Shaalan, K.; Raza, H. NERA: Named entity recognition for Arabic. J. Am. Soc. Inf. Sci. Technol. 2009, 60, 1652–1663. [Google Scholar] [CrossRef]
Shen, J.; Li, F.; Xu, F.; Uszkoreit, H. Recognition of chinese organization names and abbreviations. J. Chin. Inf. Process. 2007, 21, 17–21. [Google Scholar]
Lafferty, J.D.; McCallum, A.; Pereira, F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2001; Volume 22, pp. 1–28. [Google Scholar]
Morwal, S.; Jahan, N.; Chopra, D. Named entity recognition using Hidden Markov Model (HMM). Int. J. Nat. Lang. Comput. (IJNLC) 2012, 1, 15–23. [Google Scholar] [CrossRef]
Lample, G.; Ballesteros, M.; Subramanian, S.; Kawakami, K.; Dyer, C. Neural architectures for named entity recognition. arXiv 2016, arXiv:1603.01360. [Google Scholar] [CrossRef]
Qiu, Q.; Xie, Z.; Wu, L.; Tao, L.; Li, W. BiLSTM-CRF for geological named entity recognition from the geoscience literature. Earth Sci. Inform. 2019, 12, 565–579. [Google Scholar] [CrossRef]
Song, C.H.; Lawrie, D.; Finin, T.; Mayfield, T. Improving neural named entity recognition with gazetteers. arXiv 2020, arXiv:2003.03072. [Google Scholar] [CrossRef]
Hu, X.; Al-Olimat, H.S.; Kersten, J.; Wiegmann, M.; Klan, F.; Sun, Y.; Fan, H. GazPNE: Annotation-free deep learning for place name extraction from microblogs leveraging gazetteer and synthetic data by rules. Int. J. Geogr. Inf. Sci. 2022, 36, 310–337. [Google Scholar] [CrossRef]
Tang, X.; Huang, Y.; Xia, M.; Long, C. A multi-task BERT-BiLSTM-AM-CRF strategy for Chinese named entity recognition. Neural Process. Lett. 2023, 55, 1209–1229. [Google Scholar] [CrossRef]
Wang, Y.; Sun, Y.; Ma, Z.; Gao, L.; Xu, Y. An ERNIE-based joint model for Chinese named entity recognition. Appl. Sci. 2020, 10, 5711. [Google Scholar] [CrossRef]
Cui, L.; Wu, Y.; Liu, J.; Yang, S.; Zhang, Y. Template-based named entity recognition using BART. arXiv 2021, arXiv:2106.01760. [Google Scholar] [CrossRef]
Yan, R.; Jiang, X.; Dang, D. Named entity recognition by using XLNet-BiLSTM-CRF. Neural Process. Lett. 2021, 53, 3339–3356. [Google Scholar] [CrossRef]
Zhang, W.; Meng, J.; Wan, J.; Zhang, C.; Zhang, J.; Wang, Y.; Xu, L.; Li, F. ChineseCTRE: A model for geographical named entity recognition and correction based on deep neural networks and the BERT model. ISPRS Int. J. Geo. Inf. 2023, 12, 394. [Google Scholar] [CrossRef]
Nayak, T.; Ng, H. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence 2020, New York, NY, USA, 7–12 February 2020; pp. 8528–8535. [Google Scholar]
Yan, H.; Gui, T.; Dai, J.; Guo, Q.; Zhang, Z.; Qiu, X. A unified generative framework for various NER subtasks. arXiv 2021, arXiv:2106.01223. [Google Scholar] [CrossRef]
Liu, H.; Qiu, Q.; Wu, L.; Li, W.; Wang, B.; Zhou, Y. Few-shot learning for name entity recognition in geological text based on GeoBERT. Earth Sci. Inform. 2022, 15, 979–991. [Google Scholar] [CrossRef]
Ma, K.; Zheng, S.; Tian, M.; Qiu, Q.; Tan, Y.; Hu, X.; Li, H.; Xie, Z. CnGeoPLM: Contextual knowledge selection and embedding with pretrained language representation model for the geoscience domain. Earth Sci. Inform. 2023, 16, 3629–3646. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, H. FinBERT–MRC: Financial named entity recognition using BERT under the machine reading comprehension paradigm. Neural Process. Lett. 2023, 55, 7393–7413. [Google Scholar] [CrossRef]
Sun, C.; Yang, Z.; Wang, L.; Zhang, Y.; Lin, H.; Wang, J. Biomedical named entity recognition using BERT in the machine reading comprehension framework. J. Biomed. Inform. 2021, 118, 103799. [Google Scholar] [CrossRef]
Zhang, H.; Guo, J.; Wang, Y.; Zhang, Z.; Zhao, H. Judicial nested named entity recognition method with MRC framework. Int. J. Cogn. Comput. Eng. 2023, 4, 118–126. [Google Scholar] [CrossRef]
Huang, Z.; He, L.; Yang, Y.; Li, A.; Zhang, Z.; Wu, S.; Wang, Y.; He, Y.; Liu, X. Application of machine reading comprehension techniques for named entity recognition in materials science. J. Cheminform. 2024, 16, 76. [Google Scholar] [CrossRef]
Shrimal, A.; Jain, A.; Mehta, K.; Yenigalla, P. NER-MQMRC: Formulating named entity recognition as multi question machine reading comprehension. arXiv 2022, arXiv:2205.05904. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language under-standing. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Gao, Y.; Xiong, Y.; Wang, S.; Wang, H. GeoBERT: Pre-training geospatial representation learning on point-of-interest. Appl. Sci. 2022, 12, 12942. [Google Scholar] [CrossRef]
Li, Z.; Kim, J.; Chiang, Y.Y.; Chen, M. SpaBERT: A pretrained language model from geographic data for geo-entity representation. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP, Miami, FL, USA, 7–11 December 2022; pp. 2757–2769. [Google Scholar]
Gardazi, N.M.; Daud, A.; Malik, M.K.; Bukhari, A.; Alsahfi, T.; Alshemaimri, B. BERT applications in natural language processing: A review. Artif. Intell. Rev. 2025, 58, 166. [Google Scholar] [CrossRef]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
Li, P.; Luo, A.; Liu, J.; Wang, Y.; Zhu, J.; Deng, Y.; Zhang, J. Bidirectional gated recurrent unit neural network for Chinese address element segmentation. ISPRS Int. J. Geo-Inf. 2020, 9, 635. [Google Scholar] [CrossRef]
Jiang, X.; He, K.; He, J.; Yan, G. A new entity extraction method based on machine reading comprehension. arXiv 2021, arXiv:2108.06444. [Google Scholar] [CrossRef]
Fei, Y.; Xu, X. GFMRC: A machine reading comprehension model for named entity recognition. Pattern Recognit. Lett. 2023, 172, 97–105. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, J.; Zhu, X.; Sakai, T.; Yamana, H. NER-to-MRC: Named-entity recognition completely solving as machine reading comprehension. arXiv 2023, arXiv:2305.03970. [Google Scholar]
Du, X.; Zhao, H.; Xing, D.; Jia, Y.; Zan, H. MRC-based Nested Medical NER with Co-prediction and Adaptive Pre-training. arXiv 2024, arXiv:2403.15800. [Google Scholar]
Aryabumi, V.; Dang, J.; Talupuru, D.; Dash, S.; Cairuz, D.; Lin, H.; Venkitesh, B.; Smith, M.; Campos, J.A.; Tan, Y.C.; et al. Aya 23: Open weight releases to further multilingual progress. arXiv 2024, arXiv:2405.15032. [Google Scholar] [CrossRef]
Team, G.; Kamath, A.; Ferret, J.; Pathak, S.; Vieillard, N.; Merhej, R.; Perrin, S.; Matejovicova, T.; Ramé, A.; Rivière, M.; et al. Gemma 3 technical report. arXiv 2025, arXiv:2503.19786. [Google Scholar] [CrossRef]

Figure 1. The overall architecture of the proposed Geo-MRC model.

Figure 2. GRU-based semantic fusion of BERT outputs and internal unit structure.

Figure 3. Multi-scale 1D convolutional feature extraction.

Table 1. Question templates for Geo-NER.

	Entity Type	Question
1	Location name	What are the location names in the text?
2	Organization name	What are the organization names in the text?
3	Person name	What are the person names in the text?

Table 2. Examples of nested entity augmentation across datasets.

	Example Sentence	Augmented Nested Annotation	Type of Nesting
1	… on the New York Stock Exchange.	ORG: New York Stock Exchange LOC: New York	Boundary sharing
2	John F. Kennedy International Airport closed…	LOC: John F. Kennedy International Airport PER: John F. Kennedy	Cross-level semantic overlap
3	“北京清华大学的科研成果…” (“Research results from Tsinghua University in Beijing…”)	ORG: 清华大学 (Tsinghua University) LOC: 北京 (Beijing) ORG: 北京清华大学 (Tsinghua University in Beijing)	Boundary sharing
4	“…在上海浦东新区” (“…in Shanghai’s Pudong New Area”)	LOC: 上海浦东新区 (Shanghai’s Pudong New Area) LOC: 上海 (Shanghai) LOC: 浦东新区 (Pudong New Area)	Hierarchical nesting Boundary sharing

Note: In the Example Sentence column, the bolded words indicate the target entities. ORG, LOC, and PER denote organization, location, and person entities, respectively. For Chinese examples, the English equivalents are provided in parentheses.

Table 3. Examples of dataset instances converted into the MRC input format.

Example Sentence	Entity Type (Question)	Input Pair (MRC Format)	Expected Answer
… on the New York Stock Exchange.	Organization name (What are the organization names in the text?)	Q: What are the organization names in the text? T: …on the New York Stock Exchange	New York Stock Exchange
	Location name (What are the location names in the text?)	Q: What are the location names in the text? T: …on the New York Stock Exchange	New York
	Person name (What are the person names in the text?)	Q: What are the person names in the text? T: …on the New York Stock Exchange	None

Note: “Q” denotes the question specifying the target entity type, and “T” denotes the corresponding text. The Expected Answer column lists the gold entity span for the given question; if no entity of that type exists in the text, the answer is marked as None.

Table 4. Main parameter settings of the Geo-MRC model.

Category	Parameter	Value
Input Representation Construction	Max Sentence Length (Tokens)	128
Input Representation Construction	Max Entity Length (Tokens)	15
Multi-layer Contextual Feature Extraction	GRU Hidden Units	256
	GRU Layers	1
	Convolution Kernel Sizes	3, 5, 7
	Dropout Rate	0.1
Dynamic Boundary Inference	FFN Layers	2
Dynamic Boundary Inference	FFN Hidden Units	768
Training Settings	Learning Rate	$2 \times 10^{- 5}$
	Batch Size	256
	Training Epochs	20
	Dropout Rate	0.1

Table 5. Performance of Geo-MRC and baseline models on mixed datasets.

Datasets (Mixed: Non-Nested + Nested)	Methods	F1 Score			Micro-F1
Datasets (Mixed: Non-Nested + Nested)	Methods	PER	LOC	ORG	Micro-F1
RenMinRiBao	MRC-I2DP	92.13%	78.99%	80.67%	82.49%
	GFMRC	91.42%	78.05%	80.11%	81.80%
	NER-to-MRC	92.28%	78.85%	81.29%	82.65%
	MRC-CAP	91.52%	78.83%	82.02%	82.60%
	Geo-MRC	95.84%	90.87%	79.76%	88.40%
MSRA	MRC-I2DP	91.70%	77.63%	76.14%	80.35%
	GFMRC	92.01%	78.47%	75.58%	80.60%
	NER-to-MRC	91.53%	78.02%	76.21%	80.43%
	MRC-CAP	91.50%	77.79%	75.47%	80.12%
	Geo-MRC	95.66%	89.91%	79.03%	87.94%
CoNLL-2003	MRC-I2DP	83.36%	76.79%	66.85%	74.85%
	GFMRC	85.66%	78.63%	67.42%	76.58%
	NER-to-MRC	76.68%	80.03%	67.88%	74.45%
	MRC-CAP	87.18%	78.12%	67.80%	77.07%
	Geo-MRC	87.30%	82.30%	68.97%	79.27%
OntoNotes 5.0	MRC-I2DP	83.28%	43.87%	63.14%	68.35%
	GFMRC	84.41%	45.65%	64.54%	69.67%
	NER-to-MRC	84.85%	47.07%	64.25%	70.04%
	MRC-CAP	83.50%	48.44%	63.97%	69.37%
	Geo-MRC	84.99%	50.88%	67.13%	72.24%

Note: The best results are highlighted in bold, and the second-best results are underlined. The same formatting convention applies to the following tables.

Table 6. Performance of Geo-MRC and baseline models on nested datasets.

Datasets (Nested)	Methods	F1 Score			Micro-F1
Datasets (Nested)	Methods	PER	LOC	ORG	Micro-F1
RenMinRiBao	MRC-I2DP	67.55%	51.32%	36.47%	46.70%
	GFMRC	64.60%	50.60%	37.36%	46.36%
	NER-to-MRC	64.71%	50.00%	38.38%	46.46%
	MRC-CAP	67.53%	52.39%	34.64%	46.63%
	Geo-MRC	69.57%	57.41%	42.11%	51.66%
MSRA	MRC-I2DP	70.49%	52.84%	40.07%	49.74%
	GFMRC	69.65%	53.74%	42.00%	50.69%
	NER-to-MRC	69.98%	55.16%	42.49%	51.75%
	MRC-CAP	69.04%	53.16%	37.41%	48.57%
	Geo-MRC	71.11%	61.50%	47.45%	56.57%
CoNLL-2003	MRC-I2DP	42.63%	41.15%	28.25%	35.62%
	GFMRC	44.44%	39.67%	27.18%	34.75%
	NER-to-MRC	47.19%	38.06%	29.92%	35.76%
	MRC-CAP	46.76%	37.76%	27.25%	34.54%
	Geo-MRC	53.52%	42.97%	35.17%	44.69%
OntoNotes 5.0	MRC-I2DP	41.87%	34.77%	26.33%	32.44%
	GFMRC	47.75%	37.14%	29.49%	36.03%
	NER-to-MRC	49.39%	36.64%	29.01%	36.08%
	MRC-CAP	48.43%	36.11%	26.07%	34.51%
	Geo-MRC	51.34%	38.98%	33.44%	39.75%

Note: The best results are highlighted in bold, and the second-best results are underlined.

Table 7. Micro-F1 scores of Geo-MRC and its ablation variants.

Datasets	Methods	Micro-F1
Datasets	Methods	Mixed	Nested
RenMinRiBao	Geo-MRC w/o GRU	85.16%	50.52%
	Geo-MRC w/o CNN	86.30%	50.59%
	Geo-MRC w/o length	79.60%	45.79%
	Geo-MRC	88.40%	51.66%
MSRA	Geo-MRC w/o GRU	86.41%	54.14%
	Geo-MRC w/o CNN	87.13%	55.15%
	Geo-MRC w/o length	78.94%	50.56%
	Geo-MRC	87.94%	56.57%
CoNLL-2003	Geo-MRC w/o GRU	75.56%	39.38%
	Geo-MRC w/o CNN	74.89%	41.16%
	Geo-MRC w/o length	73.74%	35.84%
	Geo-MRC	79.27%	44.69%
OntoNotes 5.0	Geo-MRC w/o GRU	71.69%	35.16%
	Geo-MRC w/o CNN	70.78%	37.14%
	Geo-MRC w/o length	69.11%	34.99%
	Geo-MRC	72.24%	39.75%

Note: “w/o” denotes “without”, indicating the ablated version of the model without the corresponding module. The best results are highlighted in bold, and the second-best results are underlined.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Li, J.; Li, P.; Liu, T.; Du, P.; Hao, X. Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition. ISPRS Int. J. Geo-Inf. 2025, 14, 431. https://doi.org/10.3390/ijgi14110431

AMA Style

Zhang Y, Li J, Li P, Liu T, Du P, Hao X. Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition. ISPRS International Journal of Geo-Information. 2025; 14(11):431. https://doi.org/10.3390/ijgi14110431

Chicago/Turabian Style

Zhang, Yuting, Jingzhong Li, Pengpeng Li, Tao Liu, Ping Du, and Xuan Hao. 2025. "Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition" ISPRS International Journal of Geo-Information 14, no. 11: 431. https://doi.org/10.3390/ijgi14110431

APA Style

Zhang, Y., Li, J., Li, P., Liu, T., Du, P., & Hao, X. (2025). Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition. ISPRS International Journal of Geo-Information, 14(11), 431. https://doi.org/10.3390/ijgi14110431

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Geo-MRC: Dynamic Boundary Inference in Machine Reading Comprehension for Nested Geographic Named Entity Recognition

Abstract

1. Introduction

2. Related Work

2.1. Geo-NER Methods

2.2. MRC-Based NER Methods

3. Method

3.1. Input Representation Construction

3.2. Multi-Layer Contextual Feature Extraction

3.2.1. Multi-Layer BERT Representations

3.2.2. GRU-Based Semantic Fusion

3.2.3. Multi-Scale 1D Convolutions

3.3. Dynamic Boundary Inference

4. Experiments

4.1. Experimental Setup

4.1.1. Datasets

4.1.2. Evaluation Metrics

4.1.3. Parameter Settings

4.2. Comparative Experiments

4.2.1. Performance on Mixed Datasets

4.2.2. Performance on Nested Datasets

4.3. Ablation Studies

4.4. Results Analysis and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI