Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement

Wu, Di; Chen, Yao; Yan, Mingyue

doi:10.3390/app15116171

Open AccessArticle

Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement

by

Di Wu

^*

,

Yao Chen

and

Mingyue Yan

School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(11), 6171; https://doi.org/10.3390/app15116171

Submission received: 25 April 2025 / Revised: 25 May 2025 / Accepted: 28 May 2025 / Published: 30 May 2025

Download

Browse Figures

Versions Notes

Abstract

At present, researchers are showing a marked interest in the topic of few-shot named entity recognition (NER). Previous studies have demonstrated that prompt-based learning methods can effectively improve the performance of few-shot NER models and can reduce the need for annotated data. However, the contextual information of the relationship between core entities and a given prompt may not have been considered in these studies; moreover, research in this field continues to suffer from the negative impact of a limited amount of annotated data. A multi-class label prompt selection and core entity replacement-based named entity recognition (MPSCER-NER) model is proposed in this study. A multi-class label prompt selection strategy is presented, which can assist in the task of sentence–word representation. A long-distance dependency is formed between the sentence and the multi-class label prompt. A core entity replacement strategy is presented, which can enrich the word vectors of training data. In addition, a weighted random algorithm is used to retrieve the core entities that are to be replaced from the multi-class label prompt. The experimental results show that, when implemented on the CoNLL-2003, Ontonotes 5.0, Ontonotes 4.0, and BC5CDR datasets under 5-Way k-Shot (k = 5, 10), the MPSCER-NER model achieves minimum F1-score improvements of 1.32%, 2.14%, 1.05%, 1.32%, 0.84%, 1.46%, 1.43%, and 1.11% in comparison with NNshot, StructShot, MatchingCNN, ProtoBERT, DNER, and SRNER, respectively.

Keywords:

few-shot named entity recognition; multi-class label prompt selecting; demonstration prompt; core entity replacement

1. Introduction

Human–computer dialogue systems have been a hot topic in the research domain of natural language comprehension [1]. Dialogue state tracking, as one of the key tasks of human–computer dialogue systems, helps the system to accurately understand a user’s words and execute their commands. To accomplish this goal, access to structured information provided by NER is required for dialogue state tracking [2].

NER tasks identify and classify entities in text into predefined categories, such as the names of people, places, and organizations. Structured information is formed to provide data support for downstream tasks [3]. As the field of natural language processing has evolved, named entities in downstream tasks have been refined. The naming rules for these entities vary widely, resulting in significant labor costs for the annotation of data. Therefore, the development of methods that can be used to accurately identify named entities in scenarios with limited data has become an urgent requirement in this field.

In recent years, pretrained large language models (LLMs) have gradually emerged. LLMs have also been used to implement NER tasks in few-shot scenarios [4]. Prompt learning is introduced into the LLM to recognize named entities and reduce the need for a large amount of labeled data. Good prompts can prevent the need for many annotated examples, and can thus contribute to improved performance [5]. Additionally, data augmentation methods can also increase sample diversity, enhancing performance in the context of few-shot NER. These methods play a crucial role in the further development of named entity recognition. The main contributions of this study are as follows:

A model is designed to address the NER task in few-shot scenarios. A multi-class label prompt selection strategy is designed to select an annotated instance with a clear sentence structure for demonstration. The entity context information between the sentence and the multi-class label prompts is enhanced to improve the accuracy of core entity recognition. The optimization effect of multi-class label prompt demonstrations on word vector representations for entities in target sentences is empirically validated. The low-density core entity demonstrations empirically prove that prompts with clearer sentence structures can effectively enhance the accuracy of core entity recognition.
A core entity replacement strategy is designed to increase the diversity of input word vectors during training. A weighted random algorithm is employed to retrieve the core entities that are to be replaced in the prompt. The core entities selected in the multi-class label prompt are updated during each the training epoch. The vector of each token in the training data is updated. The core entity replacement method dynamically updates word vector labels in demonstration prompts. A novel approach to enrich input data in few-shot learning scenarios is proposed.
Experiments on the CoNLL-2003, OntoNotes 5.0, OntoNotes 4.0, and BC5CDR datsets showed the superiority of our model in few-shot NER.

The rest of the study is organized as follows: Section 2 reviews the literature related to the research conducted in the present study. Section 3 details the structure of the MPSCER-NER (multi-class label prompt selecting with core entity replacement for named entity recognition) model, and the working principle of the model is described in detail. Section 4 presents the results and analysis of the modeling experiments. Section 5 summarizes the conclusions of the study and presents future research directions.

2. Related Work

2.1. Prompt Learning

Petroni et al. [6] showed that language models possess strong relational and factual knowledge even without fine-tuning, and they can effectively complete closed-style sentences. The accuracy of NER is improved by predicting the masked words in the target sentence with the auxiliary sentence ‘[entity span] is [MASK]’ [7,8]. The auxiliary sentence is used to enumerate words in a sentence and fine-tune the LLM parameters to infer entity labels. Hu et al. [9] proposed a method for incorporating external knowledge into the verbalizer to expand the label word space. A method that dynamically and selectively incorporates examples into each context was proposed by Gao et al. [10]. The method validates the performance improvement of demonstration examples in few-shot learning. Lee et al. [11] enhanced the original input instances by adding automatically generated demonstration examples and inputted them into large language models. The method demonstrates the good performance of prompt learning in few-shot NER. An NER framework that unites prompt learning and contrast learning was proposed by Dong et al. [12]. The framework effectively incorporates entity boundary information and enhances entity representation in pretrained language models. A contrastive learning approach with prompt guiding for a few-shot NER method was proposed by Huang et al. [13]. The word vectors are made close to their corresponding labels using new-word-specific prompts because of the supervised annotation of the token representations, using contrast learning. Biomedical NER is redefined as a machine reading comprehension problem [14].

2.2. Data Augmentation

External knowledge contains downstream task-relevant entity information that can assist LLMs in learning special named entity rules. Wang et al. [15] proposed a model that integrates dictionary information with sentence semantics. The model incorporates external knowledge at both the word level and the sentence level into the language model. The effect of external knowledge on improving recognition accuracy in Chinese language models has been validated. A method to obtain a word vector representation of a medical text with Chinese features that was added to BERT was proposed by Lu et al. [16]. The improved effect of adding domain knowledge for specific downstream tasks was verified. An NER method that uses a Wikipedia anchor text to enhance the training of the model was proposed by Xue et al. [17]. The method validates the effectiveness of training models with external text corpora for zero-shot NER tasks. Chen et al. [18] mapped external examples to corresponding concepts using a self-descriptive network, validating the enhancement effect of external knowledge on NER in Seq2Seq generation models. A method to replace entities in available training data with similar entities using contextual similarity was proposed by Bartolini et al. [19]. The volume of training data was enriched. A data augmentation method for obtaining similar entities using Glove was proposed by Liu et al. [20]. Similar word vectors were obtained for entity substitution or semantic transformation. A method for incorporating named entity labels into training sentences was proposed by Zhou et al. [21]. The method helps the model to learn the label information of an entity. An approach for sentence destruction using external factual knowledge was proposed by Ghosh et al. [22]. Based on the corrupted sentences, new sentences are generated using BART and the generated sentences are added to the training set. The above methods use data augmentation methods to alter the existing few-shot dataset and increase the volume of training data. The model helps in adaptation to named entity rule changes.

3. The MPSCER-NER Model

The goal of the present study was to address the problems that arise when one limits the context window of a demonstration-based few-shot NER model; here, the window is limited to specific entities. Additionally, we sought to resolve the problems that arise when the NER model learns word vectors with little information. A new model, MPSCER-NER, for multi-class label prompt selection with core entity replacement is proposed. Multi-class label prompts are retrieved as candidates from the few-shot instances dataset. Here, we identify instances with low-density core entity from the candidates in order to provide a demonstration. The sentence to be identified with the multi-class label prompt demonstration is input into BERT to obtain better token representations for words. A weighted random algorithm is used to select a core entity from the demonstration. The entity is replaced with the corresponding label. The training data word vectors are changed after the demonstration update. The new word vectors help BERT to recognize similar entities in different situations. BiLSTM handles the token representation of the multi-class label prompt demonstration. CRF processes the information from BiLSTM to resolve the mismatch between labels and finally obtains the entity labels. The MPSCER-NER framework is shown in Figure 1.

3.1. Multi-Class Label Prompt Selecting

The task of NER can be described as follows: Given a sentence, S, consisting of n words—where each word is denoted as

x_{i}

—the individual entities in the text are identified and annotated according to a pre-specified set of label types

Y = {PER, LOC, ORG, O, \dots}

; these will be mapped to the labels in the set according to the computational probability function. The core entity is the entity that is not labeled as O in the collection of labels.

Prior knowledge in the model can be effectively leveraged to enhance NER performance through prompt learning. Sentences from the instance dataset are selected as demonstration prompts to enable the model to accurately understand the entity information of the downstream task. In a demonstration sentence, the core entity is the key information of the sentence that effectively represents the features of its corresponding label. A contextual link between the entities is formed using the sentence that is to be recognized. Sentences that have more than one type of entity in the instance set are called multi-class label prompts. Using a multi-class label prompt as a demonstration can help models in learning information about various types of entities. The MPSCER-NER model learns the distribution of the input text and the overall format of the sequence from the multi-class label prompts. The process of multi-class label prompt selection is shown in Figure 2.

In Figure 2, PER, LOC, and ORG denote the name of person, place, and organization, respectively. MISC represents the other core entity types. In the sentence ‘Sarah visited the NVIDIA in California’, ‘Sarah’ is PER, ‘NVIDIA’ is ORG, and ‘California’ is LOC. The core entity type in the sentence is 3. Similarly, in the sentence ‘Obama returned to Washington’, ‘Obama’ is PER and ‘Washington’ is LOC. In the sentence ‘Sophia took her dog Bella for a walk’, ‘Sophia’ is PER and ‘Bella’ is MISC. In the three sentences, the kinds of label are 3, 2, and 2. The sentence ‘Sarah visited the NVIDIA in California’, which has the most different kinds of core entity, is selected as a candidate.

Among the candidates, the core entity density of the sentences is calculated. The core entity is highlighted in a sentence with a low-density core entity. The sentence structure is clear, bringing clear contextual information about entity boundaries and helping the LLM to identify the core entity in the annotated corpus. Sentences with low-density core entity have a clear context of the entity, which helps the LLM identify core entities in the unlabeled corpus. The process of low-density core entity prompt selection is shown in Figure 3.

In Figure 3, there are five core entities in the sentence: PER (Sarah, John), ORG (Apple Inc., Google), and LOC (London). There are 12 non-core entities, so the calculation results in a core entity density of 0.29. The core entity density is computed sequentially for the different candidate sentences; the goal of this is to identify the sentence with the minimum number of core entities.

Let

{{\hat{p}}_{1}, p_{2}, \dots, p_{m}}

represent the multi-class label prompt; here,

p_{j}

represents the jth word in the prompt,

{\hat{p}}_{j}

represents the core entity word, and

y_{j}

represents the label corresponding to the jth word in the prompt. Let

{x_{1}, x_{2}, \dots, x_{n}}

represent the input corpus and

x_{i}

represent the ith word in the corpus. The input corpus is combined with the multi-class label prompt and merged into the format

{x_{1}, x_{2}, \dots, x_{n} [S E P] {\hat{p}}_{1}, p_{2}, \dots, p_{m}}

. A demonstration of the multi-class label prompt process is shown in Figure 4.

As shown in Figure 4, in the NER method without demonstration, the core entity is usually associated with the word in the sentence that is to be recognized. Only the linkage of entities within the sentence is constructed. In the multi-class label prompt demonstration of the NER method, the word vector of ‘Peter’ is related to the word vector of ‘Sarah’ in the multi-class label prompt. Word vectors of ‘Peter’ are influenced by the entity. The word vector is closer to the corresponding label.

Multi-class label prompts are selected for the demonstrations of the sentences that are to be recognized. The entity contextual links between the sentences are constructed to obtain better token representation. The entity word vectors in the sentence to be recognized are close to their corresponding labels.

3.2. Core Entity Replacement

To enable the language model to learn information about entities of the same category, entity labels are explicitly added to the demonstration. A core entity is dynamically selected in a demonstration sentence, and the entity is replaced using its corresponding label. Dynamic demonstration implies different entity word vectors, which helps the model to be exposed to word vectors in different scenarios. The model’s ability is enhanced to recognize the same type of entities. An example of the core entity replacement process is shown in Figure 5.

In Figure 5, the training sentence is ‘Peter got drunk alone at the bar’. The multi-class label prompt is ‘Sarah visited the NVIDIA in California’. Here, ‘NVIDIA’ is replaced in the prompt with its corresponding label, ‘ORG’. At the next training stage, ‘California’—selected by the weighted random algorithm—will be replaced with its corresponding label, ‘LOC’. Thus, the multi-class label prompt has been updated, and the word vectors in the training data have changed. The details of the core entity replacement method are illustrated in Figure 6.

As shown in Figure 6, the word embedding process of the training data is updated with changes in the multi-class label prompts. The model is exposed to more word vectors, which helps to improve the ability of the MPSCER-NER model to recognize entities. There are multiple kinds of core entities in the multi-class label prompt. For the MPSCER-NER model to adequately learn named entity vector information in different situations, the weighted random algorithm is used to ensure that each core entity is replaced. Thus, the balanced learning of all entity labels is achieved.

There are multiple kinds of core entities in the multi-class label prompt. For the MPSCER-NER model to adequately learn named entity vector information in different situations, the weighted random algorithm is used to ensure that each core entity is replaced. Thus, the balanced learning of all entity labels is achieved. To prevent completely random labels from appearing, one weight value per core entity is used to adjust the random selection probability during training to select the most likely word. The equation of new weights of the selected words

W_{u p d a t e}

is calculated as shown in the following equation:

W_{u p d a t e} = W_{o r i g i n} - a * r

(1)

where

W_{o r i g i n}

is the weight of the selected core entity before updating, a is a penalty factor specified in advance, and r is a random variable. After the weight has updated, the entity labels are guaranteed to be balanced for replacement.

After identifying the selected core entity, the entity is replaced by its corresponding label. The original sentence,

{{\hat{p}}_{1}, p_{2}, \dots, p_{m}}

, is transformed to

{y_{1}, p_{2}, \dots, p_{m}}

. The prompt for the demonstration is obtained. Therefore, the training data can be formatted as

{[CLS] X_{i n p u t} [SEP] y_{1}, p_{2}, \dots, p_{m} . [SEP]}

.

For example, the sentence selected using the multi-class label prompt selection method is ‘Yesterday, Sarah from Apple Inc. met John in London and attended the conference at Google headquarters’. At each round of training, the updated prompts with the training data are input into the model. The candidates for replacement from the core entities are shown in Figure 7.

In Figure 7,

X_{i n p u t}

represents the input training data. Candidate represents the core entities to be replaced in the demonstration. Labels represents the core entity labels in the multi-class label prompt, and Prompt represents the multi-class label prompt. A core entity is shown to be replaced with the corresponding label in each round of training. For example, ‘Sarah’, an entity labeled ‘PER’, is replaced to obtain a new sentence: ‘Yesterday, PER from Apple Inc. met John in London and attended the conference at Google headquarters’. The updated multi-class label prompt with the training corpus is input into the model.

In the MPSCER-NER model, each word is labeled as a specific entity category. The classification loss function is used to measure the difference between the predicted label and the actual label. The classification loss is later minimized to make the output of the model so that the output is closer to the probability distribution of the real labels. This means that the model becomes more accurate in predicting the named entity category for each word. The classification

L o s s

is as shown in the following equation:

L o s s = - \frac{1}{N} \sum_{n = 0}^{N - 1} \sum_{c = 0}^{C - 1} y_{n, c} l o g p_{n, c}

(2)

where N denotes the number of samples;

p_{n, c}

denotes whether the word at n belongs to the true label of category c or not, taking the value of 1 if the true category of sample i is equal to c. It denotes the predicted probability that

y_{n, c}

belongs to the c category.

The MPSCER-NER model selects core entities in the multi-class label prompts as candidates during training. The weighted random algorithm is used to identify the next replaced word, influenced by the word changes in the demonstration. The word vectors of the input corpus are updated, which enhances the MPSCER-NER model’s ability to conduct entity segmentation.

3.3. BiLSTM-CRF Classification

After acquiring word vectors with the contextual information of multi-class label prompts, BiLSTM-CRF is used to process semantic information and output label sequences. Multi-class label prompts increase the length of input sentences. The sequence context information can be effectively captured by BiLSTM. The information processed using BiLSTM is passed on to the CRF. An example of how BiLSTM-CRF processes word embeddings in MPSCER-NER is shown in Figure 8.

In Figure 8, the acquired word vectors affected by the multi-class label prompts are processed by BiLSTM. Finally, the word vectors with entity context information are output to CRF.

Word vectors are generated with the problem of mismatched labels. The classifier can only discern the most likely label for the word vector, but not whether it makes logical sense. Dependencies between labels can be established by CRF. The output is adjusted globally based on the results. It makes the NER model more compatible with the actual labeling constraints and also improves the robustness of the model in terms of the named entity boundaries.

The observation sequences and state sequences of the CRF interact with each other by defining the eigenfunctions. The characteristic function is defined on the input sequence. They can capture different features depending on the needs of the modeling task. Therefore, CRF is chosen to solve the label before and after the mismatch problem. The inferred label probability formula based on CRF is shown in the following equation:

P (y | p) = \frac{1}{Z (x)} exp (\sum_{i, k} λ_{k} t_{k} (y_{i - 1}, y_{i}, p, i) + \sum_{i, l} u_{l} s_{l} (y_{i}, p, i)) Z (x) λ_{k} u_{l}

(3)

Z (x) = \sum exp (\sum_{i, k} λ_{k} t_{k} (y_{i - 1}, y_{i}, p, i) + \sum_{i, l} u_{l} s_{l} (y_{i}, p, i))

(4)

where

Z (x)

is the normalization factor;

λ_{k}

and

u_{l}

are the weight values corresponding to the feature functions; p and y are the corresponding labels, respectively; k represents the kth transfer feature function;

y_{i}

is the label predicting the ith word; and l is the lth state feature function.

The function

t_{k} (\cdot)

relies on the current position and previous position labels, and

s_{l} (\cdot)

relies only on the current position label. The transfer eigenfunctions

t_{k} (\cdot)

and

s_{l} (\cdot)

are calculated as shown in the following equation:

t_{k} (y_{i - 1}, y_{i}, x, i) = \{\begin{matrix} 1, y_{i - 1} = B, y_{i} = I \\ 0, otherwise \end{matrix}

(5)

s_{l} (y_{i}, x, i) = \{\begin{matrix} 1, y_{i} = B \\ 0, otherwise \end{matrix}

(6)

where

y_{i - 1}

is the word label of the previous position, and

y_{i}

is the word label of the current position. The letter B denotes the beginning of the label (Begin), and I denotes the position inside the label (Inside). We obtain 1 for B and I, and we obtain 0 in all the other cases.

After BiLSTM-CRF processing, the contextual information in the sentence is clearer and the illogical label pairings are reduced. The labels of the identified entities are closer to their real labels.

In summary, the MPSCER-NER model training processes are described as follows:

Multi-class label prompts are selected as prompts using the MPSCER-NER model. The distribution of text in the prompt learning is learned along with the overall input format of the sequence. The weighted random algorithm is used to replace the core entities to help the MPSCER-NER model in identifying entities in different situations. Finally, the word vectors are input into BiLSTM-CRF to deal with the label mismatches that occur before and after the achievement of NER. The training algorithm of the MPSCER-NER model is shown in Algorithm 1.

Algorithm 1 MPSCER-NER model training

Require: Training dataset—D; prompt dataset—P; batch size—

B s

; epoch—

E p

; learning rate—

L r

; dropout—

D r

;

θ

—the initial MPSCER-NER model parameters.

Ensure: The MPSCER-NER model(

\tilde{θ}

).

1:: $Θ \leftarrow$
2:: for $p t$ in P do
3:: if $C o r e E n t i t y L a b e l (p t) > L a b e l_n u m$ then
4:: Clear $Θ$
5:: $Θ \leftarrow p t$
6:: $L a b e l_n u m \leftarrow C o r e E n t i t y L a b e l (p t)$
7:: end if
8:: if $C o r e E n t i t y L a b e l (p t) = L a b e l_n u m$ then
9:: $Θ \leftarrow Θ \cup p t$
10:: end if
11:: end for
12:: $\tilde{X} \leftarrow N o n e$
13:: $δ \leftarrow 0$
14:: for $c i n Θ$ do
15:: if $C o r e E n t i t y D e n s e < δ$ then
16:: $\tilde{X} \leftarrow c$
17:: end if
18:: end for
19:: $(Ω) \leftarrow C o r e E n t i t y R e t r i v e (\tilde{X})$
20:: for n in $E p$ do
21:: $(Ω) \leftarrow C o r e E n t i t y R e t r i v e (\tilde{X})$
22:: $\tilde{X} \leftarrow C o r e E n t i t y R e p l a c e (\tilde{X}, y_{i})$
23:: $L o s s (θ) = - \frac{1}{N} \sum_{n = 0}^{N - 1} \sum_{c = 0}^{C - 1} y_{n, c} l o g p (θ)$
24:: $\tilde{θ} \leftarrow θ - η Δ L o s s (θ)$
25:: end for
26:: return The MPSCER-NER model( $\tilde{θ}$ )

4. Experimental Results and Analysis

4.1. Datasets and Experimental Settings

In this study, BERT-base-cased is used as the encoder, AdamW is used as an optimizer, the PyTorch 1.7 library is used for implementing the code, and an Nvidia GeForce RTX2080Ti GPU is used for training.

The CoNLL-2003 [23] dataset is a news text containing English and German texts; it is used to train and evaluate NER systems. Moreover, it is widely used to evaluate the ability of natural language processing models to recognize named entities, such as the names of people, places, and organizations. The Ontonotes 5.0 [24] dataset contains textual data from a wide range of sources, such as news, conversations, web, and radio; it covers a large number of linguistic tasks such as named entities, semantic role annotations, and denotational disambiguation. OntoNotes 4.0 [25] is a dataset covering Chinese–English named entity annotation. BC5CDR [26] is a dataset for Named Entity Recognition tasks in the biomedical field. These datasets are often used to validate the performances of various NER models. The dataset statistics are shown in Table 1.

Note the following hyper-parameters: ‘Learning rate’ denotes the learning rate, which is set to 2 × 10⁻⁵; ‘Batch size’ denotes the batch, the number of which is set to 64; ‘Epoch’ denotes the number of training rounds, which is set to 50; and ‘Dropout’ denotes the dropout rate, which is set to 0.5. Training will stop if the F1 does not improve after Maxnoincre epochs. The experimental hyper-parameter settings are shown in Table 2.

4.2. Evaluation Indicators

In this experiment,

P r e c i s i o n

,

R e c a l l

, and F1 are used to evaluate the performances of the MPSCER-NER model and the comparison model.

P r e c i s i o n

is used to measure the accuracy of the MPSCER-NER model in cases where the prediction is a positive case. The value of the accuracy rate ranges from 0 to 1. Higher values indicate that the model is more accurate in predicting positive cases.

P r e c i s i o n

is calculated as shown in the following equation:

P r e c i s i o n = \frac{T P}{T P + F P}

(7)

where

T P

denotes the number of texts where the actual label is true and the entity category result is also true;

F P

denotes the number of entities whose actual label is false, but which are identified as true.

R e c a l l

is used to measure the proportion of positive examples that can be captured by the MPSCER-NER model. Higher values indicate higher coverage of positive examples by the model.

R e c a l l

is calculated as shown in the following equation:

R e c a l l = \frac{T P}{T P + F N}

(8)

where

F N

denotes the number of entities whose entity label is true but where the model entity identification result is false.

The F1 is the reconciled average of

P r e c i s i o n

and

R e c a l l

, which combines the performance of

P r e c i s i o n

and

R e c a l l

. F1-score is calculated as shown in the following equation:

F 1 = 2 * \frac{P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l}

(9)

Overall, precision measures how accurately the model predicts one of the positive examples. Recall measures the model’s coverage of the true-positive examples. The F1 is a composite metric that strikes a balance between

P r e c i s i o n

and

R e c a l l

.

4.3. Effectiveness on K-Shot

The number of shots is a very critical parameter in further improving model performance. Within certain limits, increasing the number of shots improves the stability of convergence, but may lead to a limitation in the generalization ability of the model. In order to verify the effect of the number of shots on the MPSCER-NER model, the effect of k-shots on the

P r e c i s i o n

,

R e c a l l

, and F1 of the MPTSCER-NER model on the CoNLL-2003 dataset (k-shots, k = 25, 50, 100) is shown in Figure 9.

As can be seen from the figure, the

P r e c i s i o n

,

R e c a l l

, and F1 of the model gradually increase as the number of shots becomes larger. The

P r e c i s i o n

,

R e c a l l

, and F1 of the model are maximized when the number of shots is taken as 100. More samples of the entity feature information are learned by the model. The models are exposed to different contexts and different forms of domain knowledge. The ability of the model to recognize entity classes is enhanced.

4.4. Confusion Matrices

As shown in Table 3, Table 4 and Table 5, the use of the multi-class label prompt selection method resulted in an increase in the TP + TN values for certain categories. We speculate that this outcome is related to the types of core entities presented in the multi-class label demonstration. The multi-class label prompt optimizes the vector representations of specific categories of entities in the sentences being recognized. In the case of solely using the core entity replacement method, the number of TP + TN values for multiple categories increased, which is due to the fact that each prompt update enriched the input word vectors and enhanced the model’s ability to identify entities. In the MPSCER-NER model, the recognition accuracy of TP + TN has effectively improved; however, the number of correctly identified MISC entities has not seen significant enhancement. We speculate that this is due to the presence of multiple different fine-grained labels within the MISC category.

4.5. Ablation Studies

The MPSCER-NER model is mainly divided into modules for multi-class label prompt selection and core entity replacement. Different combinations of the stages are used in the CoNLL-2003 dataset. The MPSCER-NER model is experimentally analyzed using the ablation method. The ablation experiments verified the validity of the multi-class label prompt selection and core entity replacement methods. The approach for selecting the ablation experiment modules is shown in Table 6.

As can be seen from Table 6, Module 1 denotes the strategy comprising multi-class label prompt demonstration only. Module 2 denotes the strategy comprising multi-class label prompt demonstration with low-density core entity selection. Module 3 denotes the strategy comprising core entity replacement with a weighted random algorithm. Module 4 denotes the strategy comprising multi-class label prompt demonstration with core entity replacing. Module 5 is the model presented in this study, MPSCER-NER.

From the results achieved using Module 1 and Module 2 in Table 7, it can be seen that the multi-class label prompt with a low-density core entity have improved the effects of the 25-shot, 50-shot, and 100-shot circumstances compared to the normal multi-class label prompts. Multi-class label prompts with low-density core entities have a clearer sentence structure. The entity context information between the sentence to be recognized and the demonstration sentence is clearer. As can be seen in Module 1, the

R e c a l l

of the entity recognition process is effectively improved through the use of the core entity replacement methods. As the demonstration prompt changes, the word vectors in the training sentences also change. The model is exposed to more word vectors and becomes more capable of recognizing entities. In 25-shot, 50-shot, and 100-shot circumstances, the MPSCER-NER model works optimally in terms of

P r e c i s i o n

,

R e c a l l

, and F1-score. The core entity replacement method for normal multi-class label prompts shows a sub-optimal performance. The results for this method prove that the multi-class label prompt selection and core entity replacement method can effectively improve the model’s ability to recognize entities and that the effects can mutually enhance each other.

4.6. Baselines

To validate the performance of the MPSCER-NER model proposed in this study, comparative experiments were conducted on CoNLL-2003, Ontonotes 5.0, and Ontonotes 4.0 with benchmark models, including the following:

NNshot and StructShot [27] comprise a simple NER based on nearest neighbor learning and structured reasoning. This is a supervised NER model trained on the source domain, which is used for feature extraction; a nearest neighbor classifier is used to learn in the feature space, capturing label dependencies between entity labels.
MatchingCNN [28] is a network that maps a small labeled support set; an unlabeled example for its label was proposed. It calculates the similarity between query instances and support instances, adapting to the recognition of new class types.
ProtoBERT [29] uses a token-level prototypical network that represents each class by averaging token representations with the same label; then, the label of each token in the query set is decided by its nearest class prototype.
DemonstrationNER [11] is a prompt learning NER method based on demonstration. The sentences marked in the dataset are selected as prompts to be input into the BERT model. The authors presented a demonstration of the relationship between the entities and the labels after the example sentences were constructed. This process helps the model to learn the contextual information from the task demonstration, contextualizing the task before the input, and enabling the model to recognize more entities through a good demonstration.
SR-Demonstration [30] is an NER method that was proposed for marking the relevance of demonstrations; it removes useless information from demonstration prompts, creates a relevance vocabulary consisting of tokens that appear in the annotated datasets, samples the tokens from the relevance vocabulary to replace the tokens in the demonstration, and calculates the most suitable demonstration sentence length required to achieve a demonstration of NER.

As can be seen from Table 8, the MPSCER-NER model shows better performances in the 25-shot and 50-shot scenarios on all three datasets—CoNLL-2003, Ontonotes 5.0, and Ontonotes 4.0—in comparison with DemonstrationNER (which performed better with the CoNLL-2003 dataset in terms of NNshot, StructShot, MatchingCNN, ProtoBERT, DemonstrationNER, and SR-Demonstration). The MPSCER-NER model improved the F1-score of the 25-shot and 50-shot scenarios by 1.32% and 2.14%, respectively. Compared to SR-Demonstration—which had a better performance on Ontonotes 4.0 and Ontonotes 5.0—MPSCER-NER improved the F1-score of the 25-shot and 50-shot scenarios by 1.05%, 1.32%, 0.84%, and 1.46%. Compared to SR-Demonstration—which had a better performance on BC5CDR—MPSCER-NER improved the F1-score in the 25-shot and 50-shot scenarios by 1.43% and 1.11%, respectively. The MPSCER-NER model was used for demonstration, identifying multi-class label prompts in the set of examples. Entity context links between the multi-class label prompts and the sentences that are to be recognized can be formed. The word vector representation in the sentence was optimized to make it closer to its corresponding label. The core entities in the multi-class label prompts in training were replaced with their corresponding labels. The word vectors in the sentences were altered to improve the model’s ability to recognize the named entities in different situations.

5. Conclusions

Intelligent dialogue systems are gradually being integrated into people’s lives. Dialogue state tracking is a fundamental task that intelligent dialogue systems must perform. NER provides entity information that helps dialogue state tracking models to accurately understand users’ words. In this study, a multi-class label prompt selection and core entity replacement-based named entity recognition (MPSCER-NER) model was proposed. In the multi-class label prompt selection phase, multi-class label prompts from the instance dataset are selected as candidates. Low-density multi-class label candidate prompts are used as a demonstration; these are input into the model together with the sentences that are to be recognized. Contextual links between entities are established. The word vector representation is optimized to improve the model’s ability to identify positive samples. In the core entity replacement phase, the weighted random algorithm is used to find out the core entities that will be replaced. Then, the word vectors used for the training data are enriched. The ability of the model to recognize entities in different scenarios can thus be enhanced. In the entity classification stage, BiLSTM is used to deal with word vectors with information from multi-class label prompts. CRF is used to deal with the problem of mismatch between the fronts and the backs of the labels. The experimental results show that, on CoNLL-2003, Ontonotes 5.0, Ontonotes 4.0, and BC5CDR under 5-Way k Shot (k = 5, 10), the MPSCER-NER model achieves minimum F1-score improvements of 1.32%, 2.14%, 1.05%, 1.32%, 0.84%, 1.46%, 1.43%, and 1.11% in comparison with NNshot, StructShot, MatchingCNN, ProtoBERT, DNER, and SRNER, respectively.These results demonstrate the superior NER accuracy of the MPSCER-NER model.The multi-class label prompt selection method mentioned in this study can effectively improve the accuracy rate of positive sample recognition; the core entity replacement method can effectively improve the recall rate. These results are promising for the recognition of proprietary named entities. However, the methods cannot achieve the expected effect in the face of a zero-sample scenario. Methods for avoiding this problem must be researched further.

6. Limitations

In the experimental results, we can see that the model’s accuracy is limited; this can be attributed to the following factors: (1) Limitations in annotated sample size—in few-shot settings, 5-way–5-shot and 5-way–10-shot annotated datasets were used to train models (excluding BC5CDR); here, the limited nature of the training data means that it is difficult to cover the full diversity of named entity expressions and to account for the complexity of contextual environments. Additionally, the small number of annotated samples leads to unstable parameter updates, causing the optimization process to fall into local optima. (2) Limitations in model parameter size—the BERT-base model, comprising 12 attention layers and approximately 110 million parameters, was used for named entity recognition; the relatively smaller parameter size of this model led to a limitation in the achievable F1-score. (3) The granularity of the data—among the CoNLL03, OntoNotes 5.0, OntoNotes 4.0, and BC5CDR datasets, the CoNLL03 dataset achieved the highest accuracy, which is primarily because it is a clean and general-purpose dataset; in OntoNotes 5.0 and OntoNotes 4.0, named entities were similarly divided into five categories (i.e., PER, LOC, ORG, MISC, and O); however, within the MISC category, the experimental data exhibit finer granularity and a larger number of subtypes, resulting in lower F1-scores on the OntoNotes datasets compared to on the CoNLL03 dataset; BC5CDR, as a biomedical domain dataset, contains highly specialized terminology, and compared to the general-domain pretraining corpus used by BERT, its specialized nature leads to lower accuracy relative to general-domain datasets.

7. Future Work

In this model, annotated data are required, providing prompts for establishing the connections between the training data and the multi-class label prompts for the entities. As a result, the model cannot independently perform the zero-shot named entity recognition (NER) task. Domain adaptation and zero-shot data generation are identified as key approaches for enabling zero-shot NER.

7.1. Domain Transfer

(1) The annotated source domain dataset is used as the training set, where multi-class label templates are selected as demonstration prompts. In this approach, the BERT model serves solely as a fixed-feature extractor, where all parameters are frozen and they are not updated during training. Only the linear classification layer is fine-tuned so that the model can learn the mapping from the BERT-derived representations of the specific label templates. (2) Subsequently, the constructed label templates can be utilized in a prompt-based or text-matching manner to predict entities in the target domain. This framework ultimately enables the implementation of zero-shot named entity recognition.

7.2. Zero-Shot Data Generation

(1) Annotated data are selected to train the generative model; here, a text-generation task can be constructed based on these annotations, transforming the original named entity recognition (NER) task into a text-to-text generation formatting task.This allows the model to generate corresponding zero-shot NER data, which can subsequently be used to train an NER model. (2) The generated data are then converted into the standard NER BIO tagging format, and these standardized data are then used to train the target domain NER model.

Author Contributions

D.W. and Y.C. wrote the main manuscript and M.Y. conducted the experiments. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Research Projects of the Nature Science Foundation of Hebei Province grant number F2020402003.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study were derived from public resources and made available within the article. We have published our code at https://github.com/CMSLDL/MPSCERmodel (accessed on 25 May 2025).

Acknowledgments

The authors look forward to the insightful comments and suggestions of the anonymous reviewers and editors, which will go a long way towards improving the quality of this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Jiang, M.; Chen, H. Label-Guided Data Augmentation for Chinese Named Entity Recognition. Appl. Sci. 2025, 15, 2521. [Google Scholar] [CrossRef]
Jehangir, B.; Radhakrishnan, S.; Agarwal, R. A survey on Named Entity Recognition—datasets, tools, and methodologies. Nat. Lang. Process. J. 2023, 3, 100017. [Google Scholar] [CrossRef]
Gong, F.; Tong, S.; Du, C.; Wan, Z.; Qiu, S. Named Entity Recognition in the Field of Small Sample Electric Submersible Pump Based on FLAT. Appl. Sci. 2025, 15, 2359. [Google Scholar] [CrossRef]
Hu, Y.; Chen, Q.; Du, J.; Peng, X.; Keloth, V.K.; Zuo, X.; Zhou, Y.; Li, Z.; Jiang, X.; Lu, Z.; et al. Improving large language models for clinical named entity recognition via prompt engineering. J. Am. Med. Inform. Assoc. 2024, 31, 1812–1820. [Google Scholar] [CrossRef]
Chen, Y.; Zheng, Y.; Yang, Z. Prompt-Based Metric Learning for Few-Shot NER. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 7199–7212. [Google Scholar]
Petroni, F.; Rocktäschel, T.; Riedel, S.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A. Language Models as Knowledge Bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 2463–2473. [Google Scholar]
Ding, N.; Chen, Y.; Han, X.; Xu, G.; Wang, X.; Xie, P.; Zheng, H.; Liu, Z.; Li, J.; Kim, H.G. Prompt-learning for Fine-grained Entity Typing. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 6888–6901. [Google Scholar]
He, K.; Mao, R.; Huang, Y.; Gong, T.; Li, C.; Cambria, E. Template-free prompting for few-shot named entity recognition via semantic-enhanced contrastive learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 18357–18369. [Google Scholar] [CrossRef] [PubMed]
Hu, S.; Ding, N.; Wang, H.; Liu, Z.; Wang, J.; Li, J.; Wu, W.; Sun, M. Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 2225–2240. [Google Scholar]
Gao, T.; Fisch, A.; Chen, D. Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Virtual, 5–6 August 2021; Volume 1, pp. 3816–3830. [Google Scholar]
Lee, D.H.; Kadakia, A.; Tan, K.; Agarwal, M.; Feng, X.; Shibuya, T.; Mitani, R.; Sekiya, T.; Pujara, J.; Ren, X. Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 2687–2700. [Google Scholar]
Dong, G.; Wang, Z.; Zhao, J.; Zhao, G.; Guo, D.; Fu, D.; Hui, T.; Zeng, C.; He, K.; Li, X.; et al. A multi-task semantic decomposition framework with task-specific pre-training for few-shot ner. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 430–440. [Google Scholar]
Huang, Y.; He, K.; Wang, Y.; Zhang, X.; Gong, T.; Mao, R.; Li, C. Copner: Contrastive learning with prompt guiding for few-shot named entity recognition. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 2515–2527. [Google Scholar]
Su, L.; Chen, J.; Peng, Y.; Sun, C. Based Learning for Few-Shot Biomedical Named Entity Recognition Under Machine Reading Comprehension. J. Biomed. Inform. 2024, 159, 104739. [Google Scholar] [CrossRef] [PubMed]
Wang, T.; Chen, J.; Ma, L. Chinese Named Entity Recognition by Fusing Dictionary Information and Sentence Semantics. Comput. Mod. 2024, 3, 24–28. [Google Scholar]
Lu, X.; Sun, L.; Ling, C.; Tong, Z.; Liu, J.; Tang, Q. Named entity recognition of Chinese electronic medical records incorporating pinyin and lexical features. J. Chin. Mini-Micro Comput. Syst. 2025. [Google Scholar] [CrossRef]
Mengge, X.; Yu, B.; Zhang, Z.; Liu, T.; Zhang, Y.; Wang, B. Coarse-to-Fine Pre-training for Named Entity Recognition. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual, 16–20 November 2020; pp. 6345–6354. [Google Scholar]
Chen, J.; Liu, Q.; Lin, H.; Han, X.; Sun, L. Few-shot Named Entity Recognition with Self-describing Networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 5711–5722. [Google Scholar]
Bartolini, I.; Moscato, V.; Postiglione, M.; Sperlì, G.; Vignali, A. Data augmentation via context similarity: An application to biomedical Named Entity Recognition. Inf. Syst. 2023, 119, 102291. [Google Scholar] [CrossRef]
Liu, W.; Cui, X. Improving named entity recognition for social media with data augmentation. Appl. Sci. 2023, 13, 5360. [Google Scholar] [CrossRef]
Zhou, R.; Li, X.; He, R.; Bing, L.; Cambria, E.; Si, L.; Miao, C. MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 2251–2262. [Google Scholar]
Ghosh, S.; Tyagi, U.; Kumar, S.; Manocha, D. Bioaug: Conditional generation based data augmentation for low-resource biomedical ner. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 1853–1858. [Google Scholar]
Chang, J.; Han, X. Character-to-word representation and global contextual representation for named entity recognition. Neural Process. Lett. 2023, 55, 8551–8567. [Google Scholar] [CrossRef]
Fang, J.; Wang, X.; Meng, Z.; Xie, P.; Huang, F.; Jiang, Y. MANNER: A variational memory-augmented model for cross domain few-shot named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Volume 1, pp. 4261–4276. [Google Scholar]
Sajun, A.R.; Zualkernan, I.; Sankalpa, D. A Historical Survey of Advances in Transformer Architectures. Appl. Sci. 2024, 14, 4316. [Google Scholar] [CrossRef]
Li, J.; Sun, Y.; Johnson, R.J.; Sciaky, D.; Wei, C.H.; Leaman, R.; Davis, A.P.; Mattingly, C.J.; Wiegers, T.C.; Lu, Z. BioCreative V CDR task corpus: A resource for chemical disease relation extraction. Database 2016, 2016, baw068. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.; Katiyar, A. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual, 16–20 November 2020; pp. 6365–6375. [Google Scholar]
Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D. Matching Networks for One Shot Learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 3630–3638. [Google Scholar]
Fritzler, A.; Logacheva, V.; Kretov, M. Few-shot classification in named entity recognition task. In Proceedings of the ACM Symposium on Applied Computing, Limassol, Cyprus, 8–12 April 2019; pp. 993–1000. [Google Scholar]
Zhang, H.; Zhang, Y.; Zhang, R.; Yang, D. Robustness of Demonstration-based Learning Under Limited Data Scenario. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 1769–1782. [Google Scholar]

Figure 1. The MPSCER-NER framework.

Figure 2. Multi-class label prompt selection.

Figure 3. Low-density core entity prompt selection.

Figure 4. Multi-class label prompt demonstration.

Figure 5. An example of the core entity replacement process.

Figure 6. The details of the core entity replacement method.

Figure 7. Candidates for replacement from the core entities.

Figure 8. An example of how BiLSTM-CRF processes word embeddings in MPSCER-NER.

Figure 9. The effect of k-shot on the

P r e c i s i o n

,

R e c a l l

, and F1 of the MPSCER-NER.

Figure 9. The effect of k-shot on the

P r e c i s i o n

,

R e c a l l

, and F1 of the MPSCER-NER.

Table 1. Dataset statistics.

Number	Dataset	Named Entity Type	Dataset Size
1	CoNLL-2003	4	22k
2	Ontonotes 5.0	18	625k
3	Ontonotes 4.0	4	100k
4	BC5CDR	2	70k

Table 2. Experimental hyper-parameter settings.

Number	Hyper-Parameter	Value
1	Learning rate	2 × 10⁻⁵
2	Batch size	64
3	Epoch	50
4	Dropout	0.5
5	Maxnoincre	15

Table 3. Confusion matrix on CoNLL-2003 (multi-class label prompt selection).

Models	5-Shot				10-Shot
Models	TP	TN	FN	FP	TP	TN	FN	FP
PER	2427	36,632	339	65	2627	36,568	139	129
LOC	1223	36,994	700	546	1561	36,789	362	751
ORG	1906	35,609	589	1359	1833	36,153	662	815
MISC	270	37,946	648	599	527	37,914	391	631

Table 4. Confusion matrix on CoNLL-2003 (core entity replacement).

Models	5-Shot				10-Shot
Models	TP	TN	FN	FP	TP	TN	FN	FP
PER	2389	36,628	317	129	2563	36,602	203	95
LOC	1185	37,059	738	481	1467	36,902	456	638
ORG	1935	35,502	560	1466	1874	36,267	621	701
MISC	247	38,017	671	528	546	37,919	372	626

Table 5. Confusion matrix on CoNLL-2003 (MPSCER-NER).

Models	5-Shot				10-Shot
Models	TP	TN	FN	FP	TP	TN	FN	FP
PER	2479	36,638	287	59	2654	36,625	112	72
LOC	1255	37,059	668	481	1581	36,858	342	682
ORG	1975	35,502	520	1466	1979	36,230	516	738
MISC	263	38,017	655	528	557	37,909	361	636

Table 6. MPSCER-NER model ablation experiment module selection.

Module Number	Module	Multi-Class Label Prompt Demonstration	Core Entity Replacing	Low Core Entity Density Selecting
1	MPD	√
2	MPD + CEDS	√		√
3	CER		√
4	MPD + CER	√	√
5	MPSCER-NER	√	√	√

Table 7. Results of ablation experiments with the MPSCER-NER model.

k-Shot	Module Number	$Precision$	$Recall$	F1
25-shot	1	51.22	46.64	48.75
	2	66.51	47.92	48.98
	3	47.11	49.13	48.28
	4	51.75	50.11	50.75
	5	52.61	50.44	51.50
50-shot	1	65.92	62.51	63.61
	2	66.11	63.91	64.36
	3	65.36	64.92	65.18
	4	66.11	68.13	66.96
	5	66.89	68.85	67.55
100-shot	1	71.95	73.53	72.44
	2	72.11	73.82	72.91
	3	73.31	74.66	73.92
	4	73.53	75.94	74.63
	5	73.92	76.15	74.95

Table 8. Performance of the model with different datasets: F1-score.

Models	CoNLL-2003		Ontonotes 5.0		Ontonotes 4.0		BC5CDR
Models	5-Shot	10-Shot	5-Shot	10-Shot	5-Shot	10-Shot	25-Shot	50-Shot
NNshot	51.43	62.45	40.11	56.17	41.23	55.18	48.71	50.13
StructShot	50.23	63.67	41.49	58.09	42.34	58.46	49.33	51.94
MacthingCNN	50.23	63.67	41.49	58.09	42.34	58.46	49.87	51.28
ProtoBERT	50.23	63.67	41.49	58.09	42.34	58.46	50.14	54.61
DemonstrationNER	57.23	65.11	46.11	58.37	46.67	59.78	52.20	56.22
SR-Demonstration	57.19	65.01	46.57	59.41	47.37	60.71	52.50	56.43
MPSCER-NER	58.55	67.25	47.62	60.73	48.21	62.17	53.93	57.54

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, D.; Chen, Y.; Yan, M. Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement. Appl. Sci. 2025, 15, 6171. https://doi.org/10.3390/app15116171

AMA Style

Wu D, Chen Y, Yan M. Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement. Applied Sciences. 2025; 15(11):6171. https://doi.org/10.3390/app15116171

Chicago/Turabian Style

Wu, Di, Yao Chen, and Mingyue Yan. 2025. "Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement" Applied Sciences 15, no. 11: 6171. https://doi.org/10.3390/app15116171

APA Style

Wu, D., Chen, Y., & Yan, M. (2025). Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement. Applied Sciences, 15(11), 6171. https://doi.org/10.3390/app15116171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Named Entity Recognition Based on Multi-Class Label Prompt Selection and Core Entity Replacement

Abstract

1. Introduction

2. Related Work

2.1. Prompt Learning

2.2. Data Augmentation

3. The MPSCER-NER Model

3.1. Multi-Class Label Prompt Selecting

3.2. Core Entity Replacement

3.3. BiLSTM-CRF Classification

4. Experimental Results and Analysis

4.1. Datasets and Experimental Settings

4.2. Evaluation Indicators

4.3. Effectiveness on K-Shot

4.4. Confusion Matrices

4.5. Ablation Studies

4.6. Baselines

5. Conclusions

6. Limitations

7. Future Work

7.1. Domain Transfer

7.2. Zero-Shot Data Generation

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI