CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain
Abstract
1. Introduction
2. Related Work
3. Method

3.1. MKN Module

3.2. Transformer Module

3.3. GNN Module

3.4. Decode Module
4. Experiments
4.1. Dataset
4.2. Evaluation Metrics
4.3. Baselines
4.4. Experimental Environment
4.5. Result
5. Conclusions
5.1. Result Analysis
5.2. Error Analysis
5.3. Summary and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Geng, R.; Chen, Y.; Huang, R.; Qin, Y.; Zheng, Q. Planarized sentence representation for nested named entity recognition. Inf. Process. Manag. 2023, 60, 103352. [Google Scholar] [CrossRef]
- Qiu, Q.; Tian, M.; Huang, Z.; Xie, Z.; Ma, K.; Tao, L.; Xu, D. Chinese engineering geological named entity recognition by fusing multi-features and data enhancement using deep learning. Expert Syst. Appl. 2024, 238, 121925. [Google Scholar] [CrossRef]
- Yu, Y.T.; Wang, Z.B.; Wei, W.; Zhang, R.H.; Mao, X.L.; Feng, S.S.; Wang, F.; He, Z.Y.; Jiang, S. Exploiting global contextual information for document-level named entity recognition. Knowl.-Based Syst. 2024, 284, 111266. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018. [Google Scholar] [CrossRef]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv 2020, arXiv:1909.11942. [Google Scholar] [CrossRef]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. XLNet: Generalized autoregressive pretraining for language understanding. arXiv 2019, arXiv:1906.08237. [Google Scholar] [CrossRef]
- Hu, S.; Zhang, H.; Hu, X.; Du, J. Chinese Named Entity Recognition based on BERT-CRF Model. In Proceedings of the 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science (ICIS), Zhuhai, China, 26–28 June 2022; pp. 105–108. [Google Scholar] [CrossRef]
- Shen, Y.; Tan, M.; Sordoni, A.; Courville, A. Ordered neurons: Integrating tree structures into recurrent neural networks. arXiv 2018, arXiv:1810.09536. [Google Scholar] [CrossRef]
- Liu, Y.; Wei, S.; Huang, H.; Lai, Q.; Li, M.; Guan, L. Naming entity recognition of citrus pests and diseases based on the BERT-BiLSTM-CRF model. Expert Syst. Appl. 2023, 234, 121103. [Google Scholar] [CrossRef]
- Xu, M.; Hou, F.; Liu, J.; Zhang, M.; Shi, L.; Kou, F.; Guo, L.; Yu, P.S.; Hu, X. Multimodal named entity recognition in the era of large pre-trained models: A comprehensive survey. Inf. Fusion 2026, 127, 103767. [Google Scholar] [CrossRef]
- Kong, B.; Liu, S.; Jia, L.; Liang, Y.; Han, D.; Zhang, X. MINIGE-MNER: A multi-stage interaction network inspired by gene editing for multimodal named entity recognition. Neural Netw. 2026, 194, 108106. [Google Scholar] [CrossRef]
- Xu, M.; Peng, K.; Liu, J.; Zhang, Q.; Song, L.; Li, Y. Multimodal Named Entity Recognition based on topic prompt and multi-curriculum denoising. Inf. Fusion 2025, 124, 103405. [Google Scholar] [CrossRef]
- Zhang, H.; Lyu, L.; Chang, W.; Zhao, Y.; Peng, X. A Chinese medical named entity recognition method considering length diversity of entities. Eng. Appl. Artif. Intell. 2025, 150, 110649. [Google Scholar] [CrossRef]
- Li, J.; Sun, A.; Han, J.; Li, C. A Survey on Deep Learning for Named Entity Recognition. IEEE Trans. Knowl. Data Eng. 2022, 34, 50–70. [Google Scholar] [CrossRef]
- Francis, S.; Van Landeghem, J.; Moens, M.-F. Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents. Information 2019, 10, 248. [Google Scholar] [CrossRef]
- Salinas Alvarado, J.; Verpsoor, K.; Baldwin, T. Domain Adaption of Named Entity Recognition to Support Credit Risk Assessment. In Proceedings of Australasian Language Technology Association Workshop; Hachey, B., Webster, K., Eds.; ACL: Parramatta, Australia, 2015; Available online: https://aclanthology.org/U15-1010/ (accessed on 14 June 2024).
- Song, B.; Li, F.; Liu, Y.; Zeng, X. Deep learning methods for biomedical named entity recognition: A survey and qualitative comparison. Brief Bioinform. 2021, 22, bbab282. [Google Scholar] [CrossRef] [PubMed]
- Loukas, L.; Fergadiotis, M.; Chalkidis, I.; Spyropoulou, E.; Malakasiotis, P.; Androutsopoulos, I.; Paliouras, G. FiNER: Financial Numeric Entity Recognition for XBRL Tagging. arXiv 2022, arXiv:2203.06482. [Google Scholar] [CrossRef]
- Wang, N.; Yang, H.; Wang, C.D. FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets. arXiv 2023, arXiv:2310.04793. [Google Scholar]
- Ogrinc, M.; Koroušić Seljak, B.; Eftimov, T. Zero-shot evaluation of ChatGPT for food named-entity recognition and linking. Front. Nutr. 2024, 11, 1429259. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Krstev, I.; Mishkovski, I.; Mirchev, M.; Golubova, B.; Gramatikov, S. Extracting Entities and Relations in Analyst Stock Ratings News. In ICT Innovations 2023. Learning: Humans, Theory, Machines, and Data; Mihova, M., Jovanov, M., Eds.; Communications in Computer and Information Science; Springer: Cham, Switzerland, 2024; Volume 1991. [Google Scholar] [CrossRef]
- Hu, Y.; Chen, Y.; Xu, Y. A shape composition method for named entity recognition. Neural Netw. 2025, 187, 107389. [Google Scholar] [CrossRef]
- Katz, D.M.; Hartung, D.; Gerlach, L.; Jana, A.; Bommarito, M.J., II. Natural language processing in the legal domain. arXiv 2023, arXiv:2302.12039. [Google Scholar] [CrossRef]
- Lee, K.; Wei, C.H.; Lu, Z. Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature. Brief Bioinform. 2021, 22, bbaa142. [Google Scholar] [CrossRef]
- Li, I.; Pan, J.; Goldwasser, J.; Verma, N.; Wong, W.P.; Nuzumlalı, M.Y.; Rosand, B.; Li, Y.; Zhang, M.; Chang, D.; et al. Neural natural language processing for unstructured data in electronic health records: A review. Comput. Sci. Rev. 2022, 46, 100511. [Google Scholar] [CrossRef]
- Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P.P. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar] [CrossRef]
- Hammerton, J.A. Named entity recognition with long short-term memory. In Proceedings of the Seventh Conference on Natural Language Learning, coNLL 2003, Edmonton, AB, Canada, 31 May–1 June 2003; Daelemans, W., Osborne, M., Eds.; ACL: Edmonton, AB, Canada, 2003; pp. 172–175. Available online: https://aclanthology.org/W03-0426/ (accessed on 14 June 2024).
- Lample, G.; Ballesteros, M.; Subramanian, S.; Kawakami, K.; Dyer, C. Neural architectures for named entity recognition. arXiv 2016. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Abilio, R.; Coelho, G.P.; da Silva, A.E.A. Evaluating Named Entity Recognition: A comparative analysis of mono- and multilingual transformer models on a novel Brazilian corporate earnings call transcripts dataset. Appl. Soft Comput. 2024, 166, 112158. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar] [CrossRef]
- Mao, Q.; Li, J.; Meng, K. Improving Chinese Named Entity Recognition by Search Engine Augmentation. arXiv 2022, arXiv:2210.12662. [Google Scholar] [CrossRef]
- Gu, R.; Wang, T.; Deng, J.; Cheng, L. Improving Chinese Named Entity Recognition by Interactive Fusion of Contextual Representation and Glyph Representation. Appl. Sci. 2023, 13, 4299. [Google Scholar] [CrossRef]
- Hu, W.; He, L.; Ma, H.; Wang, K.; Xiao, J. KGNER: Improving Chinese Named Entity Recognition by BERT Infused with the Knowledge Graph. Appl. Sci. 2022, 12, 7702. [Google Scholar] [CrossRef]
- Fang, Q.; Li, Y.; Feng, H.; Ruan, Y. Chinese Named Entity Recognition Model Based on Multi-Task Learning. Appl. Sci. 2023, 13, 4770. [Google Scholar] [CrossRef]
- Sun, S.; Deng, M.; Yu, X.; Zhao, L. HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition. arXiv 2025, arXiv:2503.01217. [Google Scholar] [CrossRef]
- Li, H.; Cheng, M.; Yang, Z.; Yang, L.; Chua, Y. Named entity recognition for Chinese based on global pointer and adversarial training. Sci. Rep. 2023, 13, 3242. [Google Scholar] [CrossRef] [PubMed]



| Entity | Description | Count |
|---|---|---|
| PER | People’s names, including full names, nicknames, etc. | 855 |
| DATE | Date, time, etc. | 2446 |
| ORG | Organizations, institutions, etc., such as the Department of Culture of Anhui Province. | 600 |
| LOC | Place names, such as the village of Tanga. | 3030 |
| FESTIVAL | Festivals, such as the Dragon Boat Festival and Chinese New Year. | 2076 |
| NATION | Ethnic groups, such as the Hmong and Zhuang. | 1167 |
| Dataset | Train | Test |
|---|---|---|
| sentence | 3829 | 425 |
| Parameter | Value | Parameter | Value |
|---|---|---|---|
| Train epochs | 50 | Learning rate | 1 × 10−5 |
| Warmup | 0.01 | mBERT Learning rate | 5 × 10−5 |
| Batch size | 4 | Classifier Learning rate | 5 × 10−5 |
| Dropout | 0.5 | CRF Learning rate | 5 × 10−5 |
| Optimizer | AdamW | Embedding size | 768 |
| Weight decay | 0.01 | Attention heads | 8 |
| Dataset | Model | P (%) | R (%) | F1 (%) |
|---|---|---|---|---|
| CTFCDataSet | BERT-CRF | 85.85 | 88.44 | 87.13 |
| BERT-Softmax | 85.98 | 88.64 | 87.29 | |
| BERT-BiLSTM-CRF | 88.56 | 90.28 | 89.41 | |
| BiLSTM-CRF | 82.91 | 77.00 | 79.84 | |
| Our (CLFF-NER) | 89.45 | 90.01 | 89.73 |
| Model | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| ISEA-NER [32] | 71.21 | 70.50 | 70.85 |
| IFCG-NER [33] | 70.23 | 71.17 | 70.70 |
| KGNER [34] | - | - | 71.90 |
| MT-Learning [35] | 74.90 | 72.70 | 73.80 |
| Our (CLFF-NER) | 72.68 | 72.84 | 72.76 |
| Model | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| ISEA-NER [32] | 96.18 | 96.48 | 96.33 |
| KGNER [34] | - | - | 96.40 |
| HREB-CRF [36] | 95.53 | 96.61 | 96.07 |
| GPAT-NER [37] | 96.38 | 96.63 | 96.48 |
| Our (CLFF-NER) | 95.97 | 96.93 | 96.44 |
| Label | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| PER | 96.34 | 91.86 | 94.05 |
| DATE | 87.60 | 93.99 | 90.68 |
| ORG | 81.13 | 86.00 | 83.50 |
| LOC | 86.81 | 88.65 | 87.72 |
| FESTIVAL | 86.47 | 88.02 | 87.24 |
| NATION | 93.75 | 97.22 | 95.45 |
| Model | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| -GNN module | 87.42 | 88.43 | 87.92 |
| -Transformer module | 87.70 | 90.01 | 88.84 |
| -MKN module | 86.73 | 89.38 | 88.04 |
| -English module | 87.19 | 88.01 | 87.60 |
| Our (CLFF-NER) | 89.45 | 90.01 | 89.73 |
| Sentence | Golden Label | Predict Label |
|---|---|---|
| 节 | O | O |
| 日 | O | O |
| 期 | O | O |
| 间 | O | O |
| 云 | O | B-LOC |
| 南 | O | I-LOC |
| 民 | O | I-LOC |
| 族 | O | I-LOC |
| 村 | O | I-LOC |
| 组 | O | O |
| 织 | O | O |
| 了 | O | O |
| 多 | O | O |
| 种 | O | O |
| 体 | O | O |
| 验 | O | O |
| 活 | O | O |
| 动 | O | O |
| Influencing Factors | Description |
|---|---|
| Place name vocabulary triggers | Yunnan is usually labeled as LOC in training data, causing the entire phrase to be tagged. |
| Character-level sequential pattern | Chinese NER predicts by character, so consecutive place name vocabulary can easily expand the prediction boundaries. |
| Contextual guidance | Sentence structures describing organizational activities may lead the model to infer LOC. |
| Differences in annotation rules | ‘Yunnan Nationalities Village’ is not labeled as LOC in the golden label, but the model cannot distinguish between standard place names and generic locations. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, S.; He, K.; Li, W.; He, Y. CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain. Informatics 2025, 12, 136. https://doi.org/10.3390/informatics12040136
Yang S, He K, Li W, He Y. CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain. Informatics. 2025; 12(4):136. https://doi.org/10.3390/informatics12040136
Chicago/Turabian StyleYang, Shenghe, Kun He, Wei Li, and Yingying He. 2025. "CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain" Informatics 12, no. 4: 136. https://doi.org/10.3390/informatics12040136
APA StyleYang, S., He, K., Li, W., & He, Y. (2025). CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain. Informatics, 12(4), 136. https://doi.org/10.3390/informatics12040136

