Research on a Joint Extraction Method of Track Circuit Entities and Relations Integrating Global Pointer and Tensor Learning
Abstract
:1. Introduction
- (1)
- Unlike existing models that only use top-layer outputs from pre-trained models, this research uses a multi-layer dilate gated convolutional neural network (MDGCNN) to extract features from the 12-layer RoBERTa-wwm encoder output. These 12 different levels of semantic information are then adaptively weighted and fused, enhancing the model’s feature representation and improving the recognition accuracy of complex entities and relations.
- (2)
- Existing methods often neglect the deep correlation between multiple relations, especially when multiple entities overlap. To address this, this study applies Tucker decomposition to learn and reconstruct core tensors, subject and object factor matrices, and relation weight matrices, resulting in high-dimensional relation tensors. This not only improves the accuracy of extracting overlapping relations but also strengthens the semantic correlation modeling between different relation types.
- (3)
- The model adopts the Efficient Global Pointer and introduces rotary position encoding (ROPE), along with a multiplicative attention mechanism, to accurately compute entity start and end positions. By sharing weights, the model reduces the number of parameters while maintaining a high recognition performance and reducing time complexity.
2. Materials and Methods
2.1. Knowledge Extraction
2.2. Knowledge Extraction in the Railway Domain
3. Method
3.1. Multi-Level Semantic Fusion Encoder
3.1.1. Hidden Layer Feature Extractor
3.1.2. Multi-Level Semantic Fusion Process
3.2. Relation Tensor Learning Module
3.3. Efficient Global Pointer Module
3.3.1. Global Pointer
3.3.2. Parameter Reduction
3.4. Training Strategies
3.5. Inference Strategy
4. Experimentation and Analysis
4.1. Experimental Environment and Parameter Configuration
4.2. Experimental Evaluation Metrics
4.3. Presentation of Experimental Data
4.4. Comparison Experiment of Joint Entity–Relation Extraction Using Public Datasets
- (1)
- CopyRe [15]: This model adopts a seq2seq architecture and incorporates a copy mechanism to address the issue of extracting long-tail relationships;
- (2)
- MultiRe [45]: This model combines a seq2seq framework with reinforcement learning to extract relational triplets;
- (3)
- CopyMTL [16]: This model is a further improvement on CopyRe, where the decoder uses an attention-fused LSTM model, and a fully connected layer is employed to obtain the output;
- (4)
- GraphRel [46]: This model uses a relation-weighted graph convolutional neural network (GCN) to model the interactions between named entities and their relationships;
- (5)
- CasRel [17]: This model is a sequence labeling model. It extracts relational triplets through a cascading framework. First, it predicts the subject, and then, based on the predicted subject, it predicts the related relations and objects;
- (6)
- TPLinker [19]: This model introduces a handshake tagging paradigm, which cleverly divides token pair links into three types, effectively addressing the issue of exposure bias in relation extraction;
- (7)
- TLRel [34]: This model uses BiLSTM to encode the input text. In the entity recognition module, CRF is used to output valid sequence tags, while in the relation extraction module, the model innovatively employs Tucker decomposition for tensor learning.
4.5. Experimental Analysis Based on Track Circuit Dataset
4.5.1. Experimental Analysis of the Upstream Encoding Module
4.5.2. Experimental Analysis of the Relation Tensor Learning Module
4.5.3. Experimental Analysis of the Efficient Global Pointer Module
4.6. Case Study
5. Conclusions
- (1)
- The multi-level semantic fusion encoder integrates a whole-word masking strategy and a multi-level semantic fusion approach. Compared to baseline models, this model better understands complex semantic structures in the text, enhances the ability to capture contextual information, and effectively suppresses noise interference in the text, thereby improving the overall performance of the model.
- (2)
- The tensor learning module employs the Tucker decomposition method to effectively capture the semantic associations between different types of relations. Compared to traditional relation extraction methods, the tensor learning module uses a three-dimensional tensor representation to better handle the complex relationships between multiple entity pairs within a sentence. Additionally, this module significantly reduces the computational complexity, improving training efficiency on large datasets.
- (3)
- The Global Pointer module efficiently identifies entities in the text through a span-based global normalization mechanism. Unlike traditional sequence labeling models (such as CRF), this module does not require recursive denominator computation, greatly improving its computational efficiency. Furthermore, the Efficient Global Pointer separates the entity extraction task from the classification task, allowing parameter sharing. By introducing RoPE encoding, the computational burden is further reduced while maintaining recognition accuracy.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hou, T.; Zheng, Q.; Yao, X.; Chen, G.; Wang, X. Fine-grained Fault Cause Analysis Method for Track Circuit Based on Text Mining. J. China Railw. Soc. 2022, 4, 73–81. [Google Scholar] [CrossRef]
- Xu, K.; Zheng, H.; Tu, Y.; Wu, S. Fault diagnosis of track circuit based on improved sparrow search algorithm and Q-Learning optimization for ensemble learning. J. Railw. Sci. Eng. 2023, 20, 4426–4437. [Google Scholar] [CrossRef]
- Guo, W.; Yu, Z.; Chui, H.-C.; Chen, X. Development of DMPS-EMAT for Long-Distance Monitoring of Broken Rail. Sensors 2023, 23, 5583. [Google Scholar] [CrossRef]
- Alvarenga, T.A.; Cerqueira, A.S.; Filho, L.M.A.; Nobrega, R.A.; Honorio, L.M.; Veloso, H. Identification and Localization of Track Circuit False Occupancy Failures Based on Frequency Domain Reflectometry. Sensors 2020, 20, 7259. [Google Scholar] [CrossRef]
- Lin, H.; Lu, R.; Xu, L. Automatic classification method of railway signal fault based on text mining. J. Yunnan Univ. Nat. Sci. Ed. 2022, 44, 281–289. [Google Scholar] [CrossRef]
- Feng, J.; Wei, D.; Su, D.; Hang, T.; Lu, J. Survey of Document-level Entity Relation Extraction Methods. Comput. Sci. 2022, 49, 224–242. [Google Scholar] [CrossRef]
- Yang, Y.; Wu, Z.; Yang, Y.; Lian, S.; Guo, F.; Wang, Z. A Survey of Information Extraction Based on Deep Learning. Appl. Sci. 2022, 12, 9691. [Google Scholar] [CrossRef]
- Zhang, Y.-S.; Liu, S.-K.; Liu, Y.; Ren, L.; Xin, Y.-H. Joint Extraction of Entities and Relations Based on Deep Learning: A Survey. Acta Electron. Sin. 2023, 51, 1093–1116. [Google Scholar] [CrossRef]
- Sun, W.; Liu, S.; Liu, Y.; Kong, L.; Jian, Z. Information Extraction Network Based on Multi-Granularity Attention and Multi-Scale Self-Learning. Sensors 2023, 23, 4250. [Google Scholar] [CrossRef] [PubMed]
- Zhang, S.; Wang, X.; Chen, Z.; Wang, L.; Xu, D.; Jia, Y. Survey of Supervised Joint Entity Relation Extraction Methods. J. Front. Comput. Sci. Technol. 2022, 16, 713–733. [Google Scholar] [CrossRef]
- E, H.; Zhang, W.; Xiao, S.; Cheng, R.; Hu, Y.; Zhou, X.; Niu, P. Survey of Entity Relationship Extraction Based on Deep Learning. J. Softw. 2019, 30, 1793–1818. [Google Scholar] [CrossRef]
- Li, Q.; Ji, H. Incremental joint extraction of entity mentions and relations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 23–24 June 2014; pp. 402–412. [Google Scholar]
- Yu, X.; Lam, W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach. In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, 23–27 August 2010; pp. 1399–1407. [Google Scholar]
- Ren, X.; Wu, Z.; He, W.; Qu, M.; Voss, C.R.; Ji, H.; Abdelzaher, T.R.; Han, J. CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases. In Proceedings of the 26th International Conference on World Wide Web, Geneva, Switzerland, 3–7 April 2017; pp. 1015–1024. [Google Scholar] [CrossRef]
- Zeng, X.; Zeng, D.; He, S.; Liu, K.; Zhao, J. Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 506–514. [Google Scholar] [CrossRef]
- Zeng, D.; Zhang, H.; Liu, Q. CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Hilton Midtown, NY, USA, 7–12 February 2020; pp. 9507–9514. [Google Scholar] [CrossRef]
- Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 1476–1488. [Google Scholar] [CrossRef]
- Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Xu, M.; Zheng, Y. PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, 1–6 August 2021; pp. 6225–6235. [Google Scholar] [CrossRef]
- Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain (Online), 8–13 December 2020; pp. 1572–1582. [Google Scholar] [CrossRef]
- Sui, D.; Zeng, X.; Chen, Y.; Liu, K.; Zhao, J. Joint Entity and Relation Extraction with Set Prediction Networks. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 12784–12795. [Google Scholar] [CrossRef] [PubMed]
- Shang, Y.-M.; Huang, H.; Mao, X. OneRel: Joint Entity and Relation Extraction with One Module in One Step. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 20–27 February 2020; pp. 11285–11293. [Google Scholar] [CrossRef]
- Li, X.; Shi, T.; Li, P.; Dai, M.; Zhang, X. Research on Knowledge Extraction Method for High-speed Railway Signal Equipment Fault Based on Text. J. China Railw. Soc. 2021, 43, 91–100. [Google Scholar] [CrossRef]
- Li, X.; Chen, Y.; Qiu, S.; Lu, R.; Cai, C.; Shi, Y. Establishment and Analysis Method of Risk Knowledge Graph of Railway Engineering Construction in Complex Areas. J. China Railw. Soc. 2024, 1–15. Available online: http://kns.cnki.net/kcms/detail/11.2104.u.20240619.1705.002.html (accessed on 21 June 2024).
- Lin, H.; Bai, W.; Zhao, Z.; Hu, N.; Li, D.; Lu, R. Construction and Application of Knowledge Graph for Troubleshooting of High-speed Railway Turnout Equipment. J. China Railw. Soc. 2024, 46, 73–80. [Google Scholar] [CrossRef]
- Lin, H.; Bai, W.; Zhao, Z.; Hu, N.; Li, D.; Lu, R. Knowledge extraction method for operation and maintenance texts of high-speed railway turnout. J. Rail. Sci. Eng. 2024, 21, 2569–2580. [Google Scholar] [CrossRef]
- Jawahar, G.; Sagot, B.; Seddah, D. What does BERT learn about the structure of language? In Proceedings of the ACL 2019-57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 3651–3657. [Google Scholar]
- Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-training with whole word masking for chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3504–3514. [Google Scholar] [CrossRef]
- Kan, Z.; Qiao, L.; Yang, S.; Liu, F.; Huang, F. Event arguments extraction via dilate gated convolutional neural network with enhanced local features. IEEE Access 2020, 8, 123483–123491. [Google Scholar] [CrossRef]
- Gastaldi, X. Shake-Shake regularization. arXiv 2017, arXiv:1705.07485. [Google Scholar]
- Yang, J.; Zhao, H. Deepening hidden representations from pre-trained language models. arXiv 2019, arXiv:1911.01940. [Google Scholar]
- Zhang, M.; Zhang, Y.; Fu, G. End-to-End Neural Relation Extraction with Global Optimization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 1730–1740. [Google Scholar] [CrossRef]
- Sun, C.; Gong, Y.; Wu, Y.; Gong, M.; Jiang, D.; Lan, M.; Sun, S.; Duan, N. Joint Type Inference on Entities and Relations via Graph Convolutional Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1361–1370. [Google Scholar] [CrossRef]
- Wang, Y.; Sun, C.; Wu, Y.; Zhou, H.; Li, L.; Yan, H. UniRE: A Unified Label Space for Entity Relation Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, 1 August 2021; pp. 220–231. [Google Scholar]
- Wang, Z.; Nie, H.; Zheng, W.; Wang, Y.; Li, X. A novel tensor learning model for joint relational triplet extraction. IEEE Trans. Cybern. 2023, 54, 2483–2494. [Google Scholar] [CrossRef] [PubMed]
- Balazevic, I.; Allen, C.; Hospedales, T.M. TuckER: Tensor Factorization for Knowledge Graph Completion. arXiv 2019, arXiv:1901.09590. [Google Scholar]
- Su, J.; Murtadha, A.; Pan, S.; Hou, J.; Sun, J.; Huang, W.; Wen, B.; Liu, Y. Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition. arXiv 2022, arXiv:2208.03054. [Google Scholar]
- Zhai, Z.; Fan, R.; Huang, J.; Xiong, N.; Zhang, L.; Wan, J.; Zhang, L. A Named Entity Recognition Method based on Knowledge Distillation and Efficient GlobalPointer for Chinese Medical Texts. IEEE Access 2024, 12, 83563–83574. [Google Scholar] [CrossRef]
- Cao, K.; Chen, S.; Yang, C.; Luo, L.; Ren, Z. Revealing the coupled evolution process of construction risks in mega hydropower engineering through textual semantics. Adv. Eng. Inform. 2024, 16, 102713. [Google Scholar] [CrossRef]
- Liang, J.; He, Q.; Zhang, D.; Fan, S. Extraction of Joint Entity and Relationships with Soft Pruning and GlobalPointer. Appl. Sci. 2022, 12, 6361. [Google Scholar] [CrossRef]
- Su, J.; Murtadha, A.; Lu, Y.; Pan, S.; Bo, W.; Liu, Y. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 2022, 568, 127063. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 99, 2999–3007. [Google Scholar] [CrossRef]
- Riedel, S.; Yao, L.; McCallum, A. Modeling relations and their mentions without labeled text. In Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part III, Barcelona, Spain, 20–24 September 2010; pp. 148–163. [Google Scholar] [CrossRef]
- Gardent, C.; Shimorina, A.; Narayan, A.; Perez-Beltrachini, L. Creating Training Corpora for NLG Micro-Planners. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 179–188. [Google Scholar] [CrossRef]
- Li, S.; He, W.; Shi, Y.; Jiang, W.; Liang, H.; Jiang, Y.; Zhang, Y.; Lyu, Y.; Zhu, Y. Duie: A large-scale Chinese dataset for information extraction. In Proceedings of the Natural Language Processing and Chinese Computing: 8th CCF International Conference, Dunhuang, China, 9–14 October 2019; pp. 791–800. [Google Scholar] [CrossRef]
- Zeng, X.; He, S.; Zeng, D.; Liu, K.; Liu, S.; Zhao, J. Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 367–377. [Google Scholar] [CrossRef]
- Fu, T.-J.; Li, P.-H.; Ma, W.-Y. GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1409–1418. [Google Scholar] [CrossRef]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
Symbol | Definition |
---|---|
Gold binary word relation tensor | |
Predicted binary word relation tensor | |
Core relation tensor | |
/ | Subject–object entity feature matrices |
Relational weight matrix | |
Word feature dimension | |
The number of words | |
The number of relation | |
The activation function | |
Tensor n-mode product | |
Feature vector of upstream output | |
Conv1D1/Conv1D2 | 1D convolution |
A bias term | |
The start positions of each entity class | |
The end positions of each entity class | |
The score matrix |
Parameter Name | Parameter Value |
---|---|
DilatedGatedConv1d_dim | 128 |
MDGCNN convolution kernel size | 3 |
MDGCNN dilation rate | (1,2,5,1) |
Dropout | 0.1 |
Upstream model learning rate | 2 × 10−5 |
Downstream model learning rate | 1 × 10−4 |
Efficient_Global_pointer_inner_dim | 64 |
MLPs_dim | 256 |
MLPo_dim | 256 |
0.9 | |
0.5 | |
Batch_size | 24 |
Epoch | 50 |
Max_seq_len | 256 |
Entity A | Relation | Entity B | Example of Extracted Triplet Result |
---|---|---|---|
Fault Phenomenon | occurs at | Fault Location | <Track circuit red band, occurs at, Fuqiang Station> |
Fault Phenomenon | is caused by | Fault Cause | <Residual green band, is caused by, vehicle wheel slippage> |
Fault Cause | leads to | Fault Phenomenon | <Breaker failure, leads to, track circuit red band> |
Fault Cause | is attributed to | Responsible Party | <Rail Break, attributed to, Workers> |
Fault Cause | adopts | Repair Measures | <Wear plate defect, adopts, wear plate replacement> |
Fault Cause | defines | Fault Nature | <Capacitor failure, defines, poor inspection> |
Repair Measures | results in | Repair Result | <Enabled axle counting device, results in, submitted for use> |
Models | NYT Dataset | WebNLG Dataset | DuIE Dataset | ||||||
---|---|---|---|---|---|---|---|---|---|
P (%) | R (%) | F1 (%) | P (%) | R (%) | F1 (%) | P (%) | R (%) | F1 (%) | |
CopyRe | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 | 39.7 | 38.9 | 39.3 |
MultiRe | 77.9 | 67.2 | 72.1 | 63.3 | 59.9 | 61.6 | 49.1 | 47.7 | 48.4 |
CopyMTL | 75.7 | 68.7 | 72.0 | 58.0 | 54.9 | 56.4 | 44.8 | 43.2 | 44.0 |
GraphRel | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 | 37.8 | 39.6 | 38.7 |
CasRelBERT | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 | 72.3 | 74.3 | 73.3 |
TPLinkerBERT | 91.3 | 92.5 | 91.9 | 91.8 | 92.0 | 91.9 | 73.4 | 75.2 | 74.3 |
TLRel | 88.5 | 85.2 | 86.8 | 91.8 | 92.7 | 92.2 | 76.1 | 78.7 | 77.4 |
Ours | 90.93 | 93.3 | 92.1 | 92.9 | 92.5 | 92.7 | 77.5 | 78.9 | 78.2 |
Parameter Settings | P (%) | R (%) | F1 (%) |
---|---|---|---|
MDGCNN(1,1,2,1) | 90.78 | 91.89 | 91.33 |
MDGCNN(1,2,4,1) | 90.27 | 91.52 | 90.89 |
MDGCNN(1,2,5,1) | 90.31 | 92.53 | 91.41 |
MDGCNN(1,1,2) | 90.80 | 91.74 | 91.27 |
MDGCNN(1,2,4) | 89.71 | 91.63 | 90.66 |
MDGCNN(1,2,5) | 91.08 | 91.68 | 91.38 |
Models | P (%) | R (%) | F1 (%) |
---|---|---|---|
CRF | 90.99 | 88.56 | 89.76 |
Global Pointer (+ROPE) | 90.36 | 91.50 | 90.92 |
Efficient Global Pointer | 90.81 | 91.75 | 91.28 |
Efficient Global Pointer (+ROPE) | 90.31 | 92.53 | 91.41 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Y.; Chen, G.; Li, P. Research on a Joint Extraction Method of Track Circuit Entities and Relations Integrating Global Pointer and Tensor Learning. Sensors 2024, 24, 7128. https://doi.org/10.3390/s24227128
Chen Y, Chen G, Li P. Research on a Joint Extraction Method of Track Circuit Entities and Relations Integrating Global Pointer and Tensor Learning. Sensors. 2024; 24(22):7128. https://doi.org/10.3390/s24227128
Chicago/Turabian StyleChen, Yanrui, Guangwu Chen, and Peng Li. 2024. "Research on a Joint Extraction Method of Track Circuit Entities and Relations Integrating Global Pointer and Tensor Learning" Sensors 24, no. 22: 7128. https://doi.org/10.3390/s24227128
APA StyleChen, Y., Chen, G., & Li, P. (2024). Research on a Joint Extraction Method of Track Circuit Entities and Relations Integrating Global Pointer and Tensor Learning. Sensors, 24(22), 7128. https://doi.org/10.3390/s24227128