Knowledge Graph Construction and Representation Method for Potato Diseases and Pests
Abstract
:1. Introduction
2. Materials and Methods
2.1. Process of Construction and Representation of the Knowledge Graph in This Paper
2.2. Design of Entity and Relationship Joint Extraction Model
2.3. Knowledge Representation Algorithms
3. Discussion
3.1. Experimental Environment
3.2. Construction of Potato Disease and Pest Data Corpus
3.3. Comparative Experiment and Analysis of Sequence Annotation Method
3.4. Experiment and Analysis of Knowledge Extraction Model Performance
3.4.1. Analysis of the Relationship between Sample Size and Model Performance
3.4.2. Comparison of Time Efficiency between ALBert and Bert Models
3.4.3. Model Performance Comparison Experiment
3.4.4. Knowledge Storage
4. Results
4.1. Experimental Dataset
4.2. Evaluation Metrics
4.3. Experimental Environment and Parameter Indicators
4.4. Experiment and Results Analysis
5. Conclusions
- (1)
- Dataset construction with crawling and digital technology was used to obtain a large amount of semi-structured and unstructured data on potato pests and diseases from agricultural specialist websites and books. The collected data was pre-processed with cleaning and the removal of redundancies to form a potato pest and disease data corpus, PotatoRE;
- (2)
- Based on the PotatoRE corpus, the entity types and relationships of potato pests and diseases were defined under the guidance of crop protection experts. Three annotation methods, BIO, BMES and BIOES, were compared on the PotatoRE corpus. The better annotation method, BMES, was selected to annotate the corpus and form a high-quality dataset consisting of six entity types, eight relationships and four attribute types, totalling 8971 entity samples, for subsequent experiments;
- (3)
- An ALBert-BiLSTM-Self_Att-CRF model was developed to extract entities and relationships from the data corpus and construct a domain knowledge graph of potato diseases and pests. To verify the efficiency and accuracy of the proposed model, it was experimentally compared with eight other SOTA models on the dataset constructed in this paper. The results show that the model designed in this paper performed well in terms of the accuracy, recall, and F1 values. Compared to the Al-Bert-BiLSTM-CRF and ALBert-BiGRU-CRF models, their accuracy was improved by 2.92% and 3.12%, respectively. Compared with the Bert-BiLSTM-CRF and Bert-BiGRU-CRF models, the model in this paper not only improved the accuracy, recall, and F1 values, but also saved training time and improved the efficiency of entity and relationship extraction. In addition, the robustness of the proposed model was further verified by comparing it with other mainstream models using the People’s Daily dataset;
- (4)
- The performance of the knowledge representation models TransE, TransH, TransR, and TransD was compared experimentally using the constructed knowledge graph. The better model, TransH, was then used to represent the knowledge graph in this paper, which laid the foundation for knowledge inference and fusion and enhanced its application value.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Guo, X.; Zhou, H.; Su, J.; Xia, H.; Li, L. Chinese agricultural diseases and pests named entity recognition with multi-scale local context features and self-attention mechanism. Comput. Electron. Agric. 2020, 179, 105830. [Google Scholar] [CrossRef]
- Xia, Y.; Sun, N.; Wang, H.; Yuan, X.; Wang, C.; Gao, Q. Research on knowledge question answering system for agriculture disease and pests based on knowledge graph. J. Nonlinear Convex Anal. 2020, 21, 1487–1496. [Google Scholar]
- Kurmi, Y.; Gangwar, S.; Agrawal, D.; Kumar, S.; Shanker, H. Leaf image analysis-based crop diseases classification. Signal Image Video Process. 2021, 15, 589–597. [Google Scholar] [CrossRef]
- Zhao, C. Agricultural Knowledge Intelligent Service Technology: A Review. Smart Agric. 2023, 5, 126–148. [Google Scholar]
- Zhang, M.; Yang, Z.; Liu, C.; Fang, L. Traditional Chinese Medicine knowledge Service based on Semi-Supervised BERT-BiLSTM-CRF Model. In Proceedings of the 2020 International Conference on Service Science (ICSS), Xining, China, 24–26 August 2020. [Google Scholar]
- Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Chen, X.; Zhang, H.; Tian, X. ERNIE: Enhanced Representation through Knowledge Integration. arXiv 2019, arXiv:1904.09223. [Google Scholar]
- Chen, J.; Xi, X.; Pi, Z.; Sheng, S.; Cui, Z. ALBERT-Based Named Entity Recognition of Chinese Medical Records. J. Nanjing Norm. Univ. (Engl. Technol. Ed.) 2021, 21, 36–43. [Google Scholar] [CrossRef]
- Wu, C.; Luo, G.; Guo, C.; Yi, R.; Zhen, A.; Yang, C. An Attention-based Multi-Task Model for Named Entity Recognition and Intent Analysis of Chinese Online Medical Questions. J. Biomed. Inform. 2020, 108, 103511. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Liu, K.; Wang, H.; Li, M.; Pan, J. Chinese named-entity recognition via self-attention mechanism and position-aware influence propagation embedding. Data Knowl. Eng. 2022, 139, 101983. [Google Scholar] [CrossRef]
- Taher, E.; Hoseini, S.A.; Shamsfard, M. Beheshti-ner: Persian named entity recognition using bert. arXiv 2020, arXiv:2003.08875. [Google Scholar]
- Zhang, N.; Xu, G.; Zhang, Z.; Li, F. Mifm: Multi-granularity information fusion model for chinese named entity recognition. IEEE Access 2019, 7, 181648–181655. [Google Scholar] [CrossRef]
- Zhang, Q.; Sun, Y.; Zhang, L.; Jiao, Y.; Yue, T. Named entity recognition method in health preserving field based on bert. Procedia Comput. Sci. 2021, 183, 212–220. [Google Scholar] [CrossRef]
- Hakala, K.; Pyysalo, S. Biomedical Named Entity Recognition with Multilingual BERT. In Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, Hong Kong, China, 4–6 November 2019; pp. 56–61. [Google Scholar]
- Zhao, P.; Zhao, C.; Wu, H.; Wang, W. Multi-feature fusion agricultural named entity recognition based on BERT. Trans. Chin. Soc. Agric. Mach. 2022, 38, 112–118. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
- Zhao, P.; Zhao, C.; Wu, H. Named entity recognition of Chinese agricultural text based on attention mechanism. Trans. Chin. Soc. Agric. Mach. 2021, 52, 185–192. [Google Scholar]
- Guo, X.; Hao, X.; Tang, Z.; Diao, L.; Bai, Z.; Lu, S. ACE-ADP: Adversarial contextual embeddings based named entity recognition for agricultural diseases and pests. Agriculture 2021, 11, 912. [Google Scholar] [CrossRef]
- Socher, R.; Chen, D.; Manning, C.D.; Ng, A.Y. Reasoning with Neural Tensor Networks for Knowledge Base Completion. In Proceedings of the NIPS’13: Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe Nevada, CA, USA, 5–10 December 2013; Volume 1, pp. 926–934. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the NIPS’13: Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe Nevada, CA, USA, 5–10 December 2013; Volume 2, pp. 2787–2795. [Google Scholar]
- Wang, Z.; Zhang, J.; Feng, J.; Zheng, C. Knowledge Graph Embedding by Translating on Hyperplanes. In Proceedings of the AAAI’14: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; pp. 1112–1119. [Google Scholar]
- Moon, C.; Jones, P.; Samatova, N. Learning entity type embeddings for knowledge graph completion. In Proceedings of the CIKM’17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, New York, NY, USA, 6–10 November 2017; pp. 2215–2218. [Google Scholar]
- Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge Graph Embedding via Dynamic Mapping Matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 687–696. [Google Scholar]
- Tang, B.; Wang, X.; Yan, J.; Chen, Q. Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF. BMC Med. Inform. Decis. Mak. 2019, 19, 89–97. [Google Scholar] [CrossRef] [PubMed]
- Lin, Y.; Liu, Z.; Luan, H.; Sun, M.; Rao, S.; Liu, S. Modeling relation paths for representation learning of knowledge bases. arXiv 2015, arXiv:1506.00379. [Google Scholar]
- Yang, J.; Zhang, Y.; Li, L.; Li, X. Yedda: A lightweight collaborative text span annotation tool. arXiv 2018, arXiv:1711.03759. [Google Scholar]
- Arora, S.; Li, Y.; Liang, Y.; Ma, T.; Risteski, A. Linear algebraic structure of word senses, with applications to polysemy. arXiv 2018, arXiv:1601.03764. [Google Scholar] [CrossRef]
- Zhang, W.; Jiang, S.; Zhao, S.; Hou, K.; Zhang, L. A BERT-BiLSTM-CRF Model for Chinese Electronic Medical Records Named Entity Recognition. In Proceedings of the 2019 12th International Conference on Intelligent Computation Technology and Automation (ICICTA), Xiangtan, China, 26–27 October 2019. [Google Scholar]
Entity Type | Quantity | Relationship Type | Quantity | Attribute Type | Quantity |
---|---|---|---|---|---|
Disease | 579 | Pest damage | 540 | Onset cycle | 579 |
Prevention and control method | 1153 | Pest control | 500 | Prevention and control | 981 |
Symptom | 1472 | Disease control | 1012 | Onset condition | 876 |
Etiology | 2452 | Disease damage | 987 | Toute of transmission | 1028 |
Onset location | 2966 | Another name | 1042 | ||
Distribution area | 200 | Distribution area | 463 | ||
Disease department | 333 | ||||
Pest department | 324 | ||||
Total number of entities | 8971 | Total number of relationships | 5238 | Total number of attributes | 3464 |
Annotation Method | P | R | F1 |
---|---|---|---|
BIO | 78.21 | 74.23 | 75.62 |
BMES | 78.17 | 77.64 | 77.91 |
BIOES | 71.31 | 71.48 | 71.24 |
Model | Training Time |
---|---|
ALBert-BiLSTM-CRF | 8 h |
Bert-BiLSTM-CRF | 32 h |
ALBert-BiGRU-CRF | 10 h |
Bert-BiGRU-CRF | 36 h |
Model | Accuracy | Recall | F1 |
---|---|---|---|
BiLSTM-CRF | 0.7496 | 0.6697 | 0.7231 |
BiGRU-CRF | 0.6654 | 0.6523 | 0.6542 |
Word2vec-BiLSTM-CRF | 0.7417 | 0.7264 | 0.7791 |
Bert-CRF | 0.7763 | 0.8169 | 0.7623 |
Bert-BiLSTM-CRF | 0.7807 | 0.8042 | 0.7925 |
Bert-BiGRU-CRF | 0.7821 | 0.7947 | 0.7887 |
ALBert-BiLSTM-CRF | 0.8027 | 0.8081 | 0.7911 |
ALBert-BiGRU-CRF | 0.8012 | 0.8137 | 0.8052 |
GlobalPointer | 0.8164 | 0.8126 | 0.8081 |
ALBert-BiLSTM-Self_Att-CRF | 0.8262 | 0.7879 | 0.8166 |
Dataset | ||||||
---|---|---|---|---|---|---|
People’s Daily | Chinese Medical | |||||
Model | P | R | F1 | P | R | F1 |
Word2vec-BiLSTM-CRF | 82.06 | 84.31 | 83.63 | 76.26 | 70.22 | 72.94 |
GlobalPointer | 89.19 | 87.32 | 88.61 | 79.31 | 78.02 | 80.66 |
ALBert-BiLSTM-Self_Att-CRF | 94.84 | 90.19 | 91.03 | 82.16 | 84.58 | 83.35 |
Model | Hit@10 | Hit@3 | MRR |
---|---|---|---|
TransE | 0.503 | 0.492 | 0.416 |
TransH | 0.524 | 0.516 | 0.437 |
TransR | 0.47 | 0.457 | 0.389 |
TransD | 0.461 | 0.448 | 0.366 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, W.; Yang, S.; Wang, G.; Liu, Y.; Lu, J.; Yuan, W. Knowledge Graph Construction and Representation Method for Potato Diseases and Pests. Agronomy 2024, 14, 90. https://doi.org/10.3390/agronomy14010090
Yang W, Yang S, Wang G, Liu Y, Lu J, Yuan W. Knowledge Graph Construction and Representation Method for Potato Diseases and Pests. Agronomy. 2024; 14(1):90. https://doi.org/10.3390/agronomy14010090
Chicago/Turabian StyleYang, Wanxia, Sen Yang, Guanping Wang, Yan Liu, Jing Lu, and Weiwei Yuan. 2024. "Knowledge Graph Construction and Representation Method for Potato Diseases and Pests" Agronomy 14, no. 1: 90. https://doi.org/10.3390/agronomy14010090
APA StyleYang, W., Yang, S., Wang, G., Liu, Y., Lu, J., & Yuan, W. (2024). Knowledge Graph Construction and Representation Method for Potato Diseases and Pests. Agronomy, 14(1), 90. https://doi.org/10.3390/agronomy14010090