Preliminary Study on the Knowledge Graph Construction of Chinese Ancient History and Culture

: The domestic population has paid increasing attention to ancient Chinese history and culture with the continuous improvement of people’s living standards, the rapid economic growth, and the rapid advancement of information science and technology. The use of information technology has been proven to promote the spread and development of historical culture, and it is becoming a necessary means to promote our traditional culture. This paper will build a knowledge graph of ancient Chinese history and culture in order to facilitate the public to more quickly and accurately understand the relevant knowledge of ancient Chinese history and culture. The construction process is as follows: firstly, use crawler technology to obtain text and table data related to ancient history and culture on Baidu Encyclopedia (similar to Wikipedia) and ancient Chinese history and culture related pages. Among them, the crawler technology crawls the semi-structured data in the information box (InfoBox) in the Baidu Encyclopedia to directly construct the triples required for the knowledge graph, crawls the introductory text information of the entries in Baidu Encyclopedia, and specialized historical and cultural websites (history Chunqiu.com, On History.com) to extract unstructured entities and relationships. Secondly, entity recognition and relationship extraction are performed on an unstructured text. The entity recognition part uses the Bidirectional Long Short-Term Memory-Convolutional Neural Networks-Conditions Random Field (BiLSTM-CNN-CRF) model for entity extraction. The relationship extraction between entities is performed by using the open source tool DeepKE (information extraction tool with language recognition ability developed by Zhejiang University) to extract the relationships between entities. After obtaining the entity and the relationship between the entities, supplement it with the triple data that were constructed from the semi-structured data in the existing knowledge base and Baidu Encyclopedia information box. Subsequently, the ontology construction and the quality evaluation of the entire constructed knowledge graph are performed to form the final knowledge graph of ancient Chinese history and culture.


Introduction
As of June 2019, the number of Internet users in China has reached 854 million, an increase of 25.98 million compared with 2018, according to the 43rd Statistical Report on the Development of the Internet in China issued by the China Internet Network Information Center (CNNIC) [1]. As of June 2019, the total number of websites in China has reached 5.18 million. Among the many websites, the number of history and culture related websites in China has dramatically increased. In particular, websites that are related to ancient Chinese history and culture have grown at an alarming rate. This


This paper proposes a model for constructing a graph of knowledge regarding ancient Chinese history and culture.  It introduces, in detail, how the knowledge graph is constructed, and the main steps in the construction process, namely entity recognition.  Finally, a visual display of the ancient Chinese history and culture knowledge graph constructed is convenient for the public to better understand ancient history and culture.
The rest of the paper is organized, as follows. Section 2 introduces the related work of knowledge graph construction and named entity recognition. Section 3 introduces the overall scheme of construction of our Chinese ancient history and culture knowledge graph. Section 4 introduces knowledge extraction technology, mainly entity extraction. Section 5 gives the experimental results and analysis. Section 6 is the visualization system display of ancient Chinese history and the culture knowledge graph. Section 7 is the summary of the paper and future outlook.

Knowledge Graph Construction
Many researchers have done a lot of work to build large-scale, high-quality, general-purpose knowledge bases, such as the aforementioned YAGO, DBpedia, Wikidata, CN-DBpedia, zhishi.me, and Ownthink, since the concept of knowledge graph was proposed in 2012. However, the knowledge graph of the general domain is often difficult to cover the knowledge of the professional domain. The construction of vertical industry knowledge graph has become a research hotspot in order to make the knowledge graph more suitable for some professional fields.
In the field of ancient history and culture, research scholars have developed various historical and cultural-based ontology and knowledge bases to achieve knowledge management and knowledge sharing. Laura Pandolfo proposed in STOLE: A Reference Ontology for Historical Research Documents, a reference ontology to build a historical ontology library in Italian public administration [14]. Martin Doerr constructed a knowledge base in the field of cultural heritage that is based on ontology in Ontologies for Cultural Heritage [15]. Zhou et al. constructed a historical figure knowledge graph in a big data environment, and visualized the obtained data through visualization technology [54].
In other areas, the Gene Ontology [16] constructed Gene Ontology, which can be used to describe the genes and gene products in any organism. Qiu Minghu et al. introduced the construction of the recipe ontology in detail, mainly including the daily recipe and food composition database [17]. Ruan et al. established an open knowledge base of traditional Chinese medicine symptoms that can be used in clinical decision support systems, including concepts of diseases, drugs, and the relationship between symptoms and the above entities [18]. The Computer Knowledge Engineering Lab of Tsinghua University constructed a bilingual knowledge graph of film and television, which is mainly integrated with LinkedIMDB, Baidu Encyclopedia, Douban (a website that offers recommendations, reviews, and price comparisons for books, movies, music records, and the city's unique cultural life), and other data sources [53]. Vrije University of Amsterdam constructed a breast cancer knowledge graph, which mainly integrates breast cancer related knowledge [19]. The Chinese Academy of Chinese Medical Sciences constructed a medical knowledge graph, which mainly includes a knowledge graph of TCM medical records, and a knowledge graph of TCM characteristic diagnosis technology [12].
These constructed knowledge graphs have made great contributions to applied research in this direction. However, due to the complexity and variety of data in ancient Chinese history, it is different from data in other fields with relatively uniform formats. Data integration in the field of history and culture is relatively difficult. Accordingly, our research goal is to establish a large-scale, high-quality knowledge graph of ancient Chinese history and culture, laying the foundation for research in the field of history and culture.
Presently, although there are a lot of knowledge graphs in vertical fields, most of them are constructed by ontology fusion and database integration. The cultural heritage knowledge graph constructed by Martin Doerr is based on a relational database and a hierarchical database system. The data source of the STOLE knowledge graph is mainly historical texts and literature, including periodicals and newspapers at that time. Zhou Yi et al. constructed a knowledge graph of historical figures, while using data from Google, Baidu, Sogou (a website), and other websites. The field of knowledge graph is constructed by it is relatively single, only historical figures.
First of all, this article is different from the data sources of the cultural heritage knowledge graph, historical knowledge graph (STOLE), and historical figures at public administration in Italy. The knowledge graphs that are mentioned above are all based on ontology fusion or database integration.
The data source of the knowledge graph that was constructed in this paper is to extract and integrate knowledge from various Internet data sources (including structured data, semi-structured data, and unstructured data) [58]. Secondly, the content of these existing knowledge graphs that are related to history and culture is relatively simple. In contrast, the model can automatically extract knowledge from unstructured text information, which makes it possible to integrate knowledge from more open and diverse data sources [57]. In addition, from the perspective of knowledge, we have introduced some historical and cultural-related knowledge, including entities, such as dynasties, characters, and war, and the relationships between them into the ancient Chinese historical and cultural knowledge graph to make it more comprehensive. Finally, most of the current knowledge graphs in the field of history and culture are in English, which might not be suitable for Chinese reading habits. We have established a Chinese history and knowledge graph. This study constructs the ancient Chinese history and culture into a knowledge graph to help the public to better understand the ancient Chinese history and culture.

Named Entity Recognition
With the rapid development of Internet technology, people began to pay attention to entity extraction in the vertical domain while performing entity extraction on general domain data.
However, text data in the vertical domain has its own characteristics, and its own characteristics need to be considered when performing entity extraction [20].
In the work of named entity recognition, it is mainly divided into rule-based methods, statistical machine learning-based methods, and neural network-based methods. Among the common statistical machine learning-based models are Hidden Markov Model (HMM) [21], Maximum Entropy Model [22], Maximum Support Vector Machine (SVM) [23], and Conditions Random Field (CRF), etc. [24]. However, these methods need to be manually done when performing feature extraction. At the same time, a lot of manual labeled samples are needed in model training, and the effect is not obvious [25].
The method that is based on neural network is usually regarded as a sequence labeling task in named entity recognition tasks, and the text is used for entity recognition by establishing a sequence labeling model. In 2011, Collobert et al. used Convolutional Neural Networks (CNN) for feature extraction, and achieved good recognition results by fusing other feature effects [26]. In 2015, Huang et al. proposed a Bidirectional Long Short-Term Memory-Conditional Random Field (BiLSTM-CRF) model to improve the model performance [27]. Santos et al. proposed using the character CNN to enhance the CNN-CRF model [28]. In 2016, Lampleet et al. used two BiLSTMs to learn word-level and character-level features, respectively [29]. In 2017, Strubell et al. proposed the use of void convolutional network (IDCNN-CRF) for named entity recognition to extract sequence information and accelerate the training speed [30]. In 2018, Fenget et al. proposed a named entity recognition method that was based on BILSTM neural network structure [31]. Maimatiayifu et al. proposed a BiLSTM-CNN-CRF model [32]. Li Lishuang et al. applied the CNN-BILSTM-CRF model to biomedical corpus and obtained the highest F1 value at that time, according to the characteristics of the Uyghur language [33].
In this paper, the BiLSTM-CNN-CRF deep neural network model will be used in Chinese ancient historical and cultural entity recognition. In this process, Continuous Bag-of-Words Model (CBOW) will be used to train word vectors, convolution neural network will be used to extract the character representation vectors in sentences, character representation vectors, and word vectors will be spliced, and the spliced results will be used as input into Bidirectional Long Short-Term Memory (BiLSTM). Finally, the best annotation sequence will be selected by CRF according to the characteristics of the text to obtain the last identified entity information.

Knowledge Graph Construction Process
Many researchers have divided the construction process into several parts in the process of constructing the knowledge graph. Yang Siluo et al. divided the knowledge graph construction process into eight parts, which are sample data collection, sample data cleaning, knowledge unit selection, unit relationship construction, data standardization, sample data simplification, knowledge visualization, and results interpretation [34]. Katy Borner et al. divided it into six steps: extract data, define analysis units, select methods, calculate similarity, build knowledge units, and analyze results [35]. Although the process of constructing the knowledge graph is slightly different, they all mention the most important parts in the construction of the knowledge graph: data acquisition, information extraction, knowledge fusion, and graph construction. Figure 1 shows a flowchart of the construction of the graph of our Chinese ancient history and culture.

Data Acquisition
According to the form of data storage, data sources can be divided into three categories, including structured data, semi-structured data, and unstructured data. In the process of constructing the knowledge graph of Chinese ancient history and culture, all data types in this paper include structured data (such as linked data, database), semi-structured data (such as tables and lists in web pages), and unstructured data (that is, plain text data), in which structured data comes from the data in the general domain encyclopedia knowledge graph on the Internet. The general domain knowledge graph that is used in this paper is Ownthink and CN-DBpedia. The semi-structured data mainly comes from Baidu Encyclopedia [36], HDWiki [37], taking zhishi.me as an example, Baidu Encyclopedia and Interactive Encyclopedia Chinese Network Encyclopedia are used as data sources for extracting a large amount of knowledge from them to build a knowledge graph [55], for which the semi-structured data are obtained through the use of wrappers, among which the generation methods of wrappers are divided into three categories: manual method, wrapper induction method, and automatic extraction method. In this paper, we mainly used the manual method to analyze and construct the rules of information extraction of the wrapper, so as to obtain the semi-structured data in the web page. The knowledge graph of ancient Chinese history and culture is to extract entities and related knowledge from semi-structured data (such as InfoBox) in Baidu Encyclopedia, as shown in Figure 2. For unstructured data, it comes from the data of ancient Chinese history related web pages on the Internet and the introduction part of Baidu Encyclopedia, we used the web crawler technology to obtain a large number of historical web page text data, and then used the obtained web page text data for processing [38].
The semi-structured data and unstructured data that were used in the construction of the ancient Chinese historical and cultural knowledge graph mainly come from the Internet, and the acquisition method mainly uses the web crawler technology [39]. This paper implements a crawler that is based on the Python Scrapy framework [56] to obtain network data. . Baidu Encyclopedia data acquisition example (taking Nan Liang as an example), It is mainly divided into three parts. The first part is the Introduction part. This part is unstructured text data. After the extraction completed, the user named entity recognition and relationship extraction are performed. The second part is the Image part. This part mainly obtains the picture information. The obtained picture will be applied in the knowledge attribute query module of the knowledge graph system. The third part is the InfoBox part. This part is semi-structured data, which is mainly used to construct triples.
The process for obtaining semi-structured data is as follows: using web crawler technology according to the given initial web page (such as "中国历史朝代 (means: Chinese historical dynasty) link: https://baike.baidu.com/item/中国历史朝代/4056123"). A method that is similar to breadth-first traversal is used to crawl clickable page information in a webpage, and the obtained page information is saved as a html file in a webpage format. Subsequently, use the Xpath selector to extract the contents of the InfoBox in the saved web page. The main content is to save the contents of "basicInfoitem name" and "basicInfo-item value" in the InfoBox to a txt file. The same method saves the unstructured text content of the Introduction section to a txt file.
For unstructured text data, we view the web page format, write the corresponding crawling rules, scrape the desired data directly, and then save it to txt text for subsequent entity recognition and relationship extraction.

Knowledge Extraction
Knowledge extraction is the first step in the construction of knowledge graphs, which extracts structured information, such as entities, relationships, and entity attributes from semi-structured and unstructured data automatically [40]. The core technology involves entity extraction, relationship extraction, and attribute extraction.
1. Named entity recognition: Also known as entity extraction, refers to the automatic identification of named entities from text content, and it is the most critical part of information extraction. The quality of entity extraction will directly affect the work of relationship extraction and knowledge fusion in subsequent work. A deep learning algorithm is used for named entity recognition in this paper. A variety of methods are used for comparison experiments, and the conditional random field is used as a benchmark. The BiLSTM, BiLSTM-CRF, and BiLSTM-CNN-CRF methods are used for comparison experiments on custom data sets. Finally, the BiLSTM-CNN-CRF model is used to extract the entities in the text. Specific experiments on named entity recognition will be explained in detail in the fourth part of the paper.

Relationship extraction:
After obtaining the entities, the relationships between the entities need to be extracted from the relevant corpus, and the unrelated entities that are initially connected are connected through the relationship to form a knowledge network structure. This article will use the Chinese relation extraction tool DeepKE that was developed by Zhejiang University to extract the relationships between entities. 3. Attribute extraction: Attribute extraction is to aggregate the information of the same entity from various data to achieve the complete outline of the entity attributes. The attribute extraction in this article mainly comes from the semi-structured data in the information box in Baidu Encyclopedia. The knowledge extraction in this part mainly uses Python programming, using the Xpath selector to extract the contents of the InfoBox in the saved web page. Mainly save the contents of "basicInfo-item name" and "basicInfo-item value" in InfoBox to a txt file. The $ symbol is used to divide between "basicInfo-item name" and "basicInfo-item value". The information that is extracted from each webpage is saved to a txt file separately. Subsequently, the txt files that are extracted from multiple web pages are combined. It is finally stored as a triple. Figure 3 shows the results after processing. The left side of the figure is to extract the information in Infobox, which mainly includes the contents of "basicinfo item name" and "basicinfo item value" in the information box, and the $ symbol is used for segmentation in the middle. On the right side of the figure is the triple data result of information processing in the extracted Infobox, in the form of (entity1-relationship-entity2), in which the $ symbol is used to divide the entity and relationship.

Knowledge Fusion
After knowledge extraction is completed, the entities, relationships, and entity attributes are extracted from the obtained unstructured and semi-structured data. However, these results may contain other error information. Erroneous data must be eliminated in order to ensure the quality of knowledge [41]. Knowledge fusion includes two parts: entity alignment and attribute value filling. Among them, entity alignment is to determine whether there are multiple entities in the obtained entity pointing to the same entity in the objective world [42]. The solution is to obtain knowledge from the third-party knowledge base Ownthink as input and supplement the entity attribute values.

Graph Construction
In this section, the processed structured data are mainly stored in the graph database, the database holding the triples data is the Neo4j graph database (https://neo4j.com/). The stored data include entities and entity relationships, entity attributes, and entity attribute values [41]. Cypher statements are used to store triples in the Neo4j graph database. By "MATCH (e: History), (cc: Attribute) WHERE e.Name = '% s' AND cc.Name = '% s' CREATE (e)-[r:% s {relation: '% s'} ]-> (cc) RETURN r"statement to create a knowledge graph.
The following Figure 4 shows the storage transformation process. Two principles in the construction of the knowledge graph: 1. The type of each node is represented by its identifier. If the identifier is "dynasty", the type of node created is "dynasty". 2. In the entity-relation table, the storage format of the triples data is "Entity1-Relation-Entity2" or "Entity1-Attribute-Attribute Value". Figure 5 shows the BiLSTM-CNN-CRF model framework. The BiLSTM-CNN-CRF model consists of three parts. The first part is CNN module, the second part is BiLSTM module, and the third part is CRF module. The CNN model process consists of three steps, training the word vector of the data set, extracting the character vector of the sentence by CNN network, and convoluting and maximizing the pool operation to obtain the character level characteristics of each word.

Overall Framework of the Model
For the BiLSTM model, the character vector and the word vector are spliced, and the spliced word vector is used as input into the BiLSTM neural network model for entity recognition.
For the CRF model, the output of the BiLSTM model is decoded in order to get an optimal tag sequence.

Feature Representation
The popular text representation usually uses the word bag model, because the model is simple to construct and it can reduce the complexity of vector calculation [43]. However, there are many shortcomings in this model. For example, the feature dimension of text will be very high when the sample data is large, which might lead to dimension explosion. The word vector matrix is very sparse, so it is easy to over fit. In order to solve this problem, Mikolov et al. [44] introduced the word embedding model, which is an effective method for learning high-quality word vector representation from a large number of unstructured text data, capturing very important syntactic and semantic information.
Currently, Word2vec [45] is the most widely used word vector training tool in the field of natural language processing, which mainly includes two training methods: Continuous Bag-of-Words Model (CBOW) and Skip-gram. CBOW uses the context statement of a word as input to predict the current word, while Skip-gram uses the current word to predict the context statement around it. In this paper, the CBOW method is used to train word vectors for the data corpus acquired on the network, as shown in Figure 6. The optimization objective function of CBOW model in the training process is as follows:

CNN Model
The convolutional layer in the convolutional neural network can extract the local feature information of the text data, and the most representative part of the local features can be further extracted as the feature vector through maximum pooling. Chiu, Nicholas et al. [46] used CNN in order to extract character-level features to achieve good results in the general field. Therefore, this paper uses CNN to extract the character-level features of words in Chinese ancient historical texts, and improves the model's performance by combining word vectors and character-level features. Figure 7 shows the CNN structure. It mainly includes a convolution layer, a pooling layer, and a character vector.

BiLSTM Model
Recurrent Neural Network (RNN) is an important kind of structure in neural networks. In theory, RNN can dynamically capture sequence data information. However, in actual use, it is prone to cause problems, such as gradient disappearance and gradient explosion. Long Short-Term Memory [47] can solve this kind of problem well. For named entity recognition, the positions of the entity sentences that need to be identified are different, and the importance of context information in the text is also different. This paper uses Bidirectional Long Short-Term Memory structure for model training in order to make better use of the information in the context.

LSTM Unit
LSTMs are variants of RNNs that are designed to cope with these gradient vanishing problems. Basically, a LSTM unit is composed of three multiplicative gates that control the proportions of information to forget and pass on to the next time step. These gating units are Input Gate, Forget Gate, and Output Gate. Figure 8 gives the basic structure of an LSTM unit. Formally, the formulas to update an LSTM unit at time t are: where is the element-wise sigmoid function and ⊙ is the element-wise product. is the input vector (e.g., word embedding) at time t, and ℎ is the hidden state (also called output) vector storing all of the useful information at (and before) time t. 、 、 、 denote the weight matrices of different gates for input , and 、 、 、 are the weight matrices for hidden state ℎ . 、 、 、 denote the bias vectors.

BiLSTM Unit
After using LSTM, the hidden layer state sequence with the same length as the sentence [ℎ , ℎ … … ℎ ] will be obtained. The BiLSTM network used in this paper can get the hidden layer state in the forward and backward directions at time t. Sequences [ℎ ⃗ , ℎ ⃗ … … ℎ ⃗ ] and[ℎ ⃖ , ℎ ⃖ … … ℎ ⃖ ], the resulting hidden layer state sequences are forward and backward. Hidden layer state sequence stitching is generated, that is ℎ = [ℎ ⃗ : ℎ ⃖ ]. Figure 9 illustrates the basic structure of an BiLSTM unit.

CRF Model
Conditional random field is a probabilistic undirected graph model [48], and it is also a common algorithm in a sequence tagging task, which can be used for entity class tagging. In this paper, CRF layer is regarded as the last layer of neural network structure, and the output of BiLSTM module is processed in order to obtain the optimal global label sequence.
For a given text, = ( , , … ) is the input sentence, = ( , , … ) represents the output tag sequence, then the tag sequence score is as: Here, A is the transfer fraction matrix and , is the fraction that is transferred from label i to label j. All of the possible sequence paths to generate the probability distribution of output sequence y need to be normalized, as shown in formula (9): In the training process, the logarithmic probability of the correct tag sequence is maximized, as shown in formula (10) Where is the sequence of all possible tags for the input sentence X.
In the final decoding, the sequence with the highest predicted total score is selected as the optimal sequence, as shown in formula (11): * = max ( , )

Data Preparation
The data are used in this paper to obtain the relevant text data on the Internet through the crawler due to the lack of relevant data about Chinese ancient history and culture in the data set published on the Internet, and then the acquired corpus has been segmented, to stop using words and other processing, and the corpus has been entity marked with the information of person name, place name, time, dynasty, war, system and so on.
Here, we used Baidu Encyclopedia to generate data sets. The Baidu Encyclopedia is more like Wikipedia, which contains text, tables, and pictures to describe, as an introduction to the entity, similar to the triplet provided by the public knowledge graph (such as CN-DBpedia, Ownthink) to represent entity relationship entity [49]. Therefore, we can obtain the triple information and unstructured text data of the entity at the same time by crawling Baidu Encyclopedia.
The specific process is as follows: First of all, we grabbed hundreds of thousands of pages from Baidu Encyclopedia. These pages contain relevant data in Infobox and text data regarding the entity in Baidu Encyclopedia. After filtering, each page crawled can be regarded as an entity introduction. In this page, structured information about entities is provided in Infobox. For example, Chinese name, nickname, national leader, time, etc. After processing, we can obtain triple information about this entity, and finally to form triple. Subsequently, deep learning algorithm is used for unstructured text in the page to extract entities and relationships, and finally to constructs triples.

Data Annotations
Presently, the main annotation models of supervised learning include BIO, BIEO, BMESO, etc. The BMESO tagging method is used in the self-built dataset in this paper in order to be able to clearly represent the named entities to be recognized in the corpus. According to the research of Roth [50], Dai [51], Lample [29], BMESO is better than BIO, which can clearly divide the boundary of entities.
For each entity, the first word is marked as "B -(entity name)", the middle word as "M -(entity name)", the end word as "E -(entity name)", the single entity as "S-(entity name)", and the nonentity as "O". Table 1 shows the labeling strategy of BMESO  Table 2 shows an example of entity annotation for a given Chinese ancient Chinese historical text while using the BMESO annotation strategy. Table 2. An example of Chinese ancient historical text named entity annotation, Example "李渊于晋 阳起兵 (means: Li Yuan starts his army in Jinyang)", where Li Yuan is a person's name and Jinyang is a place name.

English Li Yuan Starts His Army in Jinyang
Text

Evaluation Metrics
The standard evaluation measures like precision (P), recall (R), and F1-score (F1) are considered to evaluate our experiments.

Experimental Environment
The environment for all experiments is shown in Table 3:

Parameter Setting
The model that is proposed in this paper is built with Tensorflow framework, which is an indepth learning framework that was developed by Google team and is the most widely used one among all frameworks at present. The experimental parameters are set as follows in Table 4:

Experimental Results and Analysis
CRF, BiLSTM, BiLSTM-CRF, and BiLSTM-CNN-CRF were used to test the custom data set to verify the effectiveness of the method used in this paper in order to verify the recognition effect of the method in this paper. Experiment 1: Conditional random field (Baseline) In this paper, the CRF model is used as the benchmark model to explore the performance of CRF model in self built history data set. During the experiment, CRF++ [52] toolkit, which is an opensource tool of conditional random field, is used. The version of CRF + + 0.58 is used in the experiment. Entity recognition is realized by the building model.
During the experiment of CRF model, it is found that the model cannot correctly identify the names of people, places, and titles that never appear in the corpus. After analysis, the possible reasons are determined as follows: 1) the ancient surname is different from the modern surname, which might lead to a poor recognition effect; 2) most of the ancient place names have been changed, leading to mismatches; and, 3) at present, the data of people's names and place names marked in the language database is relatively small, leading to the model unable to correctly identify some entities.
The accuracy of CRF model is 75.45%, the recall rate is 72.38%, and the F1 value is 73.89%. Experiment 2: LSTM vs. BiLSTM Experiment 2 is mainly to verify the effectiveness of the LSTM bidirectional network structure. In this context, experiments on the LSTM model and the BiLSTM model were performed. The LSTM model can better capture the long-distance dependencies. However, there is still a problem in modeling sentences with LSTM. That is to say, it is impossible to encode information from back to front. However, the bidirectional LSTM combines the forward LSTM with the backward LSTM, so that both the front-to-back information and the back-to-front information can be encoded in the modeling process, thereby better modeling. According to the test results presented in Table 5, it can be seen that the entity recognition effect using the BiLSTM network model is better than the entity recognition while using the LSTM network structure. The main reason is that the BiLSTM network structure can make fuller use of context information than the LSTM network structure. The histogram of the experimental results of the LSTM model compared with the BiLSTM model is shown in (a) of Figure 10, below. It can be clearly seen that the BiLSTM is better than the LSTM in the results.

Experiment 3:
BiLSTM vs. BiLSTM-CRF Experiment 3 is mainly to verify the performance of BiLSTM in the named entity recognition task on the custom dataset after adding CRF for decoding. The CRF layer can add some constraints to the final constrained labels to ensure the validity of the predicted labels. These constraints are automatically learned by the CRF layer from the training data. The CRF layer decodes the output of the BiLSTM model to output the optimized marker sequence with the highest probability, so that the final prediction effect is better than that of the BiLSTM model without CRF. It can be seen from Table  6 that the effect of BiLSTM recognition after adding the CRF layer is better than that of decoding without the CRF layer, and the entity recognition performance is improved.    Table 7, it can be seen that the entity recognition effect with CNN model added is due to that without CNN model added. From the experimental results, it can be concluded that CNN is helpful in entity recognition. Figure 11 shows the comparison histogram of two methods.  The above four experiments verified the validity of the CRF module, the CNN module, and the effectiveness of the BiLSTM network model.
BiLSTM-CRF is the leader in the traditional named entity recognition framework. In this model, the bidirectional long-term and short-term storage network solves the problem of dependencies between long-distance named entities in text. At the same time, the generated sequence is decoded due to the existence of the conditional random field, which further improves the recognition ability of the frame. Its overall effect is better than LSTM, BiLSTM.
BiLSTM-CNN-CRF is based on the BiLSTM-CRF and it adds a CNN layer. By adding the CNN layer to extract character-level features from the text data, the character-level features that are extracted through the CNN layer and the trained Word2vec are stitched together. is input to a Bidirectional long short-term memory network. After training, the output result is passed to the CRF, and the optimal sequence label is selected in the CRF layer. Figure 12 shows the comparison histogram of the test results of different methods.   Based on the analysis of the above experimental results, the BiLSTM-CNN-CRF model that is used in this paper has achieved good results in the task of named entity recognition in the field of Chinese ancient history and culture.

Visual Display of Knowledge Graph
The above experiments can be utilized to obtain the triple data of Chinese ancient historical and cultural knowledge graph. In this paper, the entities and relationships of the acquired Chinese ancient historical and cultural knowledge graph are stored in neo4j graph database, with a total number of entities of about 15,000 and a total of 30 types of relationships. Using Echarts (similar to D3.js) in combination with the Flask framework, a knowledge graph system that was based on ancient Chinese history and culture was developed, and the acquired ancient Chinese history and culture knowledge was visually displayed in various forms, such as text, pictures, and force-oriented diagrams. The system functions mainly include the following two parts: 1. Inquiry module of knowledge graph of ancient Chinese history and culture. The dynasty is used as an entity to expand and display historical figures of the same dynasty, the start and end time of the dynasty, and the name of the emperor. The knowledge graph supports zooming in, zooming out, and moving. The entities that have a relationship with the clicked entity can be highlighted when you click an entity with the mouse, while other unrelated entities are displayed in grayscale, as shown in Figure 13.

CRF LSTM BiLSTM
BiLSTM-CRF BiLSTM-CNN-CRF Figure 13. Chinese ancient history and culture knowledge graph query module. The blue circle is entity 1 and the red circle is entity 2. The text on the arrow indicates the relationship between the two.
2. Entity attribute knowledge query module. This module is mainly divided into two parts for display: The first part is to display the relevant attribute knowledge of search entities while using force-oriented diagrams. The second part shows the picture information of this entity, and some information introduced by encyclopedia, as shown in Figure 14.

Conclusions
This paper put forward a construction process of Chinese ancient historical and cultural knowledge graph. Firstly, the data acquisition is introduced, which includes structured data, semistructured data, and unstructured data. Then how to extract knowledge from unstructured text data is given and a BiLSTM-CNN-CRF neural network model for entity extraction is proposed. According to the experimental results, BiLSTM-CNN-CRF can extract entities from unstructured text better. Finally, the knowledge graph triplet is constructed and stored in the neo4j database that is based on the entity relationship, which is visualized by using Echarts and web programming.
In the future work, we will try to use Bert model to extract entity relationship in unstructured text, and further improving the effect of named entity recognition. At the same time, because the current relationship extraction part uses the open source tool DeepKE to extract relationships, the focus will be on the relationship extraction between entities in the future in order to improve the accuracy of relationship extraction.
The construction of the ancient Chinese historical knowledge graph has only just begun. In the future, we will work hard to build a large-scale and high-quality ancient Chinese historical knowledge graph.