Exploring the Importance of Entities in Semantic Ranking

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.


Introduction
Information retrieval (IR), which facilitates the users in search of the required information, has always been a research hotspot.In traditional IR, words are fundamental to the representation of queries and documents [1][2][3].The early research (such as vector space model [4] and BM25 [5]) typically built bag-of-words representations for texts, and the relevance between query and document is usually based on handcrafting features.Recently, many types of research have also adopted text embeddings and neural networks, and these neural information retrieval models could extract IR features from training data automatically (such as DSSM [2] and DRMM [3]).However, in the previous IR models which represent texts based on words, the information obtained from the surface forms is insufficient to rank documents.Later, with more abundant semantic information provided by knowledge graphs for document-ranking, entity-based representation has become a direction worth exploring.As shown in Figure 1, in entity-based ranking models [6][7][8], the queries and documents are represented by entities with the help of entity linking [9,10].Compared with previous IR models, the entity-based ranking models incorporate external knowledge with a document-ranking task, since entities and their semantics in the knowledge graph enable a better understanding of queries and documents.

A Document About Bill Gates
Bill Gates is the principal founder of Microsoft, who grew up in Seattle… e.g.e.g.

Entities:
William Although much research has been carried out on entity-based models, little has investigated the importance of entities in documents.For an entity in a document, its importance depends on how much it contributes to the semantics of this document.For example, in a document about 'Bill Gates', the importance of the entities 'Microsoft' and 'Seattle', which are related to 'Bill Gates', should be higher than the importance of 'Facebook'.In traditional IR models, weights have always been assigned to words according to their importance, and proper term-weighting can greatly improve the accuracy of retrieval systems.Intuitively, for entity-based models, the importance of entities should also be considered.Nevertheless, entities in a document are equally treated in most existing entity-based models.Thus, this paper attempts to explore the effectiveness of the importance of entities in a document.
This paper mainly conducts research into two aspects.Firstly, data analysis is carried out on the ESR dataset-an academic dataset used for entity-based models.TextRank and term frequency (TF)-inverse document frequency (IDF) are applied to measure the importance of entities, respectively.The results of the analysis reveal that the importance of entities in a document is indeed related to document ranking.Secondly, based on the analysis, two entity-based models (toy model and ESR) are optimized by considering the importance of entities.This work assigns high weights to the important entities in documents and low weights to the less important ones.Extensive experiments on the ESR dataset confirm the effectiveness of considering the importance of entities.With TextRank or TF-IDF adopted, both two entity-based models perform better.The enhanced toy model achieves a 4.57% increase in NDCG@20 over toy model.In addition, the enhanced ESR achieves a 2.79% increase in NDCG@20 over ESR, even merely using the high-ranking entities can achieve a 0.88% increase.In addition, the effectiveness of the enhanced ESR models is more obvious on hard and long queries.
In summary, the main contributions of this paper are as follows: 1.The dataset analysis reveals that the importance of entities in a document should be considered in the IR task.2. Toy model and ESR are enhanced by considering the importance of entities in documents.3. Extensive experiments reveal that the enhanced models could achieve better performance than the original models on all evaluation metrics, especially on the long queries and the queries where ESR fails.
This paper is structured as follows: Section 2 reviews the previous works related to entity-based IR and entity ranking.Section 3 introduces entity ranking methods and analyzes the ESR dataset.The approaches for using entity ranking in toy model and ESR are described in Section 4. Section 5 presents the experimental settings and experimental results.Finally, conclusions and inspiration for future studies are offered in Section 6.

Traditional Information Retrieval Methods
Early works about IR models are mostly based on exact matching of terms between queries and documents [11].The matching score can be computed by vector space model [4] or more complex methods such as BM25 [5].These traditional models mainly use bag-of-words representation for both queries and documents, which results in the limitation of retrieving documents that are just syntactically relevant, rather than semantically relevant [12].
Recent years, while deep learning and word embedding [13] have been successfully applied in IR, neural ranking models have been gaining much concern.By matching texts to embeddings, the semantic matches between queries and documents can be computed in the representation space [14].Current neural IR models can be categorized into two groups: representation-based and interaction-based models [1].The representation-based models first build representations for the query and documents and then conduct matching in the embedding space.DSSM [2] and CDSSM [15] get low-dimensional vectors representations using deep neural networks and handle the out-of-vocabulary problem (OOV) by hashing letter-tri-grams to a low dimension vector.
The interaction-based models first get the word-level interactions from query-document pairs and then use deep neural networks to learn patterns for matching.For example, ARC-II [16] and MatchPyramid [17] employ a CNN to learn hierarchical matching patterns over the local interactions.DRMM [3] summarizes the word-level similarities with histogram, which is then combined into a learning-to-rank method.In this paper, two interaction-based models are adopted.
Moreover, there are other ways to improve the performance of IR.Query expansion methods evaluate the given query and expand the query to match additional documents, such as pseudo relevance feedback (PRF) [18][19][20].Content-based and collaborative-based filtering methods [21,22] improve the delivery of relevant content to the end-users.Personalization along with relevance feedback techniques can also be employed to increase users' overall satisfaction and quality of experience [23,24].

Entity-Based Ranking Methods
Recently, several large-scale knowledge graphs such as DBpedia [25] and Freebase [26] have emerged with the rapid development of information extraction methods [27,28].A knowledge graph is a multi-relational graph composed of entities as nodes and relations as edges [29].Knowledge graphs contain rich human knowledge about entities and have become very important resources to support other tasks.
To overcome the defects of traditional IR models such as vocabulary mismatch [30], various entity-based models [6,31,32] have been proposed with the aid of entity linking.Bag-of-entities representations use entities to represent texts, and obtain significant improvement over traditional bag-of-words representations [7].Explicit Semantic Ranking (ESR) [6] uses knowledge graphs and embeddings to take advantage of the semantics.Another way to use knowledge graphs in document-ranking is the word-entity duet framework, which represents queries and documents using both entities and words.As entity-linking tools have not been designed specifically for document-ranking in the previous research, JointSem [33] combines both entity-linking task and entity-based ranking task in an end-to-end model.
Many entity-based models use entity embeddings to correlate queries with documents.Over the past five years, several studies on representation learning could embed a knowledge graph into a continuous vector space while preserving certain information of the graph, such as TransE [34] and TransR [35].Semantic correlations of entities can be efficiently measured through representation learning.For entities which are semantically related but not matching exactly, embeddings provide a soft match signal to express their similarity.Corresponding to the exact match of query entities, the soft match can also provide signals to information retrieval [6,36].

Entity Ranking
Traditional entity ranking task always focuses on ranking entities in response to a query in natural language text [37,38], instead of ranking entities directly from documents [39].However, in this paper, entity ranking is based on the importance of entities in the document without additional data sources.Similarly, NERank [39] also ranks entities without user queries but it is based on document collections, instead of one single document.The task of ranking entities in one document could be applied in other tasks especially for entity-based ranking models.There are two keyword extraction baselines that could be used to rank entities: TextRank and TF-IDF [39].TF-IDF uses TF and IDF to evaluate how important an entity is to a document in a corpus.TextRank [40] is a graph-based ranking model for text processing, which has already been successfully applied in many natural language processing tasks.

Data Analysis
Intuitively, the importance of entities in the document is a key factor in the retrieval task while most models neglect it.When analyzing the results of entity-based models, some mistakes are found due to this factor.To explore the correlation between retrieval results and the importance of entities in the document, data analysis is performed on the ESR dataset (http://boston.lti.cs.cmu.edu/appendices/WWW2016/), which is an academic search dataset used for entity-based ranking.This section describes the ESR dataset, the entity ranking methods used for measuring the importance of entities, and detailed analysis as follows.

Dataset
The dataset was generated from Semantic Scholar, an academic search engine from Allen Institute.There are 100 queries sampled from Semantic Scholar's query logs and 8541 related papers selected as candidate documents.The academic queries are mainly about computer science (such as 'question answering' and 'deep reinforcement learning').In addition, there are 5 levels of relevance labels for academic papers in the dataset such as the TREC Web Track: Navigational (4), Exactly Right (3), Relevant (2), Related (1) and Off-Topic (0).

Entity Ranking
Before exploration, the order of entities in the document should be sorted.Similar to keyword extraction task which extracts important words from documents, entity ranking task is to rank entities according to their importance in the document.Thus, keyword extraction methods can be employed to rank entities in documents.This paper adopts two frequently used baselines, TextRank and TF-IDF, for ranking entities in documents, respectively.Due to their unsupervised character, TextRank and TF-IDF are well-qualified for entity ranking task in this work.The application of two entity ranking methods is aimed to prevent the bias of a single method.
TextRank ranks entities based on the semantic graph extracted from the document.In the graph, document entities are vertices, and co-occurrence relation is used to draw edges between vertices: two entities are connected if their corresponding lexical units co-occur within a window of N words.The score associated with a vertex can be considered as the importance of the corresponding entity: where S i is the score of entity e i , C(e i ) is the set of entities that are linked to e i , and α is a damping factor.TF-IDF uses the product of entity frequency t f i and inverse document frequency id f i to show the importance of the entity e i : The IDF id f i of entity e i is counted from Open Research Corpus [41]: where N represents the total amount of documents in the corpus, and n d i is the number of documents where the entity e i appears.
With the help of TextRank or TF-IDF, entities can be sorted in reversed order by their importance and the order of entities can be obtained for the analysis.

Analysis Result
After ranking entities, this section tries to explore whether the importance of entities is related to document retrieval.The similarities between document entities and query, computed by embeddings (see Section 4 for details), are the essential features for many entity-based models.For document entities, the similarities represent the semantic relatedness to query, which are worth analyzing.Therefore, the correlation between similarities and the order of entities for each relevant level in the dataset is analyzed.
As shown in Figure 2, the X-axis represents the percentage of sorted entities, and the Y-axis represents the average similarity between document entities and corresponding queries.For the average similarity at each percentage for each relevant level, the statistical results of each entity ranking method turn out to be similar: the important entities are always more similar to the query; for the same percentage, the more relevant the documents are, the higher the similarity is.The strong evidence founded in Figure 2 proves that the importance of entities should be considered in retrieval task.

Proposed Approach
This section presents the basic models (toy model and ESR) and enhanced ones.According to the analysis in Section 3, the enhanced models integrate the importance of entities into the basic models.Figure 3 presents the framework of the proposed approach.This section first defines the toy model and introduces ESR.In addition, then, toy model and ESR are combined with the order of entities respectively, which is obtained by TextRank or TF-IDF as described in Section 3.2.

Basic Entity-Based Models
Toy model and ESR are adopted as the basic models, which rank candidate documents D = {d 1 , . . ., d n } in response to a given query q.The toy model is a simplified method defined by ourselves and ESR is the state-of-the-art method in ESR dataset.Before integration with entity importance, these models are introduced first.

Basic Toy Model
The toy model defined in this paper is a simple interaction-based ranking model which could help us focus on entity importance.Queries and documents are represented by the entities contained in them.The toy model uses the average similarity between query entities and document entities as the ranking score of the document.

Entity-Based Representation
In toy model, the representations of query q and document d are generated by entity linking.After that, toy model gets the representation of the query q (or the document d): vector − → E q (or − → E d ), where each dimension refers to an entity e in q (or d) and the weight is the frequency of this entity appearing in the query or document.

The Generation of Query-Document Entity Translation Matrix
For characterizing the similarity between the query and document, the query-document entity translation matrix M n×m is generated through the representations of q and d, where n is the size of − → E d and m is the size of − → E q .Each element in the matrix is the cosine similarity between embeddings of entity e q in query q and entity e d in document d: where V(e q ) and V(e d ) denote the embeddings of entities e q , e d respectively.In the matrix, each row corresponds to an entity in document d and each column corresponds to an entity in query q.
Different from the exact matching in traditional IR, the query-document entity translation matrix provides a soft matching between query entity and document entity where the cosine similarity remains less than 1.

Document Ranking
The relevance score of document d corresponding to query q can be computed by the average similarity in the query-document entity translation matrix M n×m .As a formula, the score of document d can be presented as where s ij is the element in M n×m and f (q, d) is the relevance score between document d and query q.
Although the toy model is simple, it is very practical in some cases.

Basic Explicit Semantic Ranking (ESR)
As other entity-based retrieval models [7,31,42], ESR extracts interactive information between document entities and query entities to generate ranking features.Firstly, ESR generates entity-based representation and query-document entity translation matrix M n×m as described in Section 4.1.1.Then, ESR uses max-pooling and bin-pooling step to generate ranking features for learning-to-rank model.

Max-Pooling
After query-document entity translation matrix generation as in Section 4.1.1,the score of each entity in document d is calculated by the matrix.
Query level max-pooling is executed on the translation matrix M n×m , aiming to find out the most similar entity to each document entity.For each row in M n×m , max-pooling figures out the maximum element r i in this row: where s i1 , . . ., s im are the elements in one row.The maximum element of the document entity e d can be considered as the score of this entity.In addition, the vector R n = (r 1 , . . ., r n ) created by the max-pooling reflects the similarity between document entities and query q.

Learning to Rank with Bins
To transform the variable-length vector R n into a fixed-length matching histogram, ESR adopts a bin-pooling step which generates different features according to the similarity.The features are used in learning-to-rank model to compute the final score of the document d to query q.
Bin-pooling represents document d as fixed-length which groups entities according to their values: where R n is the vector generated by max-pooling, [st k , ed k ) is the range for the kth bin, and B k (q, d) represents the number of document entities whose scores are in the corresponding bin.ESR discretizes the range of the scores of document entities into a set of ordered bins, and treats exact matching as a separate bin, so that both exact matching and soft-matching can be used by ESR.
Learning-to-rank model [43] is performed by using bin scores as features to generate the ranking function: where B(q, d) is the bin scores, W is the parameters needed to learn in ESR, and f (q, d) represents the final score of query-document pair (q, d).

Enhanced Entity-based Model by Considering the Importance of Entities
Obviously, these basic entity-based models mainly regard similarity between entities as major foundation whereas do not employ the importance of entities in the document.This issue is solved by allotting proper weights to entities.On the basis of Section 4.1, entities in each document could be divided into three categories in terms of the order calculated by TextRank or TF-IDF: the first third of the ranked entities are important for the document; the middle third of the ranked entities are related with the document; and the last third of the ranked entities are more likely to be irrelevant to the document.These categories are distributed by diverse weights ω 1 , ω 2 and ω 3 respectively with the constraint ω 1 ≥ ω 2 ≥ ω 3 ≥ 0. The following describes relevant details about how to take advantage of entity importance.

Enhanced Toy Model by Considering the Importance of Entities
As Figure 4 illustrates, the enhanced toy model combines both similarity and the importance of entities to recalculate the scores of entities.Important entities should contribute a greater amount to the ranking, and in contrast noise from irrelevant entities should be reduced.By multiplied with different weights, scores of different entities will be more reasonable:

Entity Linking
where s ij is the element in query-document entity translation matrix M n×m , ω l is the weight dependent on the order of entity e i in document, and v ij is the optimized score presenting the correlation between document entity i and query entity j.The final ranking score is then substituted in a more sensible way: The improvement in the succinct model could reflect the effectiveness of entity importance definitely.As described in Section 4.1.2,the vector R n = (r 1 , . . ., r n ) created by the max-pooling shows the similarity between document and query q.Subsequently, the score of document entity is optimized by the following function:

ESR optimized by considering the importance of entities
where r i is the similarity corresponding to the l-th row in M n×m , and ω l depends on the order of this entity, and v i is the score of the entity in the document towards query q.The vector V n = (v 1 , . . ., v n ) created by R n and TextRank (or TF-IDF) reflects the relevance between document entities and query q.
Then the bin-pooling operated on V n summarizes more sufficient ranking features than basic ESR by considering the importance of entities.Learning-to-rank model performed on ranking features figures out the final ranking result of each document.

Results and Discussion
The experiments investigate the effectiveness of entity importance.This section provides evaluation results and the discussion of the impact of the importance of entities.Both TextRank and TF-IDF are tested to rank entities for entity-based models.Then the performance on different scenarios is discussed further.To verify the effectiveness of entities ranking, only the top 1/3 entities in the ranking are used to optimize the performance of ESR.

Experimental Setup
To evaluate the proposed models, a set of experiments are conducted using the datasets of ESR [6] described in Section 3. The entities contained in queries and documents are identified by Tagme software [44], which is an excellent entity-linking system.The experiments concentrate on the entities from the document's title and abstract, which contain adequate information of a document.
The baselines of these experiments are tf.idf-F,toy model and the ESR.Tf.idf-F is a widely used retrieval method; toy model and ESR are entity-based retrieval models introduced in Section 4.1.In ESR model, the score from Semantic Scholar's production system is not used because ESR and the enhanced models can be applied in not only Semantic Scholar but also other retrieval systems.The embeddings of entities in ESR and toy model are generated from Wikidata (https://www.wikidata.org)representation via TransE [34] with the help of OpenKE (http://openke.thunlp.org/index/about).A popular approach, RankSVM [45], is served as the learning-to-rank model in the experiments.
In the experiments, the co-occurrence window-size of TextRank is set to five words.Same as ESR, the negative scores of entities are discarded as they are not informative, so the effective scores of entities are distributed in the range [0, 1].Moreover, the interval [0, 1] is discretized into five bins including an exact matching bin: [1,1], [0.75, 1), [0.50, 0.75), [0.25, 0.50) and [0, 0.25).Ten features are generated for RankSVM by bin-pooling on title and abstract.Both baselines and the models integrated the importance of entities are trained and tested using ten-fold cross validation and the 'c' of RankSVM is selected from the set {0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1}.

Evaluation and Analysis
In the following, the enhanced toy model and ESR model with TextRank and TF-IDF are evaluated.All methods are evaluated by normalized discounted cumulative gain at rank 20 (NDCG@20), which is the main evaluation metric in the TREC Web Track, and evaluated by NDCG@1, NDCG@5, NDCG@10, P@3 and P@10.

Results of Enhanced Toy Model
First, the enhanced toy model is compared with baselines.Table 1 presents the experimental results obtained from tf.idf-F, toy model and the enhanced models.Although toy model is very simple, the performance of it turns out to better than tf.idf-F on NDCG@10 and NDCG@20, which proves the effectiveness of the entity-based model.
Table 1 demonstrates that, both the toy model optimized by TextRank and the toy model optimized by TF-IDF get improvement on all evaluation metrics.The enhanced models even outperform basic model by over 10% on NDCG@1 and NDCG@5.W/T/L are the number of queries a model improves (win), does not change (tie), and hurts (loss), compared with toy model on NDCG@20.The W/T/L values in Table 1 demonstrate that both enhanced toy models can win more cases.To further evaluate the enhanced models, the P@3 and P@10 are provided to measure how many related documents are among the top 3 and 10 positions.In current IR models, queries have many related documents, but few users will read all of them.Users usually pay more attention to the top documents returned from IR systems (for example, top 10 documents which are the related results on the first search results page), therefore the earlier positions are more important for user satisfaction.It can be seen from the data in Table 2 that both enhanced models also outperform toy model on P@3 and P@10, which demonstrates the enhanced toy models could improve user satisfaction.
Table 2. Retrieval performance of toy model and enhanced toy models on P@3 and P@10.The best results in each metric are marked bold.

Results of Enhanced ESR
As shown in Table 3, the results of ESR are significantly higher than vector space model especially for NDCG@20, although the scores from Semantic Scholar's production system are not employed.These results further verify the effectiveness of entity-based model and soft-matching as analyzed in the ESR [6].
Table 3. Retrieval performance of tf.idf-F, basic ESR and enhanced ESR on NDCG@K.The performance of enhanced models is compared with ESR in percentages.The number of queries a model improves (Win), does not change (Tie) or hurt (Loss), compared with ESR, are provided in the W/T/L column.The best results in each metric are marked bold.

Method
NDCG@1 NDCG@5 NDCG@10 NDCG@20 W/T/L Although ESR is the state-of-the-art model on this dataset, the ESR model optimized by TextRank still performs better than ESR for all evaluation metrics.On NDCG@1, the improvement is even higher than 9%.Compared with the result of ESR, the enhanced model using TextRank improves 46 queries' NDCG@20, unchanges 23 queries and only hurts 31 queries.The ESR model optimized by TF-IDF also have improvement on all evaluation metrics.The performances of the ESR model optimized by TextRank (or TF-IDF) are also stronger on P@3 and P @10 as Table 4 shown.These experiments provide strong evidence that considering the importance of entities could improve the performance of entity-based retrieval models.The results on both toy model and ESR model exclude accidental factors.The result of TF-IDF is not as well as TextRank in the experiments.Maybe this is the reason the result of the enhanced model using TF-IDF is not very well.
Besides, compared with the basic models, the additional implementation cost of the enhanced models is mainly the computation of the importance of entities in documents.The execution time measurements of TextRank and TF-IDF are performed on 2.8 GHz Intel Core i7 GPU and 8 GB of RAM.Execution times were measured on 1000 randomly chosen documents and each experiment was repeated 10 times.TF-IDF only costs 0.202 s to process 1000 documents, and the average execution time of TextRank on one document is 0.121 s.Moreover, the importance of entities can be measured off-line, so that the additional implementation cost does not affect the efficiency of retrieval.Therefore, the enhanced models could get improvement with less implementation cost.

Performance on Different Scenarios
This section analyzes the influence of the importance of entities in two different scenarios: multiple difficulty degrees and multiple query length degrees.The experiments compare the strong baseline, ESR, with its enhanced models on these scenarios, respectively.

Multiple Difficulty Degrees
This part explores the performance of the enhanced models on queries at different difficulty.Figure 6 compares the performance of the enhanced models on individual queries with ESR.Each point in Figure 6 corresponds to a query.The value on the X-axis is the ESR's NDCG@20 of the query, and the value on the Y-axis is the enhanced model's relative NDCG@20 (percentage) compared with ESR.Those points above the 100% line represent the queries improved by the enhanced model.It can be seen from these scatters that the enhanced models primarily improve the queries that are hard for ESR (NDCG@20 <= 0.4): for the ESR model optimized by TextRank, the performance improvement on these queries is more than 6.9%; for the ESR model optimized by TF-IDF, the gain of NDCG@20 is also about 6.3%.On the queries that are easy for ESR (NDCG@20 > 0.4), the improvement of the enhanced models is relatively small.Moreover, for the queries that the enhanced model improves, many of them obtain huge improvement by considering the importance of entities.In addition, for the queries that the enhanced model hurt, there is only a small performance penalty for most of them.

Multiple Query Length Degrees
This section further evaluates the effectiveness of the importance of entities on short queries (one entity) and long queries (more than one entity).The results on short queries and long queries are shown in Figure 7 respectively.Both enhanced models have more win cases on long queries and could pay more attention to the important entities in long queries.These two experiments show that the enhanced models mainly improve the performance of longer queries and the queries where ESR fails.

Impact of Document Entities with Different Order
To further confirm the effectiveness of the importance of entities, this section explores the impact of document entities with different orders.As shown in Figure 8, as in the method in Section 4.2, the document entities are divided into three groups according to their importance: important entities, related entities, and irrelevance entities.ESR using different groups entities is tested to analyze the contribution of each group.
Firstly, this work only uses the important entities in ESR model.In other words, the bin features of important entities are multiplied by a coefficient ω l = 1, and the bin features of other entities are multiplied by coefficient ω 2 = 0 and ω 3 = 0.As shown in Table 5, the model with only the important entities also outperforms ESR in all evaluation metrics.This indicates that the first third of ranked entities contain enough information for IR and important entities play a crucial role in ranking task.The ranking excludes some interference like wrongly linked entities.
Then the important entities and related entities are adopted in ESR model, which means that ω 3 is 0 while ω l and ω 2 are hyper-parameters to be selected.Table 5 shows that the impact of related entities is limited although the coefficient ω 2 is optimized.This work also uses all the document entities in ESR model and ω l , ω 2 , ω 3 are hyper-parameters to be selected.The third group of entities is sometimes from the wrong linked entities.The results show that the irrelevant entities are ineffective for ESR or can even be harmful.Through this experiment, it can be found that the more important the entities are, the more impact the entities have.The experiment also confirms that the importance of entities should be considered in ESR model.

Conclusions
This paper points out that the importance of entities in the document is a crucial factor in IR, while many existing entity-based models have not taken it into account.Thus, this paper explores the impact of the importance of entities from two aspects.First, data analysis is carried out on the ESR dataset and reveals the correlation between document-ranking and the importance of entities.Then this paper attempts to optimize the entity-based models with the importance of entities considered.TextRank and TF-IDF are designed to measure the importance of document entities, and the enhanced toy models and ESR models are proposed to make use of the importance of entities.Two IR models and two entity ranking models provide evidence for the effectiveness.
Extensive experiments on the ESR dataset reveal the effectiveness of the enhanced models.Statistically significant improvements over the basic model are observed on all evaluation metrics, especially for the toy model optimized by TextRank, which gets 24.22% increase on NDCG@1, and 12% increase on P@3.Besides, assigning different weights to entities according to their importance greatly improves the effectiveness on long queries and the queries where ESR fails.Furthermore, the research has shown that the important entities in the document are also essential for IR.While the enhanced models get improvement on all evaluation metrics, the additional implementation cost is quite low.We hope that our study on the importance of entities will inspire research into more in-depth entity-based retrieval models in the future.

Figure 1 .
Figure 1.The framework of entity-based model.

Figure 2 .
Figure 2. The relevance between the order of entities and similarities of document entities: (a) Entities are sorted by TextRank; (b) Entities are sorted by TF-IDF.

Figure 3 .
Figure 3. Framework of the proposed approach.

Figure 4 .
Figure 4.An overview of enhanced toy model.

Figure 5 Figure 5 .
Figure 5 illustrates how the ESR is optimized.

(Figure 6 .
Figure 6.The enhanced model's relative NDCG@20 compared with ESR on individual queries.Each point corresponds to a query in the ESR dataset.The X-axis marks the ESR's NDCG@20 of the query, and the value on the Y-axis is the enhanced model's relative NDCG@20 (percentage) compared with ESR.

Figure 7 .
Figure 7.The W/T/L on short queries (only one entity) and long queries (more than one entity).The X-axes mark two query levels, and Y-axes are the W/T/L values.

Figure 8 .
Figure 8.The architecture of the analysis model.

Table 1 .
Retrieval performance of tf.idf-F, toy model and enhanced toy model on NDCG@K.The performance of proposed models is compared with toy model in percentages.The number of queries a model improves (Win), does not change (Tie) or hurt (Loss), compared with ESR, are provided in the W/T/L column.The best results in each metric are marked bold.

Table 4 .
Retrieval performance of ESR and enhanced ESR models on P@3 and P@10.The best results in each metric are marked bold.

Table 5 .
Performance of ESR with different parts of entities.The best results in each metric are marked bold.