Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches
Abstract
:1. Introduction
- Providing a brief survey of various kinds of keyphrase extraction methods along with the necessary details and limitations of different approaches.
- Identifying the factors that can contribute to precision and recall errors in frequency and topical graph-based keyphrase extraction approaches, through performance analysis.
- Identifying the three major sources of errors in the selected approaches by conducting quantitative error source analysis.
2. Related Work
3. Selected Unsupervised Methods
3.1. Common Extraction Steps
3.2. Description of Selected Unsupervised Methods
3.2.1. TF-IDF
3.2.2. KP-Miner
3.2.3. TopicRank
4. Comparative Analysis
4.1. Experimental Setup
4.2. Performance Analysis
4.2.1. Performance Measures
- Precision measures the probability that if a phrase is selected as key concept by an algorithm then it is actually a key concept. It is the proportion of correctly identified key concepts among all retrieved phrases. In keyphrase extraction, usually one would be interested in retrieving top K concepts, so we use Precision at K (P@K).
- Recall measures the probability that if a phrase is key concept then the algorithm will correctly retrieve it. It is the proportion of correctly identified key concepts among all the standard key concepts.
- F-measure There is a tradeoff between precision and recall, if you are interested in extracting all key concepts then recall might be 100% but precision (P@K) will tend to 0%. In converse, if you want to optimize such that each extracted phrase should be really a key concept, then P@K might be 100% but the chances to extract all keyphrases will be close to 0%. Therefore, another measure called F-measure is widely used in information extraction that yields maximum value when there is balance between precision and recall. A high value of F-measure implies at reasonably high value of both precision and recall [53,54,55]. F-measure is the harmonic mean of precision and recall:
- Average Precision (AP) Precision, Recall and F-measure are single-value metrics that are computed over the whole list of concepts retrieved. However, as keyphrase extraction algorithms retrieve a ranked list of key concepts, so it is desirable to consider the ranking order in which the key concepts are extracted. Therefore, we use in our analysis the measure Average Precision which is a preferred measure for evaluating key concepts extraction algorithms that aims at ranking. Average Precision (AP) is defined as the area under a precision-recall curve. AP is a single-figure quality measure across the recall scores. To be more specific, it is the average of precision computed after each retrieved key concept in the ranked list that is matched in the gold standard. In our case, the following equation is used to calculate AP of the methods [1,56]
- Average Multiword Phrases as mentioned by Nakagawa and Mori [1,57], 85% of keyphrases are normally comprised of multi words. Therefore, we are interested to analyze the performance in terms of multiword phrases extracted by each system. To the best of our knowledge this is the first attempt to compare keyphrase algorithms on this metric. To compute Average Multiword phrases, we count the average number of multi word key concepts that match the gold standard.
4.2.2. Individual Performance
- TF-IDF The common observation for most of the key concept extraction methods is that by increasing, K, the number of key concepts predicted by each system, the recall increases while precision decreases. The precision-recall curves (Figure 2) show, that TF-IDF is consistent with this intuition. The overall performance of TF-IDF on SemEval-2010 benchmark dataset is low compared to KP-Miner but matching to TopicRank with slightly high value as shown in Figure 2a. Also, the curve in Figure 3a indicates that the average multiword concepts extracted by the system remains stable at a low of 1.25. In contrast, on Quranic and 500N-KPCrowd datasets the precision-recall curve of TF-IDF shows somewhat overlapping progression with KP-Miner (Figure 2b,c). However, the average multiword key concepts extracted are still not more than 1.25. The reason of low performance could be the fact that tf-idf model can potentially result in missing multiword concepts. This make sense because the factor tf (term frequency) is dominating than idf (Inverse document frequency). The tf measures how frequent a word is in a document and nothing can affect this value, whereas idf measures how rare a word is across the documents in the corpus and it is dependent on the number of documents, N, in the corpus and the value of document frequency. Thus, idf is affective only if there are more documents in the corpus and document frequency of a word or phrase is low. Based on this argument we can say that, despite the fact that 85% of keyphrases are normally comprised of multi words, single terms will gain more weights than multiword phrases because it has been found that single word terms occur more frequently as compared to multiword phrases [8]. Therefore, this weakness of tf-idf based data-driven approach may result in missing important multiword key concepts, and in turn affect their performance.
- KP-Miner The KP-Miner precision-recall curves (Figure 2) show a similar progression to that of TF-IDF, precision falls when recall raises. The overall performance of KP-Miner on SemEval-2010 dataset is better than both TF-IDF and TopicRank. For all the variations of top K key concepts, the highest scores are achieved by KP-Miner (Figure 2a). We may attribute this to the fact that KP-Miner weighs more to multiword concepts as can be seen in Figure 3a. KP-Miner is based on tf-idf model and as discussed earlier that idf, which measure the rareness of a phrase, is affective only if there are more documents in the corpus and document frequency of a word or phrase is low. Therefore, because multiword phrases are less frequent and rare across the document corpus, therefore, on SemEval-2010 dataset, where the number of documents in the corpus is higher than Quranic dataset, the multiword concepts may get some effective score. By investigating the other factors that contribute to higher number of multiword keyphrases extracted by KP-Miner, it is found that the author of KP-Miner assumes that compound keyphrases do not occur more frequently compared to single words with in a document set. Based on this assumption the document frequency for multiword key concepts is set to 1, which will result in maximum IDF value, thus giving maximum score to multiword key concepts. We speculate that here KP-Miner is biased towards multiword key concepts. The performance of KP-Miner on Quranic and 500N-KPCrowd dataset supports our argument because in that case the idf values of both methods are close to each other, for both single and multiword concepts that result in somewhat overlapping patterns with TF-IDF (Figure 2b,c).
- TopicRank This method exhibits different patterns. While, on SemEval-2010 dataset the performance of TopicRank in terms of precision- recall is close to TF-IDF and lower than KP-Miner, on Quranic dataset its results show unstable behavior (Figure 2b,c). First the precision does not fall as recall rises, then suddenly it falls and recall remains stable at 5.88. After that a gradual increase in precision can be seen. By dipping in depth to determine, why TopicRank performing low and behaves differently in an unstable way on SemEval-2010 and Quranic dataset, we found that the main responsibility lies in the way of generating topics and their weighting. In the first step of identifying candidate concepts, it relies on noun phrases. However, the noun phrases may contain too common and general terms or noise ones [1]. Also, it is not necessary that all concepts must be noun phrases. Verb phrases may also contain important key concepts. For example, in the keyphrase “extracting concepts” “extracting” is verb of type VBG (verb gerund) not NN (noun), but potentially it is similar to “concept extraction”. Similarly, when the key concept “distributed computing” is analyzed the word “distributed” is tagged as verb of type VBN (verb). Therefore, relying only on noun phrases is not enough for key concept extraction. This may result in missing many valuable key concepts. In the next step of making clusters from candidate phrase, it is found that the similarity between candidates is not computed semantically, rather checked lexically with a minimum overlapping threshold value of 25%. This may result in generating topics that group candidates which are lexically similar but semantically opposite. For instance, “supervised machine learning” and “unsupervised machine learning” have lexical similarity but semantically both are opposite concepts. The effect of this will be obvious in the next steps of building graph from the topics and their ranking. Semantically similar key concepts may go to wrong topics, and their co-occurrence weight will be assigned to wrong edges in the graph, thus it may co-relate wrong topics, and ultimately wrong topics may gain higher weights. Therefore, comparing TopicRank with TF-IDF and KP-Miner, we conclude that the co-occurrence based relatedness weighting scheme of TopicRank is uncertain compared to frequency-based weighting scheme of TF-IDF and KP-Miner. Therefore, the same uncertainty can be seen in the unstable results of TopicRank. However, on 500N-KPCrowd dataset it outperforms than its competitors, in terms of precision recall curve (Figure 2c). The reason could be that in 500N-KPCrowd dataset the average number of words per document is very low as compared to the other datasets, in which case the lexical-based similarity may be fruitful that would result in improved precision. Similarly, a gradual increase in the performance can be seen across all the three datasets, in terms of Average Multiword Phrases (Figure 3). this can be attributed to the fact that it does not depend on the frequency-based model tf-idf which is hard to be optimized for multiword phrases.
4.2.3. Overall Performance
4.3. Error Source Analysis
5. Conclusions
Author Contributions
Conflicts of Interest
Appendix A. Summary of Different Keyphrase Extraction Methods
Source | Category | Approach Used | Techniques Used | Remarks | Limitations |
---|---|---|---|---|---|
KEA [14] | Supervised | Statistical and structural-based | Term Frequency, Phrase Position | Language Independent. Relying only on statistical information may result in missing important multiword phrases. | Require manually annotated quality training set.Training process make them domain dependent. |
GenEx [10] | Supervised | Statistical and structural-based | Term Frequency, Phrase Position | ||
[15] | Supervised | Statistical and linguistic-based | Lexical features e.g., collection frequency, part-of-speech tags, Bagging technique | ||
KEA++ [33] | Supervised | Statistical and linguistic based | NLP techniques, Using Thesaurus | ||
[34] | Supervised | Statistical and linguistic-based | Distribution information of candidate phrase | Extension of KEA. Language dependent, may require domain knowledge and expertise in language. Glossaries or auxiliary structures are useful however, they require extensive human efforts in definition of terms and terminology standardization. | |
[16] | Supervised | Statistical and linguistic based | Integration of Wikipedia | ||
[32] | Supervised | Statistical and linguistic-based | Structural features e.g., presence of a phrase in specific section. Lexical features e.g., presence of phrase in Wordnet or Wikipedia. Bagged decision tree | ||
[31] | Supervised | Statistical and linguistic based | Statistical and linguistic features e.g., tf-idf, BM25, POS | ||
[30] | Supervised | Statistical and linguistic based | Features based on citation network information along with traditional features | ||
[35] | Un-Supervised | Statistical-based | tf-idf (term frequency-inverse document frequency). Topic proportions | Target process is the ranking of candidate phrases. Language Independent | Relying only on statistical information may result in missing important multiword keyconcepts due to higher weights to single terms.Semantics free extraction |
[38] | Un-Supervised | Statistical-based | tf-idf (term frequency-inverse document frequency). Topic proportions | ||
[27] | Un-Supervised | Statistical-based | tf-idf (term frequency-inverse document frequency). Topic proportions | ||
[37] | Un-Supervised | Statistical-based | tf-idf (term frequency-inverse document frequency). Topic proportions | ||
[8] | Un-Supervised | Statistical-based | Tf-idf, boosting factor | ||
[40] | Un-Supervised | Linguistic or syntactical information-based | Considers Part-of speech tags other than noun and adjectives | Language dependent, may require domain knowledge and expertise in language. Glossaries or auxiliary structures require extensive human efforts in definition of terms and terminology standardization. | |
[39] | Un-Supervised | Linguistic or syntactical information-based | Creates a database containing semantically related keyphrases | ||
CFinder [1] | Un-Supervised | Statistical, syntactical and structural information-based | Statistical and structural information. Domain-specific knowledge | ||
Topic-biased PageRank [47] | Un-Supervised | Topical clustering-based | Topic models | ||
[46] | Un-Supervised | Topical clustering-based | Topic models. Decomposing documents into multiple topics | Extension of topic-biased PageRank | |
TopicRank [21] | Un-Supervised | Topical clustering-based | Clustering techniques to group candidate phrases into topics | ||
TextRank [22] | Un-Supervised | Graph-based ranking | PageRank algorithm | Adjacent words are used to build the graph | Prefer single words as nodes of the graph, thus may result in missing important multiword keyphrases.Does not guarantee covering all topics. |
SingleRank [23] | Un-Supervised | Graph-based ranking | Co-occurrence window of variable size w ≥ 2. lexically-similar neighboring documents | Extension of TextRank | |
ExpandRank [24] | Un-Supervised | Graph-based ranking | Co-occurrence window of variabe size w ≥ 2. lexically-similar neighboring documents | ||
[42] | Un-Supervised | Graph-based ranking | Citation network information | Extension of ExpandRank | |
[43] | Un-Supervised | Graph-based ranking | Centrality measures e.g., node degree, closeness, and clustering coefficient | ||
[44] | Un-Supervised | Graph-based ranking | WordNet information | WordNet is used to find semantic relationship between words | |
SGRank [41] | Un-Supervised | Graph-based ranking | Statistical Heuristics e.g., Tf-Idf, First position of a keyphrase in a document | ||
[45] | Un-Supervised | Graph-based ranking | Word embedding vectors | Finds semantic relatedness between words in a graph |
References
- Kang, Y.B.; Haghighi, P.D.; Burstein, F. CFinder: An intelligent key concept finder from text for ontology development. Expert Syst. Appl. 2014, 41, 4494–4504. [Google Scholar] [CrossRef]
- Noy, N.F.; McGuinness, D.L. Ontology Development 101: A Guide to Creating Your First Ontology; Stanford University: Stanford, CA, USA, 2001. [Google Scholar]
- Cimiano, P.; Völker, J. A framework for ontology learning and data-driven change discovery. In Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB), Lecture Notes in Computer Science, Alicante, Spain, 15–17 June 2005; Springer: Berlin, Germany, 2005; Volume 3513, pp. 227–238. [Google Scholar]
- Jiang, X.; Tan, A.H. CRCTOL: A semantic-based domain ontology learning system. J. Assoc. Inf. Sci. Technol. 2010, 61, 150–168. [Google Scholar] [CrossRef]
- Li, Q.; Wu, Y.F.B. Identifying important concepts from medical documents. J. Biomed. Inform. 2006, 39, 668–679. [Google Scholar] [CrossRef] [PubMed]
- Tonelli, S.; Rospocher, M.; Pianta, E.; Serafini, L. Boosting collaborative ontology building with key-concept extraction. In Proceedings of the 2011 Fifth IEEE International Conference on Semantic Computing (ICSC), Palo Alto, CA, USA, 18–21 September 2011; pp. 316–319. [Google Scholar]
- Aman, M.; Bin Md Said, A.; Kadir, S.J.A.; Baharudin, B. A Review of Studies on Ontology Development for Islamic Knowledge Domain. J. Theor. Appl. Inf. Technol. 2017, 95, 3303–3311. [Google Scholar]
- El-Beltagy, S.R.; Rafea, A. KP-Miner: A keyphrase extraction system for English and Arabic documents. Inf. Syst. 2009, 34, 132–144. [Google Scholar] [CrossRef]
- Englmeier, K.; Murtagh, F.; Mothe, J. Domain ontology: automatically extracting and structuring community language from texts. In Proceedings of the International Conference Applied Computing (IADIS), Salamanca, Spain, 18–20 February 2007; pp. 59–66. [Google Scholar]
- Turney, P.D. Learning algorithms for keyphrase extraction. Inf. Retr. 2000, 2, 303–336. [Google Scholar] [CrossRef]
- Hulth, A.; Megyesi, B.B. A study on automatically extracted keywords in text categorization. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006; Association for Computational Linguistics: Stroudsburg, PA, USA, 2006; pp. 537–544. [Google Scholar]
- Hasan, K.S.; Ng, V. Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Beijing, China, 23–27 August 2010; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 365–373. [Google Scholar]
- Saidul, H.K.; Vincent, N. Automatic Keyphrase Extraction: A Survey of the State of the Art. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 22–27 June 2014; Volume 1, pp. 1262–1273. [Google Scholar]
- Frank, E.; Paynter, G.W.; Witten, I.H.; Gutwin, C.; Nevill-Manning, C.G. Domain-specific keyphrase extraction. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 99), Stockholm, Sweden, 31 July–6 August 1999; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1999; Volume 2, pp. 668–673. [Google Scholar]
- Hulth, A. Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan, 11–12 July 2003; Association for Computational Linguistics: Stroudsburg, PA, USA, 2003; pp. 216–223. [Google Scholar]
- Medelyan, O.; Frank, E.; Witten, I.H. Human-competitive tagging using automatic keyphrase extraction. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–7 August 2009; Association for Computational Linguistics: Stroudsburg, PA, USA, 2009; Volume 3, pp. 1318–1327. [Google Scholar]
- Turney, P.D. Coherent keyphrase extraction via web mining. arXiv, 2003; arXiv:cs/0308033. [Google Scholar]
- Nam, K.S.; Olena, M.; Min-Yen, K.; Timothy, B. Semeval-2010 task 5: Automatic keyphrase extraction from scientific articles. In Proceedings of the 5th International Workshop on Semantic Evaluation, Los Angeles, CA, USA, 15–16 July 2010; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 21–26. [Google Scholar]
- Kim, S.N.; Medelyan, O.; Kan, M.Y.; Baldwin, T. Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 2013, 47, 723–742. [Google Scholar] [CrossRef]
- Tomokiyo, T.; Hurst, M. A language model approach to keyphrase extraction. In Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, 12 July 2003; Association for Computational Linguistics: Stroudsburg, PA, USA, 2003; Volume 18, pp. 33–40. [Google Scholar]
- Bougouin, A.; Boudin, F.; Daille, B. Topicrank: Graph-based topic ranking for keyphrase extraction. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Nagoya, Japan, 14–19 October 2013; pp. 543–551. [Google Scholar]
- Mihalcea, R.; Tarau, P. TextRank: Bringing Order into Text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 25–26 July 2004; Volume 4, pp. 404–411. [Google Scholar]
- Wan, X.; Xiao, J. Single Document Keyphrase Extraction Using Neighborhood Knowledge. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, Chicago, IL, USA, 13–17 July 2008; Volume 8, pp. 855–860. [Google Scholar]
- Xiaojun, W.; Jianguo, X. CollabRank: towards a collaborative approach to single-document keyphrase extraction. In Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, UK, 18–22 August 2008; Association for Computational Linguistics: Stroudsburg, PA, USA, 2008; Volume 1, pp. 969–976. [Google Scholar]
- Wan, X.; Yang, J.; Xiao, J. Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, 25–27 June 2007; Volume 7, pp. 552–559. [Google Scholar]
- Zha, H. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 11–15 August 2002; ACM: New York, NY, USA, 2002; pp. 113–120. [Google Scholar]
- Liu, Z.; Li, P.; Zheng, Y.; Sun, M. Clustering to find exemplar terms for keyphrase extraction. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1-Volume 1, Singapore, 6–7 August 2009; Association for Computational Linguistics, 2009; pp. 257–266. [Google Scholar]
- Matsuo, Y.; Ishizuka, M. Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 2004, 13, 157–169. [Google Scholar] [CrossRef]
- Boudin, F. Pke: An open source python-based keyphrase extraction toolkit. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, Osaka, Japan, 11–16 December 2016; pp. 69–73. [Google Scholar]
- Caragea, C.; Bulgarov, F.A.; Godea, A.; Gollapalli, S.D. Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Volume 14, pp. 1435–1446. [Google Scholar]
- Chuang, J.; Manning, C.D.; Heer, J. “Without the Clutter of Unimportant Words”: Descriptive keyphrases for text visualization. ACM Trans. Comput. Hum. Interact. 2012, 19, 19. [Google Scholar] [CrossRef]
- Lopez, P.; Romary, L. HUMB: Automatic key term extraction from scientific articles in GROBID. In Proceedings of the 5th International Workshop on Semantic Evaluation, Los Angeles, CA, USA, 15–16 July 2010; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 248–251. [Google Scholar]
- Medelyan, O.; Witten, I.H. Thesaurus based automatic keyphrase indexing. In Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries, Chapel Hill, NC, USA, 11–15 June 2006; ACM: New York, NY, USA, 2006; pp. 296–297. [Google Scholar]
- Nguyen, T.; Kan, M.Y. Keyphrase extraction in scientific publications. In Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers; Springer: Berlin, Germany, 2007; pp. 317–326. [Google Scholar]
- Barker, K.; Cornacchia, N. Using noun phrase heads to extract document keyphrases. In Proceedings of the Canadian Society for Computational Studies of Intelligence, Montéal, QC, Canada, 14–17 May 2000; pp. 40–52. [Google Scholar]
- Florescu, C.; Caragea, C. PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, BC, Canada, 30 July–4 August 2017; Volume 1, pp. 1105–1115. [Google Scholar]
- Liu, F.; Pennell, D.; Liu, F.; Liu, Y. Unsupervised approaches for automatic keyword extraction using meeting transcripts. In Proceedings of the Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, USA, 1–3 June 2009; Association for Computational Linguistics: Stroudsburg, PA, USA, 2009; pp. 620–628. [Google Scholar]
- Zhang, Y.; Milios, E.; Zincir-Heywood, N. A comparative study on key phrase extraction methods in automatic web site summarization. J. Dig. Inf. Manag. 2007, 5, 323. [Google Scholar]
- Adar, E.; Datta, S. Building a scientific concept hierarchy database (schbase). In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 27–31 July 2015; pp. 606–615. [Google Scholar]
- Le, T.T.N.; Le Nguyen, M.; Shimazu, A. Unsupervised Keyphrase Extraction: Introducing New Kinds of Words to Keyphrases. In Proceedings of the Australasian Joint Conference on Artificial Intelligence, Hobart, TAS, Australia, 5–8 December 2016; Springer: Berlin, Germany, 2016; pp. 665–671. [Google Scholar]
- Danesh, S.; Sumner, T.; Martin, J.H. SGRank: Combining Statistical and Graphical Methods to Improve the State of the Art in Unsupervised Keyphrase Extraction. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, Denver, CO, USA, 4–5 June 2015; pp. 117–126. [Google Scholar]
- Gollapalli, S.D.; Caragea, C. Extracting Keyphrases from Research Papers Using Citation Networks. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; pp. 1629–1635. [Google Scholar]
- Lahiri, S.; Choudhury, S.R.; Caragea, C. Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv, 2014; arXiv:1401.6571. [Google Scholar]
- Martinez-Romo, J.; Araujo, L.; Duque Fernandez, A. SemGraph: Extracting keyphrases following a novel semantic graph-based approach. J. Assoc. Inf. Sci. Technol. 2016, 67, 71–82. [Google Scholar] [CrossRef]
- Wang, R.; Liu, W.; McDonald, C. Corpus-independent generic keyphrase extraction using word embedding vectors. In Proceedings of the Software Engineering Research Conference, Las Vegas, NV, USA, 21–24 July 2014; Volume 39. [Google Scholar]
- Liu, Z.; Huang, W.; Zheng, Y.; Sun, M. Automatic keyphrase extraction via topic decomposition. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, USA, 25–27 October 2008; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 366–376. [Google Scholar]
- Haveliwala, T.H. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Trans. Knowl. Data Eng. 2003, 15, 784–796. [Google Scholar] [CrossRef]
- Luhn, H.P. A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1957, 1, 309–317. [Google Scholar] [CrossRef]
- Sparck Jones, K. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 1972, 28, 11–21. [Google Scholar] [CrossRef]
- Quran English Translation. 2016. Available online: http://tanzil.net/trans/ (accessed on 21 March 2017).
- Ouda, K. QuranAnalysis: A Semantic Search and Intelligence System for the Quran. Master’s Thesis, University of Leeds, Leeds, UK, 2015. [Google Scholar]
- Marujo, L.; Gershman, A.; Carbonell, J.; Frederking, R.; Neto, J.P. Supervised topical key phrase extraction of news stories using crowdsourcing, light filtering and co-reference normalization. arXiv, 2013; arXiv:1306.4886. [Google Scholar]
- Rijsbergen, C.J.V. Information Retrieval, 2nd ed.; Butterworth-Heinemann: Newton, MA, USA, 1979. [Google Scholar]
- Lewis, D.D. Evaluating and optimizing autonomous text classification systems. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, USA, 9–13 July 1995; ACM: New York, NY, USA, 1995; pp. 246–254. [Google Scholar]
- Turney, P.D. Extraction of keyphrases from text: Evaluation of four algorithms. arXiv, 2002; arXiv:cs/0212014. [Google Scholar]
- Turpin, A.; Scholer, F. User performance versus precision measures for simple search tasks. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, USA, 6–10 August 2006; ACM: New York, NY, USA, 2006; pp. 11–18. [Google Scholar]
- Nakagawa, H.; Mori, T. A simple but powerful automatic term extraction method. In Proceedings of the COLING-02 on COMPUTERM 2002: Second International Workshop on Computational Terminology- Volume 14; Association for Computational Linguistics: Stroudsburg, PA, USA, 2002; pp. 1–7. [Google Scholar]
- Wang, X.; McCallum, A.; Wei, X. Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Proceedings of the Seventh IEEE International Conference on Data Mining, Omaha, NE, USA, 28–31 October 2007; pp. 697–702. [Google Scholar]
Dataset | Domain | Total Number of Docs/Chapters | Avg. Number of Words per Doc/Chapter | Avg. Gold Standard Key Concepts per Doc/Chapter |
---|---|---|---|---|
SemEval-2010 | Scientific papers | 244 | 8021.0 | 15.18 |
Quranic | Religious Book | 114 | 1469.8 | 28.25 |
500N-KPCrowd | News Stories | 500 | 432.73 | 39.9 |
Dataset | KP-Miner (Parameters) | TF-IDF (Parameter) | TopicRank (Parameters) | ||||||
---|---|---|---|---|---|---|---|---|---|
N | Lasf | Cutoff Constant | Sigma (σ) | Alpha (α) | N | N | Similarity Threshold | Clustering Linkage | |
SemEval-2010 | 12 | 3 | 800 | 3 | 2.3 | 14 | 20 | 25% | Average linkage |
Quranic | 14 | 4 | 1000 | 3 | 2.3 | 16 | 18 | 25% | Average linkage |
500N-KPCrowd | 18 | 3 | 500 | 3.0 | 2.3 | 16 | 22 | 25% | Average linkage |
Parameter | Values Range |
---|---|
Sigma (σ) | 2.8, 3.0, 3.2, 3.4, 3.6, 3.8 |
Alpha (α) | 2.2, 2.3, 2.4 |
Lasf | 2–5 |
Cutoff | 400, 600, 800, 1000, 1200, 1400 |
Clustering Linkage | Single, Complete, Average |
N | <25 |
Dataset | TF-IDF | KP-Miner | TopicRank | |||||||
---|---|---|---|---|---|---|---|---|---|---|
N | Precision | Recall | F Score | Precision | Recall | F Score | Precision | Recall | F Score | |
SemEval-2010 | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
4 | 31.25 | 7.14 | 11.62 | 37.50 | 8.57 | 13.95 | 31.25 | 7.14 | 11.62 | |
6 | 25.00 | 8.57 | 12.76 | 33.33 | 11.43 | 17.02 | 29.17 | 10.00 | 14.89 | |
8 | 21.88 | 10.00 | 13.73 | 31.25 | 14.29 | 19.61 | 25.00 | 11.43 | 15.69 | |
10 | 25.00 | 14.29 | 18.19 | 30.00 | 17.14 | 21.82 | 22.50 | 12.86 | 16.37 | |
12 | 25.00 | 17.14 | 20.34 | 33.33 | 22.86 | 27.12 | 22.92 | 15.71 | 18.64 | |
14 | 23.21 | 18.57 | 20.63 | 30.36 | 24.29 | 26.99 | 23.21 | 18.57 | 20.63 | |
16 | 20.31 | 18.57 | 19.40 | 28.13 | 25.71 | 26.87 | 21.88 | 20.00 | 20.90 | |
18 | 19.44 | 20.00 | 19.72 | 26.39 | 27.14 | 26.76 | 20.83 | 21.43 | 21.13 | |
20 | 17.50 | 20.00 | 18.67 | 25.00 | 28.57 | 26.67 | 20.00 | 22.86 | 21.33 | |
Quranic | 2 | 100.00 | 3.92 | 7.55 | 100.00 | 3.92 | 7.55 | 50.00 | 1.96 | 3.77 |
4 | 75.00 | 5.88 | 10.91 | 75.00 | 5.88 | 10.91 | 50.00 | 3.92 | 7.27 | |
6 | 50.00 | 5.88 | 10.53 | 50.00 | 5.88 | 10.53 | 50.00 | 5.88 | 10.53 | |
8 | 37.50 | 5.88 | 10.17 | 37.50 | 5.88 | 10.17 | 37.50 | 5.88 | 10.17 | |
10 | 40.00 | 7.84 | 13.11 | 40.00 | 7.84 | 13.11 | 30.00 | 5.88 | 9.84 | |
12 | 41.67 | 9.80 | 15.87 | 41.67 | 9.80 | 15.87 | 25.00 | 5.88 | 9.52 | |
14 | 35.71 | 9.80 | 15.38 | 42.86 | 11.76 | 18.46 | 28.57 | 7.84 | 12.31 | |
16 | 37.50 | 11.76 | 17.91 | 37.50 | 11.76 | 17.91 | 37.50 | 11.76 | 17.91 | |
18 | 33.33 | 11.76 | 17.39 | 33.33 | 11.76 | 17.39 | 38.89 | 13.73 | 20.29 | |
20 | 30.00 | 11.76 | 16.90 | 30.00 | 11.76 | 16.90 | 35.00 | 13.73 | 19.72 | |
500N-KPCrowed | 2 | 37.50 | 3.41 | 6.25 | 37.50 | 3.41 | 6.25 | 50.00 | 4.55 | 8.33 |
4 | 37.50 | 6.82 | 11.54 | 43.75 | 7.95 | 13.46 | 50.00 | 9.09 | 15.38 | |
6 | 33.33 | 9.09 | 14.29 | 33.33 | 9.09 | 14.29 | 45.83 | 12.50 | 19.64 | |
8 | 31.25 | 11.36 | 16.67 | 34.38 | 12.50 | 18.33 | 40.63 | 14.77 | 21.67 | |
10 | 30.00 | 13.64 | 18.75 | 30.00 | 13.64 | 18.75 | 35.00 | 15.91 | 21.88 | |
12 | 29.17 | 15.91 | 20.59 | 29.17 | 15.91 | 20.59 | 29.17 | 15.91 | 20.59 | |
14 | 26.79 | 17.05 | 20.83 | 30.36 | 13.64 | 18.82 | 26.79 | 17.05 | 20.83 | |
16 | 29.69 | 21.59 | 25.00 | 28.13 | 20.45 | 23.68 | 25.00 | 18.18 | 21.05 | |
18 | 26.39 | 21.59 | 23.75 | 27.78 | 22.73 | 25.00 | 27.78 | 22.73 | 25.00 | |
20 | 25.00 | 22.73 | 23.81 | 26.25 | 23.86 | 25.00 | 26.25 | 23.86 | 25.00 |
Method | AP (%) SemEval-2010 | AP (%) Quranic | AP (%) 500N-KPCrowd |
---|---|---|---|
KP-Miner | 30.59 | 60 | 32.06 |
TopicRank | 24.08 | 42.49 | 35.34 |
TF-IDF | 24.4 | 58.83 | 30.41 |
Method | F-Measure (%) SemEval-2010 | F-Measure (%) Quranic | F-Measure (%) 500N-KPCrowd |
---|---|---|---|
KP-Miner | 27.12 | 18.46 | 25 |
TopicRank | 21.33 | 20.29 | 26.14 |
TF-IDF | 20.63 | 17.91 | 25 |
Algorithm | Total False Positives | Error Source | 95% Confidence Interval (%) | Type |
---|---|---|---|---|
TF-IDF | 1175 | Frequency errors | Precision errors | |
Syntactical errors | ||||
KP-Miner | 1110 | Frequency errors | ||
Syntactical errors | ||||
Topic Rank | 1135 | Semantical errors | Recall errors |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aman, M.; Bin Md Said, A.; Jadid Abdul Kadir, S.; Ullah, I. Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches. Information 2018, 9, 128. https://doi.org/10.3390/info9050128
Aman M, Bin Md Said A, Jadid Abdul Kadir S, Ullah I. Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches. Information. 2018; 9(5):128. https://doi.org/10.3390/info9050128
Chicago/Turabian StyleAman, Muhammad, Abas Bin Md Said, Said Jadid Abdul Kadir, and Israr Ullah. 2018. "Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches" Information 9, no. 5: 128. https://doi.org/10.3390/info9050128
APA StyleAman, M., Bin Md Said, A., Jadid Abdul Kadir, S., & Ullah, I. (2018). Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches. Information, 9(5), 128. https://doi.org/10.3390/info9050128