Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents
AbstractBiomedicine is a pillar of the collective, scientific effort of human self-discovery, as well as a major source of humanistic data codified primarily in biomedical documents. Despite their rigid structure, maintaining and updating a considerably-sized collection of such documents is a task of overwhelming complexity mandating efficient information retrieval for the purpose of the integration of clustering schemes. The latter should work natively with inherently multidimensional data and higher order interdependencies. Additionally, past experience indicates that clustering should be semantically enhanced. Tensor algebra is the key to extending the current term-document model to more dimensions. In this article, an alternative keyword-term-document strategy, based on scientometric observations that keywords typically possess more expressive power than ordinary text terms, whose algorithmic cornerstones are third order tensors and MeSH ontological functions, is proposed. This strategy has been compared against a baseline using two different biomedical datasets, the TREC (Text REtrieval Conference) genomics benchmark and a large custom set of cognitive science articles from PubMed. View Full-Text
Share & Cite This Article
Drakopoulos, G.; Kanavos, A.; Karydis, I.; Sioutas, S.; G. Vrahatis, A. Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents. Computation 2017, 5, 34.
Drakopoulos G, Kanavos A, Karydis I, Sioutas S, G. Vrahatis A. Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents. Computation. 2017; 5(3):34.Chicago/Turabian Style
Drakopoulos, Georgios; Kanavos, Andreas; Karydis, Ioannis; Sioutas, Spyros; G. Vrahatis, Aristidis. 2017. "Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents." Computation 5, no. 3: 34.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.