entropy-logo

Journal Browser

Journal Browser

Natural Language Processing and Data Mining

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: closed (30 November 2024) | Viewed by 5423

Special Issue Editor


E-Mail Website
Guest Editor
Computer Science Department, Northwestern University, Evanston, IL 60208, USA
Interests: natural language processing; multimedia information extraction

Special Issue Information

Dear Colleagues,

Natural language processing (NLP) and data mining are two rapidly advancing and synergistic fields with broad applications of the relevant concepts of entropy, information theory, or related studies. Entropy is calling for original research submissions for a Special Issue highlighting recent innovations and advances. We invite research covering novel techniques, studies, methodologies, and technologies that integrate NLP and data mining theories, models, and algorithms. Potential topics include, but are not limited to, the following: using NLP techniques to extract and structure data from unstructured text for mining, enhancing the discovery of knowledge and patterns from text using data mining, multimodal data mining leveraging linguistic cues and rules, and studies evaluating the effectiveness of different NLP and data mining integration approaches, as well as broader applications such as sentiment analysis, recommendation systems, question answering, and decision making systems empowered by both capabilities. Both theoretical contributions and empirical studies on real-world datasets are within scope. The purpose of this Special Issue is to highlight high-quality and impactful research advancing NLP and data mining capabilities when applying the relevant concepts of entropy, information theory, or related studies.

Dr. Manling Li
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • data mining
  • information theory

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 605 KiB  
Article
A Hierarchical Multi-Task Learning Framework for Semantic Annotation in Tabular Data
by Jie Wu and Mengshu Hou
Entropy 2024, 26(8), 664; https://doi.org/10.3390/e26080664 - 4 Aug 2024
Viewed by 1143
Abstract
To optimize the utilization and analysis of tables, it is essential to recognize and understand their semantics comprehensively. This requirement is especially critical given that many tables lack explicit annotations, necessitating the identification of column types and inter-column relationships. Such identification can significantly [...] Read more.
To optimize the utilization and analysis of tables, it is essential to recognize and understand their semantics comprehensively. This requirement is especially critical given that many tables lack explicit annotations, necessitating the identification of column types and inter-column relationships. Such identification can significantly augment data quality, streamline data integration, and support data analysis and mining. Current table annotation models often address each subtask independently, which may result in the neglect of constraints and contextual information, causing relational ambiguities and inference errors. To address this issue, we propose a unified multi-task learning framework capable of concurrently handling multiple tasks within a single model, including column named entity recognition, column type identification, and inter-column relationship detection. By integrating these tasks, the framework exploits their interrelations, facilitating the exchange of shallow features and the sharing of representations. Their cooperation enables each task to leverage insights from the others, thereby improving the performance of individual subtasks and enhancing the model’s overall generalization capabilities. Notably, our model is designed to employ only the internal information of tabular data, avoiding reliance on external context or knowledge graphs. This design ensures robust performance even with limited input information. Extensive experiments demonstrate the superior performance of our model across various tasks, validating the effectiveness of unified multi-task learning framework in the recognition and comprehension of table semantics. Full article
(This article belongs to the Special Issue Natural Language Processing and Data Mining)
Show Figures

Figure 1

15 pages, 3929 KiB  
Article
Coreference Resolution Based on High-Dimensional Multi-Scale Information
by Yu Wang, Zenghui Ding, Tao Wang, Shu Xu, Xianjun Yang and Yining Sun
Entropy 2024, 26(6), 529; https://doi.org/10.3390/e26060529 - 19 Jun 2024
Cited by 1 | Viewed by 816
Abstract
Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability [...] Read more.
Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding performance. Based on this, a multi-scale context information module is designed to improve the applicability of the BERT encoding model under different text spans. In addition, improving linear separability through dimension expansion. Finally, cross-entropy loss is used as the loss function. After adding BERT and span BERT to the module designed in this article, F1 increased by 0.5% and 0.2%, respectively. Full article
(This article belongs to the Special Issue Natural Language Processing and Data Mining)
Show Figures

Figure 1

13 pages, 382 KiB  
Article
DiffFSRE: Diffusion-Enhanced Prototypical Network for Few-Shot Relation Extraction
by Yang Chen and Bowen Shi
Entropy 2024, 26(5), 352; https://doi.org/10.3390/e26050352 - 23 Apr 2024
Viewed by 1424
Abstract
Supervised learning methods excel in traditional relation extraction tasks. However, the quality and scale of the training data heavily influence their performance. Few-shot relation extraction is gradually becoming a research hotspot whose objective is to learn and extract semantic relationships between entities with [...] Read more.
Supervised learning methods excel in traditional relation extraction tasks. However, the quality and scale of the training data heavily influence their performance. Few-shot relation extraction is gradually becoming a research hotspot whose objective is to learn and extract semantic relationships between entities with only a limited number of annotated samples. In recent years, numerous studies have employed prototypical networks for few-shot relation extraction. However, these methods often suffer from overfitting of the relation classes, making it challenging to generalize effectively to new relationships. Therefore, this paper seeks to utilize a diffusion model for data augmentation to address the overfitting issue of prototypical networks. We propose a diffusion model-enhanced prototypical network framework. Specifically, we design and train a controllable conditional relation generation diffusion model on the relation extraction dataset, which can generate the corresponding instance representation according to the relation description. Building upon the trained diffusion model, we further present a pseudo-sample-enhanced prototypical network, which is able to provide more accurate representations for prototype classes, thereby alleviating overfitting and better generalizing to unseen relation classes. Additionally, we introduce a pseudo-sample-aware attention mechanism to enhance the model’s adaptability to pseudo-sample data through a cross-entropy loss, further improving the model’s performance. A series of experiments are conducted to prove our method’s effectiveness. The results indicate that our proposed approach significantly outperforms existing methods, particularly in low-resource one-shot environments. Further ablation analyses underscore the necessity of each module in the model. As far as we know, this is the first research to employ a diffusion model for enhancing the prototypical network through data augmentation in few-shot relation extraction. Full article
(This article belongs to the Special Issue Natural Language Processing and Data Mining)
Show Figures

Figure 1

14 pages, 652 KiB  
Article
Enhanced Heterogeneous Graph Attention Network with a Novel Multilabel Focal Loss for Document-Level Relation Extraction
by Yang Chen and Bowen Shi
Entropy 2024, 26(3), 210; https://doi.org/10.3390/e26030210 - 28 Feb 2024
Cited by 2 | Viewed by 1459
Abstract
Recent years have seen a rise in interest in document-level relation extraction, which is defined as extracting all relations between entities in multiple sentences of a document. Typically, there are multiple mentions corresponding to a single entity in this context. Previous research predominantly [...] Read more.
Recent years have seen a rise in interest in document-level relation extraction, which is defined as extracting all relations between entities in multiple sentences of a document. Typically, there are multiple mentions corresponding to a single entity in this context. Previous research predominantly employed a holistic representation for each entity to predict relations, but this approach often overlooks valuable information contained in fine-grained entity mentions. We contend that relation prediction and inference should be grounded in specific entity mentions rather than abstract entity concepts. To address this, our paper proposes a two-stage mention-level framework based on an enhanced heterogeneous graph attention network for document-level relation extraction. Our framework employs two different strategies to model intra-sentential and inter-sentential relations between fine-grained entity mentions, yielding local mention representations for intra-sentential relation prediction and global mention representations for inter-sentential relation prediction. For inter-sentential relation prediction and inference, we propose an enhanced heterogeneous graph attention network to better model the long-distance semantic relationships and design an entity-coreference path-based inference strategy to conduct relation inference. Moreover, we introduce a novel cross-entropy-based multilabel focal loss function to address the class imbalance problem and multilabel prediction simultaneously. Comprehensive experiments have been conducted to verify the effectiveness of our framework. Experimental results show that our approach significantly outperforms the existing methods. Full article
(This article belongs to the Special Issue Natural Language Processing and Data Mining)
Show Figures

Figure 1

Back to TopTop