Submit to Entropy Review for Entropy Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Natural Language Processing and Data Mining

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: closed (15 June 2025) | Viewed by 8710

Share This Special Issue

Special Issue Editor

Dr. Manling Li

E-Mail Website
Guest Editor

Computer Science Department, Northwestern University, Evanston, IL 60208, USA
Interests: natural language processing; multimedia information extraction

Special Issue Information

Dear Colleagues,

Natural language processing (NLP) and data mining are two rapidly advancing and synergistic fields with broad applications of the relevant concepts of entropy, information theory, or related studies. Entropy is calling for original research submissions for a Special Issue highlighting recent innovations and advances. We invite research covering novel techniques, studies, methodologies, and technologies that integrate NLP and data mining theories, models, and algorithms. Potential topics include, but are not limited to, the following: using NLP techniques to extract and structure data from unstructured text for mining, enhancing the discovery of knowledge and patterns from text using data mining, multimodal data mining leveraging linguistic cues and rules, and studies evaluating the effectiveness of different NLP and data mining integration approaches, as well as broader applications such as sentiment analysis, recommendation systems, question answering, and decision making systems empowered by both capabilities. Both theoretical contributions and empirical studies on real-world datasets are within scope. The purpose of this Special Issue is to highlight high-quality and impactful research advancing NLP and data mining capabilities when applying the relevant concepts of entropy, information theory, or related studies.

Dr. Manling Li
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

natural language processing
data mining
information theory

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

11 pages, 670 KiB

Open AccessArticle

LLM-Enhanced Chinese Morph Resolution in E-Commerce Live Streaming Scenarios

by Xiaoye Ouyang, Liu Yuan, Xiaocheng Hu, Jiahao Zhu and Jipeng Qiang

Entropy 2025, 27(7), 698; https://doi.org/10.3390/e27070698 - 29 Jun 2025

Viewed by 114

Abstract

E-commerce live streaming in China has become a major retail channel, yet hosts often employ subtle phonetic or semantic “morphs” to evade moderation and make unsubstantiated claims, posing risks to consumers. To address this, we study the Live Auditory Morph Resolution (LiveAMR) task, which restores morphed speech transcriptions to their true forms. Building on prior text-based morph resolution, we propose an LLM-enhanced training framework that mines three types of explanation knowledge—predefined morph-type labels, LLM-generated reference corrections, and natural-language rationales constrained for clarity and comprehensiveness—from a frozen large language model. These annotations are concatenated with the original morphed sentence and used to fine-tune a lightweight T5 model under a standard cross-entropy objective. In experiments on two test sets (in-domain and out-of-domain), our method achieves substantial gains over baselines, improving

F_{0.5}

by up to 7 pp in-domain (to 0.943) and 5 pp out-of-domain (to 0.799) compared to a strong T5 baseline. These results demonstrate that structured LLM-derived signals can be mined without fine-tuning the LLM itself and injected into small models to yield efficient, accurate morph resolution. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Figure 1

14 pages, 605 KiB

Open AccessArticle

A Hierarchical Multi-Task Learning Framework for Semantic Annotation in Tabular Data

by Jie Wu and Mengshu Hou

Entropy 2024, 26(8), 664; https://doi.org/10.3390/e26080664 - 4 Aug 2024

Cited by 1 | Viewed by 2021

Abstract

To optimize the utilization and analysis of tables, it is essential to recognize and understand their semantics comprehensively. This requirement is especially critical given that many tables lack explicit annotations, necessitating the identification of column types and inter-column relationships. Such identification can significantly augment data quality, streamline data integration, and support data analysis and mining. Current table annotation models often address each subtask independently, which may result in the neglect of constraints and contextual information, causing relational ambiguities and inference errors. To address this issue, we propose a unified multi-task learning framework capable of concurrently handling multiple tasks within a single model, including column named entity recognition, column type identification, and inter-column relationship detection. By integrating these tasks, the framework exploits their interrelations, facilitating the exchange of shallow features and the sharing of representations. Their cooperation enables each task to leverage insights from the others, thereby improving the performance of individual subtasks and enhancing the model’s overall generalization capabilities. Notably, our model is designed to employ only the internal information of tabular data, avoiding reliance on external context or knowledge graphs. This design ensures robust performance even with limited input information. Extensive experiments demonstrate the superior performance of our model across various tasks, validating the effectiveness of unified multi-task learning framework in the recognition and comprehension of table semantics. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Figure 1

15 pages, 3929 KiB

Open AccessArticle

Coreference Resolution Based on High-Dimensional Multi-Scale Information

by Yu Wang, Zenghui Ding, Tao Wang, Shu Xu, Xianjun Yang and Yining Sun

Entropy 2024, 26(6), 529; https://doi.org/10.3390/e26060529 - 19 Jun 2024

Cited by 1 | Viewed by 1499

Abstract

Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding performance. Based on this, a multi-scale context information module is designed to improve the applicability of the BERT encoding model under different text spans. In addition, improving linear separability through dimension expansion. Finally, cross-entropy loss is used as the loss function. After adding BERT and span BERT to the module designed in this article, F1 increased by 0.5% and 0.2%, respectively. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Figure 1

13 pages, 382 KiB

Open AccessArticle

DiffFSRE: Diffusion-Enhanced Prototypical Network for Few-Shot Relation Extraction

by Yang Chen and Bowen Shi

Entropy 2024, 26(5), 352; https://doi.org/10.3390/e26050352 - 23 Apr 2024

Cited by 1 | Viewed by 2146

Abstract

Supervised learning methods excel in traditional relation extraction tasks. However, the quality and scale of the training data heavily influence their performance. Few-shot relation extraction is gradually becoming a research hotspot whose objective is to learn and extract semantic relationships between entities with only a limited number of annotated samples. In recent years, numerous studies have employed prototypical networks for few-shot relation extraction. However, these methods often suffer from overfitting of the relation classes, making it challenging to generalize effectively to new relationships. Therefore, this paper seeks to utilize a diffusion model for data augmentation to address the overfitting issue of prototypical networks. We propose a diffusion model-enhanced prototypical network framework. Specifically, we design and train a controllable conditional relation generation diffusion model on the relation extraction dataset, which can generate the corresponding instance representation according to the relation description. Building upon the trained diffusion model, we further present a pseudo-sample-enhanced prototypical network, which is able to provide more accurate representations for prototype classes, thereby alleviating overfitting and better generalizing to unseen relation classes. Additionally, we introduce a pseudo-sample-aware attention mechanism to enhance the model’s adaptability to pseudo-sample data through a cross-entropy loss, further improving the model’s performance. A series of experiments are conducted to prove our method’s effectiveness. The results indicate that our proposed approach significantly outperforms existing methods, particularly in low-resource one-shot environments. Further ablation analyses underscore the necessity of each module in the model. As far as we know, this is the first research to employ a diffusion model for enhancing the prototypical network through data augmentation in few-shot relation extraction. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Figure 1

14 pages, 652 KiB

Open AccessArticle

Enhanced Heterogeneous Graph Attention Network with a Novel Multilabel Focal Loss for Document-Level Relation Extraction

by Yang Chen and Bowen Shi

Entropy 2024, 26(3), 210; https://doi.org/10.3390/e26030210 - 28 Feb 2024

Cited by 3 | Viewed by 1987

Abstract

Recent years have seen a rise in interest in document-level relation extraction, which is defined as extracting all relations between entities in multiple sentences of a document. Typically, there are multiple mentions corresponding to a single entity in this context. Previous research predominantly employed a holistic representation for each entity to predict relations, but this approach often overlooks valuable information contained in fine-grained entity mentions. We contend that relation prediction and inference should be grounded in specific entity mentions rather than abstract entity concepts. To address this, our paper proposes a two-stage mention-level framework based on an enhanced heterogeneous graph attention network for document-level relation extraction. Our framework employs two different strategies to model intra-sentential and inter-sentential relations between fine-grained entity mentions, yielding local mention representations for intra-sentential relation prediction and global mention representations for inter-sentential relation prediction. For inter-sentential relation prediction and inference, we propose an enhanced heterogeneous graph attention network to better model the long-distance semantic relationships and design an entity-coreference path-based inference strategy to conduct relation inference. Moreover, we introduce a novel cross-entropy-based multilabel focal loss function to address the class imbalance problem and multilabel prediction simultaneously. Comprehensive experiments have been conducted to verify the effectiveness of our framework. Experimental results show that our approach significantly outperforms the existing methods. Full article

(This article belongs to the Special Issue Natural Language Processing and Data Mining)

► Show Figures

Journal Menu

Journal Browser

Natural Language Processing and Data Mining

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (5 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI