Data Mining in Natural Language Processing: Latest Advances and Prospects

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 June 2026 | Viewed by 1434

Special Issue Editor


E-Mail Website
Guest Editor
Joh Jay College, The City University of New York, New York, NY 10019, USA
Interests: artificial intelligence; data science; machine learning; human behavior analysis

Special Issue Information

Dear Colleagues,

The Special Issue of Electronics, “Data Mining in Natural Language Processing: Latest Advances and Prospects”, explores the latest advancements in data mining applied in natural language processing and the broader implications. We invite contributions that focus on proposing new real-world databases; elevating the current state of the art; introducing new methodologies and innovative approaches for processing and analyzing complex data; and extracting valuable information from high-dimensional, noisy, and multimodal data that integrates different modalities, including text, images, signals, or speech.

Articles are expected to discuss the challenges of data mining and data architectures, advance the field of natural language processing, and discuss the ethical considerations of data mining approaches applied to natural language processing.

Original research articles are welcome. Topics of interest include, but are not limited to, the following:

  • Advances in text mining.
  • Interpretability and analysis of models in NLP.
  • Sentiment analysis and opinion mining.
  • Text summarization and text classification.
  • Information retrieval.
  • Social media analysis.
  • Multimodal natural language processing.
  • Generative models and Large Language Models.
  • Multimodal Large Language Models.
  • Advanced Language models.
  • Ethics in data mining.

Dr. Fatma Najar
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • natural language processing
  • sentiment analysis
  • text mining
  • text classification
  • language models
  • multimodality

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

33 pages, 3307 KB  
Article
Comparing Single-Agent and Multi-Agent Strategies in LLM-Based Title-Abstract Screening
by Irina Radeva, Teodora Noncheva, Lyubka Doukovska and Ivan Popchev
Electronics 2026, 15(8), 1661; https://doi.org/10.3390/electronics15081661 - 15 Apr 2026
Viewed by 685
Abstract
Title-abstract screening remains labour-intensive, especially in interdisciplinary domains where shared terminology increases misclassification risk. This study compared five LLM coordination strategies—single-agent baseline, majority voting, recall-focused ensemble, confidence-weighted aggregation, and two-stage filtering—using four 4-bit quantised open-source models (Mistral 7B, LLaMA 3.1 8B, Granite 3.3 [...] Read more.
Title-abstract screening remains labour-intensive, especially in interdisciplinary domains where shared terminology increases misclassification risk. This study compared five LLM coordination strategies—single-agent baseline, majority voting, recall-focused ensemble, confidence-weighted aggregation, and two-stage filtering—using four 4-bit quantised open-source models (Mistral 7B, LLaMA 3.1 8B, Granite 3.3 8B, Qwen 2.5 7B) in zero-shot and few-shot configurations. The evaluation was conducted on a Gold Standard of 200 papers from a corpus of 2036 records on blockchain-based e-voting. The best-performing configuration—a single-agent strategy with Qwen 2.5 7B in few-shot mode—achieved recall of 100%, precision of 70.4%, F1 of 82.6%, and a 43.4% reduction in manual screening effort, outperforming all multi-agent alternatives. Confidence-weighted aggregation produced results identical to majority voting, indicating that self-reported confidence from 7–8B parameter models did not add discriminative value. All screening decisions were logged on a private blockchain with timestamped anchoring for reproducibility. These results suggest that, for domain-specific screening tasks, careful model selection outweighs multi-agent coordination overhead, and that few-shot prompting with a well-matched model can achieve human-level recall with substantially reduced manual effort. Full article
Show Figures

Figure 1

Review

Jump to: Research

47 pages, 5474 KB  
Review
Bias in Large Language Models: Origin, Evaluation, and Mitigation
by Yufei Guo, Muzhe Guo, Juntao Su, Zhou Yang, Mengqiu Zhu, Hongfei Li, Mengyang Qiu and Shuo Shuo Liu
Electronics 2026, 15(9), 1824; https://doi.org/10.3390/electronics15091824 - 24 Apr 2026
Viewed by 425
Abstract
Large language models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges. This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies. We categorize biases as intrinsic and extrinsic, analyzing their [...] Read more.
Large language models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges. This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies. We categorize biases as intrinsic and extrinsic, analyzing their manifestations in various natural language processing (NLP) tasks. The review critically assesses a range of bias evaluation methods, including data-level, model-level, and output-level approaches, providing researchers with a robust toolkit for bias detection. We further explore mitigation strategies, categorizing them into pre-model, intra-model, and post-model techniques, highlighting their effectiveness and limitations. Ethical and legal implications of biased LLMs are discussed, emphasizing potential harms in real-world applications such as healthcare and criminal justice. By synthesizing current knowledge on bias in LLMs, this review contributes to the ongoing effort to develop fair and responsible artificial intelligence (AI) systems. Our work serves as a comprehensive resource for researchers and practitioners working towards understanding, evaluating, and mitigating bias in LLMs, fostering the development of more equitable AI technologies. Full article
Show Figures

Figure 1

Back to TopTop