applsci-logo

Journal Browser

Journal Browser

Natural Language Processing in the Era of Artificial Intelligence

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 September 2025 | Viewed by 2570

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Computer Science, Romanian Academy, Iasi Branch, 700011 Iasi, Romania
Interests: natural language processing; computational linguistics; web of linked data; content analysis; social media and health information; applied and computational statistics; integrated health informatics system; assisted decision systems; research ethics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Computational Bioscience Program, Department of Pharmacology, University of Colorado School of Medicine, Aurora, CO 80045, USA
Interests: spinal cord injury and regeneration; analysis of the speech of suicidal individuals; temporality in health records; information extraction from epilepsy clinic notes
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In an era when massive amounts of data have become available, researchers across various domains increasingly require the expertise of language engineers to process large quantities of literature, data, and records. Whether in healthcare, finance, education, social sciences, or any other field, linking the contents of these documents to each other, as well as to specialized ontologies, can enable access to and discovery of structured information, fostering significant advancements in natural language processing and research.

This Special Issue aims to gather innovative approaches for the exploitation of data using semantic web technologies and linked data by bringing together practitioners, researchers, and scholars to share examples, use cases, theories, and analyses across different fields. The main objective of this Special Issue is to consolidate an internationally appreciated forum for scientific research, with emphasis on crowdsourcing, the semantic web, knowledge integration, and data linking.

Dr. Daniela Gîfu
Dr. Kevin Cohen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing/text mining
  • data science/applied mathematics
  • knowledge integration
  • semantic web technologies
  • open linked data
  • crowdsourcing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 4707 KiB  
Article
Entropy-Optimized Dynamic Text Segmentation and RAG-Enhanced LLMs for Construction Engineering Knowledge Base
by Haiyuan Wang, Deli Zhang, Jianmin Li, Zelong Feng and Feng Zhang
Appl. Sci. 2025, 15(6), 3134; https://doi.org/10.3390/app15063134 - 13 Mar 2025
Viewed by 658
Abstract
In the field of construction engineering, there exists a dynamic evolution of extensive technical standards and specifications (e.g., GB/T and ISO series) that permeate the entire lifecycle of design, construction, and operation–maintenance. These standards require continuous version iteration to adapt to technological innovations. [...] Read more.
In the field of construction engineering, there exists a dynamic evolution of extensive technical standards and specifications (e.g., GB/T and ISO series) that permeate the entire lifecycle of design, construction, and operation–maintenance. These standards require continuous version iteration to adapt to technological innovations. Engineers require specialized knowledge bases to assist in understanding and updating these standards. The advancement of large language models (LLMs) and Retrieval-Augmented Generation (RAG) technologies provides robust technical support for constructing domain-specific knowledge bases. This study developed and tested a vertical domain knowledge base construction scheme based on RAG architecture and LLMs, comprising three critical components: entropy-optimized dynamic text segmentation (EDTS), vector correlation-based chunk ranking, and iterative optimization of prompt engineering. This study employs an EDTS method to ensure information clarity and predictability within limited chunk lengths, followed by selecting 10 relevant chunks to form prompts for input into LLMs, thereby enabling efficient retrieval of vertical domain knowledge. Experimental validation using Qwen-series LLMs with a test set of 101 expert-verified questions from Chinese construction industry standard demonstrates that the overall test accuracy reaches 76%. The comparative experiments across model scales (1.5B, 3B, 7B, 14B, 32B, and 72B) quantitatively reveal the relationship between model size, answer accuracy, and execution time, providing decision-making guidance for computational resource-accuracy tradeoffs in engineering practice. Full article
(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)
Show Figures

Figure 1

13 pages, 280 KiB  
Article
Under-Represented Speech Dataset from Open Data: Case Study on the Romanian Language
by Vasile Păiș, Verginica Barbu Mititelu, Elena Irimia, Radu Ion and Dan Tufiș
Appl. Sci. 2024, 14(19), 9043; https://doi.org/10.3390/app14199043 - 7 Oct 2024
Viewed by 1145
Abstract
This paper introduces the USPDATRO dataset. This is a speech dataset, in the Romanian language, constructed from open data, focusing on under-represented voice types (children, young and old people, and female voices). The paper covers the methodology behind the dataset construction, specific details [...] Read more.
This paper introduces the USPDATRO dataset. This is a speech dataset, in the Romanian language, constructed from open data, focusing on under-represented voice types (children, young and old people, and female voices). The paper covers the methodology behind the dataset construction, specific details regarding the dataset, and evaluation of existing Romanian Automatic Speech Recognition (ASR) systems, with different architectures. Results indicate that more under-represented speech content is needed in the training of ASR systems. Our approach can be extended to other low-resourced languages, as long as open data are available. Full article
(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop