Special Issue "Natural Language Processing and Applications: Challenges and Perspectives"

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 30 June 2022 | Viewed by 6678

Special Issue Editor

Prof. Dr. Diego Reforgiato Recupero
E-Mail Website
Guest Editor
Department of Mathematics and Computer Science, University of Cagliari, Via Ospedale 72, 09042 Cagliari, Italy
Interests: big data; social network analysis; semantic web; natural language processing; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The 2nd International Conference on Natural Language Processing and Applications (NLPA 2021) will be held in Copenhagen, Denmark, from 24 to 25 April 2021. NLPA 2021 will provide an excellent international forum for sharing knowledge and results concerning the theory, methodology, and applications of natural language computing. This Special Issue intends to contain a selection of the best papers presented at NLPA 2021, carefully revised and extended. Paper acceptance for NLPA 2021 will be based on quality, relevance to the conference theme, and originality.

The authors of a number of selected full papers of high quality will be invited after the conference to submit revised and extended versions of their originally accepted conference papers to this Special Issue of Information, published by MDPI in open access. The selection of these best papers will be based on their ratings in the conference review process, quality of presentation during the conference, and expected impact on the research community. For each submission to this Special Issue, at least 50% of the content should be new material, e.g., in the form of technical extensions, more in-depth evaluations, or additional use cases, and there should be a change of title, abstract, and keywords. These extended submissions will undergo a peer-review process according to the journal’s rules of action. At least two technical committees will act as reviewers for each extended article submitted to this Special Issue; if needed, additional external reviewers will be invited to guarantee a high-quality reviewing process.

Prof. Dr. Diego Reforgiato Recupero
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Phonology and morphology
  • Chunking/shallow parsing
  • Parsing/grammatical formalisms
  • Semantic processing
  • Lexical semantics
  • Ontology
  • Linguistic resources
  • Statistical and knowledge-based methods
  • POS tagging
  • Discourse
  • Paraphrasing/entailment/generation
  • Machine translation
  • Information retrieval
  • Text mining
  • Information extraction
  • Question answering
  • Dialog systems
  • Spoken language processing
  • Speech recognition and synthesis
  • Computational linguistics and NLP Information retrieval and AI
  • Semantics and NLP

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

Article
Robust Complaint Processing in Portuguese
Information 2021, 12(12), 525; https://doi.org/10.3390/info12120525 - 17 Dec 2021
Viewed by 659
Abstract
The Natural Language Processing (NLP) community has witnessed huge improvements in the last years. However, most achievements are evaluated on benchmarked curated corpora, with little attention devoted to user-generated content and less-resourced languages. Despite the fact that recent approaches target the development of [...] Read more.
The Natural Language Processing (NLP) community has witnessed huge improvements in the last years. However, most achievements are evaluated on benchmarked curated corpora, with little attention devoted to user-generated content and less-resourced languages. Despite the fact that recent approaches target the development of multi-lingual tools and models, they still underperform in languages such as Portuguese, for which linguistic resources do not abound. This paper exposes a set of challenges encountered when dealing with a real-world complex NLP problem, based on user-generated complaint data in Portuguese. This case study meets the needs of a country-wide governmental institution responsible for food safety and economic surveillance, and its responsibilities in handling a high number of citizen complaints. Beyond looking at the problem from an exclusively academic point of view, we adopt application-level concerns when analyzing the progress obtained through different techniques, including the need to obtain explainable decision support. We discuss modeling choices and provide useful insights for researchers working on similar problems or data. Full article
Show Figures

Figure 1

Article
A Comparative Study of Arabic Part of Speech Taggers Using Literary Text Samples from Saudi Novels
Information 2021, 12(12), 523; https://doi.org/10.3390/info12120523 - 15 Dec 2021
Viewed by 751
Abstract
Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, [...] Read more.
Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, tag sets and training and testing data. In this paper we conduct a comparative study of five Arabic POS taggers, namely: Stanford Arabic, CAMeL Tools, Farasa, MADAMIRA and Arabic Linguistic Pipeline (ALP) which examine their performance using text samples from Saudi novels. The testing data has been extracted from different novels that represent different types of narrations. The main result we have obtained indicates that the ALP tagger performs better than others in this particular case, and that Adjective is the most frequent mistagged POS type as compared to Noun and Verb. Full article
Show Figures

Figure 1

Article
Developing Core Technologies for Resource-Scarce Nguni Languages
Information 2021, 12(12), 520; https://doi.org/10.3390/info12120520 - 14 Dec 2021
Cited by 1 | Viewed by 557
Abstract
The creation of linguistic resources is crucial to the continued growth of research and development efforts in the field of natural language processing, especially for resource-scarce languages. In this paper, we describe the curation and annotation of corpora and the development of multiple [...] Read more.
The creation of linguistic resources is crucial to the continued growth of research and development efforts in the field of natural language processing, especially for resource-scarce languages. In this paper, we describe the curation and annotation of corpora and the development of multiple linguistic technologies for four official South African languages, namely isiNdebele, Siswati, isiXhosa, and isiZulu. Development efforts included sourcing parallel data for these languages and annotating each on token, orthographic, morphological, and morphosyntactic levels. These sets were in turn used to create and evaluate three core technologies, viz. a lemmatizer, part-of-speech tagger, morphological analyzer for each of the languages. We report on the quality of these technologies which improve on previously developed rule-based technologies as part of a similar initiative in 2013. These resources are made publicly accessible through a local resource agency with the intention of fostering further development of both resources and technologies that may benefit the NLP industry in South Africa. Full article
Article
A Knowledge-Based Sense Disambiguation Method to Semantically Enhanced NL Question for Restricted Domain
Information 2021, 12(11), 452; https://doi.org/10.3390/info12110452 - 31 Oct 2021
Viewed by 566
Abstract
Within the space of question answering (QA) systems, the most critical module to improve overall performance is question analysis processing. Extracting the lexical semantic of a Natural Language (NL) question presents challenges at syntactic and semantic levels for most QA systems. This is [...] Read more.
Within the space of question answering (QA) systems, the most critical module to improve overall performance is question analysis processing. Extracting the lexical semantic of a Natural Language (NL) question presents challenges at syntactic and semantic levels for most QA systems. This is due to the difference between the words posed by a user and the terms presently stored in the knowledge bases. Many studies have achieved encouraging results in lexical semantic resolution on the topic of word sense disambiguation (WSD), and several other works consider these challenges in the context of QA applications. Additionally, few scholars have examined the role of WSD in returning potential answers corresponding to particular questions. However, natural language processing (NLP) is still facing several challenges to determine the precise meaning of various ambiguities. Therefore, the motivation of this work is to propose a novel knowledge-based sense disambiguation (KSD) method for resolving the problem of lexical ambiguity associated with questions posed in QA systems. The major contribution is the proposed innovative method, which incorporates multiple knowledge sources. This includes the question’s metadata (date/GPS), context knowledge, and domain ontology into a shallow NLP. The proposed KSD method is developed into a unique tool for a mobile QA application that aims to determine the intended meaning of questions expressed by pilgrims. The experimental results reveal that our method obtained comparable and better accuracy performance than the baselines in the context of the pilgrimage domain. Full article
Show Figures

Figure 1

Article
Optimizing Small BERTs Trained for German NER
Information 2021, 12(11), 443; https://doi.org/10.3390/info12110443 - 25 Oct 2021
Viewed by 593
Abstract
Currently, the most widespread neural network architecture for training language models is the so-called BERT, which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained [...] Read more.
Currently, the most widespread neural network architecture for training language models is the so-called BERT, which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained in these NLP tasks. Unfortunately, the memory consumption and the training duration drastically increases with the size of these models. In this article, we investigate various training techniques of smaller BERT models: We combine different methods from other BERT variants, such as ALBERT, RoBERTa, and relative positional encoding. In addition, we propose two new fine-tuning modifications leading to better performance: Class-Start-End tagging and a modified form of Linear Chain Conditional Random Fields. Furthermore, we introduce Whole-Word Attention, which reduces BERTs memory usage and leads to a small increase in performance compared to classical Multi-Head-Attention. We evaluate these techniques on five public German Named Entity Recognition (NER) tasks, of which two are introduced by this article. Full article
Show Figures

Figure 1

Article
Multi-Task Learning for Sentiment Analysis with Hard-Sharing and Task Recognition Mechanisms
Information 2021, 12(5), 207; https://doi.org/10.3390/info12050207 - 12 May 2021
Cited by 3 | Viewed by 928
Abstract
In the era of big data, multi-task learning has become one of the crucial technologies for sentiment analysis and classification. Most of the existing multi-task learning models for sentiment analysis are developed based on the soft-sharing mechanism that has less interference between different [...] Read more.
In the era of big data, multi-task learning has become one of the crucial technologies for sentiment analysis and classification. Most of the existing multi-task learning models for sentiment analysis are developed based on the soft-sharing mechanism that has less interference between different tasks than the hard-sharing mechanism. However, there are also fewer essential features that the model can extract with the soft-sharing method, resulting in unsatisfactory classification performance. In this paper, we propose a multi-task learning framework based on a hard-sharing mechanism for sentiment analysis in various fields. The hard-sharing mechanism is achieved by a shared layer to build the interrelationship among multiple tasks. Then, we design a task recognition mechanism to reduce the interference of the hard-shared feature space and also to enhance the correlation between multiple tasks. Experiments on two real-world sentiment classification datasets show that our approach achieves the best results and improves the classification accuracy over the existing methods significantly. The task recognition training process enables a unique representation of the features of different tasks in the shared feature space, providing a new solution reducing interference in the shared feature space for sentiment analysis. Full article
Show Figures

Figure 1

Review

Jump to: Research

Review
Ontology-Based Approach to Semantically Enhanced Question Answering for Closed Domain: A Review
Information 2021, 12(5), 200; https://doi.org/10.3390/info12050200 - 01 May 2021
Cited by 2 | Viewed by 1092
Abstract
For many users of natural language processing (NLP), it can be challenging to obtain concise, accurate and precise answers to a question. Systems such as question answering (QA) enable users to ask questions and receive feedback in the form of quick answers to [...] Read more.
For many users of natural language processing (NLP), it can be challenging to obtain concise, accurate and precise answers to a question. Systems such as question answering (QA) enable users to ask questions and receive feedback in the form of quick answers to questions posed in natural language, rather than in the form of lists of documents delivered by search engines. This task is challenging and involves complex semantic annotation and knowledge representation. This study reviews the literature detailing ontology-based methods that semantically enhance QA for a closed domain, by presenting a literature review of the relevant studies published between 2000 and 2020. The review reports that 83 of the 124 papers considered acknowledge the QA approach, and recommend its development and evaluation using different methods. These methods are evaluated according to accuracy, precision, and recall. An ontological approach to semantically enhancing QA is found to be adopted in a limited way, as many of the studies reviewed concentrated instead on NLP and information retrieval (IR) processing. While the majority of the studies reviewed focus on open domains, this study investigates the closed domain. Full article
Show Figures

Figure 1

Back to TopTop