Special Issue "Natural Language Processing: Emerging Neural Approaches and Applications"

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 31 July 2020.

Special Issue Editors

Dr. Massimo Esposito
Website
Guest Editor
Institute for High Performance Computing and Networking - National Research Council of Italy (ICAR-CNR), Naples, Italy
Interests: artificial intelligence; natural language processing; decision support systems; cognitive systems; knowledge-based technologies
Special Issues and Collections in MDPI journals
Dr. Giovanni Luca Masala

Guest Editor
School of Computing, Mathematics & Digital Technology - Manchester Metropolitan University, Manchester, UK
Interests: artificial intelligence; natural language processing; cognitive systems; machine vision; cloud computing
Dr. Aniello Minutolo

Guest Editor
Institute for High Performance Computing and Networking - National Research Council of Italy (ICAR-CNR), Naples, Italy
Interests: knowledge-based technologies; natural language processing; decision support systems; dialog systems and chatbots
Dr. Marco Pota

Guest Editor
Institute for High Performance Computing and Networking - National Research Council of Italy (ICAR-CNR), Naples, Italy
Interests: fuzzy modeling; system interpretability; classification; natural language processing; deep neural networks
Special Issues and Collections in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, Artificial Intelligence has led to impressive achievements on a variety of complex cognitive tasks, matching or even beating humans. In the field of natural language processing (NLP), the use of deep learning models in the last five years has allowed AI to surpass human levels on many important tasks, such as machine translation and machine reading comprehension, and reach considerable improvements in other real-world NLP applications, such as image captioning, visual question answering and conversational systems, search and information retrieval, sentiment analysis, and recommender systems.

Despite the remarkable success of deep learning in different NLP tasks, significant challenges yet remain that make natural language development and understanding among the least understood human capabilities from a cognitive perspective. Indeed, the current deep learning methods have been scaled up and improved, but, as a side effect, their complexity has grown, assuming the form of empirical engineering solutions. Moreover, they have assumed the characteristic of being extremely data-hungry and are not applicable to languages with scarce or zero datasets. Furthermore, they are not able to explain their outputs, which is relevant for using and improving natural language systems. Summarizing, current deep learning systems do not provide a human-like computational model of cognition that is able to acquire, comprehend, and generate natural language as well as ground it and perform common-sense reasoning on physical concepts, objects, and events of the external world.

This Special Issue is intended to provide an overview of the research being carried out in the area of natural language processing to face these open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding, interactively or autonomously from data, in cognitive and neural systems, as well as on their potential or real applications in different domains. To this end, the Special Issue aims to gather researchers with a broad expertise in various fields—natural language processing, cognitive science and psychology, Artificial Intelligence and neural networks, computational modeling and neuroscience—to discuss their cutting-edge work as well as perspectives on future directions in this exciting field. Original contributions are sought, covering the whole range of theoretical and practical aspects, technologies, and systems in this research area.

The topics of interest for this Special Issue include but are not limited to:

  • Natural language understanding, generation, and grounding;
  • Multilingual and cross-lingual distributional representations and universal language models;
  • Conversational systems/interfaces and question answering;
  • Sentiment analysis, emotion detection, and opinion mining;
  • Document analysis, information extraction, and text mining;
  • Machine translation;
  • Search and information retrieval;
  • Common-sense reasoning;
  • Computer/human interactive learning;
  • Neuroscience-inspired cognitive architectures;
  • Trustworthy and explainable artificial intelligence;
  • Cognitive and social robotics;
  • Applications in science, engineering, medicine, healthcare, finance, business, law, education, transportation, retailing, telecommunication, and multimedia.

Dr. Massimo Esposito
Dr. Giovanni Luca Masala
Dr. Aniello Minutolo
Dr. Marco Pota
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Natural language processing
  • Text analytics
  • Interactive and reinforcement learning
  • Machine/deep learning
  • Transfer learning
  • Cognitive systems

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

Open AccessArticle
Text Normalization Using Encoder–Decoder Networks Based on the Causal Feature Extractor
Appl. Sci. 2020, 10(13), 4551; https://doi.org/10.3390/app10134551 - 30 Jun 2020
Abstract
The encoder–decoder architecture is a well-established, effective and widely used approach in many tasks of natural language processing (NLP), among other domains. It consists of two closely-collaborating components: An encoder that transforms the input into an intermediate form, and a decoder producing the [...] Read more.
The encoder–decoder architecture is a well-established, effective and widely used approach in many tasks of natural language processing (NLP), among other domains. It consists of two closely-collaborating components: An encoder that transforms the input into an intermediate form, and a decoder producing the output. This paper proposes a new method for the encoder, named Causal Feature Extractor (CFE), based on three main ideas: Causal convolutions, dilatations and bidirectionality. We apply this method to text normalization, which is a ubiquitous problem that appears as the first step of many text-to-speech (TTS) systems. Given a text with symbols, the problem consists in writing the text exactly as it should be read by the TTS system. We make use of an attention-based encoder–decoder architecture using a fine-grained character-level approach rather than the usual word-level one. The proposed CFE is compared to other common encoders, such as convolutional neural networks (CNN) and long-short term memories (LSTM). Experimental results show the feasibility of CFE, achieving better results in terms of accuracy, number of parameters, convergence time, and use of an attention mechanism based on attention matrices. The obtained accuracy ranges from 83.5% to 96.8% correctly normalized sentences, depending on the dataset. Moreover, the proposed method is generic and can be applied to different types of input such as text, audio and images. Full article
Open AccessArticle
A Polarity Capturing Sphere for Word to Vector Representation
Appl. Sci. 2020, 10(12), 4386; https://doi.org/10.3390/app10124386 - 26 Jun 2020
Abstract
Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing [...] Read more.
Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish between synonymous, antonymous, and unrelated word pairs. Meanwhile, polarity detection is crucial for applications such as sentiment analysis. In this work we propose an embedding approach that is designed to capture the polarity issue. The approach is based on embedding the word vectors into a sphere, whereby the dot product between any vectors represents the similarity. Vectors corresponding to synonymous words would be close to each other on the sphere, while a word and its antonym would lie at opposite poles of the sphere. The approach used to design the vectors is a simple relaxation algorithm. The proposed word embedding is successful in distinguishing between synonyms, antonyms, and unrelated word pairs. It achieves results that are better than those of some of the state-of-the-art techniques and competes well with the others. Full article
Show Figures

Figure 1

Open AccessArticle
Improving Sentence Retrieval Using Sequence Similarity
Appl. Sci. 2020, 10(12), 4316; https://doi.org/10.3390/app10124316 - 23 Jun 2020
Abstract
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, [...] Read more.
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference (TREC). The scope of this paper was to find out if such approach is generally beneficial to sentence retrieval. However, we did not examine in depth how partial matching helps or hinders the finding of relevant sentences. Full article
Show Figures

Figure 1

Open AccessArticle
Paraphrase Identification with Lexical, Syntactic and Sentential Encodings
Appl. Sci. 2020, 10(12), 4144; https://doi.org/10.3390/app10124144 - 16 Jun 2020
Abstract
Paraphrase identification has been one of the major topics in Natural Language Processing (NLP). However, how to interpret a diversity of contexts such as lexical and semantic information within a sentence as relevant features is still an open problem. This paper addresses the [...] Read more.
Paraphrase identification has been one of the major topics in Natural Language Processing (NLP). However, how to interpret a diversity of contexts such as lexical and semantic information within a sentence as relevant features is still an open problem. This paper addresses the problem and presents an approach for leveraging contextual features with a neural-based learning model. Our Lexical, Syntactic, and Sentential Encodings (LSSE) learning model incorporates Relational Graph Convolutional Networks (R-GCNs) to make use of different features from local contexts, i.e., word encoding, position encoding, and full dependency structures. By utilizing the hidden states obtained by the R-GCNs as well as lexical and sentential encodings by Bidirectional Encoder Representations from Transformers (BERT), our model learns the contextual similarity between sentences effectively. The experimental results by using the two benchmark datasets, Microsoft Research Paraphrase Corpus (MRPC) and Quora Question Pairs (QQP) show that the improvement compared with the baseline, BERT sentential encodings model, was 1.7% F1-score on MRPC and 1.0% F1-score on QQP. Moreover, we verified that the combination of position encoding and syntactic features contributes to performance improvement. Full article
Show Figures

Figure 1

Open AccessArticle
A Rule-Based Approach to Embedding Techniques for Text Document Classification
Appl. Sci. 2020, 10(11), 4009; https://doi.org/10.3390/app10114009 - 10 Jun 2020
Abstract
With the growth of online information and sudden expansion in the number of electronic documents provided on websites and in electronic libraries, there is difficulty in categorizing text documents. Therefore, a rule-based approach is a solution to this problem; the purpose of this [...] Read more.
With the growth of online information and sudden expansion in the number of electronic documents provided on websites and in electronic libraries, there is difficulty in categorizing text documents. Therefore, a rule-based approach is a solution to this problem; the purpose of this study is to classify documents by using a rule-based. This paper deals with the rule-based approach with the embedding technique for a document to vector (doc2vec) files. An experiment was performed on two data sets Reuters-21578 and the 20 Newsgroups to classify the top ten categories of these data sets by using a document to vector rule-based (D2vecRule). Finally, this method provided us a good classification result according to the F-measures and implementation time metrics. In conclusion, it was observed that our algorithm document to vector rule-based (D2vecRule) was good when compared with other algorithms such as JRip, One R, and ZeroR applied to the same Reuters-21578 dataset. Full article
Show Figures

Figure 1

Open AccessArticle
Dual Pointer Network for Fast Extraction of Multiple Relations in a Sentence
Appl. Sci. 2020, 10(11), 3851; https://doi.org/10.3390/app10113851 - 01 Jun 2020
Abstract
Relation extraction is a type of information extraction task that recognizes semantic relationships between entities in a sentence. Many previous studies have focused on extracting only one semantic relation between two entities in a single sentence. However, multiple entities in a sentence are [...] Read more.
Relation extraction is a type of information extraction task that recognizes semantic relationships between entities in a sentence. Many previous studies have focused on extracting only one semantic relation between two entities in a single sentence. However, multiple entities in a sentence are associated through various relations. To address this issue, we proposed a relation extraction model based on a dual pointer network with a multi-head attention mechanism. The proposed model finds n-to-1 subject–object relations using a forward object decoder. Then, it finds 1-to-n subject–object relations using a backward subject decoder. Our experiments confirmed that the proposed model outperformed previous models, with an F1-score of 80.8% for the ACE (automatic content extraction) 2005 corpus and an F1-score of 78.3% for the NYT (New York Times) corpus. Full article
Show Figures

Figure 1

Open AccessArticle
A Hybrid Adversarial Attack for Different Application Scenarios
Appl. Sci. 2020, 10(10), 3559; https://doi.org/10.3390/app10103559 - 21 May 2020
Abstract
Adversarial attack against natural language has been a hot topic in the field of artificial intelligence security in recent years. It is mainly to study the methods and implementation of generating adversarial examples. The purpose is to better deal with the vulnerability and [...] Read more.
Adversarial attack against natural language has been a hot topic in the field of artificial intelligence security in recent years. It is mainly to study the methods and implementation of generating adversarial examples. The purpose is to better deal with the vulnerability and security of deep learning systems. According to whether the attacker understands the deep learning model structure, the adversarial attack is divided into black-box attack and white-box attack. In this paper, we propose a hybrid adversarial attack for different application scenarios. Firstly, we propose a novel black-box attack method of generating adversarial examples to trick the word-level sentiment classifier, which is based on differential evolution (DE) algorithm to generate semantically and syntactically similar adversarial examples. Compared with existing genetic algorithm based adversarial attacks, our algorithm can achieve a higher attack success rate while maintaining a lower word replacement rate. At the 10% word substitution threshold, we have increased the attack success rate from 58.5% to 63%. Secondly, when we understand the model architecture and parameters, etc., we propose a white-box attack with gradient-based perturbation against the same sentiment classifier. In this attack, we use a Euclidean distance and cosine distance combined metric to find the most semantically and syntactically similar substitution, and we introduce the coefficient of variation (CV) factor to control the dispersion of the modified words in the adversarial examples. More dispersed modifications can increase human imperceptibility and text readability. Compared with the existing global attack, our attack can increase the attack success rate and make modification positions in generated examples more dispersed. We’ve increased the global search success rate from 75.8% to 85.8%. Finally, we can deal with different application scenarios by using these two attack methods, that is, whether we understand the internal structure and parameters of the model, we can all generate good adversarial examples. Full article
Show Figures

Graphical abstract

Open AccessArticle
Source Code Assessment and Classification Based on Estimated Error Probability Using Attentive LSTM Language Model and Its Application in Programming Education
Appl. Sci. 2020, 10(8), 2973; https://doi.org/10.3390/app10082973 - 24 Apr 2020
Abstract
The rate of software development has increased dramatically. Conventional compilers cannot assess and detect all source code errors. Software may thus contain errors, negatively affecting end-users. It is also difficult to assess and detect source code logic errors using traditional compilers, resulting in [...] Read more.
The rate of software development has increased dramatically. Conventional compilers cannot assess and detect all source code errors. Software may thus contain errors, negatively affecting end-users. It is also difficult to assess and detect source code logic errors using traditional compilers, resulting in software that contains errors. A method that utilizes artificial intelligence for assessing and detecting errors and classifying source code as correct (error-free) or incorrect is thus required. Here, we propose a sequential language model that uses an attention-mechanism-based long short-term memory (LSTM) neural network to assess and classify source code based on the estimated error probability. The attentive mechanism enhances the accuracy of the proposed language model for error assessment and classification. We trained the proposed model using correct source code and then evaluated its performance. The experimental results show that the proposed model has logic and syntax error detection accuracies of 92.2% and 94.8%, respectively, outperforming state-of-the-art models. We also applied the proposed model to the classification of source code with logic and syntax errors. The average precision, recall, and F-measure values for such classification are much better than those of benchmark models. To strengthen the proposed model, we combined the attention mechanism with LSTM to enhance the results of error assessment and detection as well as source code classification. Finally, our proposed model can be effective in programming education and software engineering by improving code writing, debugging, error-correction, and reasoning. Full article
Show Figures

Figure 1

Open AccessArticle
Cooperative Multi-Agent Reinforcement Learning with Conversation Knowledge for Dialogue Management
Appl. Sci. 2020, 10(8), 2740; https://doi.org/10.3390/app10082740 - 15 Apr 2020
Abstract
Dialogue management plays a vital role in task-oriented dialogue systems, which has become an active area of research in recent years. Despite the promising results brought from deep reinforcement learning, most of the studies need to develop a manual user simulator additionally. To [...] Read more.
Dialogue management plays a vital role in task-oriented dialogue systems, which has become an active area of research in recent years. Despite the promising results brought from deep reinforcement learning, most of the studies need to develop a manual user simulator additionally. To address the time-consuming development of simulator policy, we propose a multi-agent dialogue model where an end-to-end dialogue manager and a user simulator are optimized simultaneously. Different from prior work, we optimize the two-agents from scratch and apply the reward shaping technology based on adjacency pairs constraints in conversational analysis to speed up learning and to avoid the derivation from normal human-human conversation. In addition, we generalize the one-to-one learning strategy to one-to-many learning strategy, where a dialogue manager can be concurrently optimized with various user simulators, to improve the performance of trained dialogue manager. The experimental results show that one-to-one agents trained with adjacency pairs constraints can converge faster and avoid derivation. In cross-model evaluation with human users involved, the dialogue manager trained in one-to-many strategy achieves the best performance. Full article
Show Figures

Figure 1

Open AccessArticle
A Hybrid Deep Learning Model for Protein–Protein Interactions Extraction from Biomedical Literature
Appl. Sci. 2020, 10(8), 2690; https://doi.org/10.3390/app10082690 - 13 Apr 2020
Abstract
The exponentially increasing size of biomedical literature and the limited ability of manual curators to discover protein–protein interactions (PPIs) in text has led to delays in keeping PPI databases updated with the current findings. The state-of-the-art text mining methods for PPI extraction are [...] Read more.
The exponentially increasing size of biomedical literature and the limited ability of manual curators to discover protein–protein interactions (PPIs) in text has led to delays in keeping PPI databases updated with the current findings. The state-of-the-art text mining methods for PPI extraction are primarily based on deep learning (DL) models, and the performance of a DL-based method is mainly affected by the architecture of DL models and the feature embedding methods. In this study, we compared different architectures of DL models, including convolutional neural networks (CNN), long short-term memory (LSTM), and hybrid models, and proposed a hybrid architecture of a bidirectional LSTM+CNN model for PPI extraction. Pretrained word embedding and shortest dependency path (SDP) embedding are fed into a two-embedding channel model, such that the model is able to model long-distance contextual information and can capture the local features and structure information effectively. The experimental results showed that the proposed model is superior to the non-hybrid DL models, and the hybrid CNN+Bidirectional LSTM model works well for PPI extraction. The visualization and comparison of the hidden features learned by different DL models further confirmed the effectiveness of the proposed model. Full article
Show Figures

Figure 1

Open AccessArticle
Medical Instructed Real-Time Assistant for Patient with Glaucoma and Diabetic Conditions
Appl. Sci. 2020, 10(7), 2216; https://doi.org/10.3390/app10072216 - 25 Mar 2020
Abstract
Virtual assistants are involved in the daily activities of humans such as managing calendars, making appointments, and providing wake-up calls. They provide a conversational service to customers around-the-clock and make their daily life manageable. With this emerging trend, many well-known companies launched their [...] Read more.
Virtual assistants are involved in the daily activities of humans such as managing calendars, making appointments, and providing wake-up calls. They provide a conversational service to customers around-the-clock and make their daily life manageable. With this emerging trend, many well-known companies launched their own virtual assistants that manage the daily routine activities of customers. In the healthcare sector, virtual medical assistants also provide a list of relevant diseases linked to a specific symptom. Due to low accuracy and uncertainty, these generated recommendations are untrusted and may lead to hypochondriasis. In this study, we proposed a Medical Instructed Real-time Assistant (MIRA) that listens to the user’s chief complaint and predicts a specific disease. Instead of informing about the medical condition, the user is referred to a nearby appropriate medical specialist. We designed an architecture for MIRA that considers the limitations of existing virtual medical assistants such as weak authentication, lack of understanding multiple intent statements about a specific medical condition, and uncertain diagnosis recommendations. To implement the designed architecture, we collected the chief complaints along with the dialogue corpora of real patients. Then, we manually validated these data under the supervision of medical specialists. We then used these data for natural language understanding, disease identification, and appropriate response generation. For the prototype version of MIRA, we considered the cases of glaucoma (eye disease) and diabetes (an autoimmune disease) only. The performance measure of MIRA was evaluated in terms of accuracy (89%), precision (90%), sensitivity (89.8%), specificity (94.9%), and F-measure (89.8%). The task completion was calculated using Cohen’s Kappa ( k = 0.848 ) that categorizes MIRA as ‘Almost Perfect’. Furthermore, the voice-based authentication identifies the user effectively and prevent against masquerading attack. Simultaneously, the user experience shows relatively good results in all aspects based on the User Experience Questionnaire (UEQ) benchmark data. The experimental results show that MIRA efficiently predicts a disease based on chief complaints and supports the user in decision making. Full article
Show Figures

Figure 1

Open AccessArticle
Assessment of Word-Level Neural Language Models for Sentence Completion
Appl. Sci. 2020, 10(4), 1340; https://doi.org/10.3390/app10041340 - 16 Feb 2020
Abstract
The task of sentence completion, which aims to infer the missing text of a given sentence, was carried out to assess the reading comprehension level of machines as well as humans. In this work, we conducted a comprehensive study of various approaches for [...] Read more.
The task of sentence completion, which aims to infer the missing text of a given sentence, was carried out to assess the reading comprehension level of machines as well as humans. In this work, we conducted a comprehensive study of various approaches for the sentence completion based on neural language models, which have been advanced in recent years. First, we revisited the recurrent neural network language model (RNN LM), achieving highly competitive results with an appropriate network structure and hyper-parameters. This paper presents a bidirectional version of RNN LM, which surpassed the previous best results on Microsoft Research (MSR) Sentence Completion Challenge and the Scholastic Aptitude Test (SAT) sentence completion questions. In parallel with directly applying RNN LM to sentence completion, we also employed a supervised learning framework that fine-tunes a large pre-trained transformer-based LM with a few sentence-completion examples. By fine-tuning a pre-trained BERT model, this work established state-of-the-art results on the MSR and SAT sets. Furthermore, we performed similar experimentation on newly collected cloze-style questions in the Korean language. The experimental results reveal that simply applying the multilingual BERT models for the Korean dataset was not satisfactory, which leaves room for further research. Full article
Show Figures

Figure 1

Open AccessArticle
Reliable Classification of FAQs with Spelling Errors Using an Encoder-Decoder Neural Network in Korean
Appl. Sci. 2019, 9(22), 4758; https://doi.org/10.3390/app9224758 - 07 Nov 2019
Cited by 1
Abstract
To resolve lexical disagreement problems between queries and frequently asked questions (FAQs), we propose a reliable sentence classification model based on an encoder-decoder neural network. The proposed model uses three types of word embeddings; fixed word embeddings for representing domain-independent meanings of words, [...] Read more.
To resolve lexical disagreement problems between queries and frequently asked questions (FAQs), we propose a reliable sentence classification model based on an encoder-decoder neural network. The proposed model uses three types of word embeddings; fixed word embeddings for representing domain-independent meanings of words, fined-tuned word embeddings for representing domain-specific meanings of words, and character-level word embeddings for bridging lexical gaps caused by spelling errors. It also uses class embeddings to represent domain knowledge associated with each category. In the experiments with an FAQ dataset about online banking, the proposed embedding methods contributed to an improved performance of the sentence classification. In addition, the proposed model showed better performance (with an accuracy of 0.810 in the classification of 411 categories) than that of the comparison model. Full article
Show Figures

Figure 1

Open AccessArticle
A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning
Appl. Sci. 2019, 9(21), 4701; https://doi.org/10.3390/app9214701 - 04 Nov 2019
Cited by 1
Abstract
As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis we propose a novel hybrid model [...] Read more.
As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis we propose a novel hybrid model of extractive-abstractive to combine BERT (Bidirectional Encoder Representations from Transformers) word embedding with reinforcement learning. Firstly, we convert the human-written abstractive summaries to the ground truth labels. Secondly, we use BERT word embedding as text representation and pre-train two sub-models respectively. Finally, the extraction network and the abstraction network are bridged by reinforcement learning. To verify the performance of the model, we compare it with the current popular automatic text summary model on the CNN/Daily Mail dataset, and use the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics as the evaluation method. Extensive experimental results show that the accuracy of the model is improved obviously. Full article
Show Figures

Graphical abstract

Open AccessArticle
Multi-Turn Chatbot Based on Query-Context Attentions and Dual Wasserstein Generative Adversarial Networks
Appl. Sci. 2019, 9(18), 3908; https://doi.org/10.3390/app9183908 - 18 Sep 2019
Cited by 2
Abstract
To generate proper responses to user queries, multi-turn chatbot models should selectively consider dialogue histories. However, previous chatbot models have simply concatenated or averaged vector representations of all previous utterances without considering contextual importance. To mitigate this problem, we propose a multi-turn chatbot [...] Read more.
To generate proper responses to user queries, multi-turn chatbot models should selectively consider dialogue histories. However, previous chatbot models have simply concatenated or averaged vector representations of all previous utterances without considering contextual importance. To mitigate this problem, we propose a multi-turn chatbot model in which previous utterances participate in response generation using different weights. The proposed model calculates the contextual importance of previous utterances by using an attention mechanism. In addition, we propose a training method that uses two types of Wasserstein generative adversarial networks to improve the quality of responses. In experiments with the DailyDialog dataset, the proposed model outperformed the previous state-of-the-art models based on various performance measures. Full article
Show Figures

Figure 1

Open AccessArticle
A Text-Generated Method to Joint Extraction of Entities and Relations
Appl. Sci. 2019, 9(18), 3795; https://doi.org/10.3390/app9183795 - 10 Sep 2019
Abstract
Entity-relation extraction is a basic task in natural language processing, and recently, the use of deep-learning methods, especially the Long Short-Term Memory (LSTM) network, has achieved remarkable performance. However, most of the existing entity-relation extraction methods cannot solve the overlapped multi-relation extraction problem, [...] Read more.
Entity-relation extraction is a basic task in natural language processing, and recently, the use of deep-learning methods, especially the Long Short-Term Memory (LSTM) network, has achieved remarkable performance. However, most of the existing entity-relation extraction methods cannot solve the overlapped multi-relation extraction problem, which means one or two entities are shared among multiple relational triples contained in a sentence. In this paper, we propose a text-generated method to solve the overlapped problem of entity-relation extraction. Based on this, (1) the entities and their corresponding relations are jointly generated as target texts without any additional feature engineering; (2) the model directly generates the relational triples using a unified decoding process, and entities can be repeatedly presented in multiple triples to solve the overlapped-relation problem. We conduct experiments on two public datasets—NYT10 and NYT11. The experimental results show that our proposed method outperforms the existing work, and achieves the best results. Full article
Show Figures

Figure 1

Open AccessArticle
Information Extraction from Electronic Medical Records Using Multitask Recurrent Neural Network with Contextual Word Embedding
Appl. Sci. 2019, 9(18), 3658; https://doi.org/10.3390/app9183658 - 04 Sep 2019
Cited by 1
Abstract
Clinical named entity recognition is an essential task for humans to analyze large-scale electronic medical records efficiently. Traditional rule-based solutions need considerable human effort to build rules and dictionaries; machine learning-based solutions need laborious feature engineering. For the moment, deep learning solutions like [...] Read more.
Clinical named entity recognition is an essential task for humans to analyze large-scale electronic medical records efficiently. Traditional rule-based solutions need considerable human effort to build rules and dictionaries; machine learning-based solutions need laborious feature engineering. For the moment, deep learning solutions like Long Short-term Memory with Conditional Random Field (LSTM–CRF) achieved considerable performance in many datasets. In this paper, we developed a multitask attention-based bidirectional LSTM–CRF (Att-biLSTM–CRF) model with pretrained Embeddings from Language Models (ELMo) in order to achieve better performance. In the multitask system, an additional task named entity discovery was designed to enhance the model’s perception of unknown entities. Experiments were conducted on the 2010 Informatics for Integrating Biology & the Bedside/Veterans Affairs (I2B2/VA) dataset. Experimental results show that our model outperforms the state-of-the-art solution both on the single model and ensemble model. Our work proposes an approach to improve the recall in the clinical named entity recognition task based on the multitask mechanism. Full article
Show Figures

Figure 1

Review

Jump to: Research, Other

Open AccessReview
A Review of Text Corpus-Based Tourism Big Data Mining
Appl. Sci. 2019, 9(16), 3300; https://doi.org/10.3390/app9163300 - 12 Aug 2019
Cited by 5
Abstract
With the massive growth of the Internet, text data has become one of the main formats of tourism big data. As an effective expression means of tourists’ opinions, text mining of such data has big potential to inspire innovations for tourism practitioners. In [...] Read more.
With the massive growth of the Internet, text data has become one of the main formats of tourism big data. As an effective expression means of tourists’ opinions, text mining of such data has big potential to inspire innovations for tourism practitioners. In the past decade, a variety of text mining techniques have been proposed and applied to tourism analysis to develop tourism value analysis models, build tourism recommendation systems, create tourist profiles, and make policies for supervising tourism markets. The successes of these techniques have been further boosted by the progress of natural language processing (NLP), machine learning, and deep learning. With the understanding of the complexity due to this diverse set of techniques and tourism text data sources, this work attempts to provide a detailed and up-to-date review of text mining techniques that have been, or have the potential to be, applied to modern tourism big data analysis. We summarize and discuss different text representation strategies, text-based NLP techniques for topic extraction, text classification, sentiment analysis, and text clustering in the context of tourism text mining, and their applications in tourist profiling, destination image analysis, market demand, etc. Our work also provides guidelines for constructing new tourism big data applications and outlines promising research areas in this field for incoming years. Full article
Show Figures

Figure 1

Other

Jump to: Research, Review

Open AccessLetter
Evolutionary Neural Architecture Search (NAS) Using Chromosome Non-Disjunction for Korean Grammaticality Tasks
Appl. Sci. 2020, 10(10), 3457; https://doi.org/10.3390/app10103457 - 17 May 2020
Abstract
In this paper, we apply the neural architecture search (NAS) method to Korean grammaticality judgment tasks. Since the word order of a language is the final result of complex syntactic operations, a successful neural architecture search in linguistic data suggests that NAS can [...] Read more.
In this paper, we apply the neural architecture search (NAS) method to Korean grammaticality judgment tasks. Since the word order of a language is the final result of complex syntactic operations, a successful neural architecture search in linguistic data suggests that NAS can automate language model designing. Although NAS application to language has been suggested in the literature, we add a novel dataset that contains Korean-specific linguistic operations, which adds great complexity in the patterns. The result of the experiment suggests that NAS provides an architecture for the language. Interestingly, NAS has suggested an unprecedented structure that would not be designed manually. Research on the final topology of the architecture is the topic of our future research. Full article
Show Figures

Figure 1

Back to TopTop