Special Issue "Artificial Intelligence—Methodology, Systems, and Applications"

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (15 January 2019)

Special Issue Editors

Guest Editor
Assoc. Prof. Gennady Agre

Institute of Information and Communication Technologies at Bulgarian Academy of Sciences, Bulgaria
Website | E-Mail
Interests: machine learning; data mining; deep learning; semantic web services; E-learning
Guest Editor
Prof. Josef Van Genabith

The German Research Center for Artificial Intelligence DFKI, Saarland University, Germany
Website | E-Mail
Interests: language technology; machine translation; parsing; generation; computer-assisted language learning (CALL) and morphology

Special Issue Information

Dear Colleagues,

This Special Issue will present extended versions of selected papers presented at the 18th International Conference AIMSA 2018—Artificial Intelligence: Methodology, Systems, and Applications. Initiated in 1984, the biennial AIMSA conference is a premier forum for exchanging information and research results on AI theory and principles along with applications of intelligent system technology. The conference traditionally brings together academic and industrial researchers from all areas of AI to share their ideas and experiences and learn about the research in contemporary AI. As its name indicates, the conference is dedicated to Artificial Intelligence in its entirety. However, for AIMSA 2018, we would like to put an emphasis on Deep Learning as it has been used successfully in many applications, and is considered one of the most cutting-edge machine learning and AI techniques at the moment. Authors of invited papers should be aware that the final submitted manuscript must provide a minimum of 50% new content and not exceed 30% copy/paste from the proceedings paper.

Assoc. Prof. Gennady Agre
Prof. Josef van Genabith
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep learning
  • Machine learning
  • Data Mining
  • Natural language processing
  • Search
  • Planning
  • Knowledge representation
  • Multi-agent systems
  • Robotics
  • Image processing

Published Papers (9 papers)

View options order results:
result details:
Displaying articles 1-9
Export citation of selected articles as:

Research

Open AccessArticle Machine Learning Models for Error Detection in Metagenomics and Polyploid Sequencing Data
Information 2019, 10(3), 110; https://doi.org/10.3390/info10030110
Received: 27 January 2019 / Revised: 6 March 2019 / Accepted: 6 March 2019 / Published: 11 March 2019
PDF Full-text (289 KB) | HTML Full-text | XML Full-text
Abstract
Metagenomics studies, as well as genomics studies of polyploid species such as wheat, deal with the analysis of high variation data. Such data contain sequences from similar, but distinct genetic chains. This fact presents an obstacle to analysis and research. In particular, the [...] Read more.
Metagenomics studies, as well as genomics studies of polyploid species such as wheat, deal with the analysis of high variation data. Such data contain sequences from similar, but distinct genetic chains. This fact presents an obstacle to analysis and research. In particular, the detection of instrumentation errors during the digitalization of the sequences may be hindered, as they can be indistinguishable from the real biological variation inside the digital data. This can prevent the determination of the correct sequences, while at the same time make variant studies significantly more difficult. This paper details a collection of ML-based models used to distinguish a real variant from an erroneous one. The focus is on using this model directly, but experiments are also done in combination with other predictors that isolate a pool of error candidates. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle Detecting Emotions in English and Arabic Tweets
Information 2019, 10(3), 98; https://doi.org/10.3390/info10030098
Received: 21 January 2019 / Revised: 18 February 2019 / Accepted: 28 February 2019 / Published: 6 March 2019
PDF Full-text (942 KB) | HTML Full-text | XML Full-text
Abstract
Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such [...] Read more.
Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction
Information 2019, 10(3), 97; https://doi.org/10.3390/info10030097
Received: 27 January 2019 / Revised: 25 February 2019 / Accepted: 27 February 2019 / Published: 5 March 2019
PDF Full-text (412 KB) | HTML Full-text | XML Full-text
Abstract
The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features [...] Read more.
The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features distinguishing it from all similar WSD systems—the ability to construct a special compressed representation for word embeddings and the ability to construct training and test sets of examples with different data granularity. The first feature allows generation of data sets with quite small dimensionality, which can be used for training highly accurate classifiers of different types. The second feature allows generating sets of examples that can be used for training classifiers specialized in disambiguating a concrete word, words belonging to the same part-of-speech (POS) category or all open class words. Intensive experimentation has shown that classifiers trained on examples created by the system outperform the standard baselines for measuring the behaviour of all-words WSD classifiers. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle A Multilingual and Multidomain Study on Dialog Act Recognition Using Character-Level Tokenization
Information 2019, 10(3), 94; https://doi.org/10.3390/info10030094
Received: 21 January 2019 / Revised: 25 February 2019 / Accepted: 26 February 2019 / Published: 3 March 2019
PDF Full-text (368 KB) | HTML Full-text | XML Full-text
Abstract
Automatic dialog act recognition is an important step for dialog systems since it reveals the intention behind the words uttered by its conversational partners. Although most approaches on the task use word-level tokenization, there is information at the sub-word level that is related [...] Read more.
Automatic dialog act recognition is an important step for dialog systems since it reveals the intention behind the words uttered by its conversational partners. Although most approaches on the task use word-level tokenization, there is information at the sub-word level that is related to the function of the words and, consequently, their intention. Thus, in this study, we explored the use of character-level tokenization to capture that information. We explored the use of multiple character windows of different sizes to capture morphological aspects, such as affixes and lemmas, as well as inter-word information. Furthermore, we assessed the importance of punctuation and capitalization for the task. To broaden the conclusions of our study, we performed experiments on dialogs in three languages—English, Spanish, and German—which have different morphological characteristics. Furthermore, the dialogs cover multiple domains and are annotated with both domain-dependent and domain-independent dialog act labels. The achieved results not only show that the character-level approach leads to similar or better performance than the state-of-the-art word-level approaches on the task, but also that both approaches are able to capture complementary information. Thus, the best results are achieved by combining tokenization at both levels. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle Application of Machine Learning Models for Survival Prognosis in Breast Cancer Studies
Information 2019, 10(3), 93; https://doi.org/10.3390/info10030093
Received: 21 January 2019 / Revised: 23 February 2019 / Accepted: 27 February 2019 / Published: 3 March 2019
PDF Full-text (409 KB) | HTML Full-text | XML Full-text
Abstract
The application of machine learning models for prediction and prognosis of disease development has become an irrevocable part of cancer studies aimed at improving the subsequent therapy and management of patients. The application of machine learning models for accurate prediction of survival time [...] Read more.
The application of machine learning models for prediction and prognosis of disease development has become an irrevocable part of cancer studies aimed at improving the subsequent therapy and management of patients. The application of machine learning models for accurate prediction of survival time in breast cancer on the basis of clinical data is the main objective of the presented study. The paper discusses an approach to the problem in which the main factor used to predict survival time is the originally developed tumor-integrated clinical feature, which combines tumor stage, tumor size, and age at diagnosis. Two datasets from corresponding breast cancer studies are united by applying a data integration approach based on horizontal and vertical integration by using proper document-oriented and graph databases which show good performance and no data losses. Aside from data normalization and classification, the applied machine learning methods provide promising results in terms of accuracy of survival time prediction. The analysis of our experiments shows an advantage of the linear Support Vector Regression, Lasso regression, Kernel Ridge regression, K-neighborhood regression, and Decision Tree regression—these models achieve most accurate survival prognosis results. The cross-validation for accuracy demonstrates best performance of the same models on the studied breast cancer data. As a support for the proposed approach, a Python-based workflow has been developed and the plans for its further improvement are finally discussed in the paper. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle Machine Reading Comprehension for Answer Re-Ranking in Customer Support Chatbots
Information 2019, 10(3), 82; https://doi.org/10.3390/info10030082
Received: 21 January 2019 / Revised: 16 February 2019 / Accepted: 19 February 2019 / Published: 26 February 2019
PDF Full-text (317 KB) | HTML Full-text | XML Full-text
Abstract
Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents. As a result, deep neural models such as sequence-to-sequence, memory networks, and the Transformer have become key ingredients of state-of-the-art dialog systems. [...] Read more.
Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents. As a result, deep neural models such as sequence-to-sequence, memory networks, and the Transformer have become key ingredients of state-of-the-art dialog systems. While those models are able to generate meaningful responses even in unseen situations, they need a lot of training data to build a reliable model. Thus, most real-world systems have used traditional approaches based on information retrieval (IR) and even hand-crafted rules, due to their robustness and effectiveness, especially for narrow-focused conversations. Here, we present a method that adapts a deep neural architecture from the domain of machine reading comprehension to re-rank the suggested answers from different models using the question as a context. We train our model using negative sampling based on question–answer pairs from the Twitter Customer Support Dataset. The experimental results show that our re-ranking framework can improve the performance in terms of word overlap and semantics both for individual models as well as for model combinations. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessFeature PaperArticle Evolution, Robustness and Generality of a Team of Simple Agents with Asymmetric Morphology in Predator-Prey Pursuit Problem
Information 2019, 10(2), 72; https://doi.org/10.3390/info10020072
Received: 21 January 2019 / Revised: 12 February 2019 / Accepted: 17 February 2019 / Published: 20 February 2019
PDF Full-text (2662 KB) | HTML Full-text | XML Full-text
Abstract
One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information. Often, however, the limited amount of information (simplicity of sensors) should be compensated by more precise and complex control. [...] Read more.
One of the most desired features of autonomous robotic systems is their ability to accomplish complex tasks with a minimum amount of sensory information. Often, however, the limited amount of information (simplicity of sensors) should be compensated by more precise and complex control. An optimal tradeoff between the simplicity of sensors and control would result in robots featuring better robustness, higher throughput of production and lower production costs, reduced energy consumption, and the potential to be implemented at very small scales. In our work we focus on a society of very simple robots (modeled as agents in a multi-agent system) that feature an “extreme simplicity” of both sensors and control. The agents have a single line-of-sight sensor, two wheels in a differential drive configuration as effectors, and a controller that does not involve any computing, but rather—a direct mapping of the currently perceived environmental state into a pair of velocities of the two wheels. Also, we applied genetic algorithms to evolve a mapping that results in effective behavior of the team of predator agents, towards the goal of capturing the prey in the predator-prey pursuit problem (PPPP), and demonstrated that the simple agents featuring the canonical (straightforward) sensory morphology could hardly solve the PPPP. To enhance the performance of the evolved system of predator agents, we propose an asymmetric morphology featuring an angular offset of the sensor, relative to the longitudinal axis. The experimental results show that this change brings a considerable improvement of both the efficiency of evolution and the effectiveness of the evolved capturing behavior of agents. Finally, we verified that some of the best-evolved behaviors of predators with sensor offset of 20° are both (i) general in that they successfully resolve most of the additionally introduced, unforeseen initial situations, and (ii) robust to perception noise in that they show a limited degradation of the number of successfully solved initial situations. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle Automatic Acquisition of Annotated Training Corpora for Test-Code Generation
Information 2019, 10(2), 66; https://doi.org/10.3390/info10020066
Received: 21 January 2019 / Revised: 9 February 2019 / Accepted: 13 February 2019 / Published: 17 February 2019
PDF Full-text (872 KB) | HTML Full-text | XML Full-text
Abstract
Open software repositories make large amounts of source code publicly available. Potentially, this source code could be used as training data to develop new, machine learning-based programming tools. For many applications, however, raw code scraped from online repositories does not constitute an adequate [...] Read more.
Open software repositories make large amounts of source code publicly available. Potentially, this source code could be used as training data to develop new, machine learning-based programming tools. For many applications, however, raw code scraped from online repositories does not constitute an adequate training dataset. Building on the recent and rapid improvements in machine translation (MT), one possibly very interesting application is code generation from natural language descriptions. One of the bottlenecks in developing these MT-inspired systems is the acquisition of parallel text-code corpora required for training code-generative models. This paper addresses the problem of automatically synthetizing parallel text-code corpora in the software testing domain. Our approach is based on the observation that self-documentation through descriptive method names is widely adopted in test automation, in particular for unit testing. Therefore, we propose synthesizing parallel corpora comprised of parsed test function names serving as code descriptions, aligned with the corresponding function bodies. We present the results of applying one of the state-of-the-art MT methods on such a generated dataset. Our experiments show that a neural MT model trained on our dataset can generate syntactically correct and semantically relevant short Java functions from quasi-natural language descriptions of functionality. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Open AccessArticle MOLI: Smart Conversation Agent for Mobile Customer Service
Information 2019, 10(2), 63; https://doi.org/10.3390/info10020063
Received: 8 January 2019 / Revised: 30 January 2019 / Accepted: 12 February 2019 / Published: 15 February 2019
PDF Full-text (576 KB) | HTML Full-text | XML Full-text
Abstract
Human agents in technical customer support provide users with instructional answers to solve a task that would otherwise require a lot of time, money, energy, physical costs. Developing a dialogue system in this domain is challenging due to the broad variety of user [...] Read more.
Human agents in technical customer support provide users with instructional answers to solve a task that would otherwise require a lot of time, money, energy, physical costs. Developing a dialogue system in this domain is challenging due to the broad variety of user questions. Moreover, user questions are noisy (for example, spelling mistakes), redundant and have various natural language expressions. In this work, we introduce a conversational system, MOLI (the name of our dialogue system), to solve customer questions by providing instructional answers from a knowledge base. Our approach combines models for question type and intent category classification with slot filling and a back-end knowledge base for filtering and ranking answers, and uses a dialog framework to actively query the user for missing information. For answer-ranking we find that sequential matching networks and neural multi-perspective sentence similarity networks clearly outperform baseline models, achieving a 43% error reduction. The end-to-end [email protected](Precision at top 1) of MOLI was 0.69 and the customers’ satisfaction was 0.73. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Figures

Figure 1

Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top