Knowledge and Data Engineering

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 August 2024 | Viewed by 5906

Special Issue Editors


E-Mail Website
Guest Editor
Department of Informatics, Modeling, Electronics and System Engineering, University of Calabria, 87036 Rende, Italy
Interests: knowledge representation; logic programming; argumentation; incomplete and inconsistent databases

E-Mail Website
Guest Editor
Department of Mechanical, Energy and Management Engineering, University of Calabria, 87036 Rende, Italy
Interests: big data processing; optimization; machine learning; data imputation
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

As advances in technology and data generation provide new opportunities and challenges, the field of knowledge and data engineering continues to evolve and expand. The incorporation of artificial intelligence (AI) and machine learning (ML) algorithms into knowledge and data engineering has resulted in some exciting new developments.

The goal of this Special Issue is to highlight the most recent research and developments in knowledge and data engineering, with a focus on the integration of AI and ML algorithms, as well as their practical and theoretical advancements that can aid in the resolution of real-world problems in the field.

This Special Issue's scope includes, but is not limited to, knowledge representation and reasoning, knowledge graph construction and analysis, data and knowledge integration, knowledge discovery and data mining, data privacy and security, semantic web and linked data, natural language processing, recommendation systems, decision support systems, and big data processing and analysis.

This Special Issue welcomes theoretical as well as practical papers and will gather a diverse range of perspectives from academia and industry, in order to provide a platform for the exchange of ideas and insights on the future of knowledge and data engineering, and AI/ML.

Dr. Irina Trubitsyna
Dr. Reza Shahbazian
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • data integration
  • knowledge discovery
  • knowledge representation
  • knowledge graphs
  • decision support
  • data privacy
  • semantic web
  • natural language processing

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 594 KiB  
Article
Analyzing Data Reduction Techniques: An Experimental Perspective
by Vítor Fernandes, Gonçalo Carvalho, Vasco Pereira and Jorge Bernardino
Appl. Sci. 2024, 14(8), 3436; https://doi.org/10.3390/app14083436 - 18 Apr 2024
Viewed by 245
Abstract
The exponential growth in data generation has become a ubiquitous phenomenon in today’s rapidly growing digital technology. Technological advances and the number of connected devices are the main drivers of this expansion. However, the exponential growth of data presents challenges across different architectures, [...] Read more.
The exponential growth in data generation has become a ubiquitous phenomenon in today’s rapidly growing digital technology. Technological advances and the number of connected devices are the main drivers of this expansion. However, the exponential growth of data presents challenges across different architectures, particularly in terms of inefficient energy consumption, suboptimal bandwidth utilization, and the rapid increase in data stored in cloud environments. Therefore, data reduction techniques are crucial to reduce the amount of data transferred and stored. This paper provides a comprehensive review of various data reduction techniques and introduces a taxonomy to classify these methods based on the type of data loss. The experiments conducted in this study include distinct data types, assessing the performance and applicability of these techniques across different datasets. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

24 pages, 1032 KiB  
Article
Recommendation Algorithm Based on Survival Action Rules
by Marek Hermansa, Marek Sikora, Beata Sikora and Łukasz Wróbel
Appl. Sci. 2024, 14(7), 2939; https://doi.org/10.3390/app14072939 - 30 Mar 2024
Viewed by 426
Abstract
Survival analysis is widely used in fields such as medical research and reliability engineering to analyze data where not all subjects experience the event of interest by the end of the study. It requires dedicated methods capable of handling censored cases. This paper [...] Read more.
Survival analysis is widely used in fields such as medical research and reliability engineering to analyze data where not all subjects experience the event of interest by the end of the study. It requires dedicated methods capable of handling censored cases. This paper extends the collection of techniques applicable to censored data by introducing a novel algorithm for interpretable recommendations based on a set of survival action rules. Each action rule contains recommendations for changing the values of attributes describing examples. As a result of applying the action rules, an example is moved from a group characterized by a survival curve to another group with a significantly different survival rate. In practice, an example can be covered by several induced rules. To decide which attribute values should be changed, we propose a recommendation algorithm that analyzes all actions suggested by the rules covering the example. The efficiency of the algorithm has been evaluated on several benchmark datasets. We also present a qualitative analysis of the generated recommendations through a case study. The results indicate that the proposed method produces high-quality recommendations and leads to a significant change in the estimated survival time. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

27 pages, 6880 KiB  
Article
Using Level-Based Multiple Reasoning in a Web-Based Intelligent System for the Diagnosis of Farmed Fish Diseases
by Konstantinos Kovas, Ioannis Hatzilygeroudis, Konstantinos Dimitropoulos, Georgios Spiliopoulos, Konstantinos Poulos, Evi Abatzidou, Theofanis Aravanis, Aristeidis Ilias, Grigorios Kanlis and John A. Theodorou
Appl. Sci. 2023, 13(24), 13059; https://doi.org/10.3390/app132413059 - 07 Dec 2023
Viewed by 937
Abstract
Farmed fish disease diagnosis is an important problem in the fish farming industry, affecting quality of production and financial losses. In this paper, we present a web-based intelligent system that tackles the problem of fish disease diagnosis. To this end, it uses multiple [...] Read more.
Farmed fish disease diagnosis is an important problem in the fish farming industry, affecting quality of production and financial losses. In this paper, we present a web-based intelligent system that tackles the problem of fish disease diagnosis. To this end, it uses multiple knowledge representation and reasoning methods: rule-based, case-based, weight-based, and voting. Knowledge, which concerns the diagnosis of sea bass diseases, was acquired from experts in the field and represented in the form of decision trees. The diagnostic process is performed in two stages: a general one and a specialized one. In the general stage, a level-based diagnosis is performed, where environmental parameters, external signs, and internal signs are successively examined, and the three most probable diseases are identified. In the specialized stage, which is optional, a specialized expert system is used for each of the resulting diseases, where additional parameters concerning laboratory tests (microbiological, microscopic, molecular, and chemical) are considered. The general stage is the most useful, given that it can be performed on-site in real-time, whereas the specialized one requires time-consuming lab tests. The system also provides explanations for its decisions. Evaluation of the general-stage diagnostic process showed a top-3 accuracy of 78.79% on expert test cases and 94% on an artificial dataset. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

24 pages, 4084 KiB  
Article
SKATEBOARD: Semantic Knowledge Advanced Tool for Extraction, Browsing, Organisation, Annotation, Retrieval, and Discovery
by Eleonora Bernasconi, Davide Di Pierro, Domenico Redavid and Stefano Ferilli
Appl. Sci. 2023, 13(21), 11782; https://doi.org/10.3390/app132111782 - 27 Oct 2023
Cited by 1 | Viewed by 1033
Abstract
This paper introduces Semantic Knowledge Advanced Tool for Extraction Browsing Organisation Annotation Retrieval and Discovery (SKATEBOARD), a tool designed to facilitate knowledge exploration through the application of semantic technologies. The demand for advanced solutions that streamline Knowledge Extraction, management, and visualisation, characterised by [...] Read more.
This paper introduces Semantic Knowledge Advanced Tool for Extraction Browsing Organisation Annotation Retrieval and Discovery (SKATEBOARD), a tool designed to facilitate knowledge exploration through the application of semantic technologies. The demand for advanced solutions that streamline Knowledge Extraction, management, and visualisation, characterised by abundant information, has grown substantially in the current era. Graph-based representations have emerged as a robust approach for uncovering intricate data relationships, complementing the capabilities offered by AI models. Acknowledging the transparency and user control challenges faced by AI-driven solutions, SKATEBOARD offers a comprehensive framework encompassing Knowledge Extraction, ontology development, management, and interactive exploration. By adhering to Linked Data principles and adopting graph-based exploration, SKATEBOARD provides users with a clear view of data relationships and dependencies. Furthermore, it integrates recommendation systems and reasoning capabilities to augment the knowledge discovery process, thus introducing a serendipity effect generated by the SKATEBOARD interface exploration. This paper elucidates SKATEBOARD’s functionalities while emphasising its user-centric design. After reviewing related research, we provide an overview of the SKATEBOARD pipeline, demonstrating its capacity to bridge RDF and LPG representations. Subsequent sections delve into Knowledge Extraction and exploration, culminating in the evaluation of the tool. SKATEBOARD empowers users to make informed decisions and uncover valuable insights within their data domains, with the added dimension of serendipitous discoveries facilitated by its interface exploration capabilities. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

18 pages, 2572 KiB  
Article
A Domain-Oriented Entity Alignment Approach Based on Filtering Multi-Type Graph Neural Networks
by Yaoli Xu, Jinjun Zhong, Suzhi Zhang, Chenglin Li, Pu Li, Yanbu Guo, Yuhua Li, Hui Liang and Yazhou Zhang
Appl. Sci. 2023, 13(16), 9237; https://doi.org/10.3390/app13169237 - 14 Aug 2023
Viewed by 936
Abstract
Owing to the heterogeneity and incomplete information present in various domain knowledge graphs, the alignment of distinct source entities that represent an identical real-world entity becomes imperative. Existing methods focus on cross-lingual knowledge graph alignment, and assume that the entities of knowledge graphs [...] Read more.
Owing to the heterogeneity and incomplete information present in various domain knowledge graphs, the alignment of distinct source entities that represent an identical real-world entity becomes imperative. Existing methods focus on cross-lingual knowledge graph alignment, and assume that the entities of knowledge graphs in the same language are unique. However, due to the ambiguity of language, heterogeneous knowledge graphs in the same language are often duplicated, and relationship triples are far less than those of cross-lingual knowledge graphs. Moreover, existing methods rarely exclude noisy entities in the process of alignment. These make it impossible for existing methods to deal effectively with the entity alignment of domain knowledge graphs. In order to address these issues, we propose a novel entity alignment approach based on domain-oriented embedded representation (DomainEA). Firstly, a filtering mechanism employs the language model to extract the semantic features of entities and to exclude noisy entities for each entity. Secondly, a Structural Aggregator (SA) incorporates multiple hidden layers to generate high-order neighborhood-aware embeddings of entities that have few relationship connections. An Attribute Aggregator (AA) introduces self-attention to dynamically calculate weights that represent the importance of the attribute values of the entities. Finally, the approach calculates a transformation matrix to map the embeddings of distinct domain knowledge graphs onto a unified space, and matches entities via the joint embeddings of the SA and AA. Compared to six state-of-the-art methods, our experimental results on multiple food datasets show the following: (i) Our approach achieves an average improvement of 6.9% on MRR. (ii) The size of the dataset has a subtle influence on our approach; there is a positive correlation between the expansion of the dataset size and an improvement in most of the metrics. (iii) We can achieve a significant improvement in the level of recall by employing a filtering mechanism that is limited to the top-100 nearest entities as the candidate pairs. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

17 pages, 2481 KiB  
Article
Detection of Software Security Weaknesses Using Cross-Language Source Code Representation (CLaSCoRe)
by Sergiu Zaharia, Traian Rebedea and Stefan Trausan-Matu
Appl. Sci. 2023, 13(13), 7871; https://doi.org/10.3390/app13137871 - 04 Jul 2023
Viewed by 1291
Abstract
The research presented in the paper aims at increasing the capacity to identify security weaknesses in programming languages that are less supported by specialized security analysis tools, based on the knowledge gathered from securing the popular ones, for which security experts, scanners, and [...] Read more.
The research presented in the paper aims at increasing the capacity to identify security weaknesses in programming languages that are less supported by specialized security analysis tools, based on the knowledge gathered from securing the popular ones, for which security experts, scanners, and labeled datasets are, in general, available. This goal is vital in reducing the overall exposure of software applications. We propose a solution to expand the capabilities of security gaps detection to downstream languages, influenced by their more popular “ancestors” from the programming languages’ evolutionary tree, using language keyword tokenization and clustering based on word embedding techniques. We show that after training a machine learning algorithm on C, C++, and Java applications developed by a community of programmers with similar behavior of writing code, we can detect, with acceptable accuracy, similar vulnerabilities in C# source code written by the same community. To achieve this, we propose a core cross-language representation of source code, optimized for security weaknesses classifiers, named CLaSCoRe. Using this method, we can achieve zero-shot vulnerability detection—in our case, without using any training data with C# source code. Full article
(This article belongs to the Special Issue Knowledge and Data Engineering)
Show Figures

Figure 1

Back to TopTop