MDPI - Publisher of Open Access Journals

47 pages, 3137 KB

Open AccessArticle

DietQA: A Comprehensive Framework for Personalized Multi-Diet Recipe Retrieval Using Knowledge Graphs, Retrieval-Augmented Generation, and Large Language Models

by Ioannis Tsampos and Emmanouil Marakakis

Computers 2025, 14(10), 412; https://doi.org/10.3390/computers14100412 - 29 Sep 2025

Viewed by 513

Abstract

Recipes available on the web often lack nutritional transparency and clear indicators of dietary suitability. While searching by title is straightforward, exploring recipes that meet combined dietary needs, nutritional goals, and ingredient-level preferences remains challenging. Most existing recipe search systems do not effectively [...] Read more.

Recipes available on the web often lack nutritional transparency and clear indicators of dietary suitability. While searching by title is straightforward, exploring recipes that meet combined dietary needs, nutritional goals, and ingredient-level preferences remains challenging. Most existing recipe search systems do not effectively support flexible multi-dietary reasoning in combination with user preferences and restrictions. For example, users may seek gluten-free and dairy-free dinners with suitable substitutions, or compound goals such as vegan and low-fat desserts. Recent systematic reviews report that most food recommender systems are content-based and often non-personalized, with limited support for dietary restrictions, ingredient-level exclusions, and multi-criteria nutrition goals. This paper introduces DietQA, an end-to-end, language-adaptable chatbot system that integrates a Knowledge Graph (KG), Retrieval-Augmented Generation (RAG), and a Large Language Model (LLM) to support personalized, dietary-aware recipe search and question answering. DietQA crawls Greek-language recipe websites to extract structured information such as titles, ingredients, and quantities. Nutritional values are calculated using validated food composition databases, and dietary tags are inferred automatically based on ingredient composition. All information is stored in a Neo4j-based knowledge graph, enabling flexible querying via Cypher. Users interact with the system through a natural language chatbot friendly interface, where they can express preferences for ingredients, nutrients, dishes, and diets, and filter recipes based on multiple factors such as ingredient availability, exclusions, and nutritional goals. DietQA supports multi-diet recipe search by retrieving both compliant recipes and those adaptable via ingredient substitutions, explaining how each result aligns with user preferences and constraints. An LLM extracts intents and entities from user queries to support rule-based Cypher retrieval, while the RAG pipeline generates contextualized responses using the user query and preferences, retrieved recipes, statistical summaries, and substitution logic. The system integrates real-time updates of recipe and nutritional data, supporting up-to-date, relevant, and personalized recommendations. It is designed for language-adaptable deployment and has been developed and evaluated using Greek-language content. DietQA provides a scalable framework for transparent and adaptive dietary recommendation systems powered by conversational AI. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling (2nd Edition))

► Show Figures

Graphical abstract

24 pages, 3121 KB

Open AccessArticle

SG-RAG MOT: SubGraph Retrieval Augmented Generation with Merging and Ordering Triplets for Knowledge Graph Multi-Hop Question Answering

by Ahmmad O. M. Saleh, Gokhan Tur and Yucel Saygin

Mach. Learn. Knowl. Extr. 2025, 7(3), 74; https://doi.org/10.3390/make7030074 - 1 Aug 2025

Viewed by 1489

Abstract

Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given [...] Read more.

Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given knowledge graph and retrieve the subgraph necessary to answer the question. The results from our previous work showed the higher performance of our method compared to the traditional Retrieval Augmented Generation (RAG). In this work, we further enhanced SG-RAG by proposing an additional step called Merging and Ordering Triplets (MOT). The new MOT step seeks to decrease the redundancy in the retrieved triplets by applying hierarchical merging to the retrieved subgraphs. Moreover, it provides an ordering among the triplets using the Breadth-First Search (BFS) traversal algorithm. We conducted experiments on the MetaQA benchmark, which was proposed for multi-hop question-answering in the movies domain. Our experiments showed that SG-RAG MOT provided more accurate answers than Chain-of-Thought and Graph Chain-of-Thought. We also found that merging (up to a certain point) highly overlapping subgraphs and defining an order among the triplets helped the LLM to generate more precise answers. Full article

(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)

► Show Figures

Figure 1

19 pages, 460 KB

Open AccessArticle

Refining Text2Cypher on Small Language Model with Reinforcement Learning Leveraging Semantic Information

by Quoc-Bao-Huy Tran, Aagha Abdul Waheed, Syed Mudasir and Sun-Tae Chung

Appl. Sci. 2025, 15(15), 8206; https://doi.org/10.3390/app15158206 - 23 Jul 2025

Viewed by 947

Abstract

Text2Cypher is a text-to-text task that converts natural language questions into Cypher queries. Recent research by Neo4j on Text2Cypher demonstrates that fine-tuning a baseline language model (a pretrained and instruction-tuned generative model) using a comprehensive Text2Cypher dataset can effectively enhance query generation performance. [...] Read more.

Text2Cypher is a text-to-text task that converts natural language questions into Cypher queries. Recent research by Neo4j on Text2Cypher demonstrates that fine-tuning a baseline language model (a pretrained and instruction-tuned generative model) using a comprehensive Text2Cypher dataset can effectively enhance query generation performance. However, the improvement is still insufficient for effectively learning the syntax and semantics of complex natural texts, particularly when applied to unseen Cypher schema structures across diverse domains during training. To address this challenge, we propose a novel refinement training method based on baseline language models, employing reinforcement learning with Group Relative Policy Optimization (GRPO). This method leverages extracted semantic information, such as key-value properties and triple relationships from input texts during the training process. Experimental results of the proposed refinement training method applied to a small-scale baseline language model (SLM) like Qwen2.5-3B-Instruct demonstrate that it achieves competitive execution accuracy scores on unseen schemas across various domains. Furthermore, the proposed method significantly outperforms most baseline LMs with larger parameter sizes in terms of Google-BLEU and execution accuracy scores over Neo4j’s comprehensive Text2Cypher dataset, with the exception of colossal LLMs such as GPT4o, GPT4o-mini, and Gemini. Full article

► Show Figures

Figure 1

24 pages, 3832 KB

Open AccessArticle

Stitching History into Semantics: LLM-Supported Knowledge Graph Engineering for 19th-Century Greek Bookbinding

by Dimitrios Doumanas, Efthalia Ntalouka, Costas Vassilakis, Manolis Wallace and Konstantinos Kotis

Mach. Learn. Knowl. Extr. 2025, 7(3), 59; https://doi.org/10.3390/make7030059 - 24 Jun 2025

Viewed by 1279

Abstract

Preserving cultural heritage can be efficiently supported by structured and semantic representation of historical artifacts. Bookbinding, a critical aspect of book history, provides valuable insights into past craftsmanship, material use, and conservation practices. However, existing bibliographic records often lack the depth needed to [...] Read more.

Preserving cultural heritage can be efficiently supported by structured and semantic representation of historical artifacts. Bookbinding, a critical aspect of book history, provides valuable insights into past craftsmanship, material use, and conservation practices. However, existing bibliographic records often lack the depth needed to analyze bookbinding techniques, provenance, and preservation status. This paper presents a proof-of-concept system that explores how Large Language Models (LLMs) can support knowledge graph engineering within the context of 19th-century Greek bookbinding (1830–1900), and as a result, generate a domain-specific ontology and a knowledge graph. Our ontology encapsulates materials, binding techniques, artistic styles, and conservation history, integrating metadata standards like MARC and Dublin Core to ensure interoperability with existing library and archival systems. To validate its effectiveness, we construct a Neo4j knowledge graph, based on the generated ontology and utilize Cypher Queries—including LLM-generated queries—to extract insights about bookbinding practices and trends. This study also explores how semantic reasoning over the knowledge graph can identify historical binding patterns, assess book conservation needs, and infer relationships between bookbinding workshops. Unlike previous bibliographic ontologies, our approach provides a comprehensive, semantically rich representation of bookbinding history, methods and techniques, supporting scholars, conservators, and cultural heritage institutions. By demonstrating how LLMs can assist in ontology/KG creation and query generation, we introduce and evaluate a semi-automated pipeline as a methodological demonstration for studying historical bookbinding, contributing to digital humanities, book conservation, and cultural informatics. Finally, the proposed approach can be used in other domains, thus, being generally applicable in knowledge engineering. Full article

(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)

► Show Figures

Graphical abstract

33 pages, 2131 KB

Open AccessArticle

Domain- and Language-Adaptable Natural Language Interface for Property Graphs

by Ioannis Tsampos and Emmanouil Marakakis

Computers 2025, 14(5), 183; https://doi.org/10.3390/computers14050183 - 9 May 2025

Viewed by 1301

Abstract

Despite the growing adoption of Property Graph Databases, like Neo4j, interacting with them remains difficult for non-technical users due to the reliance on formal query languages. Natural Language Interfaces (NLIs) address this by translating natural language (NL) into Cypher. However, existing solutions are [...] Read more.

Despite the growing adoption of Property Graph Databases, like Neo4j, interacting with them remains difficult for non-technical users due to the reliance on formal query languages. Natural Language Interfaces (NLIs) address this by translating natural language (NL) into Cypher. However, existing solutions are typically limited to high-resource languages; are difficult to adapt to evolving domains with limited annotated data; and often depend on Machine Learning (ML) approaches, including Large Language Models (LLMs), that demand substantial computational resources and advanced expertise for training and maintenance. We address these limitations by introducing a novel dependency-based, training-free, schema-agnostic Natural Language Interface (NLI) that converts NL queries into Cypher for querying Property Graphs. Our system employs a modular pipeline-integrating entity and relationship extraction, Named Entity Recognition (NER), semantic mapping, triple creation via syntactic dependencies, and validation against an automatically extracted Schema Graph. The distinctive feature of this approach is the reduction in candidate entity pairs using syntactic analysis and schema validation, eliminating the need for candidate query generation and ranking. The schema-agnostic design enables adaptation across domains and languages. Our system supports single- and multi-hop queries, conjunctions, comparisons, aggregations, and complex questions through an explainable process. Evaluations on real-world queries demonstrate reliable translation results. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling)

► Show Figures

Graphical abstract

19 pages, 18858 KB

Open AccessArticle

PIDQA—Question Answering on Piping and Instrumentation Diagrams

by Mohit Gupta, Chialing Wei, Thomas Czerniawski and Ricardo Eiris

Mach. Learn. Knowl. Extr. 2025, 7(2), 39; https://doi.org/10.3390/make7020039 - 21 Apr 2025

Viewed by 3821

Abstract

This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, [...] Read more.

This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, we recognize entities in a P&ID image and organize their relationships to form a base entity graph. Second, this entity graph is converted into a Labeled Property Graph (LPG), enriched with semantic attributes for nodes and edges. Third, a Large Language Model (LLM)-based information retrieval system translates a user query into a graph query language (Cypher) and retrieves the answer by executing it on LPG. For our experiments, we augmented a publicly available P&ID image dataset with our novel PIDQA dataset, which comprises 64,000 question–answer pairs spanning four categories: (I) simple counting, (II) spatial counting, (III) spatial connections, and (IV) value-based questions. Our experiments (using gpt-3.5-turbo) demonstrate that grounding the LLM with dynamic few-shot sampling robustly elevates accuracy by 10.6–43.5% over schema contextualization alone, even under high lexical diversity conditions (e.g., paraphrasing, ambiguity). By reducing barriers in retrieving P&ID data, this work advances human–AI collaboration for industrial workflows in design validation and safety audits. Full article

(This article belongs to the Section Visualization)

► Show Figures

Figure 1

28 pages, 10564 KB

Open AccessArticle

Aging-Friendly Design Research: Knowledge Graph Construction for Elderly Advantage Applications

by Xiaoying Li, Xingda Wang and Guangran Li

Appl. Sci. 2025, 15(5), 2848; https://doi.org/10.3390/app15052848 - 6 Mar 2025

Cited by 1 | Viewed by 1645

Abstract

In the field of aging design, obtaining elderly advantage data is a challenge. In this study, we developed a visualization tool using knowledge graph technology to assist designers in studying elderly advantages, promoting their application in design practice. First, brainstorming sessions and workshops [...] Read more.

In the field of aging design, obtaining elderly advantage data is a challenge. In this study, we developed a visualization tool using knowledge graph technology to assist designers in studying elderly advantages, promoting their application in design practice. First, brainstorming sessions and workshops were held to analyze the challenges of applying elderly advantages in design. Based on these challenges, the concept and functional design of an elderly advantages knowledge graph were proposed. Next, the elderly advantages knowledge graph was constructed by following these steps: (1) The KJ-AHP method was used to process raw data, making them structured and quantitative. (2) The ontology of the knowledge graph was reverse-engineered based on the functional requirements of the graph, allowing the construction of the knowledge graph model layer. (3) The processed data were applied to the knowledge graph ontology through AHP-ontology mapping rules, allowing the knowledge content construction. (4) The programming language Cypher was used for the functional verification of the elderly advantages knowledge graph, and a satisfaction survey was conducted through questionnaires to assess the verification process. The elderly advantages knowledge graph constructed in this study initially fulfilled the expected functions and was met with high satisfaction. The application of knowledge graph technology provides a new reference for advantage mining in the design field. Based on the innovative combination of KJ-AHP and knowledge graph technology, this study enhances the structuring and quantification of graph data, significantly facilitating designers’ understanding of data structures, clarifying data relationships, and expanding design thinking. Full article

(This article belongs to the Special Issue Knowledge Graphs: State-of-the-Art and Applications)

► Show Figures

Figure 1

21 pages, 2494 KB

Open AccessArticle

A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language

by Wenjuan Lu, Dongping Ming, Xi Mao, Jizhou Wang, Zhanjie Zhao and Yao Cheng

Appl. Sci. 2025, 15(3), 1073; https://doi.org/10.3390/app15031073 - 22 Jan 2025

Cited by 3 | Viewed by 1414

Abstract

To address current issues in natural language spatiotemporal queries, including insufficient question semantic understanding, incomplete semantic information extraction, and inaccurate intent recognition, this paper proposes NL2Cypher, a DeBERTa (Decoding-enhanced BERT with disentangled attention)-based natural language spatiotemporal question semantic conversion model. The model first [...] Read more.

To address current issues in natural language spatiotemporal queries, including insufficient question semantic understanding, incomplete semantic information extraction, and inaccurate intent recognition, this paper proposes NL2Cypher, a DeBERTa (Decoding-enhanced BERT with disentangled attention)-based natural language spatiotemporal question semantic conversion model. The model first performs semantic encoding on natural language spatiotemporal questions, extracts pre-trained features based on the DeBERTa model, inputs feature vector sequences into BiGRU (Bidirectional Gated Recurrent Unit) to learn text features, and finally obtains globally optimal label sequences through a CRF (Conditional Random Field) layer. Then, based on the encoding results, it performs classification and semantic parsing of spatiotemporal questions to achieve question intent recognition and conversion to Cypher query language. The experimental results show that the proposed DeBERTa-based conversion model NL2Cypher can accurately achieve semantic information extraction and intent understanding in both simple and compound queries when using Chinese corpus, reaching an F1 score of 92.69%, with significant accuracy improvement compared to other models. The conversion accuracy from spatiotemporal questions to query language reaches 88% on the training set and 92% on the test set. The proposed model can quickly and accurately query spatiotemporal data using natural language questions. The research results provide new tools and perspectives for subsequent knowledge graph construction and intelligent question answering, effectively promoting the development of geographic information towards intelligent services. Full article

► Show Figures

Figure 1

17 pages, 81622 KB

Open AccessArticle

A Hierarchical Spatiotemporal Data Model Based on Knowledge Graphs for Representation and Modeling of Dynamic Landslide Scenes

by Juan Li, Jin Zhang, Li Wang and Ao Zhao

Sustainability 2024, 16(23), 10271; https://doi.org/10.3390/su162310271 - 23 Nov 2024

Viewed by 1295

Abstract

Represention and modeling the dynamic landslide scenes is essential for gaining a comprehensive understanding and managing them effectively. Existing models, which focus on a single scale make it difficult to fully express the complex, multi-scale spatiotemporal process within landslide scenes. To address these [...] Read more.

Represention and modeling the dynamic landslide scenes is essential for gaining a comprehensive understanding and managing them effectively. Existing models, which focus on a single scale make it difficult to fully express the complex, multi-scale spatiotemporal process within landslide scenes. To address these issues, we proposed a hierarchical spatiotemporal data model, named as HSDM, to enhance the representation for geographic scenes. Specifically, we introduced a spatiotemporal object model that integrates both their structural and process information of objects. Furthermore, we extended the process definition to capture complex spatiotemporal processes. We sorted out the relationships used in HSDM and defined four types of spatiotemporal correlation relations to represent the connections between spatiotemporal objects. Meanwhile, we constructed a three-level graph model of geographic scenes based on these concepts and relationships. Finally, we achieved representation and modeling of a dynamic landslide scene in Heifangtai using HSDM and implemented complex querying and reasoning with Neo4j’s Cypher language. The experimental results demonstrate our model’s capabilities in modeling and reasoning about complex multi-scale information and spatio-temporal processes with landslide scenes. Our work contributes to landslide knowledge representation, inventory and dynamic simulation. Full article

(This article belongs to the Special Issue Landslides in Urban Environments: Monitoring, Impact Mitigation and Resilient Enhancement)

► Show Figures

Figure 1

26 pages, 1559 KB

Open AccessArticle

Real-Time Text-to-Cypher Query Generation with Large Language Models for Graph Databases

by Markus Hornsteiner, Michael Kreussel, Christoph Steindl, Fabian Ebner, Philip Empl and Stefan Schönig

Future Internet 2024, 16(12), 438; https://doi.org/10.3390/fi16120438 - 22 Nov 2024

Cited by 9 | Viewed by 5155

Abstract

Based on their ability to efficiently and intuitively represent real-world relationships and structures, graph databases are gaining increasing popularity. In this context, this paper proposes an innovative integration of a Large Language Model into NoSQL databases and Knowledge Graphs to bridge the gap [...] Read more.

Based on their ability to efficiently and intuitively represent real-world relationships and structures, graph databases are gaining increasing popularity. In this context, this paper proposes an innovative integration of a Large Language Model into NoSQL databases and Knowledge Graphs to bridge the gap in field of Text-to-Cypher queries, focusing on Neo4j. Using the Design Science Research Methodology, we developed a Natural Language Interface which can receive user queries in real time, convert them into Cypher Query Language (CQL), and perform targeted queries, allowing users to choose from different graph databases. In addition, the user interaction is expanded by an additional chat function based on the chat history, as well as an error correction module, which elevates the precision of the generated Cypher statements. Our findings show that the chatbot is able to accurately and efficiently solve the tasks of database selection, chat history referencing, and CQL query generation. The developed system therefore makes an important contribution to enhanced interaction with graph databases, and provides a basis for the integration of further and multiple database technologies and LLMs, due to its modular pipeline architecture. Full article

(This article belongs to the Special Issue Generative Artificial Intelligence in Smart Societies)

► Show Figures

Figure 1

18 pages, 4198 KB

Open AccessArticle

Two-Stage Optimization Model Based on Neo4j-Dueling Deep Q Network

by Tie Chen, Pingping Yang, Hongxin Li, Jiaqi Gao and Yimin Yuan

Energies 2024, 17(19), 4998; https://doi.org/10.3390/en17194998 - 8 Oct 2024

Cited by 1 | Viewed by 1392

Abstract

To alleviate the power flow congestion in active distribution networks (ADNs), this paper proposes a two-stage load transfer optimization model based on Neo4j-Dueling DQN. First, the Neo4j graph model was established as the training environment for Dueling DQN. Meanwhile, the power supply paths [...] Read more.

To alleviate the power flow congestion in active distribution networks (ADNs), this paper proposes a two-stage load transfer optimization model based on Neo4j-Dueling DQN. First, the Neo4j graph model was established as the training environment for Dueling DQN. Meanwhile, the power supply paths from the congestion point to the power source point were obtained using the Cypher language built into Neo4j, forming a load transfer space that served as the action space. Secondly, based on various constraints in the load transfer process, a reward and penalty function was formulated to establish the Dueling DQN training model. Finally, according to the

ε - g r e e d y

action selection strategy, actions were selected from the action space and interacted with the Neo4j environment, resulting in the optimal load transfer operation sequence. In this paper, Python was used as the programming language, TensorFlow open-source software library was used to form a deep reinforcement network, and Py2neo toolkit was used to complete the linkage between the python platform and Neo4j. We conducted experiments on a real 79-node system, using three power flow congestion scenarios for validation. Under the three power flow congestion scenarios, the time required to obtain the results was 2.87 s, 4.37 s and 3.45 s, respectively. For scenario 1 before and after load transfer, the line loss, voltage deviation and line load rate were reduced by about 56.0%, 76.0% and 55.7%, respectively. For scenario 2 before and after load transfer, the line loss, voltage deviation and line load rate were reduced by 41.7%, 72.9% and 56.7%, respectively. For scenario 3 before and after load transfer, the line loss, voltage deviation and line load rate were reduced by 13.6%, 47.1% and 37.7%, respectively. The experimental results show that the trained model can quickly and accurately derive the optimal load transfer operation sequence under different power flow congestion conditions, thereby validating the effectiveness of the proposed model. Full article

(This article belongs to the Section F1: Electrical Power System)

► Show Figures

Figure 1

17 pages, 3507 KB

Open AccessArticle

Robust Text-to-Cypher Using Combination of BERT, GraphSAGE, and Transformer (CoBGT) Model

by Quoc-Bao-Huy Tran, Aagha Abdul Waheed and Sun-Tae Chung

Appl. Sci. 2024, 14(17), 7881; https://doi.org/10.3390/app14177881 - 4 Sep 2024

Cited by 8 | Viewed by 3593

Abstract

Graph databases have become essential for managing and analyzing complex data relationships, with Neo4j emerging as a leading player in this domain. Neo4j, a high-performance NoSQL graph database, excels in efficiently handling connected data, offering powerful querying capabilities through its Cypher query language. [...] Read more.

Graph databases have become essential for managing and analyzing complex data relationships, with Neo4j emerging as a leading player in this domain. Neo4j, a high-performance NoSQL graph database, excels in efficiently handling connected data, offering powerful querying capabilities through its Cypher query language. However, due to Cypher’s complexities, making it more accessible for nonexpert users requires translating natural language queries into Cypher. Thus, in this paper, we propose a text-to-Cypher model to effectively translate natural language queries into Cypher. In our proposed model, we combine several methods to enable nonexpert users to interact with graph databases using the English language. Our approach includes three modules: key-value extraction, relation–properties prediction, and Cypher query generation. For key-value extraction and relation–properties prediction, we leverage BERT and GraphSAGE to extract features from natural language. Finally, we use a Transformer model to generate the Cypher query from these features. Additionally, due to the lack of text-to-Cypher datasets, we introduced a new dataset that contains English questions querying information within a graph database, paired with corresponding Cypher query ground truths. This dataset aids future model learning, validation, and comparison on text-to-Cypher task. Through experiments and evaluations, we demonstrate that our model achieves high accuracy and efficiency when comparing with some well-known seq2seq model such as T5 and GPT2, with an 87.1% exact match score on the dataset. Full article

► Show Figures

Figure 1

63 pages, 6195 KB

Open AccessEditor’s ChoiceArticle

Matching and Rewriting Rules in Object-Oriented Databases

by Giacomo Bergami, Oliver Robert Fox and Graham Morgan

Mathematics 2024, 12(17), 2677; https://doi.org/10.3390/math12172677 - 28 Aug 2024

Cited by 2 | Viewed by 1653

Abstract

Graph query languages such as Cypher are widely adopted to match and retrieve data in a graph representation, due to their ability to retrieve and transform information. Even though the most natural way to match and transform information is through rewriting rules, those [...] Read more.

Graph query languages such as Cypher are widely adopted to match and retrieve data in a graph representation, due to their ability to retrieve and transform information. Even though the most natural way to match and transform information is through rewriting rules, those are scarcely or partially adopted in graph query languages. Their inability to do so has a major impact on the subsequent way the information is structured, as it might then appear more natural to provide major constraints over the data representation to fix the way the information should be represented. On the other hand, recent works are starting to move towards the opposite direction, as the provision of a truly general semistructured model (GSM) allows to both represent all the available data formats (Network-Based, Relational, and Semistructured) as well as support a holistic query language expressing all major queries in such languages. In this paper, we show that the usage of GSM enables the definition of a general rewriting mechanism which can be expressed in current graph query languages only at the cost of adhering the query to the specificity of the underlying data representation. We formalise the proposed query language in terms declarative graph rewriting mechanisms described as a set of production rules

L \to R

while both providing restriction to the characterisation of L, and extending it to support structural graph nesting operations, useful to aggregate similar information around an entry-point of interest. We further achieve our declarative requirements by determining the order in which the data should be rewritten and multiple rules should be applied while ensuring the application of such updates on the GSM database is persisted in subsequent rewriting calls. We discuss how GSM, by fully supporting index-based data representation, allows for a better physical model implementation leveraging the benefits of columnar database storage. Preliminary benchmarks show the scalability of this proposed implementation in comparison with state-of-the-art implementations. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

26 pages, 4703 KB

Open AccessArticle

A Novel Approach for the Analysis of Ship Pollution Accidents Using Knowledge Graph

by Junlin Hu, Weixiang Zhou, Pengjun Zheng and Guiyun Liu

Sustainability 2024, 16(13), 5296; https://doi.org/10.3390/su16135296 - 21 Jun 2024

Cited by 5 | Viewed by 2050

Abstract

Ship pollution accidents can cause serious harm to marine ecosystems and economic development. This study proposes a ship pollution accident analysis method based on a knowledge graph to solve the problem that complex accident information is challenging to present clearly. Based on the [...] Read more.

Ship pollution accidents can cause serious harm to marine ecosystems and economic development. This study proposes a ship pollution accident analysis method based on a knowledge graph to solve the problem that complex accident information is challenging to present clearly. Based on the information of 411 ship pollution accidents along the coast of China, the Word2vec’s word vector models, BERT–BiLSTM–CRF model and BiLSTM–CRF model, were applied to extract entities and relations, and the Neo4j graph database was used for knowledge graph data storage and visualization. Furthermore, the case information retrieval and cause correlation of ship pollution accidents were analyzed by a knowledge graph. This method established 3928 valid entities and 5793 valid relationships, and the extraction accuracy of the entities and relationships was 79.45% and 82.47%, respectively. In addition, through visualization and Cypher language queries, we can clearly understand the logical relationship between accidents and causes and quickly retrieve relevant information. Using the centrality algorithm, we can analyze the degree of influence between accident causes and put forward targeted measures based on the relevant causes, which will help improve accident prevention and emergency response capabilities and strengthen marine environmental protection. Full article

► Show Figures

Figure 1

35 pages, 17068 KB

Open AccessArticle

Instantiation and Implementation of HEAD Metamodel in an Industrial Environment: Non-IoT and IoT Case Studies

by Nadine Kashmar, Mehdi Adda, Hussein Ibrahim, Jean-François Morin and Tony Ducheman

Electronics 2023, 12(15), 3216; https://doi.org/10.3390/electronics12153216 - 25 Jul 2023

Cited by 1 | Viewed by 1910

Abstract

Access to resources can take many forms: digital access via an onsite network, through an external site, website, etc., or physical access to labs, machines, information repositories, etc. Whether access to resources is digital or physical, it must be allowed, denied, revoked, or [...] Read more.

Access to resources can take many forms: digital access via an onsite network, through an external site, website, etc., or physical access to labs, machines, information repositories, etc. Whether access to resources is digital or physical, it must be allowed, denied, revoked, or disabled using robust and coherent access control (AC) models. What makes the process of AC more complicated is the emergence of digital transformation technologies and pervasive systems such as the internet of things (IoT) and industry 4.0 systems, especially with the growing demand for transparency in users’ interaction with various applications and services. Controlling access and ensuring security and cybersecurity in IoT and industry 4.0 environments is a challenging task. This is due to the increasing distribution of resources and the massive presence of cyber-threats and cyber-attacks. To ensure the security and privacy of users in industry sectors, we need an advanced AC metamodel that defines all the required components and attributes to derive various instances of AC models and follow the new and increasing demand for AC requirements due to continuous technology upgrades. Due to the several limitations in the existing metamodels and their inability to answer the current AC requirements, we have developed a Hierarchical, Extensible, Advanced, Dynamic (HEAD) AC metamodel with significant features that overcome the existing metamodels’ limitations. In this paper, the HEAD metamodel is employed to specify the needed AC policies for two case studies inspired by the computing environment of Institut Technologique de Maintenance Industrielle (ITMI)-Sept-Îles, QC, Canada; the first is for ITMI’s local (non-IoT) environment and the second for ITMI’s IoT environment. For each case study, the required AC model is derived using the domain-specific language (DSL) of HEAD metamodel, then Xtend notation (an expressive dialect of Java) is utilized to generate the needed Java code which represents the concrete instance of the derived AC model. At the system level, to get the needed AC rules, Cypher statements are generated and then injected into the Neo4j database to represent the Next Generation Access Control (NGAC) policy as a graph. NGAC framework is used as an enforcement point for the rules generated by each case study. The results show that the HEAD metamodel can be adapted and integrated into various local and distributed environments. It can serve as a unified framework, answer current AC requirements and follow policy upgrades. To demonstrate that the HEAD metamodel can be implemented on other platforms, we implement an administrator panel using VB.NET and SQL. Full article

(This article belongs to the Special Issue Digital Security and Privacy Protection: Trends and Applications)

► Show Figures

Figure 1

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI