MDPI - Publisher of Open Access Journals

14 pages, 3707 KiB

Open AccessArticle

Testing Pretrained Large Language Models to Set Up a Knowledge Hub of Heterogeneous Multisource Environmental Documents

by Paolo Tagliolato Acquaviva d’Aragona, Gloria Bordogna, Lorenza Babbini, Alessandro Lotti, Annalisa Minelli, Martina Zilioli and Alessandro Oggioni

Appl. Sci. 2025, 15(10), 5415; https://doi.org/10.3390/app15105415 - 12 May 2025

Viewed by 432

Abstract

This contribution outlines the design of a Knowledge Hub of heterogeneous documents related to the UNEP/MAP Barcelona Convention system. The Knowledge Hub is intended to serve as a resource to assist public authorities and users with different backgrounds and needs in accessing information [...] Read more.

This contribution outlines the design of a Knowledge Hub of heterogeneous documents related to the UNEP/MAP Barcelona Convention system. The Knowledge Hub is intended to serve as a resource to assist public authorities and users with different backgrounds and needs in accessing information efficiently; users should be able to either formulate natural language queries or to navigate a knowledge graph that is automatically generated to find relevant documents. The ad hoc retrieval task and the Knowledge Hub creation are defined based on state-of-the-art Large Language Models (LLMs). Specifically, this contribution focuses on a user-evaluation experiment that tested publicly available pretrained foundation Large Language Models (LLMs) for retrieving a subset of documents with varying lengths and topics. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 526 KiB

Open AccessArticle

Collaborative Caching for Implementing a Location-Privacy Aware LBS on a MANET

by Rudyard Fuster, Patricio Galdames and Claudio Gutierréz-Soto

Appl. Sci. 2024, 14(22), 10480; https://doi.org/10.3390/app142210480 - 14 Nov 2024

Viewed by 973

Abstract

This paper addresses the challenge of preserving user privacy in location-based services (LBSs) by proposing a novel, complementary approach to existing privacy-preserving techniques such as k-anonymity and l-diversity. Our approach implements collaborative caching strategies within a mobile ad hoc network (MANET), exploiting [...] Read more.

This paper addresses the challenge of preserving user privacy in location-based services (LBSs) by proposing a novel, complementary approach to existing privacy-preserving techniques such as k-anonymity and l-diversity. Our approach implements collaborative caching strategies within a mobile ad hoc network (MANET), exploiting the geographic of location-based queries (LBQs) to reduce data exposure to untrusted LBS servers. Unlike existing approaches that rely on centralized servers or stationary infrastructure, our solution facilitates direct data exchange between users’ devices, providing an additional layer of privacy protection. We introduce a new privacy entropy-based metric called accumulated privacy loss (APL) to quantify the privacy loss incurred when accessing either the LBS or our proposed system. Our approach implements a two-tier caching strategy: local caching maintained by each user and neighbor caching based on proximity. This strategy not only reduces the number of queries to the LBS server but also significantly enhances user privacy by minimizing the exposure of location data to centralized entities. Empirical results demonstrate that while our collaborative caching system incurs some communication costs, it significantly mitigates redundant data among user caches and reduces the need to access potentially privacy-compromising LBS servers. Our findings show a 40% reduction in LBS queries, a 64% decrease in data redundancy within cells, and a 31% reduction in accumulated privacy loss compared to baseline methods. In addition, we analyze the impact of data obsolescence on cache performance and privacy loss, proposing mechanisms for maintaining the relevance and accuracy of cached data. This work contributes to the field of privacy-preserving LBSs by providing a decentralized, user-centric approach that improves both cache redundancy and privacy protection, particularly in scenarios where central infrastructure is unreachable or untrusted. Full article

(This article belongs to the Special Issue New Advances in Computer Security and Cybersecurity)

► Show Figures

Figure 1

50 pages, 3004 KiB

Open AccessReview

Hazard Susceptibility Mapping with Machine and Deep Learning: A Literature Review

by Angelly de Jesus Pugliese Viloria, Andrea Folini, Daniela Carrion and Maria Antonia Brovelli

Remote Sens. 2024, 16(18), 3374; https://doi.org/10.3390/rs16183374 - 11 Sep 2024

Cited by 5 | Viewed by 4438

Abstract

With the increase in climate-change-related hazardous events alongside population concentration in urban centres, it is important to provide resilient cities with tools for understanding and eventually preparing for such events. Machine learning (ML) and deep learning (DL) techniques have increasingly been employed to [...] Read more.

With the increase in climate-change-related hazardous events alongside population concentration in urban centres, it is important to provide resilient cities with tools for understanding and eventually preparing for such events. Machine learning (ML) and deep learning (DL) techniques have increasingly been employed to model susceptibility of hazardous events. This study consists of a systematic review of the ML/DL techniques applied to model the susceptibility of air pollution, urban heat islands, floods, and landslides, with the aim of providing a comprehensive source of reference both for techniques and modelling approaches. A total of 1454 articles published between 2020 and 2023 were systematically selected from the Scopus and Web of Science search engines based on search queries and selection criteria. ML/DL techniques were extracted from the selected articles and categorised using ad hoc classification. Consequently, a general approach for modelling the susceptibility of hazardous events was consolidated, covering the data preprocessing, feature selection, modelling, model interpretation, and susceptibility map validation, along with examples of related global/continental data. The most frequently employed techniques across various hazards include random forest, artificial neural networks, and support vector machines. This review also provides, per hazard, the definition, data requirements, and insights into the ML/DL techniques used, including examples of both state-of-the-art and novel modelling approaches. Full article

(This article belongs to the Special Issue Women’s Special Issue Series: Remote Sensing 2023-2025)

► Show Figures

Graphical abstract

19 pages, 2813 KiB

Open AccessArticle

Cluster-Based Vehicle-to-Everything Model with a Shared Cache

by Andrei Vladyko, Gleb Tambovtsev, Elena Podgornaya, Samia Allaoua Chelloug, Reem Alkanhel and Pavel Plotnikov

Mathematics 2023, 11(13), 3017; https://doi.org/10.3390/math11133017 - 7 Jul 2023

Cited by 7 | Viewed by 1913

Abstract

This paper presents an analysis of the effectiveness of the element interaction model in a vehicular ad hoc network (VANET). An analysis of the mathematical model and its numerical solution for the system of boundary device interactions in the traditional configuration of roadside [...] Read more.

This paper presents an analysis of the effectiveness of the element interaction model in a vehicular ad hoc network (VANET). An analysis of the mathematical model and its numerical solution for the system of boundary device interactions in the traditional configuration of roadside unit (RSU) placement using single- and dual-channel connection between on-board units (OBU) and RSU is given. In addition, the model efficiency is improved using a clustering approach. The efficiency evaluation is based on calculating the percentage of unprocessed requests generated by OBUs during their mobility, the average power consumption and the magnitude of the delay in transmitting and processing the generated requests in the OBU–RSU system. The traditional and cluster models are compared. The results obtained in this paper show that each of the proposed models can be effectively implemented in mobile nodes and will significantly reduce the overall expected query processing time to improve the organization and algorithmic support of VANET. Along with this, it is shown that the developed approach allows for efficient power consumption when combining RSUs into clusters with a shared cache. The novelty of solving the problems is due to the lack of a comprehensive model that allows the distribution and prediction of the parameters and resources of the system for different computational tasks, which is essential when implementing and using V2X technology to solve the problems of complex management of VANET elements. Full article

(This article belongs to the Special Issue Network Planning and Internet of Things: Mathematical Modeling, Connectivity Problems, and Applications)

► Show Figures

Figure 1

6 pages, 511 KiB

Open AccessProceeding Paper

Research on Big Data Ad Hoc Query Technology Based on an Accident Insurance Campaign

by Yung-Cheng Liao and Mei-Su Chen

Eng. Proc. 2023, 38(1), 8; https://doi.org/10.3390/engproc2023038008 - 19 Jun 2023

Viewed by 1045

Abstract

Lots of Insurance companies have constructed databases for ad hoc query software in Taiwan that combines customer relationship management and marketing campaign management. An ad hoc query is a non-routine and specific query performed in real time to filter specific customer information from [...] Read more.

Lots of Insurance companies have constructed databases for ad hoc query software in Taiwan that combines customer relationship management and marketing campaign management. An ad hoc query is a non-routine and specific query performed in real time to filter specific customer information from big data. Ad hoc query has the strength to retrieve customer information more quickly and conveniently than by filtering target customer lists using a mainframe or OLAP. In this study, the strengths and weaknesses of ad hoc query, online analytical processing (OLAP), and general query using a mainframe are analyzed. The results indicate that ad hoc query has the advantage of flexibility for users’ specific needs. Ad hoc query has obstacles and challenges for users regarding how to learn its system fields and writing programs. It is concluded that the design between individual assured suggestions and a convenient operation process is critical for raising the response rate. Additionally, precisely filtering technology for target customers is the key success factor for an accident insurance campaign. Full article

(This article belongs to the Proceedings of The 3rd IEEE International Conference on Electronic Communications, Internet of Things and Big Data 2023)

► Show Figures

Figure 1

19 pages, 2412 KiB

Open AccessArticle

IMF-PR: An Improved Morton-Filter-Based Pseudonym-Revocation Scheme in VANETs

by Cong Zhao, Jiayu Qi, Tianhan Gao and Xinyang Deng

Sensors 2023, 23(8), 4066; https://doi.org/10.3390/s23084066 - 18 Apr 2023

Cited by 1 | Viewed by 1536

Abstract

Vehicle ad hoc networks (VANETs) are special wireless networks which help vehicles to obtain continuous and stable communication. Pseudonym revocation, as a vital security mechanism, is able to protect legal vehicles in VANETs. However, existing pseudonym-revocation schemes suffer from the issues of low [...] Read more.

Vehicle ad hoc networks (VANETs) are special wireless networks which help vehicles to obtain continuous and stable communication. Pseudonym revocation, as a vital security mechanism, is able to protect legal vehicles in VANETs. However, existing pseudonym-revocation schemes suffer from the issues of low certificate revocation list (CRL) generation and update efficiency, along with high CRL storage and transmission costs. In order to solve the above issues, this paper proposes an improved Morton-filter-based pseudonym-revocation scheme for VANETs (IMF-PR). IMF-PR establishes a new distributed CRL management mechanism to maintain a low CRL distribution transmission delay. In addition, IMF-PR improves the Morton filter to optimize the CRL management mechanism so as to improve CRL generation and update efficiency and reduce the CRL storage overhead. Moreover, CRLs in IMF-PR store illegal vehicle information based on an improved Morton filter data structure to improve the compress ratio and the query efficiency. Performance analysis and simulation experiments showed that IMF-PR can effectively reduce storage by increasing the compression gain and reducing transmission delay. In addition, IMF-PR can also greatly improve the lookup and update throughput on CRLs. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

23 pages, 32221 KiB

Open AccessArticle

Learned Text Representation for Amharic Information Retrieval and Natural Language Processing

by Tilahun Yeshambel, Josiane Mothe and Yaregal Assabie

Information 2023, 14(3), 195; https://doi.org/10.3390/info14030195 - 20 Mar 2023

Cited by 12 | Viewed by 6068

Abstract

Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations for natural language processing (NLP) and other tasks. Many NLP applications rely on pre-trained text representations, leading to the development [...] Read more.

Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations for natural language processing (NLP) and other tasks. Many NLP applications rely on pre-trained text representations, leading to the development of a number of neural network language models for various languages. However, this is not the case for Amharic, which is known to be a morphologically complex and under-resourced language. Usable pre-trained models for automatic Amharic text processing are not available. This paper presents an investigation on the essence of learned text representation for information retrieval and NLP tasks using word embeddings and BERT language models. We explored the most commonly used methods for word embeddings, including word2vec, GloVe, and fastText, as well as the BERT model. We investigated the performance of query expansion using word embeddings. We also analyzed the use of a pre-trained Amharic BERT model for masked language modeling, next sentence prediction, and text classification tasks. Amharic ad hoc information retrieval test collections that contain word-based, stem-based, and root-based text representations were used for evaluation. We conducted a detailed empirical analysis on the usability of word embeddings and BERT models on word-based, stem-based, and root-based corpora. Experimental results show that word-based query expansion and language modeling perform better than stem-based and root-based text representations, and fastText outperforms other word embeddings on word-based corpus. Full article

(This article belongs to the Special Issue Novel Methods and Applications in Natural Language Processing)

► Show Figures

Figure 1

19 pages, 3782 KiB

Open AccessArticle

A Quick Prototype for Assessing OpenIE Knowledge Graph-Based Question-Answering Systems

by Giuseppina Di Paolo, Diego Rincon-Yanez and Sabrina Senatore

Information 2023, 14(3), 186; https://doi.org/10.3390/info14030186 - 16 Mar 2023

Cited by 7 | Viewed by 3860

Abstract

Due to the rapid growth of knowledge graphs (KG) as representational learning methods in recent years, question-answering approaches have received increasing attention from academia and industry. Question-answering systems use knowledge graphs to organize, navigate, search and connect knowledge entities. Managing such systems requires [...] Read more.

Due to the rapid growth of knowledge graphs (KG) as representational learning methods in recent years, question-answering approaches have received increasing attention from academia and industry. Question-answering systems use knowledge graphs to organize, navigate, search and connect knowledge entities. Managing such systems requires a thorough understanding of the underlying graph-oriented structures and, at the same time, an appropriate query language, such as SPARQL, to access relevant data. Natural language interfaces are needed to enable non-technical users to query ever more complex data. The paper proposes a question-answering approach to support end users in querying graph-oriented knowledge bases. The system pipeline is composed of two main modules: one is dedicated to translating a natural language query submitted by the user into a triple of the form <subject, predicate, object>, while the second module implements knowledge graph embedding (KGE) models, exploiting the previous module triple and retrieving the answer to the question. Our framework delivers a fast OpenIE-based knowledge extraction system and a graph-based answer prediction model for question-answering tasks. The system was designed by leveraging existing tools to accomplish a simple prototype for fast experimentation, especially across different knowledge domains, with the added benefit of reducing development time and costs. The experimental results confirm the effectiveness of the proposed system, which provides promising performance, as assessed at the module level. In particular, in some cases, the system outperforms the literature. Finally, a use case example shows the KG generated by user questions in a graphical interface provided by an ad-hoc designed web application. Full article

(This article belongs to the Special Issue Knowledge Graph Technology and Its Applications)

► Show Figures

Figure 1

29 pages, 5821 KiB

Open AccessArticle

A Continuous Region-Based Skyline Computation for a Group of Mobile Users

by Ghoncheh Babanejad Dehaki, Hamidah Ibrahim, Ali A. Alwan, Fatimah Sidi, Nur Izura Udzir and Ma′aruf Mohammed Lawal

Symmetry 2022, 14(10), 2003; https://doi.org/10.3390/sym14102003 - 24 Sep 2022

Cited by 1 | Viewed by 1555

Abstract

Skyline queries, which are based on the concept of Pareto dominance, filter the objects from a potentially large multi-dimensional collection of objects by keeping the best, most favoured objects in satisfying the user′s preferences. With today′s advancement of technology, ad hoc meetings or [...] Read more.

Skyline queries, which are based on the concept of Pareto dominance, filter the objects from a potentially large multi-dimensional collection of objects by keeping the best, most favoured objects in satisfying the user′s preferences. With today′s advancement of technology, ad hoc meetings or impromptu gatherings involving a group of people are becoming more and more common. Intuitively, deciding on an optimal meeting point is not a straightforward task especially when conflicting criteria are involved and the number of criteria to be considered is vast. Moreover, a point that is near to a user might not meet all the various users′ preferences, while a point that meets most of the users′ preferences might be located far away from these users. The task becomes more complicated when these users are on the move. In this paper, we present the Region-based Skyline for a Group of Mobile Users (RSGMU) method, which aims to resolve the problem of continuously finding the optimal meeting points, herein called skyline objects, for a group of users while they are on the move. RSGMU assumes a centroid-based movement where users are assumed to be moving towards a centroid that is identified based on the current locations of each user in the group. Meanwhile, to limit the searching space in identifying the objects of interest, a search region is constructed. However, the changes in the users′ locations caused the search region of the group to be reconstructed. Unlike the existing methods that require users to frequently report their latest locations, RSGMU utilises a dynamic motion formula, which abides to the laws of classical physics that are fundamentally symmetrical with respect to time, in order to predict the locations of the users at a specified time interval. As a result, the skyline objects are continuously updated, and the ideal meeting points can be decided upon ahead of time. Hence, the users′ locations as well as the spatial and non-spatial attributes of the objects are used as the skyline evaluation criteria. Meanwhile, to avoid re-computation of skylines at each time interval, the objects of interest within a Single Minimum Bounding Rectangle that is formed based on the current search region are organized in a Kd-tree data structure. Several experiments have been conducted and the results show that our proposed method outperforms the previous work with respect to CPU time. Full article

(This article belongs to the Special Issue Information Technology and Its Applications 2021)

► Show Figures

Figure 1

15 pages, 378 KiB

Open AccessArticle

An Approach to Aid Decision-Making by Solving Complex Optimization Problems Using SQL Queries

by Jose Torres-Jimenez, Nelson Rangel-Valdez, Miguel De-la-Torre and Himer Avila-George

Appl. Sci. 2022, 12(9), 4569; https://doi.org/10.3390/app12094569 - 30 Apr 2022

Cited by 2 | Viewed by 2853

Abstract

In combinatorial optimization, the more complex a problem is, the more challenging it becomes, usually causing most research to focus on creating solvers for larger cases. However, real-life situations also contain small-sized instances that deserve a researcher’s attention. For example, within a web [...] Read more.

In combinatorial optimization, the more complex a problem is, the more challenging it becomes, usually causing most research to focus on creating solvers for larger cases. However, real-life situations also contain small-sized instances that deserve a researcher’s attention. For example, within a web development context, a developer might face small combinatorial optimization cases that fall in the following situations to solve them: (1) the development of an ad hoc specialized strategy is not justified; (2) the developer could lack the time, or skills, to create the solution; (3) the efficiency of naive brute force strategies might be compromised due to the programming paradigm use. Similar situations in this context, combined with a recent increasing interest in optimization information from databases, open a research area to develop easy-to-implement strategies that compete with those naive approaches and do not require specialized knowledge. Therefore, this work revises Structured Query Language (SQL) approaches and proposes new methods to tackle combinatorial optimization problems such as the Portfolio Selection Problem, Maximum Clique Problem, and Graph Coloring Problem. The performance of the resulting queries is compared against naive approaches; its potential to extend to other optimization problems is studied. The presented examples demonstrate the simplicity and versatility of using a SQL approach to solve small optimization problem instances. Full article

(This article belongs to the Special Issue Novel Approaches and Technologies for Software Engineering and IT Management)

► Show Figures

Figure 1

16 pages, 1547 KiB

Open AccessArticle

Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval

by Weihong Cai, Zijun Hu, Yalan Luo, Daoyuan Liang, Yifan Feng and Jiaxin Chen

Information 2022, 13(5), 221; https://doi.org/10.3390/info13050221 - 25 Apr 2022

Viewed by 2713

Abstract

Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, [...] Read more.

Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models. Full article

► Show Figures

Figure 1

26 pages, 2768 KiB

Open AccessArticle

VloGraph: A Virtual Knowledge Graph Framework for Distributed Security Log Analysis

by Kabul Kurniawan, Andreas Ekelhart, Elmar Kiesling, Dietmar Winkler, Gerald Quirchmayr and A Min Tjoa

Mach. Learn. Knowl. Extr. 2022, 4(2), 371-396; https://doi.org/10.3390/make4020016 - 11 Apr 2022

Cited by 7 | Viewed by 6691

Abstract

The integration of heterogeneous and weakly linked log data poses a major challenge in many log-analytic applications. Knowledge graphs (KGs) can facilitate such integration by providing a versatile representation that can interlink objects of interest and enrich log events with background knowledge. Furthermore, [...] Read more.

The integration of heterogeneous and weakly linked log data poses a major challenge in many log-analytic applications. Knowledge graphs (KGs) can facilitate such integration by providing a versatile representation that can interlink objects of interest and enrich log events with background knowledge. Furthermore, graph-pattern based query languages, such as SPARQL, can support rich log analyses by leveraging semantic relationships between objects in heterogeneous log streams. Constructing, materializing, and maintaining centralized log knowledge graphs, however, poses significant challenges. To tackle this issue, we propose VloGraph—a distributed and virtualized alternative to centralized log knowledge graph construction. The proposed approach does not involve any a priori parsing, aggregation, and processing of log data, but dynamically constructs a virtual log KG from heterogeneous raw log sources across multiple hosts. To explore the feasibility of this approach, we developed a prototype and demonstrate its applicability to three scenarios. Furthermore, we evaluate the approach in various experimental settings with multiple heterogeneous log sources and machines; the encouraging results from this evaluation suggest that the approach can enable efficient graph-based ad-hoc log analyses in federated settings. Full article

(This article belongs to the Special Issue Selected Papers from CD-MAKE 2021 and ARES 2021)

► Show Figures

Figure 1

17 pages, 2076 KiB

Open AccessArticle

Changes in Air Pollution-Related Behaviour Measured by Google Trends Search Volume Index in Response to Reported Air Quality in Poland

by Wojciech Nazar and Katarzyna Plata-Nazar

Int. J. Environ. Res. Public Health 2021, 18(21), 11709; https://doi.org/10.3390/ijerph182111709 - 8 Nov 2021

Cited by 9 | Viewed by 5799

Abstract

Decreased air quality is connected to an increase in daily mortality rates. Thus, people’s behavioural response to sometimes elevated air pollution levels is vital. We aimed to analyse spatial and seasonal changes in air pollution-related information-seeking behaviour in response to nationwide reported air [...] Read more.

Decreased air quality is connected to an increase in daily mortality rates. Thus, people’s behavioural response to sometimes elevated air pollution levels is vital. We aimed to analyse spatial and seasonal changes in air pollution-related information-seeking behaviour in response to nationwide reported air quality in Poland. Google Trends Search Volume Index data was used to investigate Poles’ interest in air pollution-related keywords. PM₁₀ and PM_2.5 concentrations measured across Poland between 2016 and 2019 as well as locations of monitoring stations were collected from the Chief Inspectorate of Environmental Protection databases. Pearson Product-Moment Correlation Coefficients were used to measure the strength of spatial and seasonal relationships between reported air pollution levels and the popularity of search queries. The highest PM₁₀ and PM_2.5 concentrations were observed in southern voivodeships and during the winter season. Similar trends were observed for Poles’ interest in air pollution-related keywords. Greater interest in air quality data in Poland strongly correlates with both higher regional and higher seasonal air pollution levels. It appears that Poles are socially aware of this issue and that their intensification of the information-seeking behaviour seems to indicate a relevant ad hoc response to variable threat severity levels. Full article

► Show Figures

Figure 1

17 pages, 427 KiB

Open AccessArticle

Topic Models Ensembles for AD-HOC Information Retrieval

by Pablo Ormeño, Marcelo Mendoza and Carlos Valle

Information 2021, 12(9), 360; https://doi.org/10.3390/info12090360 - 1 Sep 2021

Cited by 4 | Viewed by 3834

Abstract

Ad hoc information retrieval (ad hoc IR) is a challenging task consisting of ranking text documents for bag-of-words (BOW) queries. Classic approaches based on query and document text vectors use term-weighting functions to rank the documents. Some of these methods’ limitations consist of [...] Read more.

Ad hoc information retrieval (ad hoc IR) is a challenging task consisting of ranking text documents for bag-of-words (BOW) queries. Classic approaches based on query and document text vectors use term-weighting functions to rank the documents. Some of these methods’ limitations consist of their inability to work with polysemic concepts. In addition, these methods introduce fake orthogonalities between semantically related words. To address these limitations, model-based IR approaches based on topics have been explored. Specifically, topic models based on Latent Dirichlet Allocation (LDA) allow building representations of text documents in the latent space of topics, the better modeling of polysemy and avoiding the generation of orthogonal representations between related terms. We extend LDA-based IR strategies using different ensemble strategies. Model selection obeys the ensemble learning paradigm, for which we test two successful approaches widely used in supervised learning. We study Boosting and Bagging techniques for topic models, using each model as a weak IR expert. Then, we merge the ranking lists obtained from each model using a simple but effective top-k list fusion approach. We show that our proposal strengthens the results in precision and recall, outperforming classic IR models and strong baselines based on topic models. Full article

► Show Figures

Figure 1

17 pages, 3709 KiB

Open AccessArticle

An Effective Data Sharing Scheme Based on Blockchain in Vehicular Social Networks

by Yanji Jiang, Xueli Shen and Sifa Zheng

Electronics 2021, 10(2), 114; https://doi.org/10.3390/electronics10020114 - 7 Jan 2021

Cited by 18 | Viewed by 3490

Abstract

Vehicular social networks (VSNs) are the vehicular ad hoc networks (VANETs) that integrate social networks. Compared with traditional VANETs, VSNs are more suitable to serve a group of vehicles with common interests. In VSNs, vehicles can upload the necessary data in the cloud [...] Read more.

Vehicular social networks (VSNs) are the vehicular ad hoc networks (VANETs) that integrate social networks. Compared with traditional VANETs, VSNs are more suitable to serve a group of vehicles with common interests. In VSNs, vehicles can upload the necessary data in the cloud service provider (CSP) and other vehicles can query the data they are interested in through CSP, which enables VSNs to provide more user-friendly services. However, due to the wireless network communication environment, the data sent by the vehicle can easily be monitored. Adversaries are able to violate the privacy of the vehicle based on the collected data, thereby threatening the security of the entire network. In addition, if a vehicle shares malicious or false data with other vehicles, it is easy to mislead drivers and even cause serious traffic accidents. This paper proposes an effective data sharing scheme based on blockchain in VSNs. By integrating an identity based signature mechanism and pseudonym generation mechanism, we first propose an anonymous authentication mechanism as the basis for establishing trust relationships before data transmission between entities in VSNs. Then, a data sharing scheme based on blockchain is described, in which the signature mechanism and the consensus mechanism guarantee the security and traceability of data. The result of the performance analysis and the simulation experiment indicate that VAB can achieve a favourable performance compared with existing schemes. Full article

(This article belongs to the Section Networks)

► Show Figures

Figure 1

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI