Next Issue
Volume 14, June
Previous Issue
Volume 14, April
 
 

Information, Volume 14, Issue 5 (May 2023) – 47 articles

Cover Story (view full-size image): This study presents a novel image-based insect trap that overcomes the limitations of traditional camera insect-traps. Unlike existing traps, this device does not rely on manual image annotation for pest counting and incorporates self-disposal of captured insects, making it suitable for long-term deployment. The trap integrates an imaging sensor with Raspberry Pi microcontroller units and embedded deep learning algorithms. It utilizes a pheromone-based funnel trap to count agricultural pests. Additionally, the device receives instructions from a server for configuration, and a servomotor automatically rotates the trap's bottom to dispose of dehydrated captured insects. This design eliminates the challenge of overlap and occlusion caused by decaying insects during extended operation. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
13 pages, 535 KiB  
Article
Reinforcement Learning-Based Hybrid Multi-Objective Optimization Algorithm Design
by Herbert Palm and Lorin Arndt
Information 2023, 14(5), 299; https://doi.org/10.3390/info14050299 - 22 May 2023
Cited by 1 | Viewed by 1500
Abstract
The multi-objective optimization (MOO) of complex systems remains a challenging task in engineering domains. The methodological approach of applying MOO algorithms to simulation-enabled models has established itself as a standard. Despite increasing in computational power, the effectiveness and efficiency of such algorithms, i.e., [...] Read more.
The multi-objective optimization (MOO) of complex systems remains a challenging task in engineering domains. The methodological approach of applying MOO algorithms to simulation-enabled models has established itself as a standard. Despite increasing in computational power, the effectiveness and efficiency of such algorithms, i.e., their ability to identify as many Pareto-optimal solutions as possible with as few simulation samples as possible, plays a decisive role. However, the question of which class of MOO algorithms is most effective or efficient with respect to which class of problems has not yet been resolved. To tackle this performance problem, hybrid optimization algorithms that combine multiple elementary search strategies have been proposed. Despite their potential, no systematic approach for selecting and combining elementary Pareto search strategies has yet been suggested. In this paper, we propose an approach for designing hybrid MOO algorithms that uses reinforcement learning (RL) techniques to train an intelligent agent for dynamically selecting and combining elementary MOO search strategies. We present both the fundamental RL-Based Hybrid MOO (RLhybMOO) methodology and an exemplary implementation applied to mathematical test functions. The results indicate a significant performance gain of intelligent agents over elementary and static hybrid search strategies, highlighting their ability to effectively and efficiently select algorithms. Full article
(This article belongs to the Special Issue Intelligent Agent and Multi-Agent System)
Show Figures

Figure 1

11 pages, 1428 KiB  
Article
Chinese–Vietnamese Pseudo-Parallel Sentences Extraction Based on Image Information Fusion
by Yonghua Wen, Junjun Guo, Zhiqiang Yu and Zhengtao Yu
Information 2023, 14(5), 298; https://doi.org/10.3390/info14050298 - 21 May 2023
Viewed by 1357
Abstract
Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a lack of large-scale parallel data. The objective [...] Read more.
Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a lack of large-scale parallel data. The objective of pseudo-parallel sentence extraction is to automatically identify sentence pairs in different languages that convey similar meanings. Earlier methods heavily relied on parallel data, which is unsuitable for low-resource scenarios. The current mainstream research direction is to use transfer learning or unsupervised learning based on cross-lingual word embeddings and multilingual pre-trained models; however, these methods are ineffective for languages with substantial differences. To address this issue, we propose a sentence extraction method that leverages image information fusion to extract Chinese–Vietnamese pseudo-parallel sentences from collections of bilingual texts. Our method first employs an adaptive image and text feature fusion strategy to efficiently extract the bilingual parallel sentence pair, and then, a multimodal fusion method is presented to balance the information between the image and text modalities. The experiments on multiple benchmarks show that our method achieves promising results compared to a competitive baseline by infusing additional external image information. Full article
Show Figures

Figure 1

18 pages, 5731 KiB  
Article
An Improved Method of Heart Rate Extraction Algorithm Based on Photoplethysmography for Sports Bracelet
by Binbin Ren, Zhaoyuxuan Wang, Kainan Ma, Yiheng Zhou and Ming Liu
Information 2023, 14(5), 297; https://doi.org/10.3390/info14050297 - 19 May 2023
Cited by 1 | Viewed by 2127
Abstract
Heart rate measurement employing photoplethysmography (PPG) is a prevalent technique for wearable devices. However, the acquired PPG signal is often contaminated with motion artifacts, which need to be accurately removed. In cases where the PPG and accelerometer (ACC) spectra overlap at the actual [...] Read more.
Heart rate measurement employing photoplethysmography (PPG) is a prevalent technique for wearable devices. However, the acquired PPG signal is often contaminated with motion artifacts, which need to be accurately removed. In cases where the PPG and accelerometer (ACC) spectra overlap at the actual heart rate, traditional discrete Fourier transform (DFT) algorithms fail to compute the heart rate accurately. This study proposed an enhanced heart rate extraction algorithm based on PPG to address the issue of PPG and ACC spectral overlap. The spectral overlap is assessed according to the morphological characteristics of both the PPG and ACC spectra. Upon detecting an overlap, the singular spectrum analysis (SSA) algorithm is employed to calculate the heart rate at the given time. The SSA algorithm effectively resolves the issue of spectral overlap by removing motion artifacts through the elimination of ACC-related time series in the PPG signal. Experimental results reveal that the accuracy of the proposed algorithm surpasses that of the traditional DFT method by 19.01%. The proposed method makes up for the deficiency posed by artifact and heart rate signal overlap in conventional algorithms and significantly improves heart rate extraction accuracy. Full article
(This article belongs to the Special Issue Human Activity Recognition and Biomedical Signal Processing)
Show Figures

Figure 1

16 pages, 3998 KiB  
Article
Lightweight Implicit Blur Kernel Estimation Network for Blind Image Super-Resolution
by Asif Hussain Khan, Christian Micheloni and Niki Martinel
Information 2023, 14(5), 296; https://doi.org/10.3390/info14050296 - 18 May 2023
Viewed by 2749
Abstract
Blind image super-resolution (Blind-SR) is the process of leveraging a low-resolution (LR) image, with unknown degradation, to generate its high-resolution (HR) version. Most of the existing blind SR techniques use a degradation estimator network to explicitly estimate the blur kernel to guide the [...] Read more.
Blind image super-resolution (Blind-SR) is the process of leveraging a low-resolution (LR) image, with unknown degradation, to generate its high-resolution (HR) version. Most of the existing blind SR techniques use a degradation estimator network to explicitly estimate the blur kernel to guide the SR network with the supervision of ground truth (GT) kernels. To solve this issue, it is necessary to design an implicit estimator network that can extract discriminative blur kernel representation without relying on the supervision of ground-truth blur kernels. We design a lightweight approach for blind super-resolution (Blind-SR) that estimates the blur kernel and restores the HR image based on a deep convolutional neural network (CNN) and a deep super-resolution residual convolutional generative adversarial network. Since the blur kernel for blind image SR is unknown, following the image formation model of blind super-resolution problem, we firstly introduce a neural network-based model to estimate the blur kernel. This is achieved by (i) a Super Resolver that, from a low-resolution input, generates the corresponding SR image; and (ii) an Estimator Network generating the blur kernel from the input datum. The output of both models is used in a novel loss formulation. The proposed network is end-to-end trainable. The methodology proposed is substantiated by both quantitative and qualitative experiments. Results on benchmarks demonstrate that our computationally efficient approach (12x fewer parameters than the state-of-the-art models) performs favorably with respect to existing approaches and can be used on devices with limited computational capabilities. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition and Machine Learning in Italy)
Show Figures

Figure 1

19 pages, 1488 KiB  
Review
Blockchain and Machine Learning: A Critical Review on Security
by Hamed Taherdoost
Information 2023, 14(5), 295; https://doi.org/10.3390/info14050295 - 17 May 2023
Cited by 8 | Viewed by 7038
Abstract
Blockchain is the foundation of all cryptocurrencies, while machine learning (ML) is one of the most popular technologies with a wide range of possibilities. Blockchain may be improved and made more effective by using ML. Even though blockchain technology uses encryption to safeguard [...] Read more.
Blockchain is the foundation of all cryptocurrencies, while machine learning (ML) is one of the most popular technologies with a wide range of possibilities. Blockchain may be improved and made more effective by using ML. Even though blockchain technology uses encryption to safeguard data, it is not completely reliable. Various elements, including the particular use case, the type of data, and legal constraints can determine whether it is suitable for keeping private and sensitive data. While there may be benefits, it is important to take into account possible hazards and abide by privacy and security laws. The blockchain itself is secure, but additional applications and layers are not. In terms of security, ML can aid in the development of blockchain applications. Therefore, a critical investigation is required to better understand the function of ML and blockchain in enhancing security. This study examines the current situation, evaluates the articles it contains, and presents an overview of the security issues. Despite their existing limitations, the papers included from 2012 to 2022 highlighted the importance of ML’s impact on blockchain security. ML and blockchain can enhance security, but challenges remain; advances such as federated learning and zero-knowledge proofs are important, and future research should focus on privacy and integration with other technologies. Full article
(This article belongs to the Special Issue Machine Learning for the Blockchain)
Show Figures

Figure 1

16 pages, 14957 KiB  
Article
Virtual and Augmented Experience in Virtual Learning Tours
by Fotios Bosmos, Alexandros T. Tzallas, Markos G. Tsipouras, Evripidis Glavas and Nikolaos Giannakeas
Information 2023, 14(5), 294; https://doi.org/10.3390/info14050294 - 16 May 2023
Viewed by 1430
Abstract
The aim of this work is to highlight the possibilities of using VR applications in the informal learning process. This is attempted through the development of virtual reality cultural applications for historical monuments. For this purpose, the theoretical framework of virtual and augmented [...] Read more.
The aim of this work is to highlight the possibilities of using VR applications in the informal learning process. This is attempted through the development of virtual reality cultural applications for historical monuments. For this purpose, the theoretical framework of virtual and augmented reality techniques is presented, developing as a showcase of the virtual environment of the historical bridge of Arta, in Greece. The bridge model is created through 3D software, which is then imported into virtual world environment by employing the Unity engine. The main objective of the research is the technical and empirical evaluation of the VR application by specialists, in comparison with the real environment of the monument. Accordingly, the use of the application in the learning process is evaluated by high school students. Using the conclusions of the evaluation, the environment will be enriched with multimedia elements and the application will be evaluated by secondary school students as a learning experience and process, using electroencephalography (EEG). The recording and analysis of research results can be generalized and lead to safe conclusions for the use of similar applications in the field of culture and learning. Full article
Show Figures

Figure 1

14 pages, 2115 KiB  
Article
Semi-Supervised Model for Aspect Sentiment Detection
by Zohreh Madhoushi, Abdul Razak Hamdan and Suhaila Zainudin
Information 2023, 14(5), 293; https://doi.org/10.3390/info14050293 - 16 May 2023
Cited by 1 | Viewed by 1138
Abstract
Advancements in text representation have produced many deep language models (LMs), such as Word2Vec and recurrent-based LMs. However, there are scarce works that focus on detecting implicit sentiments with a small amount of labelled data because there are many different review areas. Deep [...] Read more.
Advancements in text representation have produced many deep language models (LMs), such as Word2Vec and recurrent-based LMs. However, there are scarce works that focus on detecting implicit sentiments with a small amount of labelled data because there are many different review areas. Deep learning techniques are suitable to automate the representation learning process. Hence, we proposed a semi-supervised aspect-based sentiment analysis (ABSA) model for online review to predict explicit and implicit sentiment in three domains (laptop, restaurant, and hotel). The datasets of this study, S1 and S2, were obtained from a standard SemEval online competition and Amazon review datasets. The proposed models outperform the previous baseline models regarding the F1-score of aspect category detection and accuracy of sentiment detection. This study finds more relevant aspects and accurate sentiment for ABSA by developing more stable and robust models. The accuracy of sentiment detection is 84.87% in the restaurant domain on the first dataset. For the second dataset, the proposed method achieved 84.43% in the laptop domain, 85.21% in the restaurant domain, and 85.57% in the hotel domain. The novelty is the proposed new semi-supervised model for aspect sentiment detection with embedded aspect inspired by the encoder–decoder architecture in the neural machine translation (NMT) model. Full article
Show Figures

Figure 1

32 pages, 1146 KiB  
Article
Online Task Scheduling of Big Data Applications in the Cloud Environment
by Laila Bouhouch, Mostapha Zbakh and Claude Tadonki
Information 2023, 14(5), 292; https://doi.org/10.3390/info14050292 - 15 May 2023
Cited by 2 | Viewed by 1475
Abstract
The development of big data has generated data-intensive tasks that are usually time-consuming, with a high demand on cloud data centers for hosting big data applications. It becomes necessary to consider both data and task management to find the optimal resource allocation scheme, [...] Read more.
The development of big data has generated data-intensive tasks that are usually time-consuming, with a high demand on cloud data centers for hosting big data applications. It becomes necessary to consider both data and task management to find the optimal resource allocation scheme, which is a challenging research issue. In this paper, we address the problem of online task scheduling combined with data migration and replication in order to reduce the overall response time as well as ensure that the available resources are efficiently used. We introduce a new scheduling technique, named Online Task Scheduling algorithm based on Data Migration and Data Replication (OTS-DMDR). The main objective is to efficiently assign online incoming tasks to the available servers while considering the access time of the required datasets and their replicas, the execution time of the task in different machines, and the computational power of each machine. The core idea is to achieve better data locality by performing an effective data migration while handling replicas. As a result, the overall response time of the online tasks is reduced, and the throughput is improved with enhanced machine resource utilization. To validate the performance of the proposed scheduling method, we run in-depth simulations with various scenarios and the results show that our proposed strategy performs better than the other existing approaches. In fact, it reduces the response time by 78% when compared to the First Come First Served scheduler (FCFS), by 58% compared to the Delay Scheduling, and by 46% compared to the technique of Li et al. Consequently, the present OTS-DMDR method is very effective and convenient for the problem of online task scheduling. Full article
(This article belongs to the Special Issue Internet of Things and Cloud-Fog-Edge Computing)
Show Figures

Figure 1

22 pages, 1505 KiB  
Article
Identifying Critical Indicators in the Evaluation of Third-Party Reverse Logistics Provider Using Best–Worst Method
by Changlu Zhang, Liqian Tang and Jian Zhang
Information 2023, 14(5), 291; https://doi.org/10.3390/info14050291 - 14 May 2023
Cited by 1 | Viewed by 1326
Abstract
Evaluation and selection of a third-party reverse logistics provider (3PRLP) is an important tool for enterprises to improve the level of reverse logistics management. The identification of critical indicators plays a crucial role in the evaluation process. Firstly, on the basis of fully [...] Read more.
Evaluation and selection of a third-party reverse logistics provider (3PRLP) is an important tool for enterprises to improve the level of reverse logistics management. The identification of critical indicators plays a crucial role in the evaluation process. Firstly, on the basis of fully considering the characteristics of 3PRLP evaluation and selection, we summarized 27 evaluation indicators from 5 dimensions—overall operation level, management service level, information technology level, social and ecological benefits, and strategic alliance. Secondly, we adopted the Delphi method to determine the formal evaluation index system, and the best–worst method (BWM) to calculate the weight of each indicator. We determined the critical evaluation indicators on the basis of the weights. Finally, based on our results, corresponding countermeasures and implications regarding the process of 3PRLP evaluation were put forward. The research results show that the critical indicators include transportation allocation capacity, network coverage, service price level, service response speed, recovery efficiency, and service flexibility level. Full article
(This article belongs to the Special Issue New Applications in Multiple Criteria Decision Analysis)
Show Figures

Figure 1

22 pages, 2608 KiB  
Article
Simulated Autonomous Driving Using Reinforcement Learning: A Comparative Study on Unity’s ML-Agents Framework
by Yusef Savid, Reza Mahmoudi, Rytis Maskeliūnas and Robertas Damaševičius
Information 2023, 14(5), 290; https://doi.org/10.3390/info14050290 - 14 May 2023
Cited by 3 | Viewed by 3798
Abstract
Advancements in artificial intelligence are leading researchers to find use cases that were not as straightforward to solve in the past. The use case of simulated autonomous driving has been known as a notoriously difficult task to automate, but advancements in the field [...] Read more.
Advancements in artificial intelligence are leading researchers to find use cases that were not as straightforward to solve in the past. The use case of simulated autonomous driving has been known as a notoriously difficult task to automate, but advancements in the field of reinforcement learning have made it possible to reach satisfactory results. In this paper, we explore the use of the Unity ML-Agents toolkit to train intelligent agents to navigate a racing track in a simulated environment using RL algorithms. The paper compares the performance of several different RL algorithms and configurations on the task of training kart agents to successfully traverse a racing track and identifies the most effective approach for training kart agents to navigate a racing track and avoid obstacles in that track. The best results, value loss of 0.0013 and a cumulative reward of 0.761, were yielded using the Proximal Policy Optimization algorithm. After successfully choosing a model and algorithm that can traverse the track with ease, different objects were added to the track and another model (which used behavioral cloning as a pre-training option) was trained to avoid such obstacles. The aforementioned model resulted in a value loss of 0.001 and a cumulative reward of 0.068, proving that behavioral cloning can help achieve satisfactory results where the in game agents are able to avoid obstacles more efficiently and complete the track with human-like performance, allowing for a deployment of intelligent agents in racing simulators. Full article
(This article belongs to the Special Issue Feature Papers in Information in 2023)
Show Figures

Figure 1

14 pages, 2169 KiB  
Article
Hemodynamic and Electrophysiological Biomarkers of Interpersonal Tuning during Interoceptive Synchronization
by Michela Balconi and Laura Angioletti
Information 2023, 14(5), 289; https://doi.org/10.3390/info14050289 - 13 May 2023
Cited by 1 | Viewed by 1432
Abstract
This research explored the influence of interoception and social frame on the coherence of inter-brain electrophysiological (EEG) and hemodynamic (collected by functional Near Infrared Spectroscopy, fNIRS) functional connectivity during a motor synchronization task. Fourteen dyads executed a motor synchronization task with the presence [...] Read more.
This research explored the influence of interoception and social frame on the coherence of inter-brain electrophysiological (EEG) and hemodynamic (collected by functional Near Infrared Spectroscopy, fNIRS) functional connectivity during a motor synchronization task. Fourteen dyads executed a motor synchronization task with the presence and absence of interoceptive focus. Moreover, the motor task was socially or not-socially framed by enhancing the shared intentionality. During the experiment, delta, theta, alpha, and beta frequency bands, and oxygenated and de-oxygenated hemoglobin (O2Hb and HHb) were collected through an EEG-fNIRS hyperscanning paradigm. Inter-brain coherence indices were computed for the two neurophysiological signals and then they were correlated to explore the reciprocal coherence of the functional connectivity EEG-fNIRS in the dyads. Findings showed significant higher correlational values between delta and O2Hb, theta and O2Hb, and alpha and O2Hb for the left hemisphere in the focus compared to the no focus condition and to the right hemisphere (both during focus and no focus condition). Additionally, greater correlational values between delta and O2Hb, and theta and O2Hb were observed in the left hemisphere for the focus condition when the task was socially compared to non-socially framed. This study showed that the focus on the breath and shared intentionality activate coherently the same left frontal areas in dyads performing a joint motor task. Full article
(This article belongs to the Special Issue Feature Papers in Information in 2023)
Show Figures

Figure 1

20 pages, 429 KiB  
Article
Intent Classification by the Use of Automatically Generated Knowledge Graphs
by Mihael Arcan, Sampritha Manjunath, Cécile Robin, Ghanshyam Verma, Devishree Pillai, Simon Sarkar, Sourav Dutta, Haytham Assem, John P. McCrae and Paul Buitelaar
Information 2023, 14(5), 288; https://doi.org/10.3390/info14050288 - 12 May 2023
Viewed by 2753
Abstract
Intent classification is an essential task for goal-oriented dialogue systems for automatically identifying customers’ goals. Although intent classification performs well in general settings, domain-specific user goals can still present a challenge for this task. To address this challenge, we automatically generate knowledge graphs [...] Read more.
Intent classification is an essential task for goal-oriented dialogue systems for automatically identifying customers’ goals. Although intent classification performs well in general settings, domain-specific user goals can still present a challenge for this task. To address this challenge, we automatically generate knowledge graphs for targeted data sets to capture domain-specific knowledge and leverage embeddings trained on these knowledge graphs for the intent classification task. As existing knowledge graphs might not be suitable for a targeted domain of interest, our automatic generation of knowledge graphs can extract the semantic information of any domain, which can be incorporated within the classification process. We compare our results with state-of-the-art pre-trained sentence embeddings and our evaluation of three data sets shows improvement in the intent classification task in terms of precision. Full article
(This article belongs to the Special Issue Knowledge Graph Technology and its Applications II)
Show Figures

Figure 1

14 pages, 1306 KiB  
Article
Engagement with Optional Formative Feedback in a Portfolio-Based Digital Design Module
by Eirini Kalaitzopoulou, Paul Matthews, Stylianos Mystakidis and Athanasios Christopoulos
Information 2023, 14(5), 287; https://doi.org/10.3390/info14050287 - 12 May 2023
Cited by 1 | Viewed by 1941
Abstract
Design skills are considered important in software engineering, and formative feedback may facilitate the learning process and help students master those skills. However, little is known about student usage of and reaction to the feedback and its impact on learning and assessment outcomes. [...] Read more.
Design skills are considered important in software engineering, and formative feedback may facilitate the learning process and help students master those skills. However, little is known about student usage of and reaction to the feedback and its impact on learning and assessment outcomes. This study explores the effects of optional formative assessment feedback on learners’ performance and engagement by considering LMS interactions, student demographics, personality types, and motivation sources. Forty-five postgraduate students completed an enrolment questionnaire addressing the Big Five personality dimensions, the Situational Motivation Scale and background data. The main methods included monitoring LMS engagement over 10 weeks of teaching and analysing assessment marks to develop student profiles and assess the influence of formative feedback on engagement and performance. The main findings revealed that while formative feedback helped improve marks on portfolio tasks, it did not lead to higher performance overall compared to students who did not receive it. Students seeking feedback engaged more actively with the LMS assessments. Feedback-seeking behaviour was associated with gender, intrinsic motivation, conscientiousness, and extrinsic motivation, although not all associations were significant. The study’s main contributions are in highlighting the impact of formative feedback on performance in linked assessments and in starting to reveal the complex relationship between feedback-seeking behaviour and student characteristics. Full article
(This article belongs to the Special Issue Information Technologies in Education, Research and Innovation)
Show Figures

Figure 1

16 pages, 760 KiB  
Article
Using Genetic Algorithms to Improve Airport Pavement Structural Condition Assessment: Code Development and Case Study
by Alessia Donato and David Carfì
Information 2023, 14(5), 286; https://doi.org/10.3390/info14050286 - 11 May 2023
Viewed by 1189
Abstract
In this paper, we propose a new method of optimization based on genetic algorithms using the MATLAB toolbox “Global Optimization”. The algorithm finds layers moduli of a flexible pavement through the measurement of pavement surface deflections under assigned load conditions. First, the algorithm [...] Read more.
In this paper, we propose a new method of optimization based on genetic algorithms using the MATLAB toolbox “Global Optimization”. The algorithm finds layers moduli of a flexible pavement through the measurement of pavement surface deflections under assigned load conditions. First, the algorithm for the forward calculation is validated, then the algorithm for the back-calculation is proposed, and the results are compared, in the case of airport pavements, with other software using different back-calculation techniques. The goodness of the procedure and the way of managing the algorithm operator is demonstrated by means of positive feedback obtained from the comparison of the results of ELMOD and BackGenetic3D. Moreover, the findings of the analysis prove that, in such an optimization procedure by GA, the best solution is always reached with a low number of generations, generally less than 10, allowing a reduction in the time of calculation and choosing a population big enough to select with good probability, in the initial population, solutions close to the real ones. The code is made available in such a way that the reader can easily apply it to other flexible pavements in the case of fully bonded layers (both for roads and airports). In particular, interested readers can easily modify the algorithm parameters (population number, stop criteria, probability of mutation, cross-over, and reproduction) and the type of fitness function to minimize, together with the geometric and load characteristics (number and thickness of the layers and the range of module variation). The possibility to change the algorithm parameters and the fitness function allows for exploring different scenarios in order to find the best solution in terms of fitness values. It is also possible to intervene in the time of calculation by managing the algorithm’s stopping criteria. Full article
(This article belongs to the Special Issue Machine Intelligence in Interdisciplinary Areas)
Show Figures

Figure 1

28 pages, 1207 KiB  
Review
A Comprehensive Review of the Novel Weighting Methods for Multi-Criteria Decision-Making
by Büşra Ayan, Seda Abacıoğlu and Marcio Pereira Basilio
Information 2023, 14(5), 285; https://doi.org/10.3390/info14050285 - 11 May 2023
Cited by 39 | Viewed by 9216
Abstract
In the realm of multi-criteria decision-making (MCDM) problems, the selection of a weighting method holds a critical role. Researchers from diverse fields have consistently employed MCDM techniques, utilizing both traditional and novel methods to enhance the discipline. Acknowledging the significance of staying abreast [...] Read more.
In the realm of multi-criteria decision-making (MCDM) problems, the selection of a weighting method holds a critical role. Researchers from diverse fields have consistently employed MCDM techniques, utilizing both traditional and novel methods to enhance the discipline. Acknowledging the significance of staying abreast of such methodological developments, this study endeavors to contribute to the field through a comprehensive review of several novel weighting-based methods: CILOS, IDOCRIW, FUCOM, LBWA, SAPEVO-M, and MEREC. Each method is scrutinized in terms of its characteristics and steps while also drawing upon publications extracted from the Web of Science (WoS) and Scopus databases. Through bibliometric and content analyses, this study delves into the trend, research components (sources, authors, countries, and affiliations), application areas, fuzzy implementations, hybrid studies (use of other weighting and/or ranking methods), and application tools for these methods. The findings of this review offer an insightful portrayal of the applications of each novel weighting method, thereby contributing valuable knowledge for researchers and practitioners within the field of MCDM. Full article
Show Figures

Figure 1

18 pages, 4115 KiB  
Article
Intelligence Amplification-Based Smart Health Record Chain for Enterprise Management System
by S. Velliangiri, P. Karthikeyan, Vinayakumar Ravi, Meshari Almeshari and Yasser Alzamil
Information 2023, 14(5), 284; https://doi.org/10.3390/info14050284 - 11 May 2023
Viewed by 1514
Abstract
Medical service providers generate many healthcare records containing sensitive and private information about a patient’s health. The patient can allow healthcare service providers to generate healthcare data, which can be stored with healthcare service providers. After some time, if the patient wants to [...] Read more.
Medical service providers generate many healthcare records containing sensitive and private information about a patient’s health. The patient can allow healthcare service providers to generate healthcare data, which can be stored with healthcare service providers. After some time, if the patient wants to share the healthcare records of one healthcare service provider with another, we can quickly exchange the healthcare record using our approaches. The challenges faced by healthcare service providers are healthcare record sharing, tampering, and insurance fraud. We have developed Health Record Chain for Sharing Medical Data using the modified SHA-512 algorithm. We have evaluated our methods, and our method outperforms in terms of storage cost and total time consumption for health record sharing. The proposed model takes 130 ms to share 100,000 records, 32 ms faster than traditional methods. It also resists various security attacks, as verified by an automated security protocol verification tool. Full article
(This article belongs to the Special Issue Trends in Electronics and Health Informatics)
Show Figures

Figure 1

17 pages, 3858 KiB  
Article
Uyghur–Kazakh–Kirghiz Text Keyword Extraction Based on Morpheme Segmentation
by Sardar Parhat, Mutallip Sattar, Askar Hamdulla and Abdurahman Kadir
Information 2023, 14(5), 283; https://doi.org/10.3390/info14050283 - 10 May 2023
Viewed by 1159
Abstract
In this study, based on a morpheme segmentation framework, we researched a text keyword extraction method for Uyghur, Kazakh and Kirghiz languages, which have similar grammatical and lexical structures. In these languages, affixes and a stem are joined together to form a word. [...] Read more.
In this study, based on a morpheme segmentation framework, we researched a text keyword extraction method for Uyghur, Kazakh and Kirghiz languages, which have similar grammatical and lexical structures. In these languages, affixes and a stem are joined together to form a word. A stem is a word particle with a notional meaning, while the affixes perform grammatical functions. Because of these derivative properties, the vocabularies used for these languages are huge. Therefore, pre-processing is a necessary step in NLP tasks for Uyghur, Kazakh and Kirghiz. Morpheme segmentation enabled us to remove the suffixes as the auxiliary unit while retaining the meaningful stem and it reduced the dimension of the feature space present in the keyword extraction task for Uyghur, Kazakh and Kirghiz texts. We transformed the morpheme segmentation task into the problem of labeling the morpheme sequences, and we used the Bi-LSTM network to bidirectionally obtain the position feature information of character sequences. We applied CRF to effectively learn the information of the preceding and following label sequences to build a highly accurate Bi-LSTM_CRF morpheme segmentation model, and we prepared morpheme-based experimental text sets by using this model. Subsequently, we used the stem vectors’ similarity to modify the TextRank algorithm, subsequent to the training of the stem embedding vector using the Doc2vec algorithm, and then we performed a text keyword extraction experiment. In this experiment, the highest F1 scores of 43.8%, 44% and 43.9% were obtained for three datasets. The experimental results show that the morpheme-based approach provides much better results than the word-based approach, which shows the stem vector similarity weighting is an efficient method for the text keyword extraction task, thus proving the efficiency of morpheme sequence for morphologically derivative languages. Full article
Show Figures

Figure 1

16 pages, 1253 KiB  
Article
A Double-Stage 3D U-Net for On-Cloud Brain Extraction and Multi-Structure Segmentation from 7T MR Volumes
by Selene Tomassini, Haidar Anbar, Agnese Sbrollini, MHD Jafar Mortada, Laura Burattini and Micaela Morettini
Information 2023, 14(5), 282; https://doi.org/10.3390/info14050282 - 10 May 2023
Cited by 2 | Viewed by 1883
Abstract
The brain is the organ most studied using Magnetic Resonance (MR). The emergence of 7T scanners has increased MR imaging resolution to a sub-millimeter level. However, there is a lack of automatic segmentation techniques for 7T MR volumes. This research aims to develop [...] Read more.
The brain is the organ most studied using Magnetic Resonance (MR). The emergence of 7T scanners has increased MR imaging resolution to a sub-millimeter level. However, there is a lack of automatic segmentation techniques for 7T MR volumes. This research aims to develop a novel deep learning-based algorithm for on-cloud brain extraction and multi-structure segmentation from unenhanced 7T MR volumes. To this aim, a double-stage 3D U-Net was implemented in a cloud service, directing its first stage to the automatic extraction of the brain and its second stage to the automatic segmentation of the grey matter, basal ganglia, white matter, ventricles, cerebellum, and brain stem. The training was performed on the 90% (the 10% of which served for validation) and the test on the 10% of the Glasgow database. A mean test Dice Similarity Coefficient (DSC) of 96.33% was achieved for the brain class. Mean test DSCs of 90.24%, 87.55%, 93.82%, 85.77%, 91.53%, and 89.95% were achieved for the brain structure classes, respectively. Therefore, the proposed double-stage 3D U-Net is effective in brain extraction and multi-structure segmentation from 7T MR volumes without any preprocessing and training data augmentation strategy while ensuring its machine-independent reproducibility. Full article
(This article belongs to the Special Issue Artificial Intelligence and Big Data Applications)
Show Figures

Figure 1

18 pages, 1069 KiB  
Article
A Blockchain-Based Efficient and Verifiable Attribute-Based Proxy Re-Encryption Cloud Sharing Scheme
by Tao Feng, Dewei Wang and Renbin Gong
Information 2023, 14(5), 281; https://doi.org/10.3390/info14050281 - 09 May 2023
Cited by 2 | Viewed by 1781
Abstract
When choosing a third-party cloud storage platform, the confidentiality of data should be the primary concern. To address the issue of one-to-many access control during data sharing, it is important to encrypt data with an access policy that enables fine-grained access. The attribute-based [...] Read more.
When choosing a third-party cloud storage platform, the confidentiality of data should be the primary concern. To address the issue of one-to-many access control during data sharing, it is important to encrypt data with an access policy that enables fine-grained access. The attribute-based encryption scheme can be used for this purpose. Additionally, attribute-based proxy re-encryption (ABPRE) can generate a secret key using the delegatee’s secret key and access policy to re-encrypt the ciphertext, allowing for one-to-many data sharing. However, this scheme still has some flaws, such as low efficiency, inability to update access rules, and private data leakage. To address these issues, we proposed a scheme that combines attribute-based encryption (ABE) and identity-based encryption (IBE) to achieve efficient data sharing and data correctness verification. We also integrated this scheme with blockchain technology to ensure tamper-proof and regulated data storage, addressing issues such as data tampering and lack of supervision on third-party servers. Finally, to demonstrate the security of our scheme, we evaluated the communication overhead and computation overhead. Our results showed that our scheme is more efficient than other schemes and is secure against chosen plaintext attacks with verifiable properties. Full article
(This article belongs to the Special Issue Advances in Computing, Communication & Security)
Show Figures

Figure 1

18 pages, 3320 KiB  
Article
Blockchain-Based Automated Market Makers for a Decentralized Stock Exchange
by Radhakrishna Dodmane, Raghunandan K. R., Krishnaraj Rao N. S., Bhavya Kallapu., Surendra Shetty, Muhammad Aslam and Syeda Fizzah Jilani
Information 2023, 14(5), 280; https://doi.org/10.3390/info14050280 - 09 May 2023
Cited by 3 | Viewed by 1734
Abstract
The advancements in communication speeds have enabled the centralized financial market to be faster and more complex than ever. The speed of the order execution has become exponentially faster when compared to the early days of electronic markets. Though the transaction speed has [...] Read more.
The advancements in communication speeds have enabled the centralized financial market to be faster and more complex than ever. The speed of the order execution has become exponentially faster when compared to the early days of electronic markets. Though the transaction speed has increased, the underlying architecture or models behind the markets have remained the same. These models come with their own disadvantages. The disadvantages are usually faced by non-institutional or small traders. The bigger players, such as financial institutions, have an advantage over smaller players because of factors such as information asymmetry and access to better infrastructure, which give them an advantage in terms of the speed of execution. This makes the centralized stock market an uneven playing field. This paper discusses the limitations of centralized financial markets, particularly the disadvantage faced by non-institutional or small traders due to information asymmetry and better infrastructure access by financial institutions. The authors propose the usage of blockchain technology and the data highway protocol to create a decentralized stock exchange that can potentially eliminate these disadvantages. The data highway protocol is used to generate new blocks with a flexible finality condition that allows for the consensus mechanism to configure security thresholds more freely. The proposed framework is compared with existing frameworks to confirm its effectiveness and identify areas that require improvement. The evaluation of the proposed approach showed that the improved highway protocol boosted the transaction rate compared to the other two mechanisms (PoS and PoW). Specifically, the transaction rate of the proposed model was found to be 2.2 times higher than that of PoS and 12 times higher than that of the PoW consensus model. Full article
Show Figures

Figure 1

14 pages, 1049 KiB  
Article
The Psychometric Function for Focusing Attention on Pitch
by Adam Reeves
Information 2023, 14(5), 279; https://doi.org/10.3390/info14050279 - 09 May 2023
Viewed by 1211
Abstract
What is the effect of focusing auditory attention on an upcoming signal tone? Weak signal tones, 40 ms in duration, were presented in 50 dB continuous white noise and were either uncued or cued 82 ms beforehand by a 12 dB SL cue [...] Read more.
What is the effect of focusing auditory attention on an upcoming signal tone? Weak signal tones, 40 ms in duration, were presented in 50 dB continuous white noise and were either uncued or cued 82 ms beforehand by a 12 dB SL cue tone of the same frequency and duration as the signal. Signal frequency was either constant for a block of trials or was randomly one of 11 frequencies from 632 to 3140 Hz. Slopes of psychometric functions for detection in single-interval (Yes/No) trials were obtained from three listeners by varying the signal level over a 1–9 dB range. Plots of log(d’) against signal dB were fit by linear functions. Slopes were similar whether signal frequency was constant or varied, as found by D. Green. Slopes for uncued tones increased by 14% to 20% more than predicted by signal energy (i.e., 0.10), as also found previously, whereas slopes for cued tones followed signal energy corrected for an 8 dB sensory threshold. That pre-cues help attention focus rapidly on signal frequency and permit listeners to act as near-ideal detectors of signal energy, which they do not do otherwise, supports a key hypothesis of Grossberg’s ART model that attention guided by conscious awareness can optimize perception. Full article
Show Figures

Figure 1

13 pages, 1991 KiB  
Article
Deep Learning Pet Identification Using Face and Body
by Elham Azizi and Loutfouz Zaman
Information 2023, 14(5), 278; https://doi.org/10.3390/info14050278 - 08 May 2023
Cited by 1 | Viewed by 3384
Abstract
According to the American Humane Association, millions of cats and dogs are lost yearly. Only a few thousand of them are found and returned home. In this work, we use deep learning to help expedite the procedure of finding lost cats and dogs, [...] Read more.
According to the American Humane Association, millions of cats and dogs are lost yearly. Only a few thousand of them are found and returned home. In this work, we use deep learning to help expedite the procedure of finding lost cats and dogs, for which a new dataset is collected. We applied transfer learning methods on different convolutional neural networks for species classification and animal identification. The framework consists of seven sequential layers: data preprocessing, species classification, face and body detection with landmark detection techniques, face alignment, identification, animal soft biometrics, and recommendation. We achieved an accuracy of 98.18% on species classification. In the face identification layer, 80% accuracy was achieved. Body identification resulted in 81% accuracy. When using body identification in addition to face identification, the accuracy increased to 86.5%, with a 100% chance that the animal would be in our top 10 recommendations of matching. By incorporating animals’ soft biometric information, the system can identify animals with 92% confidence. Full article
Show Figures

Figure 1

17 pages, 3232 KiB  
Systematic Review
A Systematic Literature Review on Adaptive Supports in Serious Games for Programming
by Pavlos Toukiloglou and Stelios Xinogalos
Information 2023, 14(5), 277; https://doi.org/10.3390/info14050277 - 08 May 2023
Cited by 3 | Viewed by 1760
Abstract
This paper reviews the research on adaptive serious games for programming regarding the implementation of their support systems. Serious games are designed to educate players in an entertaining and engaging manner. A key element in terms of meeting their educational goals is the [...] Read more.
This paper reviews the research on adaptive serious games for programming regarding the implementation of their support systems. Serious games are designed to educate players in an entertaining and engaging manner. A key element in terms of meeting their educational goals is the presentation of the learning content through a support system. Recent developments in artificial intelligence, data analysis, and computing made the development of support systems that adapt to player individual characteristics possible. A systematic literature review is necessary to evaluate the efficiency of adaptive supports and examine the implementation approaches. This review identified 18 papers reporting evidence about the efficiency of the provided support and methods of development. A variety of techniques for presenting educational content was found, with text being the preferred type. Researchers employed data-driven approaches to model student knowledge levels and behavior such as Bayesian networks and questionnaires, with fuzzy logic being utilized most frequently. The efficiency of the supports, when compared with non-adaptive or traditional methods of teaching, was mostly positive, although this is not a decisive conclusion. Some papers did not provide empirical evidence or concluded no difference in efficiency. The limited number of articles in the field, together with the lack of a standard evaluation methodology, leads to the conclusion that further work needs to be carried out in the area. Full article
(This article belongs to the Special Issue Game Informatics)
Show Figures

Figure 1

23 pages, 6366 KiB  
Article
Oriented Crossover in Genetic Algorithms for Computer Networks Optimization
by Furkan Rabee and Zahir M. Hussain
Information 2023, 14(5), 276; https://doi.org/10.3390/info14050276 - 05 May 2023
Viewed by 2183
Abstract
Optimization using genetic algorithms (GA) is a well-known strategy in several scientific disciplines. The crossover is an essential operator of the genetic algorithm. It has been an active area of research to develop sustainable forms for this operand. In this work, a new [...] Read more.
Optimization using genetic algorithms (GA) is a well-known strategy in several scientific disciplines. The crossover is an essential operator of the genetic algorithm. It has been an active area of research to develop sustainable forms for this operand. In this work, a new crossover operand is proposed. This operand depends on giving an elicited description for the chromosome with a new structure for alleles of the parents. It is suggested that each allele has two attitudes, one attitude differs contrastingly with the other, and both of them complement the allele. Thus, in case where one attitude is good, the other should be bad. This is suitable for many systems which contain admired parameters and unadmired parameters. The proposed crossover would improve the desired attitudes and dampen the undesired attitudes. The proposed crossover can be achieved in two stages: The first stage is a mating method for both attitudes in one parent to improving one attitude at the expense of the other. The second stage comes after the first improvement stage for mating between different parents. Hence, two concurrent steps for improvement would be applied. Simulation experiments for the system show improvement in the fitness function. The proposed crossover could be helpful in different fields, especially to optimize routing algorithms and network protocols, an application that has been tested as a case study in this work. Full article
(This article belongs to the Special Issue Intelligent Information Processing for Sensors and IoT Communications)
Show Figures

Figure 1

20 pages, 4722 KiB  
Article
Enhancing Organizational Data Security on Employee-Connected Devices Using BYOD Policy
by Manal Rajeh AlShalaan and Suliman Mohamed Fati
Information 2023, 14(5), 275; https://doi.org/10.3390/info14050275 - 05 May 2023
Viewed by 1846
Abstract
To address a business need, most organizations allow employees to use their own devices to enhance productivity and job satisfaction. For this purpose, the Bring Your Own Device (BYOD) policy provides controllable access for employees to organize data through their personal devices. Although [...] Read more.
To address a business need, most organizations allow employees to use their own devices to enhance productivity and job satisfaction. For this purpose, the Bring Your Own Device (BYOD) policy provides controllable access for employees to organize data through their personal devices. Although the BYOD practice implies plenty of advantages, this also opens the door to a variety of security risks. This study investigates these security risks and proposes a complementary encryption approach with a digital signature that uses symmetric and asymmetric algorithms, depending on the organization’s digital certificate, to secure sensitive information stored in employees’ devices within the framework of BYOD policies. The method uses Advanced Encryption System (AES), Blowfish, RSA and ElGamal with a digital signature to achieve strong encryption and address critical security considerations such as user authentication, confidentiality and data integrity. The proposed encryption approach offers a robust and effective cryptographic solution for securing sensitive information in organizational settings that involve BYOD policies. The study includes experimental results demonstrating the proposed approach’s efficiency and performance, with reasonable encryption and decryption times for different key and file sizes. The results of the study revealed that AES and Blowfish have the best execution time. AES has a good balance of security and performance. RSA performs better than ElGamal in encryption and signature verification, while RSA is slower than ElGamal in decryption. The study also provides a comparative analysis with previous studies of the four encryption algorithms, highlighting the strengths and weaknesses of each approach. Full article
(This article belongs to the Special Issue Advances in Cybersecurity and Reliability)
Show Figures

Figure 1

22 pages, 6616 KiB  
Article
Continuous User Authentication on Multiple Smart Devices
by Yajie Wang, Xiaomei Zhang and Haomin Hu
Information 2023, 14(5), 274; https://doi.org/10.3390/info14050274 - 05 May 2023
Cited by 1 | Viewed by 2108
Abstract
Recent developments in the mobile and intelligence industry have led to an explosion in the use of multiple smart devices such as smartphones, tablets, smart bracelets, etc. To achieve lasting security after initial authentication, many studies have been conducted to apply user authentication [...] Read more.
Recent developments in the mobile and intelligence industry have led to an explosion in the use of multiple smart devices such as smartphones, tablets, smart bracelets, etc. To achieve lasting security after initial authentication, many studies have been conducted to apply user authentication through behavioral biometrics. However, few of them consider continuous user authentication on multiple smart devices. In this paper, we investigate user authentication from a new perspective—continuous authentication on multi-devices, that is, continuously authenticating users after both initial access to one device and transfer to other devices. In contrast to previous studies, we propose a continuous user authentication method that exploits behavioral biometric identification on multiple smart devices. In this study, we consider the sensor data captured by accelerometer and gyroscope sensors on both smartphones and tablets. Furthermore, multi-device behavioral biometric data are utilized as the input of our optimized neural network model, which combines a convolutional neural network (CNN) and a long short-term memory (LSTM) network. In particular, we construct two-dimensional domain images to characterize the underlying features of sensor signals between different devices and then input them into our network for classification. In order to strengthen the effectiveness and efficiency of authentication on multiple devices, we introduce an adaptive confidence-based strategy by taking historical user authentication results into account. This paper evaluates the performance of our multi-device continuous user authentication mechanism under different scenarios, and extensive empirical results demonstrate its feasibility and efficiency. Using the mechanism, we achieved mean accuracies of 99.8% and 99.2% for smartphones and tablets, respectively, in approximately 2.3 s, which shows that it authenticates users accurately and quickly. Full article
(This article belongs to the Special Issue Advances in Computing, Communication & Security)
Show Figures

Figure 1

13 pages, 6998 KiB  
Article
Quadrilateral Mesh Generation Method Based on Convolutional Neural Network
by Yuxiang Zhou, Xiang Cai, Qingfeng Zhao, Zhoufang Xiao and Gang Xu
Information 2023, 14(5), 273; https://doi.org/10.3390/info14050273 - 04 May 2023
Cited by 1 | Viewed by 1566
Abstract
The frame field distributed inside the model region characterizes the singular structure features inside the model. These singular structures can be used to decompose the model region into multiple quadrilateral structures, thereby generating a block-structured quadrilateral mesh. For the generation of block-structured quadrilateral [...] Read more.
The frame field distributed inside the model region characterizes the singular structure features inside the model. These singular structures can be used to decompose the model region into multiple quadrilateral structures, thereby generating a block-structured quadrilateral mesh. For the generation of block-structured quadrilateral mesh for two-dimensional geometric models, a convolutional neural network model is proposed to identify the singular structure inside the model contained in the frame field. By training the network model with a large number of model region decomposition data obtained in advance, the model can identify the vectors of the frame field in the region located in the segmentation field. Then, the segmentation streamline is constructed from the annotation. Based on this, the geometric region is decomposed into several small regions, regions which are then discretized with quadrilateral mesh elements. Finally, through two geometric models, it is verified that the convolutional neural network model proposed in this study can effectively identify the singular structure inside the model to realize the model region decomposition and block-structured mesh generation. Full article
Show Figures

Figure 1

13 pages, 395 KiB  
Article
Improving Semantic Information Retrieval Using Multinomial Naive Bayes Classifier and Bayesian Networks
by Wiem Chebil, Mohammad Wedyan, Moutaz Alazab, Ryan Alturki and Omar Elshaweesh
Information 2023, 14(5), 272; https://doi.org/10.3390/info14050272 - 03 May 2023
Cited by 3 | Viewed by 1860
Abstract
This research proposes a new approach to improve information retrieval systems based on a multinomial naive Bayes classifier (MNBC), Bayesian networks (BNs), and a multi-terminology which includes MeSH thesaurus (Medical Subject Headings) and SNOMED CT (Systematized Nomenclature of Medicine of Clinical Terms). Our [...] Read more.
This research proposes a new approach to improve information retrieval systems based on a multinomial naive Bayes classifier (MNBC), Bayesian networks (BNs), and a multi-terminology which includes MeSH thesaurus (Medical Subject Headings) and SNOMED CT (Systematized Nomenclature of Medicine of Clinical Terms). Our approach, which is entitled improving semantic information retrieval (IMSIR), extracts and disambiguates concepts and retrieves documents. Relevant concepts of ambiguous terms were selected using probability measures and biomedical terminologies. Concepts are also extracted using an MNBC. The UMLS (Unified Medical Language System) thesaurus was then used to filter and rank concepts. Finally, we exploited a Bayesian network to match documents and queries using a conceptual representation. Our main contribution in this paper is to combine a supervised method (MNBC) and an unsupervised method (BN) to extract concepts from documents and queries. We also propose filtering the extracted concepts in order to keep relevant ones. Experiments of IMSIR using the two corpora, the OHSUMED corpus and the Clinical Trial (CT) corpus, were interesting because their results outperformed those of the baseline: the P@50 improvement rate was +36.5% over the baseline when the CT corpus was used. Full article
(This article belongs to the Special Issue Artificial Intelligence and Big Data Applications)
Show Figures

Figure 1

20 pages, 652 KiB  
Article
Quantifying the Dissimilarity of Texts
by Benjamin Shade and Eduardo G. Altmann
Information 2023, 14(5), 271; https://doi.org/10.3390/info14050271 - 02 May 2023
Viewed by 2113
Abstract
Quantifying the dissimilarity of two texts is an important aspect of a number of natural language processing tasks, including semantic information retrieval, topic classification, and document clustering. In this paper, we compared the properties and performance of different dissimilarity measures D using three [...] Read more.
Quantifying the dissimilarity of two texts is an important aspect of a number of natural language processing tasks, including semantic information retrieval, topic classification, and document clustering. In this paper, we compared the properties and performance of different dissimilarity measures D using three different representations of texts—vocabularies, word frequency distributions, and vector embeddings—and three simple tasks—clustering texts by author, subject, and time period. Using the Project Gutenberg database, we found that the generalised Jensen–Shannon divergence applied to word frequencies performed strongly across all tasks, that D’s based on vector embedding representations led to stronger performance for smaller texts, and that the optimal choice of approach was ultimately task-dependent. We also investigated, both analytically and numerically, the behaviour of the different D’s when the two texts varied in length by a factor h. We demonstrated that the (natural) estimator of the Jaccard distance between vocabularies was inconsistent and computed explicitly the h-dependency of the bias of the estimator of the generalised Jensen–Shannon divergence applied to word frequencies. We also found numerically that the Jensen–Shannon divergence and embedding-based approaches were robust to changes in h, while the Jaccard distance was not. Full article
(This article belongs to the Special Issue Novel Methods and Applications in Natural Language Processing)
Show Figures

Figure 1

16 pages, 1165 KiB  
Article
Enhancing Traceability Link Recovery with Fine-Grained Query Expansion Analysis
by Tao Peng, Kun She, Yimin Shen, Xiangliang Xu and Yue Yu
Information 2023, 14(5), 270; https://doi.org/10.3390/info14050270 - 02 May 2023
Cited by 1 | Viewed by 1341
Abstract
Requirement traceability links are an essential part of requirement management software and are a basic prerequisite for software artifact changes. The manual establishment of requirement traceability links is time-consuming. When faced with large projects, requirement managers spend a lot of time in establishing [...] Read more.
Requirement traceability links are an essential part of requirement management software and are a basic prerequisite for software artifact changes. The manual establishment of requirement traceability links is time-consuming. When faced with large projects, requirement managers spend a lot of time in establishing relationships from numerous requirements and codes. However, existing techniques for automatic requirement traceability link recovery are limited by the semantic disparity between natural language and programming language, resulting in many methods being less accurate. In this paper, we propose a fine-grained requirement-code traceability link recovery approach based on query expansion, which analyzes the semantic similarity between requirements and codes from a fine-grained perspective, and uses a query expansion technique to establish valid links that deviate from the query, so as to further improve the accuracy of traceability link recovery. Experiments showed that the approach proposed in this paper outperforms state-of-the-art unsupervised traceability link recovery methods, not only specifying the obvious advantages of fine-grained structure analysis for word embedding-based traceability link recovery, but also improving the accuracy of establishing requirement traceability links. The experimental results demonstrate the superiority of our approach. Full article
(This article belongs to the Topic Software Engineering and Applications)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop