Editor’s Choice Articles

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 352 KiB  
Review
Combining Machine Learning and Edge Computing: Opportunities, Challenges, Platforms, Frameworks, and Use Cases
by Piotr Grzesik and Dariusz Mrozek
Electronics 2024, 13(3), 640; https://doi.org/10.3390/electronics13030640 - 3 Feb 2024
Viewed by 2521
Abstract
In recent years, we have been observing the rapid growth and adoption of IoT-based systems, enhancing multiple areas of our lives. Concurrently, the utilization of machine learning techniques has surged, often for similar use cases as those seen in IoT systems. In this [...] Read more.
In recent years, we have been observing the rapid growth and adoption of IoT-based systems, enhancing multiple areas of our lives. Concurrently, the utilization of machine learning techniques has surged, often for similar use cases as those seen in IoT systems. In this survey, we aim to focus on the combination of machine learning and the edge computing paradigm. The presented research commences with the topic of edge computing, its benefits, such as reduced data transmission, improved scalability, and reduced latency, as well as the challenges associated with this computing paradigm, like energy consumption, constrained devices, security, and device fleet management. It then presents the motivations behind the combination of machine learning and edge computing, such as the availability of more powerful edge devices, improving data privacy, reducing latency, or lowering reliance on centralized services. Then, it describes several edge computing platforms, with a focus on their capability to enable edge intelligence workflows. It also reviews the currently available edge intelligence frameworks and libraries, such as TensorFlow Lite or PyTorch Mobile. Afterward, the paper focuses on the existing use cases for edge intelligence in areas like industrial applications, healthcare applications, smart cities, environmental monitoring, or autonomous vehicles. Full article
(This article belongs to the Special Issue Towards Efficient and Reliable AI at the Edge)
Show Figures

Figure 1

17 pages, 2087 KiB  
Article
Multi-Channel Graph Convolutional Networks for Graphs with Inconsistent Structures and Features
by Xinglong Chang, Jianrong Wang, Rui Wang, Tao Wang, Yingkui Wang and Weihao Li
Electronics 2024, 13(3), 607; https://doi.org/10.3390/electronics13030607 - 1 Feb 2024
Viewed by 1046
Abstract
Graph convolutional networks (GCNs) have attracted increasing attention in various fields due to their significant capacity to process graph-structured data. Typically, the GCN model and its variants heavily rely on the transmission of node features across the graph structure, which implicitly assumes that [...] Read more.
Graph convolutional networks (GCNs) have attracted increasing attention in various fields due to their significant capacity to process graph-structured data. Typically, the GCN model and its variants heavily rely on the transmission of node features across the graph structure, which implicitly assumes that the graph structure and node features are consistent, i.e., they carry related information. However, in many real-world networks, node features may unexpectedly mismatch with the structural information. Existing GCNs fail to generalize to inconsistent scenarios and are even outperformed by models that ignore the graph structure or node features. To address this problem, we investigate how to extract representations from both the graph structure and node features. Consequently, we propose the multi-channel graph convolutional network (MCGCN) for graphs with inconsistent structures and features. Specifically, the MCGCN encodes the graph structure and node features using two specific convolution channels to extract two separate specific representations. Additionally, two joint convolution channels are constructed to extract the common information shared by the graph structure and node features. Finally, an attention mechanism is utilized to adaptively learn the importance weights of these channels under the guidance of the node classification task. In this way, our model can handle both consistent and inconsistent scenarios. Extensive experiments on both synthetic and real-world datasets for node classification and recommendation tasks show that our methods, MCGCN-A and MCGCN-I, achieve the best performance on seven out of eight datasets and the second-best performance on the remaining dataset. For simpler graph structures or tasks where the overhead of multiple convolution channels is not justified, traditional single-channel GCN models might be more efficient. Full article
Show Figures

Figure 1

15 pages, 2049 KiB  
Article
A Multimodal Late Fusion Framework for Physiological Sensor and Audio-Signal-Based Stress Detection: An Experimental Study and Public Dataset
by Vasileios-Rafail Xefteris, Monica Dominguez, Jens Grivolla, Athina Tsanousa, Francesco Zaffanela, Martina Monego, Spyridon Symeonidis, Sotiris Diplaris, Leo Wanner, Stefanos Vrochidis and Ioannis Kompatsiaris
Electronics 2023, 12(23), 4871; https://doi.org/10.3390/electronics12234871 - 2 Dec 2023
Cited by 1 | Viewed by 1573
Abstract
Stress can be considered a mental/physiological reaction in conditions of high discomfort and challenging situations. The levels of stress can be reflected in both the physiological responses and speech signals of a person. Therefore the study of the fusion of the two modalities [...] Read more.
Stress can be considered a mental/physiological reaction in conditions of high discomfort and challenging situations. The levels of stress can be reflected in both the physiological responses and speech signals of a person. Therefore the study of the fusion of the two modalities is of great interest. For this cause, public datasets are necessary so that the different proposed solutions can be comparable. In this work, a publicly available multimodal dataset for stress detection is introduced, including physiological signals and speech cues data. The physiological signals include electrocardiograph (ECG), respiration (RSP), and inertial measurement unit (IMU) sensors equipped in a smart vest. A data collection protocol was introduced to receive physiological and audio data based on alterations between well-known stressors and relaxation moments. Five subjects participated in the data collection, where both their physiological and audio signals were recorded by utilizing the developed smart vest and audio recording application. In addition, an analysis of the data and a decision-level fusion scheme is proposed. The analysis of physiological signals includes a massive feature extraction along with various fusion and feature selection methods. The audio analysis comprises a state-of-the-art feature extraction fed to a classifier to predict stress levels. Results from the analysis of audio and physiological signals are fused at a decision level for the final stress level detection, utilizing a machine learning algorithm. The whole framework was also tested in a real-life pilot scenario of disaster management, where users were acting as first responders while their stress was monitored in real time. Full article
(This article belongs to the Special Issue Future Trends of Artificial Intelligence (AI) and Big Data)
Show Figures

Figure 1

14 pages, 1680 KiB  
Article
AI to Train AI: Using ChatGPT to Improve the Accuracy of a Therapeutic Dialogue System
by Karolina Gabor-Siatkowska, Marcin Sowański, Rafał Rzatkiewicz, Izabela Stefaniak, Marek Kozłowski and Artur Janicki
Electronics 2023, 12(22), 4694; https://doi.org/10.3390/electronics12224694 - 18 Nov 2023
Cited by 1 | Viewed by 1755
Abstract
In this work, we present the use of one artificial intelligence (AI) application (ChatGPT) to train another AI-based application. As the latter one, we show a dialogue system named Terabot, which was used in the therapy of psychiatric patients. Our study was motivated [...] Read more.
In this work, we present the use of one artificial intelligence (AI) application (ChatGPT) to train another AI-based application. As the latter one, we show a dialogue system named Terabot, which was used in the therapy of psychiatric patients. Our study was motivated by the fact that for such a domain-specific system, it was difficult to acquire large real-life data samples to increase the training database: this would require recruiting more patients, which is both time-consuming and costly. To address this gap, we have employed a neural large language model: ChatGPT version 3.5, to generate data solely for training our dialogue system. During initial experiments, we identified intents that were most often misrecognized. Next, we fed ChatGPT with a series of prompts, which triggered the language model to generate numerous additional training entries, e.g., alternatives to the phrases that had been collected during initial experiments with healthy users. This way, we have enlarged the training dataset by 112%. In our case study, for testing, we used 2802 speech recordings originating from 32 psychiatric patients. As an evaluation metric, we used the accuracy of intent recognition. The speech samples were converted into text using automatic speech recognition (ASR). The analysis showed that the patients’ speech challenged the ASR module significantly, resulting in deteriorated speech recognition and, consequently, low accuracy of intent recognition. However, thanks to the augmentation of the training data with ChatGPT-generated data, the intent recognition accuracy increased by 13% relatively, reaching 86% in total. We also emulated the case of an error-free ASR and showed the impact of ASR misrecognitions on the intent recognition accuracy. Our study showcased the potential of using generative language models to develop other AI-based tools, such as dialogue systems. Full article
(This article belongs to the Special Issue Application of Machine Learning and Intelligent Systems)
Show Figures

Figure 1

33 pages, 1227 KiB  
Review
A Systematic Literature Review on Artificial Intelligence and Explainable Artificial Intelligence for Visual Quality Assurance in Manufacturing
by Rudolf Hoffmann and Christoph Reich
Electronics 2023, 12(22), 4572; https://doi.org/10.3390/electronics12224572 - 8 Nov 2023
Viewed by 3674
Abstract
Quality assurance (QA) plays a crucial role in manufacturing to ensure that products meet their specifications. However, manual QA processes are costly and time-consuming, thereby making artificial intelligence (AI) an attractive solution for automation and expert support. In particular, convolutional neural networks (CNNs) [...] Read more.
Quality assurance (QA) plays a crucial role in manufacturing to ensure that products meet their specifications. However, manual QA processes are costly and time-consuming, thereby making artificial intelligence (AI) an attractive solution for automation and expert support. In particular, convolutional neural networks (CNNs) have gained a lot of interest in visual inspection. Next to AI methods, the explainable artificial intelligence (XAI) systems, which achieve transparency and interpretability by providing insights into the decision-making process of the AI, are interesting methods for achieveing quality inspections in manufacturing processes. In this study, we conducted a systematic literature review (SLR) to explore AI and XAI approaches for visual QA (VQA) in manufacturing. Our objective was to assess the current state of the art and identify research gaps in this context. Our findings revealed that AI-based systems predominantly focused on visual quality control (VQC) for defect detection. Research addressing VQA practices, like process optimization, predictive maintenance, or root cause analysis, are more rare. Least often cited are papers that utilize XAI methods. In conclusion, this survey emphasizes the importance and potential of AI and XAI in VQA across various industries. By integrating XAI, organizations can enhance model transparency, interpretability, and trust in AI systems. Overall, leveraging AI and XAI improves VQA practices and decision-making in industries. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications - Volume III)
Show Figures

Figure 1

16 pages, 5443 KiB  
Article
Design of High-Gain and Low-Mutual-Coupling Multiple-Input–Multiple-Output Antennas Based on PRS for 28 GHz Applications
by Jinkyu Jung, Wahaj Abbas Awan, Domin Choi, Jaemin Lee, Niamat Hussain and Nam Kim
Electronics 2023, 12(20), 4286; https://doi.org/10.3390/electronics12204286 - 16 Oct 2023
Cited by 7 | Viewed by 1562
Abstract
In this paper, a high-gain and low-mutual-coupling four-port Multiple Input Multiple Output (MIMO) antenna based on a Partially Reflective Surface (PRS) for 28 GHz applications is proposed. The antenna radiator is a circular-shaped patch with a circular slot and a pair of vias [...] Read more.
In this paper, a high-gain and low-mutual-coupling four-port Multiple Input Multiple Output (MIMO) antenna based on a Partially Reflective Surface (PRS) for 28 GHz applications is proposed. The antenna radiator is a circular-shaped patch with a circular slot and a pair of vias to secure a wide bandwidth ranging from 24.29 GHz to 28.45 GHz (15.77%). The targeted band has been allocated for several countries such as Korea, Europe, the United States, China, and Japan. The optimized antenna offers a peak gain of 8.77 dBi at 24.29 GHz with a gain of 6.78 dBi. A novel PRS is designed and loaded on the antenna for broadband and high-gain characteristics. With the PRS, the antenna offers a wide bandwidth from 23.67 GHz to 29 GHz (21%), and the gain is improved up to 11.4 dBi, showing an overall increase of about 3 dBi. A 2 × 2 MIMO system is designed using the single-element antenna, which offers a bandwidth of 23.5 to 29 GHz (20%), and a maximum gain of 11.4 dBi. The MIMO antenna also exhibits a low mutual coupling of −35 dB along with a low Envelope Correlation Coefficient and Channel Capacity Loss, making it a suitable candidate for future compact-sized mmWave MIMO systems. Full article
Show Figures

Figure 1

14 pages, 840 KiB  
Article
Reconfigurable Intelligent Surface-Assisted Millimeter Wave Networks: Cell Association and Coverage Analysis
by Donglai Zhao, Gang Wang, Jinlong Wang and Zhiquan Zhou
Electronics 2023, 12(20), 4270; https://doi.org/10.3390/electronics12204270 - 16 Oct 2023
Cited by 2 | Viewed by 1018
Abstract
Reconfigurable intelligent surface (RIS) is emerging as a promising technology to achieve coverage enhancement. This paper develops a tractable analytical framework based on stochastic geometry for performance analysis of RIS-assisted millimeter wave networks. Based on the framework, a two-step cell association criterion is [...] Read more.
Reconfigurable intelligent surface (RIS) is emerging as a promising technology to achieve coverage enhancement. This paper develops a tractable analytical framework based on stochastic geometry for performance analysis of RIS-assisted millimeter wave networks. Based on the framework, a two-step cell association criterion is proposed, and the analytical expressions of the user association probability and the coverage probability in general scenarios are derived. In addition, the closed-form expressions of the two performance metrics in special cases are also provided. The simulation results verify the accuracy of the theoretically derived analytical expressions, and reveal the superiority of deploying RISs in millimeter wave networks and the effectiveness of the proposed cell association scheme to improve coverage. Furthermore, the effects of the RIS parameters and the BS density on coverage performance are also investigated. Full article
Show Figures

Figure 1

6 pages, 359 KiB  
Editorial
Wearable Electronic Systems Based on Smart Wireless Sensors for Multimodal Physiological Monitoring in Health Applications: Challenges, Opportunities, and Future Directions
by Cristiano De Marchis, Giovanni Crupi, Nicola Donato and Sergio Baldari
Electronics 2023, 12(20), 4284; https://doi.org/10.3390/electronics12204284 - 16 Oct 2023
Viewed by 1071
Abstract
Driven by the fast-expanding market, wearable technologies have rapidly evolved [...] Full article
(This article belongs to the Section Microwave and Wireless Communications)
Show Figures

Figure 1

31 pages, 5489 KiB  
Article
Explicit Representation of Mechanical Functions for Maintenance Decision Support
by Mengchu Song, Ilmar F. Santos, Xinxin Zhang, Jing Wu and Morten Lind
Electronics 2023, 12(20), 4267; https://doi.org/10.3390/electronics12204267 - 15 Oct 2023
Cited by 1 | Viewed by 1140
Abstract
Artificial intelligence (AI) has been increasingly applied to condition-based maintenance (CBM), a knowledge-based method taking advantage of human expertise and other system knowledge that can serve as an alternative in cases in which machine learning is inapplicable due to a lack of training [...] Read more.
Artificial intelligence (AI) has been increasingly applied to condition-based maintenance (CBM), a knowledge-based method taking advantage of human expertise and other system knowledge that can serve as an alternative in cases in which machine learning is inapplicable due to a lack of training data. Functional information is seen as the most fundamental and important knowledge in maintenance decision making. This paper first proposes a mechanical functional modeling approach based on a functional modeling and reasoning methodology called multilevel flow modeling (MFM). The approach actually bridges the modeling gap between the mechanical level and the process level, which potentially extends the existing capability of MFM in rule-based diagnostics and prognostics from operation support to maintenance support. Based on this extension, a framework of optimized CBM is proposed, which can be used to diagnose potential mechanical failures from condition monitoring data and predict their future impacts in a qualitative way. More importantly, the framework uses MFM-based reliability-centered maintenance (RCM) to determine the importance of a detected potential failure, which can ensure the cost-effectiveness of CBM by adapting the maintenance requirements to specific operational contexts. This ability cannot be offered by existing CBM methods. An application to a mechanical test apparatus and hypothetical coupling with a process plant are used to demonstrate the proposed framework. Full article
Show Figures

Figure 1

26 pages, 2948 KiB  
Article
Real-Time AI-Driven Fall Detection Method for Occupational Health and Safety
by Anastasiya Danilenka, Piotr Sowiński, Kajetan Rachwał, Karolina Bogacka, Anna Dąbrowska, Monika Kobus, Krzysztof Baszczyński, Małgorzata Okrasa, Witold Olczak, Piotr Dymarski, Ignacio Lacalle, Maria Ganzha and Marcin Paprzycki
Electronics 2023, 12(20), 4257; https://doi.org/10.3390/electronics12204257 - 14 Oct 2023
Cited by 3 | Viewed by 1901
Abstract
Fall accidents in industrial and construction environments require an immediate reaction, to provide first aid. Shortening the time between the fall and the relevant personnel being notified can significantly improve the safety and health of workers. Therefore, in this work, an IoT system [...] Read more.
Fall accidents in industrial and construction environments require an immediate reaction, to provide first aid. Shortening the time between the fall and the relevant personnel being notified can significantly improve the safety and health of workers. Therefore, in this work, an IoT system for real-time fall detection is proposed, using the ASSIST-IoT reference architecture. Empowered with a machine learning model, the system can detect fall accidents and swiftly notify the occupational health and safety manager. To train the model, a novel multimodal fall detection dataset was collected from ten human participants and an anthropomorphic dummy, covering multiple types of fall, including falls from a height. The dataset includes absolute location and acceleration measurements from several IoT devices. Furthermore, a lightweight long short-term memory model is proposed for fall detection, capable of operating in an IoT environment with limited network bandwidth and hardware resources. The accuracy and F1-score of the model on the collected dataset were shown to exceed 0.95 and 0.9, respectively. The collected multimodal dataset was published under an open license, to facilitate future research on fall detection methods in occupational health and safety. Full article
(This article belongs to the Special Issue Artificial Intelligence Empowered Internet of Things)
Show Figures

Figure 1

12 pages, 7248 KiB  
Article
Optimal Camera Placement to Generate 3D Reconstruction of a Mixed-Reality Human in Real Environments
by Juhwan Kim and Dongsik Jo
Electronics 2023, 12(20), 4244; https://doi.org/10.3390/electronics12204244 - 13 Oct 2023
Viewed by 1472
Abstract
Virtual reality and augmented reality are increasingly used for immersive engagement by utilizing information from real environments. In particular, three-dimensional model data, which is the basis for creating virtual places, can be manually developed using commercial modeling toolkits, but with the advancement of [...] Read more.
Virtual reality and augmented reality are increasingly used for immersive engagement by utilizing information from real environments. In particular, three-dimensional model data, which is the basis for creating virtual places, can be manually developed using commercial modeling toolkits, but with the advancement of sensing technology, computer vision technology can also be used to create virtual environments. Specifically, a 3D reconstruction approach can generate a single 3D model from image information obtained from various scenes in real environments using several cameras (multi-cameras). The goal is to generate a 3D model with excellent precision. However, the rules for choosing the optimal number of cameras and settings to capture information from in real environments (e.g., actual people) employing several cameras in unconventional positions are lacking. In this study, we propose an optimal camera placement strategy for acquiring high-quality 3D data using an irregular camera placement, essential for organizing image information while acquiring human data in a three-dimensional real space, using multiple irregular cameras in real environments. Our results show that installation costs can be lowered by arranging a minimum number of multi-camera cameras in an arbitrary space, and automated virtual human manufacturing with high accuracy can be conducted using optimal irregular camera location. Full article
(This article belongs to the Special Issue Perception and Interaction in Mixed, Augmented, and Virtual Reality)
Show Figures

Figure 1

16 pages, 4373 KiB  
Article
Computer Vision Algorithms for 3D Object Recognition and Orientation: A Bibliometric Study
by Youssef Yahia, Júlio Castro Lopes and Rui Pedro Lopes
Electronics 2023, 12(20), 4218; https://doi.org/10.3390/electronics12204218 - 12 Oct 2023
Cited by 1 | Viewed by 1379
Abstract
This paper consists of a bibliometric study that covers the topic of 3D object detection from 2022 until the present day. It employs various analysis approaches that shed light on the leading authors, affiliations, and countries within this research domain alongside the main [...] Read more.
This paper consists of a bibliometric study that covers the topic of 3D object detection from 2022 until the present day. It employs various analysis approaches that shed light on the leading authors, affiliations, and countries within this research domain alongside the main themes of interest related to it. The findings revealed that China is the leading country in this domain given the fact that it is responsible for most of the scientific literature as well as being a host for the most productive universities and authors in terms of the number of publications. China is also responsible for initiating a significant number of collaborations with various nations around the world. The most basic theme related to this field is deep learning, along with autonomous driving, point cloud, robotics, and LiDAR. The work also includes an in-depth review that underlines some of the latest frameworks that took on various challenges regarding this topic, the improvement of object detection from point clouds, and training end-to-end fusion methods using both camera and LiDAR sensors, to name a few. Full article
(This article belongs to the Special Issue Applications of Deep Learning Techniques)
Show Figures

Figure 1

15 pages, 5638 KiB  
Article
Underwater Biomimetic Covert Acoustic Communications Mimicking Multiple Dolphin Whistles
by Yongcheol Kim, Hojun Lee, Seunghwan Seol, Bonggyu Park and Jaehak Chung
Electronics 2023, 12(19), 3999; https://doi.org/10.3390/electronics12193999 - 22 Sep 2023
Cited by 1 | Viewed by 731
Abstract
This paper presents an underwater biomimetic covert acoustic communication system that achieves high covertness and a high data rate by mimicking dolphin group whistles. The proposed method uses combined time–frequency shift keying modulation with continuous varying carrier frequency modulation, which mitigates the interference [...] Read more.
This paper presents an underwater biomimetic covert acoustic communication system that achieves high covertness and a high data rate by mimicking dolphin group whistles. The proposed method uses combined time–frequency shift keying modulation with continuous varying carrier frequency modulation, which mitigates the interference between two overlapping multiple whistles while maintaining a high data rate. The data rate and bit error rate (BER) performance of the proposed method were compared with conventional underwater covert communication through an additive white Gaussian noise channel, a modeled underwater channel, and practical ocean experiments. For the covertness test, the similarity of the proposed multiple whistles was compared with the real dolphin group whistles using the mean opinion score test. As a result, the proposed method demonstrated a higher data rate, better BER performance, and large covertness to the real dolphin group whistles. Full article
(This article belongs to the Special Issue New Advances in Underwater Communication Systems)
Show Figures

Figure 1

18 pages, 12864 KiB  
Article
A CMA-Based Electronically Reconfigurable Dual-Mode and Dual-Band Antenna
by Nicholas E. Russo, Constantinos L. Zekios and Stavros V. Georgakopoulos
Electronics 2023, 12(18), 3915; https://doi.org/10.3390/electronics12183915 - 17 Sep 2023
Viewed by 788
Abstract
In this work, an electronically reconfigurable dual-band dual-mode microstrip ring antenna with high isolation is proposed. Using characteristic mode analysis (CMA), the physical characteristics of the ring antenna are revealed, and two modes are appropriately chosen for operation in two sub-6 GHz “legacy” [...] Read more.
In this work, an electronically reconfigurable dual-band dual-mode microstrip ring antenna with high isolation is proposed. Using characteristic mode analysis (CMA), the physical characteristics of the ring antenna are revealed, and two modes are appropriately chosen for operation in two sub-6 GHz “legacy” bands. Due to the inherent orthogonality of the characteristic modes, measured isolation larger than 37 dB was achieved in both bands without requiring complicated decoupling approaches. An integrated electronically reconfigurable matching network (comprising PIN diodes and varactors) was designed to switch between the two modes of operation. The simulated and measured results were in excellent agreement, showing a peak gain of 4.7 dB for both modes and radiation efficiency values of 44.3% and 64%, respectively. Using CMA to gain physical insights into the radiative orthogonal modes of under-researched and non-conventional antennas (e.g., antennas of arbitrary shapes) opens the door to developing highly compact radiators, which enable next-generation communication systems. Full article
(This article belongs to the Special Issue Recent Advances in Antenna Arrays and Millimeter-Wave Components)
Show Figures

Figure 1

15 pages, 4402 KiB  
Article
DSW-YOLOv8n: A New Underwater Target Detection Algorithm Based on Improved YOLOv8n
by Qiang Liu, Wei Huang, Xiaoqiu Duan, Jianghao Wei, Tao Hu, Jie Yu and Jiahuan Huang
Electronics 2023, 12(18), 3892; https://doi.org/10.3390/electronics12183892 - 15 Sep 2023
Cited by 4 | Viewed by 1921
Abstract
Underwater target detection is widely used in various applications such as underwater search and rescue, underwater environment monitoring, and marine resource surveying. However, the complex underwater environment, including factors such as light changes and background noise, poses a significant challenge to target detection. [...] Read more.
Underwater target detection is widely used in various applications such as underwater search and rescue, underwater environment monitoring, and marine resource surveying. However, the complex underwater environment, including factors such as light changes and background noise, poses a significant challenge to target detection. We propose an improved underwater target detection algorithm based on YOLOv8n to overcome these problems. Our algorithm focuses on three aspects. Firstly, we replace the original C2f module with Deformable Convnets v2 to enhance the adaptive ability of the target region in the convolution check feature map and extract the target region’s features more accurately. Secondly, we introduce SimAm, a non-parametric attention mechanism, which can deduce and assign three-dimensional attention weights without adding network parameters. Lastly, we optimize the loss function by replacing the CIoU loss function with the Wise-IoU loss function. We named our new algorithm DSW-YOLOv8n, which is an acronym of Deformable Convnets v2, SimAm, and Wise-IoU of the improved YOLOv8n(DSW-YOLOv8n). To conduct our experiments, we created our own dataset of underwater target detection for experimentation. Meanwhile, we also utilized the Pascal VOC dataset to evaluate our approach. The [email protected] and [email protected]:0.95 of the original YOLOv8n algorithm on underwater target detection were 88.6% and 51.8%, respectively, and the DSW-YOLOv8n algorithm [email protected] and [email protected]:0.95 can reach 91.8% and 55.9%. The original YOLOv8n algorithm was 62.2% and 45.9% [email protected] and [email protected]:0.95 on the Pascal VOC dataset, respectively. The DSW-YOLOv8n algorithm [email protected] and [email protected]:0.95 were 65.7% and 48.3%, respectively. The number of parameters of the model is reduced by about 6%. The above experimental results prove the effectiveness of our method. Full article
(This article belongs to the Special Issue Advances and Applications of Computer Vision in Electronics)
Show Figures

Figure 1

21 pages, 7872 KiB  
Article
YOLO-Drone: An Optimized YOLOv8 Network for Tiny UAV Object Detection
by Xianxu Zhai, Zhihua Huang, Tao Li, Hanzheng Liu and Siyuan Wang
Electronics 2023, 12(17), 3664; https://doi.org/10.3390/electronics12173664 - 30 Aug 2023
Cited by 13 | Viewed by 10697
Abstract
With the widespread use of UAVs in commercial and industrial applications, UAV detection is receiving increasing attention in areas such as public safety. As a result, object detection techniques for UAVs are also developing rapidly. However, the small size of drones, complex airspace [...] Read more.
With the widespread use of UAVs in commercial and industrial applications, UAV detection is receiving increasing attention in areas such as public safety. As a result, object detection techniques for UAVs are also developing rapidly. However, the small size of drones, complex airspace backgrounds, and changing light conditions still pose significant challenges for research in this area. Based on the above problems, this paper proposes a tiny UAV detection method based on the optimized YOLOv8. First, in the detection head component, a high-resolution detection head is added to improve the device’s detection capability for small targets, while the large target detection head and redundant network layers are cut off to effectively reduce the number of network parameters and improve the detection speed of UAV; second, in the feature extraction stage, SPD-Conv is used to extract multi-scale features instead of Conv to reduce the loss of fine-grained information and enhance the model’s feature extraction capability for small targets. Finally, the GAM attention mechanism is introduced in the neck to enhance the model’s fusion of target features and improve the model’s overall performance in detecting UAVs. Relative to the baseline model, our method improves performance by 11.9%, 15.2%, and 9% in terms of P (precision), R (recall), and mAP (mean average precision), respectively. Meanwhile, it reduces the number of parameters and model size by 59.9% and 57.9%, respectively. In addition, our method demonstrates clear advantages in comparison experiments and self-built dataset experiments and is more suitable for engineering deployment and the practical applications of UAV object detection systems. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Deep Learning and Its Applications)
Show Figures

Figure 1

19 pages, 4263 KiB  
Article
Integration of Wearables and Wireless Technologies to Improve the Interaction between Disabled Vulnerable Road Users and Self-Driving Cars
by Antonio Guerrero-Ibañez, Ismael Amezcua-Valdovinos and Juan Contreras-Castillo
Electronics 2023, 12(17), 3587; https://doi.org/10.3390/electronics12173587 - 25 Aug 2023
Cited by 2 | Viewed by 1678
Abstract
The auto industry is accelerating, and self-driving cars are becoming a reality. However, the acceptance of such cars will depend on their social and environmental integration into a road traffic ecosystem comprising vehicles, motorcycles, bicycles, and pedestrians. One of the most vulnerable groups [...] Read more.
The auto industry is accelerating, and self-driving cars are becoming a reality. However, the acceptance of such cars will depend on their social and environmental integration into a road traffic ecosystem comprising vehicles, motorcycles, bicycles, and pedestrians. One of the most vulnerable groups within the road ecosystem is pedestrians. Assistive technology focuses on ensuring functional independence for people with disabilities. However, little effort has been devoted to exploring possible interaction mechanisms between pedestrians with disabilities and self-driving cars. This paper analyzes how self-driving cars and disabled pedestrians should interact in a traffic ecosystem supported by wearable devices for pedestrians to feel safer and more comfortable. We define the concept of an Assistive Self-driving Car (ASC). We describe a set of procedures to identify people with disabilities using an IEEE 802.11p-based device and a group of messages to express the intentions of disabled pedestrians to self-driving cars. This interaction provides disabled pedestrians with increased safety and confidence in performing tasks such as crossing the street. Finally, we discuss strategies for alerting disabled pedestrians to potential hazards within the road ecosystem. Full article
Show Figures

Figure 1

16 pages, 5254 KiB  
Article
A Deep Learning Framework for Adaptive Beamforming in Massive MIMO Millimeter Wave 5G Multicellular Networks
by Spyros Lavdas, Panagiotis K. Gkonis, Efthalia Tsaknaki, Lambros Sarakis, Panagiotis Trakadas and Konstantinos Papadopoulos
Electronics 2023, 12(17), 3555; https://doi.org/10.3390/electronics12173555 - 23 Aug 2023
Cited by 1 | Viewed by 1274
Abstract
The goal of this paper is the performance evaluation of a deep learning approach when deployed in fifth-generation (5G) millimeter wave (mmWave) multicellular networks. To this end, the optimum beamforming configuration is defined by two neural networks (NNs) that are properly trained, according [...] Read more.
The goal of this paper is the performance evaluation of a deep learning approach when deployed in fifth-generation (5G) millimeter wave (mmWave) multicellular networks. To this end, the optimum beamforming configuration is defined by two neural networks (NNs) that are properly trained, according to mean square error (MSE) minimization. The first network has as input the requested spectral efficiency (SE) per active sector, while the second network has the corresponding energy efficiency (EE). Hence, channel and power variations can now be taken into consideration during adaptive beamforming. The performance of the proposed approach is evaluated with the help of a developed system-level simulator via extensive Monte Carlo simulations. According to the presented results, machine learning (ML)-adaptive beamforming can significantly improve EE compared to the standard non-ML framework. Although this improvement comes at the cost of increased blocking probability (BP) and radiating elements (REs) for high data rate services, the corresponding increase ratios are significantly reduced compared to the EE improvement ratio. In particular, considering 21.6 Mbps per active user and ML adaptive beamforming, the EE can reach up to 5.3 Mbps/W, which is significantly improved compared to the non-ML case (0.9 Mbps/W). In this context, BP does not exceed 2.6%, which is slightly worse compared to 1.7% in the standard non-ML case. Moreover, approximately 20% additional REs are required with respect to the non-ML framework. Full article
(This article belongs to the Special Issue Recent Advances in Antenna Arrays and Millimeter-Wave Components)
Show Figures

Figure 1

16 pages, 13408 KiB  
Article
A 220 GHz to 325 GHz Grounded Coplanar Waveguide Based Periodic Leaky-Wave Beam-Steering Antenna in Indium Phosphide Process
by Akanksha Bhutani, Marius Kretschmann, Joel Dittmer, Peng Lu, Andreas Stöhr and Thomas Zwick
Electronics 2023, 12(16), 3482; https://doi.org/10.3390/electronics12163482 - 17 Aug 2023
Cited by 2 | Viewed by 1732
Abstract
This paper presents a novel periodic grounded coplanar waveguide (GCPW) leaky-wave antenna implemented in an Indium Phosphide (InP) process. The antenna is designed to operate in the 220 GHz–325 GHz frequency range, with the goal of integrating it with an InP uni-traveling-carrier photodiode [...] Read more.
This paper presents a novel periodic grounded coplanar waveguide (GCPW) leaky-wave antenna implemented in an Indium Phosphide (InP) process. The antenna is designed to operate in the 220 GHz–325 GHz frequency range, with the goal of integrating it with an InP uni-traveling-carrier photodiode to realize a wireless transmitter module. Future wireless communication systems must deliver a high data rate to multiple users in different locations. Therefore, wireless transmitters need to have a broadband nature, high gain, and beam-steering capability. Leaky-wave antennas offer a simple and cost-effective way to achieve beam-steering by sweeping frequency in the THz range. In this paper, the first periodic GCPW leaky-wave antenna in the 220 GHz–325 GHz frequency range is demonstrated. The antenna design is based on a novel GCPW leaky-wave unit cell (UC) that incorporates mirrored L-slots in the lateral ground planes. These mirrored L-slots effectively mitigate the open stopband phenomenon of a periodic leaky-wave antenna. The leakage rate, phase constant, and Bloch impedance of the novel GCPW leaky-wave UC are analyzed using Floquet’s theory. After optimizing the UC, a periodic GCPW leaky-wave antenna is constructed by cascading 16 UCs. Electromagnetic simulation results of the leaky-wave antenna are compared with an ideal model derived from a single UC. The two design approaches show excellent agreement in terms of their reflection coefficient and beam-steering range. Therefore, the ideal model presented in this paper demonstrates, for the first time, a rapid method for developing periodic leaky-wave antennas. To validate the simulation results, probe-based antenna measurements are conducted, showing close agreement in terms of the reflection coefficient, peak antenna gain, beam-steering angle, and far-field radiation patterns. The periodic GCPW leaky-wave antenna presented in this paper exhibits a high gain of up to 13.5 dBi and a wide beam-steering range from 60° to 35° over the 220 GHz–325 GHz frequency range. Full article
(This article belongs to the Special Issue Advanced Antenna Technologies for B5G and 6G Applications)
Show Figures

Figure 1

15 pages, 2160 KiB  
Article
Safe and Trustful AI for Closed-Loop Control Systems
by Julius Schöning and Hans-Jürgen Pfisterer
Electronics 2023, 12(16), 3489; https://doi.org/10.3390/electronics12163489 - 17 Aug 2023
Cited by 2 | Viewed by 1808
Abstract
In modern times, closed-loop control systems (CLCSs) play a prominent role in a wide application range, from production machinery via automated vehicles to robots. CLCSs actively manipulate the actual values of a process to match predetermined setpoints, typically in real time and with [...] Read more.
In modern times, closed-loop control systems (CLCSs) play a prominent role in a wide application range, from production machinery via automated vehicles to robots. CLCSs actively manipulate the actual values of a process to match predetermined setpoints, typically in real time and with remarkable precision. However, the development, modeling, tuning, and optimization of CLCSs barely exploit the potential of artificial intelligence (AI). This paper explores novel opportunities and research directions in CLCS engineering, presenting potential designs and methodologies incorporating AI. Combining these opportunities and directions makes it evident that employing AI in developing and implementing CLCSs is indeed feasible. Integrating AI into CLCS development or AI directly within CLCSs can lead to a significant improvement in stakeholder confidence. Integrating AI in CLCSs raises the question: How can AI in CLCSs be trusted so that its promising capabilities can be used safely? One does not trust AI in CLCSs due to its unknowable nature caused by its extensive set of parameters that defy complete testing. Consequently, developers working on AI-based CLCSs must be able to rate the impact of the trainable parameters on the system accurately. By following this path, this paper highlights two key aspects as essential research directions towards safe AI-based CLCSs: (I) the identification and elimination of unproductive layers in artificial neural networks (ANNs) for reducing the number of trainable parameters without influencing the overall outcome, and (II) the utilization of the solution space of an ANN to define the safety-critical scenarios of an AI-based CLCS. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence Engineering)
Show Figures

Figure 1

26 pages, 1389 KiB  
Article
MAGNETO and DeepInsight: Extended Image Translation with Semantic Relationships for Classifying Attack Data with Machine Learning Models
by Aeryn Dunmore, Adam Dunning, Julian Jang-Jaccard, Fariza Sabrina and Jin Kwak
Electronics 2023, 12(16), 3463; https://doi.org/10.3390/electronics12163463 - 15 Aug 2023
Cited by 2 | Viewed by 1253
Abstract
The translation of traffic flow data into images for the purposes of classification in machine learning tasks has been extensively explored in recent years. However, the method of translation has a significant impact on the success of such attempts. In 2019, a method [...] Read more.
The translation of traffic flow data into images for the purposes of classification in machine learning tasks has been extensively explored in recent years. However, the method of translation has a significant impact on the success of such attempts. In 2019, a method called DeepInsight was developed to translate genetic information into images. It was then adopted in 2021 for the purpose of translating network traffic into images, allowing the retention of semantic data about the relationships between features, in a model called MAGNETO. In this paper, we explore and extend this research, using the MAGNETO algorithm on three new intrusion detection datasets—CICDDoS2019, 5G-NIDD, and BOT-IoT—and also extend this method into the realm of multiclass classification tasks using first a One versus Rest model, followed by a full multiclass classification task, using multiple new classifiers for comparison against the CNNs implemented by the original MAGNETO model. We have also undertaken comparative experiments on the original MAGNETO datasets, CICIDS17, KDD99, and UNSW-NB15, as well as a comparison for other state-of-the-art models using the NSL-KDD dataset. The results show that the MAGNETO algorithm and the DeepInsight translation method, without the use of data augmentation, offer a significant boost to accuracy when classifying network traffic data. Our research also shows the effectiveness of Decision Tree and Random Forest classifiers on this type of data. Further research into the potential for real-time execution is needed to explore the possibilities for extending this method of translation into real-world scenarios. Full article
(This article belongs to the Special Issue Application Research Using AI, IoT, HCI, and Big Data Technologies)
Show Figures

Figure 1

23 pages, 4527 KiB  
Article
Self-Regulated Learning and Active Feedback of MOOC Learners Supported by the Intervention Strategy of a Learning Analytics System
by Ruth Cobos
Electronics 2023, 12(15), 3368; https://doi.org/10.3390/electronics12153368 - 7 Aug 2023
Cited by 1 | Viewed by 1719
Abstract
MOOCs offer great learning opportunities, but they also present several challenges for learners that hinder them from successfully completing MOOCs. To address these challenges, edX-LIMS (System for Learning Intervention and its Monitoring for edX MOOCs) was developed. It is a learning analytics system [...] Read more.
MOOCs offer great learning opportunities, but they also present several challenges for learners that hinder them from successfully completing MOOCs. To address these challenges, edX-LIMS (System for Learning Intervention and its Monitoring for edX MOOCs) was developed. It is a learning analytics system that supports an intervention strategy (based on learners’ interactions with the MOOC) to provide feedback to learners through web-based Learner Dashboards. Additionally, edX-LIMS provides a web-based Instructor Dashboard for instructors to monitor their learners. In this article, an enhanced version of the aforementioned system called edX-LIMS+ is presented. This upgrade introduces new services that enhance both the learners’ and instructors’ dashboards with a particular focus on self-regulated learning. Moreover, the system detects learners’ problems to guide them and assist instructors in better monitoring learners and providing necessary support. The results obtained from the use of this new version (through learners’ interactions and opinions about their dashboards) demonstrate that the feedback provided has been significantly improved, offering more valuable information to learners and enhancing their perception of both the dashboard and the intervention strategy supported by the system. Additionally, the majority of learners agreed with their detected problems, thereby enabling instructors to enhance interventions and support learners’ learning processes. Full article
Show Figures

Figure 1

18 pages, 513 KiB  
Article
Cascading and Ensemble Techniques in Deep Learning
by I. de Zarzà, J. de Curtò, Enrique Hernández-Orallo and Carlos T. Calafate
Electronics 2023, 12(15), 3354; https://doi.org/10.3390/electronics12153354 - 5 Aug 2023
Cited by 3 | Viewed by 2884
Abstract
In this study, we explore the integration of cascading and ensemble techniques in Deep Learning (DL) to improve prediction accuracy on diabetes data. The primary approach involves creating multiple Neural Networks (NNs), each predicting the outcome independently, and then feeding these initial predictions [...] Read more.
In this study, we explore the integration of cascading and ensemble techniques in Deep Learning (DL) to improve prediction accuracy on diabetes data. The primary approach involves creating multiple Neural Networks (NNs), each predicting the outcome independently, and then feeding these initial predictions into another set of NN. Our exploration starts from an initial preliminary study and extends to various ensemble techniques including bagging, stacking, and finally cascading. The cascading ensemble involves training a second layer of models on the predictions of the first. This cascading structure, combined with ensemble voting for the final prediction, aims to exploit the strengths of multiple models while mitigating their individual weaknesses. Our results demonstrate significant improvement in prediction accuracy, providing a compelling case for the potential utility of these techniques in healthcare applications, specifically for prediction of diabetes where we achieve compelling model accuracy of 91.5% on the test set on a particular challenging dataset, where we compare thoroughly against many other methodologies. Full article
(This article belongs to the Special Issue Artificial Intelligence Technologies and Applications)
Show Figures

Figure 1

29 pages, 6436 KiB  
Article
Fish Monitoring from Low-Contrast Underwater Images
by Nikos Petrellis, Georgios Keramidas, Christos P. Antonopoulos and Nikolaos Voros
Electronics 2023, 12(15), 3338; https://doi.org/10.3390/electronics12153338 - 4 Aug 2023
Cited by 2 | Viewed by 1577
Abstract
A toolset supporting fish detection, orientation, tracking and especially morphological feature estimation with high speed and accuracy, is presented in this paper. It can be exploited in fish farms to automate everyday procedures including size measurement and optimal harvest time estimation, fish health [...] Read more.
A toolset supporting fish detection, orientation, tracking and especially morphological feature estimation with high speed and accuracy, is presented in this paper. It can be exploited in fish farms to automate everyday procedures including size measurement and optimal harvest time estimation, fish health assessment, quantification of feeding needs, etc. It can also be used in an open sea environment to monitor fish size, behavior and the population of various species. An efficient deep learning technique for fish detection is employed and adapted, while methods for fish tracking are also proposed. The fish orientation is classified in order to apply a shape alignment technique that is based on the Ensemble of Regression Trees machine learning method. Shape alignment allows the estimation of fish dimensions (length, height) and the localization of fish body parts of particular interest such as the eyes and gills. The proposed method can estimate the position of 18 landmarks with an accuracy of about 95% from low-contrast underwater images where the fish can be hardly distinguished from its background. Hardware and software acceleration techniques have been applied at the shape alignment process reducing the frame processing latency to less than 0.5 us on a general purpose computer and less than 16 ms on an embedded platform. As a case study, the developed system has been trained and tested with several Mediterranean fish species in the category of seabream. A large public dataset with low-resolution underwater videos and images has also been developed to test the proposed system under worst case conditions. Full article
Show Figures

Figure 1

18 pages, 8039 KiB  
Article
A Thorough Evaluation of GaN HEMT Degradation under Realistic Power Amplifier Operation
by Gianni Bosi, Antonio Raffo, Valeria Vadalà, Rocco Giofrè, Giovanni Crupi and Giorgio Vannini
Electronics 2023, 12(13), 2939; https://doi.org/10.3390/electronics12132939 - 4 Jul 2023
Cited by 1 | Viewed by 1233
Abstract
In this paper, we experimentally investigate the effects of degradation observed on 0.15-µm GaN HEMT devices when operating under realistic power amplifier conditions. The latter will be applied to the devices under test (DUT) by exploiting a low-frequency load-pull characterization technique that provides [...] Read more.
In this paper, we experimentally investigate the effects of degradation observed on 0.15-µm GaN HEMT devices when operating under realistic power amplifier conditions. The latter will be applied to the devices under test (DUT) by exploiting a low-frequency load-pull characterization technique that provides information consistent with RF operation, with the advantage of revealing electrical quantities not directly detectable at high frequency. Quantities such as the resistive gate current, play a fundamental role in the analysis of technology reliability. The experiments will be carried out on DUTs of the same periphery considering two different power amplifier operations: a saturated class-AB condition, that emphasizes the degradation effects produced by high temperatures due to power dissipation, and a class-E condition, that enhances the effects of high electric fields. The experiments will be carried out at 30 °C and 100 °C, and the results will be compared to evaluate how a specific RF condition can impact on the device degradation. Such a kind of comparison, to the authors’ knowledge, has never been carried out and represents the main novelty of the present study. Full article
Show Figures

Figure 1

15 pages, 1940 KiB  
Article
Predicting High-Frequency Stock Movement with Differential Transformer Neural Network
by Shijie Lai, Mingxian Wang, Shengjie Zhao and Gonzalo R. Arce
Electronics 2023, 12(13), 2943; https://doi.org/10.3390/electronics12132943 - 4 Jul 2023
Cited by 2 | Viewed by 3116
Abstract
Predicting stock prices has long been the holy grail for providing guidance to investors. Extracting effective information from Limit Order Books (LOBs) is a key point in high-frequency trading based on stock-movement forecasting. LOBs offer many details, but at the same time, they [...] Read more.
Predicting stock prices has long been the holy grail for providing guidance to investors. Extracting effective information from Limit Order Books (LOBs) is a key point in high-frequency trading based on stock-movement forecasting. LOBs offer many details, but at the same time, they are very noisy. This paper proposes a differential transformer neural network model, dubbed DTNN, to predict stock movement according to LOB data. The model utilizes a temporal attention-augmented bilinear layer (TABL) and a temporal convolutional network (TCN) to denoise the data. In addition, a prediction transformer module captures the dependency between time series. A differential layer is proposed and incorporated into the model to extract information from the messy and chaotic high-frequency LOB time series. This layer can identify the fine distinction between adjacent slices in the series. We evaluate the proposed model on several datasets. On the open LOB benchmark FI-2010, our model outperforms other comparative state-of-the-art methods in accuracy and F1 score. In the experiments using actual stock data, our model also shows great stock-movement forecasting capability and generalization performance. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 921 KiB  
Article
Engaging Learners in Educational Robotics: Uncovering Students’ Expectations for an Ideal Robotic Platform
by Georgios Kyprianou, Alexandra Karousou, Nikolaos Makris, Ilias Sarafis, Angelos Amanatiadis and Savvas A. Chatzichristofis
Electronics 2023, 12(13), 2865; https://doi.org/10.3390/electronics12132865 - 28 Jun 2023
Cited by 2 | Viewed by 1708
Abstract
Extensive research has been conducted on educational robotics (ER) platforms to explore their usage across different educational levels and assess their effectiveness in achieving desired learning outcomes. However, the existing literature has a limitation in regard to addressing learners’ specific preferences and characteristics [...] Read more.
Extensive research has been conducted on educational robotics (ER) platforms to explore their usage across different educational levels and assess their effectiveness in achieving desired learning outcomes. However, the existing literature has a limitation in regard to addressing learners’ specific preferences and characteristics regarding these platforms. To address this gap, it is crucial to encourage learners’ active participation in the design process of robotic platforms. By incorporating their valuable feedback and preferences and providing them with platforms that align with their interests, we can create a motivating environment that leads to increased engagement in science, technology, engineering and mathematics (STEM) courses and improved learning outcomes. Furthermore, this approach fosters a sense of absorption and full engagement among peers as they collaborate on assigned activities. To bridge the existing research gap, our study aimed to investigate the current trends in the morphology of educational robotics platforms. We surveyed students from multiple schools in Greece who had no prior exposure to robotic platforms. Our study aimed to understand students’ expectations of an ideal robotic companion. We examined the desired characteristics, modes of interaction, and socialization that students anticipate from such a companion. By uncovering these attributes and standards, we aimed to inform the development of an optimal model that effectively fulfills students’ educational aspirations while keeping them motivated and engaged. Full article
(This article belongs to the Special Issue Recent Advances in Educational Robotics, Volume II)
Show Figures

Figure 1

22 pages, 671 KiB  
Article
LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments
by J. de Curtò, I. de Zarzà, Gemma Roig, Juan Carlos Cano, Pietro Manzoni and Carlos T. Calafate
Electronics 2023, 12(13), 2814; https://doi.org/10.3390/electronics12132814 - 25 Jun 2023
Cited by 6 | Viewed by 4200
Abstract
In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle [...] Read more.
In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We bring forward a new non-stationary bandit model with fluctuating reward distributions and illustrate how LLMs can be employed to guide the choice of bandit amid this variability. Experimental outcomes illustrate the potential of our LLM-informed strategy, demonstrating its adaptability to the fluctuating nature of the bandit problem, while maintaining competitive performance against conventional strategies. This study provides key insights into the capabilities of LLMs in enhancing decision-making processes in dynamic and uncertain scenarios. Full article
Show Figures

Figure 1

12 pages, 1986 KiB  
Communication
Enhancing Object Detection in Self-Driving Cars Using a Hybrid Approach
by Sajjad Ahmad Khan, Hyun Jun Lee and Huhnkuk Lim
Electronics 2023, 12(13), 2768; https://doi.org/10.3390/electronics12132768 - 21 Jun 2023
Cited by 4 | Viewed by 5210
Abstract
Recent advancements in artificial intelligence (AI) have greatly improved the object detection capabilities of autonomous vehicles, especially using convolutional neural networks (CNNs). However, achieving high levels of accuracy and speed simultaneously in vehicular environments remains a challenge. Therefore, this paper proposes a hybrid [...] Read more.
Recent advancements in artificial intelligence (AI) have greatly improved the object detection capabilities of autonomous vehicles, especially using convolutional neural networks (CNNs). However, achieving high levels of accuracy and speed simultaneously in vehicular environments remains a challenge. Therefore, this paper proposes a hybrid approach that incorporates the features of two state-of-the-art object detection models: You Only Look Once (YOLO) and Faster Region CNN (Faster R-CNN). The proposed hybrid approach combines the detection and boundary box selection capabilities of YOLO with the region of interest (RoI) pooling from Faster R-CNN, resulting in improved segmentation and classification accuracy. Furthermore, we skip the Region Proposal Network (RPN) from the Faster R-CNN architecture to optimize processing time. The hybrid model is trained on a local dataset of 10,000 labeled traffic images collected during driving scenarios, further enhancing its accuracy. The results demonstrate that our proposed hybrid approach outperforms existing state-of-the-art models, providing both high accuracy and practical real-time object detection for autonomous vehicles. It is observed that the proposed hybrid model achieves a significant increase in accuracy, with improvements ranging from 5 to 7 percent compared to the standalone YOLO models. The findings of this research have practical implications for the integration of AI technologies in autonomous driving systems. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

18 pages, 15432 KiB  
Article
An AR Application for the Efficient Construction of Water Pipes Buried Underground
by Koki Inoue, Shuichiro Ogake, Kazuma Kobayashi, Toyoaki Tomura, Satoshi Mitsui, Toshifumi Satake and Naoki Igo
Electronics 2023, 12(12), 2634; https://doi.org/10.3390/electronics12122634 - 12 Jun 2023
Viewed by 1200
Abstract
Unlike other civil engineering works, water pipe works require digging out before construction because the construction site is buried. The AR application is a system that displays buried objects in the ground in three dimensions when users hold a device such as a [...] Read more.
Unlike other civil engineering works, water pipe works require digging out before construction because the construction site is buried. The AR application is a system that displays buried objects in the ground in three dimensions when users hold a device such as a smartphone over the ground, using images from the smartphone. The system also registers new buried objects when they are updated. The target of this project is water pipes, which are the most familiar of all buried structures. The system has the following functions: “registration and display of new water pipe information” and “acquisition and display of current location coordinate information.” By applying the plane detection function to data acquired from a camera mounted on a smartphone, the system can easily register and display a water pipe model horizontally to the ground. The system does not require a reference marker because it uses GPS and the plane detection function. In the future, the system will support the visualization and registration of not only water pipes but also other underground infrastructures and will play an active role in the rapid restoration of infrastructure after a large-scale disaster through the realization of a buried-object 3D MAP platform. Full article
Show Figures

Figure 1

19 pages, 1251 KiB  
Review
Immersive Virtual Reality Enabled Interventions for Autism Spectrum Disorder: A Systematic Review and Meta-Analysis
by Chen Li, Meike Belter, Jing Liu and Heide Lukosch
Electronics 2023, 12(11), 2497; https://doi.org/10.3390/electronics12112497 - 1 Jun 2023
Cited by 3 | Viewed by 3273
Abstract
Autism spectrum disorder (ASD) is characterized by persistent deficits in social communication and interaction, which can have significant impacts on daily life, education, and work. Limited performance in learning and working, as well as exclusion from social activities, are common challenges faced by [...] Read more.
Autism spectrum disorder (ASD) is characterized by persistent deficits in social communication and interaction, which can have significant impacts on daily life, education, and work. Limited performance in learning and working, as well as exclusion from social activities, are common challenges faced by individuals with ASD. Virtual reality (VR) technology has emerged as a promising medium for delivering interventions for ASD. To address five major research questions and understand the latest trends and challenges in this area, a systematic review of 21 journal articles published between 1 January 2010 and 31 December 2022 was conducted using the PRISMA approach. A meta-analysis of 15 articles was further conducted to assess interventional effectiveness. The results showed that most studies focused on social and affective skill training and relied on existing theories and practices with limited adaptations for VR. Furthermore, the enabling technologies’ affordances for the interventional needs of individuals with ASD were not thoroughly investigated. We suggest that future studies should propose and design interventions with solid theoretical foundations, explore more interventional areas besides social and affective skill training, and employ more rigorous experimental designs to investigate the effectiveness of VR-enabled ASD interventions. Full article
(This article belongs to the Topic Technology-Mediated Agile Blended Learning)
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

14 pages, 2173 KiB  
Article
Improving Norwegian Translation of Bicycle Terminology Using Custom Named-Entity Recognition and Neural Machine Translation
by Daniel Hellebust and Isah A. Lawal
Electronics 2023, 12(10), 2334; https://doi.org/10.3390/electronics12102334 - 22 May 2023
Viewed by 1748
Abstract
The Norwegian business-to-business (B2B) market for bicycles consists mainly of international brands, such as Shimano, Trek, Cannondale, and Specialized. The product descriptions for these brands are usually in English and need local translation. However, these product descriptions include bicycle-specific terminologies that are challenging [...] Read more.
The Norwegian business-to-business (B2B) market for bicycles consists mainly of international brands, such as Shimano, Trek, Cannondale, and Specialized. The product descriptions for these brands are usually in English and need local translation. However, these product descriptions include bicycle-specific terminologies that are challenging for online translators, such as Google. For this reason, local companies outsource translation or translate product descriptions manually, which is cumbersome. In light of the Norwegian B2B bicycle industry, this paper explores transfer learning to improve the machine translation of bicycle-specific terminology from English to Norwegian, including generic text. Firstly, we trained a custom Named-Entity Recognition (NER) model to identify cycling-specific terminology and then adapted a MarianMT neural machine translation model for the translation process. Due to the lack of publicly available bicycle-terminology-related datasets to train the proposed models, we created our dataset by collecting a corpus of cycling-related texts. We evaluated the performance of our proposed model and compared its performance with that of Google Translate. Our model outperformed Google Translate on the test set, with a SacreBleu score of 45.099 against 36.615 for Google Translate on average. We also created a web application where the user can input English text with related bicycle terminologies, and it will return the detected cycling-specific words in addition to a Norwegian translation. Full article
(This article belongs to the Special Issue Application of Machine Learning and Intelligent Systems)
Show Figures

Figure 1

14 pages, 1644 KiB  
Article
Deep Learning-Based Context-Aware Recommender System Considering Change in Preference
by Soo-Yeon Jeong and Young-Kuk Kim
Electronics 2023, 12(10), 2337; https://doi.org/10.3390/electronics12102337 - 22 May 2023
Cited by 2 | Viewed by 1634
Abstract
In order to predict and recommend what users want, users’ information is required, and more information is required to improve the performance of the recommender system. As IoT devices and smartphones have made it possible to know the user’s context, context-aware recommender systems [...] Read more.
In order to predict and recommend what users want, users’ information is required, and more information is required to improve the performance of the recommender system. As IoT devices and smartphones have made it possible to know the user’s context, context-aware recommender systems have emerged to predict preferences by considering the user’s context. A context-aware recommender system uses contextual information such as time, weather, and location to predict preferences. However, a user’s preferences are not always the same in a given context. They may follow trends or make different choices due to changes in their personal environment. Therefore, in this paper, we propose a context-aware recommender system that considers the change in users’ preferences over time. The proposed method is a context-aware recommender system that uses Matrix Factorization with a preference transition matrix to capture and reflect the changes in users’ preferences. To evaluate the performance of the proposed method, we compared the performance with the traditional recommender system, context-aware recommender system, and dynamic recommender system, and confirmed that the performance of the proposed method is better than the existing methods. Full article
(This article belongs to the Special Issue Application Research Using AI, IoT, HCI, and Big Data Technologies)
Show Figures

Figure 1

17 pages, 1888 KiB  
Article
Bibliometric Analysis of Automated Assessment in Programming Education: A Deeper Insight into Feedback
by José Carlos Paiva, Álvaro Figueira and José Paulo Leal
Electronics 2023, 12(10), 2254; https://doi.org/10.3390/electronics12102254 - 15 May 2023
Viewed by 1253
Abstract
Learning to program requires diligent practice and creates room for discovery, trial and error, debugging, and concept mapping. Learners must walk this long road themselves, supported by appropriate and timely feedback. Providing such feedback in programming exercises is not a humanly feasible task. [...] Read more.
Learning to program requires diligent practice and creates room for discovery, trial and error, debugging, and concept mapping. Learners must walk this long road themselves, supported by appropriate and timely feedback. Providing such feedback in programming exercises is not a humanly feasible task. Therefore, the early and steadily growing interest of computer science educators in the automated assessment of programming exercises is not surprising. The automated assessment of programming assignments has been an active area of research for over a century, and interest in it continues to grow as it adapts to new developments in computer science and the resulting changes in educational requirements. It is therefore of paramount importance to understand the work that has been performed, who has performed it, its evolution over time, the relationships between publications, its hot topics, and open problems, among others. This paper presents a bibliometric study of the field, with a particular focus on the issue of automatic feedback generation, using literature data from the Web of Science Core Collection. It includes a descriptive analysis using various bibliometric measures and data visualizations on authors, affiliations, citations, and topics. In addition, we performed a complementary analysis focusing only on the subset of publications on the specific topic of automatic feedback generation. The results are highlighted and discussed. Full article
Show Figures

Figure 1

20 pages, 1319 KiB  
Article
Deep-Learning-Driven Techniques for Real-Time Multimodal Health and Physical Data Synthesis
by Muhammad Salman Haleem, Audrey Ekuban, Alessio Antonini, Silvio Pagliara, Leandro Pecchia and Carlo Allocca
Electronics 2023, 12(9), 1989; https://doi.org/10.3390/electronics12091989 - 25 Apr 2023
Cited by 3 | Viewed by 1862
Abstract
With the advent of Artificial Intelligence for healthcare, data synthesis methods present crucial benefits in facilitating the fast development of AI models while protecting data subjects and bypassing the need to engage with the complexity of data sharing and processing agreements. Existing technologies [...] Read more.
With the advent of Artificial Intelligence for healthcare, data synthesis methods present crucial benefits in facilitating the fast development of AI models while protecting data subjects and bypassing the need to engage with the complexity of data sharing and processing agreements. Existing technologies focus on synthesising real-time physiological and physical records based on regular time intervals. Real health data are, however, characterised by irregularities and multimodal variables that are still hard to reproduce, preserving the correlation across time and different dimensions. This paper presents two novel techniques for synthetic data generation of real-time multimodal electronic health and physical records, (a) the Temporally Correlated Multimodal Generative Adversarial Network and (b) the Document Sequence Generator. The paper illustrates the need and use of these techniques through a real use case, the H2020 GATEKEEPER project of AI for healthcare. Furthermore, the paper presents the evaluation for both individual cases and a discussion about the comparability between techniques and their potential applications of synthetic data at the different stages of the software development life-cycle. Full article
Show Figures

Figure 1

20 pages, 10058 KiB  
Article
Design of Vessel Data Lakehouse with Big Data and AI Analysis Technology for Vessel Monitoring System
by Sun Park, Chan-Su Yang and JongWon Kim
Electronics 2023, 12(8), 1943; https://doi.org/10.3390/electronics12081943 - 20 Apr 2023
Cited by 4 | Viewed by 2110
Abstract
The amount of data in the maritime domain is rapidly increasing due to the increase in devices that can collect marine information, such as sensors, buoys, ships, and satellites. Maritime data is growing at an unprecedented rate, with terabytes of marine data being [...] Read more.
The amount of data in the maritime domain is rapidly increasing due to the increase in devices that can collect marine information, such as sensors, buoys, ships, and satellites. Maritime data is growing at an unprecedented rate, with terabytes of marine data being collected every month and petabytes of data already being made public. Heterogeneous marine data collected through various devices can be used in various fields such as environmental protection, defect prediction, transportation route optimization, and energy efficiency. However, it is difficult to manage vessel related data due to high heterogeneity of such marine big data. Additionally, due to the high heterogeneity of these data sources and some of the challenges associated with big data, such applications are still underdeveloped and fragmented. In this paper, we propose the Vessel Data Lakehouse architecture consisting of the Vessel Data Lake layer that can handle marine big data, the Vessel Data Warehouse layer that supports marine big data processing and AI, and the Vessel Application Services layer that supports marine application services. Our proposed a Vessel Data Lakehouse that can efficiently manage heterogeneous vessel related data. It can be integrated and managed at low cost by structuring various types of heterogeneous data using an open source-based big data framework. In addition, various types of vessel big data stored in the Data Lakehouse can be directly utilized in various types of vessel analysis services. In this paper, we present an actual use case of a vessel analysis service in a Vessel Data Lakehouse by using AIS data in Busan area. Full article
Show Figures

Figure 1

30 pages, 12366 KiB  
Article
A Freehand 3D Ultrasound Reconstruction Method Based on Deep Learning
by Xin Chen, Houjin Chen, Yahui Peng, Liu Liu and Chang Huang
Electronics 2023, 12(7), 1527; https://doi.org/10.3390/electronics12071527 - 23 Mar 2023
Cited by 2 | Viewed by 2908
Abstract
In the medical field, 3D ultrasound reconstruction can visualize the internal structure of patients, which is very important for doctors to carry out correct analyses and diagnoses. Furthermore, medical 3D ultrasound images have been widely used in clinical disease diagnosis because they can [...] Read more.
In the medical field, 3D ultrasound reconstruction can visualize the internal structure of patients, which is very important for doctors to carry out correct analyses and diagnoses. Furthermore, medical 3D ultrasound images have been widely used in clinical disease diagnosis because they can more intuitively display the characteristics and spatial location information of the target. The traditional way to obtain 3D ultrasonic images is to use a 3D ultrasonic probe directly. Although freehand 3D ultrasound reconstruction is still in the research stage, a lot of research has recently been conducted on the freehand ultrasound reconstruction method based on wireless ultrasonic probe. In this paper, a wireless linear array probe is used to build a freehand acousto-optic positioning 3D ultrasonic imaging system. B-scan is considered the brightness scan. It is used for producing a 2D cross-section of the eye and its orbit. This system is used to collect and construct multiple 2D B-scans datasets for experiments. According to the experimental results, a freehand 3D ultrasonic reconstruction method based on depth learning is proposed, which is called sequence prediction reconstruction based on acoustic optical localization (SPRAO). SPRAO is an ultrasound reconstruction system which cannot be put into medical clinical use now. Compared with 3D reconstruction using a 3D ultrasound probe, SPRAO not only has a controllable scanning area, but also has a low cost. SPRAO solves some of the problems in the existing algorithms. Firstly, a 60 frames per second (FPS) B-scan sequence can be synthesized using a 12 FPS wireless ultrasonic probe through 2–3 acquisitions. It not only effectively reduces the requirement for the output frame rate of the ultrasonic probe, but also increases the moving speed of the wireless probe. Secondly, SPRAO analyzes the B-scans through speckle decorrelation to calibrate the acousto-optic auxiliary positioning information, while other algorithms have no solution to the cumulative error of the external auxiliary positioning device. Finally, long short-term memory (LSTM) is used to predict the spatial position and attitude of B-scans, and the calculation of pose deviation and speckle decorrelation is integrated into a 3D convolutional neural network (3DCNN). Prepare for real-time 3D reconstruction under the premise of accurate spatial pose of B-scans. At the end of this paper, SPRAO is compared with linear motion, IMU, speckle decorrelation, CNN and other methods. From the experimental results, it can be observed that the spatial pose deviation of B-scans output using SPRAO is the best of these methods. Full article
Show Figures

Figure 1

21 pages, 7652 KiB  
Article
An Image Object Detection Model Based on Mixed Attention Mechanism Optimized YOLOv5
by Guangming Sun, Shuo Wang and Jiangjian Xie
Electronics 2023, 12(7), 1515; https://doi.org/10.3390/electronics12071515 - 23 Mar 2023
Cited by 4 | Viewed by 1834
Abstract
As one of the more difficult problems in the field of computer vision, utilizing object image detection technology in a complex environment includes other key technologies, such as pattern recognition, artificial intelligence, and digital image processing. However, because an environment can be complex, [...] Read more.
As one of the more difficult problems in the field of computer vision, utilizing object image detection technology in a complex environment includes other key technologies, such as pattern recognition, artificial intelligence, and digital image processing. However, because an environment can be complex, changeable, highly different, and easily confused with the target, the target is easily affected by other factors, such as insufficient light, partial occlusion, background interference, etc., making the detection of multiple targets extremely difficult and the robustness of the algorithm low. How to make full use of the rich spatial information and deep texture information in an image to accurately identify the target type and location is an urgent problem to be solved. The emergence of deep neural networks provides an effective way for image feature extraction and full utilization. By aiming at the above problems, this paper proposes an object detection model based on the mixed attention mechanism optimization of YOLOv5 (MAO-YOLOv5). The proposed method fuses the local features and global features in an image so as to better enrich the expression ability of the feature map and more effectively detect objects with large differences in size within the image. Then, the attention mechanism is added to the feature map to weigh each channel, enhance the key features, remove the redundant features, and improve the recognition ability of the feature network towards the target object and background. The results show that the proposed network model has higher precision and a faster running speed and can perform better in object-detection tasks. Full article
(This article belongs to the Special Issue Advanced Technologies of Artificial Intelligence in Signal Processing)
Show Figures

Figure 1

19 pages, 2296 KiB  
Article
Integration of Farm Financial Accounting and Farm Management Information Systems for Better Sustainability Reporting
by Krijn Poppe, Hans Vrolijk and Ivor Bosloper
Electronics 2023, 12(6), 1485; https://doi.org/10.3390/electronics12061485 - 21 Mar 2023
Cited by 3 | Viewed by 4549
Abstract
Farmers face an increasing administrative burden as agricultural policies and certification systems of trade partners ask for more sustainability reporting. Several indicator frameworks have been developed to measure sustainability, but they often lack empirical operationalization and are not always measured at the farm [...] Read more.
Farmers face an increasing administrative burden as agricultural policies and certification systems of trade partners ask for more sustainability reporting. Several indicator frameworks have been developed to measure sustainability, but they often lack empirical operationalization and are not always measured at the farm level. The research gap we address in this paper is the empirical link between the data needs for sustainability reporting and the developments in data management at the farm level. Family farms do not collect much data for internal management, but external demand for sustainability data can partly be fulfilled by reorganizing data management in the farm office. The Farm Financial Accounts (FFAs) and Farm Management Information Systems (FMISs) are the main data sources in the farm office. They originate from the same source of note-taking by farmers but became separated when formalized and computerized. Nearly all European farms have a bank account and must keep financial accounts (e.g., for Value-Added Tax or income tax) that can be audited. Financial accounts are not designed for environmental accounting or calculating sustainability metrics but provide a wealth of information to make assessments on these subjects. FMISs are much less frequently used but collect more technical and fine-grained data at crop or enterprise level for different fields. FMISs are also strong in integrating sensor and satellite data. Integrating data availability and workflows of FFAs and FMISs makes sustainability reporting less cumbersome regarding data entry and adds valuable data to environmental accounts. This paper applies a design science approach to design an artifact, a dashboard for sustainability reporting based on the integration of information flows from farm financial accounting systems and farm management information systems. The design developed in this paper illustrates that if invoices were digitized, most data-gathering needed for external sustainability reporting would automatically be done when the invoices is paid by a bank transfer. Data on the use of inputs and production could be added with procedures as in current FMISs, but with less data entry, fewer risks of differences in outcomes, and possibilities of cross-checking the results. Full article
Show Figures

Figure 1

15 pages, 685 KiB  
Article
Knowledge-Guided Prompt Learning for Few-Shot Text Classification
by Liangguo Wang, Ruoyu Chen and Li Li
Electronics 2023, 12(6), 1486; https://doi.org/10.3390/electronics12061486 - 21 Mar 2023
Cited by 1 | Viewed by 3184
Abstract
Recently, prompt-based learning has shown impressive performance on various natural language processing tasks in few-shot scenarios. The previous study of knowledge probing showed that the success of prompt learning contributes to the implicit knowledge stored in pre-trained language models. However, how this implicit [...] Read more.
Recently, prompt-based learning has shown impressive performance on various natural language processing tasks in few-shot scenarios. The previous study of knowledge probing showed that the success of prompt learning contributes to the implicit knowledge stored in pre-trained language models. However, how this implicit knowledge helps solve downstream tasks remains unclear. In this work, we propose a knowledge-guided prompt learning method that can reveal relevant knowledge for text classification. Specifically, a knowledge prompting template and two multi-task frameworks were designed, respectively. The experiments demonstrated the superiority of combining knowledge and prompt learning in few-shot text classification. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Figure 1

13 pages, 2089 KiB  
Article
Serious Games and Soft Skills in Higher Education: A Case Study of the Design of Compete!
by Nadia McGowan, Aída López-Serrano and Daniel Burgos
Electronics 2023, 12(6), 1432; https://doi.org/10.3390/electronics12061432 - 17 Mar 2023
Cited by 4 | Viewed by 3114
Abstract
This article describes the serious game Compete!, developed within the European Erasmus+ framework, that aims to teach soft skills to higher education students in order to increase their employability. Despite the increasing relevance of soft skills for successful entry into the labour [...] Read more.
This article describes the serious game Compete!, developed within the European Erasmus+ framework, that aims to teach soft skills to higher education students in order to increase their employability. Despite the increasing relevance of soft skills for successful entry into the labour market, these are often overlooked in higher education. A participatory learning methodology based on a gamification tool has been used for this purpose. The game presents a series of scenarios describing social sustainability problems that require the application of soft skills identified as key competencies in a field study across different European countries. These competencies are creative problem-solving, effective communication, stress management, and teamwork. On completion of each game scenario and the game itself, students receive an evaluation of both their soft skills and the strategic and operational decisions they have made. In the evaluation of these decisions, both the economic and sustainability aspects of the decision are assessed. The teacher can then address the competencies and sustainability issues using the different game scenarios, thus creating higher motivation and deeper understanding amongst the students. This hybrid learning methodology incorporates digital tools for the cross-curricular teaching and learning of sustainability and soft skills. In conclusion, this article describes a possible method of incorporating soft skills in higher education; this complements students’ technical knowledge while helping to achieve Sustainable Development Goals. Full article
Show Figures

Figure 1

14 pages, 25964 KiB  
Article
Deep-Learning-Based Scalp Image Analysis Using Limited Data
by Minjeong Kim, Yujung Gil, Yuyeon Kim and Jihie Kim
Electronics 2023, 12(6), 1380; https://doi.org/10.3390/electronics12061380 - 14 Mar 2023
Cited by 3 | Viewed by 3764
Abstract
The World Health Organization and Korea National Health Insurance assert that the number of alopecia patients is increasing every year, and approximately 70 percent of adults suffer from scalp problems. Although alopecia is a genetic problem, it is difficult to diagnose at an [...] Read more.
The World Health Organization and Korea National Health Insurance assert that the number of alopecia patients is increasing every year, and approximately 70 percent of adults suffer from scalp problems. Although alopecia is a genetic problem, it is difficult to diagnose at an early stage. Although deep-learning-based approaches have been effective for medical image analyses, it is challenging to generate deep learning models for alopecia detection and analysis because creating an alopecia image dataset is challenging. In this paper, we present an approach for generating a model specialized for alopecia analysis that achieves high accuracy by applying data preprocessing, data augmentation, and an ensemble of deep learning models that have been effective for medical image analyses. We use an alopecia image dataset containing 526 good, 13,156 mild, 3742 moderate, and 825 severe alopecia images. The dataset was further augmented by applying normalization, geometry-based augmentation (rotate, vertical flip, horizontal flip, crop, and affine transformation), and PCA augmentation. We compare the performance of a single deep learning model using ResNet, ResNeXt, DenseNet, XceptionNet, and ensembles of these models. The best result was achieved when DenseNet, XceptionNet, and ResNet were combined to achieve an accuracy of 95.75 and an F1 score of 87.05. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

19 pages, 8478 KiB  
Article
FE-GAN: Fast and Efficient Underwater Image Enhancement Model Based on Conditional GAN
by Jie Han, Jian Zhou, Lin Wang, Yu Wang and Zhongjun Ding
Electronics 2023, 12(5), 1227; https://doi.org/10.3390/electronics12051227 - 4 Mar 2023
Cited by 4 | Viewed by 1906
Abstract
The processing of underwater images can vastly ease the difficulty of underwater robots’ tasks and promote ocean exploration development. This paper proposes a fast and efficient underwater image enhancement model based on conditional GAN with good generalization ability using aggregation strategies and concatenate [...] Read more.
The processing of underwater images can vastly ease the difficulty of underwater robots’ tasks and promote ocean exploration development. This paper proposes a fast and efficient underwater image enhancement model based on conditional GAN with good generalization ability using aggregation strategies and concatenate operations to take full advantage of the limited hierarchical features. A sequential network can avoid frequently visiting additional nodes, which is beneficial for speeding up inference and reducing memory consumption. Through the structural re-parameterization approach, we design a dual residual block (DRB) and accordingly construct a hierarchical attention encoder (HAE), which can extract sufficient feature and texture information from different levels of an image, and with 11.52% promotion in GFLOPs. Extensive experiments were carried out on real and artificially synthesized benchmark underwater image datasets, and qualitative and quantitative comparisons with state-of-the-art methods were implemented. The results show that our model produces better images, and has good generalization ability and real-time performance, which is more conducive to the practical application of underwater robot tasks. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

15 pages, 557 KiB  
Article
WCC-JC 2.0: A Web-Crawled and Manually Aligned Parallel Corpus for Japanese-Chinese Neural Machine Translation
by Jinyi Zhang, Ye Tian, Jiannan Mao, Mei Han, Feng Wen, Cong Guo, Zhonghui Gao and Tadahiro Matsumoto
Electronics 2023, 12(5), 1140; https://doi.org/10.3390/electronics12051140 - 26 Feb 2023
Cited by 5 | Viewed by 1934
Abstract
Movie and TV subtitles are frequently employed in natural language processing (NLP) applications, but there are limited Japanese-Chinese bilingual corpora accessible as a dataset to train neural machine translation (NMT) models. In our previous study, we effectively constructed a corpus of a considerable [...] Read more.
Movie and TV subtitles are frequently employed in natural language processing (NLP) applications, but there are limited Japanese-Chinese bilingual corpora accessible as a dataset to train neural machine translation (NMT) models. In our previous study, we effectively constructed a corpus of a considerable size containing bilingual text data in both Japanese and Chinese by collecting subtitle text data from websites that host movies and television series. The unsatisfactory translation performance of the initial corpus, Web-Crawled Corpus of Japanese and Chinese (WCC-JC 1.0), was predominantly caused by the limited number of sentence pairs. To address this shortcoming, we thoroughly analyzed the issues associated with the construction of WCC-JC 1.0 and constructed the WCC-JC 2.0 corpus by first collecting subtitle data from movie and TV series websites. Then, we manually aligned a large number of high-quality sentence pairs. Our efforts resulted in a new corpus that includes about 1.4 million sentence pairs, an 87% increase compared with WCC-JC 1.0. As a result, WCC-JC 2.0 is now among the largest publicly available Japanese-Chinese bilingual corpora in the world. To assess the performance of WCC-JC 2.0, we calculated the BLEU scores relative to other comparative corpora and performed manual evaluations of the translation results generated by translation models trained on WCC-JC 2.0. We provide WCC-JC 2.0 as a free download for research purposes only. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Figure 1

14 pages, 5425 KiB  
Article
Swin-UperNet: A Semantic Segmentation Model for Mangroves and Spartina alterniflora Loisel Based on UperNet
by Zhenhua Wang, Jing Li, Zhilian Tan, Xiangfeng Liu and Mingjie Li
Electronics 2023, 12(5), 1111; https://doi.org/10.3390/electronics12051111 - 24 Feb 2023
Cited by 2 | Viewed by 2726
Abstract
As an ecosystem in transition from land to sea, mangroves play a vital role in wind and wave protection and biodiversity maintenance. However, the invasion of Spartina alterniflora Loisel seriously damages the mangrove wetland ecosystem. To protect mangroves scientifically and dynamically, a semantic [...] Read more.
As an ecosystem in transition from land to sea, mangroves play a vital role in wind and wave protection and biodiversity maintenance. However, the invasion of Spartina alterniflora Loisel seriously damages the mangrove wetland ecosystem. To protect mangroves scientifically and dynamically, a semantic segmentation model for mangroves and Spartina alterniflora Loise was proposed based on UperNet (Swin-UperNet). In the proposed Swin-UperNet model, a data concatenation module was proposed to make full use of the multispectral information of remote sensing images, the backbone network was replaced with a Swin transformer to improve the feature extraction capability, and a boundary optimization module was designed to optimize the rough segmentation results. Additionally, a linear combination of cross-entropy loss and Lovasz-Softmax loss was taken as the loss function of Swin-UperNet, which could address the problem of unbalanced sample distribution. Taking GF-1 and GF-6 images as the experiment data, the performance of the Swin-UperNet model was compared against that of other segmentation models in terms of pixel accuracy (PA), mean intersection over union (mIoU), and frames per second (FPS), including PSPNet, PSANet, DeepLabv3, DANet, FCN, OCRNet, and DeepLabv3+. The results showed that the Swin-UperNet model achieved the best PA of 98.87% and mIoU of 90.0%, and the efficiency of the Swin-UperNet model was higher than that of most models. In conclusion, Swin-UperNet is an efficient and accurate model for mangrove and Spartina alterniflora Loise segmentation synchronously, which will provide a scientific basis for Spartina alterniflora Loise monitoring and mangrove resource conservation and management. Full article
(This article belongs to the Special Issue Applications of Deep Neural Network for Smart City)
Show Figures

Figure 1

16 pages, 417 KiB  
Article
Fuzzy Rough Nearest Neighbour Methods for Aspect-Based Sentiment Analysis
by Olha Kaminska, Chris Cornelis and Veronique Hoste
Electronics 2023, 12(5), 1088; https://doi.org/10.3390/electronics12051088 - 22 Feb 2023
Cited by 5 | Viewed by 1553
Abstract
Fine-grained sentiment analysis, known as Aspect-Based Sentiment Analysis (ABSA), establishes the polarity of a section of text concerning a particular aspect. Aspect, sentiment, and emotion categorisation are the three steps that make up the configuration of ABSA, which we looked into for the [...] Read more.
Fine-grained sentiment analysis, known as Aspect-Based Sentiment Analysis (ABSA), establishes the polarity of a section of text concerning a particular aspect. Aspect, sentiment, and emotion categorisation are the three steps that make up the configuration of ABSA, which we looked into for the dataset of English reviews. In this work, due to the fuzzy nature of textual data, we investigated machine learning methods based on fuzzy rough sets, which we believe are more interpretable than complex state-of-the-art models. The novelty of this paper is the use of a pipeline that incorporates all three mentioned steps and applies Fuzzy-Rough Nearest Neighbour classification techniques with their extension based on ordered weighted average operators (FRNN-OWA), combined with text embeddings based on transformers. After some improvements in the pipeline’s stages, such as using two separate models for emotion detection, we obtain the correct results for the majority of test instances (up to 81.4%) for all three classification tasks. We consider three different options for the pipeline. In two of them, all three classification tasks are performed consecutively, reducing data at each step to retain only correct predictions, while the third option performs each step independently. This solution allows us to examine the prediction results after each step and spot certain patterns. We used it for an error analysis that enables us, for each test instance, to identify the neighbouring training samples and demonstrate that our methods can extract useful patterns from the data. Finally, we compare our results with another paper that performed the same ABSA classification for the Dutch version of the dataset and conclude that our results are in line with theirs or even slightly better. Full article
(This article belongs to the Special Issue AI for Text Understanding)
Show Figures

Figure 1

17 pages, 781 KiB  
Article
Distilling Monolingual Models from Large Multilingual Transformers
by Pranaydeep Singh, Orphée De Clercq and Els Lefever
Electronics 2023, 12(4), 1022; https://doi.org/10.3390/electronics12041022 - 18 Feb 2023
Cited by 2 | Viewed by 2297
Abstract
Although language modeling has been trending upwards steadily, models available for low-resourced languages are limited to large multilingual models such as mBERT and XLM-RoBERTa, which come with significant overheads for deployment vis-à-vis their model size, inference speeds, etc. We attempt to tackle this [...] Read more.
Although language modeling has been trending upwards steadily, models available for low-resourced languages are limited to large multilingual models such as mBERT and XLM-RoBERTa, which come with significant overheads for deployment vis-à-vis their model size, inference speeds, etc. We attempt to tackle this problem by proposing a novel methodology to apply knowledge distillation techniques to filter language-specific information from a large multilingual model into a small, fast monolingual model that can often outperform the teacher model. We demonstrate the viability of this methodology on two downstream tasks each for six languages. We further dive into the possible modifications to the basic setup for low-resourced languages by exploring ideas to tune the final vocabulary of the distilled models. Lastly, we perform a detailed ablation study to understand the different components of the setup better and find out what works best for the two under-resourced languages, Swahili and Slovene. Full article
(This article belongs to the Special Issue AI for Text Understanding)
Show Figures

Figure 1

17 pages, 3166 KiB  
Article
A Framework for Understanding Unstructured Financial Documents Using RPA and Multimodal Approach
by Seongkuk Cho, Jihoon Moon, Junhyeok Bae, Jiwon Kang and Sangwook Lee
Electronics 2023, 12(4), 939; https://doi.org/10.3390/electronics12040939 - 13 Feb 2023
Cited by 1 | Viewed by 2658
Abstract
The financial business process worldwide suffers from huge dependencies upon labor and written documents, thus making it tedious and time-consuming. In order to solve this problem, traditional robotic process automation (RPA) has recently been developed into a hyper-automation solution by combining computer vision [...] Read more.
The financial business process worldwide suffers from huge dependencies upon labor and written documents, thus making it tedious and time-consuming. In order to solve this problem, traditional robotic process automation (RPA) has recently been developed into a hyper-automation solution by combining computer vision (CV) and natural language processing (NLP) methods. These solutions are capable of image analysis, such as key information extraction and document classification. However, they could improve on text-rich document images and require much training data for processing multilingual documents. This study proposes a multimodal approach-based intelligent document processing framework that combines a pre-trained deep learning model with traditional RPA used in banks to automate business processes from real-world financial document images. The proposed framework can perform classification and key information extraction on a small amount of training data and analyze multilingual documents. In order to evaluate the effectiveness of the proposed framework, extensive experiments were conducted using Korean financial document images. The experimental results show the superiority of the multimodal approach for understanding financial documents and demonstrate that adequate labeling can improve performance by up to about 15%. Full article
(This article belongs to the Special Issue Applied AI-Based Platform Technology and Application, Volume II)
Show Figures

Figure 1

20 pages, 729 KiB  
Article
A Benchmark for Dutch End-to-End Cross-Document Event Coreference Resolution
by Loic De Langhe, Thierry Desot, Orphée De Clercq and Veronique Hoste
Electronics 2023, 12(4), 850; https://doi.org/10.3390/electronics12040850 - 8 Feb 2023
Cited by 3 | Viewed by 1114
Abstract
In this paper, we present a benchmark result for end-to-end cross-document event coreference resolution in Dutch. First, the state of the art of this task in other languages is introduced, as well as currently existing resources and commonly used evaluation metrics. We then [...] Read more.
In this paper, we present a benchmark result for end-to-end cross-document event coreference resolution in Dutch. First, the state of the art of this task in other languages is introduced, as well as currently existing resources and commonly used evaluation metrics. We then build on recently published work to fully explore end-to-end event coreference resolution for the first time in the Dutch language domain. For this purpose, two well-performing transformer-based algorithms for the respective detection and coreference resolution of Dutch textual events are combined in a pipeline architecture and compared to baseline scores relying on feature-based methods. The results are promising and comparable to similar studies in higher-resourced languages; however, they also reveal that in this specific NLP domain, much work remains to be done. In order to gain more insights, an in-depth analysis of the two pipeline components is carried out to highlight and overcome possible shortcoming of the current approach and provide suggestions for future work. Full article
(This article belongs to the Special Issue AI for Text Understanding)
Show Figures

Figure 1

11 pages, 3028 KiB  
Article
Merchant Recommender System Using Credit Card Payment Data
by Suyoun Yoo and Jaekwang Kim
Electronics 2023, 12(4), 811; https://doi.org/10.3390/electronics12040811 - 6 Feb 2023
Cited by 2 | Viewed by 2131
Abstract
As the size of the domestic credit card market is steadily growing, the marketing method for credit card companies to secure customers is also changing. The process of understanding individual preferences and payment patterns has become an essential element, and it has developed [...] Read more.
As the size of the domestic credit card market is steadily growing, the marketing method for credit card companies to secure customers is also changing. The process of understanding individual preferences and payment patterns has become an essential element, and it has developed a sophisticated personalized marketing method to properly understand customers’ interests and meet their needs. Based on this, a personalized system that recommends products or stores suitable for customers acts to attract customers more effectively. However, the existing research model implementing the General Framework using the neural network cannot reflect the major domain information of credit card payment data when applied directly to store recommendations. This study intends to propose a model specializing in the recommendation of member stores by reflecting the domain information of credit card payment data. The customers’ gender and age information were added to the learning data. The industry category and region information of the settlement member stores were reconstructed to be learned together with interaction data. A personalized recommendation system was realized by combining historical card payment data with customer and member store information to recommend member stores that are highly likely to be used by customers in the future. This study’s proposed model (NMF_CSI) showed a performance improvement of 3% based on HR@10 and 5% based on NDCG@10, compared to previous models. In addition, customer coverage was expanded so that the recommended model can be applied not only to customers actively using credit cards but also to customers with low usage data. Full article
(This article belongs to the Special Issue Application of Machine Learning and Intelligent Systems)
Show Figures

Figure 1

Back to TopTop