Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (28 February 2025) | Viewed by 42486

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
Interests: machine learning; biometrics; data mining; image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
Interests: anomaly detection; multimedia analysis; object detection; image/video compression; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer and Information Science, University of Macau, Macau, China
Interests: biometrics; pattern recognition; image processing; medical image analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, machine learning, pattern recognition, and deep learning techniques have been successfully applied to science and engineering research fields. For example, biometric recognition, i.e., palmprint, face, and iris, performs personal security authentication for airport, bank, and online payments. We often retrieve the information we are interested in from the Internet. Furthermore, image processing technology can help us obtain more beautiful photos. Particularly, deep learning has brought out powerful capabilities in extracting discriminant patterns and making accurate predictions from large-scale databases. In fact, the performances of machine learning, pattern recognition, and deep learning algorithms highly rely on model design, mathematical interpretation, and optimization. Good fusion of theories and models are crucial to success in these above applications. The aim of this topic is to highlight recent advances in machine learning, pattern recognition, and deep learning methodologies and theories. Papers with interesting/significant new applications of the above methods are also welcome. The topics of interest include, but are not limited to, the following:

  1. Advanced machine intelligence methods and applications;
  2. Advanced pattern analysis methods and applications;
  3. Deep-learning-based methods and applications;
  4. Biometric recognition algorithms and applications;
  5. Multi-view/-modal learning and fusion;
  6. Data mining and analysis;
  7. Hashing learning-based methods and applications;
  8. Dimensionality reduction and discriminant representation;
  9. Subspace learning and clustering;
  10. Graph learning-based methods and applications;
  11. Image super-resolution/enhancing/restoration;
  12. Advanced models in computer vision, such as object tracking and detection;
  13. Sparse representation and application.

Dr. Shuping Zhao
Dr. Jie Wen
Dr. Chao Huang
Dr. Bob Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • machine learning
  • pattern recognition
  • deep learning
  • mathematical optimization

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (18 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1502 KiB  
Article
A Modification Method for Domain Shift in the Hidden Semi-Markov Model and Its Application
by Yunosuke Shimada, Takashi Kusaka, Takayuki Mukaeda, Yui Endo, Mitsunori Tada, Natsuki Miyata and Takayuki Tanaka
Electronics 2025, 14(8), 1579; https://doi.org/10.3390/electronics14081579 - 13 Apr 2025
Viewed by 158
Abstract
In human behavior recognition using machine learning, model performance degrades when the training data and operational data follow different distributions which is a phenomenon known as domain shift. This study proposes a method for domain adaptation in the hidden semi-Markov model (HSMM) by [...] Read more.
In human behavior recognition using machine learning, model performance degrades when the training data and operational data follow different distributions which is a phenomenon known as domain shift. This study proposes a method for domain adaptation in the hidden semi-Markov model (HSMM) by modifying only the emission probability distributions. Assuming that the state transition probabilities remain unchanged, the method updates the emission probabilities based on the posterior distribution of the target domain. This approach enables domain adaptation with minimal computational cost without requiring model retraining. The effectiveness of the proposed method was evaluated on synthetic time-series data from different domains and actual care work data, achieving recognition performance comparable to that of models retrained for each domain. These findings suggest that the proposed method applies to various time-series data analysis tasks requiring domain adaptation. Full article
Show Figures

Figure 1

18 pages, 1052 KiB  
Article
A Sparse Representation Classification Framework for Person Identification and Verification Using Neurophysiological Signals
by Vangelis P. Oikonomou
Electronics 2025, 14(6), 1108; https://doi.org/10.3390/electronics14061108 - 11 Mar 2025
Viewed by 467
Abstract
Brain biometrics has received increasing attention from the scientific community due to its unique properties in comparison to traditional biometric methods. Many studies have shown that EEG features are distinct among individuals. SSVEP signals, generated by stationary localized sources and distributed sources in [...] Read more.
Brain biometrics has received increasing attention from the scientific community due to its unique properties in comparison to traditional biometric methods. Many studies have shown that EEG features are distinct among individuals. SSVEP signals, generated by stationary localized sources and distributed sources in the parietal and occipital regions of the brain, serve as a reliable basis for biometrics. In this study, we present a novel approach that leverages the spatial patterns of brain responses elicited by visual stimulation at specific frequencies. Specifically, we propose integrating common spatial patterns with Sparse Representation Classification (SRC) frameworks for person identification and verification. The use of common spatial patterns enables the design of personalized spatial filters, which play a crucial role in constructing the dictionary used by SRC frameworks. We conducted extensive evaluations of the proposed method, comparing it with several traditional approaches using two SSVEP datasets. Our analysis also explored a broad range of flickering frequencies in the SSVEP experiments. The results from these datasets demonstrated the effectiveness of our approach for person identification and verification, achieving an average correct recognition rate above 90% across various visual stimulus frequencies and short durations of electrophysiological signals. Full article
Show Figures

Figure 1

15 pages, 1132 KiB  
Article
Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques
by Victor Pinto, Verusca Severo and Francisco Madeiro
Electronics 2025, 14(5), 1013; https://doi.org/10.3390/electronics14051013 - 3 Mar 2025
Viewed by 591
Abstract
In the realm of product design and development, efficient retrieval and reuse of 3D CAD models are vital for optimizing workflows and minimizing redundant efforts. Manual labeling of CAD models, while traditional, is labor-intensive and prone to inconsistency, highlighting the need for automated [...] Read more.
In the realm of product design and development, efficient retrieval and reuse of 3D CAD models are vital for optimizing workflows and minimizing redundant efforts. Manual labeling of CAD models, while traditional, is labor-intensive and prone to inconsistency, highlighting the need for automated classification systems. Multi-view convolutional neural networks (MVCNNs) offer an automated solution by leveraging 2D projections to represent 3D objects, balancing high classification accuracy with computational efficiency. Despite their effectiveness, the computational demands of MVCNNs pose challenges in large-scale CAD applications. This study investigates the use of optimization strategies, precisely pruning and quantization, in the scenario of MVCNN applied to the classification of 3D CAD mechanical models. By using different pruning and quantization strategies, we evaluate trade-offs between classification accuracy, execution time, and memory usage. In our evaluation of pruning and quantization techniques, 8-bit quantization reduced the memory used by the model from 83.78 MB to 21.01 MB, with accuracy only slightly decreasing from 93.83% to 93.59%. When applying 25% structured pruning, the model’s memory usage was reduced to 47.16 MB, execution time decreased from 133 to 97 s, and accuracy decreased to 92.14%. A combined approach of 25% pruning and 8-bit quantization achieved even better resource efficiency, with memory usage at 11.86 MB, execution time at 99 s, and accuracy at 92.06%. This combination of pruning and quantization leads to efficient MVCNN model optimization, balancing resource usage and classification performance, which is especially relevant in large-scale applications. Full article
Show Figures

Figure 1

23 pages, 520 KiB  
Article
Investigation of Text-Independent Speaker Verification by Support Vector Machine-Based Machine Learning Approaches
by Odin Kohler and Masudul Imtiaz
Electronics 2025, 14(5), 963; https://doi.org/10.3390/electronics14050963 - 28 Feb 2025
Viewed by 491
Abstract
Speaker verification is a common issue that has enumerable biomedical security applications. Speaker verification comes in two different forms: text-independent and text-dependent. Each of these forms can be implemented via many different machine learning and deep learning techniques. From our research, we found [...] Read more.
Speaker verification is a common issue that has enumerable biomedical security applications. Speaker verification comes in two different forms: text-independent and text-dependent. Each of these forms can be implemented via many different machine learning and deep learning techniques. From our research, we found that there is significantly less work implementing text-independent speaker verification using machine learning techniques than there is using deep learning techniques. Because of this gap, we were motivated to build our own SVM and CNN model for text-independent speaker verification and compare them to other systems using SVMs or deep learning techniques. We limited ourselves to SVMs because they are commonly used for speech recognition and achieved very high accuracies. The main motivation behind this was two-fold. The first reason is to demonstrate that SVMs can and have been successfully used for text-independent speaker verification at a level comparable to deep learning techniques; the second reason is to make work using SVMs for text-independent speaker verification more accessible so it can be expanded upon easily. The analysis and comparison conducted in this paper will demonstrate how SVMs achieve results comparable to deep learning techniques and allow future researchers to more easily find SVMs used for text-independent speaker verification and derive a sense of what is being implemented in the field. Full article
Show Figures

Figure 1

29 pages, 5038 KiB  
Article
An Evolutionary Deep Learning Framework for Accurate Remaining Capacity Prediction in Lithium-Ion Batteries
by Yang Liu, Liangyu Han, Yuzhu Wang, Jinqi Zhu, Bo Zhang and Jia Guo
Electronics 2025, 14(2), 400; https://doi.org/10.3390/electronics14020400 - 20 Jan 2025
Viewed by 888
Abstract
Accurate remaining capacity prediction (RCP) of lithium-ion batteries (LIBs) is crucial for ensuring their safety, reliability, and performance, particularly amidst the growing energy crisis and environmental concerns. However, the complex aging processes of LIBs significantly hinder accurate RCP, as traditional prediction methods struggle [...] Read more.
Accurate remaining capacity prediction (RCP) of lithium-ion batteries (LIBs) is crucial for ensuring their safety, reliability, and performance, particularly amidst the growing energy crisis and environmental concerns. However, the complex aging processes of LIBs significantly hinder accurate RCP, as traditional prediction methods struggle to effectively capture nonlinear degradation patterns and long-term dependencies. To tackle these challenges, we introduce an innovative framework that combines evolutionary learning with deep learning for RCP. This framework integrates Temporal Convolutional Networks (TCNs), Bidirectional Gated Recurrent Units (BiGRUs), and an attention mechanism to extract comprehensive time-series features and improve prediction accuracy. Additionally, we introduce a hybrid optimization algorithm that combines the Sparrow Search Algorithm (SSA) with Bayesian Optimization (BO) to enhance the performance of the model. The experimental results validate the superiority of our framework, demonstrating its capability to achieve significantly improved prediction accuracy compared to existing methods. This study provides researchers in battery management systems, electric vehicles, and renewable energy storage with a reliable tool for optimizing lithium-ion battery performance, enhancing system reliability, and addressing the challenges of the new energy industry. Full article
Show Figures

Figure 1

20 pages, 7824 KiB  
Article
Research on a Feature Point Detection Algorithm for Weld Images Based on Deep Learning
by Shaopeng Kang, Hongbin Qiang, Jing Yang, Kailei Liu, Wenbin Qian, Wenpeng Li and Yanfei Pan
Electronics 2024, 13(20), 4117; https://doi.org/10.3390/electronics13204117 - 18 Oct 2024
Viewed by 1487
Abstract
Laser vision seam tracking enhances robotic welding by enabling external information acquisition, thus improving the overall intelligence of the welding process. However, camera images captured during welding often suffer from distortion due to strong noises, including arcs, splashes, and smoke, which adversely affect [...] Read more.
Laser vision seam tracking enhances robotic welding by enabling external information acquisition, thus improving the overall intelligence of the welding process. However, camera images captured during welding often suffer from distortion due to strong noises, including arcs, splashes, and smoke, which adversely affect the accuracy and robustness of feature point detection. To mitigate these issues, we propose a feature point extraction algorithm tailored for weld images, utilizing an improved Deeplabv3+ semantic segmentation network combined with EfficientDet. By replacing Deeplabv3+’s backbone with MobileNetV2, we enhance prediction efficiency. The DenseASPP structure and attention mechanism are implemented to focus on laser stripe edge extraction, resulting in cleaner laser stripe images and minimizing noise interference. Subsequently, EfficientDet extracts feature point positions from these cleaned images. Experimental results demonstrate that, across four typical weld types, the average feature point extraction error is maintained below 1 pixel, with over 99% of errors falling below 3 pixels, indicating both high detection accuracy and reliability. Full article
Show Figures

Figure 1

26 pages, 5646 KiB  
Article
Multi-Feature Extraction and Selection Method to Diagnose Burn Depth from Burn Images
by Xizhe Zhang, Qi Zhang, Peixian Li, Jie You, Jingzhang Sun and Jianhang Zhou
Electronics 2024, 13(18), 3665; https://doi.org/10.3390/electronics13183665 - 14 Sep 2024
Cited by 1 | Viewed by 1182
Abstract
Burn wound depth is a significant determinant of patient treatment. Typically, the evaluation of burn depth relies heavily on the clinical experience of doctors. Even experienced surgeons may not achieve high accuracy and speed in diagnosing burn depth. Thus, intelligent burn depth classification [...] Read more.
Burn wound depth is a significant determinant of patient treatment. Typically, the evaluation of burn depth relies heavily on the clinical experience of doctors. Even experienced surgeons may not achieve high accuracy and speed in diagnosing burn depth. Thus, intelligent burn depth classification is useful and valuable. Here, an intelligent classification method for burn depth based on machine learning techniques is proposed. In particular, this method involves extracting color, texture, and depth features from images, and sequentially cascading these features. Then, an iterative selection method based on random forest feature importance measure is applied. The selected features are input into the random forest classifier to evaluate this proposed method using the standard burn dataset. This method classifies burn images, achieving an accuracy of 91.76% when classified into two categories and 80.74% when classified into three categories. The comprehensive experimental results indicate that this proposed method is capable of learning effective features from limited data samples and identifying burn depth effectively. Full article
Show Figures

Figure 1

27 pages, 4478 KiB  
Article
Predicting Economic Trends and Stock Market Prices with Deep Learning and Advanced Machine Learning Techniques
by Victor Chang, Qianwen Ariel Xu, Anyamele Chidozie and Hai Wang
Electronics 2024, 13(17), 3396; https://doi.org/10.3390/electronics13173396 - 26 Aug 2024
Cited by 6 | Viewed by 15844
Abstract
The volatile and non-linear nature of stock market data, particularly in the post-pandemic era, poses significant challenges for accurate financial forecasting. To address these challenges, this research develops advanced deep learning and machine learning algorithms to predict financial trends, quantify risks, and forecast [...] Read more.
The volatile and non-linear nature of stock market data, particularly in the post-pandemic era, poses significant challenges for accurate financial forecasting. To address these challenges, this research develops advanced deep learning and machine learning algorithms to predict financial trends, quantify risks, and forecast stock prices, focusing on the technology sector. Our study seeks to answer the following question: “Which deep learning and supervised machine learning algorithms are the most accurate and efficient in predicting economic trends and stock market prices, and under what conditions do they perform best?” We focus on two advanced recurrent neural network (RNN) models, long short-term memory (LSTM) and Gated Recurrent Unit (GRU), to evaluate their efficiency in predicting technology industry stock prices. Additionally, we integrate statistical methods such as autoregressive integrated moving average (ARIMA) and Facebook Prophet and machine learning algorithms like Extreme Gradient Boosting (XGBoost) to enhance the robustness of our predictions. Unlike classical statistical algorithms, LSTM and GRU models can identify and retain important data sequences, enabling more accurate predictions. Our experimental results show that the GRU model outperforms the LSTM model in terms of prediction accuracy and training time across multiple metrics such as RMSE and MAE. This study offers crucial insights into the predictive capabilities of deep learning models and advanced machine learning techniques for financial forecasting, highlighting the potential of GRU and XGBoost for more accurate and efficient stock price prediction in the technology sector. Full article
Show Figures

Figure 1

21 pages, 10977 KiB  
Article
Lightweight Progressive Fusion Calibration Network for Rotated Object Detection in Remote Sensing Images
by Jing Liu, Donglin Jing, Yanyan Cao, Ying Wang, Chaoping Guo, Peijun Shi and Haijing Zhang
Electronics 2024, 13(16), 3172; https://doi.org/10.3390/electronics13163172 - 11 Aug 2024
Cited by 2 | Viewed by 1368
Abstract
Rotated object detection is a crucial task in aerial image analysis. To address challenges such as multi-directional object rotation, complex backgrounds with occlusions, and the trade-off between speed and accuracy in remote sensing images, this paper introduces a lightweight progressive fusion calibration network [...] Read more.
Rotated object detection is a crucial task in aerial image analysis. To address challenges such as multi-directional object rotation, complex backgrounds with occlusions, and the trade-off between speed and accuracy in remote sensing images, this paper introduces a lightweight progressive fusion calibration network for rotated object detection (LPFC-RDet). The network comprises three main modules: the Retentive Meet Transformers (RMT) feature extraction block, the Progressive Fusion Calibration module (PFC), and the Shared Group Convolution Lightweight detection head (SGCL). The RMT feature extraction block integrates a retentive mechanism with global context modeling to learn rotation-insensitive features. The PFC module employs pixel-level, local-level, and global-level weights to calibrate features, enhancing feature extraction from occluded objects while suppressing background interference. The SGCL detection head uses decoupled detection tasks and shared group convolution layers to achieve parameter sharing and feature interaction, improving accuracy while maintaining a lightweight structure. Experimental results demonstrate that our method surpasses state-of-the-art detectors on three widely used remote sensing object datasets: HRSC2016, UCAS_AOD, and DOTA. Full article
Show Figures

Figure 1

14 pages, 2607 KiB  
Article
Enhanced Text Classification with Label-Aware Graph Convolutional Networks
by Ming-Yen Lin, Hsuan-Chun Liu and Sue-Chen Hsush
Electronics 2024, 13(15), 2944; https://doi.org/10.3390/electronics13152944 - 25 Jul 2024
Viewed by 898
Abstract
Text classification is an important research field in text mining and natural language processing, gaining momentum with the growth of social networks. Despite the accuracy advancements made by deep learning models, existing graph neural network-based methods often overlook the implicit class information within [...] Read more.
Text classification is an important research field in text mining and natural language processing, gaining momentum with the growth of social networks. Despite the accuracy advancements made by deep learning models, existing graph neural network-based methods often overlook the implicit class information within texts. To address this gap, we propose a graph neural network model named LaGCN to improve classification accuracy. LaGCN utilizes the latent class information in texts, treating it as explicit class labels. It refines the graph convolution process by adding label-aware nodes to capture document–word, word–word, and word–class correlations for text classification. Comparing LaGCN with leading-edge models like HDGCN and BERT, our experiments on Ohsumed, Movie Review, 20 Newsgroups, and R8 datasets demonstrate its superiority. LaGCN outperformed existing methods, showing average accuracy improvements of 19.47%, 10%, 4.67%, and 0.4%, respectively. This advancement underscores the importance of integrating class information into graph neural networks, setting a new benchmark for text classification tasks. Full article
Show Figures

Figure 1

15 pages, 1293 KiB  
Article
An Improved Lightweight YOLOv5s-Based Method for Detecting Electric Bicycles in Elevators
by Ziyuan Zhang, Xianyu Yang and Chengyu Wu
Electronics 2024, 13(13), 2660; https://doi.org/10.3390/electronics13132660 - 7 Jul 2024
Cited by 1 | Viewed by 1230
Abstract
The increase in fire accidents caused by indoor charging of electric bicycles has raised concerns among people. Monitoring EBs in elevators is challenging, and the current object detection method is a variant of YOLOv5, which faces problems with calculating the load and detection [...] Read more.
The increase in fire accidents caused by indoor charging of electric bicycles has raised concerns among people. Monitoring EBs in elevators is challenging, and the current object detection method is a variant of YOLOv5, which faces problems with calculating the load and detection rate. To address this issue, this paper presents an improved lightweight method based on YOLOv5s to detect EBs in elevators. This method introduces the MobileNetV2 module to achieve the lightweight performance of the model. By introducing the CBAM attention mechanism and the Bidirectional Feature Pyramid Network (BiFPN) into the YOLOv5s neck network, the detection precision is improved. In order to better verify that the model can be deployed at the edge of an elevator, this article deploys it using the Raspberry Pi 4B embedded development board and connects it to a buzzer for application verification. The experimental results demonstrate that the model parameters of EBs are reduced by 58.4%, the computational complexity is reduced by 50.6%, the detection precision reaches 95.9%, and real-time detection of electric vehicles in elevators is achieved. Full article
Show Figures

Figure 1

16 pages, 6121 KiB  
Article
Prediction of Machine-Generated Financial Tweets Using Advanced Bidirectional Encoder Representations from Transformers
by Muhammad Asad Arshed, Ștefan Cristian Gherghina, Dur-E-Zahra and Mahnoor Manzoor
Electronics 2024, 13(11), 2222; https://doi.org/10.3390/electronics13112222 - 6 Jun 2024
Viewed by 1483
Abstract
With the rise of Large Language Models (LLMs), distinguishing between genuine and AI-generated content, particularly in finance, has become challenging. Previous studies have focused on binary identification of ChatGPT-generated content, overlooking other AI tools used for text regeneration. This study addresses this gap [...] Read more.
With the rise of Large Language Models (LLMs), distinguishing between genuine and AI-generated content, particularly in finance, has become challenging. Previous studies have focused on binary identification of ChatGPT-generated content, overlooking other AI tools used for text regeneration. This study addresses this gap by examining various AI-regenerated content types in the finance domain. Objective: The study aims to differentiate between human-generated financial content and AI-regenerated content, specifically focusing on ChatGPT, QuillBot, and SpinBot. It constructs a dataset comprising real text and AI-regenerated text for this purpose. Contribution: This research contributes to the field by providing a dataset that includes various types of AI-regenerated financial content. It also evaluates the performance of different models, particularly highlighting the effectiveness of the Bidirectional Encoder Representations from the Transformers Base Cased model in distinguishing between these content types. Methods: The dataset is meticulously preprocessed to ensure quality and reliability. Various models, including Bidirectional Encoder Representations Base Cased, are fine-tuned and compared with traditional machine learning models using TFIDF and Word2Vec approaches. Results: The Bidirectional Encoder Representations Base Cased model outperforms other models, achieving an accuracy, precision, recall, and F1 score of 0.73, 0.73, 0.73, and 0.72 respectively, in distinguishing between real and AI-regenerated financial content. Conclusions: This study demonstrates the effectiveness of the Bidirectional Encoder Representations base model in differentiating between human-generated financial content and AI-regenerated content. It highlights the importance of considering various AI tools in identifying synthetic content, particularly in the finance domain in Pakistan. Full article
Show Figures

Figure 1

22 pages, 1238 KiB  
Article
A Novel Source Code Representation Approach Based on Multi-Head Attention
by Lei Xiao, Hao Zhong, Jianjian Liu, Kaiyu Zhang, Qizhen Xu and Le Chang
Electronics 2024, 13(11), 2111; https://doi.org/10.3390/electronics13112111 - 29 May 2024
Viewed by 1371
Abstract
Code classification and code clone detection are crucial for understanding and maintaining large software systems. Although deep learning surpasses traditional techniques in capturing the features of source code, existing models suffer from low processing power and high complexity. We propose a novel source [...] Read more.
Code classification and code clone detection are crucial for understanding and maintaining large software systems. Although deep learning surpasses traditional techniques in capturing the features of source code, existing models suffer from low processing power and high complexity. We propose a novel source code representation method based on the multi-head attention mechanism (SCRMHA). SCRMHA captures the vector representation of entire code segments, enabling it to focus on different positions of the input sequence, capture richer semantic information, and simultaneously process different aspects and relationships of the sequence. Moreover, it can calculate multiple attention heads in parallel, speeding up the computational process. We evaluate SCRMHA on both the standard dataset and an actual industrial dataset, and analyze the differences between these two datasets. Experiment results in code classification and clone detection tasks show that SCRMHA consumes less time and reduces complexity by about one-third compared with traditional source code feature representation methods. The results demonstrate that SCRMHA reduces the computational complexity and time consumption of the model while maintaining accuracy. Full article
Show Figures

Figure 1

17 pages, 1211 KiB  
Article
FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability
by Muneeb A. Khan and Heemin Park
Electronics 2024, 13(10), 1881; https://doi.org/10.3390/electronics13101881 - 11 May 2024
Cited by 5 | Viewed by 1733
Abstract
The early detection of wildfires is a crucial challenge in environmental monitoring, pivotal for effective disaster management and ecological conservation. Traditional detection methods often fail to detect fires accurately and in a timely manner, resulting in significant adverse consequences. This paper presents FireXplainNet, [...] Read more.
The early detection of wildfires is a crucial challenge in environmental monitoring, pivotal for effective disaster management and ecological conservation. Traditional detection methods often fail to detect fires accurately and in a timely manner, resulting in significant adverse consequences. This paper presents FireXplainNet, a Convolutional Neural Network (CNN) base model, designed specifically to address these limitations through enhanced efficiency and precision in wildfire detection. We optimized data input via specialized preprocessing techniques, significantly improving detection accuracy on both the Wildfire Image and FLAME datasets. A distinctive feature of our approach is the integration of Local Interpretable Model-agnostic Explanations (LIME), which facilitates a deeper understanding of and trust in the model’s predictive capabilities. Additionally, we have delved into optimizing pretrained models through transfer learning, enriching our analysis and offering insights into the comparative effectiveness of FireXplainNet. The model achieved an accuracy of 87.32% on the FLAME dataset and 98.70% on the Wildfire Image dataset, with inference times of 0.221 and 0.168 milliseconds, respectively. These performance metrics are critical for the application of real-time fire detection systems, underscoring the potential of FireXplainNet in environmental monitoring and disaster management strategies. Full article
Show Figures

Figure 1

15 pages, 2456 KiB  
Article
Deep Reinforcement Learning with Godot Game Engine
by Mahesh Ranaweera and Qusay H. Mahmoud
Electronics 2024, 13(5), 985; https://doi.org/10.3390/electronics13050985 - 5 Mar 2024
Viewed by 4713
Abstract
This paper introduces a Python framework for developing Deep Reinforcement Learning (DRL) in an open-source Godot game engine to tackle sim-to-real research. A framework was designed to communicate and interface with the Godot game engine to perform the DRL. With the Godot game [...] Read more.
This paper introduces a Python framework for developing Deep Reinforcement Learning (DRL) in an open-source Godot game engine to tackle sim-to-real research. A framework was designed to communicate and interface with the Godot game engine to perform the DRL. With the Godot game engine, users will be able to set up their environment while defining the constraints, motion, interactive objects, and actions to be performed. The framework interfaces with the Godot game engine to perform defined actions. It can be further extended to perform domain randomization and enhance overall learning by increasing the complexity of the environment. Unlike other proprietary physics or game engines, Godot provides extensive developmental freedom under an open-source licence. By incorporating Godot’s built-in powerful node-based environment system, flexible user interface, and the proposed Python framework, developers can extend its features to develop deep learning applications. Research performed on Sim2Real using this framework has provided great insight into the factors that affect the gap in reality. It also demonstrated the effectiveness of this framework in Sim2Real applications and research. Full article
Show Figures

Figure 1

19 pages, 9458 KiB  
Article
Seismic Event Detection in the Copahue Volcano Based on Machine Learning: Towards an On-the-Edge Implementation
by Yair Mauad Sosa, Romina Soledad Molina, Silvana Spagnotto, Iván Melchor, Alejandro Nuñez Manquez, Maria Liz Crespo, Giovanni Ramponi and Ricardo Petrino
Electronics 2024, 13(3), 622; https://doi.org/10.3390/electronics13030622 - 2 Feb 2024
Cited by 1 | Viewed by 1723
Abstract
This study focused on seismic event detection in a volcano using machine learning by leveraging the advantages of software/hardware co-design for a system on a chip (SoC) based on field-programmable gate array (FPGA) devices. A case study was conducted on the Copahue Volcano, [...] Read more.
This study focused on seismic event detection in a volcano using machine learning by leveraging the advantages of software/hardware co-design for a system on a chip (SoC) based on field-programmable gate array (FPGA) devices. A case study was conducted on the Copahue Volcano, an active stratovolcano located on the border between Argentina and Chile. Volcanic seismic event processing and detection were integrated into a PYNQ-based implementation by using a low-end SoC-FPGA device. We also provide insights into integrating an SoC-FPGA into the acquisition node, which can be valuable in scenarios where stations are deployed solely for data collection and holds the potential for the development of an early alert system. Full article
Show Figures

Figure 1

23 pages, 1650 KiB  
Article
A Heterogeneous Inference Framework for a Deep Neural Network
by Rafael Gadea-Gironés, José Luís Rocabado-Rocha, Jorge Fe and Jose M. Monzo
Electronics 2024, 13(2), 348; https://doi.org/10.3390/electronics13020348 - 14 Jan 2024
Cited by 1 | Viewed by 1872
Abstract
Artificial intelligence (AI) is one of the most promising technologies based on machine learning algorithms. In this paper, we propose a workflow for the implementation of deep neural networks. This workflow attempts to combine the flexibility of high-level compilers (HLS)-based networks with the [...] Read more.
Artificial intelligence (AI) is one of the most promising technologies based on machine learning algorithms. In this paper, we propose a workflow for the implementation of deep neural networks. This workflow attempts to combine the flexibility of high-level compilers (HLS)-based networks with the architectural control features of hardware description languages (HDL)-based flows. The architecture consists of a convolutional neural network, SqueezeNet v1.1, and a hard processor system (HPS) that coexists with acceleration hardware to be designed. This methodology allows us to compare solutions based solely on software (PyTorch 1.13.1) and propose heterogeneous inference solutions, taking advantage of the best options within the software and hardware flow. The proposed workflow is implemented on a low-cost field programmable gate array system-on-chip (FPGA SOC) platform, specifically the DE10-Nano development board. We have provided systolic architectural solutions written in OpenCL that are highly flexible and easily tunable to take full advantage of the resources of programmable devices and achieve superior energy efficiencies working with a 32-bit floating point. From a verification point of view, the proposed method is effective, since the reference models in all tests, both for the individual layers and the complete network, have been readily available using packages well known in the development, training, and inference of deep networks. Full article
Show Figures

Figure 1

25 pages, 17393 KiB  
Article
Enhancing Human Activity Recognition with LoRa Wireless RF Signal Preprocessing and Deep Learning
by Mingxing Nie, Liwei Zou, Hao Cui, Xinhui Zhou and Yaping Wan
Electronics 2024, 13(2), 264; https://doi.org/10.3390/electronics13020264 - 6 Jan 2024
Cited by 3 | Viewed by 2683
Abstract
This paper introduces a novel approach for enhancing human activity recognition through the integration of LoRa wireless RF signal preprocessing and deep learning. We tackle the challenge of extracting features from intricate LoRa signals by scrutinizing the unique propagation process of linearly modulated [...] Read more.
This paper introduces a novel approach for enhancing human activity recognition through the integration of LoRa wireless RF signal preprocessing and deep learning. We tackle the challenge of extracting features from intricate LoRa signals by scrutinizing the unique propagation process of linearly modulated LoRa signals—a critical aspect for effective feature extraction. Our preprocessing technique involves converting intricate data into real numbers, utilizing Short-Time Fourier Transform (STFT) to generate spectrograms, and incorporating differential signal processing (DSP) techniques to augment activity recognition accuracy. Additionally, we employ frequency-to-image conversion for the purpose of intuitive interpretation. In comprehensive experiments covering activity classification, identity recognition, room identification, and presence detection, our carefully selected deep learning models exhibit outstanding accuracy. Notably, ConvNext attains 96.7% accuracy in activity classification, 97.9% in identity recognition, and 97.3% in room identification. The Vision TF model excels with 98.5% accuracy in presence detection. Through leveraging LoRa signal characteristics and sophisticated preprocessing techniques, our transformative approach significantly enhances feature extraction, ensuring heightened accuracy and reliability in human activity recognition. Full article
Show Figures

Figure 1

Back to TopTop