applsci-logo

Journal Browser

Journal Browser

Application, Optimization and Architecture of Deep Learning Neural Network

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 September 2025 | Viewed by 47092

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science and Engineering, University of Electronic Science & Technology of China, Chengdu 610054, China
Interests: cloud computing; big data; deep learning; IoT; wireless networks
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the surge in images and videos that need to be processed, deep learning neural networks are becoming a popular and essential tool for solving various problems, such as classification, detection, and regression. The new deep architectures have led to advances in less computational resources and to more reliable results.

Therefore, in this Special Issue, we aim to present novel deep learning algorithms and their applications in various research fields. Submitted articles can cover various topics, including but not limited to deep learning, artificial intelligence, and the processing of large datasets from medical instruments, ground-based datasets, scientific experiments, and many other sources.

Dr. Haigang Gong
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • artificial intelligence
  • reinforcement learning
  • image detection and segmentation
  • forecasting using deep networks
  • predications using deep networks
  • applications of deep learning neural networks

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (20 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 5002 KiB  
Article
Research on Blueberry Maturity Detection Based on Receptive Field Attention Convolution and Adaptive Spatial Feature Fusion
by Bingqiang Huang, Zongyi Xie, Hanno Homann, Zhengshun Fei, Xinjian Xiang, Yongping Zheng, Guolong Zhang and Siqi Sun
Appl. Sci. 2025, 15(11), 6356; https://doi.org/10.3390/app15116356 - 5 Jun 2025
Viewed by 106
Abstract
Detecting small objects in complex outdoor conditions remains challenging. This paper proposes an improved version of YOLOv8n for the detection of blueberry in challenging outdoor scenarios. In this context, this article addresses feature extraction, small-target detection, and multi-scale feature fusion. Specifically, the C2F-RFAConv [...] Read more.
Detecting small objects in complex outdoor conditions remains challenging. This paper proposes an improved version of YOLOv8n for the detection of blueberry in challenging outdoor scenarios. In this context, this article addresses feature extraction, small-target detection, and multi-scale feature fusion. Specifically, the C2F-RFAConv module is introduced to enhance spatial receptive field learning and a P2-level detection layer is introduced for small and distant targets and fused by a four-head adaptive spatial feature fusion detection head (Detect-FASFF). Additionally, the Focaler-CIoU loss is chosen to mitigate sample imbalance, accelerate convergence, and improve overall model performance. Experiments on our blueberry maturity dataset show that the proposed model outperforms YOLOv8n, achieving 2.8% higher precision, 4% higher recall, and a 4.5% increase in mAP@0.5, with an FPS of 80. It achieves 89.1%, 91.0%, and 85.5% AP for ripe, semi-ripe, and unripe blueberries, demonstrating robustness under varying lighting, occlusion, and distance conditions. Compared to other lightweight networks, the model offers superior accuracy and efficiency. Future work will focus on model compression for real-world deployment. Full article
Show Figures

Figure 1

25 pages, 2652 KiB  
Article
YOLO-AFR: An Improved YOLOv12-Based Model for Accurate and Real-Time Dangerous Driving Behavior Detection
by Tianchen Ge, Bo Ning and Yiwu Xie
Appl. Sci. 2025, 15(11), 6090; https://doi.org/10.3390/app15116090 - 28 May 2025
Viewed by 408
Abstract
Accurate detection of dangerous driving behaviors is crucial for improving the safety of intelligent transportation systems. However, existing methods often struggle with limited feature extraction capabilities and insufficient attention to multiscale and contextual information. To overcome these limitations, we propose YOLO-AFR (YOLO with [...] Read more.
Accurate detection of dangerous driving behaviors is crucial for improving the safety of intelligent transportation systems. However, existing methods often struggle with limited feature extraction capabilities and insufficient attention to multiscale and contextual information. To overcome these limitations, we propose YOLO-AFR (YOLO with Adaptive Feature Refinement) for dangerous driving behavior detection. YOLO-AFR builds upon the YOLOv12 architecture and introduces three key innovations: (1) the redesign of the original A2C2f module by introducing a Feature-Refinement Feedback Network (FRFN), resulting in a new A2C2f-FRFN structure that adaptively refines multiscale features, (2) the integration of self-calibrated convolution (SC-Conv) modules in the backbone to enhance multiscale contextual modeling, and (3) the employment of a SEAM-based detection head to improve global contextual awareness and prediction accuracy. These three modules combine to form a Calibration-Refinement Loop, which progressively reduces redundancy and enhances discriminative features layer by layer. We evaluate YOLO-AFR on two public driver behavior datasets, YawDD-E and SfdDD. Experimental results show that YOLO-AFR significantly outperforms the baseline YOLOv12 model, achieving improvements of 1.3% and 1.8% in mAP@0.5, and 2.6% and 12.3% in mAP@0.5:0.95 on the YawDD-E and SfdDD datasets, respectively, demonstrating its superior performance in complex driving scenarios while maintaining high inference speed. Full article
Show Figures

Figure 1

19 pages, 8115 KiB  
Article
Research on Seamless Fabric Defect Detection Based on Improved YOLOv8n
by Qin Sun, Bernd Noche, Zongyi Xie and Bingqiang Huang
Appl. Sci. 2025, 15(5), 2728; https://doi.org/10.3390/app15052728 - 4 Mar 2025
Viewed by 791
Abstract
An improved YOLOv8n seamless fabric defect detection model is proposed to solve the current issues in seamless fabric defects in factories in this paper. The improvement in this paper first introduces the SPPF_LSKA module, which not only optimizes the extraction of multi-scale features [...] Read more.
An improved YOLOv8n seamless fabric defect detection model is proposed to solve the current issues in seamless fabric defects in factories in this paper. The improvement in this paper first introduces the SPPF_LSKA module, which not only optimizes the extraction of multi-scale features but also enhances the adaptability of the model in detecting defects of different sizes by improving the feature fusion mechanism, enabling efficient recognition of both large-sized and small-sized defects. Secondly, the CARAFE upsampling method is used to adaptively learn the relationship between pixels, which not only reduces information loss but also improves the reconstruction quality of feature maps, which is crucial for capturing complex textures and subtle defects of seamless fabrics. In addition, adding a small object detection layer particularly improves the detection accuracy of the model for small-sized defects, making it no longer limited to traditional models when dealing with high-density fabrics or small defects. Finally, integrating OREPA technology significantly reduces computational complexity, reduces redundant computing burden, and accelerates the training process by optimizing the model structure. The experimental results show that the precision, recall, and mAP@0.5 of the model on the seamless fabric defect dataset have improved by 7.3%, 8.5%, and 5.1%, respectively, compared to the baseline model YOLOv8n. Future research aims to explore the application of the model further in practical scenarios and complete the actual deployment of the seamless fabric defect detection system. Full article
Show Figures

Figure 1

29 pages, 4731 KiB  
Article
AYOLO: Development of a Real-Time Object Detection Model for the Detection of Secretly Cultivated Plants
by Ali Yılmaz, Yüksel Yurtay and Nilüfer Yurtay
Appl. Sci. 2025, 15(5), 2718; https://doi.org/10.3390/app15052718 - 4 Mar 2025
Viewed by 1003
Abstract
AYOLO introduces a novel fusion architecture that integrates unsupervised learning techniques with Vision Transformers, leveraging the YOLO series models as its foundation. This innovation enables the effective utilization of rich, unlabeled data, establishing a new pretraining methodology tailored to YOLO architectures. On a [...] Read more.
AYOLO introduces a novel fusion architecture that integrates unsupervised learning techniques with Vision Transformers, leveraging the YOLO series models as its foundation. This innovation enables the effective utilization of rich, unlabeled data, establishing a new pretraining methodology tailored to YOLO architectures. On a custom dataset comprising 80 images of poppy plants, AYOLO achieved a remarkable Average Precision (AP) of 38.7% while maintaining a high rendering speed of 239 FPS (Frames Per Second) on a Tesla K80 GPU. Real-time performance is demonstrated by achieving 239 FPS, and feature fusion optimally combines spatial and semantic information across scales. This performance surpasses the previous state-of-the-art YOLO v6-3.0 by +2.2% AP while retaining comparable speed. AYOLO exemplifies the potential of integrating advanced information fusion techniques with supervised pretraining, significantly enhancing precision and efficiency for object detection models optimized for small, specialized datasets. Full article
Show Figures

Figure 1

18 pages, 2403 KiB  
Article
Random Forest-Based Stability Prediction Modeling of Closed Wall for Goaf
by Yong Yang, Kepeng Hou, Huafen Sun, Linning Guo and Yalei Zhe
Appl. Sci. 2025, 15(5), 2300; https://doi.org/10.3390/app15052300 - 21 Feb 2025
Cited by 1 | Viewed by 387
Abstract
To effectively mitigate the hazards posed by the blast waves of rock mass caving on closed walls during the mining process, a stability prediction method based on a random forest (RF) algorithm is proposed, which is designed to automatically identify key parameters. A [...] Read more.
To effectively mitigate the hazards posed by the blast waves of rock mass caving on closed walls during the mining process, a stability prediction method based on a random forest (RF) algorithm is proposed, which is designed to automatically identify key parameters. A machine learning model is developed using the algorithm, and its performance is evaluated through accuracy, precision, recall, and F1-score metrics. The probabilistic model of the objective function is constructed using the grid search hyperparameter optimization method, allowing for the selection of the most favorable hyperparameters for evaluation. The initial prediction accuracy of the RF algorithm model is 94.6%, indicating a strong predictive capability. Further adjustments to the base classifier, maximum depth, minimum number of leaves, and minimum number of samples enhance the model’s performance, resulting in an improved prediction accuracy of 95.9%. Finally, the optimized model is applied to predict the stability of the closed walls in the actual project, and the results are consistent with the on-site situation. This demonstrates that the random forest-based stability prediction model effectively forecasts the stability of closed walls in the actual project. Full article
Show Figures

Figure 1

20 pages, 2384 KiB  
Article
A Cross-Level Iterative Subtraction Network for Camouflaged Object Detection
by Tongtong Hu, Chao Zhang, Xin Lyu, Xiaowen Sun, Shangjing Chen, Tao Zeng and Jiale Chen
Appl. Sci. 2024, 14(17), 8063; https://doi.org/10.3390/app14178063 - 9 Sep 2024
Viewed by 993
Abstract
Camouflaged object detection (COD) is a challenging task, aimed at segmenting objects that are similar in color and texture to their background. Sufficient multi-scale feature fusion is crucial for accurately segmenting object regions. However, most methods usually focus on information compensation, overlooking the [...] Read more.
Camouflaged object detection (COD) is a challenging task, aimed at segmenting objects that are similar in color and texture to their background. Sufficient multi-scale feature fusion is crucial for accurately segmenting object regions. However, most methods usually focus on information compensation, overlooking the difference between features, which is important for distinguishing the object from the background. To this end, we propose the cross-level iterative subtraction network (CISNet), which integrates information from cross-layer features and enhances details through iteration mechanisms. CISNet involves a cross-level iterative structure (CIS) for feature complementarity, where texture information is used to enrich high-level features and semantic information is used to enhance low-level features. In particular, we present a multi-scale strip convolution subtraction (MSCSub) module within CIS to extract difference information between cross-level features and fuse multi-scale features, which improves the feature representation and guides accurate segmentation. Furthermore, an enhanced guided attention (EGA) module is presented to refine features by deeply mining local context information and capturing a broader range of relationships between different feature maps in a top-down manner. Extensive experiments conducted on four benchmark datasets demonstrate that our model outperforms the state-of-the-art COD models in all evaluation metrics. Full article
Show Figures

Figure 1

17 pages, 9324 KiB  
Article
Integrating Spatio-Temporal Graph Convolutional Networks with Convolutional Neural Networks for Predicting Short-Term Traffic Speed in Urban Road Networks
by Seung Bae Jeon and Myeong-Hun Jeong
Appl. Sci. 2024, 14(14), 6102; https://doi.org/10.3390/app14146102 - 12 Jul 2024
Cited by 3 | Viewed by 1983
Abstract
The rapid expansion of large urban areas underscores the critical importance of road infrastructure. An accurate understanding of traffic flow on road networks is essential for enhancing civil services and reducing fuel consumption. However, traffic flow is influenced by a complex array of [...] Read more.
The rapid expansion of large urban areas underscores the critical importance of road infrastructure. An accurate understanding of traffic flow on road networks is essential for enhancing civil services and reducing fuel consumption. However, traffic flow is influenced by a complex array of factors and perpetually changing conditions, making comprehensive prediction of road network behavior challenging. Recent research has leveraged deep learning techniques to identify and forecast traffic flow and road network conditions, enhancing prediction accuracy by extracting key features from diverse factors. In this study, we performed short-term traffic speed predictions for road networks using data from Mobileye sensors mounted on taxis in Daegu City, Republic of Korea. These sensors capture the road network flow environment and the driver’s intentions. Utilizing these data, we integrated convolutional neural networks (CNNs) with spatio-temporal graph convolutional networks (STGCNs). Our experimental results demonstrated that the combined STGCN and CNN model outperformed the standalone STGCN and CNN models. The findings of this study contribute to the advancement of short-term traffic speed prediction models, thereby improving road network flow management. Full article
Show Figures

Figure 1

16 pages, 3987 KiB  
Article
Deep Learning Realizes Photoacoustic Imaging Artifact Removal
by Ruonan He, Yi Chen, Yufei Jiang, Yuyang Lei, Shengxian Yan, Jing Zhang and Hui Cao
Appl. Sci. 2024, 14(12), 5161; https://doi.org/10.3390/app14125161 - 13 Jun 2024
Viewed by 1706
Abstract
Photoacoustic imaging integrates the strengths of optics and ultrasound, offering high resolution, depth penetration, and multimodal imaging capabilities. Practical considerations with instrumentation and geometry limit the number of available acoustic sensors and their “view” of the imaging target, which result in image reconstruction [...] Read more.
Photoacoustic imaging integrates the strengths of optics and ultrasound, offering high resolution, depth penetration, and multimodal imaging capabilities. Practical considerations with instrumentation and geometry limit the number of available acoustic sensors and their “view” of the imaging target, which result in image reconstruction artifacts degrading image quality. To address this problem, YOLOv8-Pix2Pix is proposed as a hybrid artifact-removal algorithm, which is advantageous in comprehensively eliminating various types of artifacts and effectively restoring image details compared to existing algorithms. The proposed algorithm demonstrates superior performance in artifact removal and segmentation of photoacoustic images of brain tumors. For the purpose of further expanding its application fields and aligning with actual clinical needs, an experimental system for photoacoustic detection is designed in this paper to be verified. The experimental results show that the processed images are better than the pre-processed images in terms of reconstruction metrics PSNR and SSIM, and also the segmentation performance is significantly improved, which provides an effective solution for the further development of photoacoustic imaging technology. Full article
Show Figures

Figure 1

13 pages, 2153 KiB  
Article
A Lightweight Method for Graph Neural Networks Based on Knowledge Distillation and Graph Contrastive Learning
by Yong Wang and Shuqun Yang
Appl. Sci. 2024, 14(11), 4805; https://doi.org/10.3390/app14114805 - 2 Jun 2024
Cited by 2 | Viewed by 1710
Abstract
Graph neural networks (GNNs) are crucial tools for processing non-Euclidean data. However, due to scalability issues caused by the dependency and topology of graph data, deploying GNNs in practical applications is challenging. Some methods aim to address this issue by transferring GNN knowledge [...] Read more.
Graph neural networks (GNNs) are crucial tools for processing non-Euclidean data. However, due to scalability issues caused by the dependency and topology of graph data, deploying GNNs in practical applications is challenging. Some methods aim to address this issue by transferring GNN knowledge to MLPs through knowledge distillation. However, distilled MLPs cannot directly capture graph structure information and rely only on node features, resulting in poor performance and sensitivity to noise. To solve this problem, we propose a lightweight optimization method for GNNs that combines graph contrastive learning and variable-temperature knowledge distillation. First, we use graph contrastive learning to capture graph structural representations, enriching the input information for the MLP. Then, we transfer GNN knowledge to the MLP using variable temperature knowledge distillation. Additionally, we enhance both node content and structural features before inputting them into the MLP, thus improving its performance and stability. Extensive experiments on seven datasets show that the proposed KDGCL model outperforms baseline models in both transductive and inductive settings; in particular, the KDGCL model achieves an average improvement of 1.63% in transductive settings and 0.8% in inductive settings when compared to baseline models. Furthermore, KDGCL maintains parameter efficiency and inference speed, making it competitive in terms of performance. Full article
Show Figures

Figure 1

12 pages, 1385 KiB  
Article
Heterogeneous Graph-Convolution-Network-Based Short-Text Classification
by Jiwei Hua, Debing Sun, Yanxiang Hu, Jiayu Wang, Shuquan Feng and Zhaoyang Wang
Appl. Sci. 2024, 14(6), 2279; https://doi.org/10.3390/app14062279 - 8 Mar 2024
Cited by 11 | Viewed by 1821
Abstract
With the development of online interactive media platforms, a large amount of short text has appeared on the internet. Determining how to classify these short texts efficiently and accurately is of great significance. Graph neural networks can capture information dependencies in the entire [...] Read more.
With the development of online interactive media platforms, a large amount of short text has appeared on the internet. Determining how to classify these short texts efficiently and accurately is of great significance. Graph neural networks can capture information dependencies in the entire short-text corpus, thereby enhancing feature expression and improving classification accuracy. However, existing works have overlooked the role of entities in these short texts. In this paper, we propose a heterogeneous graph-convolution-network-based short-text classification (SHGCN) method that integrates heterogeneous graph convolutional neural networks of text, entities, and words. Firstly, the model constructs a graph network of the text and extracts entity nodes and word nodes. Secondly, the relationship of the graph nodes in the heterogeneous graphs is determined by the mutual information between the words, the relationship between the documents and words, and the confidence between the words and entities. Then, the feature is represented through a word graph and combined with its BERT embedding, and the word feature is strengthened through BiLstm. Finally, the enhanced word features are combined with the document graph representation features to predict the document categories. To verify the performance of the model, experiments were conducted on the public datasets AGNews, R52, and MR. The classification accuracy of SHGCN reached 88.38%, 93.87%, and 82.87%, respectively, which is superior to that of some existing advanced classification methods. Full article
Show Figures

Figure 1

14 pages, 4362 KiB  
Article
Unraveling Convolution Neural Networks: A Topological Exploration of Kernel Evolution
by Lei Yang, Mengxue Xu and Yunan He
Appl. Sci. 2024, 14(5), 2197; https://doi.org/10.3390/app14052197 - 6 Mar 2024
Viewed by 1685
Abstract
Convolutional Neural Networks (CNNs) have become essential in deep learning applications, especially in computer vision, yet their complex internal mechanisms pose significant challenges to interpretability, crucial for ethical applications. Addressing this, our paper explores CNNs by examining their topological changes throughout the learning [...] Read more.
Convolutional Neural Networks (CNNs) have become essential in deep learning applications, especially in computer vision, yet their complex internal mechanisms pose significant challenges to interpretability, crucial for ethical applications. Addressing this, our paper explores CNNs by examining their topological changes throughout the learning process, specifically employing persistent homology, a core method within Topological Data Analysis (TDA), to observe the dynamic evolution of their structure. This approach allows us to identify consistent patterns in the topological features of CNN kernels, particularly through shifts in Betti curves, which is a key concept in TDA. Our analysis of these Betti curves, initially focusing on the zeroth and first Betti numbers (respectively referred to as Betti-0 and Betti-1, which denote the number of connected components and loops), reveals insights into the learning dynamics of CNNs and potentially indicates the effectiveness of the learning process. We also discover notable differences in topological structures when CNNs are trained on grayscale versus color datasets, indicating the need for more extensive parameter space adjustments in color image processing. This study not only enhances the understanding of the intricate workings of CNNs but also contributes to bridging the gap between their complex operations and practical, interpretable applications. Full article
Show Figures

Figure 1

13 pages, 3072 KiB  
Article
A Lithology Recognition Network Based on Attention and Feature Brownian Distance Covariance
by Dake Zheng, Shudong Liu, Yidan Chen and Boyu Gu
Appl. Sci. 2024, 14(4), 1501; https://doi.org/10.3390/app14041501 - 12 Feb 2024
Cited by 3 | Viewed by 1263
Abstract
In the context of mountain tunnel mining through the drilling and blasting method, the recognition of lithology from palm face images is crucial for the comprehensive analysis of geological conditions and the prevention of geological risks. However, the complexity of the background in [...] Read more.
In the context of mountain tunnel mining through the drilling and blasting method, the recognition of lithology from palm face images is crucial for the comprehensive analysis of geological conditions and the prevention of geological risks. However, the complexity of the background in the acquired palm face images, coupled with an insufficient data sample size, poses challenges. While the incorporation of deep learning technology has enhanced lithology recognition accuracy, issues persist, including inadequate feature extraction and suboptimal recognition accuracy. To address these challenges, this paper proposes a lithology recognition network integrating attention mechanisms and a feature Brownian distance covariance approach. Drawing inspiration from the Brownian distance covariance concept, a feature Brownian distance covariance module is devised to enhance the network’s attention to rock sample features and improve classification accuracy. Furthermore, an enhanced lightweight Convolutional Block Attention Module is introduced, with upgrades to the multilayer perceptron in the channel attention module. These improvements emphasize attention to lithological features while mitigating interference from background information. The proposed method is evaluated on a palm face image dataset collected in the field. The proposed method was evaluated on a dataset comprising field-collected images of a tunnel rock face. The results illustrate a significant enhancement in the improved model’s ability to recognize rock images, as evidenced by improvements across all objective evaluation metrics. The achieved accuracy rate of 97.60% surpasses that of the current mainstream lithology recognition neural network. Full article
Show Figures

Figure 1

13 pages, 2928 KiB  
Article
Biogas Production Prediction Based on Feature Selection and Ensemble Learning
by Shurong Peng, Lijuan Guo, Yuanshu Li, Haoyu Huang, Jiayi Peng and Xiaoxu Liu
Appl. Sci. 2024, 14(2), 901; https://doi.org/10.3390/app14020901 - 20 Jan 2024
Viewed by 1663
Abstract
The allocation of biogas between power generation and heat supply in traditional kitchen waste power generation system is unreasonable; for this reason, a biogas prediction method based on feature selection and heterogeneous model integration learning is proposed for biogas production predictions. Firstly, the [...] Read more.
The allocation of biogas between power generation and heat supply in traditional kitchen waste power generation system is unreasonable; for this reason, a biogas prediction method based on feature selection and heterogeneous model integration learning is proposed for biogas production predictions. Firstly, the working principle of the biogas generation system based on kitchen waste is analyzed, the relationship between system features and biogas production is mined, and the important features are extracted. Secondly, the prediction performance of different individual learner models is comprehensively analyzed, and the training set is divided to reduce the risk of overfitting by combining K-fold cross-validation. Finally, different primary learners and meta learners are selected according to the prediction error and diversity index, and different learners are fused to construct the stacking ensemble learning model with a two-layer structure. The experimental results show that the research method has a higher prediction accuracy in predicting biogas production, which provides supporting data for the economic planning of kitchen waste power generation systems. Full article
Show Figures

Figure 1

24 pages, 5037 KiB  
Article
Soft Generative Adversarial Network: Combating Mode Collapse in Generative Adversarial Network Training via Dynamic Borderline Softening Mechanism
by Wei Li and Yongchuan Tang
Appl. Sci. 2024, 14(2), 579; https://doi.org/10.3390/app14020579 - 9 Jan 2024
Cited by 4 | Viewed by 2174
Abstract
In this paper, we propose the Soft Generative Adversarial Network (SoftGAN), a strategy that utilizes a dynamic borderline softening mechanism to train Generative Adversarial Networks. This mechanism aims to solve the mode collapse problem and enhance the training stability of the generated outputs. [...] Read more.
In this paper, we propose the Soft Generative Adversarial Network (SoftGAN), a strategy that utilizes a dynamic borderline softening mechanism to train Generative Adversarial Networks. This mechanism aims to solve the mode collapse problem and enhance the training stability of the generated outputs. Within the SoftGAN, the objective of the discriminator is to learn a fuzzy concept of real data with a soft borderline between real and generated data. This objective is achieved by balancing the principles of maximum concept coverage and maximum expected entropy of fuzzy concepts. During the early training stage of the SoftGAN, the principle of maximum expected entropy of fuzzy concepts guides the learning process due to the significant divergence between the generated and real data. However, in the final stage of training, the principle of maximum concept coverage dominates as the divergence between the two distributions decreases. The dynamic borderline softening mechanism of the SoftGAN can be likened to a student (the generator) striving to create realistic images, with the tutor (the discriminator) dynamically guiding the student towards the right direction and motivating effective learning. The tutor gives appropriate encouragement or requirements according to abilities of the student at different stages, so as to promote the student to improve themselves better. Our approach offers both theoretical and practical benefits for improving GAN training. We empirically demonstrate the superiority of our SoftGAN approach in addressing mode collapse issues and generating high-quality outputs compared to existing approaches. Full article
Show Figures

Figure 1

16 pages, 8580 KiB  
Article
Enhanced YOLOv8 with BiFPN-SimAM for Precise Defect Detection in Miniature Capacitors
by Ning Li, Tianrun Ye, Zhihua Zhou, Chunming Gao and Ping Zhang
Appl. Sci. 2024, 14(1), 429; https://doi.org/10.3390/app14010429 - 3 Jan 2024
Cited by 16 | Viewed by 6681
Abstract
In the domain of automatic visual inspection for miniature capacitor quality control, the task of accurately detecting defects presents a formidable challenge. This challenge stems primarily from the small size and limited sample availability of defective micro-capacitors, which leads to issues such as [...] Read more.
In the domain of automatic visual inspection for miniature capacitor quality control, the task of accurately detecting defects presents a formidable challenge. This challenge stems primarily from the small size and limited sample availability of defective micro-capacitors, which leads to issues such as reduced detection accuracy and increased false-negative rates in existing inspection methods. To address these challenges, this paper proposes an innovative approach employing an enhanced ‘you only look once’ version 8 (YOLOv8) architecture specifically tailored for the intricate task of micro-capacitor defect inspection. The merging of the bidirectional feature pyramid network (BiFPN) architecture and the simplified attention module (SimAM), which greatly improves the model’s capacity to recognize fine features and feature representation, is at the heart of this methodology. Furthermore, the model’s capacity for generalization was significantly improved by the addition of the weighted intersection over union (WISE-IOU) loss function. A micro-capacitor surface defect (MCSD) dataset comprising 1358 images representing four distinct types of micro-capacitor defects was constructed. The experimental results showed that our approach achieved 95.8% effectiveness in the mean average precision (mAP) at a threshold of 0.5. This indicates a notable 9.5% enhancement over the original YOLOv8 architecture and underscores the effectiveness of our approach in the automatic visual inspection of miniature capacitors. Full article
Show Figures

Figure 1

24 pages, 4332 KiB  
Article
Multi-Path Routing Algorithm Based on Deep Reinforcement Learning for SDN
by Yi Zhang, Lanxin Qiu, Yangzhou Xu, Xinjia Wang, Shengjie Wang, Agyemang Paul and Zhefu Wu
Appl. Sci. 2023, 13(22), 12520; https://doi.org/10.3390/app132212520 - 20 Nov 2023
Cited by 4 | Viewed by 3739
Abstract
Software-Defined Networking (SDN) enhances network control but faces Distributed Denial of Service (DDoS) attacks due to centralized control and flow-table constraints in network devices. To overcome this limitation, we introduce a multi-path routing algorithm for SDN called Trust-Based Proximal Policy Optimization (TBPPO). TBPPO [...] Read more.
Software-Defined Networking (SDN) enhances network control but faces Distributed Denial of Service (DDoS) attacks due to centralized control and flow-table constraints in network devices. To overcome this limitation, we introduce a multi-path routing algorithm for SDN called Trust-Based Proximal Policy Optimization (TBPPO). TBPPO incorporates a Kullback–Leibler divergence (KL divergence) trust value and a node diversity mechanism as the security assessment criterion, aiming to mitigate issues such as network fluctuations, low robustness, and congestion, with a particular emphasis on countering DDoS attacks. To avoid routing loops, differently from conventional ‘Next Hop’ routing decision methodology, we implemented an enhanced Depth-First Search (DFS) approach involving the pre-computation of path sets, from which we select the best path. To optimize the routing efficiency, we introduced an improved Proximal Policy Optimization (PPO) algorithm based on deep reinforcement learning. This enhanced PPO algorithm focuses on optimizing multi-path routing, considering security, network delay, and variations in multi-path delays. The TBPPO outperforms traditional methods in the Germany-50 evaluation, reducing average delay by 20%, cutting delay variation by 50%, and leading in trust value by 0.5, improving security and routing efficiency in SDN. TBPPO provides a practical and effective solution to enhance SDN security and routing efficiency. Full article
Show Figures

Figure 1

12 pages, 393 KiB  
Article
Pipelined Stochastic Gradient Descent with Taylor Expansion
by Bongwon Jang, Inchul Yoo and Dongsuk Yook
Appl. Sci. 2023, 13(21), 11730; https://doi.org/10.3390/app132111730 - 26 Oct 2023
Cited by 2 | Viewed by 1617
Abstract
Stochastic gradient descent (SGD) is an optimization method typically used in deep learning to train deep neural network (DNN) models. In recent studies for DNN training, pipeline parallelism, a type of model parallelism, is proposed to accelerate SGD training. However, since SGD is [...] Read more.
Stochastic gradient descent (SGD) is an optimization method typically used in deep learning to train deep neural network (DNN) models. In recent studies for DNN training, pipeline parallelism, a type of model parallelism, is proposed to accelerate SGD training. However, since SGD is inherently sequential, naively implemented pipeline parallelism introduces the weight inconsistency and the delayed gradient problems, resulting in reduced training efficiency. In this study, we propose a novel method called TaylorPipe to alleviate these problems. The proposed method generates multiple model replicas to solve the weight inconsistency problem, and adopts a Taylor expansion-based gradient prediction algorithm to mitigate the delayed gradient problem. We verified the efficiency of the proposed method using the VGG-16 and the ResNet-34 on the CIFAR-10 and CIFAR-100 datasets. The experimental results show that not only the training time is reduced by up to 2.7 times but also the accuracy of TaylorPipe is comparable with that of SGD. Full article
Show Figures

Figure 1

19 pages, 4711 KiB  
Article
Prediction of Wind Power with Machine Learning Models
by Ömer Ali Karaman
Appl. Sci. 2023, 13(20), 11455; https://doi.org/10.3390/app132011455 - 19 Oct 2023
Cited by 40 | Viewed by 11039
Abstract
Wind power is a vital power grid component, and wind power forecasting represents a challenging task. In this study, a series of multiobjective predictive models were created utilising a range of cutting-edge machine learning (ML) methodologies, namely, artificial neural networks (ANNs), recurrent neural [...] Read more.
Wind power is a vital power grid component, and wind power forecasting represents a challenging task. In this study, a series of multiobjective predictive models were created utilising a range of cutting-edge machine learning (ML) methodologies, namely, artificial neural networks (ANNs), recurrent neural networks (RNNs), convolutional neural networks, and long short-term memory (LSTM) networks. In this study, two independent data sets were combined and used to predict wind power. The first data set contained internal values such as wind speed (m/s), wind direction (°), theoretical power (kW), and active power (kW). The second data set was external values that contained the meteorological data set, which can affect the wind power forecast. The k-nearest neighbours (kNN) algorithm completed the missing data in the data set. The results showed that the LSTM, RNN, CNN, and ANN algorithms were powerful in forecasting wind power. Furthermore, the performance of these models was evaluated by incorporating statistical indicators of performance deviation to demonstrate the efficacy of the employed methodology effectively. Moreover, the performance of these models was evaluated by incorporating statistical indicators of performance deviation, including the coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), and mean square error (MSE) metrics to effectively demonstrate the efficacy of the employed methodology. When the metrics are examined, it can be said that ANN, RNN, CNN, and LSTM methods effectively forecast wind power. However, it can be said that the LSTM model is more successful in estimating the wind power with an R2 value of 0.9574, MAE of 0.0209, MSE of 0.0038, and RMSE of 0.0614. Full article
Show Figures

Figure 1

18 pages, 12281 KiB  
Article
Lane Line Type Recognition Based on Improved YOLOv5
by Boyu Liu, Hao Wang, Yongqiang Wang, Congling Zhou and Lei Cai
Appl. Sci. 2023, 13(18), 10537; https://doi.org/10.3390/app131810537 - 21 Sep 2023
Cited by 3 | Viewed by 2806
Abstract
The recognition of lane line type plays an important role in the perception of advanced driver assistance systems (ADAS). In actual vehicle driving on roads, there are a variety of lane line type and complex road conditions which present significant challenges to ADAS. [...] Read more.
The recognition of lane line type plays an important role in the perception of advanced driver assistance systems (ADAS). In actual vehicle driving on roads, there are a variety of lane line type and complex road conditions which present significant challenges to ADAS. To address this problem, this paper proposes an improved YOLOv5 method for recognising lane line type. This method can accurately and quickly identify the types of lane lines and can show good recognition results in harsh environments. The main strategy of this method includes the following steps: first, the FasterNet lightweight network is introduced into all the concentrated-comprehensive convolution (C3) modules in the network to accelerate the inference speed and reduce the number of parameters. Then, the efficient channel attention (ECA) mechanism is integrated into the backbone network to extract image feature information and improve the model’s detection accuracy. Finally, the sigmoid intersection over union (SIoU) loss function is used to replace the original generalised intersection over union (GIoU) loss function to further enhance the robustness of the model. Through experiments, the improved YOLOv5s algorithm achieves 95.1% of mAP@0.5 and 95.2 frame·s−1 of FPS, which can satisfy the demand of ADAS for accuracy and real-time performance. And the number of model parameters are only 6M, and the volume is only 11.7 MB, which will be easily embedded into ADAS and does not require huge computing power to support it. Meanwhile, the improved algorithms increase the accuracy and speed of YOLOv5m, YOLOv5l, and YOLOv5x models to different degrees. The appropriate model can be selected according to the actual situation. This plays a practical role in improving the safety of ADAS. Full article
Show Figures

Figure 1

14 pages, 1028 KiB  
Article
CDF-LS: Contrastive Network for Emphasizing Feature Differences with Fusing Long- and Short-Term Interest Features
by Kejian Liu, Wei Wang, Rongju Wang, Xuran Cui, Liying Zhang, Xianzhi Yuan and Xianyong Li
Appl. Sci. 2023, 13(13), 7627; https://doi.org/10.3390/app13137627 - 28 Jun 2023
Cited by 1 | Viewed by 1366
Abstract
Modelling both long- and short-term user interests from historical data is crucial for generating accurate recommendations. However, unifying these metrics across multiple application domains can be challenging, and existing approaches often rely on complex, intertwined models which can be difficult to interpret. To [...] Read more.
Modelling both long- and short-term user interests from historical data is crucial for generating accurate recommendations. However, unifying these metrics across multiple application domains can be challenging, and existing approaches often rely on complex, intertwined models which can be difficult to interpret. To address this issue, we propose a lightweight, plug-and-play interest enhancement module that fuses interest vectors from two independent models. After analyzing the dataset, we identify deviations in the recommendation performance of long- and short-term interest models. To compensate for these differences, we use feature enhancement and loss correction during training. In the fusion process, we explicitly split long-term interest features with longer duration into multiple local features. We then use a shared attention mechanism to fuse multiple local features with short-term interest features to obtain interaction features. To correct for bias between models, we introduce a comparison learning task that monitors the similarity between local features, short-term features, and interaction features. This adaptively reduces the distance between similar features. Our proposed module combines and compares multiple independent long-term and short-term interest models on multiple domain datasets. As a result, it not only accelerates the convergence of the models but also achieves outstanding performance in challenging recommendation scenarios. Full article
Show Figures

Figure 1

Back to TopTop