Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (64)

Search Parameters:
Keywords = shortcut connection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 9482 KB  
Article
A Hybrid End-to-End Dual Path Convolutional Residual LSTM Model for Battery SOH Estimation
by Azadeh Gholaminejad, Arta Mohammad-Alikhani and Babak Nahid-Mobarakeh
Batteries 2025, 11(12), 449; https://doi.org/10.3390/batteries11120449 - 6 Dec 2025
Viewed by 459
Abstract
Accurate estimation of battery state of health is essential for ensuring safety, supporting fault diagnosis, and optimizing the lifetime of electric vehicles. This study proposes a compact dual-path architecture that combines Convolutional Neural Networks with Convolutional Long Short-Term Memory (ConvLSTM) units to jointly [...] Read more.
Accurate estimation of battery state of health is essential for ensuring safety, supporting fault diagnosis, and optimizing the lifetime of electric vehicles. This study proposes a compact dual-path architecture that combines Convolutional Neural Networks with Convolutional Long Short-Term Memory (ConvLSTM) units to jointly extract spatial and temporal degradation features from charge-cycle voltage and current measurements. Residual and inter-path connections enhance gradient flow and feature fusion, while a three-channel preprocessing strategy aligns cycle lengths and isolates padded regions, improving learning stability. Operating end-to-end, the model eliminates the need for handcrafted features and does not rely on discharge data or temperature measurements, enabling practical deployment in minimally instrumented environments. The model is evaluated on the NASA battery aging dataset under two scenarios: Same-Battery Evaluation and Leave-One-Battery-Out Cross-Battery Generalization. It achieves average RMSE values of 1.26% and 2.14%, converging within 816 and 395 epochs, respectively. An ablation study demonstrates that the dual-path design, ConvLSTM units, residual shortcuts, inter-path exchange, and preprocessing pipeline each contribute to accuracy, stability, and reduced training cost. With only 4913 parameters, the architecture remains robust to variations in initial capacity, cutoff voltage, and degradation behavior. Edge deployment on an NVIDIA Jetson AGX Orin confirms real-time feasibility, achieving 2.24 ms latency, 8.24 MB memory usage, and 12.9 W active power, supporting use in resource-constrained battery management systems. Full article
Show Figures

Figure 1

26 pages, 946 KB  
Article
Optimizing IoMT Security: Performance Trade-Offs Between Neural Network Architectural Design, Dimensionality Reduction, and Class Imbalance Handling
by Heyfa Ammar and Asma Cherif
IoT 2025, 6(4), 74; https://doi.org/10.3390/iot6040074 - 29 Nov 2025
Viewed by 377
Abstract
The proliferation of Internet of Medical Things (IoMT) devices in healthcare requires robust intrusion detection systems to protect sensitive data and ensure patient safety. While existing neural network-based Intrusion Detection Systems have shown considerable effectiveness, significant challenges persist—particularly class imbalance and high data [...] Read more.
The proliferation of Internet of Medical Things (IoMT) devices in healthcare requires robust intrusion detection systems to protect sensitive data and ensure patient safety. While existing neural network-based Intrusion Detection Systems have shown considerable effectiveness, significant challenges persist—particularly class imbalance and high data dimensionality. Although various approaches have been proposed to mitigate these issues, their actual impact on detection accuracy remains insufficiently explored. This study investigates advanced Artificial Neural Network (ANN) architectures and preprocessing strategies for intrusion detection in IoMT environments, addressing critical challenges of feature dimensionality and class imbalance. Leveraging the WUSTL-EHMS-2020 dataset—a specialized dataset specifically designed for IoMT cybersecurity research—this research systematically examines the performance of multiple neural network designs. Our research implements and evaluates five distinct ANN architectures: the Standard Feedforward Network, the Enhanced Channel ANN, Dual-Branch Addition and Concatenation ANNs, and the Shortcut Connection ANN. To mitigate the class imbalance challenge, we compare three balancing approaches: the Synthetic Minority Over-sampling Technique (SMOTE), Hybrid Over-Under Sampling, and the Weighted Cross-Entropy Loss Function. Performance analysis reveals nuanced insights across different architectures and balancing strategies. SMOTE-based models achieved average AUC scores ranging from 0.8491 to 0.8766. Hybrid sampling strategies improved performance, with AUC increasing to 0.8750. The weighted cross-entropy loss function demonstrated the most consistent performance. The most significant finding emerges from the Dual-Branch ANN with addition operations and a weighted loss function, which achieved 0.9403 Accuracy, 0.8786 AUC, a 0.8716 F1-Score, 0.8650 Precision, and 0.8786 Recall. Compared to the related work’s baseline, it demonstrates a substantial increase in F1 Score by 8.45% and an improvement of 18.67% in AUC and Recall, highlighting the model’s superiority at identifying potential security threats and minimizing false negatives. Full article
Show Figures

Figure 1

18 pages, 2452 KB  
Article
Enhanced FISH Image Classification via CBAM-PPM-Optimized ResNet50 for Precision Cytogenetic Diagnosis
by Zhiling Li, Wenjia Li, Yang Zhou and Liu Wang
Sensors 2025, 25(22), 6951; https://doi.org/10.3390/s25226951 - 13 Nov 2025
Viewed by 528
Abstract
To address the low efficiency and high subjectivity of manual interpretation in fluorescence in situ hybridization (FISH) tissue and cell images, this study proposes an intelligent FISH image classification model based on an improved ResNet50 architecture. By analyzing the characteristics of multi-channel fluorescence [...] Read more.
To address the low efficiency and high subjectivity of manual interpretation in fluorescence in situ hybridization (FISH) tissue and cell images, this study proposes an intelligent FISH image classification model based on an improved ResNet50 architecture. By analyzing the characteristics of multi-channel fluorescence signals and the bottlenecks of clinical interpretation, a Convolutional Block Attention Module (CBAM) is introduced to enhance the representation of salient fluorescence features through dual channel–spatial attention mechanisms. A Pyramid Pooling Module (PPM) is integrated to fuse multi-scale contextual information, improving the detection accuracy of small targets such as microdeletions. Furthermore, the shortcut connections in residual blocks are optimized to reduce feature loss. To mitigate the limitation of insufficient annotated samples, transfer learning is employed, combined with a focal loss function to enhance classification performance under class-imbalanced conditions. Experiments conducted on a clinical dataset of 12,000 FISH images demonstrate that the proposed model achieves an overall classification accuracy of 92.4%, representing a 9.9% improvement over the original ResNet50. The recall rate for complex categories (e.g., translocation and fusion) exceeds 90.7%, with an inference time of 22.3 ms per sample, meeting the real-time requirements of clinical diagnosis. These results provide an efficient and practical solution for the automated intelligent interpretation of FISH images, offering significant potential for precision-assisted diagnosis of tumors and genetic disorders. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

16 pages, 3810 KB  
Article
Array-Patterned Anisotropic Conductive Films for High Precision Circuit Interconnection
by Changxiang Hao, Junde Chen, Yonghao Chen, Ge Cao, Xing Cheng and Yanqing Tian
Materials 2025, 18(21), 4927; https://doi.org/10.3390/ma18214927 - 28 Oct 2025
Viewed by 734
Abstract
Anisotropic conductive films (ACFs) are widely used for circuit interconnection due to their easy use, low temperature bonding, higher precision than soldering and eco-friendliness. However, current ACFs are generally prepared by randomly distributing conductive particles into suitable resins. The ACFs prepared by this [...] Read more.
Anisotropic conductive films (ACFs) are widely used for circuit interconnection due to their easy use, low temperature bonding, higher precision than soldering and eco-friendliness. However, current ACFs are generally prepared by randomly distributing conductive particles into suitable resins. The ACFs prepared by this approach have risks to result in shortcut when applied for high precision bonding (<100 μm). In order to alleviate this problem, we designed and prepared a new kind of ACFs with conducting particles well aligned in adhesive film, which is named as array-patterned ACFs (A-ACFs). A template with 12 μm periodic microcavities was prepared and used to load 5.4 μm silver-coated polystyrene particles. Through a series of process optimizations including particles-filling cycles and particles-transferring-pressure/temperature into the used polyurethane (PU) adhesive, well-aligned particles with a spacing of 6.6 μm in the PU film was obtained. Such prepared A-ACFs were used to bond two flexible printed circuits (FPC) not only with a spacing of 200 μm (FPC-200) but also with 40 μm (FPC-40). The bonding conditions including temperature and pressure for the FPC-200 connections were investigated in detail. The connecting resistance, insulation resistance, peeling force, and the particles’ morphologies between the bonded FPCs were investigated. The reliability of the two bonded FPCs were tested under 85 °C and 85% relative humidity. Results showed that the new kinds of A-ACFs are suitable for achieving high precision circuits bonding and show better accuracy than those of traditional ACFs (T-ACFs). Thus, this study might have new insight for designing A-ACFs and great potential for applications in high-precision devices. Full article
(This article belongs to the Special Issue Reinforced Polymer Composites with Natural and Nano Fillers)
Show Figures

Figure 1

25 pages, 1426 KB  
Article
Advanced Probabilistic Roadmap Path Planning with Adaptive Sampling and Smoothing
by Mateusz Ambrożkiewicz, Bartłomiej Bonar, Tomasz Buratowski and Piotr Małka
Electronics 2025, 14(19), 3804; https://doi.org/10.3390/electronics14193804 - 25 Sep 2025
Cited by 1 | Viewed by 1182
Abstract
Probabilistic roadmap (PRM) methods are widely used for robot navigation in both 2D and 3D environments; however, a major drawback is that the raw paths tend to be jagged. Executing a trajectory along such paths can lead to significant overshoots and tight turns, [...] Read more.
Probabilistic roadmap (PRM) methods are widely used for robot navigation in both 2D and 3D environments; however, a major drawback is that the raw paths tend to be jagged. Executing a trajectory along such paths can lead to significant overshoots and tight turns, making it difficult to achieve a near-optimal solution under motion constraints. This paper presents an enhanced PRM-based path planning approach designed to improve path quality and computational efficiency. The method integrates advanced sampling strategies, adaptive neighbor selection with spatial data structures, and multi-stage path post-processing. In particular, shortcut smoothing and polynomial fitting are used to generate smoother trajectories suitable for motion-constrained robots. The proposed hybrid sampling scheme biases sample generation toward critical regions—near obstacles, in narrow passages, and between the start and goal—to improve graph connectivity in challenging areas. An adaptive k-d tree-based connection strategy then efficiently builds a roadmap using variable connection radii guided by PRM* theory. Once a path is found using an any-angle graph search, post-processing is applied to refine it. Unnecessary waypoints are removed via line-of-sight shortcuts, and the final trajectory is smoothed using a fitted polynomial curve. The resulting paths are shorter and exhibit gentler turns, making them more feasible for execution. In simulated complex scenarios, including narrow corridors and cluttered environments, the advanced PRM achieved a 100% success rate where standard PRM frequently failed. It also reduced calculation time to 30% and peak turning angle by up to 50% compared to conventional methods. The approach supports dynamic re-planning: when the environment changes, the roadmap is efficiently updated rather than rebuilt from scratch. Furthermore, the use of an adaptive k-d tree structure and incremental roadmap updates leads to an order-of-magnitude speedup in the connection phase. These improvements significantly increase the planner’s path quality, runtime performance, and reliability. Quantitative results are provided to substantiate the performance gains of the proposed method. Full article
(This article belongs to the Special Issue Artificial Intelligence in Vision Modelling)
Show Figures

Figure 1

18 pages, 3577 KB  
Article
WT-ResNet: A Non-Destructive Method for Determining the Nitrogen, Phosphorus, and Potassium Content of Sugarcane Leaves Based on Leaf Image
by Cuimin Sun, Junyang Dou, Biao He, Yuxiang Cai and Chengwu Zou
Agriculture 2025, 15(16), 1752; https://doi.org/10.3390/agriculture15161752 - 15 Aug 2025
Viewed by 881
Abstract
Traditional nutritional diagnosis suffers from inefficiency, high cost, and damage when predicting the nitrogen, phosphorus, and potassium content of sugarcane leaves. Non-destructive nutritional diagnosis of sugarcane leaves based on traditional machine learning and deep learning suffers from poor generalization and lower accuracy. To [...] Read more.
Traditional nutritional diagnosis suffers from inefficiency, high cost, and damage when predicting the nitrogen, phosphorus, and potassium content of sugarcane leaves. Non-destructive nutritional diagnosis of sugarcane leaves based on traditional machine learning and deep learning suffers from poor generalization and lower accuracy. To address these issues, this study proposes a novel convolutional neural network called WT-ResNet. This model incorporates wavelet transform into the residual network structure, enabling effective feature extraction from sugarcane leaf images and facilitating the regression prediction of nitrogen, phosphorus, and potassium content in the leaves. By employing a cascade of decomposition and reconstruction, the wavelet transform extracts multi-scale features, which allows for the capture of different frequency components in images. Through the use of shortcut connections, residual structures facilitate the learning of identity mappings within the model. The results show that by analyzing sugarcane leaf images, our model achieves R2 values of 0.9420 for nitrogen content prediction, 0.9084 for phosphorus content prediction, and 0.8235 for potassium content prediction. The accuracy rate for nitrogen prediction reaches 88.24% within a 0.5 tolerance, 58.82% for phosphorus prediction within a 0.1 tolerance, and 70.59% for potassium prediction within a 0.5 tolerance. Compared to other algorithms, WT-ResNet demonstrates higher accuracy. This study aims to provide algorithms for non-destructive sugarcane nutritional diagnosis and technical support for precise sugarcane fertilization. Full article
Show Figures

Figure 1

37 pages, 2776 KB  
Article
Design of Identical Strictly and Rearrangeably Nonblocking Folded Clos Networks with Equally Sized Square Crossbars
by Yamin Li
Computers 2025, 14(7), 293; https://doi.org/10.3390/computers14070293 - 20 Jul 2025
Viewed by 1168
Abstract
Clos networks and their folded versions, fat trees, are widely adopted in interconnection network designs for data centers and supercomputers. There are two main types of Clos networks: strictly nonblocking Clos networks and rearrangeably nonblocking Clos networks. Strictly nonblocking Clos networks can connect [...] Read more.
Clos networks and their folded versions, fat trees, are widely adopted in interconnection network designs for data centers and supercomputers. There are two main types of Clos networks: strictly nonblocking Clos networks and rearrangeably nonblocking Clos networks. Strictly nonblocking Clos networks can connect an idle input to an idle output without interfering with existing connections. Rearrangeably nonblocking Clos networks can connect an idle input to an idle output with rearrangements of existing connections. Traditional strictly nonblocking Clos networks have two drawbacks. One drawback is the use of crossbars with different numbers of input and output ports, whereas the currently available switches are square crossbars with the same number of input and output ports. Another drawback is that every connection goes through a fixed number of stages, increasing the length of the communication path. A drawback of traditional fat trees is that the root stage uses differently sized crossbar switches than the other stages. To solve these problems, this paper proposes an Identical Strictly NonBlocking folded Clos (ISNBC) network that uses equally sized square crossbars for all switches. Correspondingly, this paper also proposes an Identical Rearrangeably NonBlocking folded Clos (IRNBC) network. Both ISNBC and IRNBC networks can have any number of stages, can use equally sized square crossbars with no unused switch ports, and can utilize shortcut connections to reduce communication path lengths. Moreover, both ISNBC and IRNBC networks have a lower switch crosspoint cost ratio relative to a single crossbar than their corresponding traditional Clos networks. Specifically, ISNBC networks use 46.43% to 87.71% crosspoints of traditional strictly nonblocking folded Clos networks, and IRNBC networks use 53.85% to 60.00% crosspoints of traditional rearrangeably nonblocking folded Clos networks. Full article
Show Figures

Figure 1

18 pages, 9529 KB  
Article
Adaptive Temporal Action Localization in Video
by Zhiyu Xu, Zhuqiang Lu, Yong Ding, Liwei Tian and Suping Liu
Electronics 2025, 14(13), 2645; https://doi.org/10.3390/electronics14132645 - 30 Jun 2025
Viewed by 2581
Abstract
Temporal action localization aims to identify the boundaries of the action of interest in a video. Most existing methods take a two-stage approach: first, identify a set of action proposals; then, based on this set, determine the accurate temporal locations of the action [...] Read more.
Temporal action localization aims to identify the boundaries of the action of interest in a video. Most existing methods take a two-stage approach: first, identify a set of action proposals; then, based on this set, determine the accurate temporal locations of the action of interest. However, the diversely distributed semantics of a video over time have not been well considered, which could compromise the localization performance, especially for ubiquitous short actions or events (e.g., a fall in healthcare and a traffic violation in surveillance). To address this problem, we propose a novel deep learning architecture, namely an adaptive template-guided self-attention network, to characterize the proposals adaptively with their relevant frames. An input video is segmented into temporal frames, within which the spatio-temporal patterns are formulated by a global–Local Transformer-based encoder. Each frame is associated with a number of proposals of different lengths as their starting frame. Learnable templates for proposals of different lengths are introduced, and each template guides the sampling for proposals with a specific length. It formulates the probabilities for a proposal to form the representation of certain spatio-temporal patterns from its relevant temporal frames. Therefore, the semantics of a proposal can be formulated in an adaptive manner, and a feature map of all proposals can be appropriately characterized. To estimate the IoU of these proposals with ground truth actions, a two-level scheme is introduced. A shortcut connection is also utilized to refine the predictions by using the convolutions of the feature map from coarse to fine. Comprehensive experiments on two benchmark datasets demonstrate the state-of-the-art performance of our proposed method: 32.6% mAP@IoU 0.7 on THUMOS-14 and 9.35% mAP@IoU 0.95 on ActivityNet-1.3. Full article
(This article belongs to the Special Issue Applications of Artificial Intelligence in Image and Video Processing)
Show Figures

Figure 1

17 pages, 2010 KB  
Article
Gaze Estimation Network Based on Multi-Head Attention, Fusion, and Interaction
by Changli Li, Fangfang Li, Kao Zhang, Nenglun Chen and Zhigeng Pan
Sensors 2025, 25(6), 1893; https://doi.org/10.3390/s25061893 - 18 Mar 2025
Cited by 1 | Viewed by 2823
Abstract
Gaze is an externally observable indicator of human visual attention, and thus, recording the gaze position can help to solve many problems. Existing gaze estimation models typically utilize separate neural network branches to process data streams from both eyes and the face, failing [...] Read more.
Gaze is an externally observable indicator of human visual attention, and thus, recording the gaze position can help to solve many problems. Existing gaze estimation models typically utilize separate neural network branches to process data streams from both eyes and the face, failing to fully exploit their feature correlations. This study presents a gaze estimation network that integrates multi-head attention mechanisms, fusion, and interaction strategies to fuse facial features with eye features, as well as features from both eyes, separately. Specifically, multi-head attention and channel attention are used to fuse features from both eyes, and a face and eye interaction module is designed to highlight the most important facial features guided by the eye features; in addition, the channel attention in the Convolutional Block Attention Module (CBAM) is replaced with minimum pooling instead of maximum pooling, and a shortcut connection is added to enhance the network’s attention to eye region details. Comparative experiments on three public datasets—Gaze360, MPIIFaceGaze, and EYEDIAP—validate the superiority of the proposed method. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

16 pages, 2334 KB  
Article
A Multi-Input Residual Network for Non-Destructive Prediction of Wood Mechanical Properties
by Jingchao Ma, Zhufang Kuang, Yixuan Fang and Jiahui Huang
Forests 2025, 16(2), 355; https://doi.org/10.3390/f16020355 - 16 Feb 2025
Cited by 1 | Viewed by 1400
Abstract
Modulus of elasticity (MOE) and modulus of rupture (MOR) are crucial indicators for assessing the application value of wood. However, traditional physical testing methods for the mechanical properties of wood are typically destructive, costly, and time-consuming. To efficiently assess these properties, this study [...] Read more.
Modulus of elasticity (MOE) and modulus of rupture (MOR) are crucial indicators for assessing the application value of wood. However, traditional physical testing methods for the mechanical properties of wood are typically destructive, costly, and time-consuming. To efficiently assess these properties, this study proposes a multi-input residual network (MIRN) model, which integrates microscopic images of wood with physical density data and leverages deep learning technology for rapid and accurate predictions. By using larger convolution kernels to enhance the receptive field, the model captures fine microstructural features in the images. Batch normalization layers were removed from the ResNet architecture to reduce the number of parameters and improve training stability. Shortcut connections were utilized to enable deeper network architectures and address the vanishing gradient problem. Two types of residual blocks, convolutional block and identity block, were defined based on input dimensional changes. The MIRN method, based on multi-input residual networks, is proposed for non-destructive testing of wood mechanical properties. The experimental results show that MIRN outperforms convolutional neural networks (CNNs) and ResNet-50 in predicting MOE and MOR, with an R2 of 0.95 for MOE and RMSE reduced to 46.88, as well as an R2 of 0.85 for MOR and an RMSE of 0.44. Thus, this method offers an efficient and cost-effective tool for wood processing and quality control. Full article
(This article belongs to the Section Wood Science and Forest Products)
Show Figures

Figure 1

14 pages, 4638 KB  
Article
LightVSR: A Lightweight Video Super-Resolution Model with Multi-Scale Feature Aggregation
by Guanglun Huang, Nachuan Li, Jianming Liu, Minghe Zhang, Li Zhang and Jun Li
Appl. Sci. 2025, 15(3), 1506; https://doi.org/10.3390/app15031506 - 1 Feb 2025
Cited by 1 | Viewed by 4150
Abstract
Video super-resolution aims to generate high-resolution video sequences with realistic details from existing low-resolution video sequences. However, most existing video super-resolution models require substantial computational power and are not suitable for resource-constrained devices such as smartphones and tablets. In this paper, we propose [...] Read more.
Video super-resolution aims to generate high-resolution video sequences with realistic details from existing low-resolution video sequences. However, most existing video super-resolution models require substantial computational power and are not suitable for resource-constrained devices such as smartphones and tablets. In this paper, we propose a lightweight video super-resolution (LightVSR) model that employs a novel feature aggregation module to enhance video quality by efficiently reconstructing high-resolution frames from compressed low-resolution inputs. LightVSR integrates several novel mechanisms, including head-tail convolution, cross-layer shortcut connections, and multi-input attention, to enhance computational efficiency while guaranteeing video super-resolution performance. Extensive experiments show that LightVSR achieves a frame rate of 28.57 FPS and a PSNR of 39.25 dB on the UDM10 dataset and 36.91 dB on the Vimeo-90k dataset, validating its efficiency and effectiveness. Full article
Show Figures

Figure 1

27 pages, 12110 KB  
Article
Exploring the Impact of Additive Shortcuts in Neural Networks via Information Bottleneck-like Dynamics: From ResNet to Transformer
by Zhaoyan Lyu and Miguel R. D. Rodrigues
Entropy 2024, 26(11), 974; https://doi.org/10.3390/e26110974 - 14 Nov 2024
Cited by 2 | Viewed by 1801
Abstract
Deep learning has made significant strides, driving advances in areas like computer vision, natural language processing, and autonomous systems. In this paper, we further investigate the implications of the role of additive shortcut connections, focusing on models such as ResNet, Vision Transformers (ViTs), [...] Read more.
Deep learning has made significant strides, driving advances in areas like computer vision, natural language processing, and autonomous systems. In this paper, we further investigate the implications of the role of additive shortcut connections, focusing on models such as ResNet, Vision Transformers (ViTs), and MLP-Mixers, given that they are essential in enabling efficient information flow and mitigating optimization challenges such as vanishing gradients. In particular, capitalizing on our recent information bottleneck approach, we analyze how additive shortcuts influence the fitting and compression phases of training, crucial for generalization. We leverage Z-X and Z-Y measures as practical alternatives to mutual information for observing these dynamics in high-dimensional spaces. Our empirical results demonstrate that models with identity shortcuts (ISs) often skip the initial fitting phase and move directly into the compression phase, while non-identity shortcut (NIS) models follow the conventional two-phase process. Furthermore, we explore how IS models are still able to compress effectively, maintaining their generalization capacity despite bypassing the early fitting stages. These findings offer new insights into the dynamics of shortcut connections in neural networks, contributing to the optimization of modern deep learning architectures. Full article
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

15 pages, 4276 KB  
Article
Spectrum Sensing Method Based on STFT-RADN in Cognitive Radio Networks
by Anyi Wang, Tao Zhu and Qifeng Meng
Sensors 2024, 24(17), 5792; https://doi.org/10.3390/s24175792 - 6 Sep 2024
Cited by 6 | Viewed by 2221
Abstract
To address the common issues in traditional convolutional neural network (CNN)-based spectrum sensing algorithms in cognitive radio networks (CRNs), including inadequate signal feature representation, inefficient utilization of feature map information, and limited feature extraction capabilities due to shallow network structures, this paper proposes [...] Read more.
To address the common issues in traditional convolutional neural network (CNN)-based spectrum sensing algorithms in cognitive radio networks (CRNs), including inadequate signal feature representation, inefficient utilization of feature map information, and limited feature extraction capabilities due to shallow network structures, this paper proposes a spectrum sensing algorithm based on a short-time Fourier transform (STFT) and residual attention dense network (RADN). Specifically, the RADN model improves the basic residual block and introduces the convolutional block attention module (CBAM), combining residual connections and dense connections to form a powerful deep feature extraction structure known as residual in dense (RID). This significantly enhances the network’s feature extraction capabilities. By performing STFT on the received signals and normalizing them, the signals are converted into time–frequency spectrograms as network inputs, better capturing signal features. The RADN is trained to extract abstract features from the time–frequency images, and the trained RADN serves as the final classifier for spectrum sensing. Experimental results demonstrate that the STFT-RADN spectrum sensing method significantly improves performance under low signal-to-noise ratio (SNR) conditions compared to traditional deep-learning-based methods. This method not only adapts to various modulation schemes but also exhibits high detection probability and strong robustness. Full article
(This article belongs to the Special Issue Sensors for Enabling Wireless Spectrum Access)
Show Figures

Figure 1

12 pages, 2987 KB  
Article
A Lightweight Crop Pest Detection Method Based on Improved RTMDet
by Wanqing Wang and Haoyue Fu
Information 2024, 15(9), 519; https://doi.org/10.3390/info15090519 - 26 Aug 2024
Viewed by 1992
Abstract
To address the issues of low detection accuracy and large model parameters in crop pest detection in natural scenes, this study improves the deep learning object detection model and proposes a lightweight and accurate method RTMDet++ for crop pest detection. First, the real-time [...] Read more.
To address the issues of low detection accuracy and large model parameters in crop pest detection in natural scenes, this study improves the deep learning object detection model and proposes a lightweight and accurate method RTMDet++ for crop pest detection. First, the real-time object detection network RTMDet is utilized to design the pest detection model. Then, the backbone and neck structures are pruned to reduce the number of parameters and computation. Subsequently, a shortcut connection module is added to the classification and regression branches, respectively, to enhance its feature learning capability, thereby improving its accuracy. Experimental results show that, compared to the original model RTMDet, the improved model RTMDet++ reduces the number of parameters by 15.5%, the computation by 25.0%, and improves the mean average precision by 0.3% on the crop pest dataset IP102. The improved model RTMDet++ achieves a mAP of 94.1%, a precision of 92.5%, and a recall of 92.7% with 4.117M parameters and 3.130G computations, outperforming other object detection methods. The proposed model RTMDet++ achieves higher performance with fewer parameters and computations, which can be applied to crop pest detection in practice and aids in pest control research. Full article
Show Figures

Figure 1

18 pages, 8861 KB  
Article
Foreign Object Detection Network for Transmission Lines from Unmanned Aerial Vehicle Images
by Bingshu Wang, Changping Li, Wenbin Zou and Qianqian Zheng
Drones 2024, 8(8), 361; https://doi.org/10.3390/drones8080361 - 30 Jul 2024
Cited by 13 | Viewed by 3198
Abstract
Foreign objects such as balloons and nests often lead to widespread power outages by coming into contact with transmission lines. The manual detection of these is labor-intensive work. Automatic foreign object detection on transmission lines is a crucial task for power safety and [...] Read more.
Foreign objects such as balloons and nests often lead to widespread power outages by coming into contact with transmission lines. The manual detection of these is labor-intensive work. Automatic foreign object detection on transmission lines is a crucial task for power safety and is becoming the mainstream method, but the lack of datasets is a restriction. In this paper, we propose an advanced model termed YOLOv8 Network with Bidirectional Feature Pyramid Network (YOLOv8_BiFPN) to detect foreign objects on power transmission lines. Firstly, we add a weighted cross-scale connection structure to the detection head of the YOLOv8 network. The structure is bidirectional. It provides interaction between low-level and high-level features, and allows information to spread across feature maps of different scales. Secondly, in comparison to the traditional concatenation and shortcut operations, our method integrates information between different scale features through weighted settings. Moreover, we created a dataset of Foreign Object detection on Transmission Lines from a Drone-view (FOTL_Drone). It consists of 1495 annotated images with six types of foreign object. To our knowledge, FOTL_Drone stands out as the most comprehensive dataset in the field of foreign object detection on transmission lines, which encompasses a wide array of geographic features and diverse types of foreign object. Experimental results showcase that YOLOv8_BiFPN achieves an average precision of 90.2% and an mAP@.50 of 0.896 across various categories of foreign objects, surpassing other models. Full article
Show Figures

Figure 1

Back to TopTop