Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (6)

Search Parameters:
Keywords = mixup global attention

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 1572 KiB  
Article
A Lightweight Semantic Segmentation Model for Underwater Images Based on DeepLabv3+
by Chongjing Xiao, Zhiyu Zhou and Yanjun Hu
J. Imaging 2025, 11(5), 162; https://doi.org/10.3390/jimaging11050162 - 19 May 2025
Cited by 1 | Viewed by 708
Abstract
Underwater object image processing is a crucial technology for marine environmental exploration. The complexity of marine environments typically results in underwater object images exhibiting color deviation, imbalanced contrast, and blurring. Existing semantic segmentation methods for underwater objects either suffer from low segmentation accuracy [...] Read more.
Underwater object image processing is a crucial technology for marine environmental exploration. The complexity of marine environments typically results in underwater object images exhibiting color deviation, imbalanced contrast, and blurring. Existing semantic segmentation methods for underwater objects either suffer from low segmentation accuracy or fail to meet the lightweight requirements of underwater hardware. To address these challenges, this study proposes a lightweight semantic segmentation model based on DeepLabv3+. The framework employs MobileOne-S0 as the lightweight backbone for feature extraction, integrates Simple, Parameter-Free Attention Module (SimAM) into deep feature layers, replaces global average pooling in the Atrous Spatial Pyramid Pooling (ASPP) module with strip pooling, and adopts a content-guided attention (CGA)-based mixup fusion scheme to effectively combine high-level and low-level features while minimizing parameter redundancy. Experimental results demonstrate that the proposed model achieves a mean Intersection over Union (mIoU) of 71.18% on the DUT-USEG dataset, with parameters and computational complexity reduced to 6.628 M and 39.612 G FLOPs, respectively. These advancements significantly enhance segmentation accuracy while maintaining model efficiency, making the model highly suitable for resource-constrained underwater applications. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

22 pages, 6129 KiB  
Article
A Novel Machine Vision-Based Collision Risk Warning Method for Unsignalized Intersections on Arterial Roads
by Zhongbin Luo, Yanqiu Bi, Qing Ye, Yong Li and Shaofei Wang
Electronics 2025, 14(6), 1098; https://doi.org/10.3390/electronics14061098 - 11 Mar 2025
Cited by 1 | Viewed by 880
Abstract
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural [...] Read more.
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural enhancements were incorporated. The RepLayer module replaced the original C2f module in the backbone, integrating large-kernel depthwise separable convolution to better capture contextual information in cluttered environments. The GIoU loss function was introduced to improve bounding box regression accuracy, mitigating the issues related to missed or incorrect detections due to occlusion and overlapping objects. Furthermore, a Global Attention Mechanism (GAM) was implemented in the neck network to better learn both location and semantic information, while the ReContext gradient composition feature pyramid replaced the traditional FPN, enabling more effective multi-scale object detection. Additionally, the CSPNet structure in the neck was substituted with Res-CSP, enhancing feature fusion flexibility and improving detection performance in complex traffic conditions. For tracking, the Deep SORT algorithm was optimized with enhanced appearance feature extraction, reducing the identity switches caused by occlusions and ensuring the stable tracking of vehicles, pedestrians, and non-motorized vehicles. The Bi-LSTM model was employed for trajectory prediction, capturing long-range dependencies to provide accurate forecasting of future positions. The collision risk was quantified using the predictive collision risk area (PCRA) method, categorizing risks into three levels (danger, warning, and caution) based on the predicted overlaps in trajectories. In the experimental setup, the dataset used for training the model consisted of 30,000 images annotated with bounding boxes around vehicles, pedestrians, and non-motorized vehicles. Data augmentation techniques such as Mosaic, Random_perspective, Mixup, HSV adjustments, Flipud, and Fliplr were applied to enrich the dataset and improve model robustness. In real-world testing, the system was deployed as part of the G310 highway safety project, where it achieved a mean Average Precision (mAP) of over 90% for object detection. Over a one-month period, 120 warning events involving vehicles, pedestrians, and non-motorized vehicles were recorded. Manual verification of the warnings indicated a prediction accuracy of 97%, demonstrating the system’s reliability in identifying potential collisions and issuing timely warnings. This approach represents a significant advancement in enhancing safety at unsignalized intersections in urban traffic environments. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing in Machine Learning)
Show Figures

Figure 1

30 pages, 7517 KiB  
Article
MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection
by Zhaodi Wang, Shuqiang Yang, Huafeng Qin, Yike Liu and Junqiang Wang
Electronics 2025, 14(2), 362; https://doi.org/10.3390/electronics14020362 - 17 Jan 2025
Cited by 2 | Viewed by 1261
Abstract
Finger vein recognition has gained significant attention for its importance in enhancing security, safeguarding privacy, and ensuring reliable liveness detection. As a foundation of vein recognition systems, vein detection faces challenges, including low feature extraction efficiency, limited robustness, and a heavy reliance on [...] Read more.
Finger vein recognition has gained significant attention for its importance in enhancing security, safeguarding privacy, and ensuring reliable liveness detection. As a foundation of vein recognition systems, vein detection faces challenges, including low feature extraction efficiency, limited robustness, and a heavy reliance on real-world data. Additionally, environmental variability and advancements in spoofing technologies further exacerbate data privacy and security concerns. To address these challenges, this paper proposes MixCFormer, a hybrid CNN–transformer architecture that incorporates Mixup data augmentation to improve the accuracy of finger vein liveness detection and reduce dependency on large-scale real datasets. First, the MixCFormer model applies baseline drift elimination, morphological filtering, and Butterworth filtering techniques to minimize the impact of background noise and illumination variations, thereby enhancing the clarity and recognizability of vein features. Next, finger vein video data are transformed into feature sequences, optimizing feature extraction and matching efficiency, effectively capturing dynamic time-series information and improving discrimination between live and forged samples. Furthermore, Mixup data augmentation is used to expand sample diversity and decrease dependency on extensive real datasets, thereby enhancing the model’s ability to recognize forged samples across diverse attack scenarios. Finally, the CNN and transformer architecture leverages both local and global feature extraction capabilities to capture vein feature correlations and dependencies. Residual connections improve feature propagation, enhancing the stability of feature representations in liveness detection. Rigorous experimental evaluations demonstrate that MixCFormer achieves a detection accuracy of 99.51% on finger vein datasets, significantly outperforming existing methods. Full article
Show Figures

Figure 1

20 pages, 4061 KiB  
Article
A Lightweight Crop Pest Classification Method Based on Improved MobileNet-V2 Model
by Hongxing Peng, Huiming Xu, Guanjia Shen, Huanai Liu, Xianlu Guan and Minhui Li
Agronomy 2024, 14(6), 1334; https://doi.org/10.3390/agronomy14061334 - 20 Jun 2024
Cited by 9 | Viewed by 1786
Abstract
This paper proposes PestNet, a lightweight method for classifying crop pests, which improves upon MobileNet-V2 to address the high model complexity and low classification accuracy commonly found in pest classification research. Firstly, the training phase employs the AdamW optimizer and mixup data augmentation [...] Read more.
This paper proposes PestNet, a lightweight method for classifying crop pests, which improves upon MobileNet-V2 to address the high model complexity and low classification accuracy commonly found in pest classification research. Firstly, the training phase employs the AdamW optimizer and mixup data augmentation techniques to enhance the model’s convergence and generalization capabilities. Secondly, the Adaptive Spatial Group-Wise Enhanced (ASGE) attention mechanism is introduced and integrated into the inverted residual blocks of the MobileNet-V2 model, boosting the model’s ability to extract both local and global pest information. Additionally, a dual-branch feature fusion module is developed using convolutional kernels of varying sizes to enhance classification performance for pests of different scales under real-world conditions. Lastly, the model’s activation function and overall architecture are optimized to reduce complexity. Experimental results on a proprietary pest dataset show that PestNet achieves classification accuracy and an F1 score of 87.62% and 86.90%, respectively, marking improvements of 4.20 percentage points and 5.86 percentage points over the baseline model. Moreover, PestNet’s parameter count and floating-point operations are reduced by 14.10% and 37.50%, respectively, compared to the baseline model. When compared with ResNet-50, MobileNet V3-Large, and EfficientNet-B1, PestNet offers superior parameter efficiency and floating-point operation requirements, as well as improved pest classification accuracy. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

16 pages, 4472 KiB  
Article
BA-YOLO for Object Detection in Satellite Remote Sensing Images
by Kuilin Wang and Zhenze Liu
Appl. Sci. 2023, 13(24), 13122; https://doi.org/10.3390/app132413122 - 9 Dec 2023
Cited by 8 | Viewed by 3346
Abstract
In recent years, there has been significant progress in object detection within the domain of natural images. However, the field of satellite remote sensing images has consistently presented challenges due to its significant scale variations and complex background interference. Achieving satisfactory results by [...] Read more.
In recent years, there has been significant progress in object detection within the domain of natural images. However, the field of satellite remote sensing images has consistently presented challenges due to its significant scale variations and complex background interference. Achieving satisfactory results by directly applying conventional image object detection models has proven to be difficult. To address these challenges, this paper introduces BA-YOLO, an improved version of the YOLOv8 object detection model. It incorporates several notable enhancements. Firstly, to fuse an increased number of features more effectively, we introduce the design concept of a higher-performing Bi-directional Feature Pyramid Network (BiFPN). Secondly, to retain sufficient global contextual information, we integrated a module in BA-YOLO that combines multi-head self-attention and convolutional networks. Finally, we employed various data augmentation techniques such as Mixup, Cutout, Mosaic, and multi-scale training to enhance the model’s accuracy and robustness. Experimental results demonstrate that BA-YOLO outperforms state-of-the-art detectors and has been evaluated on the DOTA dataset. BA-YOLO achieves a mean average precision (mAP) of 0.722 on the DOTA dataset. Full article
Show Figures

Figure 1

16 pages, 1671 KiB  
Article
MAFormer: A New Method for Radar Reflectivity Reconstructing Using Satellite Data
by Kuoyin Wang, Yan Huang, Tingzhao Yu, Yu Chen, Zhimin Li and Qiuming Kuang
Atmosphere 2023, 14(12), 1723; https://doi.org/10.3390/atmos14121723 - 23 Nov 2023
Cited by 2 | Viewed by 1736
Abstract
Radar reflectivity plays a crucial role in detecting heavy rainfall and is an important tool for meteorological analysis. However, the coverage of a single radar is limited, leading to the use of satellite data as a complementary source. Consequently, how to bridge the [...] Read more.
Radar reflectivity plays a crucial role in detecting heavy rainfall and is an important tool for meteorological analysis. However, the coverage of a single radar is limited, leading to the use of satellite data as a complementary source. Consequently, how to bridge the gap between radar and satellite data has become a growing research focus. In this paper, we present MAFormer, a novel model for reconstructing radar reflectivity using satellite data within the Transformer framework. MAFormer consists of two modules: the Axial Local Attention Module and the Mixup Global Attention Module, which extract both local saliency and global similarity. Quantitative and qualitative experiments demonstrate the effectiveness of our proposed method. Specifically, the MAFormer model exhibits notable advancements when compared to state-of-the-art deep learning techniques. It demonstrates an improvement ranging from 0.01 to 0.05 in terms of the Heidke skill score, indicating its superior performance. Additionally, MAFormer effectively mitigates false alarm rates by approximately 0.016 to 0.04, which further highlights its enhanced accuracy and reliability. Full article
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)
Show Figures

Figure 1

Back to TopTop