Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (415)

Search Parameters:
Keywords = learning-based features detectors

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 4195 KB  
Article
WeldSimAM and EnNWD Co-Optimization: Enhancing Lightweight YOLOv11 for Multi-Scale Weld Defect Detection
by Wenquan Huang, Qing Cheng and Jing Zhu
Technologies 2026, 14(3), 140; https://doi.org/10.3390/technologies14030140 - 26 Feb 2026
Viewed by 171
Abstract
In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of [...] Read more.
In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of fusion. Existing YOLO-family models, although effective on general-purpose datasets, often fail to robustly localize tiny defects and long, slender discontinuities while remaining lightweight enough for industrial edge deployment. A critical research gap lies in the lack of task-specific optimization for weld defects: standard attention mechanisms are isotropic and cannot capture linear defect continuity, while existing loss functions ignore scale disparity between tiny pores (area < 100 pixels2) and large incomplete fusion defects (area > 5000 pixels2), leading to unstable regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. First, we introduce WeldSimAM, an enhanced attention module that augments parameter-free SimAM with directional (horizontal/vertical) and channel-wise enhancement to better capture the directional texture of linear weld defects. Second, we develop an Enhanced Normalized Wasserstein Distance (EnNWD) loss, which incorporates scale-disparity penalties and relative-area-based weighting to mitigate sample imbalance and improve regression accuracy for tiny and large-aspect-ratio targets. Validated via 10-fold cross-validation on three datasets (self-built + two public), the method achieves 99.48% mAP@0.5 and 73.29% mAP@0.5:0.95, outperforming YOLOv11 by 0.13 and 3.76 percentage points (p < 0.01, two-tailed t-test), with 5.21 MB and 132 FPS on NVIDIA RTX 4090. It also surpasses non-YOLO SOTA methods (e.g., EfficientDet-Lite3) by 3.8–5.5 percentage points in mAP@0.5 (p < 0.05), offering a practical real-time solution for industrial inspection. Full article
(This article belongs to the Section Manufacturing Technology)
Show Figures

Figure 1

11 pages, 1220 KB  
Proceeding Paper
Enhanced GNSS Threat Detection: On-Edge Statistical Approach with Crowdsourced Measurements and Fuzzy Logic Decision-Making
by Eustachio Roberto Matera, Olivier Lagrange and Maxime Olivier
Eng. Proc. 2026, 126(1), 18; https://doi.org/10.3390/engproc2026126018 - 24 Feb 2026
Viewed by 168
Abstract
Global Navigation Satellite Systems are vulnerable to jamming and spoofing threats, compromising several critical applications. Existing detection methods based on hardware solutions (antenna array, spectrogram) are low-latency and accurate but require expensive hardware, while machine learning solutions are the most effective but require [...] Read more.
Global Navigation Satellite Systems are vulnerable to jamming and spoofing threats, compromising several critical applications. Existing detection methods based on hardware solutions (antenna array, spectrogram) are low-latency and accurate but require expensive hardware, while machine learning solutions are the most effective but require extensive training and lack adaptability. This work proposes an edge-based, statistical threat detector using crowdsourced GNSS data and fuzzy logic to integrate multiple anomaly indicators. A key feature is a C-/N0-based crowdsourcing metric. Experiments show detection precision up to 88% for jamming and 97% for spoofing, with false positive rates around 1–2% and an average detection time of 10 s. Full article
(This article belongs to the Proceedings of European Navigation Conference 2025)
Show Figures

Figure 1

18 pages, 12952 KB  
Article
Synthetic Melanoma Image Generation and Evaluation Using Generative Adversarial Networks
by Pei-Yu Lin, Yidan Shen, Neville Mathew, Renjie Hu, Siyu Huang, Courtney M. Queen, Cameron E. West, Ana Ciurea and George Zouridakis
Bioengineering 2026, 13(2), 245; https://doi.org/10.3390/bioengineering13020245 - 20 Feb 2026
Viewed by 369
Abstract
Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class [...] Read more.
Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class imbalance, where melanoma images are substantially underrepresented. To address these challenges, we present the first systematic benchmarking study comparing four GAN architectures—DCGAN, StyleGAN2, and two StyleGAN3 variants (T and R)—for high-resolution (512×512) melanoma-specific synthesis. We train and optimize all models on two expert-annotated benchmarks (ISIC 2018 and ISIC 2020) under unified preprocessing and hyperparameter exploration, with particular attention to R1 regularization tuning. Image quality is assessed through a multi-faceted protocol combining distribution-level metrics (FID), sample-level representativeness (FMD), qualitative dermoscopic inspection, downstream classification with a frozen EfficientNet-based melanoma detector, and independent evaluation by two board-certified dermatologists. StyleGAN2 achieves the best balance of quantitative performance and perceptual quality, attaining FID scores of 24.8 (ISIC 2018) and 7.96 (ISIC 2020) at γ=0.8. The frozen classifier recognizes 83% of StyleGAN2-generated images as melanoma, while dermatologists distinguish synthetic from real images at only 66.5% accuracy (chance = 50%), with low inter-rater agreement (κ=0.17). In a controlled augmentation experiment, adding synthetic melanoma images to address class imbalance improved melanoma detection AUC from 0.925 to 0.945 on a held-out real-image test set. These findings demonstrate that StyleGAN2-generated melanoma images preserve diagnostically relevant features and can provide a measurable benefit for mitigating class imbalance in melanoma-focused machine learning pipelines. Full article
(This article belongs to the Special Issue AI and Data Science in Bioengineering: Innovations and Applications)
Show Figures

Figure 1

22 pages, 21660 KB  
Article
YOSDet: A YOLO-Based Oriented Ship Detector in SAR Imagery
by Chushi Yu, Oh-Soon Shin and Yoan Shin
Remote Sens. 2026, 18(4), 645; https://doi.org/10.3390/rs18040645 - 19 Feb 2026
Viewed by 193
Abstract
Synthetic aperture radar (SAR) serves as a prominent remote sensing (RS) technology, permitting continuous maritime surveillance regardless of weather or time. Although deep learning-based detectors have achieved promising results in SAR imagery, the majority of current algorithms rely on axis-aligned bounding boxes, which [...] Read more.
Synthetic aperture radar (SAR) serves as a prominent remote sensing (RS) technology, permitting continuous maritime surveillance regardless of weather or time. Although deep learning-based detectors have achieved promising results in SAR imagery, the majority of current algorithms rely on axis-aligned bounding boxes, which are insufficient for accurately representing arbitrarily oriented ships, especially under speckle noise, complex coastal clutter, and real-time deployment constraints. To address this limitation, we propose a YOLO-based oriented ship detector (YOSDet). Specifically, a dynamic aggregation module (DAM) is incorporated into the backbone to enhance feature representation against non-stationary backscattering. An objective-guided detection head (OGDH) is developed to decouple classification and localization, complemented by a localization quality estimator (LQE) to calibrate classification confidence by mitigating the impact of scattering center shifts. Comparative evaluations conducted on three public SAR ship detection benchmarks validate the effectiveness of YOSDet. The proposed model outperforms existing detectors, achieving mAP scores of 96.8%, 88.5%, and 67.3% on the SSDD+, HRSID, and SRSDD-v1.0 datasets, respectively. Furthermore, the consistency of our approach in both nearshore and offshore environments is confirmed through rigorous quantitative and qualitative assessments. Full article
Show Figures

Figure 1

29 pages, 3365 KB  
Article
A Hybrid Automatic Model for Circle Detection in X-Ray Imagery: A Case Study on Hip Prosthesis Wear
by Mehmet Öztürk and Yahia Adwan
Bioengineering 2026, 13(2), 235; https://doi.org/10.3390/bioengineering13020235 - 17 Feb 2026
Viewed by 678
Abstract
This study presents a fully automatic hybrid framework for circle detection and geometric feature extraction from anteroposterior (AP) X-ray images. Detecting circular structures in X-ray imagery is challenging due to low contrast, noise, and metal-induced artifacts, which often limit the robustness of purely [...] Read more.
This study presents a fully automatic hybrid framework for circle detection and geometric feature extraction from anteroposterior (AP) X-ray images. Detecting circular structures in X-ray imagery is challenging due to low contrast, noise, and metal-induced artifacts, which often limit the robustness of purely learning-based or purely geometric approaches. To address these challenges, a hybrid deep learning and computer vision pipeline is proposed that combines data-driven region localization with robust geometric fitting. A YOLOv5-based detector is first employed to identify a compact region of interest (ROI) containing circular components. Within this ROI, edge-based processing using Canny detection is applied, followed by an Edge-Snap refinement stage and robust RANSAC-based circle fitting with a Hough-transform fallback to ensure anatomically plausible circle estimation. The resulting circle centers and radii provide stable geometric parameters that can be consistently extracted across images with varying contrast, noise levels, and prosthesis appearances. The applicability of the proposed framework is demonstrated through a case study on hip prosthesis wear analysis, where the automatically detected circle parameters are used to compute medial, superior, and resultant displacement components using established two-dimensional radiographic formulations. Experimental evaluation on AP hip radiographs shows that the YOLOv5 detector achieves high ROI localization performance (mAP@0.5 = 0.971) and that the hybrid pipeline produces consistent circle parameters across longitudinal image sequences. Overall, the proposed method provides an end-to-end automatic solution for robust circle detection in X-ray imagery, with hip prosthesis wear presented solely as a case study without clinical or diagnostic claims. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

25 pages, 3611 KB  
Article
Automatic Estimation of Football Possession via Improved YOLOv8 Detection and DBSCAN-Based Team Classification
by Rong Guo, Yucheng Zeng, Rong Deng, Yawen Lei, Yonglin Che, Lin Yu, Jianpeng Zhang, Xiaobin Xu, Zhaoxiang Ma, Jiajin Zhang and Jianke Yang
Sensors 2026, 26(4), 1252; https://doi.org/10.3390/s26041252 - 14 Feb 2026
Viewed by 309
Abstract
Recent developments in computer vision have significantly enhanced the automation and objectivity of sports analytics. This paper proposes a novel deep learning-based framework for estimating football possession directly from broadcast video, eliminating the reliance on manual annotations or event-based data that are often [...] Read more.
Recent developments in computer vision have significantly enhanced the automation and objectivity of sports analytics. This paper proposes a novel deep learning-based framework for estimating football possession directly from broadcast video, eliminating the reliance on manual annotations or event-based data that are often labor-intensive, subjective, and temporally coarse. The framework incorporates two structurally improved object detection models: YOLOv8-P2S3A for football detection and YOLOv8-HWD3A for player detection. These models demonstrate superior accuracy compared to baseline detectors, achieving 79.4% and 71.1% validation average precision, respectively, while maintaining low computational latency. Team identification is accomplished through unsupervised DBSCAN clustering on jersey color features, enabling robust and label-free team assignment across diverse match scenarios. Object trajectories are maintained via the Norfair multi-object tracking algorithm, and a temporally aware refinement module ensures accurate estimation of ball possession durations. Extensive experiments were conducted on a dataset comprising 20 full-match Video clips. The proposed system achieved a root mean square error (RMSE) of 4.87 in possession estimation, outperforming all evaluated baselines, including YOLOv10n (RMSE: 5.12) and YOLOv11 (RMSE: 5.17), with a substantial improvement over YOLOv6n (RMSE: 12.73). These results substantiate the effectiveness of the proposed framework in enhancing the precision, efficiency, and automation of football analytics, offering practical value for coaches, analysts, and sports scientists in professional settings. Full article
Show Figures

Figure 1

19 pages, 1004 KB  
Article
Early Anomaly Detection in Maritime Refrigerated Containers Using a Hybrid Digital Twin and Deep Learning Framework
by Marko Vukšić, Jasmin Ćelić, Dario Ogrizović and Ana Perić Hadžić
Appl. Sci. 2026, 16(4), 1887; https://doi.org/10.3390/app16041887 - 13 Feb 2026
Viewed by 203
Abstract
Maritime refrigerated containers operate under harsh and highly variable conditions, where gradual equipment degradation can lead to temperature excursions, cargo losses, and operational disruptions. In current practice, monitoring relies largely on threshold-based temperature alarms, which are reactive and provide limited insight into early [...] Read more.
Maritime refrigerated containers operate under harsh and highly variable conditions, where gradual equipment degradation can lead to temperature excursions, cargo losses, and operational disruptions. In current practice, monitoring relies largely on threshold-based temperature alarms, which are reactive and provide limited insight into early abnormal behaviour. This study proposes a hybrid framework for early anomaly detection in maritime refrigerated containers that combines a lightweight physics-based digital twin with a deep learning anomaly detector trained exclusively on fault-free operation. The approach is designed for shipboard constraints and uses only controller-level signals augmented by locally derived features, enabling low-complexity edge execution. The digital twin produces physically interpretable temperature residuals, while a convolutional autoencoder learns normal multivariate operating patterns and flags deviations via reconstruction error. Both indicators are integrated using conservative persistence gating to suppress short-lived transients typical of maritime operation. The framework is evaluated in a simulation environment calibrated to representative reefer thermal dynamics under variable ambient conditions and progressive fault injection across gradual and abrupt fault categories. Results indicate earlier and operationally credible detection compared to conventional alarms, supporting practical predictive maintenance in maritime cold-chain logistics. Full article
(This article belongs to the Special Issue AI Applications in the Maritime Sector)
Show Figures

Figure 1

21 pages, 3073 KB  
Article
SARDet-MIM: Enhancing SAR Target Detection via a Structural and Scattering Masked Autoencoder
by Peiling Zhou, Ben Niu, Lijia Huang, Qiantong Wang, Yongchao Zhao, Guangyao Zhou and Yuxin Hu
Remote Sens. 2026, 18(4), 580; https://doi.org/10.3390/rs18040580 - 13 Feb 2026
Viewed by 183
Abstract
The performance of deep learning approaches for Synthetic Aperture Radar (SAR) target detection is often limited by the scarcity of annotated data. While Self-Supervised Learning (SSL) has emerged as a powerful paradigm to mitigate data dependence, its potential in SAR target detection remains [...] Read more.
The performance of deep learning approaches for Synthetic Aperture Radar (SAR) target detection is often limited by the scarcity of annotated data. While Self-Supervised Learning (SSL) has emerged as a powerful paradigm to mitigate data dependence, its potential in SAR target detection remains largely underexplored. In this study, we propose SARDet-MIM, a comprehensive framework based on Masked Image Modeling (MIM), to enhance SAR target detection. The approach consists of two stages. In the self-supervised pre-training stage, we propose an innovative Structural and Scattering Masked Autoencoder (SSMAE) method for SAR imagery. Unlike conventional MIM methods, which typically reconstruct raw pixels, SSMAE employs a physics-aware reconstruction target comprising multi-scale gradient and SAR-Harris features. This strategy explicitly guides the network to capture discriminative structural contexts and intrinsic scattering features that benefit SAR target detection. For downstream detection, we construct a Maximally Pre-trained Detector (MPD), which integrally transfers the pre-trained ViT encoder–decoder architecture to the detection network to fully exploit pre-trained representations. Extensive experiments on three SAR target detection datasets demonstrate that SARDet-MIM consistently outperforms competing methods. Full article
Show Figures

Figure 1

28 pages, 66640 KB  
Article
SSABNet: Spatial-Semantic Aggregation and Balancing Network for Small-Target Detection in UAV Remote Sensing Images
by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Shigeng Wang, Yang Guo and Meina Song
Remote Sens. 2026, 18(4), 550; https://doi.org/10.3390/rs18040550 - 9 Feb 2026
Viewed by 285
Abstract
The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing [...] Read more.
The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing deep learning-based detectors typically suffer from two fundamental limitations: the irreversible loss of fine-grained spatial details during hierarchical feature fusion and the scale-insensitive optimization of conventional loss functions, which inadequately emphasize hard-to-detect small targets. To address these issues, we propose a novel Spatial-Semantic Aggregation and Balancing Network (SSABNet) tailored for UAV-based small-target detection. First, a Spatial-Semantic Aggregation (SSA) module is introduced to establish a high-fidelity restoration pathway that recovers fine-grained texture and boundary information from shallow layers. By employing content-aware operators, SSA effectively reconciles the structural discrepancy between spatial details and semantic abstractions, enabling precise cross-scale feature fusion while suppressing aliasing artifacts. Second, we design a Scale-Aware Balancing Loss (SABL) to mitigate the gradient instability and vanishing-gradient issues commonly encountered when optimizing non-overlapping small targets. SABL adopts a scale-dependent modulation mechanism that smoothly transitions from Wasserstein distance for distributional alignment of small objects to Euclidean distance for geometric refinement of larger targets, thereby ensuring stable and balanced optimization across object scales. Extensive experiments on the VisDrone benchmark demonstrate that SSABNet outperforms state-of-the-art detectors, achieving gains of 1.3% in overall AP and 2.5% in APs. Further evaluation on the UAVDT dataset confirms its strong generalization capability, yielding improvements of 0.5% in AP and 16.9% in APs. These results validate the effectiveness of jointly addressing feature representation and scale-aware optimization for UAV small-target detection. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

28 pages, 922 KB  
Article
MAESTRO: A Multi-Scale Ensemble Framework with GAN-Based Data Refinement for Robust Malicious Tor Traffic Detection
by Jinbu Geng, Yu Xie, Jun Li, Xuewen Yu and Lei He
Mathematics 2026, 14(3), 551; https://doi.org/10.3390/math14030551 - 3 Feb 2026
Viewed by 365
Abstract
Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further [...] Read more.
Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further reduces detection performance. In addition, Tor’s fixed 512-byte cell architecture removes packet-size diversity that many encrypted-traffic methods rely on, making feature extraction difficult. This paper proposes an efficient three-stage framework, MAESTRO v1.0, for malicious Tor traffic detection. In Stage 1, MAESTRO extracts multi-scale behavioral signatures by fusing temporal, positional, and directional embeddings at cell, direction, and flow granularities to mitigate feature homogeneity; it then compresses these representations with an autoencoder into compact latent features. In Stage 2, MAESTRO introduces an ensemble-based quality quantification method that combines five complementary anomaly detection models to produce robust discriminability scores for adaptive sample weighting, helping the classifier to emphasize high-quality samples. MAESTRO also trains three specialized GANs per minority class and applies strict five-model ensemble validation to synthesize diverse high-fidelity samples, addressing extreme class imbalance. We evaluate MAESTRO under systematic imbalance settings, ranging from the natural distribution to an extreme 1% malicious ratio. On the CCS’22 Tor malware dataset, MAESTRO achieves 92.38% accuracy, 64.79% recall, and 73.70% F1-score under the natural distribution, improving F1-score by up to 15.53% compared with state-of-the-art baselines. Under the 1% malicious setting, MAESTRO maintains 21.1% recall, which is 14.1 percentage points higher than the best baseline, while conventional methods drop below 10%. Full article
(This article belongs to the Special Issue New Advances in Network Security and Data Privacy)
Show Figures

Figure 1

39 pages, 3530 KB  
Article
AI-Based Embedded Framework for Cyber-Attack Detection Through Signal Processing and Anomaly Analysis
by Sebastian-Alexandru Drǎguşin, Robert-Nicolae Boştinaru, Nicu Bizon and Gabriel-Vasile Iana
Appl. Sci. 2026, 16(3), 1416; https://doi.org/10.3390/app16031416 - 30 Jan 2026
Viewed by 410
Abstract
This paper proposes an applied framework for cyberattack and anomaly detection in resource-constrained embedded/IoT environments by combining signal-processing feature construction with supervised and unsupervised AI (Artificial Intelligence) models. The workflow covers dataset preparation and normalization, correlation-driven feature analysis, and compact representations via PCA [...] Read more.
This paper proposes an applied framework for cyberattack and anomaly detection in resource-constrained embedded/IoT environments by combining signal-processing feature construction with supervised and unsupervised AI (Artificial Intelligence) models. The workflow covers dataset preparation and normalization, correlation-driven feature analysis, and compact representations via PCA (Principal Component Analysis), followed by classification and anomaly scoring. In addition to the original UNSW-NB15 (University of New South Wales—Network-Based Dataset 2015) traffic features, Fourier-domain descriptors, wavelet-domain descriptors, and Kalman-based smoothing/innovation features are considered to improve robustness under variability and measurement noise. Detection performance is assessed using classical and ensemble learning methods (SVM (Support Vector Machines), RF (Random Forest), XGBoost (Extreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine)), unsupervised baselines (K-Means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise)), and DL (Deep-Learning) anomaly detectors based on Autoencoder reconstruction and GAN (Generative Adversarial Network)-based scoring. Experimental results on UNSW-NB15 indicate that ensemble-based models provide the strongest overall detection performance, while the signal-processing augmentation and PCA-based compactness support efficient deployment in embedded contexts. The findings confirm that integrating lightweight signal processing with AI-driven models enables effective and adaptable identification of malicious network traffic supporting deployment-oriented embedded cybersecurity and motivating future real-time validation on edge hardware. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
Show Figures

Figure 1

24 pages, 1253 KB  
Article
Re-Evaluating Android Malware Detection: Tabular Features, Vision Models, and Ensembles
by Prajwal Hosahalli Dayananda and Zesheng Chen
Electronics 2026, 15(3), 544; https://doi.org/10.3390/electronics15030544 - 27 Jan 2026
Viewed by 434
Abstract
Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android [...] Read more.
Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android malware detection. However, existing public benchmark datasets do not provide paired tabular and image representations for the same samples, limiting direct comparisons between tabular models and vision-based models. This work investigates whether carefully engineered, domain-specific tabular features can match or surpass the performance of state-of-the-art deep vision models under strict false-positive-rate constraints, and whether ensemble approaches justify their additional complexity. To enable this analysis, we construct a large corpus of Android applications with paired static representations and evaluate six popular machine learning models on the exact same samples: two tabular models using EMBER features, two tabular models using extended EMBER features, and two vision-based models using malware images. Our results show that a LightGBM model trained on extended EMBER features outperforms all other evaluated models, as well as a state-of-the-art approach trained on a much larger dataset. Furthermore, we develop an ensemble model combining both tabular and vision-based detectors, which yields a modest performance improvement but at the cost of substantial additional computational and engineering overhead. Full article
(This article belongs to the Special Issue Feature Papers in Networks: 2025–2026 Edition)
Show Figures

Figure 1

27 pages, 49730 KB  
Article
AMSRDet: An Adaptive Multi-Scale UAV Infrared-Visible Remote Sensing Vehicle Detection Network
by Zekai Yan and Yuheng Li
Sensors 2026, 26(3), 817; https://doi.org/10.3390/s26030817 - 26 Jan 2026
Viewed by 339
Abstract
Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an [...] Read more.
Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an adaptive multi-scale detection network fusing infrared (IR) and visible (RGB) modalities for robust UAV-based vehicle detection. Our framework comprises four novel components: (1) a MobileMamba-based dual-stream encoder extracting complementary features via Selective State-Space 2D (SS2D) blocks with linear complexity O(HWC), achieving 2.1× efficiency improvement over standard Transformers; (2) a Cross-Modal Global Fusion (CMGF) module capturing global dependencies through spatial-channel attention while suppressing modality-specific noise via adaptive gating; (3) a Scale-Coordinate Attention Fusion (SCAF) module integrating multi-scale features via coordinate attention and learned scale-aware weighting, improving small object detection by 2.5 percentage points; and (4) a Separable Dynamic Decoder generating scale-adaptive predictions through content-aware dynamic convolution, reducing computational cost by 48.9% compared to standard DETR decoders. On the DroneVehicle dataset, AMSRDet achieves 45.8% mAP@0.5:0.95 (81.2% mAP@0.5) at 68.3 Frames Per Second (FPS) with 28.6 million (M) parameters and 47.2 Giga Floating Point Operations (GFLOPs), outperforming twenty state-of-the-art detectors including YOLOv12 (+0.7% mAP), DEIM (+0.8% mAP), and Mamba-YOLO (+1.5% mAP). Cross-dataset evaluation on Camera-vehicle yields 52.3% mAP without fine-tuning, demonstrating strong generalization across viewpoints and scenarios. Full article
(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)
Show Figures

Figure 1

21 pages, 1284 KB  
Article
Probabilistic Indoor 3D Object Detection from RGB-D via Gaussian Distribution Estimation
by Hyeong-Geun Kim
Mathematics 2026, 14(3), 421; https://doi.org/10.3390/math14030421 - 26 Jan 2026
Viewed by 277
Abstract
Conventional object detectors represent each object by a deterministic bounding box, regressing its center and size from RGB images. However, such discrete parameterization ignores the inherent uncertainty in object appearance and geometric projection, which can be more naturally modeled as a probabilistic density [...] Read more.
Conventional object detectors represent each object by a deterministic bounding box, regressing its center and size from RGB images. However, such discrete parameterization ignores the inherent uncertainty in object appearance and geometric projection, which can be more naturally modeled as a probabilistic density field. Recent works have introduced Gaussian-based formulations that treat objects as distributions rather than boxes, yet they remain limited to 2D images or require late fusion between image and depth modalities. In this paper, we propose a unified Gaussian-based framework for direct 3D object detection from RGB-D inputs. Our method is built upon a vision transformer backbone to effectively capture global context. Instead of separately embedding RGB and depth features or refining depth within region proposals, our method takes a full four-channel RGB-D tensor and predicts the mean and covariance of a 3D Gaussian distribution for each object in a single forward pass. We extend a pretrained vision transformer to accept four-channel inputs by augmenting the patch embedding layer while preserving ImageNet-learned representations. This formulation allows the detector to represent both object location and geometric uncertainty in 3D space. By optimizing divergence metrics such as the Kullback–Leibler or Bhattacharyya distances between predicted and target distributions, the network learns a physically consistent probabilistic representation of objects. Experimental results on the SUN RGB-D benchmark demonstrate that our approach achieves competitive performance compared to state-of-the-art point-cloud-based methods while offering uncertainty-aware and geometrically interpretable 3D detections. Full article
Show Figures

Figure 1

20 pages, 1567 KB  
Article
Deformable Pyramid Sparse Transformer for Semi-Supervised Driver Distraction Detection
by Qiang Zhao, Zhichao Yu, Jiahui Yu, Simon James Fong, Yuchu Lin, Rui Wang and Weiwei Lin
Sensors 2026, 26(3), 803; https://doi.org/10.3390/s26030803 - 25 Jan 2026
Viewed by 337
Abstract
Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction [...] Read more.
Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction detection framework based on teacher–student learning and deformable pyramid feature fusion. The framework leverages a limited amount of labeled data together with abundant unlabeled samples to achieve robust and scalable distraction detection. An adaptive pseudo-label optimization strategy is introduced, incorporating category-aware pseudo-label thresholding, delayed pseudo-label scheduling, and a confidence-weighted pseudo-label loss to dynamically balance pseudo-label quality and training stability. To enhance fine-grained perception of subtle driver behaviors, a Deformable Pyramid Sparse Transformer (DPST) module is integrated into a lightweight YOLOv11 detector, enabling precise multi-scale feature alignment and efficient cross-scale semantic fusion. Furthermore, a teacher-guided feature consistency distillation mechanism is employed to promote semantic alignment between teacher and student models at the feature level, mitigating the adverse effects of noisy pseudo-labels. Extensive experiments conducted on the Roboflow Distracted Driving Dataset demonstrate that the proposed method outperforms representative fully supervised baselines in terms of mAP@0.5 and mAP@0.5:0.95 while maintaining a balanced trade-off between precision and recall. These results indicate that the proposed framework provides an effective and practical solution for real-world driver monitoring systems under limited annotation conditions. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

Back to TopTop