MDPI - Publisher of Open Access Journals

18 pages, 4195 KB

Open AccessArticle

WeldSimAM and EnNWD Co-Optimization: Enhancing Lightweight YOLOv11 for Multi-Scale Weld Defect Detection

by Wenquan Huang, Qing Cheng and Jing Zhu

Technologies 2026, 14(3), 140; https://doi.org/10.3390/technologies14030140 - 26 Feb 2026

Viewed by 171

In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of [...] Read more.

In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of fusion. Existing YOLO-family models, although effective on general-purpose datasets, often fail to robustly localize tiny defects and long, slender discontinuities while remaining lightweight enough for industrial edge deployment. A critical research gap lies in the lack of task-specific optimization for weld defects: standard attention mechanisms are isotropic and cannot capture linear defect continuity, while existing loss functions ignore scale disparity between tiny pores (area < 100 pixels²) and large incomplete fusion defects (area > 5000 pixels²), leading to unstable regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. First, we introduce WeldSimAM, an enhanced attention module that augments parameter-free SimAM with directional (horizontal/vertical) and channel-wise enhancement to better capture the directional texture of linear weld defects. Second, we develop an Enhanced Normalized Wasserstein Distance (EnNWD) loss, which incorporates scale-disparity penalties and relative-area-based weighting to mitigate sample imbalance and improve regression accuracy for tiny and large-aspect-ratio targets. Validated via 10-fold cross-validation on three datasets (self-built + two public), the method achieves 99.48% mAP@0.5 and 73.29% mAP@0.5:0.95, outperforming YOLOv11 by 0.13 and 3.76 percentage points (p < 0.01, two-tailed t-test), with 5.21 MB and 132 FPS on NVIDIA RTX 4090. It also surpasses non-YOLO SOTA methods (e.g., EfficientDet-Lite3) by 3.8–5.5 percentage points in mAP@0.5 (p < 0.05), offering a practical real-time solution for industrial inspection. Full article

(This article belongs to the Section Manufacturing Technology)

► Show Figures

Figure 1

11 pages, 1220 KB

Open AccessProceeding Paper

Enhanced GNSS Threat Detection: On-Edge Statistical Approach with Crowdsourced Measurements and Fuzzy Logic Decision-Making

by Eustachio Roberto Matera, Olivier Lagrange and Maxime Olivier

Eng. Proc. 2026, 126(1), 18; https://doi.org/10.3390/engproc2026126018 - 24 Feb 2026

Viewed by 168

Abstract

Global Navigation Satellite Systems are vulnerable to jamming and spoofing threats, compromising several critical applications. Existing detection methods based on hardware solutions (antenna array, spectrogram) are low-latency and accurate but require expensive hardware, while machine learning solutions are the most effective but require [...] Read more.

Global Navigation Satellite Systems are vulnerable to jamming and spoofing threats, compromising several critical applications. Existing detection methods based on hardware solutions (antenna array, spectrogram) are low-latency and accurate but require expensive hardware, while machine learning solutions are the most effective but require extensive training and lack adaptability. This work proposes an edge-based, statistical threat detector using crowdsourced GNSS data and fuzzy logic to integrate multiple anomaly indicators. A key feature is a C-/N₀-based crowdsourcing metric. Experiments show detection precision up to 88% for jamming and 97% for spoofing, with false positive rates around 1–2% and an average detection time of 10 s. Full article

(This article belongs to the Proceedings of European Navigation Conference 2025)

► Show Figures

Figure 1

18 pages, 12952 KB

Open AccessArticle

Synthetic Melanoma Image Generation and Evaluation Using Generative Adversarial Networks

by Pei-Yu Lin, Yidan Shen, Neville Mathew, Renjie Hu, Siyu Huang, Courtney M. Queen, Cameron E. West, Ana Ciurea and George Zouridakis

Bioengineering 2026, 13(2), 245; https://doi.org/10.3390/bioengineering13020245 - 20 Feb 2026

Viewed by 369

Abstract

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class [...] Read more.

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class imbalance, where melanoma images are substantially underrepresented. To address these challenges, we present the first systematic benchmarking study comparing four GAN architectures—DCGAN, StyleGAN2, and two StyleGAN3 variants (T and R)—for high-resolution (

512 \times 512

) melanoma-specific synthesis. We train and optimize all models on two expert-annotated benchmarks (ISIC 2018 and ISIC 2020) under unified preprocessing and hyperparameter exploration, with particular attention to R1 regularization tuning. Image quality is assessed through a multi-faceted protocol combining distribution-level metrics (FID), sample-level representativeness (FMD), qualitative dermoscopic inspection, downstream classification with a frozen EfficientNet-based melanoma detector, and independent evaluation by two board-certified dermatologists. StyleGAN2 achieves the best balance of quantitative performance and perceptual quality, attaining FID scores of 24.8 (ISIC 2018) and 7.96 (ISIC 2020) at

γ = 0.8

. The frozen classifier recognizes 83% of StyleGAN2-generated images as melanoma, while dermatologists distinguish synthetic from real images at only 66.5% accuracy (chance = 50%), with low inter-rater agreement (

κ = 0.17

). In a controlled augmentation experiment, adding synthetic melanoma images to address class imbalance improved melanoma detection AUC from 0.925 to 0.945 on a held-out real-image test set. These findings demonstrate that StyleGAN2-generated melanoma images preserve diagnostically relevant features and can provide a measurable benefit for mitigating class imbalance in melanoma-focused machine learning pipelines. Full article

(This article belongs to the Special Issue AI and Data Science in Bioengineering: Innovations and Applications)

► Show Figures

Figure 1

22 pages, 21660 KB

Open AccessArticle

YOSDet: A YOLO-Based Oriented Ship Detector in SAR Imagery

by Chushi Yu, Oh-Soon Shin and Yoan Shin

Remote Sens. 2026, 18(4), 645; https://doi.org/10.3390/rs18040645 - 19 Feb 2026

Viewed by 193

Abstract

Synthetic aperture radar (SAR) serves as a prominent remote sensing (RS) technology, permitting continuous maritime surveillance regardless of weather or time. Although deep learning-based detectors have achieved promising results in SAR imagery, the majority of current algorithms rely on axis-aligned bounding boxes, which [...] Read more.

Synthetic aperture radar (SAR) serves as a prominent remote sensing (RS) technology, permitting continuous maritime surveillance regardless of weather or time. Although deep learning-based detectors have achieved promising results in SAR imagery, the majority of current algorithms rely on axis-aligned bounding boxes, which are insufficient for accurately representing arbitrarily oriented ships, especially under speckle noise, complex coastal clutter, and real-time deployment constraints. To address this limitation, we propose a YOLO-based oriented ship detector (YOSDet). Specifically, a dynamic aggregation module (DAM) is incorporated into the backbone to enhance feature representation against non-stationary backscattering. An objective-guided detection head (OGDH) is developed to decouple classification and localization, complemented by a localization quality estimator (LQE) to calibrate classification confidence by mitigating the impact of scattering center shifts. Comparative evaluations conducted on three public SAR ship detection benchmarks validate the effectiveness of YOSDet. The proposed model outperforms existing detectors, achieving

m A P

scores of 96.8%, 88.5%, and 67.3% on the SSDD+, HRSID, and SRSDD-v1.0 datasets, respectively. Furthermore, the consistency of our approach in both nearshore and offshore environments is confirmed through rigorous quantitative and qualitative assessments. Full article

► Show Figures

Figure 1

29 pages, 3365 KB

Open AccessArticle

A Hybrid Automatic Model for Circle Detection in X-Ray Imagery: A Case Study on Hip Prosthesis Wear

by Mehmet Öztürk and Yahia Adwan

Bioengineering 2026, 13(2), 235; https://doi.org/10.3390/bioengineering13020235 - 17 Feb 2026

Viewed by 678

Abstract

This study presents a fully automatic hybrid framework for circle detection and geometric feature extraction from anteroposterior (AP) X-ray images. Detecting circular structures in X-ray imagery is challenging due to low contrast, noise, and metal-induced artifacts, which often limit the robustness of purely [...] Read more.

This study presents a fully automatic hybrid framework for circle detection and geometric feature extraction from anteroposterior (AP) X-ray images. Detecting circular structures in X-ray imagery is challenging due to low contrast, noise, and metal-induced artifacts, which often limit the robustness of purely learning-based or purely geometric approaches. To address these challenges, a hybrid deep learning and computer vision pipeline is proposed that combines data-driven region localization with robust geometric fitting. A YOLOv5-based detector is first employed to identify a compact region of interest (ROI) containing circular components. Within this ROI, edge-based processing using Canny detection is applied, followed by an Edge-Snap refinement stage and robust RANSAC-based circle fitting with a Hough-transform fallback to ensure anatomically plausible circle estimation. The resulting circle centers and radii provide stable geometric parameters that can be consistently extracted across images with varying contrast, noise levels, and prosthesis appearances. The applicability of the proposed framework is demonstrated through a case study on hip prosthesis wear analysis, where the automatically detected circle parameters are used to compute medial, superior, and resultant displacement components using established two-dimensional radiographic formulations. Experimental evaluation on AP hip radiographs shows that the YOLOv5 detector achieves high ROI localization performance (mAP@0.5 = 0.971) and that the hybrid pipeline produces consistent circle parameters across longitudinal image sequences. Overall, the proposed method provides an end-to-end automatic solution for robust circle detection in X-ray imagery, with hip prosthesis wear presented solely as a case study without clinical or diagnostic claims. Full article

(This article belongs to the Section Biosignal Processing)

► Show Figures

Figure 1

25 pages, 3611 KB

Open AccessArticle

Automatic Estimation of Football Possession via Improved YOLOv8 Detection and DBSCAN-Based Team Classification

by Rong Guo, Yucheng Zeng, Rong Deng, Yawen Lei, Yonglin Che, Lin Yu, Jianpeng Zhang, Xiaobin Xu, Zhaoxiang Ma, Jiajin Zhang and Jianke Yang

Sensors 2026, 26(4), 1252; https://doi.org/10.3390/s26041252 - 14 Feb 2026

Viewed by 309

Abstract

Recent developments in computer vision have significantly enhanced the automation and objectivity of sports analytics. This paper proposes a novel deep learning-based framework for estimating football possession directly from broadcast video, eliminating the reliance on manual annotations or event-based data that are often [...] Read more.

Recent developments in computer vision have significantly enhanced the automation and objectivity of sports analytics. This paper proposes a novel deep learning-based framework for estimating football possession directly from broadcast video, eliminating the reliance on manual annotations or event-based data that are often labor-intensive, subjective, and temporally coarse. The framework incorporates two structurally improved object detection models: YOLOv8-P2S3A for football detection and YOLOv8-HWD3A for player detection. These models demonstrate superior accuracy compared to baseline detectors, achieving 79.4% and 71.1% validation average precision, respectively, while maintaining low computational latency. Team identification is accomplished through unsupervised DBSCAN clustering on jersey color features, enabling robust and label-free team assignment across diverse match scenarios. Object trajectories are maintained via the Norfair multi-object tracking algorithm, and a temporally aware refinement module ensures accurate estimation of ball possession durations. Extensive experiments were conducted on a dataset comprising 20 full-match Video clips. The proposed system achieved a root mean square error (RMSE) of 4.87 in possession estimation, outperforming all evaluated baselines, including YOLOv10n (RMSE: 5.12) and YOLOv11 (RMSE: 5.17), with a substantial improvement over YOLOv6n (RMSE: 12.73). These results substantiate the effectiveness of the proposed framework in enhancing the precision, efficiency, and automation of football analytics, offering practical value for coaches, analysts, and sports scientists in professional settings. Full article

(This article belongs to the Special Issue Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence)

► Show Figures

Figure 1

19 pages, 1004 KB

Open AccessArticle

Early Anomaly Detection in Maritime Refrigerated Containers Using a Hybrid Digital Twin and Deep Learning Framework

by Marko Vukšić, Jasmin Ćelić, Dario Ogrizović and Ana Perić Hadžić

Appl. Sci. 2026, 16(4), 1887; https://doi.org/10.3390/app16041887 - 13 Feb 2026

Viewed by 203

Abstract

Maritime refrigerated containers operate under harsh and highly variable conditions, where gradual equipment degradation can lead to temperature excursions, cargo losses, and operational disruptions. In current practice, monitoring relies largely on threshold-based temperature alarms, which are reactive and provide limited insight into early [...] Read more.

Maritime refrigerated containers operate under harsh and highly variable conditions, where gradual equipment degradation can lead to temperature excursions, cargo losses, and operational disruptions. In current practice, monitoring relies largely on threshold-based temperature alarms, which are reactive and provide limited insight into early abnormal behaviour. This study proposes a hybrid framework for early anomaly detection in maritime refrigerated containers that combines a lightweight physics-based digital twin with a deep learning anomaly detector trained exclusively on fault-free operation. The approach is designed for shipboard constraints and uses only controller-level signals augmented by locally derived features, enabling low-complexity edge execution. The digital twin produces physically interpretable temperature residuals, while a convolutional autoencoder learns normal multivariate operating patterns and flags deviations via reconstruction error. Both indicators are integrated using conservative persistence gating to suppress short-lived transients typical of maritime operation. The framework is evaluated in a simulation environment calibrated to representative reefer thermal dynamics under variable ambient conditions and progressive fault injection across gradual and abrupt fault categories. Results indicate earlier and operationally credible detection compared to conventional alarms, supporting practical predictive maintenance in maritime cold-chain logistics. Full article

(This article belongs to the Special Issue AI Applications in the Maritime Sector)

► Show Figures

Figure 1

21 pages, 3073 KB

Open AccessArticle

SARDet-MIM: Enhancing SAR Target Detection via a Structural and Scattering Masked Autoencoder

by Peiling Zhou, Ben Niu, Lijia Huang, Qiantong Wang, Yongchao Zhao, Guangyao Zhou and Yuxin Hu

Remote Sens. 2026, 18(4), 580; https://doi.org/10.3390/rs18040580 - 13 Feb 2026

Viewed by 183

Abstract

The performance of deep learning approaches for Synthetic Aperture Radar (SAR) target detection is often limited by the scarcity of annotated data. While Self-Supervised Learning (SSL) has emerged as a powerful paradigm to mitigate data dependence, its potential in SAR target detection remains [...] Read more.

The performance of deep learning approaches for Synthetic Aperture Radar (SAR) target detection is often limited by the scarcity of annotated data. While Self-Supervised Learning (SSL) has emerged as a powerful paradigm to mitigate data dependence, its potential in SAR target detection remains largely underexplored. In this study, we propose SARDet-MIM, a comprehensive framework based on Masked Image Modeling (MIM), to enhance SAR target detection. The approach consists of two stages. In the self-supervised pre-training stage, we propose an innovative Structural and Scattering Masked Autoencoder (SSMAE) method for SAR imagery. Unlike conventional MIM methods, which typically reconstruct raw pixels, SSMAE employs a physics-aware reconstruction target comprising multi-scale gradient and SAR-Harris features. This strategy explicitly guides the network to capture discriminative structural contexts and intrinsic scattering features that benefit SAR target detection. For downstream detection, we construct a Maximally Pre-trained Detector (MPD), which integrally transfers the pre-trained ViT encoder–decoder architecture to the detection network to fully exploit pre-trained representations. Extensive experiments on three SAR target detection datasets demonstrate that SARDet-MIM consistently outperforms competing methods. Full article

(This article belongs to the Special Issue Target Recognition and Detection Based on High Resolution Radar Images (Second Edition))

► Show Figures

Figure 1

28 pages, 66640 KB

Open AccessArticle

SSABNet: Spatial-Semantic Aggregation and Balancing Network for Small-Target Detection in UAV Remote Sensing Images

by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Shigeng Wang, Yang Guo and Meina Song

Remote Sens. 2026, 18(4), 550; https://doi.org/10.3390/rs18040550 - 9 Feb 2026

Viewed by 285

Abstract

The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing [...] Read more.

The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing deep learning-based detectors typically suffer from two fundamental limitations: the irreversible loss of fine-grained spatial details during hierarchical feature fusion and the scale-insensitive optimization of conventional loss functions, which inadequately emphasize hard-to-detect small targets. To address these issues, we propose a novel Spatial-Semantic Aggregation and Balancing Network (SSABNet) tailored for UAV-based small-target detection. First, a Spatial-Semantic Aggregation (SSA) module is introduced to establish a high-fidelity restoration pathway that recovers fine-grained texture and boundary information from shallow layers. By employing content-aware operators, SSA effectively reconciles the structural discrepancy between spatial details and semantic abstractions, enabling precise cross-scale feature fusion while suppressing aliasing artifacts. Second, we design a Scale-Aware Balancing Loss (SABL) to mitigate the gradient instability and vanishing-gradient issues commonly encountered when optimizing non-overlapping small targets. SABL adopts a scale-dependent modulation mechanism that smoothly transitions from Wasserstein distance for distributional alignment of small objects to Euclidean distance for geometric refinement of larger targets, thereby ensuring stable and balanced optimization across object scales. Extensive experiments on the VisDrone benchmark demonstrate that SSABNet outperforms state-of-the-art detectors, achieving gains of 1.3% in overall AP and 2.5% in

{AP}_{s}

. Further evaluation on the UAVDT dataset confirms its strong generalization capability, yielding improvements of 0.5% in AP and 16.9% in

{AP}_{s}

. These results validate the effectiveness of jointly addressing feature representation and scale-aware optimization for UAV small-target detection. Full article

(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)

► Show Figures

Figure 1

28 pages, 922 KB

Open AccessArticle

MAESTRO: A Multi-Scale Ensemble Framework with GAN-Based Data Refinement for Robust Malicious Tor Traffic Detection

by Jinbu Geng, Yu Xie, Jun Li, Xuewen Yu and Lei He

Mathematics 2026, 14(3), 551; https://doi.org/10.3390/math14030551 - 3 Feb 2026

Viewed by 365

Abstract

Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further [...] Read more.

Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further reduces detection performance. In addition, Tor’s fixed 512-byte cell architecture removes packet-size diversity that many encrypted-traffic methods rely on, making feature extraction difficult. This paper proposes an efficient three-stage framework, MAESTRO v1.0, for malicious Tor traffic detection. In Stage 1, MAESTRO extracts multi-scale behavioral signatures by fusing temporal, positional, and directional embeddings at cell, direction, and flow granularities to mitigate feature homogeneity; it then compresses these representations with an autoencoder into compact latent features. In Stage 2, MAESTRO introduces an ensemble-based quality quantification method that combines five complementary anomaly detection models to produce robust discriminability scores for adaptive sample weighting, helping the classifier to emphasize high-quality samples. MAESTRO also trains three specialized GANs per minority class and applies strict five-model ensemble validation to synthesize diverse high-fidelity samples, addressing extreme class imbalance. We evaluate MAESTRO under systematic imbalance settings, ranging from the natural distribution to an extreme 1% malicious ratio. On the CCS’22 Tor malware dataset, MAESTRO achieves 92.38% accuracy, 64.79% recall, and 73.70% F1-score under the natural distribution, improving F1-score by up to 15.53% compared with state-of-the-art baselines. Under the 1% malicious setting, MAESTRO maintains 21.1% recall, which is 14.1 percentage points higher than the best baseline, while conventional methods drop below 10%. Full article

(This article belongs to the Special Issue New Advances in Network Security and Data Privacy)

► Show Figures

Figure 1

39 pages, 3530 KB

Open AccessArticle

AI-Based Embedded Framework for Cyber-Attack Detection Through Signal Processing and Anomaly Analysis

by Sebastian-Alexandru Drǎguşin, Robert-Nicolae Boştinaru, Nicu Bizon and Gabriel-Vasile Iana

Appl. Sci. 2026, 16(3), 1416; https://doi.org/10.3390/app16031416 - 30 Jan 2026

Viewed by 410

Abstract

This paper proposes an applied framework for cyberattack and anomaly detection in resource-constrained embedded/IoT environments by combining signal-processing feature construction with supervised and unsupervised AI (Artificial Intelligence) models. The workflow covers dataset preparation and normalization, correlation-driven feature analysis, and compact representations via PCA [...] Read more.

This paper proposes an applied framework for cyberattack and anomaly detection in resource-constrained embedded/IoT environments by combining signal-processing feature construction with supervised and unsupervised AI (Artificial Intelligence) models. The workflow covers dataset preparation and normalization, correlation-driven feature analysis, and compact representations via PCA (Principal Component Analysis), followed by classification and anomaly scoring. In addition to the original UNSW-NB15 (University of New South Wales—Network-Based Dataset 2015) traffic features, Fourier-domain descriptors, wavelet-domain descriptors, and Kalman-based smoothing/innovation features are considered to improve robustness under variability and measurement noise. Detection performance is assessed using classical and ensemble learning methods (SVM (Support Vector Machines), RF (Random Forest), XGBoost (Extreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine)), unsupervised baselines (K-Means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise)), and DL (Deep-Learning) anomaly detectors based on Autoencoder reconstruction and GAN (Generative Adversarial Network)-based scoring. Experimental results on UNSW-NB15 indicate that ensemble-based models provide the strongest overall detection performance, while the signal-processing augmentation and PCA-based compactness support efficient deployment in embedded contexts. The findings confirm that integrating lightweight signal processing with AI-driven models enables effective and adaptable identification of malicious network traffic supporting deployment-oriented embedded cybersecurity and motivating future real-time validation on edge hardware. Full article

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

► Show Figures

Figure 1

24 pages, 1253 KB

Open AccessEditor’s ChoiceArticle

Re-Evaluating Android Malware Detection: Tabular Features, Vision Models, and Ensembles

by Prajwal Hosahalli Dayananda and Zesheng Chen

Electronics 2026, 15(3), 544; https://doi.org/10.3390/electronics15030544 - 27 Jan 2026

Viewed by 434

Abstract

Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android [...] Read more.

Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android malware detection. However, existing public benchmark datasets do not provide paired tabular and image representations for the same samples, limiting direct comparisons between tabular models and vision-based models. This work investigates whether carefully engineered, domain-specific tabular features can match or surpass the performance of state-of-the-art deep vision models under strict false-positive-rate constraints, and whether ensemble approaches justify their additional complexity. To enable this analysis, we construct a large corpus of Android applications with paired static representations and evaluate six popular machine learning models on the exact same samples: two tabular models using EMBER features, two tabular models using extended EMBER features, and two vision-based models using malware images. Our results show that a LightGBM model trained on extended EMBER features outperforms all other evaluated models, as well as a state-of-the-art approach trained on a much larger dataset. Furthermore, we develop an ensemble model combining both tabular and vision-based detectors, which yields a modest performance improvement but at the cost of substantial additional computational and engineering overhead. Full article

(This article belongs to the Special Issue Feature Papers in Networks: 2025–2026 Edition)

► Show Figures

Figure 1

27 pages, 49730 KB

Open AccessArticle

AMSRDet: An Adaptive Multi-Scale UAV Infrared-Visible Remote Sensing Vehicle Detection Network

by Zekai Yan and Yuheng Li

Sensors 2026, 26(3), 817; https://doi.org/10.3390/s26030817 - 26 Jan 2026

Viewed by 339

Abstract

Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an [...] Read more.

Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an adaptive multi-scale detection network fusing infrared (IR) and visible (RGB) modalities for robust UAV-based vehicle detection. Our framework comprises four novel components: (1) a MobileMamba-based dual-stream encoder extracting complementary features via Selective State-Space 2D (SS2D) blocks with linear complexity

O (H W C)

, achieving 2.1× efficiency improvement over standard Transformers; (2) a Cross-Modal Global Fusion (CMGF) module capturing global dependencies through spatial-channel attention while suppressing modality-specific noise via adaptive gating; (3) a Scale-Coordinate Attention Fusion (SCAF) module integrating multi-scale features via coordinate attention and learned scale-aware weighting, improving small object detection by 2.5 percentage points; and (4) a Separable Dynamic Decoder generating scale-adaptive predictions through content-aware dynamic convolution, reducing computational cost by 48.9% compared to standard DETR decoders. On the DroneVehicle dataset, AMSRDet achieves 45.8% mAP@0.5:0.95 (81.2% mAP@0.5) at 68.3 Frames Per Second (FPS) with 28.6 million (M) parameters and 47.2 Giga Floating Point Operations (GFLOPs), outperforming twenty state-of-the-art detectors including YOLOv12 (+0.7% mAP), DEIM (+0.8% mAP), and Mamba-YOLO (+1.5% mAP). Cross-dataset evaluation on Camera-vehicle yields 52.3% mAP without fine-tuning, demonstrating strong generalization across viewpoints and scenarios. Full article

(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)

► Show Figures

Figure 1

21 pages, 1284 KB

Open AccessArticle

Probabilistic Indoor 3D Object Detection from RGB-D via Gaussian Distribution Estimation

by Hyeong-Geun Kim

Mathematics 2026, 14(3), 421; https://doi.org/10.3390/math14030421 - 26 Jan 2026

Viewed by 277

Abstract

Conventional object detectors represent each object by a deterministic bounding box, regressing its center and size from RGB images. However, such discrete parameterization ignores the inherent uncertainty in object appearance and geometric projection, which can be more naturally modeled as a probabilistic density [...] Read more.

Conventional object detectors represent each object by a deterministic bounding box, regressing its center and size from RGB images. However, such discrete parameterization ignores the inherent uncertainty in object appearance and geometric projection, which can be more naturally modeled as a probabilistic density field. Recent works have introduced Gaussian-based formulations that treat objects as distributions rather than boxes, yet they remain limited to 2D images or require late fusion between image and depth modalities. In this paper, we propose a unified Gaussian-based framework for direct 3D object detection from RGB-D inputs. Our method is built upon a vision transformer backbone to effectively capture global context. Instead of separately embedding RGB and depth features or refining depth within region proposals, our method takes a full four-channel RGB-D tensor and predicts the mean and covariance of a 3D Gaussian distribution for each object in a single forward pass. We extend a pretrained vision transformer to accept four-channel inputs by augmenting the patch embedding layer while preserving ImageNet-learned representations. This formulation allows the detector to represent both object location and geometric uncertainty in 3D space. By optimizing divergence metrics such as the Kullback–Leibler or Bhattacharyya distances between predicted and target distributions, the network learns a physically consistent probabilistic representation of objects. Experimental results on the SUN RGB-D benchmark demonstrate that our approach achieves competitive performance compared to state-of-the-art point-cloud-based methods while offering uncertainty-aware and geometrically interpretable 3D detections. Full article

► Show Figures

Figure 1

20 pages, 1567 KB

Open AccessArticle

Deformable Pyramid Sparse Transformer for Semi-Supervised Driver Distraction Detection

by Qiang Zhao, Zhichao Yu, Jiahui Yu, Simon James Fong, Yuchu Lin, Rui Wang and Weiwei Lin

Sensors 2026, 26(3), 803; https://doi.org/10.3390/s26030803 - 25 Jan 2026

Viewed by 337

Abstract

Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction [...] Read more.

Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction detection framework based on teacher–student learning and deformable pyramid feature fusion. The framework leverages a limited amount of labeled data together with abundant unlabeled samples to achieve robust and scalable distraction detection. An adaptive pseudo-label optimization strategy is introduced, incorporating category-aware pseudo-label thresholding, delayed pseudo-label scheduling, and a confidence-weighted pseudo-label loss to dynamically balance pseudo-label quality and training stability. To enhance fine-grained perception of subtle driver behaviors, a Deformable Pyramid Sparse Transformer (DPST) module is integrated into a lightweight YOLOv11 detector, enabling precise multi-scale feature alignment and efficient cross-scale semantic fusion. Furthermore, a teacher-guided feature consistency distillation mechanism is employed to promote semantic alignment between teacher and student models at the feature level, mitigating the adverse effects of noisy pseudo-labels. Extensive experiments conducted on the Roboflow Distracted Driving Dataset demonstrate that the proposed method outperforms representative fully supervised baselines in terms of mAP@0.5 and mAP@0.5:0.95 while maintaining a balanced trade-off between precision and recall. These results indicate that the proposed framework provides an effective and practical solution for real-world driver monitoring systems under limited annotation conditions. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

Search Results (415)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (415)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI