MDPI - Publisher of Open Access Journals

19 pages, 5143 KB

Open AccessReview

by Zhiling Liu, Qiufeng Yan and Qingyu Liu

Micromachines 2026, 17(4), 400; https://doi.org/10.3390/mi17040400 (registering DOI) - 25 Mar 2026

Linear ultrasonic motors (LUSMs) occupy an important position in the field of high-precision actuation due to their advantages of simple structure, high control accuracy and direct linear motion generation. This review first classifies LUSMs according to wave modes into traveling wave linear ultrasonic [...] Read more.

Linear ultrasonic motors (LUSMs) occupy an important position in the field of high-precision actuation due to their advantages of simple structure, high control accuracy and direct linear motion generation. This review first classifies LUSMs according to wave modes into traveling wave linear ultrasonic motors (TWLUSMs) and standing wave linear ultrasonic motors (SWLUSMs). Among them, TWLUSMs include the straight beam type and the annular beam type, while SWLUSMs consist of the single-foot type and the multi-foot type. In addition, the working principles of TWLUSMs and SWLUSMs are elaborated. The structural characteristics and performance parameters of different types of ultrasonic motors (USMs) are sorted out, and the analysis shows that SWLUSMs are significantly superior to TWLUSMs in terms of output speed and output force. This review summarizes the application status of LUSMs in fields such as biomedicine, deep-sea exploration, aerospace and precision manufacturing, and finally outlines the development trends of LUSMs from the aspects of miniaturization and lightweighting, extreme environment adaptability and intelligent upgrade. This review provides a comprehensive reference for the structural design, performance improvement and application expansion of LUSMs. Full article

► Show Figures

Figure 1

32 pages, 7914 KB

Open AccessArticle

UAV Target Detection and Tracking Integrating a Dynamic Brain–Computer Interface

by Jun Wang, Zanyang Li, Lirong Yan, Muhammad Imtiaz, Hang Li, Muhammad Usman Shoukat, Jianatihan Jinsihan, Benjun Feng, Yi Yang, Fuwu Yan, Shumo He and Yibo Wu

Drones 2026, 10(3), 222; https://doi.org/10.3390/drones10030222 - 21 Mar 2026

Viewed by 224

Abstract

To address the inherent limitations in the robustness of fully autonomous unmanned aerial vehicle (UAV) visual perception and the high cognitive workload associated with manual control, this paper proposes a human-in-the-loop brain–computer interface (BCI) control framework. The system integrates steady-state visual evoked potential [...] Read more.

To address the inherent limitations in the robustness of fully autonomous unmanned aerial vehicle (UAV) visual perception and the high cognitive workload associated with manual control, this paper proposes a human-in-the-loop brain–computer interface (BCI) control framework. The system integrates steady-state visual evoked potential (SSVEP) with deep learning techniques to create a spatio-temporally dynamic interaction paradigm, enabling real-time alignment between visual targets and frequency stimuli. At the perception level, an enhanced YOLOv11 network incorporating partial convolution (PConv) and shape intersection over union (Shape-IoU) loss is developed and coupled with the DeepSort multi-object tracking algorithm. This configuration ensures high-speed execution on edge computing platforms while maintaining stable stimulus coverage over dynamic targets, thus providing a robust visual induction environment for EEG decoding. At the neural decoding level, an enhanced task-discriminant component analysis (TDCA-V) algorithm is introduced to improve signal detection stability within non-stationary flight conditions. Experimental results demonstrate that within the predefined fixation task window, the system achieves 100% success in maintaining target identity (ID). The BCI system achieved an average command recognition accuracy of 91.48% within a 1.0 s time window, with the TDCA-V algorithm significantly outperforming traditional spatial filtering methods in dynamic scenarios. These findings demonstrate the system’s effectiveness in decoupling human cognitive intent from machine execution, providing a robust solution for human–machine collaborative control. Full article

(This article belongs to the Section Artificial Intelligence in Drones (AID))

► Show Figures

Figure 1

28 pages, 3863 KB

Open AccessArticle

DeepSORT-OCR: Design and Application Research of a Maritime Ship Target Tracking Algorithm Incorporating Hull Number Features

by Jing Ma, Xihang Su, Kehui Xu, Hongliang Yin, Zhihong Xiao, Jiale Wang and Peng Liu

Mathematics 2026, 14(6), 1062; https://doi.org/10.3390/math14061062 - 20 Mar 2026

Viewed by 141

Abstract

Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, [...] Read more.

Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, a ship multi-object tracking algorithm, DeepSORT-OCR, that integrates hull number semantic features is proposed. Based on the YOLO detection framework and the DeepSORT tracking architecture, a CBAM-ResNet network is introduced to enhance the representation of ship appearance features. An Inner-SIoU metric is adopted to improve the geometric matching of slender ship targets, while an LSTM-Adaptive Kalman Filter is employed to model the nonlinear motion patterns of ships and improve trajectory prediction stability. In addition, a Hull Number Feature Extraction module is designed in order to recognize ship hull numbers using OCR and match them with a hull number database. The extracted hull number semantic features are dynamically fused with visual appearance features to strengthen identity constraints during target association. The experimental results show that the proposed method achieves an MOTA of 66.53% on the MOT16 dataset, representing an improvement of 5.13% over DeepSORT. On the self-constructed maritime ship dataset, the method achieves an MOTA of 70.89% and an MOTP of 80.84%. Furthermore, on the hull-number subset, the MOTA further increases to 77.18%, an improvement of 7.31% compared with DeepSORT, while the number of ID switches is significantly reduced. In addition, experiments conducted on pure real data, pure synthetic data, and cross-domain evaluation settings demonstrate the stability and strong generalization capability of the proposed algorithm under different data distributions. The proposed method effectively improves the stability and identity consistency of ship multi-object tracking in complex maritime environments. Full article

(This article belongs to the Special Issue Control Theory for Multi-Agent Systems: Recent Advances and Applications)

► Show Figures

Figure 1

22 pages, 3299 KB

Open AccessArticle

DualStream-RTNet: A Multimodal Deep Learning Framework for Grape Cultivar Classification and Soluble Solid Content Prediction

by Zhiguo Liu, Yufei Song, Aoran Liu, Xi Meng, Chang Liu, Shanshan Li, Xiangqing Wang and Guifa Teng

Foods 2026, 15(6), 1095; https://doi.org/10.3390/foods15061095 - 20 Mar 2026

Viewed by 166

Abstract

Accurate and non-destructive evaluation of grape quality is crucial for intelligent viticulture, yet most existing approaches address cultivar classification and soluble solid content (SSC) prediction as independent tasks based on single-modality data, limiting robustness and practical applicability. This study proposes DualStream-RTNet, a unified [...] Read more.

Accurate and non-destructive evaluation of grape quality is crucial for intelligent viticulture, yet most existing approaches address cultivar classification and soluble solid content (SSC) prediction as independent tasks based on single-modality data, limiting robustness and practical applicability. This study proposes DualStream-RTNet, a unified multimodal deep learning framework that simultaneously performs grape cultivar classification and SSC prediction by integrating RGB-HSV fused images and PCA-compressed hyperspectral spectra. The dual-stream architecture enables the complementary learning of external chromatic–textural cues and internal physicochemical information, while a Transformer-enhanced fusion module strengthens global representation and cross-modal correlation. A dataset of 864 berries from five grape cultivars was used to validate the model. DualStream-RTNet achieved 93.64% classification accuracy, outperforming ResNet18 and other CNN baselines, and produced more compact and consistent confusion-matrix patterns. For SSC prediction, it consistently yielded the highest performance across cultivars, with R2p values up to 0.9693 and RMSE as low as 0.2567, surpassing the PLSR, SVR, LSTM, and Transformer regression models. These results demonstrate the superiority of the proposed framework in capturing both visual and spectral characteristics. DualStream-RTNet provides an efficient and scalable solution for comprehensive grape quality assessment, offering strong potential for real-time sorting, precision grading, and smart agricultural applications. Full article

(This article belongs to the Section Food Engineering and Technology)

► Show Figures

Figure 1

25 pages, 45697 KB

Open AccessArticle

Research on a Real-Time Warning System for Unsafe Behaviors in Hydraulic Construction Based on DeepSORT and Improved YOLOv5s

by Yongqiang Liu, Haibin Wu and Haomin Li

Appl. Sci. 2026, 16(6), 2960; https://doi.org/10.3390/app16062960 - 19 Mar 2026

Viewed by 96

Abstract

The construction environment of hydraulic engineering is complex, while traditional safety monitoring methods suffer from low efficiency and delayed response. Although static recognition models based on improved YOLOv5s have enhanced detection accuracy, they still cannot assess behavioral persistence and struggle to achieve proactive [...] Read more.

The construction environment of hydraulic engineering is complex, while traditional safety monitoring methods suffer from low efficiency and delayed response. Although static recognition models based on improved YOLOv5s have enhanced detection accuracy, they still cannot assess behavioral persistence and struggle to achieve proactive early warning. To address this, this study integrates the improved YOLOv5s with the DeepSORT algorithm to construct an integrated real-time “detection–tracking–warning” system. The system utilizes DeepSORT to achieve stable personnel tracking in complex scenarios and triggers dynamic warnings based on spatiotemporal behavioral logic. A desktop prototype system was developed using PyQt5/PySide6. Experimental results show that the system achieves a Multiple Object Tracking Accuracy (MOTA) of 86.2% in multi-object occlusion scenarios; the accuracy of unsafe behavior warning exceeds 95%, with an average delay of less than 1.5 s. This research accomplishes a transition from passive recognition to proactive warning, providing an intelligent solution for safety management in hydraulic construction under normal illumination and visibility conditions. Full article

► Show Figures

Figure 1

17 pages, 9213 KB

Open AccessArticle

Improved Point Cloud Representation via a Learnable Sort–Mix–Attend Mechanism

by Yuyan Zhang, Xi Wang, Zhang Yi and Lei Xu

Sensors 2026, 26(6), 1888; https://doi.org/10.3390/s26061888 - 17 Mar 2026

Viewed by 174

Abstract

Recent years have seen remarkable progress in deep learning on 3D point clouds, with hierarchical architectures becoming standard. Most work has focused on developing increasingly complex operators, such as self-attention, while enhancing the representational capacity of efficient point-wise MLP-based backbones has received less [...] Read more.

Recent years have seen remarkable progress in deep learning on 3D point clouds, with hierarchical architectures becoming standard. Most work has focused on developing increasingly complex operators, such as self-attention, while enhancing the representational capacity of efficient point-wise MLP-based backbones has received less attention. We address this issue by proposing a differentiable module that learns to impose a task-driven canonical structure on local point sets. Our proposed SMA (Sort–Mix–Attend) layer dynamically serializes a neighborhood by generating a geometric basis and using a differentiable sorting mechanism. This enables an efficient MLP-based network to model rich feature interactions, adaptively modulating features prior to the final symmetric aggregation function. We demonstrate that SMA effectively enhances standard backbones for 3D classification and segmentation. Specifically, integrating SMA into PointNeXt-S achieves an Overall Accuracy (OA) of 88.3% on the challenging ScanObjectNN dataset, an improvement of 0.6% over the baseline. Furthermore, it boosts the classic PointNet++ architecture by a significant 5.2% in OA. We also introduce a highly efficient SMA-Tiny variant that achieves 86.0% OA with only 0.3 M parameters, proving the structural superiority, computational cost-effectiveness, and practical significance of our method for real-world 3D perception tasks. Full article

(This article belongs to the Special Issue Intelligent Point Cloud Processing, Sensing and Understanding—Third Edition)

► Show Figures

Figure 1

22 pages, 1990 KB

Open AccessFeature PaperArticle

Linking Cucumber Surface Color to Internal Hydration Level Using Deep Learning for Freshness Classification

by Amin Taheri-Garavand, Theodora Makraki, Omidali Akbarpour, Aggeliki Sakellariou, Georgios Tsaniklidis and Dimitrios Fanourakis

Horticulturae 2026, 12(3), 357; https://doi.org/10.3390/horticulturae12030357 - 14 Mar 2026

Viewed by 189

Abstract

Postharvest dehydration is a major determinant of cucumber freshness and marketability, yet early reductions in internal water status are difficult to detect using conventional quality assessment methods. This study presents a non-destructive, physiology-informed deep learning approach that links cucumber surface color and texture [...] Read more.

Postharvest dehydration is a major determinant of cucumber freshness and marketability, yet early reductions in internal water status are difficult to detect using conventional quality assessment methods. This study presents a non-destructive, physiology-informed deep learning approach that links cucumber surface color and texture patterns to internal hydration level for automated freshness classification. A time-resolved dataset comprising 4160 RGB images of cucumber fruits was paired with gravimetrically determined relative water content (RWC), used as an objective indicator of internal hydration status. Based on RWC, fruits were classified into four freshness categories: Very Fresh (≥98%), Moderately Fresh (95–98%), Low Freshness (90–95%), and Spoiled (<90%). A custom convolutional neural network (CNN) was trained using standardized RGB images and evaluated on an independent test set. The model achieved an overall classification accuracy of 91.35% and a Cohen’s Kappa coefficient of 0.875, indicating strong agreement between predicted and actual freshness classes. Classification performance was highest for the extreme freshness states, with F1-scores exceeding 0.94 for Very Fresh and Spoiled fruits, while intermediate classes showed greater overlap, reflecting the gradual nature of postharvest water loss. Model interpretability analyses revealed that the CNN consistently focused on physiologically meaningful surface color and texture features associated with dehydration. Overall, these findings highlight the potential of physiology-informed deep learning to advance non-destructive freshness assessment in cucumbers, offering a realistic pathway toward hydration-based sorting, improved shelf-life management, and intelligent quality monitoring in modern postharvest supply chains. Full article

(This article belongs to the Special Issue Application of Computer Vision Technology in Postharvest Processing of Fruits and Vegetables)

► Show Figures

Figure 1

27 pages, 2940 KB

Open AccessArticle

A Unified Framework for Vehicle Detection, Tracking, and Counting Across Ground and Aerial Views Using Knowledge Distillation with YOLOv10-S

by Md Rezaul Karim Khan and Naphtali Rishe

Remote Sens. 2026, 18(5), 842; https://doi.org/10.3390/rs18050842 - 9 Mar 2026

Viewed by 363

Abstract

Accurate and reliable vehicle detection, tracking, and counting across different surveillance platforms are fundamental requirements for developing smart Traffic Management Systems (TMS) and promoting sustainable urban mobility. Recent advances in both ground-level surveillance and remote sensing using deep learning have opened new opportunities [...] Read more.

Accurate and reliable vehicle detection, tracking, and counting across different surveillance platforms are fundamental requirements for developing smart Traffic Management Systems (TMS) and promoting sustainable urban mobility. Recent advances in both ground-level surveillance and remote sensing using deep learning have opened new opportunities for extracting detailed vehicular information from high-resolution aerial and surveillance video data. Our research reported here aims to present a unified, real-time vehicle analysis framework that integrates lightweight deep learning–based detection, robust multi-object tracking, and trajectory-driven counting within a single modular pipeline. The proposed framework employs a “You Only Look Once” system, YOLOv10-S as the detection backbone and enhances its robustness through supervision-level knowledge distillation without introducing any architectural modifications. Temporal consistency is enforced using an observation-centric multi-object tracking algorithm (OC-SORT), enabling stable identity preservation under camera motion and dense traffic conditions. Vehicle counting is performed using a trajectory-based virtual gate strategy, reducing duplicate counts and improving counting reliability. Comprehensive experiments conducted on the UA-DETRAC and VisDrone benchmarks show that the proposed framework effectively balances detection performance, tracking robustness, counting accuracy, and real-time efficiency in both ground-based and aerial surveillance settings. Furthermore, cross-dataset evaluations under direct train–test transfer highlight the inherent challenges of domain shift while showing that knowledge distillation consistently improves robustness in detection, tracking identity consistency, and vehicle counting. Overall, this framework enables effective real-world traffic monitoring by adopting a scalable and practical system design, where reliability is prioritized over architectural complexity. Full article

(This article belongs to the Section Urban Remote Sensing)

► Show Figures

Figure 1

20 pages, 4810 KB

Open AccessArticle

Unauthorized Expressway Parking Detection Based on Spatiotemporal Analysis of Vehicle–Structure Distances Using UAV Aerial Images

by Xiaolong Gong, Haiqing Liu, Yuehao Wang, Yaxin Wei and Guoran Shi

Vehicles 2026, 8(3), 49; https://doi.org/10.3390/vehicles8030049 - 6 Mar 2026

Viewed by 312

Abstract

Owing to their high-altitude vantage point and maneuverability, unmanned aerial vehicles (UAVs) have emerged as an effective technical solution for real-time parking detection in expressway scenarios. Using UAV cruise-perspective images, this paper proposes an unauthorized parking detection method by analyzing the time-series variations [...] Read more.

Owing to their high-altitude vantage point and maneuverability, unmanned aerial vehicles (UAVs) have emerged as an effective technical solution for real-time parking detection in expressway scenarios. Using UAV cruise-perspective images, this paper proposes an unauthorized parking detection method by analyzing the time-series variations in the relative distances between the moving vehicle and static structure as a reference. Firstly, vehicle and static structure targets are recognized and tracked by the DeepSort, and a Vehicle–Structure (V-S) distance matrix is further constructed to describe their frame-wise relative positions in the pixel coordinate system. Then, to eliminate the radial scale errors caused by perspective distortion, a scale factor (SF) index is introduced to correct the original V-S matrix and provide a more accurate spatiotemporal representation. Finally, the stationarity of the distance series in the V-S matrix is tested using the Augmented Dickey–Fuller (ADF) test, and a parking detection method is proposed by introducing the parking support ratio (PSR) to establish a multi-structure joint decision scheme. Experimental results show that the corrected V-S matrix can faithfully describe the spatial positional relationship between road vehicles and static structures. With the optimal PSR threshold

ψ_{0}

and time window T, the proposed method achieves better overall parking-detection performance in terms of accuracy, precision, recall, and F1-score in comparison with a traditional speed threshold approach. Full article

(This article belongs to the Special Issue Air Vehicle Operations: Opportunities, Challenges and Future Trends)

► Show Figures

Figure 1

27 pages, 3381 KB

Open AccessArticle

Fusion of Stereo Matching and Spatiotemporal Interaction Analysis: A Detection Method for Excavator-Related Struck-By Hazards in Construction Sites

by Yifan Zhu, Hainan Chen, Rui Pan, Mengqi Yuan, Pan Zhang and Wen Wang

Buildings 2026, 16(5), 1002; https://doi.org/10.3390/buildings16051002 - 4 Mar 2026

Viewed by 253

Abstract

In the construction industry, struck-by accidents involving heavy equipment such as crawler excavators are a leading cause of worker fatalities and injuries. Existing vision-based hazard detection methods are limited by approximate evaluations, reliance on specific references, and neglect of spatial relationships between equipment [...] Read more.

In the construction industry, struck-by accidents involving heavy equipment such as crawler excavators are a leading cause of worker fatalities and injuries. Existing vision-based hazard detection methods are limited by approximate evaluations, reliance on specific references, and neglect of spatial relationships between equipment and workers, making them inadequate for complex dynamic construction environments. This study aims to address these limitations by proposing a precise and adaptable struck-by hazard detection method. The method integrates four core modules: object tracking via the YOLOv5-DeepSORT model to detect workers, excavators, and their key components; activity recognition to identify the operational states of excavators, working or static, and workers, driver or field worker; proximity estimation based on stereo vision using the BGNet model and camera calibration to calculate 3D spatial distances; and safety identification to assess worker safety status in real time. Validated through three virtual construction scenarios, flat ground, rugged terrain, slope, the method achieved high safety status identification accuracies of 92.71%, 90.04%, and 94.25% respectively. The results demonstrate its robustness in adapting to diverse construction environments and accurately capturing equipment–worker spatial interactions. This research expands the application scope of hazard monitoring in complex settings, enhances safety identification efficiency, and provides a reliable technical solution for improving construction site safety management. Full article

(This article belongs to the Special Issue Recent Advances in Intelligent Infrastructure and Construction Engineering)

► Show Figures

Figure 1

41 pages, 19770 KB

Open AccessArticle

Vision-Based Dual-Mode Collision Risk-Warning for Aircraft Apron Monitoring

by Emre Can Bingol, Hamed Al-Raweshidy and Konstantinos Banitsas

Drones 2026, 10(3), 173; https://doi.org/10.3390/drones10030173 - 2 Mar 2026

Viewed by 398

Abstract

Ground incidents on airport aprons can cause substantial operational disruption and economic loss, while conventional surveillance (e.g., Surface Movement Radar (SMR), Closed-Circuit Television (CCTV)) often lacks the resolution and proactive decision support required for close-proximity operations. This study proposes a UAV-deployable, camera-agnostic Computer [...] Read more.

Ground incidents on airport aprons can cause substantial operational disruption and economic loss, while conventional surveillance (e.g., Surface Movement Radar (SMR), Closed-Circuit Television (CCTV)) often lacks the resolution and proactive decision support required for close-proximity operations. This study proposes a UAV-deployable, camera-agnostic Computer Vision (CV) framework for collision-risk warning from elevated viewpoints. An optimised YOLOv8-Seg backbone performs multi-class aircraft segmentation (airplane, wing, nose, tail, and fuselage) and is integrated with four MOT algorithms under identical evaluation settings. For quantitative tracker benchmarking, DeepSORT provides the strongest overall performance on the airplane-only MOTChallenge-format ground truth (MOTA 92.77%, recall 93.27%). To mitigate the scarcity of annotated apron-incident data, a labelled 997-frame MOT dataset is created via an MSFS simulation-based reenactment inspired by the 2018 Asiana–Turkish Airlines wing-to-tail event at Istanbul Ataturk Airport. The framework further introduces a dual-module warning mechanism that can operate independently: (i) a reactive module using image-plane proximity derived from segmentation masks, and (ii) a proactive module that predicts short-horizon conflicts via trajectory extrapolation and IoU-based future overlap analysis. The approach is evaluated on multiple simulated incident scenarios and assessed on a real apron video from Hong Kong International Airport; additionally, laboratory-scale UAV experiments using diecast aircraft models provide end-to-end feasibility evidence on unmanned-platform imagery. Overall, the results indicate timely warnings and practical feasibility for low-overhead UAV-enabled apron monitoring. Full article

► Show Figures

Figure 1

15 pages, 8737 KB

Open AccessArticle

Sedimentological and Geological Mapping of the Shallow Platform and Deep Basin of Lake Faro (Cape Peloro Coastal Lagoon, Italy): New Insights into Modern Sediments and Holocene Beachrocks

by Roberta Somma, Mohammadali Ghanadzadeh Yazdi and Salvatore Giacobbe

Quaternary 2026, 9(2), 19; https://doi.org/10.3390/quat9020019 - 28 Feb 2026

Viewed by 309

Abstract

Lake Faro (Cape Peloro coastal lagoon, NE Sicily, Italy) is a distinctive Mediterranean coastal lake characterized by the coexistence of a shallow platform and a steep-sided deep basin within a very limited area. This study provides a sedimentological and geological characterization of the [...] Read more.

Lake Faro (Cape Peloro coastal lagoon, NE Sicily, Italy) is a distinctive Mediterranean coastal lake characterized by the coexistence of a shallow platform and a steep-sided deep basin within a very limited area. This study provides a sedimentological and geological characterization of the present-day lake floor based on grain-size, petrographic, statistical, and GIS-based analyses, with the aim of clarifying the relationship between basin morphology and modern depositional processes. The lake floor is subdivided into two main bathymetric domains. The shallow platform (<10 m water depth) is dominated by modern coarse-grained, very poorly sorted sediments, including gravel and very coarse- to medium-grained sand, deposited under high-energy, low-confinement conditions comparable to beach and open-lagoon environments. In contrast, the deep basin (>10 m water depth) is characterized by modern finer, organic-rich sediments with extremely poor sorting, reflecting lower-energy and more confined depositional conditions. A key new finding is the identification of upper Holocene beachrocks beneath the modern unconsolidated sediments of the shallow platform, which likely exert a significant morpho-structural control on platform development. Overall, the results highlight the strong influence of bathymetry on sediment distribution in coastal lake systems and provide a reference framework for comparable Mediterranean lagoon environments. Full article

► Show Figures

Figure 1

18 pages, 7252 KB

Open AccessArticle

Frequency-Based Deep Occlusion Awareness Instance Segmentation

by Yasin Güzel, Zafer Aydın and Muhammed Fatih Talu

Mathematics 2026, 14(5), 792; https://doi.org/10.3390/math14050792 - 26 Feb 2026

Viewed by 280

Abstract

One major challenge faced by deep learning-based methods that detect target objects in the form of bounding boxes is object occlusion. High degrees of occlusion significantly diminish the accuracy of instance segmentation. Nonetheless, complex-valued Fourier descriptors can robustly represent object boundaries using minimal [...] Read more.

One major challenge faced by deep learning-based methods that detect target objects in the form of bounding boxes is object occlusion. High degrees of occlusion significantly diminish the accuracy of instance segmentation. Nonetheless, complex-valued Fourier descriptors can robustly represent object boundaries using minimal information. In this study, the impact of integrating Fourier descriptors—renowned for their strong representational capacity—with deep network models (UNet) that exhibit high generalization performance on instance segmentation accuracy was investigated. Within the scope of the research, nine network models were designed based on different strategies for utilizing frequency components. These variants fall into four strategy families: (i) UNet-style spectrum regression on fixed low-frequency windows (FUNet), (ii) magnitude-guided frequency selection/ROI construction (FUNet–Thr, FUNet–BBox), (iii) sequence models over tokenized FFT coefficients (BiLSTM Patch/Sorted), and (iv) encoder-only spectrum predictors with different depth/capacity (EncoderFFT1/2). To fairly evaluate the models’ performance in segmenting objects subjected to disruptive factors (e.g., occlusion, blurring, noise), a specialized synthetic dataset was prepared. The task is formulated as single-target (single-instance), single-class segmentation. This dataset, automatically generated according to initial parameter values, contains images of objects moving at various speeds within a single frame. Among these models, the one termed FUNet, which relies on partial matching of central frequency components, achieved the highest segmentation accuracy despite the disruptive effects. Under the challenging Dataset 8 setting, the proposed FUNet achieved the highest overlap-based performance (Dice = 0.9329, IoU = 0.8842) among Attention U-Net, U-Net, and FourierNet, with statistically significant gains confirmed by paired per-image tests. Full article

► Show Figures

Figure 1

22 pages, 1503 KB

Open AccessArticle

Multi-Objective Collaborative Optimization for Low-Carbon Cold-Chain Routing with Dynamic Demand

by Qiaoying Hu, Xiangxin Liu and Xiaoyun Jiang

Mathematics 2026, 14(5), 753; https://doi.org/10.3390/math14050753 - 24 Feb 2026

Viewed by 285

Abstract

To address the challenges of high energy consumption, substantial carbon emissions, and dynamic customer demand in cold-chain logistics, this paper investigates the balance between sustainable development and operational efficiency for low-carbon distribution. We construct a Multi-Objective Low-Carbon Cold-Chain Vehicle Routing Problem with Dynamic [...] Read more.

To address the challenges of high energy consumption, substantial carbon emissions, and dynamic customer demand in cold-chain logistics, this paper investigates the balance between sustainable development and operational efficiency for low-carbon distribution. We construct a Multi-Objective Low-Carbon Cold-Chain Vehicle Routing Problem with Dynamic Demand (MO-LC-CCDVRP) model to synergistically optimize the comprehensive costs, including vehicle dispatch, transportation adjustments, carbon emissions, and refrigeration, while maximizing customer satisfaction. To solve this model efficiently, we propose a novel deep reinforcement learning-enhanced Non-Dominated Sorting Genetic Algorithm II (DRL-NSGA-II). By using DRL to adaptively control the genetic operators, this algorithm significantly enhances both the convergence speed and distribution quality of the Pareto front. The solution process occurs in two stages: first, high-quality initial routes are generated from static information; then, upon dynamic information updates, rapid replanning is performed for unserved customers. Numerical experiments using adapted Solomon benchmark instances demonstrate the superiority of the proposed algorithm. Furthermore, a dynamic distribution case study confirms the model’s effectiveness, and a sensitivity analysis elucidates the complex impact of carbon pricing on total cost, customer satisfaction, and carbon emissions. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

25 pages, 21968 KB

Open AccessArticle

A Study on Bus Passenger Boarding and Alighting Detection and Recognition Based on Video Images and YOLO Algorithm

by Wei Xu, Yushan Zhao, Xiaodong Du, Haoyang Ji and Lei Xing

Sensors 2026, 26(5), 1418; https://doi.org/10.3390/s26051418 - 24 Feb 2026

Viewed by 387

Abstract

Public transportation is the core of easing urban traffic congestion, reducing pollution and advancing smart city transportation intellectualization. Its refined operation relies heavily on accurate, real-time passenger origin–destination (OD) data. However, traditional manual surveys are costly with low sampling rates, while smart card [...] Read more.

Public transportation is the core of easing urban traffic congestion, reducing pollution and advancing smart city transportation intellectualization. Its refined operation relies heavily on accurate, real-time passenger origin–destination (OD) data. However, traditional manual surveys are costly with low sampling rates, while smart card big data lacks alighting information and has deviations, failing to reflect real travel behaviors and becoming a bottleneck for intelligent public transportation development. To address this, this paper proposes a bus passenger boarding/alighting detection and recognition study based on video images and the YOLO algorithm. Aiming at traditional YOLO’s shortcomings in on-vehicle scenarios (insufficient feature extraction, inefficient feature fusion, slow convergence), the baseline YOLOv8n is improved for bus scenarios’ high-density, high-occlusion and variable-target scales: (1) DAC2f structure (deformable attention + C2f) captures occluded passengers’ core features and suppresses background interference; (2) SWD-PAN enables bidirectional cross-scale feature interaction to adapt to scale differences; and (3) WIoUv3 balances sample weights for small targets and non-standard posture passengers. Experiments show that precision, recall and mAP increase by 3.68%, 5.12% and 6.26%, respectively, meeting real-time requirements. The improved YOLOv8 is deeply integrated with DeepSORT to enhance tracking stability. Tests show that MOTA reaches 31.24% (2.6% higher than YOLOv8n, 16.4% higher than YOLO-X) and MOTP reaches 88.06%, solving trajectory breakage and ID switching. This addresses traditional OD data collection pain points, providing technical support for intelligent public transportation refined management and smart city transportation optimization. Full article

(This article belongs to the Collection Computer Vision Based Smart Sensing)

► Show Figures

Figure 1

Search Results (528)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (528)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI