MDPI - Publisher of Open Access Journals

20 pages, 7016 KiB

Open AccessArticle

Design, Analysis and Control of Tracked Mobile Robot with Passive Suspension on Rugged Terrain

by Junfeng Gao, Yi Li, Jingfu Jin, Zhicheng Jia and Chao Wei

Actuators 2025, 14(8), 389; https://doi.org/10.3390/act14080389 - 6 Aug 2025

With the application of tracked mobile robots in detection and rescue, how to improve their stability and trafficability has become the research focus. In order to improve the driving ability and trafficability of tracked mobile robots in rugged terrain, this paper proposes a [...] Read more.

With the application of tracked mobile robots in detection and rescue, how to improve their stability and trafficability has become the research focus. In order to improve the driving ability and trafficability of tracked mobile robots in rugged terrain, this paper proposes a new type of tracked mobile robot using passive suspension. By adding a connecting rod differential mechanism between the left and right track mechanisms, the contact stability between the track and terrain is enhanced. The kinematics model and attitude relationship of the suspension are analyzed and established, and the rationality of the passive suspension scheme is verified by dynamic simulation. The simulation results show that the tracked robot with passive suspension shows good obstacle surmounting performance, but there will be a heading deflection problem. Therefore, a track drive speed of the driving state compensation control is proposed based on the driving scene, which can effectively solve the problem of slip and heading deflection. Through the field test of the robot prototype, the effectiveness of the suspension scheme and control system is verified, which provides a useful reference for the scheme design and performance improvement of the tracked mobile robot in complex field scenes. Full article

(This article belongs to the Section Actuators for Robotics)

► Show Figures

Figure 1

17 pages, 3439 KiB

Open AccessArticle

Delay Prediction Through Multi-Channel Traffic and Weather Scene Image: A Deep Learning-Based Method

by Ligang Yuan, Linghua Kong and Haiyan Chen

Appl. Sci. 2025, 15(15), 8604; https://doi.org/10.3390/app15158604 (registering DOI) - 3 Aug 2025

Viewed by 180

Abstract

Accurate prediction of airport delays under convective weather conditions is essential for effective traffic coordination and improving overall airport efficiency. Traditional methods mainly rely on numerical weather and traffic indicators, but they often fail to capture the spatial distribution of traffic flows within [...] Read more.

Accurate prediction of airport delays under convective weather conditions is essential for effective traffic coordination and improving overall airport efficiency. Traditional methods mainly rely on numerical weather and traffic indicators, but they often fail to capture the spatial distribution of traffic flows within the terminal area. To address this limitation, we propose a novel image-based representation named Multi-Channel Traffic and Weather Scene Image (MTWSI), which maps both meteorological and traffic information onto a two-dimensional airspace grid, thereby preserving spatial relationships. Based on the MTWSI, we develop a delay prediction model named ADLCNN. This model first uses a convolutional neural network to extract deep spatial features from the scene images and then classifies each sample into a delay level. Using real operational data from Guangzhou Baiyun Airport, this paper shows that ADLCNN achieves significantly higher prediction accuracy compared to traditional machine learning methods. The results confirm that MTWSI provides a more accurate representation of real traffic conditions under convective weather. Full article

► Show Figures

Figure 1

21 pages, 5817 KiB

Open AccessArticle

UN15: An Urban Noise Dataset Coupled with Time–Frequency Attention for Environmental Sound Classification

by Yu Shen, Ge Cao, Huan-Yu Dong, Bo Dong and Chang-Myung Lee

Appl. Sci. 2025, 15(15), 8413; https://doi.org/10.3390/app15158413 - 29 Jul 2025

Viewed by 168

Abstract

With the increasing severity of urban noise pollution, its detrimental impact on public health has garnered growing attention. However, accurate identification and classification of noise sources in complex urban acoustic environments remain major technical challenges for achieving refined noise management. To address this [...] Read more.

With the increasing severity of urban noise pollution, its detrimental impact on public health has garnered growing attention. However, accurate identification and classification of noise sources in complex urban acoustic environments remain major technical challenges for achieving refined noise management. To address this issue, this study presents two key contributions. First, we construct a new urban noise classification dataset, namely the urban noise 15-category dataset (UN15), which consists of 1620 audio clips from 15 representative categories, including traffic, construction, crowd activity, and commercial noise, recorded from diverse real-world urban scenes. Second, we propose a novel deep neural network architecture based on a residual network and integrated with a time–frequency attention mechanism, referred to as residual network with temporal–frequency attention (ResNet-TF). Extensive experiments conducted on the UN15 dataset demonstrate that ResNet-TF outperforms several mainstream baseline models in both classification accuracy and robustness. These results not only verify the effectiveness of the proposed attention mechanism but also establish the UN15 dataset as a valuable benchmark for future research in urban noise classification. Full article

(This article belongs to the Section Acoustics and Vibrations)

► Show Figures

Figure 1

19 pages, 1196 KiB

Open AccessArticle

The Effects of Landmark Salience on Drivers’ Spatial Cognition and Takeover Performance in Autonomous Driving Scenarios

by Xianyun Liu, Yongdong Zhou and Yunhong Zhang

Behav. Sci. 2025, 15(7), 966; https://doi.org/10.3390/bs15070966 - 16 Jul 2025

Viewed by 244

Abstract

With the increasing prevalence of autonomous vehicles (AVs), drivers’ spatial cognition and takeover performance have become critical to traffic safety. This study investigates the effects of landmark salience—specifically visual and structural salience—on drivers’ spatial cognition and takeover behavior in autonomous driving scenarios. Two [...] Read more.

With the increasing prevalence of autonomous vehicles (AVs), drivers’ spatial cognition and takeover performance have become critical to traffic safety. This study investigates the effects of landmark salience—specifically visual and structural salience—on drivers’ spatial cognition and takeover behavior in autonomous driving scenarios. Two simulator-based experiments were conducted. Experiment 1 examined the impact of landmark salience on spatial cognition tasks, including route re-cruise, scene recognition, and sequence recognition. Experiment 2 assessed the effects of landmark salience on takeover performance. Results indicated that salient landmarks generally enhance spatial cognition; the effects of visual and structural salience differ in scope and function in autonomous driving scenarios. Landmarks with high visual salience not only improved drivers’ accuracy in making intersection decisions but also significantly reduced the time it took to react to a takeover. In contrast, structurally salient landmarks had a more pronounced effect on memory-based tasks, such as scene recognition and sequence recognition, but showed a limited influence on dynamic decision-making tasks like takeover response. These findings underscore the differentiated roles of visual and structural landmark features, highlighting the critical importance of visually salient landmarks in supporting both navigation and timely takeover during autonomous driving. The results provide practical insights for urban road design, advocating for the strategic placement of visually prominent landmarks at key decision points. This approach has the potential to enhance both navigational efficiency and traffic safety. Full article

(This article belongs to the Section Cognition)

► Show Figures

Figure 1

17 pages, 5189 KiB

Open AccessArticle

YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather

by Wei Wang, Bin Jing, Xiaoru Yu, Wei Zhang, Shengyu Wang, Ziqi Tang and Liping Yang

Sensors 2025, 25(14), 4338; https://doi.org/10.3390/s25144338 - 11 Jul 2025

Viewed by 561

Abstract

Visually impaired individuals face significant challenges in navigating safely and independently, particularly under adverse weather conditions such as fog. To address this issue, we propose YOLO-Extreme, an enhanced object detection framework based on YOLOv12, specifically designed for robust navigation assistance in foggy environments. [...] Read more.

Visually impaired individuals face significant challenges in navigating safely and independently, particularly under adverse weather conditions such as fog. To address this issue, we propose YOLO-Extreme, an enhanced object detection framework based on YOLOv12, specifically designed for robust navigation assistance in foggy environments. The proposed architecture incorporates three novel modules: the Dual-Branch Bottleneck Block (DBB) for capturing both local spatial and global semantic features, the Multi-Dimensional Collaborative Attention Module (MCAM) for joint spatial-channel attention modeling to enhance salient obstacle features and reduce background interference in foggy conditions, and the Channel-Selective Fusion Block (CSFB) for robust multi-scale feature integration. Comprehensive experiments conducted on the Real-world Task-driven Traffic Scene (RTTS) foggy dataset demonstrate that YOLO-Extreme achieves state-of-the-art detection accuracy and maintains high inference speed, outperforming existing dehazing-and-detect and mainstream object detection methods. To further verify the generalization capability of the proposed framework, we also performed cross-dataset experiments on the Foggy Cityscapes dataset, where YOLO-Extreme consistently demonstrated superior detection performance across diverse foggy urban scenes. The proposed framework significantly improves the reliability and safety of assistive navigation for visually impaired individuals under challenging weather conditions, offering practical value for real-world deployment. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

16 pages, 1610 KiB

Open AccessArticle

Cascaded Dual-Inpainting Network for Scene Text

by Chunmei Liu

Appl. Sci. 2025, 15(14), 7742; https://doi.org/10.3390/app15147742 - 10 Jul 2025

Viewed by 207

Abstract

Scene text inpainting is a significant research challenge in visual text processing, with critical applications spanning incomplete traffic sign comprehension, degraded container-code recognition, occluded vehicle license plate processing, and other incomplete scene text processing systems. In this paper, a cascaded dual-inpainting network for [...] Read more.

Scene text inpainting is a significant research challenge in visual text processing, with critical applications spanning incomplete traffic sign comprehension, degraded container-code recognition, occluded vehicle license plate processing, and other incomplete scene text processing systems. In this paper, a cascaded dual-inpainting network for scene text (CDINST) is proposed. The architecture integrates two scene text inpainting models to reconstruct the text foreground: the Structure Generation Module (SGM) and Structure Reconstruction Module (SRM). The SGM primarily performs preliminary foreground text reconstruction and extracts text structures. Building upon the SGM’s guidance, the SRM subsequently enhances the foreground structure reconstruction through structure-guided refinement. The experimental results demonstrate compelling performance on the benchmark dataset, showcasing both the effectiveness of the proposed dual-inpainting network and its accuracy in incomplete scene text recognition. The proposed network achieves an average recognition accuracy improvement of 11.94% compared to baseline methods for incomplete scene text recognition tasks. Full article

► Show Figures

Figure 1

21 pages, 15478 KiB

Open AccessReview

Small Object Detection in Traffic Scenes for Mobile Robots: Challenges, Strategies, and Future Directions

by Zhe Wei, Yurong Zou, Haibo Xu and Sen Wang

Electronics 2025, 14(13), 2614; https://doi.org/10.3390/electronics14132614 - 28 Jun 2025

Viewed by 563

Abstract

Small object detection in traffic scenes presents unique challenges for mobile robots operating under constrained computational resources and highly dynamic environments. Unlike general object detection, small targets often suffer from low resolution, weak semantic cues, and frequent occlusion, especially in complex outdoor scenarios. [...] Read more.

Small object detection in traffic scenes presents unique challenges for mobile robots operating under constrained computational resources and highly dynamic environments. Unlike general object detection, small targets often suffer from low resolution, weak semantic cues, and frequent occlusion, especially in complex outdoor scenarios. This study systematically analyses the challenges, technical advances, and deployment strategies for small object detection tailored to mobile robotic platforms. We categorise existing approaches into three main strategies: feature enhancement (e.g., multi-scale fusion, attention mechanisms), network architecture optimisation (e.g., lightweight backbones, anchor-free heads), and data-driven techniques (e.g., augmentation, simulation, transfer learning). Furthermore, we examine deployment techniques on embedded devices such as Jetson Nano and Raspberry Pi, and we highlight multi-modal sensor fusion using Light Detection and Ranging (LiDAR), cameras, and Inertial Measurement Units (IMUs) for enhanced environmental perception. A comparative study of public datasets and evaluation metrics is provided to identify current limitations in real-world benchmarking. Finally, we discuss future directions, including robust detection under extreme conditions and human-in-the-loop incremental learning frameworks. This research aims to offer a comprehensive technical reference for researchers and practitioners developing small object detection systems for real-world robotic applications. Full article

(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)

► Show Figures

Figure 1

36 pages, 122050 KiB

Open AccessArticle

GAML-YOLO: A Precise Detection Algorithm for Extracting Key Features from Complex Environments

by Lihu Pan, Zhiyang Xue and Kaiqiang Zhang

Electronics 2025, 14(13), 2523; https://doi.org/10.3390/electronics14132523 - 21 Jun 2025

Viewed by 445

Abstract

This study addresses three major challenges in non-motorized vehicle rider helmet detection: multi-spectral interference between the helmet and hair color (HSV spatial similarity > 0.82), target occlusion in high-density traffic flows (with peak density reaching 11.7 vehicles/frame), and perception degradation under complex weather [...] Read more.

This study addresses three major challenges in non-motorized vehicle rider helmet detection: multi-spectral interference between the helmet and hair color (HSV spatial similarity > 0.82), target occlusion in high-density traffic flows (with peak density reaching 11.7 vehicles/frame), and perception degradation under complex weather conditions (such as overcast, foggy, and strong light interference). To tackle these issues, we developed the GMAL-YOLO detection algorithm. This algorithm enhances feature representation by constructing a Feature-Enhanced Neck Network (FENN) that integrates both global and local features. It employs the Global Mamba Architecture Enhancement (GMET) to reduce parameter size while strengthening global context capturing ability. It also incorporates Multi-Scale Spatial Pyramid Pooling (MSPP) combined with multi-scale feature extraction to improve the model’s robustness. The enhanced channel attention mechanism with self-attention (ECAM) is designed to enhance local feature extraction and stabilize deep feature learning through partial convolution and residual learning, resulting in a 13.04% improvement in detection precision under occlusion scenarios. Furthermore, the model’s convergence speed and localization precision are optimized using the modified Enhanced Precision-IoU loss function(EP-IoU). Experimental results demonstrate that GMAL-YOLO outperforms existing algorithms on the self-constructed HelmetVision dataset and public datasets. Specifically, in extreme scenarios, the false detection rate is reduced by 17.3%, and detection precision in occluded scenes is improved by 13.6%, providing an effective technical solution for intelligent traffic surveillance. Full article

► Show Figures

Figure 1

24 pages, 6003 KiB

Open AccessArticle

ADSAP: An Adaptive Speed-Aware Trajectory Prediction Framework with Adversarial Knowledge Transfer

by Cheng Da, Yongsheng Qian, Junwei Zeng, Xuting Wei and Futao Zhang

Electronics 2025, 14(12), 2448; https://doi.org/10.3390/electronics14122448 - 16 Jun 2025

Viewed by 382

Abstract

Accurate trajectory prediction of surrounding vehicles is a fundamental challenge in autonomous driving, requiring sophisticated modeling of complex vehicle interactions, traffic dynamics, and contextual dependencies. This paper introduces Adaptive Speed-Aware Prediction (ADSAP), a novel trajectory prediction framework that advances the state of the [...] Read more.

Accurate trajectory prediction of surrounding vehicles is a fundamental challenge in autonomous driving, requiring sophisticated modeling of complex vehicle interactions, traffic dynamics, and contextual dependencies. This paper introduces Adaptive Speed-Aware Prediction (ADSAP), a novel trajectory prediction framework that advances the state of the art through innovative mechanisms for adaptive attention modulation and knowledge transfer. At its core, ADSAP employs an adaptive deformable speed-aware pooling mechanism that dynamically adjusts the model’s attention distribution and receptive field based on instantaneous vehicle states and interaction patterns. This adaptive architecture enables fine-grained modeling of diverse traffic scenarios, from sparse highway conditions to dense urban environments. The framework incorporates a sophisticated speed-aware multi-scale feature aggregation module that systematically combines spatial and temporal information across multiple scales, facilitating comprehensive scene understanding and robust trajectory prediction. To bridge the gap between model complexity and computational efficiency, we propose an adversarial knowledge distillation approach that effectively transfers learned representations and decision-making strategies from a high-capacity teacher model to a lightweight student model. This novel distillation mechanism preserves prediction accuracy while significantly reducing computational overhead, making the framework suitable for real-world deployment. Extensive empirical evaluation on the large-scale NGSIM and highD naturalistic driving datasets demonstrates ADSAP’s superior performance. The ADSAP framework achieves an 18.7% reduction in average displacement error and a 22.4% improvement in final displacement error compared to state-of-the-art methods while maintaining consistent performance across varying traffic densities (0.05–0.85 vehicles/meter) and speed ranges (0–35 m/s). Moreover, ADSAP exhibits robust generalization capabilities across different driving scenarios and weather conditions, with the lightweight student model achieving 95% of the teacher model’s accuracy while offering a 3.2× reduction in inference time. Comprehensive experimental results supported by detailed ablation studies and statistical analyses validate ADSAP’s effectiveness in addressing the trajectory prediction challenge. Our framework provides a novel perspective on integrating adaptive attention mechanisms with efficient knowledge transfer, contributing to the development of more reliable and intelligent autonomous driving systems. Significant improvements in prediction accuracy, computational efficiency, and generalization capability demonstrate ADSAP’s potential ability to advance autonomous driving technology. Full article

(This article belongs to the Special Issue Advances in AI Engineering: Exploring Machine Learning Applications)

► Show Figures

Figure 1

18 pages, 6877 KiB

Open AccessArticle

Machine Learning-Enhanced 3D GIS Urban Noise Mapping with Multi-Modal Factors

by Jianping Pan, Yuzhe He, Wei Ma, Shengwang An, Lu Li, Dan Huang and Dunxin Jia

ISPRS Int. J. Geo-Inf. 2025, 14(6), 223; https://doi.org/10.3390/ijgi14060223 - 4 Jun 2025

Viewed by 862

Abstract

Geographic Information System (GIS)-based noise management is crucial in urban environments as it provides precise spatial analysis, helping to identify noise hotspots and optimize noise control measures. By integrating noise propagation models with GIS technology, dynamic simulation and visualization of noise distribution can [...] Read more.

Geographic Information System (GIS)-based noise management is crucial in urban environments as it provides precise spatial analysis, helping to identify noise hotspots and optimize noise control measures. By integrating noise propagation models with GIS technology, dynamic simulation and visualization of noise distribution can be achieved, offering scientific support for urban planning and noise management. Most existing noise prediction models fail to fully account for three-dimensional (3D) spatial information and a wide range of environmental factors. As a result, there are often discrepancies between the actual noise measurements at monitoring points and the predicted values generated by these models. Furthermore, there is a lack of a system that can effectively integrate noise data with three-dimensional scenes for simulation. This paper proposes a new method to simulate urban noise propagation, aiming to achieve more accurate noise prediction and visualization in a three-dimensional environment. First, we computed the preliminary noise propagation based on a traffic noise model. Next, machine learning techniques were applied to analyze the relationship between noise discrepancies and multi-modal factors, thereby improving the accuracy of environmental noise level estimation. Based on this, we developed an urban noise simulation system. The system integrates functions such as noise simulation, traffic simulation, and weather changes, enabling accurate noise visualization within a three-dimensional virtual environment. Experimental results demonstrate that this method enhances the accuracy of urban noise prediction and visualization, providing users with a more comprehensive understanding of the spatial distribution of urban noise. Full article

► Show Figures

Figure 1

18 pages, 3976 KiB

Open AccessProceeding Paper

Survey on Comprehensive Visual Perception Technology for Future Air–Ground Intelligent Transportation Vehicles in All Scenarios

by Guixin Ren, Fei Chen, Shichun Yang, Fan Zhou and Bin Xu

Eng. Proc. 2024, 80(1), 50; https://doi.org/10.3390/engproc2024080050 - 30 May 2025

Viewed by 460

Abstract

As an essential part of the low-altitude economy, low-altitude carriers are an important cornerstone of its development and a new industry that cannot be ignored strategically. However, it is difficult for the existing two-dimensional vehicle autonomous driving perception scheme to meet the needs [...] Read more.

As an essential part of the low-altitude economy, low-altitude carriers are an important cornerstone of its development and a new industry that cannot be ignored strategically. However, it is difficult for the existing two-dimensional vehicle autonomous driving perception scheme to meet the needs of general key technologies for all-scene perception such as the global high-precision map construction of low-altitude vehicles in a three-dimensional space, the perception identification of local environmental traffic participants, and the extraction of key visual information under extreme conditions. Therefore, it is urgent to explore the development and verification of all-scene universal sensing technology for low-altitude intelligent vehicles. In this paper, the literature on vision-based urban rail transit and general perception technology in low-altitude flight environment is studied, and the paper summarizes the research status and innovation points from five aspects, namely the environment perception algorithm based on visual SLAM, the environment perception algorithm based on BEV, the environment perception algorithm based on image enhancement, the performance optimization of the perception algorithm using cloud computing, and the rapid deployment of the perception algorithm using edge nodes, and puts forward the future optimization direction of this topic. Full article

(This article belongs to the Proceedings of 2nd International Conference on Green Aviation (ICGA 2024))

► Show Figures

Figure 1

19 pages, 3016 KiB

Open AccessArticle

Attention-Based LiDAR–Camera Fusion for 3D Object Detection in Autonomous Driving

by Zhibo Wang, Xiaoci Huang and Zhihao Hu

World Electr. Veh. J. 2025, 16(6), 306; https://doi.org/10.3390/wevj16060306 - 29 May 2025

Viewed by 1892

Abstract

In multi-vehicle traffic scenarios, achieving accurate environmental perception and motion trajectory tracking through LiDAR–camera fusion is critical for downstream vehicle planning and control tasks. To address the challenges of cross-modal feature interaction in LiDAR–image fusion and the low recognition efficiency/positioning accuracy of traffic [...] Read more.

In multi-vehicle traffic scenarios, achieving accurate environmental perception and motion trajectory tracking through LiDAR–camera fusion is critical for downstream vehicle planning and control tasks. To address the challenges of cross-modal feature interaction in LiDAR–image fusion and the low recognition efficiency/positioning accuracy of traffic participants in dense traffic flows, this study proposes an attention-based 3D object detection network integrating point cloud and image features. The algorithm adaptively fuses LiDAR geometric features and camera semantic features through channel-wise attention weighting, enhancing multi-modal feature representation by dynamically prioritizing informative channels. A center point detection architecture is further employed to regress 3D bounding boxes in bird’s-eye-view space, effectively resolving orientation ambiguities caused by sparse point distributions. Experimental validation on the nuScenes dataset demonstrates the model’s robustness in complex scenarios, achieving a mean Average Precision (mAP) of 64.5% and a 12.2% improvement over baseline methods. Real-vehicle deployment further confirms the fusion module’s effectiveness in enhancing detection stability under dynamic traffic conditions. Full article

(This article belongs to the Special Issue Electric Vehicle Autonomous Driving Based on Image Recognition)

► Show Figures

Figure 1

17 pages, 3055 KiB

Open AccessArticle

Characterization of Driver Dynamic Visual Perception Under Different Road Linearity Conditions

by Zhenxiang Hao, Jianping Hu, Jin Ran, Xiaohui Sun, Yuhang Zheng and Chengzhang Li

Appl. Sci. 2025, 15(11), 6076; https://doi.org/10.3390/app15116076 - 28 May 2025

Viewed by 385

Abstract

Drivers’ visual characteristics have an important impact on traffic safety, but existing studies are mostly limited to single-scene analyses and lack a systematic study on the dynamic changes in drivers’ eye tracking characteristics on different road sections. In this study, 23 drivers were [...] Read more.

Drivers’ visual characteristics have an important impact on traffic safety, but existing studies are mostly limited to single-scene analyses and lack a systematic study on the dynamic changes in drivers’ eye tracking characteristics on different road sections. In this study, 23 drivers were recruited to wear the aSee Glasses eye tracking device and driving tests were conducted on four typical road sections, namely, straight ahead, turning, climbing, and downhill. The average fixation duration, pupil diameter, and the saccade amplitude of the eye tracking were collected, one-way analysis of variance (ANOVA) was used to explore the differences between the different road sections, and a mathematical model of changes in the visual characteristics over time was constructed, based on the fitting of the data. Computerized fitting models of changes over time were also constructed using the Origin 2021 software. The results show that different road sections had significant effects on drivers’ visual tasks: the longest average fixation duration was found in the straight road section, the largest pupil diameter was found in the curved road section, and the highest saccade amplitude was found in the downhill road section, reflecting the influence of the complexity of the driving task on the cognitive load. The fitted model further reveals the dynamic change law of eye tracking indicators over time, providing a quantitative basis for modeling driving behavior and visual tasks. This study provides a theoretical basis and practical reference for the optimal design of advanced driver assistance systems, traffic safety management, and road planning. Full article

► Show Figures

Figure 1

30 pages, 4437 KiB

Open AccessArticle

Smart Maritime Transportation-Oriented Ship-Speed Prediction Modeling Using Generative Adversarial Networks and Long Short-Term Memory

by Xinqiang Chen, Peishi Wu, Yajie Zhang, Xiaomeng Wang, Jiangfeng Xian and Han Zhang

J. Mar. Sci. Eng. 2025, 13(6), 1045; https://doi.org/10.3390/jmse13061045 - 26 May 2025

Viewed by 719

Abstract

Ship-speed prediction is an emerging research area in marine traffic safety and other related fields, occupying an important position with respect to these areas. At present, the effectiveness of techniques used in in time-series forecasting methods in ship-speed prediction is poor, and there [...] Read more.

Ship-speed prediction is an emerging research area in marine traffic safety and other related fields, occupying an important position with respect to these areas. At present, the effectiveness of techniques used in in time-series forecasting methods in ship-speed prediction is poor, and there are accumulated errors in long-term forecasting, which is limited in its processing of ship-speed information combined with multi-feature data input. To overcome this difficulty and further optimize the accuracy of ship-speed prediction, this research proposes a new deep learning framework to predict ship speed by combining GANs (Generative Adversarial Networks) and LSTM (Long Short-Term Memory). First, the algorithm takes an LSTM network as the generating network and uses the LSTM to mine the spatiotemporal correlation between nodes. Secondly, the complementary characteristics linked between the generative network and the discriminant network are used to eliminate the cumulative error of a single neural network in the long-term prediction process and improve the prediction accuracy of the network in ship-speed determination. To conclude, the Generator–LSTM model advanced here is used for ship-speed prediction and compared with other models, utilizing identical AIS (automatic identification system) ship-speed information in the same scene. The findings indicate that the model demonstrates high accuracy in the typical error measurement index, which means that the model can reliably better predict the ship speed. The results of the study will assist maritime traffic participants in better taking precautions to prevent collisions and improve maritime traffic safety. Full article

(This article belongs to the Special Issue Smart and Low Carbon Emission-Oriented Maritime Traffic Management and Controlling)

► Show Figures

Figure 1

23 pages, 5784 KiB

Open AccessArticle

RT-DETR-EVD: An Emergency Vehicle Detection Method Based on Improved RT-DETR

by Jun Hu, Jiahao Zheng, Wenwei Wan, Yongqi Zhou and Zhikai Huang

Sensors 2025, 25(11), 3327; https://doi.org/10.3390/s25113327 - 26 May 2025

Cited by 1 | Viewed by 1267

Abstract

With the rapid acceleration of urbanization and the increasing volume of road traffic, emergency vehicles frequently encounter congestion when performing urgent tasks. Failure to yield in a timely manner can result in the loss of critical rescue time. Therefore, this study aims to [...] Read more.

With the rapid acceleration of urbanization and the increasing volume of road traffic, emergency vehicles frequently encounter congestion when performing urgent tasks. Failure to yield in a timely manner can result in the loss of critical rescue time. Therefore, this study aims to develop a lightweight and high-precision RT-DETR-EVD emergency vehicle detection model to enhance urban emergency response capabilities. The proposed model replaces ResNet with a lightweight CSPDarknet backbone and integrates an innovative hybrid C2f-MogaBlock architecture. A multi-order gated aggregation mechanism is introduced to dynamically fuse multi-scale features, improving spatial-channel feature representation while reducing the number of parameters. Additionally, an Attention-based Intra-scale Feature Interaction Dynamic Position Bias (AIDPB) module is designed, replacing fixed positional encoding with learnable dynamic position bias (DPB), improving feature discrimination in complex scenarios. The experimental results demonstrate that the improved RT-DETR-EVD model achieves superior performance in emergency vehicle detection under the same training conditions. Specifically, compared to the baseline RT-DETR-r18 model, RT-DETR-EVD reduces parameter count to 14.5 M (a 27.1% reduction), lowers floating-point operations (FLOPs) to 49.5 G (a 13.2% reduction), and improves precision by 0.5%. Additionally, recall and mean average precision (mAP50%) increase by 0.6%, reaching an mAP50% of 88.3%. The proposed RT-DETR-EVD model achieves a breakthrough balance between accuracy, efficiency, and scene adaptability. Its unique lightweight design enhances detection accuracy while significantly reducing model size and accelerating inference. This model provides an efficient and reliable solution for smart city emergency response systems, demonstrating strong deployment potential in real-world engineering applications. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

Search Results (412)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (412)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI