Cooperative Perception for Modern Transportation

A special issue of Drones (ISSN 2504-446X). This special issue belongs to the section "Innovative Urban Mobility".

Deadline for manuscript submissions: 28 February 2026 | Viewed by 2764

Special Issue Editors


E-Mail Website
Guest Editor
School of Electronic Information, Wuhan University, Wuhan 430072, China
Interests: video and image processing; computer vision; artificial intelligence; swarm intelligence
Special Issues, Collections and Topics in MDPI journals
State Key Laboratory of Information Engineering in Surveying, mapping and Remote Sensing, Wuhan University, Wuhan 430205, China
Interests: artificial intelligence; machine learning; classification pattern; recognition image processing; LiDAR; GNSS; autonomous systems
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Geography, The University of Hong Kong, Hong Kong SAR, China
Interests: mobile mapping; multi-sensor integration; robotics and autonomous systems; smart city

E-Mail Website
Guest Editor
Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China
Interests: artificial intelligence; brain-inspired computing; information retrieval

Special Issue Information

Dear Colleagues,

Unmanned aerial vehicles (UAVs) and unmanned ground vehicles will become more and more widely used in the field of transportation. Collaborative perception technology will play a key role in this process. Collaborative perception is a way to use the collaboration between various sensors and infrastructure to jointly perceive the environment. Collaborative perception can improve the perception accuracy and coverage, reduce the perception cost, and support a variety of intelligent transportation types, intelligent security, and other applications. Common sensing technologies in collaborative sensing include radar, camera, lidar, ultrasonic sensor, and inertial measurement units. The choice of sensing technology mainly depends on factors such as sensing demand, cost, and environmental conditions. The perception data in collaborative perception is usually multi-source, heterogeneous, and needs fusion processing to improve the perception accuracy and robustness.

This Special Issue aims to explore the modeling theories and methods for UAV and self-driving vehicles (SDVs) in intelligent transportation systems. The application of drones in modern transportation systems will bring more opportunities and challenges. The efficiency of traffic management can be greatly improved by leveraging the high-altitude monitoring and rapid response capabilities of drones, combined with the big data analysis and real-time monitoring technology of intelligent transportation systems. At the same time, it can also be applied to the field of public safety and logistics distribution, improving the overall benefit to society and human quality of life. Collaborative perception processes and integrates data collected by multiple collaborative sensors to produce more accurate and complete perceptual results. Collaborative perception solves the two main problems of remote occlusion and sparse data in single perception. Collaborative perception usually involves algorithms and technologies such as data processing, distributed computing, and artificial intelligence. These greatly improve the ability of the drones to integrate information with the existing sensors in the intelligent transportation system.

Topics including, but are not limited to, the following:

  • Fusion perception of visual, infrared, laser, and other sensors.
  • Intelligent video surveillance in modern transportation systems.
  • Remote sensing and data analysis for modern transportation.
  • Localization and high-definition map for transportation.
  • Deep-reinforcement learning for multi-unmanned traffic systems
  • Application of artificial intelligence for unmanned aerial vehicles (UAVs) and self-driving vehicles (SDVs);
  • Modeling, simulation, and dynamic analysis of the collaboration system for UAVs and SDVs;
  • UAV and SDV decision making in a complex urban traffic environment;
  • Parameter identification and state estimation of UAVs and SDVs;
  • Trajectory prediction and its application to intelligent transportation systems;
  • Design of new sensors and novel estimation and data fusion algorithms for UAVs and SDVs.
  • Dynamic vision sensor and its application in intelligent transportation.
  • Spiking Neural Network and neuromorphic computing.

Dr. Jinsheng Xiao
Dr. Jian Zhou
Dr. Sheng Bao
Dr. Hailong Shi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Drones is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Fusion perception of visual, infrared, laser, and other sensors.
  • Intelligent video surveillance in modern transportation systems.
  • Remote sensing and data analysis for modern transportation.
  • Localization and high-definition map for transportation.
  • Deep-reinforcement learning for multi-unmanned traffic systems
  • Application of artificial intelligence for unmanned aerial vehicles (UAVs) and self-driving vehicles (SDVs);
  • Modeling, simulation, and dynamic analysis of the collaboration system for UAVs and SDVs;
  • UAV and SDV decision making in a complex urban traffic environment;
  • Parameter identification and state estimation of UAVs and SDVs;
  • Trajectory prediction and its application to intelligent transportation systems;
  • Design of new sensors and novel estimation and data fusion algorithms for UAVs and SDVs.
  • Dynamic vision sensor and its application in intelligent transportation.
  • Spiking Neural Network and neuromorphic computing.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1823 KiB  
Article
Collaborative Target Tracking Algorithm for Multi-Agent Based on MAPPO and BCTD
by Yuebin Zhou, Yunling Yue, Bolun Yan, Linkun Li, Jinsheng Xiao and Yuan Yao
Drones 2025, 9(8), 521; https://doi.org/10.3390/drones9080521 - 24 Jul 2025
Viewed by 316
Abstract
Target tracking is a representative task in multi-agent reinforcement learning (MARL), where agents must collaborate effectively in environments with dense obstacles, evasive targets, and high-dimensional observations—conditions that often lead to local optima and training inefficiencies. To address these challenges, this paper proposes a [...] Read more.
Target tracking is a representative task in multi-agent reinforcement learning (MARL), where agents must collaborate effectively in environments with dense obstacles, evasive targets, and high-dimensional observations—conditions that often lead to local optima and training inefficiencies. To address these challenges, this paper proposes a collaborative tracking algorithm for UAVs that integrates behavior cloning with temporal difference (BCTD) and multi-agent proximal policy optimization (MAPPO). Expert trajectories are generated using the artificial potential field (APF), followed by policy pre-training via behavior cloning and TD-based value optimization. MAPPO is then employed for dynamic fine-tuning, enhancing robustness and coordination. Experiments in a simulated environment show that the proposed MAPPO+BCTD framework outperforms MAPPO, QMIX, and MADDPG in success rate, convergence speed, and tracking efficiency. The proposed method effectively alleviates the local optimization problem of APF and the training inefficiency problem of RL, offering a scalable and reliable solution for dynamic multi-agent coordination. Full article
(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)
Show Figures

Figure 1

23 pages, 13739 KiB  
Article
Traffic Accident Rescue Action Recognition Method Based on Real-Time UAV Video
by Bo Yang, Jianan Lu, Tao Liu, Bixing Zhang, Chen Geng, Yan Tian and Siyu Zhang
Drones 2025, 9(8), 519; https://doi.org/10.3390/drones9080519 - 24 Jul 2025
Viewed by 465
Abstract
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and [...] Read more.
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and localization annotation. A total of 5082 keyframes were labeled with 1–5 targets each, and 14,412 instances of data were prepared (including flight altitude and camera angles) for action classification and position annotation. To mitigate the challenges posed by high-resolution drone footage with excessive redundant information, we propose the SlowFast-Traffic (SF-T) framework, a spatio-temporal sequence-based algorithm for recognizing traffic accident rescue actions. For more efficient extraction of target–background correlation features, we introduce the Actor-Centric Relation Network (ACRN) module, which employs temporal max pooling to enhance the time-dimensional features of static backgrounds, significantly reducing redundancy-induced interference. Additionally, smaller ROI feature map outputs are adopted to boost computational speed. To tackle class imbalance in incident samples, we integrate a Class-Balanced Focal Loss (CB-Focal Loss) function, effectively resolving rare-action recognition in specific rescue scenarios. We replace the original Faster R-CNN with YOLOX-s to improve the target detection rate. On our proposed dataset, the SF-T model achieves a mean average precision (mAP) of 83.9%, which is 8.5% higher than that of the standard SlowFast architecture while maintaining a processing speed of 34.9 tasks/s. Both accuracy-related metrics and computational efficiency are substantially improved. The proposed method demonstrates strong robustness and real-time analysis capabilities for modern traffic rescue action recognition. Full article
(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)
Show Figures

Figure 1

17 pages, 1557 KiB  
Article
MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles
by Binghui Yang, Tao Tao, Wenfei Wu, Yongjun Zhang, Xiuyuan Meng and Jianfeng Yang
Drones 2025, 9(5), 322; https://doi.org/10.3390/drones9050322 - 22 Apr 2025
Viewed by 702
Abstract
Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been [...] Read more.
Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been made in detection methods based on point clouds, cameras, and multimodal fusion, the computational complexity of existing high-precision models struggles to meet the real-time requirements of vehicular edge devices. Additionally, during the model lightweighting process, issues such as multimodal feature coupling failure and the imbalance between classification and localization performance often arise. To address these challenges, this paper proposes a knowledge distillation framework for multimodal 3D object detection, incorporating attention guidance, rank-aware learning, and interactive feature supervision to achieve efficient model compression and performance optimization. Specifically: To enhance the student model’s ability to focus on key channel and spatial features, we introduce attention-guided feature distillation, leveraging a bird’s-eye view foreground mask and a dual-attention mechanism. To mitigate the degradation of classification performance when transitioning from two-stage to single-stage detectors, we propose ranking-aware category distillation by modeling anchor-level distribution. To address the insufficient cross-modal feature extraction capability, we enhance the student network’s image features using the teacher network’s point cloud spatial priors, thereby constructing a LiDAR-image cross-modal feature alignment mechanism. Experimental results demonstrate the effectiveness of the proposed approach in multimodal 3D object detection. On the KITTI dataset, our method improves network performance by 4.89% even after reducing the number of channels by half. Full article
(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)
Show Figures

Figure 1

Back to TopTop