Vehicle–Road–Cloud Collaborative Perception: Resource and Intelligence Optimization
Abstract
1. Introduction
- We use a slimmable network to construct an adaptive computing framework, which enables intelligent elasticity, thereby meeting the computational demands of perception tasks.
- We propose an accuracy-aware collaborative mechanism. By leveraging deep supervised learning, a nonlinear mapping is established between perception accuracy, network structure, and data density.
- We integrate MEC technology and utilize computation offloading technology to achieve the flexible offloading of computing tasks, thereby improving the overall computing efficiency.
- We design and solve a multi-objective optimization problem that comprehensively considers a collaborative mechanism, resource allocation, intelligent elasticity, and computation offloading, aiming to minimize overall resource consumption while satisfying accuracy and delay constraints. Unlike traditional V2X approaches that treat the AI model as a static black box and only optimize transmission resources, our method uniquely introduces “intelligent elasticity.” By jointly optimizing the collaborative mechanism, resource allocation, and the internal structure of the slimmable network, we achieve a superior Pareto frontier between perception accuracy and latency, adapting to dynamic constraints that fixed-model approaches cannot handle.
2. Related Work
2.1. Collaborative Perception
2.2. Resource Allocation
2.3. Slimmable Network
3. System Model
3.1. Vehicle-Road-Cloud Collaborative Perception Scenario
3.2. Perception Data
3.3. Architecture
3.4. Computation Model
3.5. Communication Model
3.6. Problem Formulation
4. Method
4.1. Inner-Layer Module
4.2. Outer-Layer Module
5. Experiments
5.1. Environmental Settings
5.2. Experimental Result
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bai, Z.; Wu, G.; Qi, X.; Liu, Y.; Oguchi, K.; Barth, M.J. Infrastructure-Based Object Detection and Tracking for Cooperative Driving Automation: A Survey. In Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV), Aachen, Germany, 4–9 June 2022; pp. 1366–1373. [Google Scholar] [CrossRef]
- Xiang, C.; Feng, C.; Xie, X.; Shi, B.; Lu, H.; Lv, Y.; Yang, M.; Niu, Z. Multi-Sensor Fusion and Cooperative Perception for Autonomous Driving: A Review. IEEE Intell. Transp. Syst. Mag. 2023, 15, 36–58. [Google Scholar] [CrossRef]
- Huang, T.; Liu, J.; Zhou, X.; Nguyen, D.C.; Azghadi, M.R.; Xia, Y.; Han, Q.-L.; Sun, S. V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges. arXiv 2023, arXiv:2310.03525. [Google Scholar] [CrossRef]
- Wang, T.-H.; Manivasagam, S.; Liang, M.; Yang, B.; Zeng, W.; Tu, J.; Urtasun, R. V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction. arXiv 2020, arXiv:2008.07519. [Google Scholar] [CrossRef]
- Li, Y.; Ren, S.; Wu, P.; Chen, S.; Feng, C.; Zhang, W. Learning Distilled Collaboration Graph for Multi-Agent Perception. arXiv 2022, arXiv:2111.00643. [Google Scholar] [CrossRef]
- Ye, X.; Qu, K.; Zhuang, W.; Shen, X. Accuracy-Aware Cooperative Sensing and Computing for Connected Autonomous Vehicles. IEEE Trans. Mob. Comput. 2024, 23, 8193–8207. [Google Scholar] [CrossRef]
- Yu, J.; Yang, L.; Xu, N.; Yang, J.; Huang, T. Slimmable Neural Networks. arXiv 2018, arXiv:1812.08928. [Google Scholar] [CrossRef]
- Xu, R.; Xiang, H.; Xia, X.; Han, X.; Li, J.; Ma, J. OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication. arXiv 2022, arXiv:2109.07644. [Google Scholar] [CrossRef]
- Liu, Y.-C.; Tian, J.; Ma, C.-Y.; Glaser, N.; Kuo, C.-W.; Kira, Z. Who2com: Collaborative Perception via Learnable Handshake Communication. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 6876–6883. [Google Scholar] [CrossRef]
- Liu, Y.-C.; Tian, J.; Glaser, N.; Kira, Z. When2com: Multi-Agent Perception via Communication Graph Grouping. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4105–4114. [Google Scholar] [CrossRef]
- Hu, Y.; Fang, S.; Lei, Z.; Zhong, Y.; Chen, S. Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps. arXiv 2022, arXiv:2209.12836. [Google Scholar] [CrossRef]
- Xu, R.; Xiang, H.; Tu, Z.; Xia, X.; Yang, M.-H.; Ma, J. V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer. arXiv 2022, arXiv:2203.10638. [Google Scholar] [CrossRef]
- Wang, Z.; Fan, S.; Huo, X.; Xu, T.; Wang, Y.; Liu, J.; Chen, Y.; Zhang, Y.-Q. VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection. arXiv 2023, arXiv:2303.10975. [Google Scholar] [CrossRef]
- Xu, S.; Hu, X.; Wang, L.; Wang, Y.; Wang, W. Joint Power and Bandwidth Allocation for Internet of Vehicles Based on Proximal Policy Optimization Algorithm. In Proceedings of the 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Shenyang, China, 20–22 October 2021; pp. 1352–1357. [Google Scholar] [CrossRef]
- Jiang, W.; Song, T.; Song, X.; Wang, C.; Jin, Z.; Hu, J. Energy-Efficient Resource Allocation for NOMA-Enabled Vehicular Networks. IEEE Trans. Veh. Technol. 2025, 74, 12042–12057. [Google Scholar] [CrossRef]
- Liu, T.; Tang, L.; Wang, W.; He, X.; Chen, Q.; Zeng, X.; Jiang, H. Resource Allocation in DT-Assisted Internet of Vehicles via Edge Intelligent Cooperation. IEEE Internet Things J. 2022, 9, 17608–17626. [Google Scholar] [CrossRef]
- Liu, C.; Xia, M.; Zhao, J.; Li, H.; Gong, Y. Optimal Resource Allocation for Integrated Sensing and Communications in Internet of Vehicles: A Deep Reinforcement Learning Approach. IEEE Trans. Veh. Technol. 2025, 74, 3028–3038. [Google Scholar] [CrossRef]
- Ji, M.; Wu, Q.; Fan, P.; Cheng, N.; Chen, W.; Wang, J.; Letaief, K.B. Graph Neural Networks and Deep Reinforcement Learning-Based Resource Allocation for V2X Communications. IEEE Internet Things J. 2025, 12, 3613–3628. [Google Scholar] [CrossRef]
- Moreno, A.; Osaba, E.; Onieva, E.; Perallos, A.; Iovino, G.; Fernández, P. Design and Field Experimentation of a Cooperative ITS Architecture Based on Distributed RSUs. Sensors 2016, 16, 1147. [Google Scholar] [CrossRef]
- Zobaed, S.; Mokhtari, A.; Champati, J.P.; Kourouma, M.; Salehi, M.A. Edge-MultiAI: Multi-Tenancy of Latency-Sensitive Deep Learning Applications on Edge. In Proceedings of the 2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC), Vancouver, WA, USA, 6–9 December 2022; pp. 11–20. [Google Scholar] [CrossRef]
- Gu, X.; Wu, Q.; Fan, P.; Cheng, N.; Chen, W.; Letaief, K.B. DRL-Based Federated Self-Supervised Learning for Task Offloading and Resource Allocation in ISAC-Enabled Vehicle Edge Computing. Digit. Commun. Netw. 2024, 11, 12042–12057. [Google Scholar] [CrossRef]
- Yu, J.; Huang, T. Universally Slimmable Networks and Improved Training Techniques. arXiv 2019, arXiv:1903.05134. [Google Scholar] [CrossRef]
- Li, C.; Wang, G.; Wang, B.; Liang, X.; Li, Z.; Chang, X. Dynamic Slimmable Network. arXiv 2021, arXiv:2103.13258. [Google Scholar] [CrossRef]
- Wang, S.; Li, Z.; Gao, J.; Li, L.; Hu, W. DSPNet: Towards Slimmable Pretrained Networks based on Discriminative Self-supervised Learning. arXiv 2022, arXiv:2207.06075. [Google Scholar] [CrossRef]
- Zhao, S.; Zhu, L.; Wang, X.; Yang, Y. Slimmable Networks for Contrastive Self-supervised Learning. arXiv 2024, arXiv:2209.15525. [Google Scholar] [CrossRef]
- Bouguezzi, S.; Faiedh, H.; Souani, C. Slim MobileNet: An Enhanced Deep Convolutional Neural Network. In Proceedings of the 2021 18th International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 22–25 March 2021; pp. 12–16. [Google Scholar] [CrossRef]
- Elminshawi, M.; Chetupalli, S.R.; Habets, E.A.P. Slim-Tasnet: A Slimmable Neural Network for Speech Separation. In Proceedings of the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 22–25 October 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Yu, J.; Huang, T. AutoSlim: Towards One-Shot Architecture Search for Channel Numbers. arXiv 2019, arXiv:1903.11728. [Google Scholar] [CrossRef]







| Symbol | Definition |
|---|---|
| Set of CAVs and set of assisting nodes (CAVs + RSU) | |
| G | Set of objects to be perceived within the ROI |
| Width ratio of the slimmable network for node q | |
| Communication bandwidth allocation proportion for node q | |
| Computation offloading decision variable (1 if offloaded to MEC) | |
| Collaborative selection decision variable | |
| BEV features of object g generated by node q |
| Parameters | Value |
|---|---|
| Bandwidth () | 20 MHz |
| Average transfer rate (R) | 85.8 Mbps |
| Distance from the vehicle to the RSU | 5 m |
| Distance between RSU and MEC | 5 m |
| Interference power () | W |
| Transmit power ( and P) | 1 W |
| Loss exponent () | 3.4 |
| Channel attenuation factor () | 1 |
| Available compute resources for CAVs and RSU () | 10 GHz |
| Allocable computing resources for MEC | 100 GHz |
| Data size per observation point () | 96 bit |
| Perception Mechanism | Perception Accuracy | ||
|---|---|---|---|
| 0 | 1 | 2 | |
| Single-RSU Perception | 0.65 | 0.73 | 0.74 |
| Cooperative Perception | 0.96 | 0.92 | 0.95 |
| Method | Feature | ||
|---|---|---|---|
| Collaboration | Network Width | Computation Offloading | |
| C4I-JO (Proposed) | Selective | Selective | Yes |
| SCFW (Baseline) | Selective | FULL | Yes |
| ACSW (Baseline) | ALL | Selective | Yes |
| ACFW (Baseline) | ALL | FULL | Yes |
| SCSW(Baseline) | Selective | Selective | No |
| Method | Accuracy | Delay (ms) | ||
|---|---|---|---|---|
| 0 | 1 | 2 | ||
| C4I-JO (Proposed) | 0.95 | 0.93 | 0.96 | 6 |
| SCFW (Baseline) | 0.94 | 0.95 | 0.95 | 13 |
| ACSW (Baseline) | 0.95 | 0.96 | 0.94 | 14 |
| ACFW (Baseline) | 0.99 | 0.98 | 0.98 | 25 |
| SCSW (Baseline) | 0.95 | 0.92 | 0.95 | 11 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xin, L.; Zhou, G.; Yu, Z.; Zhu, H.; Feng, X.; Yuan, Q.; Li, J. Vehicle–Road–Cloud Collaborative Perception: Resource and Intelligence Optimization. Appl. Sci. 2025, 15, 12613. https://doi.org/10.3390/app152312613
Xin L, Zhou G, Yu Z, Zhu H, Feng X, Yuan Q, Li J. Vehicle–Road–Cloud Collaborative Perception: Resource and Intelligence Optimization. Applied Sciences. 2025; 15(23):12613. https://doi.org/10.3390/app152312613
Chicago/Turabian StyleXin, Liang, Guangtao Zhou, Zhaoyang Yu, Hong Zhu, Xiaolong Feng, Quan Yuan, and Jinglin Li. 2025. "Vehicle–Road–Cloud Collaborative Perception: Resource and Intelligence Optimization" Applied Sciences 15, no. 23: 12613. https://doi.org/10.3390/app152312613
APA StyleXin, L., Zhou, G., Yu, Z., Zhu, H., Feng, X., Yuan, Q., & Li, J. (2025). Vehicle–Road–Cloud Collaborative Perception: Resource and Intelligence Optimization. Applied Sciences, 15(23), 12613. https://doi.org/10.3390/app152312613
