AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery
Abstract
:1. Introduction
2. Literature Review
2.1. General-Purpose Datasets
2.2. Construction-Specific Datasets
2.3. Applications in Construction
3. Materials and Methods
3.1. Dataset Development
3.1.1. Image Collection
3.1.2. Privacy Protection
3.1.3. Image Segmentation
3.1.4. Clustering
3.2. Performance Evaluation
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tamin, M.A.; Darwin, N.; Majid, Z.; Mohd Ariff, M.F.; Idris, K.M.; Manan Samad, A. Volume Estimation of Stockpile Using Unmanned Aerial Vehicle. In Proceedings of the 9th IEEE International Conference on Control System, Computing and Engineering, ICCSCE 2019, Penang, Malaysia, 29 November–1 December 2019; pp. 49–54. [Google Scholar]
- Chen, C.; Zhu, Z.; Hammad, A. Automated Excavators Activity Recognition and Productivity Analysis from Construction Site Surveillance Videos. Autom. Constr. 2020, 110, 103045. [Google Scholar] [CrossRef]
- Zhang, S.; Zhang, L. Construction Site Safety Monitoring and Excavator Activity Analysis System. Constr. Robot. 2022, 6, 151–161. [Google Scholar] [CrossRef]
- Rezazadeh Azar, E.; McCabe, B. Part Based Model and Spatial-Temporal Reasoning to Recognize Hydraulic Excavators in Construction Images and Videos. Autom. Constr. 2012, 24, 194–202. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. pp. 740–755. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Roberts, D.; Golparvar-Fard, M. End-to-End Vision-Based Detection, Tracking and Activity Analysis of Earthmoving Equipment Filmed at Ground Level AAA. Autom. Constr. 2019, 105, 102811. [Google Scholar] [CrossRef]
- Xiao, B.; Kang, S.-C. Vision-Based Method Integrating Deep Learning Detection for Tracking Multiple Construction Machines. J. Comput. Civ. Eng. 2021, 35, 04020071. [Google Scholar] [CrossRef]
- Kim, D.; Liu, M.; Lee, S.; Kamat, V.R. Remote Proximity Monitoring between Mobile Construction Resources Using Camera-Mounted UAVs. Autom. Constr. 2019, 99, 168–182. [Google Scholar] [CrossRef]
- Fang, Q.; Li, H.; Luo, X.; Ding, L.; Luo, H.; Rose, T.M.; An, W. Detecting Non-Hardhat-Use by a Deep Learning Method from Far-Field Surveillance Videos. Autom. Constr. 2018, 85, 1–9. [Google Scholar] [CrossRef]
- Kim, S.; Irizarry, J.; Bastos Costa, D. Potential Factors Influencing the Performance of Unmanned Aerial System (UAS) Integrated Safety Control for Construction Worksites. In Proceedings of the Construction Research Congress 2016, San Juan, Puerto Rico, 31 May–2 June 2016; pp. 2039–2049. [Google Scholar]
- Liu, P.; Chen, A.Y.; Huang, Y.N.; Han, J.Y.; Lai, J.S.; Kang, S.C.; Wu, T.H.; Wen, M.C.; Tsai, M.H. A Review of Rotorcraft Unmanned Aerial Vehicle (UAV) Developments and Applications in Civil Engineering. Smart Struct. Syst. 2014, 13, 1065–1094. [Google Scholar] [CrossRef]
- Akinsemoyin, A.; Awolusi, I.; Chakraborty, D.; Al-Bayati, A.J.; Akanmu, A. Unmanned Aerial Systems and Deep Learning for Safety and Health Activity Monitoring on Construction Sites. Sensors 2023, 23, 6690. [Google Scholar] [CrossRef]
- Duan, R.; Deng, H.; Tian, M.; Deng, Y.; Lin, J. SODA: A Large-Scale Open Site Object Detection Dataset for Deep Learning in Construction. Autom. Constr. 2022, 142, 104499. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 386–397. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1483–1498. [Google Scholar] [CrossRef]
- Huang, Z.; Huang, L.; Gong, Y.; Huang, C.; Wang, X. Mask Scoring R-CNN. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; Volume 2019, pp. 6402–6411. [Google Scholar]
- Chen, K.; Pang, J.; Wang, J.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Shi, J.; Ouyang, W.; et al. Hybrid Task Cascade for Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; Volume 2019, pp. 4969–4978. [Google Scholar] [CrossRef]
- Kirillov, A.; Wu, Y.; He, K.; Girshick, R. Pointrend: Image Segmentation as Rendering. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 9799–9808. [Google Scholar]
- Fei-Fei, L.; Fergus, R.; Perona, P. One-Shot Learning of Object Categories. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 594–611. [Google Scholar] [CrossRef] [PubMed]
- Griffin, G.; Holub, A.; Perona, P. Caltech-256 Object Category Dataset. Available online: http://www.vision.caltech.edu/datasets/ (accessed on 19 August 2024).
- Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Toronto, Toronto, ON, Canada, 2009. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Kuznetsova, A.; Rom, H.; Alldrin, N.; Uijlings, J.; Krasin, I.; Pont-Tuset, J.; Kamali, S.; Popov, S.; Malloci, M.; Kolesnikov, A.; et al. The Open Images Dataset V4: Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale. Int. J. Comput. Vis. 2018, 128, 1956–1981. [Google Scholar] [CrossRef]
- Gupta, A.; Dollar, P.; Girshick, R. LVIS: A Dataset for Large Vocabulary Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; Volume 2019, pp. 5351–5359. [Google Scholar]
- Tajeen, H.; Zhu, Z. Image Dataset Development for Measuring Construction Equipment Recognition Performance. Autom. Constr. 2014, 48, 1–10. [Google Scholar] [CrossRef]
- Kim, H.; Kim, H.; Hong, Y.W.; Byun, H. Detecting Construction Equipment Using a Region-Based Fully Convolutional Network and Transfer Learning. J. Comput. Civ. Eng. 2018, 32, 04017082. [Google Scholar] [CrossRef]
- Xiao, B.; Kang, S.-C. Development of an Image Data Set of Construction Machines for Deep Learning Object Detection. J. Comput. Civ. Eng. 2021, 35, 1–18. [Google Scholar] [CrossRef]
- Xuehui, A.; Li, Z.; Zuguang, L.; Chengzhi, W.; Pengfei, L.; Zhiwei, L. Dataset and Benchmark for Detecting Moving Objects in Construction Sites. Autom. Constr. 2021, 122, 103482. [Google Scholar] [CrossRef]
- Del Savio, A.; Luna, A.; Cárdenas-Salas, D.; Vergara, M.; Urday, G. Dataset of Manually Classified Images Obtained from a Construction Site. Data Br. 2022, 42, 108042. [Google Scholar] [CrossRef] [PubMed]
- Yan, X.; Zhang, H.; Wu, Y.; Lin, C.; Liu, S. Construction Instance Segmentation (CIS) Dataset for Deep Learning-Based Computer Vision. Autom. Constr. 2023, 156, 105083. [Google Scholar] [CrossRef]
- Soltani, M.M.; Zhu, Z.; Hammad, A. Automated Annotation for Visual Recognition of Construction Resources Using Synthetic Images. Autom. Constr. 2016, 62, 14–23. [Google Scholar] [CrossRef]
- Barrera-Animas, A.Y.; Davila Delgado, J.M. Generating Real-World-like Labelled Synthetic Datasets for Construction Site Applications. Autom. Constr. 2023, 151, 104850. [Google Scholar] [CrossRef]
- Bang, S.; Baek, F.; Park, S.; Kim, W.; Kim, H. Image Augmentation to Improve Construction Resource Detection Using Generative Adversarial Networks, Cut-and-Paste, and Image Transformation Techniques. Autom. Constr. 2020, 115, 103198. [Google Scholar] [CrossRef]
- Hwang, J.; Kim, J.; Chi, S.; Seo, J.O. Development of Training Image Database Using Web Crawling for Vision-Based Site Monitoring. Autom. Constr. 2022, 135, 104141. [Google Scholar] [CrossRef]
- Hwang, J.; Kim, J.; Chi, S. Site-Optimized Training Image Database Development Using Web-Crawled and Synthetic Images. Autom. Constr. 2023, 151, 104886. [Google Scholar] [CrossRef]
- Memarzadeh, M.; Golparvar-Fard, M.; Niebles, J.C. Automated 2D Detection of Construction Equipment and Workers from Site Video Streams Using Histograms of Oriented Gradients and Colors. Autom. Constr. 2013, 32, 24–37. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Zhong, B.; Love, P.E.D.; Luo, H. Automated Detection of Workers and Heavy Equipment on Construction Sites: A Convolutional Neural Network Approach. Adv. Eng. Informatics 2018, 37, 139–149. [Google Scholar] [CrossRef]
- Xiang, X.; Lv, N.; Guo, X.; Wang, S.; El Saddik, A. Engineering Vehicles Detection Based on Modified Faster R-CNN for Power Grid Surveillance. Sensors 2018, 18, 2258. [Google Scholar] [CrossRef] [PubMed]
- Lin, Z.-H.; Chen, A.Y.; Hsieh, S.-H. Temporal Image Analytics for Abnormal Construction Activity Identification. Autom. Constr. 2021, 124, 103572. [Google Scholar] [CrossRef]
- Golparvar-Fard, M.; Heydarian, A.; Niebles, J.C. Vision-Based Action Recognition of Earthmoving Equipment Using Spatio-Temporal Features and Support Vector Machine Classifiers. Adv. Eng. Inform. 2013, 27, 652–663. [Google Scholar] [CrossRef]
- Zhu, Z.; Ren, X.; Chen, Z. Integrated Detection and Tracking of Workforce and Equipment from Construction Jobsite Videos. Autom. Constr. 2017, 81, 161–171. [Google Scholar] [CrossRef]
- Luo, X.; Li, H.; Cao, D.; Dai, F.; Seo, J.; Lee, S. Recognizing Diverse Construction Activities in Site Images via Relevance Networks of Construction-Related Objects Detected by Convolutional Neural Networks. J. Comput. Civ. Eng. 2018, 32, 1–16. [Google Scholar] [CrossRef]
- Gong, J.; Caldas, C.H. An Object Recognition, Tracking, and Contextual Reasoning-Based Video Interpretation Method for Rapid Productivity Analysis of Construction Operations. Autom. Constr. 2011, 20, 1211–1226. [Google Scholar] [CrossRef]
- Kim, J.; Chi, S. Action Recognition of Earthmoving Excavators Based on Sequential Pattern Analysis of Visual Features and Operation Cycles. Autom. Constr. 2019, 104, 255–264. [Google Scholar] [CrossRef]
- Kim, H.; Bang, S.; Jeong, H.; Ham, Y.; Kim, H. Analyzing Context and Productivity of Tunnel Earthmoving Processes Using Imaging and Simulation. Autom. Constr. 2018, 92, 188–198. [Google Scholar] [CrossRef]
- Soltani, M.M.; Zhu, Z.; Hammad, A. Skeleton Estimation of Excavator by Detecting Its Parts. Autom. Constr. 2017, 82, 1–15. [Google Scholar] [CrossRef]
- Mahmood, B.; Han, S.; Seo, J. Implementation Experiments on Convolutional Neural Network Training Using Synthetic Images for 3D Pose Estimation of an Excavator on Real Images. Autom. Constr. 2022, 133, 103996. [Google Scholar] [CrossRef]
- Chi, S.; Caldas, C.H. Automated Object Identification Using Optical Video Cameras on Construction Sites. Comput. Civ. Infrastruct. Eng. 2011, 26, 368–380. [Google Scholar] [CrossRef]
- Rezazadeh Azar, E.; McCabe, B. Automated Visual Recognition of Dump Trucks in Construction Videos. J. Comput. Civ. Eng. 2012, 26, 769–781. [Google Scholar] [CrossRef]
- Rezazadeh Azar, E.; Dickinson, S.; McCabe, B. Server-Customer Interaction Tracker: Computer Vision–Based System to Estimate Dirt-Loading Cycles. J. Constr. Eng. Manag. 2013, 139, 785–794. [Google Scholar] [CrossRef]
- Kim, J.; Chi, S.; Seo, J. Interaction Analysis for Vision-Based Activity Identification of Earthmoving Excavators and Dump Trucks. Autom. Constr. 2018, 87, 297–308. [Google Scholar] [CrossRef]
- Kim, J.; Hwang, J.; Chi, S.; Seo, J.O. Towards Database-Free Vision-Based Monitoring on Construction Sites: A Deep Active Learning Approach. Autom. Constr. 2020, 120, 103376. [Google Scholar] [CrossRef]
- Kim, J.; Chi, S. A Few-Shot Learning Approach for Database-Free Vision-Based Monitoring on Construction Sites. Autom. Constr. 2021, 124, 103566. [Google Scholar] [CrossRef]
- Arabi, S.; Haghighat, A.; Sharma, A. A Deep-Learning-Based Computer Vision Solution for Construction Vehicle Detection. Comput. Civ. Infrastruct. Eng. 2020, 35, 753–767. [Google Scholar] [CrossRef]
- Guo, Y.; Xu, Y.; Li, S. Dense Construction Vehicle Detection Based on Orientation-Aware Feature Fusion Convolutional Neural Network. Autom. Constr. 2020, 112, 103124. [Google Scholar] [CrossRef]
- Meng, L.; Peng, Z.; Zhou, J.; Zhang, J.; Lu, Z.; Baumann, A.; Du, Y. Real-Time Detection of Ground Objects Based on Unmanned Aerial Vehicle Remote Sensing with Deep Learning: Application in Excavator Detection for Pipeline Safety. Remote Sens. 2020, 12, 182. [Google Scholar] [CrossRef]
- Bang, S.; Hong, Y.; Kim, H. Proactive Proximity Monitoring with Instance Segmentation and Unmanned Aerial Vehicle-Acquired Video-Frame Prediction. Comput. Civ. Infrastruct. Eng. 2021, 36, 800–816. [Google Scholar] [CrossRef]
- DJI Camera Drones. Available online: https://www.dji.com/global/products/camera-drones (accessed on 19 August 2024).
- Yuneec Drones. Available online: https://yuneec.online/drones/ (accessed on 19 August 2024).
- CVAT Powerful and Efficient Computer Vision Annotation Tool (CVAT). Available online: https://github.com/opencv/cvat (accessed on 19 August 2022).
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3982–3992. [Google Scholar]
- Soviany, P.; Ionescu, R.T. Optimizing the Trade-off between Single-Stage and Two-Stage Deep Object Detectors Using Image Difficulty Prediction. In Proceedings of the Proceedings—2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2018, Timisoara, Romania, 20–23 September 2018; pp. 209–214. [Google Scholar]
- Carranza-García, M.; Torres-Mateo, J.; Lara-Benítez, P.; García-Gutiérrez, J. On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data. Remote Sens. 2021, 13, 89. [Google Scholar] [CrossRef]
- MMDetection Contributors OpenMMLab Detection Toolbox and Benchmark. Available online: https://github.com/open-mmlab/mmdetection (accessed on 19 August 2024).
Tajeen et al. [27] | AIM [28] | ACID [29] | MOCS [30] | Del Savio et al. [31] | CIS [32] | AIDCON | |
---|---|---|---|---|---|---|---|
Year | 2014 | 2018 | 2021 | 2021 | 2022 | 2023 | 2024 |
No. of Machinery Categories | 5 | 5 | 10 | 11 | 7 | 7 | 8 |
No. of Images | 2000 | 2920 | 10,000 | 41,668 | 1046 | 50,000 | 2155 |
Instances per Image | 1 | 1 | 1.58 | 5.34 | N/A | 2.08 | 4.34 |
Ratio of Aerial Images | 0 | N/A | 0.50% | N/A | 0 | N/A | 100% |
Image Source | On-site | ImageNet | On-site + Web | On-site | On-site | On-site + Web | On-site |
Devices | Digital cameras | N/A | Cell phone cameras, UAVs, on-site cameras | Smartphones, UAVs, digital cameras | Static cameras | Smartphones, UAVs, digital and security cameras | UAVs |
Type of Annotation | Bounding Box | Bounding Box | Bounding Box | Pixel-wise | Bounding Box | Pixel-wise | Pixel-wise |
Clustering Strategy | No | No | No | No | No | No | Yes |
Sensor | P4 RTK | Mavic Pro | Mavic 2 Pro | Yuneec H520 E90 |
---|---|---|---|---|
Sensor | 1” CMOS | 1/2.3” CMOS | 1” CMOS | 1” CMOS |
FOV | 84° | 78.8° | 77° | 91° |
Resolution (H × V) | 5472 × 3648 | 4000 × 3000 | 4000 × 3000 | 5472 × 3648 |
Flight Time (mins) | 30 | 27 | 31 | 30 |
Weight (g) | 1391 | 734 | 907 | 1945 |
Transmission Range (km) | 7 | 7 | 10 | 7 |
Images | Annotations | Categories |
---|---|---|
… { “id”: 697, “width”: 5472, “height”: 3648, “file_name”: “images05”.jpg”, “hasCategories”: [ 1, 1, 7, 3, 2, 5, 4, 5 ], “clusterID”: 5 } … | … { “id”: 3874, “image_id”: 697, “category_id”: 7, “segmentation”: [ [ 493.05, … 2128.57 ] ], “bbox”: [ 87.72, 1949.66, 412.43, 869.26 ] } … | … { “id”: 2, “name”: “excavator” }, { “id”: 3, “name”: “backhoe_loader” }, { “id”: 4, “name”: “wheel_loader” }, { “id”: 5, “name”: “compactor “ } … |
Algorithm | mAP | mAP50 | mAP75 | mAPm | mAPl |
---|---|---|---|---|---|
Hybrid Task Cascade | 67.7 | 92.4 | 81.5 | 47.1 | 69.0 |
Cascade Mask R-CNN | 66.2 | 91.0 | 81.4 | 52.4 | 67.6 |
Mask Scoring R-CNN | 66.1 | 88.4 | 80.0 | 36.6 | 68.0 |
Pointrend | 68.2 | 92.6 | 83.5 | 49.7 | 69.6 |
Mask R-CNN | 66.6 | 91.6 | 80.4 | 46.8 | 67.8 |
Algorithm | mAP | mAP50 | mAP75 | mAPm | mAPl |
---|---|---|---|---|---|
Hybrid Task Cascade | 71.9 | 93.7 | 86.4 | 60.4 | 72.5 |
Cascade Mask R-CNN | 70.7 | 93.4 | 85.0 | 58.9 | 71.4 |
Mask Scoring R-CNN | 72.9 | 93.7 | 88.3 | 41.9 | 73.8 |
Pointrend | 72.9 | 94.2 | 88.4 | 49.2 | 73.6 |
Mask R-CNN | 71.4 | 93.9 | 86.6 | 57.1 | 72.3 |
Algorithm | D.T | Exc. | B.L. | W.L. | Com. | Doz. | Gra. | Car | Other |
---|---|---|---|---|---|---|---|---|---|
Pointrend (clustered) | 97.1 | 97.5 | 92.2 | 92.8 | 91.5 | 95.9 | 92.3 | 96.5 | 77.7 |
Pointrend (unclustered) | 97.3 | 97.8 | 97.9 | 96.6 | 92.6 | 86.5 | 100 | 95 | 84.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ersoz, A.B.; Pekcan, O.; Akbas, E. AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery. Remote Sens. 2024, 16, 3295. https://doi.org/10.3390/rs16173295
Ersoz AB, Pekcan O, Akbas E. AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery. Remote Sensing. 2024; 16(17):3295. https://doi.org/10.3390/rs16173295
Chicago/Turabian StyleErsoz, Ahmet Bahaddin, Onur Pekcan, and Emre Akbas. 2024. "AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery" Remote Sensing 16, no. 17: 3295. https://doi.org/10.3390/rs16173295
APA StyleErsoz, A. B., Pekcan, O., & Akbas, E. (2024). AIDCON: An Aerial Image Dataset and Benchmark for Construction Machinery. Remote Sensing, 16(17), 3295. https://doi.org/10.3390/rs16173295