Framework for LLM-Enabled Construction Robot Task Planning: Knowledge Base Preparation and Robot–LLM Dialogue for Interior Wall Painting
Abstract
1. Introduction
2. Literature Review
2.1. Current State of Construction Robot Task Planning Using BIM
2.2. Utilization of LLMs in Construction Industry
2.3. Utilization of LLMs for Robot Task Planning and Control
2.4. Point of Departure
3. Objective and Scope
4. Automated Construction Robot Task Planning via LLM–Robot Communication
4.1. Framework for LLM-Enhanced Construction Robot Task Planning
4.2. Painting Robot Software Prototype with BIM–Robot–ChatGPT-4 Integration
4.2.1. Contextual Data Preparation in BIM Interface
- Extraction of building element data: Utilizing PyRevit, the interface extracts conventional building element data and formats the data into JSON files. Our approach focuses on extracting only the relevant building element data needed for the painting robot’s operations as defined in the human language task specification. Consequently, the scope of the building data extraction in this study is limited to properties of room elements, such as room names and center coordinates, that are mentioned in the painting task specification. The extracted data can vary significantly depending on how the task specifications are written.
- Modeling of objects related to robot task performance: Robot task planning requires explicit information that is not typically required by human workers. For instance, the task specification for a painting robot includes actions like “navigate to the painting location”, requiring the precise determination of painting locations within the building model. Therefore, this study created custom Revit family models specifically to define painting locations. This approach allows construction superintendents to directly provide precise data for the robot’s painting operations by modeling these robot-related objects in the Revit building model. Then, PyRevit is utilized to extract and format these data into JSON files. Like building elements, the family models to be created depend on the type of robot task and how the task specification is written, reflecting the unique requirements. For example, for a drywall installation robot, the task specification can state “pick up drywall board from the nearest material storage”, which requires the creation of a family model for material storage locations.
- Creation of robot task specifications: The interface enables users to select detailed specifications for robot tasks written in natural human language. These specifications describe sequences of required actions (such as navigating to a room, scanning the room, moving to painting locations) and the robot skills required to perform these actions. The utilization of human language for task specification allows users to describe operational details without in-depth technical knowledge of robotics. Nonetheless, a basic understanding of the robot’s capabilities and the appropriate alignment of robot skills with required actions is still required. For instance, if the robot does not have the skill to navigate to a destination, the specification should not include such an action.
- Configuration of robot profiles: Through the interface, users can configure robot profiles that specify the robot’s properties and available skills. For example, the profile for the “Husky_painter” includes skills, such as navigation, speed setting, painting, and scanning. This profile helps to inform ChatGPT-4 of the robot’s capabilities, facilitating the generation of only executable instructions by ChatGPT-4 that align with the robot’s actual functions.
- Defining robot–ChatGPT interaction guidelines: The interface also allows users to create guidelines for the interaction between the robot and ChatGPT-4. These guidelines describe ChatGPT’s expected roles, types of responses to be generated, and prompt–response example sets. As shown in Appendix A, this study enables ChatGPT-4 to produce responses in both natural language for the human supervisor and in JSON format for the painting robot. We also defined type 1 response to trigger new actions, type 2 response to confirm the current status of action, and type 3 response to change work plan based on superintendent input.
4.2.2. ROS-ChatGPT Server
4.2.3. Husky Painter Skills
- navigateTo (float x, float y, float orientation): The “navigateTo” skill directs the robot to move to a specified location with a given orientation. The sequence begins with the robot submitting a prompt to the LLM, requesting permission to initiate navigation with the message “start navigating to destination.” Upon receiving a type 1 confirmation response, the robot executes navigation using the Adaptive Monte Carlo Localization (AMCL) [66], which provides accurate location tracking and orientation in the environment. Once the robot reaches the destination, it submits another prompt, “completed navigation,” to indicate task completion, and awaits a type 2 response from the LLM, confirming successful execution.
- paintWall (int strokes): The “paintWall” skill enables the robot to perform wall painting actions. The task starts with a prompt to the LLM, “start wall painting,” requesting confirmation to proceed. Once a type 1 response is received, the robot utilizes the MoveIt library [67] to control its UR5 arm, performing the specified number of paint strokes at the target location. After completing the painting task, the robot sends another prompt, “completed wall painting,” to the LLM for signal completion, and awaits a type 2 confirmation response, indicating successful execution.
- scanRoom (): The “scanRoom” skill allows the robot to perform a 360-degree scan of the room to gather environmental data. The process begins by sending the prompt “start room scanning” to the LLM, which, upon receiving a type 1 response, authorizes the robot to proceed. The robot then initiates a 360-degree rotation using AMCL to scan its surroundings. After completing the scan, it submits a prompt, “completed room scanning,” and waits for a type 2 response from the LLM to confirm task completion.
- setNavigationSpeed (int speed): This skill adjusts the robot’s navigation speed according to situational needs. The robot starts by sending a prompt, “setting navigation speed now,” to the LLM. Upon receiving a type 1 confirmation, it adjusts the internal speed parameter for navigation. After completing the speed adjustment, the robot sends a final prompt, “completed setting navigation speed,” to confirm with the LLM, receiving a type 2 response to indicate successful completion.
4.2.4. Skill Execution with Recursive Robot–ChatGPT-4 Communication
5. Case Study
5.1. Case Study Setup and Data Preparation
5.2. Painting Task Planning via Robot–ChatGPT Dialogue
5.3. Task Planning via Robot–ChatGPT Dialogue and Superintendent Input
6. Conclusions and Discussion
7. Limitations and Future Directions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Robot–ChatGPT-4 Interaction Guideline
# Introduction of roles - I am a painting robot. I will execute actions according to your (ChatGPT’s) instructions to complete all the planned works. # Notes for Robot–ChatGPT Interaction 1. I (robot) will submit prompts indicating the initiation and completion of actions. You (ChatGPT) should provide responses. A superintendent may occasionally provide additional instructions to make changes to the work plan. ChatGPT can provide three types of responses in JSON format following the structure shown in response examples below. 2. Type 1: When I complete an action, provide a JSON response for the required next action. 3. Type 2: While I am performing 1 or more actions, provide a JSON response indicating I am busy (status should be “busy”). An example can be found below. 4. Type 3: When a superintendent submits a prompt on situation-specific instructions, provide a JSON response indicating I am replanning or adjusting motion parameter (status can be “replanning” or “adjusting motion parameter”). Provide details about changed work plan in the description. An example can be found below. # Response examples response example 1 (type 1): { “actions”: [ { “description”: “provide comprehensive description. include contextual details”, “function_called”: “navigateToLocation”, “arguments”: [ {“name”: “x”, “value”: 3}, {“name”: “y”, “value”: 3}, {“name”: “orientation”, “value”: 3.14} ] } ] } response example 2 (type 2): { “status”: “busy”, “message”: “No additional action required at the moment.”, “actions_in_progress”: [ { “description”: “The robot is currently navigating to the specified location.”, “function_called”: “navigateToLocation”, “arguments”: [ {“name”: “x”, “value”: 3}, {“name”: “y”, “value”: 3}, {“name”: “orientation”, “value”: 3.14} ] } ] } response example 3 (type 3): { “status”: “replanning”, “message”: “No additional action required at the moment.”, “actions_in_progress”: [ { “description”: “Changed work plan based on the input from superintendent.”, “function_called”: “navigateToLocation”, “arguments”: [ {“name”: “x”, “value”: 3}, {“name”: “y”, “value”: 3}, {“name”: “orientation”, “value”: 3.14} ] } ] } |
Appendix B. Robotic Painting Task Specification
# Name: Task Specification for Wall Painting # Specification ID: WP1 # Task Description: The robot will paint designated interior or exterior walls according to the construction project’s plan. # Sequence of Actions in Task: Action 1. Navigate to the Room: Robot first moves to the central position of each specified room. Action 2. Scan Room: Turn 360 degrees to update map for navigation. Action 3. Navigate to the Painting Location: Move to a painting location. Action 4. Perform Wall Painting: Carry out the painting operation, applying paint with designated stroke time for consistency. # Notes for Task Execution Action 1, Action 2, Action 3, and Action 4 cannot be performed at the same time. For example, navigation should be completed to start wall painting # Skill Utilization: 1. Navigate to the Room: navigateToLocation skill is used with required input of room center coordinate. 2. Scan Room: scanRoom skill is used without any input. 3. Navigate to the Painting Location: navigateToLocation skill is used with required input of painting location coordinate. 4. Perform Wall Painting: paintWall is used with required input of stroke time. |
Appendix C. Painting Locations
{ “Area”: “Living_room”, “Room_Location”: { “Orientation_Radians”: 1.5708, “Y”: 6.39, “X”: 8.13, “Z”: 0.0 }, “Painting_Location”: [ { “Orientation”: 3.14, “Y”: 5.38, “X”: 5.20, “stroke_time”: 2, “Z”: 0, “Sequence”: 1 }, { “Orientation”: 3.14, “Y”: 7.30, “X”: 5.20, “stroke_time”: 2, “Z”: 0, “Sequence”: 2 }, { “Orientation”: 1.5707, “Y”: 7.32, “X”: 6.46, “stroke_time”: 2, “Z”: 0, “Sequence”: 3 }, | { “Orientation”: 1.5707, “Y”: 7.32, “X”: 10.70, “stroke_time”: 2, “Z”: 0, “Sequence”: 4 }, { “Orientation”: 0, “Y”: 5.88, “X”: 12.29, “stroke_time”: 2, “Z”: 0, “Sequence”: 5 }, { “Orientation”: −1.5707, “Y”: 5.2, “X”: 12.30, “stroke_time”: 2, “Z”: 0, “Sequence”: 6 }, { “Orientation”: −1.5707, “Y”: 5.2, “X”: 10.86, “stroke_time”: 2, “Z”: 0, “Sequence”: 7 }, | { “Orientation”: −1.5707, “Y”: 5.20, “X”: 7.86, “stroke_time”: 2, “Z”: 0, “Sequence”: 8 }, { “Orientation”: −1.5707, “Y”: 5.20, “X”: 5.42, “stroke_time”: 2, “Z”: 0, “Sequence”: 9 } ] } |
Appendix D. Robot Profile
{ “Robot_ID”: “11011”, “Name”: “Husky_painter”, “Task”: [ “Wall_painting”, “Ceiling_painting” ], “Skill”: [ “navigateToLocation”, “paintWall”, “scanRoom”, “setNavigationSpeed” ] } |
References
- Associated Builders and Contractors. 2024 Construction Workforce Shortage Tops Half a Million. Available online: https://www.abc.org/News-Media/News-Releases/abc-2024-construction-workforce-shortage-tops-half-a-million (accessed on 13 March 2024).
- CPWR. Fatal and Nonfatal Injuries in Construction. Available online: https://www.cpwr.com/research/data-center/data-dashboards/fatal-and-nonfatal-injuries-in-construction/ (accessed on 13 March 2024).
- Teicholz, P. Labor-productivity declines in the construction industry: Causes and remedies (another look). AECbytes Viewp. 2013, 67, 15. [Google Scholar]
- Bock, T. Construction robotics. Auton. Robot. 2007, 22, 201–209. [Google Scholar] [CrossRef]
- Jones, K. Robots are Coming to the Construction Site. Available online: https://www.constructconnect.com/blog/construction-robotics (accessed on 13 March 2024).
- Meschini, S.; Iturralde, K.; Linner, T.; Bock, T. Novel applications offered by integration of robotic tools in BIM-based design workflow for automation in construction processes. In Proceedings of the CIB*IAARC W119 CIC 2016 Workshop, Munich, Germany, 31 August 2016; Available online: https://mediatum.ub.tum.de/1484218 (accessed on 13 March 2024).
- Kim, S.; Peavy, M.; Huang, P.; Kim, K. Development of BIM-integrated construction robot task planning and simulation system. Autom. Constr. 2021, 127, 103720. [Google Scholar] [CrossRef]
- Correa, F. Robot-oriented design for production in the context of building information modeling. In Proceedings of the International Symposium on Automation and Robotics in Construction, Auburn, AL, USA, 18–21 July 2016. [Google Scholar] [CrossRef]
- Eastman, C.; Teicholz, P.; Sacks, R.; Liston, K. BIM Handbook, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar] [CrossRef]
- Sulankivi, K.; Kähkönen, K.; Mäkelä, T.; Kiviniemi, M. 4D-BIM for construction safety planning. In Proceedings of the W099-Special Track 18th CIB World Building Congress, Manchester, UK, 10–13 May 2010; pp. 117–128. Available online: https://www.researchgate.net/profile/Kristiina-Sulankivi/publication/228640694_4D-BIM_for_Construction_Safety_Planning/links/02bfe50ff850a0c2cd000000/4D-BIM-for-Construction-Safety-Planning.pdf (accessed on 23 April 2016).
- Koo, B.; Fischer, M. Feasibility study of 4D CAD in commercial construction. J. Constr. Eng. Manag. 2000, 126, 251–260. [Google Scholar] [CrossRef]
- Vaugn, F. 3D & 4D CAD modeling on commercial design-build projects. In Computing in Civil Engineering; ASCE: Reston, VA, USA, 1996. [Google Scholar]
- Zhang, S.; Teizer, J.; Lee, J.; Eastman, C.; Venugopal, M. Building information modeling (BIM) and safety: Automatic safety checking of construction models and schedules. Autom. Constr. 2013, 29, 183–195. [Google Scholar] [CrossRef]
- Park, J.; Kim, K.; Cho, Y. Framework of automated construction-safety monitoring using cloud-enabled BIM and BLE mobile tracking sensors. J. Constr. Eng. Manag. 2017, 143, 05016019. [Google Scholar] [CrossRef]
- Kim, K.; Teizer, J. Automatic design and planning of scaffolding systems using building information modeling. Adv. Eng. Inform. 2014, 28, 66–80. [Google Scholar] [CrossRef]
- Han, K.; Golparvar-Fard, M. Appearance-based material classification for monitoring of operation-level construction progress using 4D BIM and site photologs. Autom. Constr. 2015, 53, 44–57. [Google Scholar] [CrossRef]
- ROS.org. Powering the World’s Robots. Available online: https://www.ros.org/ (accessed on 28 July 2021).
- ROS-Industrial. ROS-Industrial. Available online: https://rosindustrial.org/ (accessed on 4 March 2021).
- Everett, J.; Slocum, A. Automation and robotics opportunities: Construction versus manufacturing. J. Constr. Eng. Manag. 1994, 120, 443–452. [Google Scholar] [CrossRef]
- Zhu, A.; Pauwels, P.; Torta, E.; Zhang, H.; De Vries, B. Data linking and interaction between BIM and robotic operating system (ROS) for flexible construction planning. Autom. Constr. 2024, 163, 105426. [Google Scholar] [CrossRef]
- Lakin, R.; Kim, K.; Huang, P. ROS-based robot simulation for repetitive labor-intensive construction tasks. In Proceedings of the 18th IEEE International Conference on Industrial Informatics (INDIN 2020), Warwick, UK, 20–23 July 2020. [Google Scholar]
- Lundeen, K.; Kamat, V.; Menassa, C.; McGee, W. Autonomous motion planning and task execution in geometrically adaptive robotized construction work. Autom. Constr. 2019, 100, 24–45. [Google Scholar] [CrossRef]
- Follini, C.; Magnago, V.; Freitag, K.; Terzer, M.; Marcher, C.; Riedl, M.; Giusti, A.; Matt, D. BIM-integrated collaborative robotics for application in building construction and maintenance. Robotics 2020, 10, 2. [Google Scholar] [CrossRef]
- Ding, L.; Jiang, W.; Zhou, Y.; Zhou, C.; Liu, S. BIM-based task-level planning for robotic brick assembly through image-based 3D modeling. Adv. Eng. Inform. 2020, 43, 100993. [Google Scholar] [CrossRef]
- Wong Chong, O.; Zhang, J.; Voyles, R.; Min, B. BIM-based simulation of construction robotics in the assembly process of wood frames. Autom. Constr. 2022, 137, 104194. [Google Scholar] [CrossRef]
- Zhu, A.; Pauwels, P.; De Vries, B. Smart component-oriented method of construction robot coordination for prefabricated housing. Autom. Constr. 2021, 129, 103778. [Google Scholar] [CrossRef]
- Oyediran, H.; Turner, W.; Kim, K.; Barrows, M. Integration of 4D BIM and robot task planning: Creation and flow of construction-related information for action-level simulation of indoor wall frame installation. arXiv 2024, arXiv:2402.03602. [Google Scholar] [CrossRef]
- OpenAI. GPT-4 technical report. arXiv 2023, arXiv:2303.08774v6. [Google Scholar] [CrossRef]
- Huang, W.; Xia, F.; Xiao, T.; Chan, H.; Liang, J.; Florence, P.; Zeng, A.; Tompson, J.; Mordatch, I.; Chebotar, Y.; et al. Inner monologue: Embodied reasoning through planning with language models. Proc. Mach. Learn Res. 2023. [Google Scholar] [CrossRef]
- Sun, H.; Zhuang, Y.; Kong, L.; Dai, B.; Zhang, C. Adaplanner: Adaptive planning from feedback with language models. Adv. Neural Inf. Process Syst. 2023, 36, 58202–58245. [Google Scholar]
- Liang, J.; Huang, W.; Xia, F.; Xu, P.; Hausman, K.; Ichter, B.; Florence, P.; Zeng, A. Code as policies: Language model programs for embodied control. In Proceedings of the IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023. [Google Scholar] [CrossRef]
- Quigley, M.; Conley, K.; Gerkey, B.; Faust, J.; Foote, T.; Leibs, J.; Wheeler, R.; Ng, A. ROS: An open-source robot operating system. In Proceedings of the ICRA Workshop on Open Source Software, Kobe, Japan, 12–17 May 2009; Volume 3, p. 5. [Google Scholar]
- Kim, K.; Peavy, M. BIM-based semantic building world modeling for robot task planning and execution in built environments. Autom. Constr. 2022, 138, 104247. [Google Scholar] [CrossRef]
- ROS.org. urdf. Available online: http://wiki.ros.org/urdf (accessed on 28 November 2021).
- Yin, H.; Lin, Z.; Yeoh, J. Semantic localization on BIM-generated maps using a 3D LiDAR sensor. Autom. Constr. 2023, 146, 104641. [Google Scholar] [CrossRef]
- Ghimire, P.; Kim, K.; Acharya, M. Opportunities and challenges of generative AI in construction industry: Focusing on adoption of text-based models. Buildings 2024, 14, 220. [Google Scholar] [CrossRef]
- Rane, N. Role of ChatGPT and similar generative artificial intelligence (AI) in construction industry. SSRN Electron. J. 2023. [Google Scholar] [CrossRef]
- Saka, A.; Taiwo, R.; Saka, N.; Salami, B.; Ajayi, S.; Akande, K.; Kazemi, H. GPT models in construction industry: Opportunities, limitations, and a use case validation. Dev. Built Environ. 2024, 17, 100300. [Google Scholar] [CrossRef]
- Taiwo, R.; Bello, I.; Abdulai, S.; Yussif, A.; Salami, B.; Saka, A.; Zayed, T. Generative AI in the construction industry: A state-of-the-art analysis. arXiv 2024, arXiv:2402.09939. [Google Scholar]
- Zheng, J.; Fischer, M. Dynamic prompt-based virtual assistant framework for BIM information search. Autom. Constr. 2023, 155, 105067. [Google Scholar] [CrossRef]
- Prieto, S.; Mengiste, E.; García de Soto, B. Investigating the use of ChatGPT for the scheduling of construction projects. Buildings 2023, 13, 857. [Google Scholar] [CrossRef]
- Mohamed Hassan, H.; Marengo, E.; Nutt, W. A BERT-based model for question answering on construction incident reports. In Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2022; pp. 216–231. [Google Scholar] [CrossRef]
- Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
- Zhang, C.; Chen, J.; Li, J.; Peng, Y.; Mao, Z. Large language models for human–robot interaction: A review. Biomim. Intell. Robot. 2023, 3, 100131. [Google Scholar] [CrossRef]
- Ding, Y.; Zhang, X.; Amiri, S.; Cao, N.; Yang, H.; Kaminski, A.; Esselink, C.; Zhang, S. Integrating action knowledge and LLMs for task planning and situation handling in open worlds. Auton. Robot. 2023, 47, 729–754. [Google Scholar] [CrossRef]
- Micheli, V.; Fleuret, F. Language models are few-shot butlers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Virtual, 7–11 November 2021; pp. 9312–9323. [Google Scholar] [CrossRef]
- Li, S.; Puig, X.; Paxton, C.; Du, Y.; Wang, C.; Fan, L.; Akyürek, E.; Anandkumar, A.; Andreas, J.; Mordatch, I. Pre-trained language models for interactive decision-making. Adv. Neural Inf. Process Syst. 2022, 35, 31199–31212. [Google Scholar]
- Huang, W.; Abbeel, P.; Pathak, D.; Mordatch, I. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In Proceedings of the 39th International Conference on Machine Learning (ICML 2022), Baltimore, MD, USA, 17–23 July 2022. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A. Language models are few-shot learners. Adv. Neural Inf. Process Syst. 2020, 33, 1877–1901. [Google Scholar]
- Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Wu, Z.; Chang, B.; Xu, J.; Sun, X.; Sui, Z. A survey on in-context learning. arXiv 2022, arXiv:2301.00234. [Google Scholar]
- Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, I.; Narasimhan, K.; Cao, Y. React: Synergizing reasoning and acting in language models. arXiv 2022, arXiv:2210.03629. [Google Scholar]
- Ding, Y.; Zhang, X.; Paxton, C.; Zhang, S. Task and motion planning with large language models for object rearrangement. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Detroit, MI, USA, 1–5 October 2023. [Google Scholar] [CrossRef]
- Chen, Y.; Arkin, J.; Zhang, Y.; Roy, N.; Fan, C. Autotamp: Autoregressive task and motion planning with LLMs as translators and checkers. arXiv 2023, arXiv:2306.06531. [Google Scholar]
- Wang, S.; Han, M.; Jiao, Z.; Zhang, Z.; Wu, Y.; Zhu, S.; Liu, H. LLM^3: Large language model-based task and motion planning with motion failure reasoning. arXiv 2024, arXiv:2403.11552. [Google Scholar]
- Ahn, M.; Brohan, A.; Brown, N.; Chebotar, Y.; Cortes, O.; David, B.; Finn, C.; Fu, C.; Gopalakrishnan, K.; Hausman, K. Do as I can, not as I say: Grounding language in robotic affordances. Proc Mach Learn Res. 2023. [Google Scholar] [CrossRef]
- Bhat, V.; Kaypak, A.; Krishnamurthy, P.; Karri, R.; Khorrami, F. Grounding LLMs for robot task planning using closed-loop state feedback. arXiv 2024, arXiv:2402.08546. [Google Scholar] [CrossRef]
- Singh, I.; Blukis, V.; Mousavian, A.; Goyal, A.; Xu, D.; Tremblay, J.; Fox, D.; Thomason, J.; Garg, A. Progprompt: Generating situated robot task plans using large language models. In Proceedings of the IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023. [Google Scholar] [CrossRef]
- Song, C.; Wu, J.; Washington, C.; Sadler, B.; Chao, W.; Su, Y. LLM-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 2998–3009. [Google Scholar]
- Driess, D.; Xia, F.; Sajjadi, M.; Lynch, C.; Chowdhery, A.; Ichter, B.; Wahid, A.; Tompson, J.; Vuong, Q.; Yu, T.; et al. PaLM-E: An embodied multimodal language model. Proc Mach Learn Res. 2023. [Google Scholar] [CrossRef]
- You, H.; Ye, Y.; Zhou, T.; Zhu, Q.; Du, J. Robot-enabled construction assembly with automated sequence planning based on ChatGPT: RoboGPT. Buildings 2023, 13, 1772. [Google Scholar] [CrossRef]
- Luo, H.; Wu, J.; Liu, J.; Antwi-Afari, M. Large language model-based code generation for the control of construction assembly robots: A hierarchical generation approach. Dev. Built Environ. 2024, 19, 100488. [Google Scholar] [CrossRef]
- Park, S.; Wang, X.; Menassa, C.; Kamat, V.; Chai, J. Natural language instructions for intuitive human interaction with robotic assistants in field construction work. Autom. Constr. 2024, 161, 105345. [Google Scholar] [CrossRef]
- Gazebo. Robot Simulation Made Easy. Available online: http://gazebosim.org/ (accessed on 28 November 2021).
- Iran-Nejad, E. pyRevit. Available online: https://pyrevitlabs.notion.site/pyRevit-bd907d6292ed4ce997c46e84b6ef67a0 (accessed on 8 May 2024).
- OpenAI. Text Generation Models. Available online: https://platform.openai.com/docs/guides/text-generation (accessed on 8 May 2024).
- ROS.org. amcl-ROS Wiki. Available online: http://wiki.ros.org/amcl (accessed on 28 November 2021).
- Chitta, S. Moveit!: An introduction. Stud. Comput. Intell. 2016, 625, 3–27. [Google Scholar] [CrossRef]
- Yang, H. GitHub-pgm_map_creator: Create pgm Map from Gazebo World File for ROS Localization. Available online: https://github.com/hyfan1116/pgm_map_creator (accessed on 28 November 2021).
- ROS.org. rviz: Package Summary. Available online: http://wiki.ros.org/rviz (accessed on 28 November 2021).
# | Prompt | Response | Response Time (s) | # | Prompt | Response | Response Time (s) |
---|---|---|---|---|---|---|---|
1 | Robot: “Now, I start working. What is my first task?” | Type 1 response to navigate to center coordinates | 3.8635 | 23 | Robot: “Completed navigating to location 5” | Type 1 response to start painting at fifth location | 3.4591 |
2 | Robot: “Start navigating to room center” | Type 2: “No additional action required.” | 4.9501 | 24 | Robot: “Painting started” | Type 2: “No additional action required.” | 4.1301 |
3 | Robot: “Completed navigating to room center” | Type 1 response to initiate room scan | 3.5210 | 25 | Robot: “Painting completed at location 5” | Type 1 response to navigate to sixth painting location (x: 12.3, y: 5.2) | 5.2085 |
4 | Robot: “Started room scan” | Type 2: “No additional action required.” | 3.3425 | 26 | Robot: “Start navigating to location 6” | Type 2: “No additional action required.” | 5.2176 |
5 | Robot: “Completed room scan” | Type 1 response to navigate to first painting location (x: 5.2, y: 5.38) | 4.5994 | 27 | Robot: “Completed navigating to location 6” | Type 1 response to start painting at sixth location | 3.5528 |
6 | Robot: “Start navigating to location 1” | Type 2: “No additional action required.” | 4.9233 | 28 | Robot: “Painting started” | Type 2: “No additional action required.” | 3.7013 |
7 | Robot: “Completed navigating to location 1” | Type 1 response to start painting at first location | 3.7831 | 29 | Robot: “Painting completed at location 6” | Type 1 response to navigate to seventh painting location (x: 10.86, y: 5.2) | 4.8954 |
8 | Robot: “Painting started” | Type 2: “No additional action required.” | 4.4186 | 30 | Robot: “Start navigating to location 7” | Type 2: “No additional action required.” | 4.7253 |
9 | Robot: “Painting completed at location 1” | Type 1 response to navigate to second painting location (x: 5.2, y: 7.3) | 4.5604 | 31 | Robot: “Completed navigating to location 7” | Type 1 response to start painting at seventh location | 3.8591 |
10 | Robot: “Start navigating to location 2” | Type 2: “No additional action required.” | 4.9233 | 32 | Robot: “Painting started” | Type 2: “No additional action required.” | 3.5528 |
11 | Robot: “Completed navigating to location 2” | Type 1 response to start painting at second location | 3.7013 | 33 | Robot: “Painting completed at location 7” | Type 1 response to navigate to eighth painting location (x: 7.86, y: 5.2) | 5.2176 |
12 | Robot: “Painting started” | Type 2: “No additional action required.” | 3.7013 | 34 | Robot: “Start navigating to location 8” | Type 2: “No additional action required.” | 5.6109 |
13 | Robot: “Painting completed at location 2” | Type 1 response to navigate to third painting location (x: 6.46, y: 7.32) | 4.5947 | 35 | Robot: “Completed navigating to location 8” | Type 1 response to start painting at eighth location | 3.6812 |
14 | Robot: “Start navigating to location 3” | Type 2: “No additional action required.” | 4.7253 | 36 | Robot: “Painting started” | Type 2: “No additional action required.” | 4.0332 |
15 | Robot: “Completed navigating to location 3” | Type 1 response to start painting at third location | 3.5528 | 37 | Robot: “Painting completed at location 8” | Type 1 response to navigate to ninth painting location (x: 5.42, y: 5.2) | 5.3178 |
16 | Robot: “Painting started” | Type 2: “No additional action required.” | 3.8591 | 38 | Robot: “Start navigating to location 9” | Type 2: “No additional action required.” | 5.3178 |
17 | Robot: “Painting completed at location 3” | Type 1 response to navigate to fourth painting location (x: 10.7, y: 7.32) | 5.6109 | 39 | Robot: “Completed navigating to location 9” | Type 1 response to start painting at ninth location | 4.0332 |
18 | Robot: “Start navigating to location 4” | Type 2: “No additional action required.” | 5.2085 | 40 | Robot: “Painting started at final location” | Type 2: “No additional action required.” | 2.9016 |
19 | Robot: “Completed navigating to location 4” | Type 1 response to start painting at fourth location | 3.0943 | 41 | Robot: “Painting in progress at final location” | Type 2: “No additional action required.” | 4.0332 |
20 | Robot: “Painting started” | Type 2: “No additional action required.” | 4.0332 | 42 | Robot: “Painting completed at final location” | Type 1: “All painting tasks successfully completed.” | 3.2565 |
21 | Robot: “Painting completed at location 4” | Type 1 response to navigate to fifth painting location (x: 12.29, y: 5.88) | 4.4736 | 43 | Superintendent: “Summarize works completed” | Response from ChatGPT: “The robot ‘Husky_painter’ has successfully completed all the assigned wall painting tasks...” | 5.5387 |
22 | Robot: “Start navigating to location 5” | Type 2: “No additional action required.” | 5.1159 |
# | Prompt | Response | Response Time (s) | # | Prompt | Response | Response Time (s) |
---|---|---|---|---|---|---|---|
1 | Robot: “Now, I start working. What is my first task?” | Type 1 response to navigate to center coordinates | 4.4922 | 17 | Robot: “completed navigating to location 6” | Type 1 response to start painting at sixth location | 3.7263 |
2 | Robot: “start navigating to room center” | Type 2: “No additional action required.” | 4.8897 | 18 | Robot: “painting started” | Type 2: “No additional action required.” | 3.3481 |
3 | Superintendent: “Skip painting locations 1, 2, 3” | Type 3: “the robot is replanning the sequence...” | 6.5843 | 19 | Robot: “painting completed” | Type 1 response to navigate to seventh painting location | 5.3984 |
4 | Superintendent: “Reduce navigation speed to half at location 5, increase after location 7” | Type 3 response: “Navigation speed reduced for location 5...” | 6.1421 | 20 | Robot: “start navigating to location 7” | Type 2: “No additional action required.” | 5.0772 |
5 | Robot: “completed navigating to room center” | Type 1 response to initiate room scan. | 2.7408 | 21 | Robot: “completed navigating to location 7” | Type 1 response to start painting at seventh location | 3.7815 |
6 | Robot: “started room scan” | Type 2: “No additional action required.” | 3.5857 | 22 | Robot: “painting started” | Type 2: “No additional action required.” | 3.0984 |
7 | Robot: “completed room scan” | Type 1 response to navigate to fourth painting location | 5.1834 | 23 | Robot: “painting completed” | Type 1 response to navigate to eighth painting location & increase navigation speed | 6.4008 |
8 | Robot: “start navigating to location 4” | Type 2: “No additional action required.” | 5.8110 | 24 | Robot: “start navigating to location 8” | Type 2: “No additional action required.” | 4.5917 |
9 | Robot: “completed navigating to location 4” | Type 1 response to start painting at fourth location. | 3.4877 | 25 | Robot: “completed navigating to location 8” | Type 1 response to start painting at eighth location | 4.4586 |
10 | Robot: “painting started” | Type 2: “No additional action required.” | 3.2121 | 26 | Robot: “painting started” | Type 2: “No additional action required.” | 4.5595 |
11 | Robot: “painting completed” | Type 1 response to navigate to fifth painting location & reduce navigation speed | 5.3238 | 27 | Robot: “painting completed” | Type 1 response to navigate to ninth painting location | 5.3106 |
12 | Robot: “start navigating to location 5” | Type 2: “No additional action required.” | 4.5958 | 28 | Robot: “start navigating to location 9” | Type 2: “No additional action required.” | 4.8117 |
13 | Robot: “completed navigating to location 5” | Type 1 response to start painting at fifth location | 3.6059 | 29 | Robot: “completed navigating to location 9” | Type 1 response to start painting at ninth location | 3.5661 |
14 | Robot: “painting started” | Type 2: “No additional action required.” | 4.0348 | 30 | Robot: “painting started” | Type 2: “No additional action required.” | 3.8647 |
15 | Robot: “painting completed” | Type 1 response to navigate to sixth painting location | 5.7541 | 31 | Robot: “painting completed” | Type 2: “Robot completed all the work.” | 2.9089 |
16 | Robot: “start navigating to location 6” | Type 2: “No additional action required.” | 4.6873 | 32 | Superintendent: “summarize work and interaction” | Type 3 to summarize works done and interaction with superintendent | 16.5739 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, K.; Ghimire, P.; Huang, P.-C. Framework for LLM-Enabled Construction Robot Task Planning: Knowledge Base Preparation and Robot–LLM Dialogue for Interior Wall Painting. Robotics 2025, 14, 117. https://doi.org/10.3390/robotics14090117
Kim K, Ghimire P, Huang P-C. Framework for LLM-Enabled Construction Robot Task Planning: Knowledge Base Preparation and Robot–LLM Dialogue for Interior Wall Painting. Robotics. 2025; 14(9):117. https://doi.org/10.3390/robotics14090117
Chicago/Turabian StyleKim, Kyungki, Prashnna Ghimire, and Pei-Chi Huang. 2025. "Framework for LLM-Enabled Construction Robot Task Planning: Knowledge Base Preparation and Robot–LLM Dialogue for Interior Wall Painting" Robotics 14, no. 9: 117. https://doi.org/10.3390/robotics14090117
APA StyleKim, K., Ghimire, P., & Huang, P.-C. (2025). Framework for LLM-Enabled Construction Robot Task Planning: Knowledge Base Preparation and Robot–LLM Dialogue for Interior Wall Painting. Robotics, 14(9), 117. https://doi.org/10.3390/robotics14090117