Embodied Intelligence: Physical Human–Robot Interaction

A special issue of Robotics (ISSN 2218-6581). This special issue belongs to the section "AI in Robotics".

Deadline for manuscript submissions: closed (30 April 2026) | Viewed by 27464

Special Issue Editor


E-Mail Website
Guest Editor
1. Department of Computer Science, UNC at Chapel Hill, Chapel Hill, NC, USA
2. Department of Mechanical Engineering, UC Berkeley, Berkeley, CA, USA
Interests: robotics; human-robot Interaction; embodied artificial intelligence; computer vision; robot learning

Special Issue Information

Dear Colleagues,

Recent advancements in robotics and human–robot interaction (HRI) have led to significant progress in the realm of embodied AI, in which robots demonstrate physical behaviors and engage in complex interactions with humans and the 3D world. This evolving field holds vast potential in industries such as healthcare, manufacturing, and assistive technologies. However, the physical properties of objects, such as friction, soft bodies, and fluids, can significantly affect the precision and adaptability of robotic grasping and manipulation, especially in unstructured environments. Additionally, many existing physics-based grasping techniques rely heavily on single-forward prediction, in which a predetermined grasping pose is executed without real-time exploration or feedback. To address these limitations, it is critical to explore how robots can move beyond predefined motion tasks and intelligently adapt to dynamic conditions using physical sensors such as tactile sensors.

This Special Issue welcomes the submission of papers that explore cutting-edge approaches to embodied AI within the scope of physical human–robot interaction, intelligent grasping, exploration and feedback mechanisms. Submissions that integrate robot learning, embodied planning, vision-language understanding, tactile sensing, adaptive control strategies, and iterative feedback loops to enhance the manipulation capabilities of robotics are highly encouraged. We also welcome theoretical studies that delve into the intersection of embodied AI, dynamics, cognitive science, and interactive systems, as well as real-world applications in healthcare robotics, industrial automation, and beyond.

Dr. Mingyu Ding
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Robotics is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • robotics
  • cognitive robotics
  • robot learning
  • embodied AI
  • tactile sensors and feedback
  • physics-based grasping
  • physical inference
  • human–robot interaction
  • physical interaction
  • vision-language understanding
  • soft object manipulation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

20 pages, 4999 KB  
Article
Beyond Visual and Force Feedback: Role of Vibrotactile and Auditory Cues in Robot Teleoperated Assembly
by Kaoru Ohno, Hikaru Nagano and Yasuyoshi Yokokohji
Robotics 2026, 15(2), 39; https://doi.org/10.3390/robotics15020039 - 9 Feb 2026
Viewed by 758
Abstract
Reliable detection of contact states, such as the “mating” of connectors, is crucial for high-quality teleoperated assembly. Conventional systems relying solely on visual and continuous force feedback often fail to convey these discrete high-frequency transients due to the limited high-frequency rendering capabilities. This [...] Read more.
Reliable detection of contact states, such as the “mating” of connectors, is crucial for high-quality teleoperated assembly. Conventional systems relying solely on visual and continuous force feedback often fail to convey these discrete high-frequency transients due to the limited high-frequency rendering capabilities. This study investigates the effectiveness of augmenting visual and force feedback with vibrotactile and auditory cues for detecting connector mating. We conducted three experiments: (1) a mating detection task using recorded multimodal data (N=10), (2) a modality contribution analysis (N=10), and (3) a real-time robot connector insertion task (N=10). Results from the real-time task demonstrated that the proposed multimodal feedback significantly reduced the maximum contact force exerted after mating compared to the baseline visual-force condition (p<0.001), thereby enhancing physical safety. Furthermore, vibrotactile and auditory cues were found to be redundant yet complementary, providing robust cues even when one modality is compromised. Although subjective mental workload increased due to sensory integration, the significant improvement in detection clarity and safety justifies the multimodal approach. We conclude that providing transient vibrotactile and auditory cues is a highly effective strategy for compensating for the limitations of conventional force feedback in teleoperated assembly. Full article
(This article belongs to the Special Issue Embodied Intelligence: Physical Human–Robot Interaction)
Show Figures

Graphical abstract

28 pages, 4319 KB  
Article
Agentic Workflows for Improving Large Language Model Reasoning in Robotic Object-Centered Planning
by Jesus Moncada-Ramirez, Jose-Luis Matez-Bandera, Javier Gonzalez-Jimenez and Jose-Raul Ruiz-Sarmiento
Robotics 2025, 14(3), 24; https://doi.org/10.3390/robotics14030024 - 24 Feb 2025
Cited by 10 | Viewed by 10887
Abstract
Large Language Models (LLMs) provide cognitive capabilities that enable robots to interpret and reason about their workspace, especially when paired with semantically rich representations like semantic maps. However, these models are prone to generating inaccurate or invented responses, known as hallucinations, that can [...] Read more.
Large Language Models (LLMs) provide cognitive capabilities that enable robots to interpret and reason about their workspace, especially when paired with semantically rich representations like semantic maps. However, these models are prone to generating inaccurate or invented responses, known as hallucinations, that can produce an erratic robotic operation. This can be addressed by employing agentic workflows, structured processes that guide and refine the model’s output to improve response quality. This work formally defines and qualitatively analyzes the impact of three agentic workflows (LLM Ensemble, Self-Reflection, and Multi-Agent Reflection) on enhancing the reasoning capabilities of an LLM guiding a robotic system to perform object-centered planning. In this context, the LLM is provided with a pre-built semantic map of the environment and a query, to which it must respond by determining the most relevant objects for the query. This response can be used in a multitude of downstream tasks. Extensive experiments were carried out employing state-of-the-art LLMs and semantic maps generated from the widely-used datasets ScanNet and SceneNN. The results show that agentic workflows significantly enhance object retrieval performance, especially in scenarios requiring complex reasoning, with improvements averaging up to 10% over the baseline. Full article
(This article belongs to the Special Issue Embodied Intelligence: Physical Human–Robot Interaction)
Show Figures

Figure 1

Review

Jump to: Research, Other

38 pages, 5906 KB  
Review
Perception and Computation for Speed and Separation Monitoring Architectures
by Odysseus Adamides, Karthik Subramanian, Sarthak Arora and Ferat Sahin
Robotics 2025, 14(4), 41; https://doi.org/10.3390/robotics14040041 - 31 Mar 2025
Cited by 3 | Viewed by 4935
Abstract
Human–Robot Collaboration (HRC) has been a significant research topic within the Industry 4.0 movement over the past decade. The interest in HRC research has continued on with the dawn of Industry 5.0 focusing on worker experience. Within the study of HRC, the collaboration [...] Read more.
Human–Robot Collaboration (HRC) has been a significant research topic within the Industry 4.0 movement over the past decade. The interest in HRC research has continued on with the dawn of Industry 5.0 focusing on worker experience. Within the study of HRC, the collaboration approach of Speed and Separation Monitoring (SSM) has been implemented through various architectures. The different configuration strategies involve different perception-sensing modalities, mounting strategies, data filtration, computational platforms, and calibration methods. This paper explores the evolution of the perception architectures used to perform SSM, and highlights innovations in sensing and processing technologies that can open up the door to significant advancements in this sector of HRC research. Full article
(This article belongs to the Special Issue Embodied Intelligence: Physical Human–Robot Interaction)
Show Figures

Figure 1

Other

Jump to: Research, Review

53 pages, 5533 KB  
Systematic Review
Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review
by Matthew Lisondra, Beno Benhabib and Goldie Nejat
Robotics 2026, 15(3), 55; https://doi.org/10.3390/robotics15030055 - 4 Mar 2026
Cited by 2 | Viewed by 7815
Abstract
Rapid advancements in foundation models, including Large Language Models, Vision-Language Models, Multimodal Large Language Models, and Vision-Language-Action models, have opened new avenues for embodied AI in mobile service robotics. By combining foundation models with the principles of embodied AI, where intelligent systems perceive, [...] Read more.
Rapid advancements in foundation models, including Large Language Models, Vision-Language Models, Multimodal Large Language Models, and Vision-Language-Action models, have opened new avenues for embodied AI in mobile service robotics. By combining foundation models with the principles of embodied AI, where intelligent systems perceive, reason, and act through physical interaction, mobile service robots can achieve more flexible understanding, adaptive behavior, and robust task execution in dynamic real-world environments. Despite this progress, embodied AI for mobile service robots continues to face fundamental challenges related to the translation of natural language instructions into executable robot actions, multimodal perception in human-centered environments, uncertainty estimation for safe decision-making, and computational constraints for real-time onboard deployment. In this paper, we present the first systematic review of foundation models in mobile service robotics, following the preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines. Using an OpenAlex literature search, we considered 7506 papers for the years spanning 1968–2025. Our detailed analysis identified four main challenges and how recent advances in foundation models, related to the translation of natural language instructions into executable robot actions, multimodal perception in human-centered environments, uncertainty estimation for safe decision-making, and computational constraints for real-time onboard deployment, have addressed these challenges. We further examine real-world applications in domestic assistance, healthcare, and service automation, highlighting how foundation models enable context-aware, socially responsive, and generalizable robot behaviors. Beyond technical considerations, we discuss ethical, societal, human-interaction, and physical design and ergonomic implications associated with deploying foundation-model-enabled service robots in human environments. Finally, we outline future research directions emphasizing reliability and lifelong adaptation, privacy-aware and resource-constrained deployment, as well as the governance and human-in-the-loop frameworks required for safe, scalable, and trustworthy mobile service robotics. Full article
(This article belongs to the Special Issue Embodied Intelligence: Physical Human–Robot Interaction)
Show Figures

Figure 1

Back to TopTop