Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation

Boker, Avishag; Grimberg, Einat; Tener, Felix; Lanir, Joel

doi:10.3390/su17135925

Open AccessArticle

Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation

¹

Department of Information Systems, University of Haifa, Haifa 3498838, Israel

²

Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Dutton Park, QLD 4102, Australia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sustainability 2025, 17(13), 5925; https://doi.org/10.3390/su17135925

Submission received: 12 May 2025 / Revised: 17 June 2025 / Accepted: 23 June 2025 / Published: 27 June 2025

Download

Browse Figures

Versions Notes

Abstract

The market share of Last-Mile Delivery Robots (LMDRs) has grown rapidly over the past few years. These robots are mostly autonomous and supported remotely by human operators. As part of a broader shift toward sustainable urban logistics, LMDRs are seen as a promising low-emission alternative to conventional delivery vehicles. While there is a large body of literature about the technology, little is known about the real-world experiences of operating these robots. This study investigates the operational challenges faced by remote operators (ROs) of LMDRs, aiming to enhance their efficiency and safety. Through interviews with industry professionals, we explore the scenarios requiring human intervention, the strategies employed by ROs, and the unique challenges they encounter. Our findings not only identify key intervention scenarios but also provide a thorough examination of the teleoperation ecosystem, operational workflows, and how they affect the ways the ROs manage their interactions with robots. We found that ROs’ involvement varies from monitoring to active intervention to support the robots in completing their tasks when they face connectivity issues, blocked routes, and various other interruptions on their journeys. The findings highlight the importance of intuitive user interfaces (UIs) and decision-support systems to reduce cognitive load and improve situational awareness. This research contributes to the literature by offering a detailed examination of real-world teleoperation practices and focusing on the human factors influencing LMDR scalability, sustainability, and integration into future-ready logistics systems.

Keywords:

teleoperation; last-mile delivery; robots; remote operation; sustainable urban logistics; human–machine interaction; situational awareness; sustainability

1. Introduction

As e-commerce continues its rapid expansion, developing faster, more affordable, and more environmentally sustainable last-mile delivery solutions become an increasingly important challenge. LMD refers to the last logistic step of the delivery process; that is, the navigation of the last few steps to the end customer. Customers now demand quicker deliveries within narrow timeframes, placing greater pressure on last-mile logistics. Often, LMD is the most expensive portion of the e-commerce process and can range from 28% up to 50% of total logistic delivery costs [1]. This is because of the many challenges in LMD, which include driver shortages, damaged products, failed delivery attempts, and increased traffic congestion. However, unlike other parts of the supply chain, the LMD sector has seen little technological change or disruption in recent years. To address these challenges, autonomous LMDRs offer a promising and environmentally sustainable solution [2,3]. LMDRs are mobile, electric units designed to deliver small goods such as groceries, food, and parcels (see Figure 1). Communication with the end customer is typically handled through a smartphone application allowing customers to place orders, receive notifications, track delivery in real-time, and unlock the robot to retrieve their goods. Over the past decade, the market share of LMDRs has steadily increased [4,5], accelerating significantly during the COVID-19 pandemic. Lockdowns and infection concerns spurred online shopping and a growing demand for contactless delivery [6]. LMDRs present numerous benefits, including financial and environmental advantages, while also enhancing the efficiency of LMD [7]. By autonomously navigating congested areas and taking optimal routes, delivery robots can reduce delivery times and costs [8,9,10]. Additionally, removing delivery vehicles from the road can alleviate congestion, improve traffic flow, reduce road crashes, and lower CO₂ emissions and noise levels [8,9,11].

Despite their clear advantages, high-scale deployment of LMDRs still faces many challenges. While these robots possess advanced navigation capabilities, they may still encounter challenges in complex urban environments, including road obstructions, adverse weather conditions, congested intersections, and unpredictable human interactions. For instance, a delivery recipient may be unresponsive or misunderstood by the robot, or the final destination may be inaccessible. These situations necessitate human intervention by an RO who can assess the issue and take appropriate action [10]. While autonomous robot technology continues to advance, there is a growing consensus that certain situations will always necessitate human involvement—whether due to the broader decision-making capabilities of humans or communication, ethical, and legal considerations [12,13,14].

This study examines the teleoperation of LMDRs—the remote control and intervention by human operators—in dynamic and often unpredictable environments. Adopting a human–machine interaction (HMI) perspective, we aim to understand the broader ecosystem in which ROs and LMDRs function, focusing on the challenges and varying perspectives surrounding teleoperation. Through in-depth interviews with diverse stakeholders, we investigate real-world scenarios that necessitate human intervention, analyzing the specific roles and responsibilities of ROs and the difficulties they encounter. By identifying key operational challenges and requirements, this research seeks to provide insights into the tools, training, and system improvements needed to enhance the efficiency, safety, and overall sustainability of LMDR teleoperation.

1.1. Last-Mile Delivery Robots

Autonomous LMDRs, like other autonomous vehicles (AVs), can drive on their own and autonomously perceive, recognize, and respond to different objects on and off the roadway [7,10,15]. For efficient commercial-scale operation, remote human supervision and assistance is needed and is handled remotely by ROs.

LMDRs are equipped with various sensors that enable both their autonomous mobility and remote operation. These sensors can include cameras, LIDAR, sonar, and radar for sensing objects in the environment. They also have inertial measurement units (IMU) and global positioning systems (GPS) for navigation [7,10]. Some robots also have microphones and speakers that allow them, or their ROs, to communicate with humans [10].

LMDRs can vary in size, weight, cargo capacity, travel speed, and operational area [16,17,18]. The operational area, which determines travel speed [19,20], is divided into three main categories: pedestrian spaces (sidewalks and crosswalks), bike lanes, and roads. The travel speed of robots is adjusted to match the common speed in these areas, with slower speeds on sidewalks and bike lanes compared to roads [17,18]. Most robots fit into one discrete category, with few exceptions [7].

One of the key distinctions in the deployment of robots is based on their operational or dispatching modes. These modes determine how and from where robots are dispatched to perform their tasks. In the “many-to-many” dispatching mode, robots travel directly from various vendors to numerous customers. This approach is commonly seen in the food delivery industry, where robots might pick up orders from restaurants or grocery shops and deliver them directly to consumers. The “one-to-many” dispatching mode involves robots being dispatched from a single point, such as a logistic hub [19] or a “mothership” vehicle (a vehicle that holds several ground robots to make nearby deliveries [8]), to multiple customer destinations. This method is often employed in logistics and supply chain management, where goods are transported from central warehouses to end-users (e.g., Amazon).

1.2. Remote Operation

1.2.1. Teleoperation of Autonomous Agents (Robots and AVs)

Autonomous and semi-autonomous systems represent higher levels of the automation spectrum, which ranges from fully manual operation to complete autonomy. According to SAE’s taxonomy of driving automation systems, at the higher end (Level 5), the robotic agents are fully automated, while at the lower end (Level 0), agents rely entirely on human operators for decision-making and action. In between, there are several levels of automation [21,22]. A complementary approach was proposed by Bogdoll et al. [23], who suggested mapping between the upper-mentioned levels of automation and remote human input systems (RHISs), primarily focusing on remote driving and remote assistance (see Section 3). Although higher levels of automation offer notable advantages such as cost savings, enhanced efficiency, and improved safety, these systems have inherent weaknesses and flaws that may result in substantial safety hazards. One way to mitigate these risks is through teleoperation.

Traditionally, teleoperation involves an RO controlling a robot remotely, often in real time. It utilizes various communication technologies to transmit control signals and sensory data between the RO and the remote system [24,25,26]. This approach combines human decision-making capabilities with robotic execution. Sheridan (2016) defined teleoperation as “the extension of a person’s sensing and manipulation capability to a remote location” [27]. Teleoperation has become increasingly important in various fields, including space exploration [28,29], disaster response [30,31,32], agriculture [33,34], construction [12], mining [35], medical practices [36,37], logistics [38], road transport [39,40] and more. Early teleoperation systems were often characterized by basic mechanical interfaces and minimal sensory feedback. Modern systems employ cutting-edge sensors, state-of-the-art communication technology, and improved human–machine interfaces [41,42,43].

As robots become more prevalent in our daily lives, the need for efficient teleoperation systems continues to grow. The efficiency of teleoperation systems depends on the RO’s ability to understand the robot’s environment, make informed decisions, and issue appropriate commands. Research in human–robot interaction focuses on optimizing situational awareness and minimizing cognitive load [44,45].

The efficiency and safety of robotic teleoperation is heavily influenced by the cognitive aspects of the task and the perceptual and cognitive challenges faced by the RO. These challenges include latency issues, spatial orientation, object identification, limited field of view, inaccurate size and distance estimation, degraded motion perception, RO workload and more [40,46,47,48]. To address these challenges, various mitigation strategies have been proposed, including the use of multiple cameras and auditory alerts, among other techniques [46,49].

Research in robotic teleoperation has explored a diverse range of UIs and interaction techniques. Rea and Seo [50] discuss the importance of intuitive UIs that provide ROs with comprehensive situational awareness, highlighting that the design of interfaces that enable effective communication between ROs and robots is a critical aspect of teleoperation. Despite the potential benefits of novel interface paradigms, the core challenges in robotic teleoperation, particularly for semi-autonomous systems, appear to be more fundamentally linked to human perception issues. Additionally, key considerations include the allocation of tasks between human and robot, the reliability of autonomous agents, and the development of trust in automation [51]. These factors underscore the complexity of designing effective teleoperation systems that balance technological innovation with human cognitive capabilities and operational requirements.

1.2.2. Teleoperation and the LMDR Ecosystem

Despite the significant advancement of LMDR’s autonomous capabilities, the complexity of dynamic urban environments necessitates their occasional human monitoring and teleoperation [52]. Locked roads, bad weather conditions, a congested cross section or interaction with a human may all require the intervention of a human RO. When an LMDR encounters an unknown problem in a particular situation, an RO can be called to assess the situation and guide the robot until the problem is resolved. Thus, in a future in which LMDRs will become ubiquitous, a scalable and affordable teleoperation solution is essential to support this growth.

LMDRs’ operation requires establishing specific regulations, as they operate mainly on sidewalks and face challenges regarding infrastructure, externalities such as CO₂ emissions, and shorter delivery-time requirements [53]. Many jurisdictions have enacted laws governing where and how LMDRs are allowed to operate [18,53,54,55], with varying levels of regulation regarding speed limits, weight restrictions and operational zones. Attempts have been made to create regulatory frameworks regarding LMDRs, analyzing political, social, and sustainable perspectives [10]. Yet, there is great regulatory variation between countries and even between cities, both in general LMDR regulations and specific teleoperation guidelines. For instance, the U.S. has a diverse and sometimes contradictory legal landscape regarding robot traffic laws, with individual cities and municipalities creating their own regulations. Regulatory restrictions are heavily influenced by public acceptance [51,53]. As of May 2023, 22 states in the US have passed a law allowing the operation of LMDRs under certain conditions [17]. LMDRs are also allowed in some countries in Europe. Estonia, for example, was one of the first countries to allow them and enact laws to regulate how LMDRs share sidewalks with pedestrians [10]. LMD includes three stakeholders, namely the seller, an intermediary, and the client [10]. Today, most intermediaries are well-established delivery companies, including traditional logistics service providers such as DHL, UPS, as well as a range of new startups focusing on the development of delivery robots (e.g., Coco, Starship, Serve robotics). LMDRs have led to the creation of a new industry segment: companies specializing in fleet management platforms and robotic teleoperation services. These firms (e.g., DriveU, Phantom Auto, Ottopia) offer software solutions to logistics service providers (e.g., Magna, Serve Robotics) who handle deliveries for sellers. The sellers supply products to end customers, and in some cases also take on the role of logistics providers themselves. This ecosystem reflects the complex interplay between technology providers, logistics companies, sellers and clients in the evolving landscape of automated LMD.

While several studies have examined the deployment and performance of LMDRs, focusing on the role and operation of these stakeholders [7,56,57,58,59], no study that we are aware of has focused on the teleoperation of LMDRs or how to design or implement an LMDR teleoperation solution.

1.3. Research Objectives

While research on autonomous urban agents, such as AVs and LMDRs, has been steadily increasing, the real-world challenges involved in the teleoperation of LMDRs remain an underexplored area. Despite the crucial role ROs play in ensuring the efficiency, safety, and adaptability of these systems, limited studies have examined their work practices, challenges, and requirements. This study aims to bridge this gap by investigating the working environment, responsibilities, and experiences of LMDR ROs. By mapping their workflows and operational challenges, we seek to uncover unresolved issues and unmet needs in the field. These insights will contribute to the development of teleoperation tools that better support ROs, enhance decision-making processes, and inform the design of more intuitive and effective interfaces.

The remainder of this paper is structured as follows: Section 2 details the research methodology, which involved in-depth interviews with key stakeholders from the LMDR teleoperation pipeline. Section 3 presents our findings, including different teleoperation modes, the ecosystem of remote operation centers (ROC), and key intervention scenarios. In Section 4, we discuss the broader implications of our findings for system design and policy considerations including directions for future research. Finally, Section 5 provides concluding remarks.

2. Methodology

2.1. Participants

We employed purposive sampling in our recruitment strategy. This ensures a diverse range of perspectives by specifically including informants with different viewpoints on the subject [60]. We targeted industry professionals within the LMD and remote operation sectors, representing various stakeholders with relevant personal experiences. Participants, drawn from five countries, represent organizations at varying stages of technological development and fleet sizes. While a minority of participants represented very early-stage startups, a considerable proportion of the participants come from series A and B funding rounds startups (Series C is typically associated with greater product maturity and operational scale) Series A, B, and C are funding rounds that generally follow the stages of initial investment (“seed funding”). Often, the more rounds of series funding you go through, the more mature and established your business is and the closer it is to go public. Series B typically takes place after a business has made significant progress and demonstrated its ability to generate revenue [61]. Participants were identified and contacted through existing collaborations and social media platforms like LinkedIn. We also emailed companies in the field, requesting an interview with one of their staff members. Table 1 provides details of the participants including the company they work for and their role.

2.2. Interview Protocol

Data was collected through semi-structured interviews, which provide depth and flexibility when dealing with complex issues requiring an understanding of participants’ perspectives and insights. This approach is especially appropriate given the diverse backgrounds of the participants. The interview protocol consisted of a set of open-ended questions, with room for follow-up inquiries based on participants’ responses. The overall framework focused on the main challenges that ROs of LMDRs encounter and the use cases in which they are needed to operate. A standardized introduction script was used, but specific questions were adapted to the participants’ unique backgrounds and experiences. Questions focused on the practices of ROs and the challenges they face as part of their work.

Interviews were primarily held online, lasted between 45 and 60 min, and were recorded with the participants’ explicit consent. These recordings were then transcribed for further analysis. In accordance with the ethical clearance, participants consented to participate and were able to stop the recording at any point they wished.

2.3. Data Coding and Thematic Analysis

Data analysis began with the development of a codebook by the first two authors [64]. Consistency between coders was verified through two rounds of independent coding, involving comparisons and discussions to resolve discrepancies and gaps in their interpretations. Once consistency was verified, we generated a codebook that guided the rest of the analysis. The remaining transcripts were divided between the two coders for independent coding. The coding process was performed manually (i.e., without the use of a designated coding software). Regular meetings were held during the coding period to discuss and resolve any questions and ambiguities, and to update the codebook as needed. Next, we searched for recurring themes and clustered the data for further analysis [65].

We observed that no substantially new codes or themes emerged in the later stages of coding, and existing themes were reinforced by multiple participants. This suggests that code saturation was achieved, supporting the adequacy and completeness of our sample for the exploratory goals of this study [66].

3. Findings

The diverse backgrounds of our participants provide a comprehensive overview of various aspects of the work practices of ROs. By investigating the situations that require ROs’ intervention and their working environment, we can better understand the challenges they face. This chapter begins with describing the different modes of LMDR teleoperation. Next, we explain how teleoperation is utilized in various intervention scenarios and then describe the remote operation working environment and ecosystem, and how it supports ROs in their tasks.

3.1. Teleoperation Modes

The interview data highlights several operational modes implemented to ensure the efficient and safe operation of the robots. Three of these operational modes are conducted remotely: monitoring, tele-assisting, and tele-driving. Additionally, on-site support operation is provided by field teams. Each mode represents a distinct operational method tailored to the specific situation of the robots. Figure 2 summarizes the suggested role taxonomy within the ROC. The following sections provide more details into these modes, offering a comprehensive overview of current practices, challenges, and future directions in LMDR operations.

3.1.1. Remote Monitoring

Monitoring involves the passive observation of robots as they operate autonomously [67]. This is the default mode in many companies with autonomous LMDR fleets to ensure smooth and safe operation and to identify any issues that require intervention and active support from the ROC, as described in the next sections. ROs typically monitor multiple robots by observing video streams and data metrics such as speed, location, battery status and so forth [68,69], as explained by one of the interviewees: “ …a supervisor position where one person… can look at multiple robots at once… and their job is looking at these robots and seeing if something’s going wrong.” (P4). P13 adds that “I can see their statuses at a glance… Green is online… I can actually see in the overview all kinds of parameters, activation time, all kinds of things it gets”.

3.1.2. Remote Intervention: Tele-Assistance

Tele-assistance employs high-level commands (e.g., “Stop”, “Go”) to remotely operate the robot without directly driving it [49,70,71]. Based on the interview inputs, we identified several ways in which tele-assistance is utilized. The most simplistic way involves solely a ‘go’ command. That is, when a robot encounters a situation that requires intervention, it will stop and wait for approval to continue its path autonomously. Another approach to tele-assistance involves marking a destination or path on the screen, typically while using a map [72]. The destination-marking option is described as “a point-and-click system where the operator points on a map” (P4). Alternatively, “the operator can draw exactly the path he wants the robot to travel on” as noted by P7. Informed decisions are made based on relevant information such as camera streams and locations of vehicles and pedestrians, as indicated by P1: “operators need to have access to camera streams and the locations of vehicles and pedestrians to make informed decisions.”

Tele-assistance represents a middle ground between fully autonomous robots and remotely driven robots. Companies use this mode when possible as a more efficient solution compared to fully remote-driven robots, which incur higher operational costs: “…at the stage of commercial deployment, we try to put as few sensors as possible and be as much cost-effective as possible, and [companies] won’t add sensors just for the teleoperation [tele-driving], just as they won’t add more crazy communication relay stations just so that the tele-operator can open a camera at low latency. … in extreme cases, there will be a tele-operation [tele-driving].” (P11). Reducing operational costs is also achieved by minimizing human operator involvement through allowing ROs to manage multiple LMDRs during a single teleoperation session [40,51], as P9 commented: “[We try] to reach a situation where the operator hardly does much beyond making high decisions, so that he can multitask, so that one operator… can simultaneously control 10, 20, 30, 50, 100 tasks at the same time. We are [trying] to lower the human overload”. By minimizing direct control, this approach can reduce operating time, reduce cognitive load, simplify control mechanisms, and improve safety compared to remotely driving the robot.

3.1.3. Remote Intervention: Tele-Driving

When tele-assistance fails to resolve the robot’s problem, tele-driving is required. In tele-driving, human ROs take direct control over the robot (i.e., driving) until the robot can proceed independently. By leveraging tele-driving, LMDRs can benefit from human judgment and adaptability while maintaining the cost-efficiency and automation benefits of robotic systems, compared to fully manual-operated robots. The interview data confirms that tele-drivers are often susceptible to detachment from the robots’ surroundings due to the lack of sensory input which results in impaired situational awareness and significant challenges in terms of ROs’ cognitive load, among other things [40]. One interviewee explained, “…they have to sit in front of a screen to drive. …when you are in front of a computer screen … it requires much higher concentration… It’s not just vision, it’s also the hearing and body sensations and other parameter inputs” (P3).

One way to enhance ROs’ situational awareness and reduce cognitive load is by providing sufficient contextual information. This includes understanding what has transpired up to the point of the ROs’ intervention and identifying key factors relevant to current and future driving conditions, such as permitted speed limits, pedestrians’ presence, and road conditions [73]. As one expert explained, “…just because I’ve gained control of the device doesn’t mean I understand the situation” (P8). Another expert also claimed, “[the person who is] sitting in front of a screen, driving a robot and then switching to another robot, and they need to understand the situation… There is no continuity from the previous session, it’s something else. …it requires a lot of cognitive concentration … to actually focus on the task and do it well.” (P3).

3.1.4. Field Team

Delivery robot companies typically maintain a small number of field engineers who are on standby, ready to intervene when teleoperation is insufficient. This can occur when a robot tips over, its battery is depleted, or a connection cannot be established for other reasons. In such instances, the RO will send a field team to perform a local repair or collect the robot to the nearest support center [67]. The field team is equipped to handle various situations, such as reorienting robots, charging depleted batteries, and clearing physical obstacles. P14 shared that “…at the end of the day we would go around and collect them [the robots] back, bring them to the garage and in the garage, we would clean and essentially return to service all of the active robots.”

Nevertheless, it was indicated that ROs sometimes resort to this option too quickly. P4 mentioned the extensive use of sending a field team at the expense of troubleshooting using teleoperation: “There’s definitely a lot of operators who … call for on-site technician … too quickly. So instead of doing the full script and making sure everything works, they might get frustrated. They might think that the troubleshooting takes too long time, so they rush to a resolution, or they rush to escalating the issue too fast.”

3.1.5. Response Procedures

Whether the intervention takes the form of tele-assistance, tele-driving or dispatching field team, it raises the question of how decisions on intervention and escalation are made. While our data confirms that ROs should practice their judgment, several interviewees mentioned that the response can also be determined by the pre-defined scripts that serve as a type of checklist and guide the ROs through the steps they need to take to solve an issue. The interviewees exemplified how these scripts are used to troubleshoot problems. P4 explained that “…[the ROs] basically follow a script. They look at the logs, they look at the symbols and they follow a script [of action]. And the script might be as easy as trying to reset the camera, trying to restart the computer on board, trying to check the battery…”. Nevertheless, as noted earlier, ROs may take some shortcuts. It was found that responses to these events depend on the type of incident, its urgency and environmental conditions, as well as on the RO’s experience and actions.

3.2. Remote Operation Centers

Several interviewees (P8, P11, P13, P14) underscored the importance of the remote operation centers (ROC) in supporting and facilitating the teleoperation of robotic fleets [70,74]. These centers function as command-and-control hubs from which ROs oversee and manage the robots. Figure 3 depicts how such a ROC operates. The interviewees indicated that ROCs can be operated and structured in different ways, including variations in workflows, organizational affiliations, and physical locations. This section examines these aspects of ROCs and their connection to remote operation work.

3.2.1. ROC Structure

Companies employ different teleoperation structural models that define their work structure, robot-to-operator ratios, and decision-making hierarchies. These models can be broadly classified as hierarchical-vertical or horizontal-flat.

In a hierarchical model, there is a clear chain of command with multiple levels of roles involved in oversight and direct control. This structure emphasizes role specialization and responsibility, allowing for distributed attention and potentially more consistent adherence to protocols. P5 described the hierarchical structure: “In the control center there is a function of an observer, who is someone who only watches [robots] and cannot perform actions, and there is an operator who is actually above him. …there is actually an observer, an operator and a shift manager, each of whom has a role…”. In such a system, each RO has distinct tasks and responsibilities, with roles ordered by discretion levels. For example, those responsible for monitoring do not drive the robots or provide them with assistance. This model is well-suited to larger organizations with higher robot volumes, where separating tasks can improve robot handling, reduce operators’ cognitive load, and provide oversight for risk mitigation. However, this approach may also introduce slower decision-making, as incidents must escalate through multiple layers before action is taken.

On the other hand, in a horizontal model, one person handles both the monitoring and active operational tasks, simplifying the organizational structure. As P13 described: “There’s always a control center… and… there can be one person who controls seven [robots]… He constantly receives events. He knows: ‘now I received an event from here, I go and take care of this robot’”. In this flat structure, the RO receives notifications directly from the robots and acts without intermediary layers. While this model enables faster response times and greater autonomy, it places more demands on the operator’s situational awareness and multitasking abilities. It may also lead to operator fatigue in high-volume scenarios, especially when managing several simultaneous interventions. That said, the horizontal model offers cost-efficiency and agility, which can be advantageous for smaller deployments.

These structural models are not only organizational preferences. They reflect broader trade-offs in scalability, training requirements, autonomy levels, and response speed. Some hybrid variations may also exist, where companies adopt elements from both models depending on the time of day, operational load, or robot type. It is important to note that although the literature, in other operational domains, examines the relative efficiency of these two models [75,76], our qualitative data and the scope of this study do not allow us to make such an assessment.

3.2.2. Robots-Operator Ratio

The human-to-robot ratio in ROCs vary. Several interviewees (P3, P4, P5, P10, P13) mentioned a one-to-many relationship between the monitoring person and the robots, though the specific ratio is not well established. One participant noted, “My assessment today is that what is customary in the market is 1 to 3, something like that. In the more advanced companies … for example, the Starship company … they are already at the level of 1 to 7 or something like that…” (P13). However, one interviewee reported that his company deployed until recently a one-to-one remote operation ratio (P14). The ratio seems to depend on the autonomy level of the robot, the company size, and the structural model.

In a hierarchical level, the monitoring capacity is significantly higher and can reach more robots per supervisor, given that the monitoring is conducted by a dedicated stakeholder who focuses solely on this task. Either way, it is clear that one of the goals of the LMDR industry is to increase the operator-robot ratio, which requires significant advancements in robotic autonomy. In addition, the use of tele-assistance can also potentially increase the number of robots per RO.

3.2.3. Allocating Teleoperation Calls

Several interviewees (P3, P7, P8, P13) shared that the monitoring person should have a queue of remote operation calls or notifications. “The queue of calls is managed on some screen that we call the ‘administrator’ screen … and he can decide which call goes to which tele-operator. Alternatively, the calls might be allocated automatically…” (P7). P13 added that each robot has a clear indication of its current status, whether it is connected, needs assistance, etc. Such an indicator can help the monitoring and assisting professionals understand the robot’s state at any time. Incoming calls are allocated to ROs either by a human or by an automatic matching algorithm that considers both the robots’ and the ROs’ characteristics. For instance, P13 explicitly mentioned that he “…want[s] to be able to monitor and manage (e.g., perform software updates) [multiple] robots”.

3.2.4. ROC Location

Various models exist for ROCs locations, differing mainly by the proximity of the ROC to the site in which the fleet operates. Some companies have established ROCs in different countries from where the robots operate. Such an approach can offer cost benefits but may exacerbate latency issues because the robotic fleet might be geographically far from the ROC. A distributed model in which ROs of the same ROC are located in various geographic locations is also possible. For instance, one company had ROs located throughout the United States. As stated by P14, such an approach primarily accommodated different time zones: “…they sat all over the USA mainly due to time zone adaptation…”.

However, the distance between the ROC and the teleoperated robots may impact the latency when transmitting data between the two [77]. For instance, P11 shared that “…the further you are away, the latency increases. …there is a very, very important issue of [geographic] proximity here. This latency is built from network loads and distances…”. Thus, the physical distance between the operational site and the ROC impacts the efficiency and effectiveness of remote operations.

3.2.5. ROs’ UI

Effective teleoperation and operator-to-robot ratio depend significantly on intuitive UIs that reduce ROs’ cognitive load and increase their situational awareness. ROs often need to handle large amounts of data, which can create a high cognitive load and consequently affect their performance. P4 mentioned, “With a good UI, a good driver can drive for 12 h and have a great time. However, if you have a poor UI with buttons that are hard to get to, things don’t work, and the system itself has a lot of latency, after 4 h, you’re just completely done for the day … you’re so mentally exhausted…”. The ROs’ station generally includes screen(s) and controllers (see Figure 2). The screens display high-definition video streams from various mounted cameras (front, back, right, and left) transmitted to the ROs’ screen in real time via a cellular connection with a high frame rate (20–30 FPS).

Camera views from the robots are typically shown on the main screen of the UI. Several participants commented that the camera views and angles should enable selection and manipulation by the RO. Additionally, many participants commented that an internal trunk camera is needed to see the goods that have been delivered. In addition, the ROC should also be equipped with a region’s map or a 3D LIDAR-based map, enabling the RO to see the robot within its environment based on a GPS signal, including nearby vehicles, humans, other robots, and static infrastructure. P8 explained: “When you are sitting somewhere else, you have no idea where you are [inside the remote environment]…”.

The controllers are used to guide the robot and communicate with its environment. These input devices might be a steering wheel, pedals, joystick, keyboard and mouse (P9, P7, P10). It is important to mention that the input devices are related to the teleoperation modalities described in detail in Section 3.1. Effective tele-driving systems may utilize familiar control schemes similar to gaming joysticks. For instance, P4 mentioned that “In some cases, he might want to drive it… think of it like an Xbox…” And P14 said they “…used an off-the-shelf Xbox joystick…”. In other companies, a keyboard and mouse can be used: “In many cases, it’s a keyboard with a mouse. I mean, it’s not even something very specific. You control driving with the arrows…” (P8).

The RO should also have access to the robot’s parts and sensors. For instance, P13 mentioned that the RO should be able to “…open the door [=cover]…” of the robot. Finally, an audio channel must also exist to allow interaction with humans around the robot (e.g., entering a gated community). Thus, each ROC is also equipped with headphones and a microphone. Finally, many interviewees (P3, P4, P7, P8, P9, P10, P11, P13, P14, P15) described various on-screen components essential for successful teleoperation of robots, which can complement a real-time video feed. Table 2 summarizes the major components of the RO’s screen and their purpose.

3.3. Key Intervention Scenarios

This section highlights situations in which human remote intervention is required in the operation of robots. Key use cases include connectivity issues, blocked routes, road and environmental conditions (e.g., terrain irregularities, weather conditions), robot malfunctions, and interactions with people (e.g., bystanders, drivers). Figure 4 presents the Key Intervention Scenarios that require teleoperation, along with the sub-cases identified in our study.

3.3.1. Interaction with People

Our study corroborates a large body of literature [56,58,59] showing that interaction with road users is one of the main challenges that robots face according to the interviewees.

Many interviewees (P4, P5, P10, P11, P12, P13, P14) highlighted that robots often draw attention and spark interest among bystanders. Their presence in the public sphere leads to a range of communication and interaction issues, both favorable and unfavorable, with varying levels of risk to the robots, human road users, and the delivery tasks. Interaction scenarios can be analyzed along various dimensions, including sentiment (positive vs. negative), proximity (close physical contact vs. solely verbal interaction), identity of those at risk (robots vs. humans), participants (e.g., pedestrians, drivers), causes of the interaction (inadvertent or intentional), and impact level of the outcome. Following we detail a few examples of such interactions.

Negative Response: Intentional Interaction

Multiple interviewees (P4, P10, P11, P12, P13, P14) raised the issue of negative and anti-social behavior towards robots. Typically, this type of conduct involves obstructing the robots’ path, verbally or physically assaulting the robots (e.g., vandalism), attempting to steal deliveries, or a mix of these actions. Blocking the way is a situation where individuals or groups, often younger people, encircle the robot to stop it from moving forward. It is conducted mostly as a prank, where people amuse themselves by challenging the robots such as placing obstacles in their paths, and observing how they respond: “those who want to put a safety cone in the way to see if it stops” (P10).

Confrontations with robots, either verbally or physically, also occur. Several interviewees (P10, P11, P12, P13, P14) identified the issue of people blocking the path of the robots intentionally. P10 shared that “…some kids want to mess with a robot or push it over.” Another participant described that “…there is an ongoing issue of people who deliberately mess with these autonomous robots, put a bucket on their head so they can’t see, and throw a piece of paper on them, let’s call it mild vandalism. It [happens] a lot relatively. It’s not going to disappear… Everyone is very satisfied that they managed to disable an autonomous robot, it is quite simple to disable them today.” (P11). Moreover, it appears that some parts are more susceptible to vandalism than others: “…but the turnstile and the flag are one of the things that the homeless like to break the most and it is also one of the problems… especially in downtown LA where many homeless people are…”.

Intentional Interaction-Positive Intention

Despite occasional harassment, participants generally agreed that robots are usually well-received by the public. Respondents (P5, P10, P11, P14) noted that people, especially kids, show enthusiasm when they see robots by greeting or trying to interact with them in a friendly manner, including taking pictures with them. Assisting robots in trouble is another example of a friendly interaction. If robots encountered an obstacle or toppled over, some passersby might help them get back on track: “Kids love the robots… [they] make it special and magical for kids… a lot of people, because they really like it, they put them back up.” (P10).

However, even well-meant interactions can obstruct the robot and impede its movement: “It’s not vandalism, but they are just really curious. …let’s differentiate between those who want to put a safety cone to see if it stops it [the robot], …and those who just look at [it] so much that sometimes they crash with it because they didn’t notice they were on the road, I mean a kind of natural curiosity. …it could also be something that obstructs it, [the] gatherings of people [around it].” (P11).

Robot as an Obstacle/Impact on Human

The interaction between robots and road users impacts not only the robots but also extends to the road user. Participants noted that sometimes road users may not understand the robots’ direction or intention, leading to potential collisions or blocked pathways. A few participants suggested that robots could disrupt regular traffic and occasionally cause interruption by accidentally blocking the way to other road users. According to interviewees, much of the interaction between pedestrians and drivers traditionally relies on subtle body language, such as nods, facial expressions, eye-contact or hand gestures. These subtle physical cues are social norms that humans–often unconsciously–send or perceive to indicate personal intentions and expectation and interpret the intentions of others. “You are educating your children to try to make eye contact with the driver when crossing a road and see that they [the driver] really saw them. …But with autonomy obviously… there is no communication…” (P12).

Miscommunication can also occur with drivers. If a driver is unable to predict a robot’s movements, they might hesitate or make incorrect decisions, thereby increasing the risk of accidents. Poor communication between a robot and a driver at pedestrian crossings, for example, could lead to incidents, as explained by P12: “…[when] a robot tries to cross the road it asks [the driver’s] permission by [signaling via] all kinds of lights… But by the time the robot starts driving, a few seconds pass, and it creates some kind of ambiguity for the drivers. I go, it goes, I go, it goes… many times there is a collision.” The car starts driving and the robot starts driving at the same moment. Just like with pedestrians, the misunderstanding is attributed to the missing eye contact between the human driver and the robot: “Every driver knows that after a very short time [as a licensed driver], you don’t pay attention to the rules and where the cars are, but to the intention of the driver opposite [to] you.” (P12). The interviewee explains that drivers are socialized to adjust their behavior based on a mutual understanding achieved through eye contact. Without this type of communication, misunderstanding inevitably occurs.

Many of the challenges associated with these encounters are often attributed to the novelty of robots in public spaces. It is anticipated that as robots become more prevalent through increased commercial deployment and greater exposure, at least some of these issues (e.g., curiosity-related) will naturally diminish, as pointed by P11: “I think this [the curiosity] is mainly relevant to the early stages. I mean, in commercial deployment, when they are [going to be] constantly on the sidewalk and part of the urban landscape, I think that people will get used to them… naturally, once it becomes part of the urban environment, it becomes a bit less interesting. Once it’s something new, everyone looks at it.” (P11).

RO’s Role Within the Interaction

Certain interactions between robots and surrounding road users require RO intervention. Essential intervention is required when there is an apparent risk to humans or robots. This includes, for example, aiding a robot in distress (e.g., stuck or fallen) by seeking help from nearby people, deterring people who are trying to harm the robots or asking people who block the robots’ way to move (P10, P12). Another type of necessary remote intervention is helping end-customers when the delivery arrives (e.g., the storage compartment does not open). ROs may also get involved to enhance positive public experiences. Although these interactions are not intended to resolve an immediate problem, they can improve the public acceptance of robots [53]. Such interventions happen when people try to socially interact with robots. In these situations, ROs may have flexibility in how they respond. One of the participants commented: “Usually an operator, when they see a person that wants to interact with the robot, like you’ll see on the screen a person that’s like ‘hi [ROBOT NAME]’ and they get excited… they’ll start interacting with the person.” (P10). However, we can reasonably assume that such a time-consuming interaction is more likely to occur during the initial stages of deployment or infrequently thereafter, as it may result in undesired delays.

3.3.2. Connectivity

According to our interviewees, connectivity is one of the most frequent issues affecting LMDRs’ work. This challenge was frequently mentioned, with 7 out of the 15 participants identifying communication as a primary obstacle for LMDRs. Connectivity issues refer to communication failures (e.g., disconnections or latency) caused by instability or malfunctions in the cellular networks used by the robots [78].

One primary cause of connectivity problems is the high density of users competing for bandwidth, including smartphones, Internet of Things devices, and AVs. As one interviewee noted: “…if there are now many autonomous agents there will almost certainly be a communication problem… if many operators will open too many cameras at the same time… they are all standing at the same intersection that has a puddle in it. For sure the communication will stutter, in stuttering communication tele-operation cannot work.” (P11). That is, congestion can lead to overloaded networks, resulting in slower data transmission speeds and increased latency. For LMDRs, which rely on real-time data for navigation and communication, such delays can disrupt their operations, causing them to stop unexpectedly or lose connection with their ROC.

Besides overloaded networks, communication issues might also arise due to the physical characteristics of urban environments. Narrow streets and tall buildings (“Urban Canyons”) may cause dropped connections or weak signals. This is particularly problematic for GPS-based navigation systems since it causes difficulties in receiving and sending GPS signals effectively: “…because it was narrow streets and high buildings, we had problems with the GPS Signal… They had to redesign a bit the navigation algorithm not to rely so much on the GPS signal…” (P1).

3.3.3. Blocked Routes

The sidewalks, where LMDRs travel, often contain various obstructions, including trash cans, parked vehicles, baby strollers, electric scooters, and other objects left on sidewalks by pedestrians (P4, P9, P11, P15). These obstacles are inherent to urban landscape, making navigation challenging and unpredictable for LMDRs. Typically, robots attempt to navigate autonomously by recalculating their route. However, when this is not feasible, the robot sends an alert to a ROC, requesting assistance.

3.3.4. Road and Environmental Conditions

LMDRs frequently encounter challenges due to poor road conditions and environmental factors, such as broken sidewalks, potholes, and uneven terrain. These disruptions can obstruct their path, leading to mobility difficulties or even causing them to become stuck and completely immobilized. For example, if a robot’s wheel gets trapped in a pavement groove and cannot extricate itself autonomously, it will be unable to continue moving forward. Another challenge is adverse weather conditions like hail, ice, and wet surfaces. Snow, for example, can cover up lanes and create dangerous slipping conditions, as mentioned by P2: “After snowing, we cannot see the lane. Moving and slipping is very dangerous…”.

Robots also struggle with changes in elevation, such as stairs or steep slopes, which their engines may not handle effectively: “…the elevation of the ground at a certain angle, and the engines are not just strong enough to carry their robots upstairs or downstairs…” (P4), “These situations are often addressed using tele-driving or tele-assistance…” (P9). In the latter case, for example, the RO may mark the part of the road where the height is different (e.g., marking a ramp) and instruct the robot to continue and take into account the new information.

3.3.5. Complex Traffic Scenarios

Urban navigation involves complex decision-making at street crossings, intersections, and other traffic scenarios. P7 discussed the ‘go-no-go’ tele-assistance solution at a complicated T-junction where the robot needs an RO’s approval to proceed. If there is a need to travel outside mapped areas, robots may struggle to navigate safely and effectively, thus necessitating intervention of a human. Moreover, the ROs themselves, if they operate fleets in different, less familiar areas may need to adapt to changes in traffic laws and environmental conditions.

3.3.6. Rules and Regulations

Several interviewees highlighted challenges related to regulation and bureaucracy, including area and speed restrictions (such as restrictions in specific neighborhoods), and requiring a permanent human escort in specific areas (P4, P5, P12) [10]. In terms of teleoperation, certain authorities mandate human intervention in specific scenarios, such as road crossings. As one participant explained: “There are places where in regulation you must intervene. The robot reached the sidewalk and wanted to cross a road, [but] it can’t. According to regulation an autonomous robot is not allowed to cross the road… and therefore, the operator must take control and pass it through…” (P13).

3.3.7. Robot Malfunction

Mechanical failures are common among LMDRs. These issues can result in significant consequences such as failed deliveries, stoppage when trying to cross streets, and significant traffic disruptions. P4 shared that “…a robot is trying to cross the street and gets stuck, and then suddenly you stop all traffic…”. Electrical issues also pose major challenges. Problems with sensors, cameras, and other electrical components can prevent the robot from operating correctly. Even minor electrical issues can complicate and hinder the robot’s operation. For example, something as simple as a low battery level might prevent the robot from completing its journey until the battery is replaced (P10). Battery depletion is currently addressed reactively by dispatching a field team. However, this highlights the need for more proactive power management. Although interviewees did not report formal energy strategies, repeated references to battery-related issues suggest that power management is a critical factor in ensuring fleet reliability and supporting effective task planning. Technical faults requiring remote help might also occur when taking out goods. If the robot door gets stuck or the receiver’s code does not work, it will not be possible to send or receive the shipment.

4. Discussion

Despite the increasing deployment of LMDRs in urban environments, there remains a limited understanding of the scenarios that necessitate human intervention and the challenges ROs face in real-world settings. This study seeks to address this gap by examining the operational challenges encountered by LMDR ROs, shedding light on their working environment and the diverse scenarios they must manage. Our findings provide a comprehensive analysis of the teleoperation ecosystem, key intervention scenarios, operational workflows, and the strategies ROs employ to interact effectively with the robots. By identifying critical challenges and inefficiencies, this research offers valuable insights for the development of more effective teleoperation systems, improved interface designs, and enhanced support tools.

Three main teleoperation modes for LMDRs were identified in the study: monitoring, tele-driving, and tele-assistance. These modes are consistent with the literature [13]. However, we also identified a considerable use of a field team serving as a last-resort, resource-intensive option when an operator faces an issue with remotely operating the robot. Between the two control modes, our findings indicate that most participants preferred tele-assistance over tele-driving, as it leverages human decision-making while minimizing direct robot control, ultimately enabling quicker and safer interventions. This aligns with existing research emphasizing the advantages of integrating human decision-making with autonomous systems [39,79,80]. While the tele-assistance and tele-driving interfaces may be integrated into one teleoperation station, our findings imply significant variations in teleoperation needs across different roles, companies, and environments. These variations stem from three main sources: organizational structure (e.g., horizontal versus hierarchical model), robot characteristics (such as different door configurations requiring distinct control interfaces), and operational environments (from university campuses to dense urban areas with varying regulations). This raises a fundamental design question: should teleoperation interfaces be standardized to support all conditions, or should they be tailored to specific ROs, robots, and environments? Based on our findings regarding the significant variation in teleoperation structures, workflows, and UI features across companies, we believe there is a need for a standardized interface that will enable a common language, while allowing for minor adjustments based on the varying conditions.

Prior research has emphasized the importance of consistency in UI design [81,82]. However, other researchers indicated that adaptability and flexibility in UI design may be necessary to accommodate the unique demands of different operational contexts [82,83]. While standardized interfaces offer clear advantages, their feasibility depends on achieving interoperability and establishing industry-wide standards. Designing a standardized teleoperation interface that applies across companies and countries, enabling global management of LMDR systems, would require integration into national transportation regulations. Such an approach demands international collaboration among transportation ministries, along with careful planning to align interface design with diverse regulatory environments. However, achieving this level of coordination is highly complex. An alternative pathway toward standardization may emerge through industry-led cooperation. If leading companies in the LMDR domain adopt a shared teleoperation and fleet management system, smaller companies may follow suit. While this bottom-up approach avoids bureaucratic delays and promotes organic adoption, it lacks formal regulatory enforcement and may lead to inconsistent implementation or fragmentation over time.

To support standardization, we suggest using a Risk Management approach [84]. Risk management focuses on identifying, understanding, and assessing risks and issues associated with the robot, along with their potential consequences. Several established techniques exist for evaluating risk and its impact, including ISO 31000 [85,86]. Based on these approaches, the risks associated with LMDRs can be classified as follows:

Operational and Financial: Issues such as delays, incomplete deliveries, disrupted planning, repair of damaged robots, the need for substitutions, and customer compensation.
Health and Safety: Potential harm caused to road users.
Reputational: Impacts on public perception and potential loss of customer trust.

Another relevant framework is an adaptation of the Failure Mode and Effects Analysis (FMEA) approach for mobile service robots, known as Robot-Inclusive FMEA (RIFMEA) [87]). This framework considers the entities that could be impacted-namely robots, humans and objects in the robot’s working environment. The risk to humans corresponds to health and safety risks in the previous typology, while the risks to robots and the environment are associated with the operational risks.

Each of the intervention scenarios can be analyzed using these frameworks. For example, robot malfunctions (e.g., dead battery) involve the robots directly (i.e., operational risks) but can also indirectly pose health and safety risks to humans–for instance, if the robot blocks other road users’ path. In other scenarios, such as a collision between a robot and a passerby who failed to correctly understand the robot’s intentions, the impact on humans is more direct. These scenarios can be rated by their likelihood and the severity of their impact [88].

Scenarios can be further examined through the cognitive load theory [89] that provides a human factors lens—specifically, operator-centric demands—to address questions like “How challenging is this intervention for the operator?”. Each scenario could be rated based on the cognitive load it imposes on the operator. The load can be Intrinsic or extraneous. Intrinsic load is the inherent difficulty of the problem itself (e.g., diagnosing a complex sensor failure is intrinsically harder than noting a low battery), where extraneous load is imposed by the design of the interface (e.g., how many clicks to find the backup camera view? Is the message clear enough?).

The findings from this study offer valuable insights to inform the design of future LMDR teleoperation interfaces. For example, the identified intervention scenarios can serve as use cases for developing essential tele-assistance commands, similar to prior approaches in AV research [49]. Furthermore, given the intervention scenarios, the interface could help address them by accurately conveying urban space density, indicating navigability through narrow passages, and providing alternative location indicators when GPS is unavailable. Similarly, proper presentation of the robot’s operational space is pivotal for enhancing the ROs’ situational awareness and decision-making, ensuring that obstacles, passage constraints, and location tracking issues are clearly communicated. Finally, according to the findings, ROs sometimes fail to exhaust all remote solutions and often dispatch a field team too early. This undermines the efficiency gains that teleoperation is designed to achieve. Possible reasons for such behavior include seeking quick solutions (i.e., shortcuts) known as “cognitive miser” [68], decision fatigue [90], low level of experience [68] or unclear scripts. To alleviate this, simple UI solutions such as warning the user or making the command to dispatch a field team less accessible, reserving it for situations where all other remote options have been exhausted, might help mitigate the issue.

A well-designed teleoperation interface is crucial for enabling ROs to effectively assist LMDRs in real-world environments. Our interviewees repeatedly emphasized the importance of quick access to camera views, reliable map-based navigation, and real-time updates on robot and environmental status. These needs should be reflected in the interface design to effectively support situational awareness, enabling ROs to rapidly assess their surroundings, detect obstacles, and anticipate potential challenges. In light of these findings, several actionable UI design principles can be outlined to guide interface development. Interfaces should combine persistent elements, such as the front-facing camera feed, map view, and robot status indicators, with modular components, such as additional sensor views (e.g., LIDAR, rear camera, internal trunk camera) and secondary information panels, which can be resized or hidden depending on the task. The map, in particular, which proved to be very valuable for the interviewees, should always remain visible, though its size or position may be adjusted to accommodate task-specific needs. The use of multiple screens should also be considered, especially in high-demand contexts, to reduce the need for switching views and to support spatial orientation. Visual hierarchy should be applied not only through graphic emphasis (e.g., color, size, position), but also functionally, for example, by displaying context-relevant commands before presenting the full set of available controls. Some interface aspects were explicitly raised by interviewees, such as the need to view the robot’s rear and container cameras. Other design implications emerged from their described needs, including the importance of enabling interaction with bystanders through tools like microphone and speaker controls or predefined voice messages. In addition, a clear distinction between autonomous and manual control states, persistent indicators of battery and network status, and intuitive controls for tele-assistance are all essential for maintaining situational awareness. These principles are aligned with the UI components presented in Table 2 and offer generalizable, practical guidance to enhance situational awareness and operator effectiveness across diverse teleoperation environments.

Several study participants indicated that the physical distance between ROCs and operational sites creates multiple challenges, including increased latency, and connectivity issues that directly impact situational awareness [46]. If the centers are located in a different state or even country, it might also create gaps in the ROs’ contextual understanding of local aspects of infrastructure, traffic regulations, and cultural norms that might slow down and even compromise their decision-making. Strategies like virtual training sessions, simulations, and context-specific on-screen guidance can help bridge these gaps.

To improve efficiency, teleoperation systems can implement various crowd wisdom mechanisms that allow solutions to be shared across multiple robots. For example, a command showing how to bypass a roadblock by an RO in a certain location can be remembered and utilized by the system when another robot arrives at the same place [91]. Additionally, ROs can enhance system knowledge by manually annotating maps with relevant navigation information, and marking features or conditions that impact robot movement. This ensures that other LMDRs have access to crucial information. Moreover, areas where the intervention of an RO is likely to be needed, such as crosswalks that are legally restricted from autonomous crossing in certain regions, could also be marked on the map and included in the RO’s intervention scenarios list when they are in one of the robot’s routes, ensuring proactive management of regulatory constraints. Another possible use of the map to streamline the RO work can be in choosing an optimal route that also takes into consideration the load on the cellular network in the various routes, as this load may cause communication malfunctions such as disconnections and latency.

Finally, our study emphasized the crucial role of the interactions of the robots with bystanders and road users. In conflict scenarios, a human RO is often called upon to mediate this communication. The RO’s communication and interpersonal skills play a critical role in shaping public acceptance of these robots. The findings emphasize that social acceptance is vital for companies, a conclusion that is corroborated by existing studies [53], and as a result, strong communication skills should be considered essential in the hiring process or provided through specialized training programs to ensure that ROs can positively influence public perceptions of delivery robots. In this context, it is worth noting that the robots are a relatively new addition to the public sphere and not yet ubiquitous. It can be expected that as their presence in the street increases and people become accustomed to them, some of the conflicts between the robots and road users described earlier (mainly those resulting from curiosity) are likely to decrease [92].

The Effect of Workflow and Organizational Structure: Although this topic was not the primary focus of our study, we observed notable variations in how different companies structure the workflow of teleoperation. Some organizations adopt a flat or horizontal model, in which each RO is responsible for the full range of remote operational tasks—from monitoring to direct control. In contrast, companies employing a vertical or hierarchical model assign specific roles to ROs based on their position and responsibilities within the organizational structure. The structure of teleoperation centers reflects broader dynamics found in human–robot collaboration, where organizational models influence task allocation, communication flow, and situational awareness between human operators and autonomous systems [93]. Boin and Hart [94] discuss variations in organizational structure in the context of emergency response organizations and argue that there is no one golden standard on how it should work. Nevertheless, these structural differences have important implications for the design of operator interface systems. Interfaces should be tailored to support the specific tasks expected of operators, while avoiding both cognitive overload and impaired situation awareness and autonomy. For instance, in a flat-structured operations center, the interface should enable seamless transitions between monitoring and intervention modes, provide rapid access to diagnostic tools, and facilitate communication with field teams. Conversely, in a hierarchical model, monitoring operators may not require access to all intervention-related resources, and their interfaces should be streamlined accordingly.

4.1. Limitations

While the study provides valuable insights, it has several limitations. Firstly, the sample size is relatively small. Although it includes participants from organizations operating in diverse domains and at different levels of technological deployment, it may not fully represent the range of stakeholders in the LMDR ecosystems. This is especially true for those working within ROCs, who may have distinct perspectives and experiences not captured in our sample. Similarly, the sample does not include regulators. Additionally, the richness of intervention scenarios may be influenced by the deployment characteristics or geographic context. Consequently, our findings may not encompass the full range of possible scenarios and their prevalence. Furthermore, the valuable breadth gained through the relative diversity of stakeholders may come at the cost of depth in understanding specific operational settings (e.g., workflow across scenarios). However, we note that as with many qualitative studies, the goal was not to generalize statistically, but to explore in depth the emerging practices and challenges in LMDR teleoperation. Future work could expand on our findings with broader surveys or quantitative studies to assess prevalence across a larger population.

Related to this, obtaining data including specifics of the scripts used by the ROs, interface design details or frequency of user behavior (e.g., calling team field) proved challenging, partially due to confidentiality issues. Consequently, our data is more high-level and lacks detailed information regarding a few aspects of the routine procedures of ROs’ work. For example, data about the level of detail of the scripts and the extent to which they allow independent decision-making is lacking. However, the diverse backgrounds of our participants provide complementary points of view about the ROs’ work, enriching the data. Future research with access to such materials could offer a more granular understanding of system-level influences on operator behavior.

Another limitation of the study is that our data lacks sufficient information about the distance between ROCs and the operational sites, complicating the assessment of the distance’s impact on workers’ experiences. Nevertheless, it should be noted that commercial deployment is progressing at a relatively slow pace, and there are currently limited use cases of interstate, let alone international, ROCs.

4.2. Future Research

Our study aimed to gain a better understanding of the LMDR teleoperation ecosystem and to inform the design of LMDR teleoperation interfaces. The next step is to design and develop such interfaces, evaluating them in simulated and real-world environments. While various companies have developed specific interface solutions, their efficiency and effectiveness are unclear. Research on supervisory control suggests that interface design play a critical role in managing multiple robots effectively [51]. Further research is needed to understand how design issues, such as the number of camera views, their placement in the interface, the commands available to the RO, etc., affect the situation awareness, cognitive load, and, eventually, the effectiveness of the RO in resolving different issues identified in our study.

The current study investigated the experiences, challenges, and perspectives of stakeholders involved in LMDR teleoperation using qualitative insights. Future research should include performance-related metrics such as response time, task success rate, and error frequency by analyzing quantitative data, gathered in real or simulated settings. Such studies can also be useful to categorize intervention scenarios by frequency, risk level, level of impact, etc., to better support operational prioritization.

Training ROs is another critical area for future research. Our study highlights the importance of skilled and experienced ROs, particularly in situations requiring interaction with bystanders or other road users. Exploring effective training methods—both prior to and during deployment—is essential. For instance, future research could develop techniques to help ROs rapidly familiarize themselves with unfamiliar locations, enabling them to quickly assess and interpret their surroundings. Research could also determine the optimal training duration needed for new ROs to achieve safe operational competency with this standardized system.

The integration of artificial intelligence with teleoperation systems presents a valuable opportunity for enhancing RO effectiveness. By using generative AI to help ROs quickly analyze information in unfamiliar situations [95] and applying machine learning to study expert RO behaviors, both robot autonomy and teleoperation interfaces could be continuously improved. Such systems could develop increasingly sophisticated response protocols for common problems. This approach enables fleet scaling without proportional increases in human operators.

The findings also offer a closer examination of the regulatory aspect of LMDR teleoperation. Future work could assess ways to inform policy design, such as the importance of distinguishing between different modes of teleoperation (monitoring, tele-assistance, tele-driving) when creating intervention rules; the need for regulatory flexibility to accommodate evolving interface capabilities and AI autonomy levels, or the potential benefit of developing interoperability standards for remote operators across robot platforms and urban jurisdictions. We also emphasize that a more structured framework should emerge through collaboration between researchers, industry, and policymakers, building on empirical insights such as those presented in this study.

Finally, the role of human–robot interaction warrants further investigation. Given that ROs must effectively communicate with pedestrians and delivery recipients, future research could explore how ROs’ behavior and robot interface design influence public acceptance. This includes examining how robots are perceived in public spaces [96] based on RO communication styles and robot characteristics. Additionally, evaluating different message delivery formats—such as verbal, written, or visual communication—and their impact on robot personification could provide valuable insights. For instance, interface features that allow ROs to control robots’ movements or adjust visual cues may enhance communication and user acceptance. Such studies would contribute to the development of teleoperation systems that effectively balance operational efficiency with social integration.

5. Conclusions

In conclusion, our study sheds light on the multifaceted challenges and operational dynamics of remotely operating LMDRs. Through comprehensive interviews with industry professionals, we have identified key scenarios necessitating human intervention, explored the strategies employed by ROs, and highlighted the unique challenges they face. Our findings underscore the critical role of teleoperation in ensuring the efficient and safe operation of LMDRs, particularly in complex urban environments that present difficulties not experienced in other contexts of remotely operated robots.

Beyond the descriptive insights, our study offers several conceptual contributions. First, we introduce a framework of LMDR teleoperation modes representing different levels of human involvement. This typology contributes to HCI and remote collaboration literature by informing discussions on human–AI teaming and control delegation, encouraging more refined approaches to interface design and operator support in mixed-initiative systems. Second, our findings also reveal a lack of standardized teleoperation practices across the LMDR industry, despite operational similarities. This variation points to a gap that could be addressed by developing shared frameworks, similar to those found in remote work research, to promote interoperability, consistency, and best practices across organizations. Finally, the intervention scenarios described—ranging from navigation challenges to regulatory constraints—highlight the limits of automation and the continued need for human oversight. These examples reinforce the value of situated decision-making, where human judgment plays a critical role in addressing unpredictable, real-world conditions.

Taken together, these insights advance scholarly understanding of LMDR teleoperation, offering a foundation for both improved system design and deeper conceptual engagement with the evolving dynamics of human–machine interaction.

Author Contributions

Conceptualization, A.B., E.G., F.T. and J.L.; methodology, A.B. and E.G.; formal analysis, A.B. and E.G.; data curation, A.B., E.G. and F.T.; writing—original draft preparation, A.B., E.G., F.T. and J.L.; writing—review and editing, A.B., E.G., F.T. and J.L.; visualization, A.B. and F.T.; supervision, J.L.; project administration, A.B., E.G. and J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Israeli Smart Transportation Research Center (ISTRC).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of University of Haifa (198/23).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LMDR	Last-Mile Delivery Robot
RO	Remote Operator
UI	User Interface
HMI	Human–Machine Interface
AV	Autonomous Vehicles
IMU	Inertial Measurement Units
GPS	Global Positioning Systems
ROC	Remote Operation Center

References

Wang, Y.; Zhang, D.; Liu, Q.; Shen, F.; Lee, L.H. Towards enhancing the last-mile delivery: An effective crowd-tasking model with scalable solutions. Transp. Res. Part E Logist. Transp. Rev. 2016, 93, 279–293. [Google Scholar] [CrossRef]
Engesser, V.; Rombaut, E.; Vanhaverbeke, L.; Lebeau, P. Autonomous Delivery Solutions for Last-Mile Logistics Operations: A Literature Review and Research Agenda. Sustainability 2023, 15, 2774. [Google Scholar] [CrossRef]
Mohammad, W.A.; Nazih Diab, Y.; Elomri, A.; Triki, C. Innovative solutions in last mile delivery: Concepts, practices, challenges, and future directions. Supply Chain Forum Int. J. 2023, 24, 151–169. [Google Scholar] [CrossRef]
Fortune Business Insights. Autonomous Last Mile Delivery Market Size, Share & Industry Analysis 2024–2032. FBI105598105598. January 2025. Available online: https://www.fortunebusinessinsights.com/autonomous-last-mile-delivery-market-105598 (accessed on 26 January 2025).
Grand View Research. Autonomous Last Mile Delivery Market Size, Share & Trends Analysis Report, 2023–2030. Grand View Research, GVR-4-68039-204-2. Available online: https://www.grandviewresearch.com/industry-analysis/autonomous-last-mile-delivery-market (accessed on 26 January 2025).
Sahay, R.; Wolff, C. Pandemic, Parcels and Public Vaccination Envisioning the Next Normal for the Last-Mile Ecosystem. World Economic Forum, Insight Report. April 2021. Available online: https://www3.weforum.org/docs/WEF_Pandemic_Parcels_and_Public_Vaccination_report_2021.pdf (accessed on 26 January 2025).
Lemardelé, C.; Pinheiro Melo, S.; Cerdas, F.; Herrmann, C.; Estrada, M. Life-cycle analysis of last-mile parcel delivery using autonomous delivery robots. Transp. Res. Part D Transp. Environ. 2023, 121, 103842. [Google Scholar] [CrossRef]
Chen, C.; Demir, E.; Huang, Y.; Qiu, R. The adoption of self-driving delivery robots in last mile logistics. Transp. Res. Part E Logist. Transp. Rev. 2021, 146, 102214. [Google Scholar] [CrossRef] [PubMed]
De Maio, A.; Ghiani, G.; Laganà, D.; Manni, E. Sustainable last-mile distribution with autonomous delivery robots and public transportation. Transp. Res. Part C Emerg. Technol. 2024, 163, 104615. [Google Scholar] [CrossRef]
Hoffmann, T.; Prause, G. On the Regulatory Framework for Last-Mile Delivery Robots. Machines 2018, 6, 33. [Google Scholar] [CrossRef]
Fordham, C.; Fowler, C.; Kemp, I.; Williams, D. GATEway Safety and Insurance Ensuring Safety for Autonomous Vehicle Trials. TRL UK, PPR859. September 2018. Available online: https://www.trl.co.uk/Uploads/TRL/Documents/D2.2_-Safety-and-Insurance_PPR859_Optimized.pdf (accessed on 26 January 2025).
Lee, J.S.; Ham, Y.; Park, H.; Kim, J. Challenges, tasks, and opportunities in teleoperation of excavator toward human-in-the-loop construction automation. Autom. Constr. 2022, 135, 104119. [Google Scholar] [CrossRef]
Moniruzzaman, M.; Rassau, A.; Chai, D.; Islam, S.M.S. Teleoperation methods and enhancement techniques for mobile robots: A comprehensive survey. Robot. Auton. Syst. 2022, 150, 103973. [Google Scholar] [CrossRef]
Salvini, P.; Reinmund, T.; Hardin, B.; Grieman, K.; Ten Holter, C.; Johnson, A.; Kunze, L.; Winfield, A.; Jirotka, M. Human involvement in autonomous decision-making systems. Lessons learned from three case studies in aviation, social care and road vehicles. Front. Polit. Sci. 2023, 5, 1238461. [Google Scholar] [CrossRef]
Pedestrian and Bicycle Information Center. Sharing Spaces with Robots: The Basics of Personal Delivery Devices; UNC Highway Safety Research Center: Chapel Hill, NC, USA, 2021; Available online: https://www.pedbikeinfo.org/downloads/PBIC_InfoBrief_SharingSpaceswithRobots.pdf (accessed on 23 March 2025).
Buldeo Rai, H.; Touami, S.; Dablanc, L. Autonomous e-commerce delivery in ordinary and exceptional circumstances. The French case. Res. Transp. Bus. Manag. 2022, 45, 100774. [Google Scholar] [CrossRef]
Clamann, M.; Podsiad, K.; Cover, A. Personal Delivery Devices (PDDs) Legislative Tracker. Available online: https://www.pedbikeinfo.org/resources/resources_details.php?id=5314 (accessed on 1 May 2023).
Srinivas, S.; Ramachandiran, S.; Rajendran, S. Autonomous robot-driven deliveries: A review of recent developments and future directions. Transp. Res. Part E Logist. Transp. Rev. 2022, 165, 102834. [Google Scholar] [CrossRef]
Plank, M.; Lemardelé, C.; Assmann, T.; Zug, S. Ready for robots? Assessment of autonomous delivery robot operative accessibility in German cities. J. Urban Mobil. 2022, 2, 100036. [Google Scholar] [CrossRef]
Thiel, M.; Ziegenbein, J.; Blunder, N.; Schrick, M.; Kreutzfeldt, J. From Concept to Reality: Developing Sidewalk Robots for Real-World Research and Operation in Public Space. Logist. J. Proc. 2023, 19, 1–19. [Google Scholar] [CrossRef]
Parasuraman, R.; Sheridan, T.B.; Wickens, C.D. A model for types and levels of human interaction with automation. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 2000, 30, 286–297. [Google Scholar] [CrossRef] [PubMed]
Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. J3016_202104. April 2021. Available online: https://www.sae.org/standards/content/j3016_202104/ (accessed on 6 October 2024).
Bogdoll, D.; Orf, S.; Töttel, L.; Zöllner, J.M. Taxonomy and Survey on Remote Human Input Systems for Driving Automation Systems. In Proceedings of the Future of Information and Communication Conference, Online, 3–4 March 2022; Volume 439, pp. 94–108. [Google Scholar]
Caldwell, D.G.; Reddy, K.; Kocak, O.; Wardle, A. Sensory requirements and performance assessment of tele-presence controlled robots. In Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN, USA, 22–28 April 1996; Volume 2, pp. 1375–1380. [Google Scholar]
Li, Y.Y.; Fan, W.H.; Liu, Y.H.; Cai, X.P. Teleoperation of robots via the mobile communication networks. In Proceedings of the 2005 IEEE International Conference on Robotics and Biomimetics—ROBIO, Hong Kong, China, 5–9 July 2005; pp. 670–675. [Google Scholar]
Luo, J.; He, W.; Yang, C. Combined perception, control, and learning for teleoperation: Key technologies, applications, and challenges. Cogn. Comput. Syst. 2020, 2, 33–43. [Google Scholar] [CrossRef]
Sheridan, T.B. Human–Robot Interaction: Status and Challenges. Hum. Factors J. Hum. Factors Ergon. Soc. 2016, 58, 525–532. [Google Scholar] [CrossRef]
Sheridan, T.B. Space teleoperation through time delay: Review and prognosis. IEEE Trans. Robot. Autom. 1993, 9, 592–606. [Google Scholar] [CrossRef]
Skaar, S.B.; Ruoff, C.F. (Eds.) Teleoperation and Robotics in Space; American Institute of Aeronautics and Astronautics: Washington, DC, USA, 1994. [Google Scholar] [CrossRef]
Isaacs, J.; Knoedler, K.; Herdering, A.; Beylik, M.; Quintero, H. Teleoperation for Urban Search and Rescue Applications. Field Robot. 2022, 2, 1177–1190. [Google Scholar] [CrossRef]
Stopforth, R.; Holtzhausen, S.; Bright, G.; Tlale, N.S.; Kumile, C.M. Robots for Search and Rescue Purposes in Urban and Underwater Environments—A survey and comparison. In Proceedings of the 2008 15th International Conference on Mechatronics and Machine Vision in Practice, Auckland, New Zealand, 2–4 December 2008; pp. 476–480. [Google Scholar]
Waharte, S.; Trigoni, N. Supporting Search and Rescue Operations with UAVs. In Proceedings of the 2010 International Conference on Emerging Security Technologies, Canterbury, UK, 6–7 September 2010; pp. 142–147. [Google Scholar]
Adamides, G.; Katsanos, C.; Parmet, Y.; Christou, G.; Xenos, M.; Hadzilacos, T.; Edan, Y. HRI usability evaluation of interaction modes for a teleoperated agricultural robotic sprayer. Appl. Ergon. 2017, 62, 237–246. [Google Scholar] [CrossRef]
Murakami, N.; Ito, A.; Will, J.D.; Steffen, M.; Inoue, K.; Kita, K.; Miyaura, S. Development of a teleoperation system for agricultural vehicles. Comput. Electron. Agric. 2008, 63, 81–88. [Google Scholar] [CrossRef]
Mining Editor. The Rise of Remote Operating Centres in Mining. Australian Mine Safety Journal. Available online: https://www.amsj.com.au/the-rise-of-remote-operating-centres-in-mining/ (accessed on 26 January 2025).
Harnett, B.M.; Doarn, C.R.; Rosen, J.; Hannaford, B.; Broderick, T.J. Evaluation of Unmanned Airborne Vehicles and Mobile Robotic Telesurgery in an Extreme Environment. Telemed. E-Health 2008, 14, 539–544. [Google Scholar] [CrossRef]
Marescaux, J.; Leroy, J.; Gagner, M.; Rubino, F.; Mutter, D.; Vix, M.; Butner, S.E.; Smith, M.K. Transatlantic robot-assisted telesurgery. Nature 2001, 413, 379–380. [Google Scholar] [CrossRef] [PubMed]
Deckers, L.; Madadi, B.; Verduijn, T. Tele-operated driving in logistics as a transition to full automation: An exploratory study and research agenda. In Proceedings of the 27th ITS World Congress, Hamburg, Germany, 11–15 October 2021; pp. 11–15. [Google Scholar]
Meir, A.; Grimberg, E.; Musicant, O. The human-factors’ challenges of (tele)drivers of Autonomous Vehicles. Ergonomics 2024, 68, 947–967. [Google Scholar] [CrossRef] [PubMed]
Tener, F.; Lanir, J. Driving from a Distance: Challenges and Guidelines for Autonomous Vehicle Teleoperation Interfaces. In Proceedings of the CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 30 April–5 May 2022; ACM: New York, NY, USA, 2022; pp. 1–13. [Google Scholar] [CrossRef]
Ollero, A.; Tognon, M.; Suarez, A.; Lee, D.; Franchi, A. Past, Present, and Future of Aerial Robotic Manipulators. IEEE Trans. Robot. 2022, 38, 626–645. [Google Scholar] [CrossRef]
Sheridan, T.B. Teleoperation, telerobotics and telepresence: A progress report. Control Eng. Pract. 1995, 3, 205–214. [Google Scholar] [CrossRef]
Siciliano, B.; Khatib, O. (Eds.) Springer Handbook of Robotics; Springer International Publishing: Cham, Switzerland, 2016. [Google Scholar] [CrossRef]
Boker, A.; Lanir, J. Bird’s Eye View Effect on Situational Awareness in Remote Driving. In Proceedings of the Adjunct 15th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Ingolstadt, Germany, 18–21 September 2023; ACM: New York, NY, USA, 2023; pp. 36–41. [Google Scholar]
Steinfeld, A.; Fong, T.; Kaber, D.; Lewis, M.; Scholtz, J.; Schultz, A.; Goodrich, M. Common metrics for human-robot interaction. In Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, Salt Lake City, UT, USA, 2–3 March 2006; ACM: New York, NY, USA, 2006; pp. 33–40. [Google Scholar]
Chen, J.Y.; Haas, E.C.; Barnes, M.J. Human Performance Issues and User Interface Design for Teleoperated Robots. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 1231–1245. [Google Scholar] [CrossRef]
Durantin, G.; Gagnon, J.-F.; Tremblay, S.; Dehais, F. Using near infrared spectroscopy and heart rate variability to detect mental overload. Behav. Brain Res. 2014, 259, 16–23. [Google Scholar] [CrossRef] [PubMed]
Van Erp, J.B.; Padmos, P. Image parameters for driving with indirect viewing systems. Ergonomics 2003, 46, 1471–1499. [Google Scholar] [CrossRef]
Tener, F.; Lanir, J. Devising a High-Level Command Language for the Teleoperation of Autonomous Vehicles. Int. J. Hum.-Comput. Interact. 2024, 41, 5299–5315. [Google Scholar] [CrossRef]
Rea, D.J.; Seo, S.H. Still Not Solved: A Call for Renewed Focus on User-Centered Teleoperation Interfaces. Front. Robot. AI 2022, 9, 704225. [Google Scholar] [CrossRef] [PubMed]
Chen, J.C.Y.; Barnes, M.J.; Harper-Sciarini, M. Supervisory Control of Multiple Robots: Human-Performance Issues and User-Interface Design. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2011, 41, 435–454. [Google Scholar] [CrossRef]
Cooke, N.J. Human Factors of Remotely Operated Vehicles. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 2006, 50, 166–169. [Google Scholar] [CrossRef]
Alverhed, E.; Hellgren, S.; Isaksson, H.; Olsson, L.; Palmqvist, H.; Flodén, J. Autonomous last-mile delivery robots: A literature review. Eur. Transp. Res. Rev. 2024, 16, 4. [Google Scholar] [CrossRef]
Nikkei, A. Allowed on Sidewalks, South Korean Delivery Robots Poised to Take Off. Available online: https://kr-asia.com/allowed-on-sidewalks-south-korean-delivery-robots-poised-to-take-off (accessed on 1 February 2024).
Mike, O. Real Life Robotics Debuts Delivery Robot at Toronto Zoo. Available online: https://www.therobotreport.com/real-life-robotics-debuts-delivery-robot-toronto-zoo/ (accessed on 11 September 2024).
Lim, X.-J.; Chang, J.Y.-S.; Cheah, J.-H.; Lim, W.M.; Kraus, S.; Dabić, M. Out of the way, human! Understanding post-adoption of last-mile delivery robots. Technol. Forecast. Soc. Change 2024, 201, 123242. [Google Scholar] [CrossRef]
Ostermeier, M.; Heimfarth, A.; Hübner, A. Cost-optimal truck-and-robot routing for last-mile delivery. Networks 2022, 79, 364–389. [Google Scholar] [CrossRef]
Patel, S. Human Interaction with Autonomous Delivery Robots: Navigating the Intersection of Psychological Acceptance and Societal Integration. In Proceedings of the 2024 AHFE International Conference on Human Factors in Design, Engineering, and Computing, Honolulu, HI, USA, 20–24 April 2024. [Google Scholar] [CrossRef]
Yu, X.; Hoggenmüller, M.; Tran, T.T.M.; Wang, Y.; Tomitsch, M. Understanding the Interaction between Delivery Robots and Other Road and Sidewalk Users: A Study of User-generated Online Videos. ACM Trans. Hum.-Robot Interact. 2024, 13, 1–32. [Google Scholar] [CrossRef]
Robinson, O.C. Sampling in Interview-Based Qualitative Research: A Theoretical and Practical Guide. Qual. Res. Psychol. 2014, 11, 25–41. [Google Scholar] [CrossRef]
Church, J. Funding Rounds Explained: A Guide for Startups. Medium. 2024. Available online: https://jameschurch.medium.com/funding-rounds-explained-a-guide-for-startups-7d2ba23a4424 (accessed on 16 June 2025).
Dashboard|Tracxn. Available online: https://platform.tracxn.com/a/dashboard (accessed on 16 June 2025).
GlobeNewswire. Delivery Robots Market Report 2025: Bear Robotics, Starship Technologies, Amazon Robotics, and Kiwibot Lead the Space. March 2025. Available online: https://www.globenewswire.com/news-release/2025/03/25/3048871/0/en/Delivery-Robots-Market-Report-2025-Bear-Robotics-Starship-Technologies-Amazon-Robotics-and-Kiwibot-Lead-the-Space.html (accessed on 15 June 2025).
MacQueen, K.M.; McLellan, E.; Kay, K.; Milstein, B. Codebook Development for Team-Based Qualitative Analysis. CAM J. 1998, 10, 31–36. [Google Scholar] [CrossRef]
Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
Guest, G.; Bunce, A.; Johnson, L. How Many Interviews Are Enough? An Experiment with Data Saturation and Variability. Field Methods 2006, 18, 59–82. [Google Scholar] [CrossRef]
Mutzenich, C.; Durant, S.; Helman, S.; Dalton, P. Updating our understanding of situation awareness in relation to remote operators of autonomous vehicles. Cogn. Res. Princ. Implic. 2021, 6, 9. [Google Scholar] [CrossRef] [PubMed]
Chen, J.C.Y.; Barnes, M.J. Supervisory Control of Multiple Robots: Effects of Imperfect Automation and Individual Differences. Hum. Factors J. Hum. Factors Ergon. Soc. 2012, 54, 157–174. [Google Scholar] [CrossRef]
Wong, J.C. San Francisco Sours on Rampant Delivery Robots: Not Every Innovation Is Great. The Guardian, 10 December 2017. [Google Scholar]
Kettwich, C.; Schrank, A.; Oehl, M. Teleoperation of Highly Automated Vehicles in Public Transport: User-Centered Design of a Human-Machine Interface for Remote-Operation and Its Expert Usability Evaluation. Multimodal Technol. Interact. 2021, 5, 26. [Google Scholar] [CrossRef]
Tener, F.; Lanir, J. Guiding, not Driving: Design and Evaluation of a Command-Based User Interface for Teleoperation of Autonomous Vehicles. arXiv 2025, arXiv:2502.00750. [Google Scholar]
Majstorović, D.; Hoffmann, S.; Pfab, F.; Schimpe, A.; Wolf, M.M.; Diermeyer, F. Survey on teleoperation concepts for automated vehicles. In Proceedings of the 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic, 9–12 October 2022; pp. 1290–1296. [Google Scholar]
Naujoks, F.; Forster, Y.; Wiedemann, K.; Neukum, A. A Human-Machine Interface for Cooperative Highly Automated Driving. In Advances in Human Aspects of Transportation; Stanton, N.A., Landry, S., Di Bucchianico, G., Vallicelli, A., Eds.; Advances in Intelligent Systems and Computing; Springer International Publishing: Cham, Switzerland, 2017; Volume 484, pp. 585–595. [Google Scholar] [CrossRef]
Feiler, J.; Hoffmann, S.; Diermeyer, F. Concept of a Control Center for an Automated Vehicle Fleet. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
Ikeda, S.; Ito, T.; Sakamoto, M. Discovering the efficient organization structure: Horizontal versus vertical. Artif. Life Robot. 2010, 15, 478–481. [Google Scholar] [CrossRef]
Lee, S. The myth of the flat start-up: Reconsidering the organizational structure of start-ups. Strateg. Manag. J. 2022, 43, 58–92. [Google Scholar] [CrossRef]
Neumeier, S.; Wintersberger, P.; Frison, A.-K.; Becher, A.; Facchi, C.; Riener, A. Teleoperation: The Holy Grail to Solve Problems of Automated Driving? Sure, but Latency Matters. In Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Utrecht, The Netherlands, 21–25 September 2019; ACM: New York, NY, USA, 2019; pp. 186–197. [Google Scholar]
Luck, J.P.; McDermott, P.L.; Allender, L.; Russell, D.C. An investigation of real world control of robotic assets under communication latency. In Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, Salt Lake City, UT, USA, 2–3 March 2006; ACM: New York, NY, USA, 2006; pp. 202–209. [Google Scholar]
Chiou, M.; Hawes, N.; Stolkin, R. Mixed-initiative Variable Autonomy for Remotely Operated Mobile Robots. ACM Trans. Hum.-Robot Interact. 2021, 10, 37. [Google Scholar] [CrossRef]
Tener, F.; Lanir, J. Investigating intervention road scenarios for teleoperation of autonomous vehicles. Multimed. Tools Appl. 2024, 83, 61103–61119. [Google Scholar] [CrossRef]
Mortimer, M.; Horan, B.; Seyedmahmoudian, M. Building a Relationship between Robot Characteristics and Teleoperation User Interfaces. Sensors 2017, 17, 587. [Google Scholar] [CrossRef]
Nielsen, J.; Molich, R. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Empowering People—CHI ’90, Seattle, WA, USA, 1–5 April 1990; ACM Press: New York, NY, USA, 1990; pp. 249–256. [Google Scholar]
Shneiderman, B.; Plaisant, C. Designing the User Interface: Strategies for Effective Human-Computer Interaction, 4th ed.; Pearson/Addison-Wesley: Boston, MA, USA, 2005. [Google Scholar]
Aven, T. Risk assessment and risk management: Review of recent advances on their foundation. Eur. J. Oper. Res. 2016, 253, 1–13. [Google Scholar] [CrossRef]
Pillay, M. The Utility of M-31000 for Managing Health and Safety Risks: A Pilot Investigation. In Occupational Health and Safety —A Multi-Regional Perspective; Pillay, M., Tuck, M., Eds.; InTech: Houston, TX, USA, 2018. [Google Scholar] [CrossRef]
ISO 31000:2018; Risk Management—Guidelines. International Organization for Standardization: Geneva, Switzerland, 2018. Available online: https://www.iso.org/standard/65694.html (accessed on 16 June 2025).
Ng, Y.J.; Yeo, M.S.K.; Ng, Q.B.; Budig, M.; Muthugala, M.A.V.J.; Samarakoon, S.M.B.P.; Mohan, R.E. Application of an adapted FMEA framework for robot-inclusivity of built environments. Sci. Rep. 2022, 12, 3408. [Google Scholar] [CrossRef]
Fuentes-Bargues, J.L.; Bastante-Ceca, M.J.; Ferrer-Gisbert, P.S.; González-Cruz, M.C. Study of Major-Accident Risk Assessment Techniques in the Environmental Impact Assessment Process. Sustainability 2020, 12, 5770. [Google Scholar] [CrossRef]
Sweller, J. Cognitive Load Theory. In Psychology of Learning and Motivation; Elsevier: Amsterdam, The Netherlands, 2011; Volume 55, pp. 37–76. [Google Scholar] [CrossRef]
Pignatiello, G.A.; Martin, R.J.; Hickman, R.L. Decision fatigue: A conceptual analysis. J. Health Psychol. 2020, 25, 123–135. [Google Scholar] [CrossRef]
Geng, Y.; Zhang, D.; Li, P.; Akcin, O.; Tang, A.; Chinchali, S.P. Decentralized Sharing and Valuation of Fleet Robotic Data. In Proceedings of the 5th Conference on Robot Learning, London, UK, 8–11 November 2021; Volume 164, pp. 1795–1800. Available online: https://proceedings.mlr.press/v164/geng22a.html (accessed on 23 May 2025).
Grimberg, E. Acceptability of Autonomous Vehicles: Does Contextual Consideration Make a Difference? Ph.D. Thesis, The University of Queensland, St Lucia, Australia, 23 September 2022. [Google Scholar]
Goodrich, M.A.; Schultz, A.C. Human-Robot Interaction: A Survey. Found. Trends^® Hum.-Comput. Interact. 2007, 1, 203–275. [Google Scholar] [CrossRef]
Boin, A.; Hart, P. Organising for Effective Emergency Management: Lessons from Research. Aust. J. Public Adm. 2010, 69, 357–371. [Google Scholar] [CrossRef]
Hao, X.; Demir, E.; Eyers, D. Exploring collaborative decision-making: A quasi-experimental study of human and Generative AI interaction. Technol. Soc. 2024, 78, 102662. [Google Scholar] [CrossRef]
Gehrke, S.R.; Phair, C.D.; Russo, B.J.; Smaglik, E.J. Observed sidewalk autonomous delivery robot interactions with pedestrians and bicyclists. Transp. Res. Interdiscip. Perspect. 2023, 18, 100789. [Google Scholar] [CrossRef]

Figure 1. Postmates (Serve robotics) Delivery Robot (Reprinted with permission from 10.Robot.Postmates.WDC.25October2017. 2017, Elvert Barnes via https://flic.kr/p/CGiBYu, accessed on 30 April 2025).

Figure 2. Various roles involved in the operation of LMDRs: (a) Monitoring; (b) Tele assistance; (c) Tele driving; (d) Field team.

Figure 3. A schematic diagram depicting a ROC in which LMDRs are matched with ROs.

Figure 4. Key intervention scenarios and sub-cases requiring teleoperation for LMDRs.

Table 1. Participant List.

Participant	Company Products/Services	Role	Deployment Stage/Fleet Size [62]
P1	European project to pilot and validate a fully autonomous last-mile logistics system	PhD candidate-Project manager	Experimental pilot (few experimental robots)
P2	Korean AI-powered outdoor robot company	Managing Director	Startup-Early stages (unfunded)
P3	Software company that offers teleoperation safety solutions and remote operations for logistics vehicles	Ex Co-Founder & COO	Startup–Series B, Soonicorn
P4	Software company that offers teleoperation safety solutions and remote operations for logistics vehicles	Chief Product Officer	Startup–Series B, Soonicorn
P5	Technology company that develops AVs and robots tailored for delivery uses	Head of Marketing and Public Relations	NA *
P6	Autonomous delivery robots	Business development	Startup–Series A
P7	A company that develops teleoperation software for Avs	Chief Operating Officer (Remote operation)	Startup–Series A
P8	AI navigation for delivery robots	Co-Founder and CEO	Startup–Series B
P9	AI-driven drone and robot operating system	Co-Founder & CXO	Startup–Series B
P10	Interactive urban service robots	Operation manager	Startup–Seed
P11	Solutions for autonomous Applications	VP Business Development	Startup–Series B
P12	Design and innovation consultancy which was responsible for designing a robotic delivery vehicle	President/Principal designer	Represents a Series C company, one of the top 5 in the category [63]
P13	Robotics software development platform	CEO	Startup–Series A
P14	A company that develops LMDRs	Head of Fleet Quality	Startup–Series A, (Fleet size: >1000 robots, Soonicorn)
P15	Innovation and investment arm of a large automotive company dealing with AVs	Open Innovation Manager	Open innovation Center of an international motor group

* NA—Not available.

Table 2. Various UI components that should appear in the teleoperation UI.

#	UI Component	Purpose
1	Notifications/alerts	ROs can see various notification types, such as intervention reason-related, latency-related, mission-related, object identification-related, maintenance-related, etc. (P9, P10, P13). Examples of alerts include “human alert,” “dynamic obstacle alert,” etc.
2	Battery level status	The RO should understand how much energy is left (P14).
3	Network quality status	At any point, the RO should know the network quality (P3, P9).
4	Speed	Shows the speed of the LMDR to alleviate the difficulty of physical disconnect from the LMDR (P14).
5	Mission duration	Since LMDRs are used to deliver things from origin to destination, the RO needs to know how long it takes to deliver things from the starting point to the destination (P14).
6	Control ownership	A status that shows whether the LMDR is in autonomous or manual modes (P9).
7	On-screen AR layers	It can show the RO’s possible routes for overtaking an obstacle (P7) showing the LMDR’s width or the distance to the nearby object (P4).
8	Status of physical components	ROs might want to know and control the status of various robot hardware components, such as the trunk (P13), light, flag-mast height, emergency lights, etc. (P14)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Boker, A.; Grimberg, E.; Tener, F.; Lanir, J. Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation. Sustainability 2025, 17, 5925. https://doi.org/10.3390/su17135925

AMA Style

Boker A, Grimberg E, Tener F, Lanir J. Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation. Sustainability. 2025; 17(13):5925. https://doi.org/10.3390/su17135925

Chicago/Turabian Style

Boker, Avishag, Einat Grimberg, Felix Tener, and Joel Lanir. 2025. "Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation" Sustainability 17, no. 13: 5925. https://doi.org/10.3390/su17135925

APA Style

Boker, A., Grimberg, E., Tener, F., & Lanir, J. (2025). Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation. Sustainability, 17(13), 5925. https://doi.org/10.3390/su17135925

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Navigating the Last Mile: A Stakeholder Analysis of Delivery Robot Teleoperation

Abstract

1. Introduction

1.1. Last-Mile Delivery Robots

1.2. Remote Operation

1.2.1. Teleoperation of Autonomous Agents (Robots and AVs)

1.2.2. Teleoperation and the LMDR Ecosystem

1.3. Research Objectives

2. Methodology

2.1. Participants

2.2. Interview Protocol

2.3. Data Coding and Thematic Analysis

3. Findings

3.1. Teleoperation Modes

3.1.1. Remote Monitoring

3.1.2. Remote Intervention: Tele-Assistance

3.1.3. Remote Intervention: Tele-Driving

3.1.4. Field Team

3.1.5. Response Procedures

3.2. Remote Operation Centers

3.2.1. ROC Structure

3.2.2. Robots-Operator Ratio

3.2.3. Allocating Teleoperation Calls

3.2.4. ROC Location

3.2.5. ROs’ UI

3.3. Key Intervention Scenarios

3.3.1. Interaction with People

Negative Response: Intentional Interaction

Intentional Interaction-Positive Intention

Robot as an Obstacle/Impact on Human

RO’s Role Within the Interaction

3.3.2. Connectivity

3.3.3. Blocked Routes

3.3.4. Road and Environmental Conditions

3.3.5. Complex Traffic Scenarios

3.3.6. Rules and Regulations

3.3.7. Robot Malfunction

4. Discussion

4.1. Limitations

4.2. Future Research

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI