Social robotic navigation is a question of massive interest in the field of autonomous robotics. Robots in scenarios with humans, such as care facilities, have to be able to behave in a socially acceptable way, i.e., it has to plan a path and navigate according to social rules, e.g., robots should avoid getting to close to people or disturbing people who are not willing to interact with them [
24]. This section describes two experimental scenarios for evaluating the proposed CPS-AAL. First, a statement of the problem of socially accepted path-planning is described. Next, the use cases where the CPS-AAL is evaluated are defined, and then, the social navigation framework used in the CPS is depicted. Finally, the results are presented and discussed.
4.1. Problem Statement
Traditionally, when a robot navigates in real environments, most of the algorithms in the literature have considered all obstacles of similar relevance, including people. This reasoning is not valid for a social robot, who must have the ability to navigate similarly to humans. This situation implies, among other constraints, the people’s comfort level when the robot moves near them. In our opinion, two references can offer the context of the problem to the readers [
25,
26], and more recently [
27]. This works describes the specific problem and, also, the solutions provided by other authors. An interesting approach is the definition of a social map depicted in the works [
28,
29], which extend the concept of the metric and semantic map to include spaces where the robot can navigate without disturbing people.
Suppose the two examples described in
Figure 8, where the robot must go from the initial to the final one. The robot must avoid moving too near people or crossing between people who are interacting with each other (
Figure 8a). Also, it should avoid traversing between people that are interacting with some object (
Figure 8b). Consequently, the robot should path-planning according to these constraints. From this perspective, the need to model a personal space that should be included in the path-planning process to achieve acceptable behaviors for the robots during navigation seems to arise. Consequently, this article is inspired by the notion of social mapping described in [
28].
This map is built in the digital twin model
from data acquired from the physical world
. To this end, the social navigation framework presented in this work requires the use of physical devices and social behavior models. In our case, these models are based on the theory of proxemics in human relationships and models of use of everyday objects [
30].
To plan a socially accepted path, as concluded from the above, it is necessary a cyber-world capable of extracting information from the position of people, objects, detecting changes in those positions (tracking objects and people) and, of course, knowing the robot’s pose in the physical world at any time. It is not a simple problem and requires an architecture capable of exchanging and processing information in real time between the different agents, consistency in the data, and the use of multiple sources. It would be impossible to carry out this social navigation using only the robot’s sensors, and that is why the use of CPS-AAL has particular relevance.
4.2. Use-Case Definition
The article presents two use cases in the scenario shown in
Figure 9. It consists of a partial view of a caregiving center with two main rooms, a physical therapy room, and an occupational therapy room. Additionally, the scenario includes a corridor and a toilet. The SAR and the devices deployed in the CPS-AAL is also shown in the figure. The distribution of the sensors in the physical world has been made based on the following criteria: (1) most of the space must be visible by RGBD cameras (except for the bathroom, where there is only one camera at the entrance); (2) all rooms must have the possibility of allowing human interaction with the CPS-AAL, either through microphones/speakers or touch screens; (3) each room must have temperature-humidity and CO
sensors; and (4) the number of devices installed must be the optimal one that meets the above criteria. It is also important to note that the RGBD camera network has been calibrated according to the method described in [
31].
The robot has been designed to provide physical and cognitive support to aged and to help caregivers with their tasks. In particular, it communicates with users through a touch screen, speakers, and microphones for speech synthesis and recognition, respectively. On the touch screen, in addition to selecting between different services, physical and cognitive therapies are presented that the elderly can perform in collaboration with the robot. Users can communicate with the robot directly or through the array of microphones displayed on the scenario. To this end, the robot can recognize specific keywords and manage the conversation based on a dialogue manager agent.
The cyber-world, in addition to the digital twin version of the physical world, includes all the models and information necessary for the correct development of the activities in the caregiving center. Among the models used in the use cases are those related to social navigation and the construction of the social map of the robot’s environment.
The first use case is described in
Table 2 and
Figure 10a. In this test, the robot acts as an assistant that warns the users (older adults) that the therapy is over. To performing the simulation, a senior is placed in the occupational therapy room right in front of the television (i.e., the television plays a sequence of movements that the older person is repeating). When the therapy is over, the robot navigates from its initial position to a position near the older person. Although warning the elderly can be done with any other device, such as a smartphone or smartwatch, which can be effortlessly integrated into the proposed CPS-AAL, it has been decided that the SAR will alert the old adult. The reason is to show the system’s ability to adapt the SAR’s path to social conventions since it coincides with the caregiver’s protocol: Go and warn the user that the therapy is over. This situation produces a short verbal interaction between the human and the caregiver, or the SAR in this case, which allows knowing how the therapy was developed, generating a higher degree of adherence and motivation.
The second use case is shown in
Table 3 and
Figure 10b. In this second test, the robot acts as a virtual physical therapist that navigates to the user and proposes a physical activity. To achieve it, the robot navigates from a starting position to the older person’s position. Once in this position, the robot begins an interaction with the senior and later presents, on its touch screen, a physical therapy that the person must imitate.
In both use cases, the entire CPS-AAL works together to achieve the same goal, starting with the agents for detecting and tracking people and objects, the human-robot interaction agents, the caregiving center management agent, which is responsible for managing, among other functions, the center’s schedule of activities, and finally the social navigation agent.
4.4. Experimental Results and Discussion
The evaluation of the CPS-AAL for robot’s social navigation in caregiving scenarios requires the correct performance of all architecture agents. Both the detection of people in the scenario and the detection of changes in the objects’ positions are carried out with software agents that use information from RGBD cameras, which means that the visual field of the camera network distributed throughout the environment reaches most of the scenario [
22].
Figure 12 shows images acquired by using the camera network deployed in the caregiving center at different instants of time. As the figure shows, there is minimal overlap between cameras, which is needed for the monitoring of people and the SAR during the activities. The CPS-AAL keeps updated information based on the analysis of data provided by the physical world
that has its virtual representation in the digital twin model
.
The digital twin model of the physical world for both scenarios is shown in
Figure 13. For both cases, the experimental environment consists of four rooms (i.e., toilet, corridor, occupational, and physical therapy rooms) connected to each other according to the design of the caregiving center. Among the node’s attributes are not only their geometrical dimensions but also environmental parameters, such as temperature, CO
level, or humidity. Depending on the use case, these four nodes are connected to other nodes associated with people and objects through the in edge. Furthermore, on the one hand, people has personal spaces, and on the other hand, an object in a room has its associated affordance space. If a person is interacting with the interactive object, an edge is also drawn in the graph. This same edge is drawn in the case of two people are interacting with each other.
To validate the social navigation of the SAR in each case of use, a methodology similar to that proposed in [
36,
37,
38] has been carried out, who established a set of metrics to evaluate the navigation of a robot in human environments: (1) average minimum distance to a human during navigation,
; (2) distance traveled,
; (3) navigation time,
; (4) cumulative heading changes,
; and (5) personal space intrusions,
. Nevertheless, a brief description of these metrics is provided:
Average distance to the closest human during navigation: A measure of the average distance from the robot pose,
, to the closest human
along the robot’s path
, being
N the number of points of the path planned by the agent.
Distance traveled: length of the path planned by the navigation framework, in meters.
Navigation time: time since the robot starts the navigation,
, until it arrives to the target,
.
Cumulative Heading Changes (CHC): a measure to count the cumulative heading changes of the robot during navigation [
38]. Angles are normalized between
and
.
Personal space intrusions (
): In this paper, four different areas are defined: Intimate (
); Personal (
); Social (
); and Public (
). This metric measures the percentage of the time spent in each area along the robot’s path as:
where
defines the distance range for classification (intimate, personal, social and public), and
is the indicator function.
Figure 14 describes the first use case.
Figure 14a depicts a 3D view of the scenario with the old adult and the caregiver in the occupational therapy room.
Figure 14b illustrates the social interaction spaces of the different agents in the scenario. These social spaces of interaction, defined through models in the digital twin world, modify the free space graph used to plan the path. People add an asymmetric Gaussian-shaped space with different weights depending on whether it is intimate, personal, social, or public space, penalizing the robot’s path through these nodes of the graph [
30]. Similarly, objects in the environment generate interaction spaces if the caregiving center’s users are interacting with them. Thus, the route planned by the robot takes into account all these values, and the navigation agent builds a social path to the target pose, in this case, the occupational therapy room to communicate the end of the therapy. The route planned by the robot is shown in
Figure 14c. This path avoids crossing close to the people in the room, getting as far away from them as possible, always minimizing the distances traveled. The time it takes for the robot to reach its target increases considerably compared to a classic planner without social behavior, but in return, it does not disturb people while they are performing their therapy (see
Table 4). The final robot’s pose is drawn in
Figure 14d (Readers can watch the video of this use case at the address:
https://youtu.be/hJYLT661TqU). In the video, images acquired from the RGBD camera network are also shown). At this point, the robot is in a position close enough to the older person to be heard, and the interaction can begin. The results of this first use case are shown in
Table 4, where metrics for the path planned by a classical Dijkstra’s planner without social behavior is also detailed. First, as is evident, the path planned by the robot without social behavior travels a shorter distance in a shorter time. However, distances to the people
and
are very small, which can bother the caregiving center’s users. This same situation can also be observed with the value of
, which indicates that the robot invades this personal space. In the case of social navigation, thanks to CPS-AAL, the robot can plan a socially accepted path, which allows it to reach the target position without bothering anyone, as shown the values of
in
Table 4 equal to zero in all cases, except for the public area.
Figure 15 describes the second use case. In this scenario, two people will interact with each other, and the robot should avoid passing near them to move to the physical therapy room (see
Figure 15a). Social spaces of interaction are shown in
Figure 15b. As in the previous use case, the models of the digital twin world, the affordance spaces for objects and the asymmetric Gaussian spaces for people, modify the free space graph. The planned route is shown in
Figure 15c. In this case, the robot searches for the optimal path respecting the social norms until it reaches the final position, where the interaction with the old adult begins (
Figure 15d) (A video of this second use case can be found in:
https://youtu.be/Npb-kfNRLpo).
Table 5 shows the set of metrics obtained after performing the social navigation framework within the proposed CPS-AAL. These metrics are compared with a classical Dijkstra’s path-planning algorithm without social behavior. As in the first test, the results show how the robot’s social behavior needs a longer path, and therefore it also needs more time to perform it. However, this social behavior prevents the robot from navigating near people, as the values of
and
show.
As a summary of the experiments, it can be concluded that the SAR presents notable advantages in social navigation behavior, avoiding navigating near people (caregivers or older people) or invading areas where people interact with objects during therapy. All this would be much more complicated without a system that works in a coordinated way and integrates the physical world with specific models and agents that support the whole system. In the solution presented in this work, the cyber-world, built from the digital twin model with a shared working memory, the DSR, and the CORTEX architecture, facilitates the coordinated work of the agents and reduces the complexity of the problems. Finally, the metrics used in this work promote the comparison of the proposed approach with other similar works in the literature. The social navigation framework can be effortlessly adapted to changes and modifications, due to the essential feature in the complete system is the integration of the two worlds, the physical and the cyber-world, and the architecture presented here meets the desired criteria, including being modular and easily scalable.