Social and Robust Navigation for Indoor Robots Based on Object Semantic Grid and Topological Map

: For the indoor navigation of service robots, human–robot interaction and adapting to the environment still need to be strengthened, including determining the navigation goal socially, improving the success rate of passing doors, and optimizing the path planning e ﬃ ciency. This paper proposes an indoor navigation system based on object semantic grid and topological map, to optimize the above problems. First, natural language is used as a human–robot interaction form, from which the target room, object, and spatial relationship can be extracted by using speech recognition and word segmentation. Then, the robot selects the goal point from the target space by object a ﬀ ordance theory. To improve the navigation success rate and safety, we generate auxiliary navigation points on both sides of the door to correct the robot trajectory. Furthermore, based on the topological map and auxiliary navigation points, the global path is segmented into each topological area. The path planning algorithm is carried on respectively in every room, which signiﬁcantly improves the navigation e ﬃ ciency. This system has demonstrated to support autonomous navigation based on language interaction and signiﬁcantly improve the safety, e ﬃ ciency, and robustness of indoor robot navigation. Our system has been successfully tested in real domestic environments.


Introduction
Robotics, as a frontier field of technology, has profoundly affected all aspects of society. Various types of indoor service robots work in human life, such as nanny robots [1] that help the disabled or patients in their daily lives, medical robots [2] that assist the elderly in cognitive activities, guide robots [3] that lead tourists or clients, and the security robots [4] that defend the security. In order to ensure the indoor robots to complete different tasks and move autonomously, it is necessary to ensure the high safety, robustness, and natural human-robot interaction of the navigation system. Although the current robot navigation system has been greatly developed, there are still two critical problems: simple and social selection of navigation goal, efficient and secure path planning.
Selection of navigation goal: The shortcomings of navigation goal selection for indoor robots are the operational complexity and goal point against the social norms. For example, the user is required to click the screen to choose a goal point in the robot operating system (ROS) [5], or set the target point in advance during the mapping [6]. Both of them are contrary to human nature of interacting with robots in natural language, because ordinary users do not have the theoretical knowledge related to robotics. In this contest, Landsiedel et al. [7] pointed out that natural language is arguably the most common and natural communication channel used and understood by humans, and dialogue systems are an integral part of modern approaches to reduce the user's operating requirements and enhance the user's interactive experience. Based on this, Tellex et al. [8] created a database for the robot's behavior and and usually has corners. On the other hand, people tried to integrate the grid maps and topological maps in the mobile robot navigation. Thrun et al. [31] integrated both maps, partitioned the topological maps into coherent regions and gained better navigation efficiency. Kuipers et al. [32] combined the details of metrical maps in small-scale space with the conciseness of topological maps in large-scale space. Both of the above methods implemented the advantages of the topological maps in the complex and large environments. Then Martin et al. [33] used AI planning to carry out topological navigation in a complex indoor environment, to perform a sequence of navigation actions. It divided the navigation tasks into phases but ignored the grid maps and took a lot of time defining navigation action libraries.
This paper proposes an indoor robot navigation system with object semantic grid and topological map, which enables robot navigation to automatically select goals and become more safe, robust, and efficient. First, based on the team's previous work [34,35], the object semantic grid and topological map are built for the indoor environment. The map contains all objects' semantic information, occupied area, the semantic information of surrounding area. Moreover, it contains the coordinates of all door endpoints and the room dominant direction. Then, the natural language is used for human-robot interaction. The robot will extract the target room, object, and spatial relationship from the voice instructions and selects the navigation goal from the corresponding goal space. Therefore, navigation goal selection can be more convenient and social. Next, to improve the success rate and safety of navigation, we use the coordinate of door endpoints and dominant direction to generate the auxiliary navigation points (ANPs) on both sides of the door. The ANPs are used to correct the navigation trajectory, so robot crosses the door vertically. In particular, the generation of ANPs is offline. Finally, based on the ANPs and semantic topological map, global path planning is separated into each topological area and conducted respectively. This provides the correct heuristic direction for global path planning and greatly improves efficiency. Moreover, we verified our system in two real environments and three types of robots. The results show that the system is efficient, robust, and secure.
In summary, the contributions of our work are as follows: • We propose the indoor robot navigation system based on object semantic grid and topological map.

•
We use natural language as human-robot interaction form and realize the autonomous and social selection of navigation goal point.

•
We generate auxiliary navigation points on both sides of the door to improve the success rate of the crossing door and navigation.

•
We separate the global path planning into each topology area based on the ANPs and topological map, which improves the efficiency of the navigation system.

•
We verify our proposed system in the real office and home environments.
The remainder of this paper is organized as follows. The system overview is introduced in Section 2. The object semantic grid and topological map is introduced in Section 3. The automatic selection of navigation goal is detailed in Section 4. Section 5 presents the generation of ANPs and the algorithm of segmented global path planning. Section 6 presents the experimental results and discussion. Finally, Section 7 concludes this paper.

System Overview
The schematic representation of the proposed system is shown in Figure 1. It can be divided into two major modules: (1) automatic selection of navigation goal; (2) segmented global path planning. The first module is used to select a navigation goal based on human-robot speech interaction and affordance theory. The second module is used to generate ANPs and segment the global path planning into each topological area. First, object semantic grid map and semantic topological map are built for the indoor environment. Semantic topological map can provide the rooms' semantic information and topological information. Object semantic grid map contains the objects' semantic information, occupied space, and semantic information of object surrounding space (affordance and nonaffordance space). Then, using the users' natural language as input, the system runs speech recognition and word segmentation to obtain the navigation goal's room, object, and spatial relationship (front or beside). By querying object semantic grid map and semantic topological map, the location of the target home and object can be determined. Next, according to the target spatial relationship, robot will choose affordance or nonaffordance space as the goal space and select a point from it as the navigation goal point. The orientation of navigation goal, points to the coordinate of the target object. Afterward, to improve the navigation success rate and decrease the effects of sensor noise, localization error, and motion error, we use the coordinates of the door endpoint and the dominant direction of the room, which are generated in the process of building the topological map, to generate auxiliary navigation points (ANPs) on both sides of the door. ANPs can be used to correct the robot's trajectory through door. Finally, to improve the efficiency of global path planning, we carry on topological path planning in the topological map and get the global topological path. We segment the global path into each topological area and run path planning respectively by using ANPs as the end of every part of the global path.
In additional, considering that theories for local path planning is already achieved, we directly adopt dynamic window approach algorithm [36] to create the kinematic trajectory and focus on the automatic selection of navigation goal and segmented global path planning.

Object Semantic Grid and Topological Map
To select navigation goal automatically and perform segmented global path planning, it is essential to add semantic information to the traditional occupied grid map. Based on our previous work [34,35], object semantic grid map and topological map are built for indoor environment, which store: (1) object semantic information, (2) object occupied space, (3) object goal space (semantic information of object surrounding space), and (4) semantic topological information of room.
First, the object semantic grid map is built. In the instance of the alma scene [37], we create point cloud with semantic information to represent the types of objects (point clouds with different colors in Figure 2a mean different types of objects). Then, we calculate the minimum bounding rectangle of every point clouds (shown as the blue rectangles in Figure 2a) to represent the object occupied space. Next, to represent the object goal space, the object surrounding space is divided into affordance space (the space related to human activities) and nonaffordance space (the space irrelevant to human activities) according to the definition of [14], as shown in Figure 2b. The spatial relationship of object goal space and object occupied space is shown in Figure 2c. Finally, we create the semantic topological information, as shown in Figure 2d, by connected domain analysis and Bayesian analysis. In this process, the room's main direction and the coordinates of the door endpoints are also extracted and saved in the map.

Automatic Selection of Navigation Goal with Speech Interaction
Natural language is the most common and social communication channel, so it is essential to enable the human-robot interaction via natural language to reduce the operation requirements to users.
As shown in Figure 1, the automatic selection of navigation goal needs three inputs: natural language, object semantic grid map, and semantic topological map. The algorithm includes speech recognition, word segmentation, keyword extraction, concept query, goal space selection, and goal point selection, as shown in Algorithm A1 in Appendix A.
Before the navigation, users issue voice instructions to the robot in the form of natural language, such as "go to the front of fridge in the kitchen" including room, object, and location information. First, the language instruction is converted into text through speech recognition, and the text is broken down into phrases through Chinese word segmentation module. For example, by implementing the Jieba algorithm [12], the sentence "go to the front of fridge in the kitchen" is broken into "go to the/side/of/fridge/in the/kitchen", as shown in Figure 3a. Since the purpose of keyword extraction is to extract phrases related to navigation goal, we build a database of phrases and their corresponding navigation behavior. Then we query the database for each phrase extracted from voice instruction, so as to determine the target room, target object, and target spatial relationship. For example, in the case of Figure 3a, the target room is the kitchen, target object is the fridge, and target spatial relationship is the side. To make sure whether navigation goal exists in the indoor environment, we need to carry out concept query in the map, which means looking for the target room and target object in the room lists and object lists saved in the object semantic grid and topological map, as shown in Figure 3b. If target exists, the navigation path will be planned. We visualize the target room and target object in the map, as shown in Figure 3c.
To make the selection of robot target points more in line with social norms, we need to select the goal space from the object surrounding area, according to target spatial relationship. Based on [14], we divide the object surrounding area into affordance space and nonaffordance space. The affordance space refers to the area related to human activities (the front and yellow area of the object occupied area in Figure 3d), and the nonaffordance space refers to the area irrelevant to human activities (the diagonal and red area of the object occupied area in Figure 3d). In this context, nonaffordance space can be determined as goal space because the target spatial relationship is the side.
Moreover, it is necessary to filter the target points in the target area. The selection criteria are as follows. (1) Hard constraint: the target point should not be in the inflation area of cost map; (2) Soft constraint: the target point should avoid being in the other objects' affordance space. Among two criteria, the hard constraint is absolutely inviolate, while the soft constraint should be followed as far as possible. Thus, the navigation goal point (indicated by the black point in Figure 3e) is selected from the nonaffordance space, and the orientation (indicated by the green arrow in the Figure 3f) points to the coordinate of the target object.

Segmented Global Path Planning
In indoor navigation, doors do have special significance. The door area is the connection area of the different topological areas, and is inevitable for robot navigation path. Thus, it is necessary to use the door area as the intermediate guidance in indoor navigation algorithm. On the other hand, the door area is usually a narrow channel with a corner, so it is a high-danger area where robot navigation fails. Therefore, this paper proposes a segmented global path planning method with the topological map and auxiliary navigation points (ANPs) to overcome the above two problems.

Door and Auxiliary Navigation Point
The door area is a high-risk area for navigation failure. Firstly, the feasible road is blocked by the wrong obstacles (shown as the dots in the red circle in Figure 4a) because of the sensor noise and calibration error, so that robot cannot correctly perceive the environment. Secondly, the shortest path through the door is usually a circle trajectory (red line in Figure 4a), so the motor and control errors like drift or slippage easily lead to the collision between the robot and the door frame. For the first problem of sensor error, we can find that the noise can be cleared when robot directly stand in front of the door to perceive the environment (as shown in Figure 4b), so as to plan a correct path correctly. For the second problem, it is obvious that controlling the robot to cross the door along the center line of the door area, can ensure the maximum distance between the robot and the door frame to ensure the maximum safety. Therefore, we propose to use the dominant direction of the room and the coordinates of the door endpoints to generate ANPs on both sides of the door and modify the robot's path to pass through the ANPs. This greatly improves the safety and success rate of the robot crossing door. The specific algorithm is described in Algorithm A2 in Appendix A.
In the process of building the semantic topological map based on the previous work [35], the semantic information of rooms (Room 1 and Room 2 ), dominant directions of the room (L), and coordinates of the door endpoint (E 1 and E 2 ) have been generated, as shown in Figure 5a. Then, we make the perpendicular line A 1 A 2 through the midpoint O of line-segment E 1 E 2 , as shown in Figure 5b. Note that the length of OA 1 and OA 2 both are equal to ∆D and is related to door width and robotic size. A 1 and A 2 are used as the ANPs on both sides of the door. Furthermore, the room's semantic information is stored in the ANP, as the form of the room which ANP is in, the room which ANP connects to. Additionally, the orientation of ANPs are parallel to L but change with navigation

Path Planning Algorithm
A* algorithm is one of the most widely applied algorithms in global path planning in the indoor environment. As a heuristic algorithm, it uses both of shortest path and heuristic searching, so the cell in the grid map is computed by the value: where g(n) is the length of the path from the start cell to the current cell through the selected sequence of cells, and h(n) is the heuristic distance from current cell to the goal. Heuristic provides the search direction pointing to the goal point. However, it does not take the room topological information into account, so there is a serious risk that A* provides the wrong search direction, which leads to large computational time and inefficient navigation, as shown in Figure 6a. Obviously, the map of the indoor environment (Map) is composed of multiple maps of different rooms (Map i ): and the robot must pass through the door when it moves from one room to another. Therefore, we propose to separate the global path into each topological area and use the APNs on both sides of the door as the endpoint of every path to run the path planning respectively, as described in Algorithm A3 in Appendix A. In this context, the heuristic can provide a more accurate search direction.
To obtain the sequence of room passed when robot move from the start point to the goal point, we carry on topological path planning based on the topological map, as shown in Figure 6b. The red dotted line represents the topological path.
Thus, the robot will pass through room B, room D, and room E, which are stored in the room sequence. We infer the corresponding ANPs sequence from the room sequence, and then the global path is segmented into five parts. In Figure 6c, blue dots indicate the ANPs sequence, and the red dotted line indicates the global path. Figure 6d-h shows the results of every path planning and their traversed cells. Through calculation, the number of traversed cells of A* is 30,571, while the number of our method is 1694. The decrease is 94.46%. It is found that our method provides a more accurate search direction for path planning in each topological area by using ANPs and greatly reduces the traversed cells and computational time.

Experimental Platform
For the verification of our system, we selected three experimental platforms of different radius. Figure 7 shows the configurations of these robots. All of them are equipped with Kobuki, Hokuyo UST-10LX 2D lidar sensor, Kinect V2 RGB-D camera, and Hasee z7m-kp7gc laptop whose CPU is Intel i7-8750h, GPU is gtx1050ti, RAM is 12 GB. All of their shapes are circular, and the max radiuses are 0.18 m, 0.23 m and 0.28 m respectively, as shown in Figure 7. The navigation algorithm is based the robot operating system (ROS), which provides various libraries and tools to help create robot application. The robot pose was estimated by using AMCL [38], the ROS implementation of the Monte Carlo localization algorithm [39], which is feed the laser measurement data and odometry data and resample the particles representing the belief of robot pose. For the controllers and algorithms for controlling the robot, the ROS move_base stack [40] is implemented, with segmented global path planning and dwa local path planning. The former segments the global path into every topological area based on topological map, and run Astar path planning, respectively. The latter provides the controller that drives a robot in the plane, by assessing every local paths' costs of traversing through the grid cells and determine the linear and angular velocity to send to the robot. Meanwhile, ROS move_base stack provides the global cost map and local cost map [41]. We configured the 2D costmap with the static layer (the unchanging portion of the costmap generated by SLAM), the obstacle layer (the obstacles created by the sensor data), and the inflation layer (the inflation of the obstacles to represent the configuration space of the robot and avoid the collision).

Experimental Environment
The experimental environments include home and office. The home environment includes one bathroom, two bedrooms, one kitchen, and one living room. All doors are 0.74 m wide. Moreover, according to the result of semantic segmentation algorithm, there are 11 objects in the home environment. For the convenience of the experiment, we select five objects as the endpoints of the navigation paths, and they are sink in bathroom, bed in master bedroom, bed in second bedroom, TV in living room, and fridge in kitchen, as shown in Figure 8a. Figure 8c is the object semantic grid and topological map established for the home environment, which contains: object semantic information, object occupied space, object goal space (affordance and nonaffordance space), semantic topological information of room, the room dominant direction, and the coordinates of door endpoints. Figure 8e visualizes the door endpoints in the home map, which is the intermediate product of generating the semantic topology map based on [35].
Similarly, the office environment includes one meeting room, one laboratory, one corridor, and one elevator hall. The door width in the environment is 0.65-0.96 m. There are 12 objects in the environment, and we select 5 objects as the endpoints of navigation paths, which are notebook in meeting room, desk, and microwave in laboratory, desk in elevator hall, and desk in corridor, as shown in Figure 8b. Figure 8d shows the object semantic grid and topological map of office environment. Figure 8f visualizes the door endpoints in the office map (which is cut and rotate by 90 degrees).

Automatically Selecting Navigation Goal Results
This paper proposed the autonomous and social selection of navigation goal through natural language interaction and the object semantic grid and topological map. To verify the ability to select goal autonomously by language interaction, the experiment was designed as follows: The operator designated the navigation target by language instructions containing the target home, object, and spatial relationship. Twenty paths were set in the home environment (as shown in Table 1) and 18 paths in the office environment (as shown in Table 2). Especially, both of the front space and side space of every object are set as the goal space at least twice. If the value of cell (i, j) in the table is 1, the voice command is to move from object i to the front of object j, and then the robot will select a navigation point from the affordance space of object j. If the value is 2, the voice instruction is to move from object i to the side of object j, and then the robot will select a navigation point from the nonaffordance space of object j. In addition, the robot radius was set 0.18 m, and path planning algorithm and controller were set by default, so that the experiment focused on autonomously select goal by language interaction.   Only parts of environmental results are illustrated in Table 3 due to space limitations. As shown in the column called "word segmentation", the target information are extracted from voice instructions (which is Chinese originally and translated into English). The coordinate of navigation goal point selected is shown in the column called "target point selection" and all of these pictures are generated by the RVIZ software during the navigation. The red rectangle is the nonaffordance space, the green rectangle is the affordance space, and the blue arrow is the goal pose. The navigation results are shown in the column called "Navigation Result". The human operator is also shown in the pictures to indicate whether the robot's final position satisfies the requirement of the human-robot coexisting environment.
Multiple conclusions can be extracted from these results: • The target information, including target room, object, and spatial relationship, can be extracted accurately and clearly through language reorganization [10], word segmentation [12], keyword extraction, and concept query. For example, the command in the first row of Table 3 "go to the front of the sink in the bathroom" can be divided into "go to the/front/of the/sink/in the/bathroom ", in which the target room is bathroom, target object is sink, and target spatial relationship is front.

•
The navigation goal is selected automatically and correctly, and it is excellent in terms of social norms and the object affordance theory through qualitative comparison. The goal is located in the affordance space if goal spatial relationship is front, while it is located in the nonaffordance space if goal spatial relationship is side. In addition, the goal location satisfies the hard constraint (not in the inflation area of cost map) and soft constraint (avoid other objects' affordance space) as described in Section 3.

•
The system helps the robot better integrated into the human-robot coexisting environment according to the navigation results. When the robot wants to operate the target object, such as the TV in the third row or the laptop in the sixth row of Table 3, it can reach the position conducive to complete the operation task. When the robot doesn't need to operate the object, it can reach the area diagonal to the object, so it will not hinder the user from operating the object, like the navigation results in the second row and fourth row of Table 3.

Segmented Global Path Planning Results
The segmented global path planning method with topological map and auxiliary navigation points aims to improve the success rate and safety of domestic navigation and decrease the computational time. The experiments were grouped into the test on the success rate and the test on global path planning efficiency to evaluate the method.
To compare the performance of the success rate, the crossing door success rate (CDSR) and navigation success rate (NSR) was employed as standard. The CDSR indicates the door number passed successfully divided by the total door number. If the robot collides with the door frame or cannot plan the path in the process of passing door, it is considered that the robot fails to pass the door. Furthermore, in terms of the likelihood of collision to other obstacles leading to navigation termination, NSR is defined and indicates the number of successful navigation paths divided by the total number of navigation paths. If the robot reaches the destination and does not collide with other objects during the whole navigation process, the robot navigation is successful. The higher NSR and CDSR indicate better performance.
To assess the improvement of global path planning efficiency, we defined the path length and the traversed cells number. The shorter the planned path, the better the algorithm. To improve the efficiency of planning, the less time the algorithm consumes, the better the algorithm. In addition to considering that different computers have different performances and the time consumed mainly depends on the number of traversed cells (the cells whose potential has been calculated) by the path planning algorithm, we used the number of traversed cells to evaluate the efficiency of our system. Table 3. Experiment results of automatically selecting navigation goal.

Word Segmentation
Selecting Navigation Goal 1 Navigation Result instructions (which is Chinese originally and translated into English). The coordinate of navigation goal point selected is shown in the column called "target point selection" and all of these pictures are generated by the RVIZ software during the navigation. The red rectangle is the nonaffordance space, the green rectangle is the affordance space, and the blue arrow is the goal pose. The navigation results are shown in the column called "Navigation Result". The human operator is also shown in the pictures to indicate whether the robot's final position satisfies the requirement of the human-robot coexisting environment.

Related Parameters and ANPs
The navigation target objects we selected in the family scene were the bed in master bedroom (called bed 1), bed in second bedroom (called bed 2), fridge in kitchen, and sink in bathroom (as shown in Figure 8a), all of which are distributed across various topological areas and at least two doors between them. The robot was ordered from one object to another objects in different rooms. Therefore, there were 12 paths in total: bed 1 ↔ bed 2, bed 1 ↔ fridge, bed 1 ↔ sink, bed 2 ↔ fridge, bed 2 ↔ sink, and fridge ↔ sink. The ANPs of the home scene, generated by Algorithm A2 in Appendix A, are shown in Figure 9a. Similarly, the navigation target objects we selected from the office environment were the desk (called desk 1) and the microwave in lab, the laptop in the meeting room, the desk in the elevator hall (called desk 2), and the desk in the corridor (called desk 3), as shown in Figure 8b. There were 18 paths in total: desk 1 ↔ desk 2, desk 1 ↔ desk 3, desk 1 ↔ laptop, desk 2 ↔ desk 3, desk 2 ↔ microwave, desk 2 ↔ laptop, desk 3 ↔ microwave, desk 3 ↔ laptop, microwave ↔ laptop. The APNs of the office scene are shown in Figure 9b.
In order to evaluate different robot dimensions, we designated three platforms with different diameters shown in Figure 7. The robot radius employed in the home scene were 0.18 m, 0.23 m, and 0.28 m, while the radius employed in the office scene is 0.18 m and 0.23 m.

Success Rate
Tables 4 and 5 respectively summarize the experimental results of the CDSR and NSR with different radius in the home scene and the office scene. In the experiment, the ROS move_base stack [40] with A* global planner and DWA local planner were implemented to compare with the segmented global path planning method.    Multiple conclusions can be extracted from these results: • The proposed navigation algorithm based on the ANPs can significantly improve crossing door success rate and navigation success rate for the three robot sizes and two environments, reaching improvements of up to 33.33% (e.g., see 0.23 m radius in the home scene) and 50% (e.g., see 0.23 m radius in the office scene).

•
The first reason for the increase of success rate is that the proposed method can eliminate the influence of observation error on the navigation. In the process of approaching the door, the erroneous obstacle completely obstructed the feasible area crossing door, and the valid path could not be planned, thus giving up navigation. As shown in Figure 10a, when the robot with a radius of 0.23 m approached the bathroom's door (indicated in the red circle), the lidar observation data had noise, making the door area blocked by obstacles. Figure 10b shows our proposed method in the same scenarios. First, the robot was controlled to the auxiliary navigation point (the position where robot stands in Figure 10b) and observed the door area in the direction parallel to the trajectory passing direction. Thus, the shape of the door area was captured successfully.  • The second reason for the increase of success rate is that, by setting ANPs on both sides of the door, the robot passes through the door in an approximately straight line, reducing the probability of collision. In the indoor scene, the connection channels of different topological areas are usually the right angle. For example, 20 of the 30 paths in our experiment's home and office scenes were the right angle. By observing the move_base navigation trajectory (expressed by red line in Figure 10a,c), it can be seen that the crossing door trajectory is circular-like. This makes the distance between the two sides of the robot and the obstacle uneven and increases the probability of collision. The proposed method forces the robot to pass through the door along the door area's centerline, keeping the maximum distance from the obstacle on both sides, so as to improve the success rate and security.

•
As the robot's radius increases, both of the success rates of our method and move_base are degraded. The increase in the robot's radius narrows the feasible area when passing the door, so it is more likely that the valid path cannot be sensed or the collision occurs. For example, the door's width in the home scene was 0.74 m, and as the diameter of the robot increased, the feasible area for the robot to pass the door gradually decreased, respectively 0.34 m, 0.24 m, and 0.14 m. Tables 6 and 7 shows the global path planning results for robots of different sizes in home and office environments. Considering that A* and Dijkstra were successfully implemented and tested as the path planning algorithms for mobile robots, we employed A* and Dijkstra to compare with our proposed segmented global path planning algorithm.  Multiple conclusions can be extracted from these results:

Global Path Planning Efficiency
• Compared with A* and Dijkstra, the segmented global path planning algorithm greatly reduced the number of traversed cells for the three robot sizes and two environments, reaching a decrease of up to 76.29% (e.g., see 0.18 m radius in the office scene) and 94.54% (e.g., see 0.23 m radius in the office scene).

•
The reason why the proposed method improves the navigation efficiency is that the correct heuristic direction pointing to ANPs is generated by segmented global path planning. However, the heuristic direction of A* method and segmented global path planning both are correct and similar for the simple and no branching path. In these scenarios, the improvements of segmented global path planning compared to A*, are not obvious or even negative. For example, there were three paths in the home environment: bed in master room to sink, bed in master room to fridge, fridge to sink, whose results of traversed cells of A* are shown in Figure 11. It can be seen that the heuristic search direction of A* was approximate to the optimal search direction. Specifically, our method forced the robot navigation path to pass through the ANPs, leading to a slightly or negative improvement compared with A*.

•
The increase of navigation efficiency by our method is more significant in the scene with large areas and complex plane structure. Comparing the two scenes' experimental data, the reduction of traversed cell numbers was larger in the office scene (up to 30.20% and 85.86% in the home scenario, while 73.68% and 94.54% in the office scenario). There were two doors, and the arrangement of things was more disorderly and scattered in the Lab. So, it was more likely for A* to provide wrong heuristic search directions, resulting in inefficiency cell traversal and large computational time. Simultaneously, the two environments' door areas were approximately the same size, while the topological area of the office was larger. It is inferred that the segmented global path planning method contributes more to navigation efficiency in the broad and complex scene.

•
In terms of path length, the path planned by our method was slightly longer than that of A* and Dijkstra, with an average of 6.42% and 6.25%. This is because the navigation trajectory by our method was forced to pass through the ANPs to improve the success rate and safety of domestic navigation, while the trajectory by A* or Dijkstra algorithm kept close to the obstacle for the shortest length.

Conclusions
This paper proposes a social and robust indoor robot navigation system based on object semantic grid and topological map, which combines the two major modules: automatic selection of navigation goal and segmented global path planning. The proposed method aims at addressing the three critical problems. First, to select navigation goal more conveniently and socially, natural language is used as the human-robot interaction form, from which the target room, object, and spatial relationship can be extracted through speech recognition and word segmentation technology. Thus, the robot determines the target space and selects the navigation goal point from it. For example, we can get the target room "meeting room," the target object "laptop," and the target space decided as nonaffordance space, from the voice instruction "go to the side of laptop in the meeting room." Furthermore, to improve the success rate and security of indoor navigation, we use the door endpoint coordinates and the room dominant direction kept in the topological map, to generate auxiliary navigation points on both sides of the door, to correct the robot trajectory. Finally, to improve the efficiency of global path planning, we segment the global path into every single topological space and plan the path respectively based on the auxiliary navigation points. The system is tested by a series of experiments in real indoor environments with multiple robot dimensions. Experimental results show that the navigation goals satisfy the social rules better. Auxiliary navigation point significantly improves the crossing door success rate (average 24.22%) and navigation success rate (average 30.56%) compared to ROS move_base. The number of traversed cells is greatly reduced than A* (average 69.85%) and Dijkstra (average 93.25%), especially in the complex and large-scale environment.
In future work, we will improve the system and exploit deeply the ability of environment expression and reasoning, such as more associations between voice instructions and spatial relationships, so that system can be more diverse and social. At the same time, the navigation in the multi-connected topology area will also be considered by imposing the distance constraints and social constraints.