An Approach for Opening Doors with a Mobile Robot Using Machine Learning Methods

: One of the tasks of robotics is to develop a robot’s ability to perform speciﬁc actions for as long as possible without human assistance. One such step is to open different types of doors. This task is essential for any operation that involves moving a robot from one room to another. This paper proposes a versatile and computationally efﬁcient algorithm for an autonomous mobile robot opening different types of doors, using machine learning methods. The latter include the YOLOv5 object detection model, the RANSAC iterative method for estimating the mathematical model parameters, and the DBSCAN clustering algorithm. Alternative clustering methods are also compared. The proposed algorithm was explored and tested in simulation and on a real robot manufactured by SOMATIC version Dalek. The percentage of successful doors opened out of the total number of attempts was used as an accuracy metric. The proposed algorithm reached an accuracy of 95% in 100 attempts. The result of testing the door-handle detection algorithm on simulated data was an error of 1.98 mm in 10,000 samples. That is, the average distance from the door handle found by the detector to the real one was 1.98 mm. The proposed algorithm has shown high accuracy and the ability to be applied in real time for opening different types of doors.


Introduction
An autonomous mobile robot (AMR) is a machine capable of moving and performing specific tasks without regular human assistance [1].Each task requires a different degree of autonomy.A high degree of autonomy is required for tasks such as cleaning a building [2], delivering food [3], etc.For an AMR, it is important to perform human-like actions in typical indoor environments in order to develop an extensive range of tasks [4,5].One of these actions is opening various types of doors.This task is important for any operation that involves the robot moving from one room to another.
In most cases, LiDAR sensors are used to localize a robot in a building [6].Usually, the data from LiDAR sensors are sent to the input of either the Particle Filter (PF) algorithm [7] or the Kalman Filter (KF) algorithm [8].The problem with low-cost LiDAR sensors is that they need to be more accurate to ensure that the robot hand hits the door handle, as an error of just 2-3 cm can cause the detector to miss the door, and therefore, the door will not be opened.This leads to the obvious conclusion that machine learning (ML) algorithms are required in this task [9][10][11] since most AMRs for such operations have only a camera as a sensor to see the door handle, which makes the direction of object recognition in the camera image relevant in this task.This means that the position of the door handle relative to the detector needs to be constantly updated in order to hit it and successfully open the door.
Wang et al. [12] investigated the control of mobile robotic arms for opening doors.To accomplish this task, they first used an image-recognition method based on YOLOv5 to obtain the position of the door handle.Next, they employed the simulation platform Cop-peliaSim to facilitate the interaction between the robotic arm and the environment.Third,

•
Proportional-integral-differential (PID) law [17,18].An example of the usage of this algorithm in robotics is the regulation of the wheel speed to achieve a given linear and angular acceleration of an AMR.At first look, such a task may seem quite simple: we calculate the rotational speeds of each of the four wheels and then apply a certain starting voltage to each wheel, and if any wheel does not rotate at the frequency we need, we reduce or increase the voltage in a certain step until the frequency is equal to the one set.In fact, this is a particular case of a PID controller.The equation of the PID law is as Equation (1).
where K p , K i , K d represent the coefficients of the proportional, integral, and differential components, respectively; e(t) indicates the measured value deviation from the set value at a given time; u(t) indicates the control function.The PID control law helps to move the AMP on the floor quite accurately, which plays an important role in the door-opening process.
• Inverse kinematics using the Jacobian inverse.In computer animation and robotics, inverse kinematics [19] is the mathematical process of calculating the variable joint parameters required to place the end of a kinematic chain, such as a robot arm endeavor, in a given position and orientation relative to the chain beginning.Given the joint parameters, the position and direction of the end of a chain, such as a character arm or robot arm, can be calculated directly through a few uses of trigonometric formulas.This is a process known as forward kinematics.However, in general, inverse kinematics is much more complicated.There are approaches to solving the inverse kinematics task analytically [20], but iterative methods are used for available cases with many joints.One such method is inverse kinematics using the Jacobian inverse.This is a simple but effective way to solve the inverse kinematics task used in this research.

•
Particle filters, or sequential Monte Carlo methods [21,22], are used to solve filtering problems in signal processing and Bayesian statistical conclusion.The task of filtering is to estimate internal states in dynamic systems when partial observations are made using sensors that contain a measurement error.In this research, a particle filter is used to estimate the position and orientation of a robot.The sensor data used are the results of a Lidar scan.
One example of a mobile robot capable of opening a door is the SpotMini robot from Boston Dynamics [23]: a small robot called SpotMini opens the door (on itself) with the help of a unique gripper placed on top, and it enters the room, supporting the door.Unfortunately, there is no detailed description of this algorithm and only some videos are available, but the robot has a built-in RGBD stereo camera, and it is possible that data from it were used to open the door.The design of the SpotMini robot with a gripper fixed on top is perfect for the task of opening doors.It is quite small and uses legs, which provide improved maneuverability when compared with the use of conventional wheels.
A similar analogy to our work is the algorithm for opening doors with a robot manufactured by Toyota-Human Support [24].This robot is similar in size and has an RGBD stereo camera, which it uses to find the door handle.This mobile robot uses an algorithm for opening doors that is comparable to that proposed in this paper.This robot is different in some respects.For example, its arm has five degrees of flexibility.
As mentioned above, this work aims to develop a universal algorithm for building a door-opening trajectory using an autonomous mobile robot with a seven-stage robotic arm, an RGBD camera, and four LiDAR sensors responsible for localization.The focus should be on the development and realization of the part of the algorithm responsible for estimating the position and direction of the door handle, as this is the universal part of the whole algorithm and can be used for many types of AMRs because the main requirement is the availability of an RGBD stereo camera and a mechanism for finding its position and direction relative to the robot.
In concept, the realization of the door-opening algorithm can be divided into two isolated stages.The first stage is the construction of the door-opening trajectory based on a priori data of the door handle position and orientation.It is not considered in detail in this research since the door-opening trajectory for two different robots will be radically different, since one robot can use a robotic arm with seven degrees of flexibility and the other has six, and this already introduces major changes to the concept of trajectory construction.Usually, robots differ in much more than the number of joints in the arm, so the realization of this logic will be described in abstract steps.The second stage is finding the position and orientation of the handle using machine learning methods.This is a key part of the work, as a completely similar approach can be used for a robot that is completely different from the one used in this project.All that such a robot needs is an RGBD stereo camera and a mechanism for finding its exact position relative to the robot's coordinates.This second stage can be called the algorithm for clarifying the position and orientation of the door and door handle, as it uses a priori and sensor data (in our case, only the RGBD stereo camera).
We must have some a priori data about the position and orientation of the door and door handle on the map.That is, there will be a map of the room onto which the door and the position of the handle can be added, and this information will be read out when the algorithm is initialized.Such information is not accurate in the first place, and neither is the robotic arm [25] nor the localization [26].To enable the robot to open a door, we must continually update the position of both the door and the door handle in relation to the robot, using information obtained from the RGBD stereo camera.To accomplish this, we will utilize an algorithm designed to refine the position and orientation of the door and its handle.The data obtained from this algorithm will then be fed into a separate algorithm that is responsible for constructing a trajectory for the robot's arm to follow while opening the door.The resulting output of this process will be the specific trajectory that the robot will use to successfully open the given door, using its initial position and arm state as additional input parameters.
In this article, we will present the following sections: Section 2 will introduce input and output data, as well as a description of the proposed algorithm.Section 3 will present the results of numerical experiments conducted to test the efficacy of the proposed algorithm.Finally, in Section 4, we will draw conclusions from our findings and discuss potential avenues for future research.

Input Data
Let us suppose that we have a certain map of the room, with doors defined by an affine transformation D with a rotation matrix R (D) and a translational width T (D) : Note that the orientation of the door along the axes x, y matches the point of attachment of the door to the wall, and along the axis it matches the floor.The handle of the door will be set by a transformation H D relative to the transformation of the door.The position of the robot will be set by a transformation M in the global coordinate system, as in the case of the door (see Figure 1).
algorithm is initialized.Such information is not accurate in the first place, and neither is the robotic arm [25] nor the localization [26].To enable the robot to open a door, we must continually update the position of both the door and the door handle in relation to the robot, using information obtained from the RGBD stereo camera.To accomplish this, we will utilize an algorithm designed to refine the position and orientation of the door and its handle.The data obtained from this algorithm will then be fed into a separate algorithm that is responsible for constructing a trajectory for the robot's arm to follow while opening the door.The resulting output of this process will be the specific trajectory that the robot will use to successfully open the given door, using its initial position and arm state as additional input parameters.
In this article, we will present the following sections: Section 2 will introduce input and output data, as well as a description of the proposed algorithm.Section 3 will present the results of numerical experiments conducted to test the efficacy of the proposed algorithm.Finally, in Section 4, we will draw conclusions from our findings and discuss potential avenues for future research.

Input Data
Let us suppose that we have a certain map of the room, with doors defined by an affine transformation  with a rotation matrix  () and a translational width  () : Note that the orientation of the door along the axes ,  matches the point of attachment of the door to the wall, and along the axis it matches the floor.The handle of the door will be set by a transformation   relative to the transformation of the door.The position of the robot will be set by a transformation  in the global coordinate system, as in the case of the door (see Figure 1).The starting position of the robotic arm will be set by a vector  ̅ = ( 1 ,  2 , … ,  7 ) of seven elements, since the robotic arm has seven joints.Each element of the vector represents the corner at which the corresponding joint is rotated.
The last thing that needs to be added is the image from the RGBD camera.Let us assume that the camera returns an image and a depth map in  by  pixels.Since the The starting position of the robotic arm will be set by a vector a = (a 1 , a 2 , . . . ,a 7 ) of seven elements, since the robotic arm has seven joints.Each element of the vector represents the corner at which the corresponding joint is rotated.
The last thing that needs to be added is the image from the RGBD camera.Let us assume that the camera returns an image and a depth map in n by m pixels.Since the camera will be looking at the door handle for some time, we will obtain l images from it.So, the resulting data object from the camera will be I ∈ R l×4×m×n .It can also implement additional logic, such as filtering out blurry [27] or outdated images.It is also possible to filter out similar images [28], so that only a few of them are left.
Thus, the input to the door-opening algorithm is D, H D , a, M, I.

Output Data
The output for this task will be a set of N pairs T = {(M 1 , a 1 ), (M 2 , a 2 ), . . . ,(M N , a N )}, where M i is the robot's position and a i is the state of the arm joints.In other words, it is the trajectory that the robot should follow, given some physical constraints [29,30], to open the door.

Proposed Algorithm
Before the algorithm starts working, the robot has to perform certain actions, namely, drive as close to the door as possible and point the camera first at the door itself and then at the door handle, in order to fill in the object with images I.
Then, its position and orientation M are recorded, as well as the hand's position a, and together with the images I and the a priori position of the door D and door handle H D , they are sent to the door-opening algorithm.
It is also important to note that in the process of opening the door, we can receive new images from the camera and additionally estimate the position of the door handle, and if the estimate is very different from the one we made before the algorithm started, we can rebuild the opening trajectory in real time.For many robots, this is not possible due to the specifics of the design because they cannot use the manipulator and point the camera at the same time.However, in general, the architecture of the developed system will provide such functionality, so let us introduce a metric for the distance between two affine transformations d(x, y) and a threshold ξ.
< ξ, then we stop the algorithm and rebuild the trajectory with new D, H D , a, M, I.
During the research, many combinations of different methods were tested to find the door handle.Finally, our proposed algorithm consisted of the following steps: 1. RANSAC (Random Sample Consensus) is a statistical algorithm commonly used in computer vision and image processing to estimate parameters of a mathematical model from a set of observed data points that may contain outliers or errors.The goal of RANSAC is to identify the inliers (data points that fit the model well) and reject the outliers (data points that do not fit the model well).The algorithm works by randomly selecting a subset of data points, fitting a model to them, and then testing the remaining points to see how well they fit the model.If a sufficient number of data points fit the model well, it is considered a good fit and the algorithm terminates.Otherwise, another subset of data points is selected, and the process is repeated.
RANSAC is often used in computer vision tasks such as image registration, stereo vision, and structure from motion, where it is necessary to estimate a mathematical model from noisy data with outliers.RANSAC is popular because it is relatively fast and simple to implement and can provide robust estimates even in the presence of a large number of outliers.In the context of RANSAC, outliers are data points that do not conform to the underlying mathematical model being estimated.Outliers can arise due to measurement errors, noise, or other factors that cause the data to deviate from the expected pattern.In our case, all points further than a fixed distance from the door plane are considered outliers.
RANSAC is designed to handle datasets that contain outliers by iteratively fitting models to subsets of the data and rejecting points that do not fit the model well.In each iteration, the algorithm selects a random subset of points and fits a model to them.It then uses the model to predict the values of the remaining points and computes a measure of how well each point fits the model.Points that are too far from the model (i.e., outliers) are then discarded, and the algorithm repeats the process with a new subset of points.
By repeating this process many times, RANSAC is able to estimate the parameters of the mathematical model in a way that is robust to outliers.The final model is obtained by using all of the inlier points that were selected in the iterations and can be used for further processing or analysis.In our case, it will be used to find the coefficients of the door plane, which are actually equal to the orientation of the door handle as it is attached to it (the door).The door plane will be located in the pixel coordinate space.That is, we have a depth map: where d ij is the pixel depth i, j.So, we have a set on which we will execute the RANSAC algorithm: By performing RANSAC on the set ∼ D, we get the plane ax + by + cz + d = 0. 2. All points of the point cloud (PC) that are close to the plane will be deleted.Additionally, all points that lie outside the BB that the YOLOv5 ML model finds will be deleted.
So, we need to find a rectangle (Bounding box (BB)) with a door handle on it.That is, we are given an image as input i ∈ R 4×m×n , and the output is a set B = x j , y j , h j , w j j = 1, k of elements k, each of which describes the coordinate, height, and width of the BB.The YOLOv5 architecture is used for this purpose [31][32][33].Object detection is aimed at creating objects from the input images and then passing these objects through a prediction system to draw frames around the objects and predict their classes.Arduengo et al. [24] proposed a framework to adaptively operate common doors using a manipulator embedded in a mobile robot.The researchers utilized their own dataset to train YOLOv5 for the purpose of door-handle detection.For this study, the aforementioned model was utilized.However, it should be noted that the model typically generates a BB that does not entirely encompass the visual aspect of the door handle (see Figure 2).Therefore, the BB is generally expanded by a fixed number of pixels.
the mathematical model in a way that is robust to outliers.The final model is obta using all of the inlier points that were selected in the iterations and can be used ther processing or analysis.
In our case, it will be used to find the coefficients of the door plane, which a ally equal to the orientation of the door handle as it is attached to it (the door).Th plane will be located in the pixel coordinate space.That is, we have a depth map: where   is the pixel depth , .So, we have a set on which we will execute the RA algorithm: ̃= {(, ,   ) | (, ) ∈ 1 ̅̅̅̅ × 1 ̅̅̅̅ } By performing RANSAC on the set  ̃ , we get the plane  +  +  +  = 2.All points of the point cloud (PC) that are close to the plane will be deleted tionally, all points that lie outside the BB that the YOLOv5 ML model finds will leted.
So, we need to find a rectangle (Bounding box (BB)) with a door handle on it.we are given an image as input  ∈ ℝ 4×× , and the output is a s {(  ,   , ℎ  ,   ) |  = 1,  ̅̅̅̅̅ } of elements , each of which describes the coordinate, and width of the BB.The YOLOv5 architecture is used for this purpose [31][32][33] detection is aimed at creating objects from the input images and then passing th jects through a prediction system to draw frames around the objects and predi classes.
Arduengo et al. [24] proposed a framework to adaptively operate common using a manipulator embedded in a mobile robot.The researchers utilized the dataset to train YOLOv5 for the purpose of door-handle detection.For this stu aforementioned model was utilized.However, it should be noted that the model ty generates a BB that does not entirely encompass the visual aspect of the door han Figure 2).Therefore, the BB is generally expanded by a fixed number of pixels.3. The resulting PC is then fed into the DBSCAN clustering model.We t largest cluster from it and consider its centroid to be the position of the door han consider the normal vector of the door plane to be the orientation.3. The resulting PC is then fed into the DBSCAN clustering model.We take the largest cluster from it and consider its centroid to be the position of the door handle.We consider the normal vector of the door plane to be the orientation.
In computer vision, DBSCAN can be used for various tasks, such as object detection, image segmentation, and feature extraction.For instance, DBSCAN can be applied to segment an image by clustering pixels based on their similarity in color or texture.Addi-tionally, DBSCAN can also be utilized to detect and group together objects in an image by clustering features such as key points or descriptors.By leveraging the density-based nature of DBSCAN, it can be effective in handling complex and varying object shapes, and it is robust to noise and outliers.
DBSCAN can also be applied to point clouds obtained from depth images, which are commonly used in 3D computer vision tasks.By treating the 3D point cloud as a set of spatial coordinates, DBSCAN can cluster points that are close together in 3D space while considering points that are more sparsely distributed as noise or outliers.This can be particularly useful in tasks such as object recognition and segmentation, where the 3D point cloud of an object can be clustered to identify its boundaries and separate it from the background.
One benefit of using DBSCAN for point cloud segmentation is its ability to handle non-uniformly distributed points, making it suitable for noisy or sparse point clouds.Additionally, DBSCAN can handle point clouds with varying densities, making it an appropriate choice for point clouds that may have missing data or varying resolution.
Note that the DBSCAN algorithm takes as input the proximity matrix and two parameters: the radius of the circle and the number of neighbors.Given some symmetric distance, function ρ(x, y) = ρ(y, x) and constants , m are given.We call the area E(x), for which ∀y : ρ(x, y) ≤ , the -vicinity of the object.The root object is an object whose surrounding contains at least m elements: |E(x)| ≥ m.An object p is directly accessible from the object q if p ∈ E(q) and q is the root object.The object p is reachable from the object if ∃p 1 , p 2 , . . ., p n , p 1 = q, p n = p, such that ∀i ∈ 1 . . .n − 1 : p i+1 is directly reachable from p i .The algorithm is as follows: first, select any root object from the dataset.Then, we mark this object and select all its directly reachable neighbors and add them to the traversal list.Then, for each of the vertices, if they are also the root, we perform similar operations.If we have traversed all the points in a given cluster, we can start with any other root vertex.
DBSCAN only requires a few computing resources, which is important in this task since we will cluster points thirty times per second, corresponding to the camera frame rate.Next, we will consider other clustering methods and compare their complexity.
A relatively simple and popular clustering method is K-Means [34], which has low computational complexity, which is essential for our task since the robot is usually not allowed to stand in one place for a long time while calculating the trajectory of the door opening.
This algorithm can be implemented with asymptotic complexity, where is the number of dimensional vectors, is the number of clusters, and is the number of iterations required to achieve accuracy.However, as can be seen from this description of the complexity estimate, we have a hyperparameter responsible for the number of clusters.We need to know the number of clusters in advance for our task.If we set any number that is not very large, it will significantly affect the accuracy, since this parameter will be different for each door, which is related to its size and the form of the door handle.
As in the case of DBSCAN, another density-based data clustering algorithm is OPTICS (ordering points to identify the clustering structure) [35,36].In general, the algorithm is very similar to DBSCAN, which was actually used in the final implementation.Still, it solves one of the main weaknesses of DBSCAN: the problem of identifying significant clusters in datasets of different densities.The database objects must be ordered (in linear time) so that objects close to each other will be neighbors in the final ordering.In addition, a particular distance is stored for each point, representing the density acceptable for a cluster to have both neighboring points belong to the same group.OPTICS can be used instead of DBSCAC.However, DBSCAN was used because we do not need the additional improvement provided by the OPTICS algorithm since we usually just need to find the most significant cluster, and whether all the others are also considered clusters or outliers does not play a strong role anymore.That is why we chose DBSCAN for this work.
Another alternative to the DBSCAN algorithm could be the EM algorithm [37,38].This method estimates the maximum similarity of parameters of probable models when the model depends on some hidden parameters.All iterations of this method consist of two steps: expectation and maximization.The E-step (expectation) calculates the expected value of the probability function.During this step, hidden variables are treated as observables.The M-step (maximization) calculates the maximum likelihood estimate.As a result, the expected similarity that was calculated in the previous step increases.This value is then used for the E-step in the next iteration.The algorithm is convergent and runs until a certain accuracy criterion is reached.It can also be used as a clustering method since the hidden parameters, in this case, are the mean and variance of normal distributions.This means that the algorithm searches for the parameters of the distribution that consist of the sum of normal distributions and from which the data were most likely generated.The reason why this algorithm was not used is, again, the presence of the hyperparameter, as in the case of K-Means.Additionally, this algorithm takes a long time to execute, as it calculates complex mathematical formulas at each iteration, which usually takes significant time at the program level.Making the algorithm run in fewer iterations is possible, but this requires sacrificing accuracy.
This work generally tested many other methods that did not show the necessary efficiency.There were attempts to use image segmentation [39] instead of object detection, but the results did not show the required accuracy.This is primarily due to the need for sufficiently large datasets for training neural networks, a consequence of datasets for segmentation that are quite difficult to create (much more complex than for object detection).
Efforts have been made to substitute the RANSAC algorithm with linear regression, but such attempts were not able to deliver the desired accuracy.The reason for the failure of linear regression in identifying the plane of the door is that it optimizes the mean square error function, which is not well-suited for identifying the plane with the largest number of points around it.Hence, for this task, the RANSAC algorithm is better suited.
There were also attempts to not use the clustering algorithm afterward and instead try to connect the points with a three-dimensional model of the door handle [40].Such an algorithm is entirely accurate but requires a lot of computing resources that are usually unavailable on an AMR of this class.Therefore, in the end, it was decided to find the centroid of the central cluster.

Results
This section describes the simulation and experimental results obtained in this paper.The omnidirectional mobile robot Dalek is based on the use of four mecanum wheels.This type of wheel has rollers evenly distributed around the rim by which it interacts with the surface of the movement.This wheel allows the vehicle to independently move its wheels, to provide controlled movement in any direction.This design greatly simplifies movement for a robot that moves indoors.For example, it does not need to turn 90 degrees to drive to the left.In tight spaces, the robot often cannot physically turn at all, and this problem is solved by the mecanum wheel design.At a height of up to 20 cm, four LiDAR sensors help localize the robot.These are LiDAR sensors from the Chinese manufacturer SLAMTEC-RPLIDAR A3.They allow you to find the distance to an object no further than 25 m from the robot, with an error of up to 5 cm.These are two-dimensional LiDAR sensors, and in this robot design, they are installed horizontally, so the output is cut in the form of points parallel to the floor plane.As for the robot arm, this robot has a UFactory farm 7.This arm has seven degrees of freedom and weighs 13.7 kg.Its repeatability is one-tenth of a millimeter.A repeatability of one-tenth of a millimeter means that if you put the arm in any position that the limits of the joints allow, record the coordinates in the space of these joints, put the arm in any other position, and tell it to come to the recorded coordinates, it will do so with an accuracy of one-tenth of a millimeter.This is an important nuance since experiments have shown that the distance from the endeavor position calculated by direct kinematics to the actual endeavor position is much greater than the repeatability and is close to 1 cm.This is one of the reasons for using the MN to dynamically find the position of the handle relative to the camera during the trajectory execution since an error of 1 cm can cause a missed hit on the door handle.Note that the maximum speed of the hand detector is 1 m/s.An RGBD stereo camera manufactured by Intel, the RealSense D435i, is mounted on the sensor.This camera captures depth images at a frequency of 30 frames per second with a resolution of 1280 by 720 pixels.The camera can calculate image depth for objects no closer than 10 cm and no further than 10 m away.The error of the depth image increases with distance.For example, if the thing is 30 cm away from the sensor, it is plus or minus 1 cm.If the object is 5 m away, the error will be several tens of centimeters.
All software was executed on an NVIDIA Jetson TX2 minicomputer.The proposed door-handle detection algorithm can be used for different types of hands and mechanisms, which depends more on engineering solutions.Still, the robot must be physically able to do so and have an RGBD stereo camera to recognize the door plane and the door handle.In this case, we used an Intel RealSense D435i (see Figure 3).
tion is much greater than the repeatability and is close to 1 cm.This is one of the for using the MN to dynamically find the position of the handle relative to the during the trajectory execution since an error of 1 cm can cause a missed hit on handle.Note that the maximum speed of the hand detector is 1 m/s.An RGB camera manufactured by Intel, the RealSense D435i, is mounted on the sen camera captures depth images at a frequency of 30 frames per second with a reso 1280 by 720 pixels.The camera can calculate image depth for objects no closer th and no further than 10 m away.The error of the depth image increases with dista example, if the thing is 30 cm away from the sensor, it is plus or minus 1 cm.If t is 5 m away, the error will be several tens of centimeters.
All software was executed on an NVIDIA Jetson TX2 minicomputer.The p door-handle detection algorithm can be used for different types of hands and nisms, which depends more on engineering solutions.Still, the robot must be p able to do so and have an RGBD stereo camera to recognize the door plane and handle.In this case, we used an Intel RealSense D435i (see Figure 3).The developed algorithm was tested on a SOMATIC robot of the Dalek ver Figure 4).The robotic system under consideration is equipped with a robotic possesses seven degrees of freedom, as well as the RGBD stereo camera that w ously described.The developed algorithm was tested on a SOMATIC robot of the Dalek version (see Figure 4).The robotic system under consideration is equipped with a robotic arm that possesses seven degrees of freedom, as well as the RGBD stereo camera that was previously described.
from the endeavor position calculated by direct kinematics to the actual endeavor pos tion is much greater than the repeatability and is close to 1 cm.This is one of the reason for using the MN to dynamically find the position of the handle relative to the came during the trajectory execution since an error of 1 cm can cause a missed hit on the do handle.Note that the maximum speed of the hand detector is 1 m/s.An RGBD stere camera manufactured by Intel, the RealSense D435i, is mounted on the sensor.Th camera captures depth images at a frequency of 30 frames per second with a resolution 1280 by 720 pixels.The camera can calculate image depth for objects no closer than 10 c and no further than 10 m away.The error of the depth image increases with distance.F example, if the thing is 30 cm away from the sensor, it is plus or minus 1 cm.If the obje is 5 m away, the error will be several tens of centimeters.
All software was executed on an NVIDIA Jetson TX2 minicomputer.The propose door-handle detection algorithm can be used for different types of hands and mech nisms, which depends more on engineering solutions.Still, the robot must be physical able to do so and have an RGBD stereo camera to recognize the door plane and the doo handle.In this case, we used an Intel RealSense D435i (see Figure 3).The developed algorithm was tested on a SOMATIC robot of the Dalek version (se Figure 4).The robotic system under consideration is equipped with a robotic arm th possesses seven degrees of freedom, as well as the RGBD stereo camera that was prev ously described.It is important to note that, prior to initiating the door-opening task, a map of the room needs to be generated and the door parameters need to be set.This includes determining the door's width, height, thickness, the height at which the door handle is situated, and other relevant parameters of the door handle.Then, when the map is created, you can create a basic door-opening plan in the corresponding plan creation menu (see Figure 5) by selecting the appropriate door.In Figure 6, a door is selected to build a trajectory, which can be more complex as the room map can be more difficult.For example, you can recreate an approximate trajectory to see if the robot will not hit the walls.First, it will use the appropriate endeavor to grab the door handle (see Figure 7).Then, in Figure 8, you can see the process of opening the door.It is important to note that, prior to initiating the door-opening task, a map of the room needs to be generated and the door parameters need to be set.This includes determining the door's width, height, thickness, the height at which the door handle is situated, and other relevant parameters of the door handle.Then, when the map is created, you can create a basic door-opening plan in the corresponding plan creation menu (see Figure 5) by selecting the appropriate door.In Figure 6, a door is selected to build a trajectory, which can be more complex as the room map can be more difficult.For example, you can recreate an approximate trajectory to see if the robot will not hit the walls.First, it will use the appropriate endeavor to grab the door handle (see Figure 7).Then, in Figure 8, you can see the process of opening the door.It is important to note that, prior to initiating the door-opening task, a map of the room needs to be generated and the door parameters need to be set.This includes determining the door's width, height, thickness, the height at which the door handle is situated, and other relevant parameters of the door handle.Then, when the map is created, you can create a basic door-opening plan in the corresponding plan creation menu (see Figure 5) by selecting the appropriate door.In Figure 6, a door is selected to build a trajectory, which can be more complex as the room map can be more difficult.For example, you can recreate an approximate trajectory to see if the robot will not hit the walls.First, it will use the appropriate endeavor to grab the door handle (see Figure 7).Then, in Figure 8, you can see the process of opening the door.Having tested the proposed algorithm on the newly developed software solution in practice, we present the accuracy marks received.
So, the algorithm was tested in two ways: in simulation and on a real robot.Simulation testing allows us to understand whether the algorithm is valid for different combinations of environmental parameters, such as door width, handle length and whether the door is opened clockwise or anti-clockwise.Testing on a real robot allows you to understand whether the result in the simulation correlates with the natural world for several exceptional cases of environmental parameters.Furthermore, the algorithm was tested in two variations on a real robot: with and without ML.In the second case, the a priori data of the position and orientation of the door and the door handle were used.

Evaluating Accuracy in Simulation
Testing was performed on 10,000 combinations of door width, door handle position on the door, handle radius, handle length, opening variation (clockwise or counterclockwise), and some other parameters.The proposed algorithm worked successfully in all cases, but this result is quite expected in the simulation, and it hardly matches the real world.However, we estimate some values that are likely similar to those in the real  Having tested the proposed algorithm on the newly developed software solution in practice, we present the accuracy marks received.
So, the algorithm was tested in two ways: in simulation and on a real robot.Simulation testing allows us to understand whether the algorithm is valid for different combinations of environmental parameters, such as door width, handle length and whether the door is opened clockwise or anti-clockwise.Testing on a real robot allows you to understand whether the result in the simulation correlates with the natural world for several exceptional cases of environmental parameters.Furthermore, the algorithm was tested in two variations on a real robot: with and without ML.In the second case, the a priori data of the position and orientation of the door and the door handle were used.

Evaluating Accuracy in Simulation
Testing was performed on 10,000 combinations of door width, door handle position on the door, handle radius, handle length, opening variation (clockwise or counterclockwise), and some other parameters.The proposed algorithm worked successfully in all cases, but this result is quite expected in the simulation, and it hardly matches the real world.However, we estimate some values that are likely similar to those in the real Having tested the proposed algorithm on the newly developed software solution in practice, we present the accuracy marks received.
So, the algorithm was tested in two ways: in simulation and on a real robot.Simulation testing allows us to understand whether the algorithm is valid for different combinations of environmental parameters, such as door width, handle length and whether the door is opened clockwise or anti-clockwise.Testing on a real robot allows you to understand whether the result in the simulation correlates with the natural world for several exceptional cases of environmental parameters.Furthermore, the algorithm was tested in two variations on a real robot: with and without ML.In the second case, the a priori data of the position and orientation of the door and the door handle were used.

Evaluating Accuracy in Simulation
Testing was performed on 10,000 combinations of door width, door handle position on the door, handle radius, handle length, opening variation (clockwise or counterclockwise), and some other parameters.The proposed algorithm worked successfully in all cases, but this result is quite expected in the simulation, and it hardly matches the real world.However, we estimate some values that are likely similar to those in the real world.This is the distance from the door handle's found position to the door handle's actual position.The average length to the actual position is 1.98 mm. Figure 9 shows the distribution of this value.world.This is the distance from the door handle's found position to the door hand actual position.The average length to the actual position is 1.98 mm. Figure 9 shows distribution of this value.

Evaluation of Accuracy on a Real Robot
The number of successful attempts to the total number of shots was used to estim the accuracy in a natural environment.We tested two algorithm variations: without a with the use of ML.
As we can see from Table 1, the algorithm for detecting a door handle from an RG camera image significantly increases the number of successful attempts.It was also established based on numerical experiments that the average operat time of the algorithm for finding the door handle is 315 ms for the robot and 254 ms in simulation.The average running time of the collision-free trajectory construction al rithm for the robot is 8.5 s.
In Figure 10, the robot executes the door-opening trajectory for a test door for wh accuracy has been estimated.The video [41] shows a short demonstration of this app cation.

Evaluation of Accuracy on a Real Robot
The number of successful attempts to the total number of shots was used to estimate the accuracy in a natural environment.We tested two algorithm variations: without and with the use of ML.
As we can see from Table 1, the algorithm for detecting a door handle from an RGBD camera image significantly increases the number of successful attempts.It was also established based on numerical experiments that the average operating time of the algorithm for finding the door handle is 315 ms for the robot and 254 ms in the simulation.The average running time of the collision-free trajectory construction algorithm for the robot is 8.5 s.
In Figure 10, the robot executes the door-opening trajectory for a test door for which accuracy has been estimated.The video [41] shows a short demonstration of this application.
Thus, as shown in the numerical experiments, the algorithm for detecting door handle wear from RGBD stereo camera data significantly increases the number of successful attempts.It has also been shown in the simulation that the average error for 10,000 test cases is 1.98 mm.That is, on average, the distance between the actual position of the door handle and the detected one is 1.98 mm.

Figure 1 .
Figure 1.Schematic view of the door, door handle and robot.

Figure 1 .
Figure 1.Schematic view of the door, door handle and robot.

Figure 2 .
Figure 2. Door-handle detection result using YOLOv5 for one of the test cases.

Figure 2 .
Figure 2. Door-handle detection result using YOLOv5 for one of the test cases.

Figure 5 .
Figure 5.The menu for creating a plan.

Figure 6 .
Figure 6.Selected doors for building a door-opening algorithm.

Figure 5 .
Figure 5.The menu for creating a plan.

Figure 5 .
Figure 5.The menu for creating a plan.

Figure 6 .
Figure 6.Selected doors for building a door-opening algorithm.

Figure 6 .
Figure 6.Selected doors for building a door-opening algorithm.

Figure 7 .
Figure 7. Suitable end effector for gripping door handles.

Figure 8 .
Figure 8.The first stage of the built trajectory execution.

Figure 8 .
Figure 8.The first stage of the built trajectory execution.

Figure 8 .
Figure 8.The first stage of the built trajectory execution.

Figure 9 .
Figure 9.The distribution of the average distance from the real position of the door handle to one found (the OX axis is the distance in mm, and the OY axis is the number of cases).

Figure 9 .
Figure 9.The distribution of the average distance from the real position of the door handle to the one found (the OX axis is the distance in mm, and the OY axis is the number of cases).

Table 1 .
Test results on a real robot.

Table 1 .
Test results on a real robot.