Model-Based Manipulation of Linear Flexible Objects: Task Automation in Simulation and Real World †

: Manipulation of deformable objects is a desired skill in making robots ubiquitous in manufacturing, service, healthcare, and security. Common deformable objects (e.g., wires, clothes, bed sheets, etc.) are signiﬁcantly more difﬁcult to model than rigid objects. In this research, we contribute to the model-based manipulation of linear ﬂexible objects such as cables. We propose a 3D geometric model of the linear ﬂexible object that is subject to gravity and a physical model with multiple links connected by revolute joints and identiﬁed model parameters. These models enable task automation in manipulating linear ﬂexible objects both in simulation and real world. To bridge the gap between simulation and real world and build a close-to-reality simulation of ﬂexible objects, we propose a new strategy called Simulation-to-Real-to-Simulation (Sim2Real2Sim). We demonstrate the feasibility of our approach by completing the Plug Task used in the 2015 DARPA Robotics Challenge Finals both in simulation and real world, which involves unplugging a power cable from one socket and plugging it into another. Numerical experiments are implemented to validate our approach.


Introduction
Manipulation of linear flexible objects is of great interest in many applications including service, manufacturing, health, and disaster response. Indeed, the topic has been the focus of many research studies in recent years. Cables, ropes, clothes, organs, and strings are common deformable objects used in deformable object manipulation [1][2][3][4][5][6]. Cables are linear flexible objects that are common both in industrial and domestic environments. Popular tasks about linear flexible object manipulation include tying a knot using a rope [7], inserting a string to a hole [8], untangling a rope tie [9], predicting and controlling the shape of the cable [10,11].
Cables are linear flexible objects that are common in industrial, domestic, and nuclear environments. The 2015 DARPA Robotics Challenge (DRC) Finals was aimed at advancing the capabilities of human-robot teams in responding to natural and man-made disasters. The Plug Task in this challenge required the robot to pull a power cable out of a socket and plug it into another [12]. Among six teams that completed the task, Team WPI-CMU had the fastest completion time with 5 min and 7 s [13]. Their approach used teleoperation by the human operator to grasp the plug and to complete the task [14]. The other five teams also did not automate this task, and they took longer times to perform the task.
In most of the related literature of manipulating cable-like objects, the physical model of the deformed object needs to be known before the manipulation such that the deformation of the object can be predicted. A recent review paper about different physical models of the cable-like deformable linear objects is provided by Lv et al. [15]. Chen and Zheng [16] used a cubic spline function to estimate the contour of a flexible aluminum beam in the 2D plane, and a non-linear model was applied to describe the deformation based on the material characteristics identified by vision sensors. A systematic method to model the bend, twist, and extensional deformations of flexible linear objects was presented in [17], and the stable deformed shape of the flexible object was characterized by minimizing the potential energy under the geometric constraints. Nakagaki et al. [18] extended this work and proposed a method to estimate the force on a wire based on its shape observed by stereo vision. This method was employed for an insertion task of flexible wire into a hole. Linn et al. [19] proposed a discrete model of flexible rods based on Kirchhoff's geometrically exact theory for VR applications. Caldwell et al. [20] presented a technique to model a flexible loop by a chain of rigid bodies connected by torsional springs. Yoshida et al. [21] proposed a method for planning motions of a ring-shaped object based on precise simulation using the Finite Element Method (FEM). Lv et al. [22] proposed a new mass-spring model by adding torsion springs to the cable links to express twisting behavior.
There are also approaches not based on the physical model of the flexible object. Navarro-Alarcon et al. [23][24][25] proposed a framework for automatic manipulation of deformable objects with an adaptive deformation model in 3D space. Cartesian features composed of points, lines, and angles have been used to represent the deformation. Navarro-Alarcon and Liu [26] proposed a representation of the object's shape based on a truncated Fourier series, and this model allows the robotic arm to deform the soft objects to desired contours in the 2D plane. Recently, Zhu et al. [11] extended the work in [26] and used the Fourier-based method to control the shape of a flexible cable in 2D space. Compared with the Fourier-based visual servoing method, the SPR-RWLS method with the Gaussian model proposed by Jin et al. [27] took visual tracking uncertainties into consideration and showed robustness in the presence of outliers and occlusions for cable manipulation.
Inspired by the existing work and our previous research study on the model-based linear flexible object manipulation [28,29], we have the following contributions in this paper: (1) It introduces a novel 3D geometrical model of the linear flexible objects subject to gravity based on 2D models on two projection planes and learned object positions; (2) it presents an autonomous system framework for accomplishing the DRC Plug Task; (3) it introduces a novel strategy, Simulation-to-Real-to-Simulation (Sim2Real2Sim), for bridging the gap between simulation and real world; (4) it presents a physical model of the linear flexible objects and an identification approach of getting the model parameters. It should be noted that DRC Plug Task has been selected as the validation study in this research for two reasons: (i) Our team participated in the DRC Finals [14], and (ii) The DRC Plug Task provides a well-defined set of requirements to generalize the approach presented here to other applications. This paper is organized as follows. In Section 2, we introduce the methodology of automating the DRC Plug Task in the real world. An automation system framework, a reliable geometrical model of the linear flexible objects, and a robust pose alignment controller are covered in this section. Section 3 describes how we complete the DRC Plug Task in simulation. In this section, we introduce how we build a complete simulation environment, a novel Sim2Real2Sim strategy, and a physical model of the linear flexible objects with identified parameters. Section 4 shows the experimental results. Section 5 includes the conclusion and directions for future work.

Task Automation in Real World Based on Our Geometrical Modeling of the Linear Flexible Objects
The DRC Plug Task involves unplugging a power cable from one socket and plugging it into another. The problem involves detecting a flexible cable in a massive environment, grasping the cable with a feasible grasp pose, controlling the shape of the cable to fit the target pose, and inserting the cable tip to the target socket.

Benchmarking Setup and Flow of Our Designed System
A benchmarking setup is used in this study shown in Figure 1. The setup consists of a power cable (Deka Wire DW04914-1) with a plug (Optronics A7WCB) on the tip, two power sockets (McMaster-Carr 7905k35) attached on the wall, two neodymium disk magnets (DIYMAG HLMAG03) glued to the socket and the plug to provide a suction force [12]. Our system also includes a 6-DOF JACO v2 arm with three fingers, and two RGB-D cameras, i.e., Realsense D435 and Microsoft Kinect. The Realsense camera is used to estimate and model the shape of the cable, while the Kinect camera is used to estimate the socket pose and filter the point cloud. This scenario corresponds to a humanoid robot equipped with a depth camera at its left arm wrist and another depth camera at its head. One arm performing the task while the other is providing the perception feedback. From a system design perspective, to complete the DRC Plug Task autonomously, the system needs to be able to model a linear flexible object, to keep track of the cable configurations and to detect the pose of the target socket through pose estimation, to plan motions for the robot, and to align the cable tip with the target socket by a controller. A flow of the proposed system architecture is presented in Figure 2. We can divide the entire process into five phases: INITIALIZE, GRASP, UNPLUG, PRE-INSERT, and INSERT. Our goal is to give the robot the ability to automatically switch the phases. From the INITIALIZE phase to the GRASP phase, the system needs to sense the environment, including estimating the socket pose, filtering the point cloud, and modeling the cable. After getting the model of the cable, the system will estimate the poses of the cable and the cable tip and select the grasp point on the cable. Then the robot will go and GRASP the cable. From the UNPLUG phase to the PRE-INSERT phase, the robot needs to align the pose of the cable tip with the target pose. From the GRASP phase to the UNPLUG phase, a planned motion away from the target socket and perpendicular to the target socket hole plane is executed by the robot, and another planned motion perpendicular to the socket hole plane but close to the socket is executed by the robot to transfer from the PRE-INSERT phase to the INSERT phase. To automate the DRC Plug Task, the robot needs to know where the target is. We used the Hough Circle Transform [30] to detect the hole (circle) of the target socket. Then the coordinates of the circle center can be obtained. By integrating the depth information at the circle center, we can get the 3D position of the target socket center. Random sample consensus (RANSAC) method [31] is applied to detect the wall plane which is parallel to the socket hole plane. The normal to the wall plane and the 3D position of the target center are used to define the target frame (named "Target_Socket").

Geometrical Modeling of the Linear Flexible Objects
It is challenging to find specific irregular objects in a noisy environment. In this task, a real-time object detection method called YOLO [32] is used for detecting the flexible power cable in the 2D images ( Figure 3(left)). The process requires training a convolutional neural network (CNN) with self-labeled images. We trained the network for 40,000 steps with 100 labeled images. The output of YOLO is a bounding box that narrows the region down to a search for the cable. To get the pixels corresponding to the object, we then use the color information and detect the black pixels in this region (Figure 3(middle)). These pixels are stored in a set in the order from top to bottom and left to right, and we define the middle pixel as the center pixel of the object which is marked as "Center" in Figure 3(left). By integrating the depth information, we can get the 3D positions of the object center. A PassThrough filter from the point cloud library (pcl) [33] was used to filter the point cloud from three dimensions as shown in Figure 3(right). The point cloud in the narrowed bounding box served to model the cable. It is published through ROS at a frequency of 30 Hz. In Figure 3 (right), the remaining points in the filtered point cloud are shown in white with respect to the world frame. The x, y, and z axes of the world frame are represented by the red, green, and blue bars, respectively. Modeling the cable in 3D space is still a research problem in linear flexible object manipulation. On the other hand, modeling a curved line in 2D space is a solved problem. We can exploit this by projecting the 3D curve onto 2D spaces. In this study, we project the 3D point cloud onto the y-x and y-z planes. However, the projection in 2D may have multiple corresponding x or z values to the y coordinate. For example, if the cable has a "⊃" shape, in order to get a unique projection model, we filtered out the point cloud below the rightmost point. Moreover, we assume that the cable is not in more complex shapes, e.g., "S" or "α" shape. Due to the relatively high stiffness and large bending radius of the power cables, we found that a quadratic polynomial equation is sufficient to represent their shapes: x = a 0 + a 1 y + a 2 y 2 , The polynomial coefficients a 0 , a 1 , a 2 , b 0 , b 1 , and b 2 are estimated by using the least squares method, and they continuously change based on the shape of the power cable. In order to model the cable in 3D, we uniformly sample the points in Figure 3(right) along the y-axis. The corresponding x and z values can be calculated by using Equations (1) and (2). This projection method is efficient, and the model can be published through ROS at a frequency of 29.997 Hz. Figure 4 shows an example of modeling result with 10 sample points. The power cable is visualized in Rviz by markers. Green markers represent the sampling points, consecutive points are connected by blue lines, and vertical upward red lines are used to illustrate the displacements between the points. In order for the robot to detect and track the location of the cable and cable tip, we propose a piece-wise linear model for the object. In Figure 4, we denote the leftmost point as p 1 and the remaining points as p 2 , p 3 , . . . , p N from left to right, where N is the number of sample points. We denote the leftmost blue line that connects p 1 and p 2 as l 1 and rest of the blue lines as l 2 , l 3 , . . . , l N−1 . We assume that the line l i is the tangent at the point p i . Because the cable tip has the round shape, its roll angle can be neglected (we set it to 0). After setting the point p 1 as the origin and defining the x axis to the tangent of the point p 1 , the "cable_tip" frame is defined in Figure 5(left). Now that the positions of all the sample points and the pose of the cable tip are known with respect to the robot frame, a feasible grasp pose for the robot can be calculated to GRASP the cable. Because the plug is inside the socket, we consider only the sampling points on the cable except the cable tip as the grasp point candidates. To select the grasp point on the cable, the following trade-off needs to be taken into account. As the grasp point gets closer to the cable tip, the risk of the robot colliding with the socket or the wall increases and the number of points in the filtered point cloud reduces. On the other hand, when the grasp point is too far away from the cable tip, the robot can not align the cable tip pose with the target socket pose because the cable will dangle substantially. Thus, the grasp point needs to be selected empirically to meet these two constraints. For this purpose, we pre-define a minimum distance from the grasp point to the cable tip (p 1 ) along the cable as d min , and a maximum distance as d max . d min and d max are pre-defined with the robot holding the cable before the experiment. With this step, the cable deformation characteristics can be ignored for selecting the grasp point, and this vision-only method is satisfied for selecting a feasible grasp point. Since we can calculate the length of the lines l 1 , . . . , l N−1 , we can also calculate the distance d s from each sampling point to the cable tip by using: The grasp point needs to be selected such that d s ∈ [d min , d max ] for the robot to be able to align the cable tip pose with the target pose. By using the same method of estimating the "cable_tip" frame, we can get the "cable" frame ( Figure 5(right)), which is used to get the grasp pose of the robot's end-effector. The grasp pose is then used as the target of the motion planner. We implemented both MoveIt! [34] and TrajOpt [35] for the motion planing in this task.
Once grasping of the cable is completed, a pulling motion along the y axis of the "world" frame is generated to UNPLUG the cable. Figure 6a shows the "cable_tip" frame after the robot unplugs the power cable. We filter out the point cloud of the plug during modeling which improves the accuracy of the model because the plug is a straight and rigid object.

Pose Alignment Controller
In order to finish the INSERTION task, the robot needs to adjust the pose of the cable tip to match with the pose of the target socket. A PRE-INSERT frame right in front of the target socket is defined as our target pose for the pose alignment controller. Because of the continuous visual feedback, we decide to use the visual servoing approach to align the cable tip pose with the target socket pose.
To explain this algorithm in a succinct way, we denote the transformation matrix from the "cable_tip" frame to the PRE-INSERT frame as T pre ct , the transformation matrix from the PRE-INSERT frame to the "end-effector" frame as T ee pre , the transformation matrix from the "cable_tip" frame to the "end-effector" frame as T ee ct . [∆x ∆y ∆z ∆α ∆β ∆γ] are the translational (x, y, z) and rotational (roll, pitch, yaw) deviations between the "cable_tip" frame and the PRE-INSERT frame under the "end-effector" frame which can be calculated as  where x 1 , y 1 , z 1 , α 1 , β 1 , and γ 1 are translational and rotational terms from the transformation T ee pre under the "end-effector" frame, x 2 , y 2 , z 2 , α 2 , β 2 , and γ 2 are translational and rotational terms from the transformation T ee ct under the "end-effector" frame. Then the linear and angular velocities under the "end-effector" frame can be calculated [36]: where ∆t is the execution time,ẋ,ẏ,ż, w x , w y , and w z are linear and angular velocities under the "end-effector" frame, which will be used as inputs for our PD controller. A PD controller was designed to calculate the Cartesian velocity (ẋ) for the robot's end-effector: where e is the error, which is [ẋẏż w x w y w z ] , K p and K d are proportional and derivative terms.
Our designed values for K p and K d are 2.0 and 0.2. After getting the Cartesian velocity from our PD controller, a velocity controller built in the Jaco robotic arm generates the controlled joint velocity (q) by using robot Jacobian matrix (J): Algorithm 1 shows the pose alignment control loop. The control loop keeps running until the translational deviations (∆x e , ∆y e , ∆z e ) and the rotational deviations (∆α e , ∆β e , ∆γ e ) from the "cable_tip" frame to the PRE-INSERT frame (T pre ct ) satisfy the thresholds which are 1 (m) and 2 (rad). A successful example of the robot aligning the cable to PRE-INSERT frame is shown in Figure 6b. For the final INSERTION step, a translation along the x axis of the "cable_tip" frame is applied. The insertion is facilitated by the magnets on the plug and in the socket.

Task Automation in Simulation Based on Our Physical Model of the Linear Flexible Objects
Simulation plays a very important role in robotics [37]. Various simulators (e.g., Gazebo, OpenRAVE, MuJoCo) are used to design the robot model, create different simulation environments, analyze the kinematics and dynamics of the robot, design different plan or control algorithms, investigate the performances of the system, etc. The construction of a real environment is usually more expensive than building a simulation environment. In simulation, we can build objects with their geometric and physical properties. We can also design new robot models or import existing robot models to complete a series of manipulation tasks. Usually, the robot needs to make physical contact with the environment and objects when completing the manipulation tasks. Learning to use the robot to manipulate objects in simulation can help with avoiding damage to the robot, the environment, and humans.
The  [14] and the detailed definition and requirements of this task, we decided to build a complete simulation environment and complete this task in simulation for our research study.

Simulation Environment Setup
The simulation setup consists of a wall, two power sockets, a power cable with a plug, a Kinova Jaco V2 robotic arm, and a Kinect camera. Figure 7 shows the simulation setup in the Gazebo simulator [38]. Modeling of rigid objects in the simulator is a solved problem: The wall can be modeled as a box and the plug can be fitted as a cylinder. Modeling objects of irregular shape is relatively difficult, but we can get the model in a solid modeling computer-aided design tool (e.g., SolidWorks) and then use a converter (e.g., SolidWorks to URDF Exporter) to get the model in the desired format. The RGB-D camera in the simulated environment is a Kinect camera. To read the point cloud information, we need to model the Kinect camera as a Depth Camera Plugin in Gazebo. The simulated Jaco arm is cloned from the official Kinova GitHub website. In simulation, we assume that the positions of the sockets are known. Modeling of the cable in the simulator is a challenge and it will be introduced in the next subsection.

Physical Model of the Linear Flexible Objects
Inspired by the "fire hose" model in Gazebo and the Piecewise Constant Curvature (PCC) model [39], a Dynamically-Consistent Augmented Formulation [40] of the PCC cable model is created in Figure 8a. A Gazebo model of the cable is also built based on the PCC model (see Figure 8b). Fifteen links are used to model the cable, each link is 5 cm long and weighs 50 g with the center of mass in the middle of the arc. The plug is also a link with length 10 cm and weight 100 g. These links are connected by revolute joints (except the joint between the plug and link-1) with the roll and pitch rotations. Each joint has stiffness and damping properties. There exists a trade-off in selecting the number of links for the simulated cable. In theory, more links of the cable model will produce more accurate simulation results. On the other hand, more links will cause the computing speed to slow down or even the simulator to crash. We observed that if the number of links is increased to twenty, sometimes there exists a misalignment between the links.

Sim2Real2Sim-Achieving Flexible Object Manipulation in Simulation with Identified Physical Model
The gap between simulation and real world was introduced in Simulation-to-Real (Sim2Real) [41][42][43][44]. Simulation is observed to have inevitable simplifications with heavy optimization. Besides, there exist physical events not modeled in simulators and parameters of the simulated models that need to be identified. Therefore, policies or parameters learned [45] and controllers designed [43] in simulation can not be transferred to the real world directly.
To reduce the gap between simulation and real world especially in flexible object manipulation, we propose a new strategy called Simulation-to-Real-to-Simulation (Sim2Real2Sim). Figure 9 shows the flowchart of the Sim2Real2Sim strategy we will use for bridging the gap. We start with a rough simulation environment with the estimated models. Then we test the system framework in the real world based on the methods developed in simulation and collect the data from the real world. Finally, we go back to the simulation and update the models and methods based on the data from the real world. The research approach from simulation to real world implementation is widely applied in humanoid robot [46], mobile robot [47], robotic arm [48], etc. Normally, we complete a task in simulation first, then transfer the methods used in simulation to the real world. This step is named "simulation to real world" in the Sim2Real2Sim strategy. Building a simulation environment with all rigid objects is well-developed and these objects have nearly the same behavior in simulation and real world. However, simulation of soft objects usually simplifies the model; the gap between simulation and real world can not be ignored. Our proposed Sim2Real2Sim approach is for solving this issue. More specifically, we use real world data to improve the simulated models and we call this step "real world to simulation". Therefore, our idea of Sim2Real2Sim is like a research strategy and a generalized methodology to reduce the gap between simulation and the real world and to obtain a high-fidelity simulation environment for flexible object manipulation.

Simulation to Real World
Led by the Sim2Real2Sim strategy, we build an initial simulation environment based on the task requirements (see Figure 7). Recall that the DRC Plug Task can be divided into five phases (INITIALIZE,  GRASP, UNPLUG, PRE-INSERT, and INSERT). In the INITIALIZE and GRASP phases, we use the same geometrical model of the cable mentioned in Section 2.2 to estimate the cable tip pose and select the grasp point. After completing the GRASP of the cable in simulation with 30 successful experiments, we apply the same motion planning method to the real robot, and as expected, the real robot can GRASP the cable at the grasp point autonomously.
UNPLUGGING the cable from the socket is straightforward with a backward panning motion for the end-effector. After the cable being plugged out of the socket, the weight of the front section (plug) of the cable will cause the cable to dangle. In our model (see Figure 8), this deformation depends on the joint stiffness and damping. We observe that the deformation of the cable is different by randomly setting the values for the joint stiffness and damping. Two special cases are: (1) The cable dangles too much (see Figure 10a); (2) the cable barely dangles (see Figure 10b). However, our geometric model works well in both cases (see Figure 11). To get reasonable values for the joint stiffness and damping and to narrow the gap between simulation and real world, we need to collect data from the real world and go back to simulation to update the models and methods. To automate the Plug Task, we need to figure out how to carry the cable to the PRE-INSERT position after the cable being UNPLUGGED from the original socket. We use the visual servoing approach mentioned in Section 2.3 to reduce the differences between the "cable_tip" frame and the PRE-INSERT frame. After the cable tip moves to the PRE-INSERT position, a forward panning motion for the end-effector will INSERT the cable tip (plug) to the target socket.

Real World to Simulation
After implementing the methods developed in simulation to real world, we realized that the deformation of the cable in simulation and real world is different after the robot unplugs the cable from the socket, which means our physical model in simulation needs to be tuned to match with the real world.
As we mentioned before, the cable is modeled as a rigid manipulator with passive revolute joints (see Figure 8). The dynamics of the cable can be represented as: where q,q,q represent joint position, velocity, and acceleration. M is the inertia matrix. C is the centrifugal and Coriolis forces matrix. G is gravitational forces or torques. J T is the transpose of the robot Jacobian. f ext is the external force. K is the stiffness and D is the damping. τ is a vector of joint torques, corresponding to the torques and forces applied by the actuators at the joints. Our goal is to estimate the stiffness and damping in this equation.
Recalling that each joint of the cable has roll and pitch rotations (see Figure 8), we simplify the model by only considering the pitch. Our first challenge is to get the joint values. AprilTag [49] is used for getting the joint positions. Figure 12 shows the start and end points of recording the joint positions with a 100 g weight added to the tip. Four apriltags with side lengths of 2.0 cm [50] are attached on the cable 5 cm apart (same as the link length in Gazebo). The joint position q and the time stamp are recorded for calculating joint velocityq and accelerationq. After we get joint values, the robot Jacobian J can be calculated. Because the joints on the cable are passive, τ equals to 0. The external force f ext is also known by multiplying the mass of the weight (0.1 kg) with the gravitational acceleration (9.8 m/s 2 ). Computing of the left three terms (Mq + Cq + G) in Equation (9) is our second challenge. We refer to the Recursive Newton Euler (RNE) approach [51][52][53]. The RNE approach includes two recursions: Forward recursion for computing velocities and accelerations, and backward recursion for computing forces and torques. Figure 13 shows the parameters of three adjacent links and Algorithm 2 shows the detailed RNE algorithm. The RNE algorithm gives us the torque required for each joint to overcome the gravity of the model and the force generated from the motion if the cable joints have no friction or damping and there is no external force. The left three terms (Mq + Cq + G) in Equation (9) can be represented as: where τ RNE equals to [τ 1 , τ 2 , . . . τ n ] T (n = 4) calculated from the RNE method.  OpenRAVE [54] is used for computing the robot Jacobian J and the inverse dynamic terms Mq + Cq + G. Figure 14 shows the physical model built in OpenRAVE. We pick the first four links of the cable for modeling because the cable tip pose is not guaranteed to be aligned with the socket pose if the grasp point is beyond this range. The robot Jacobian and inverse dynamics are computed based on this model. At this point, we can rewrite Equation (9) as: The terms to the right of the Equation (11) are known. To get the stiffness K, we use the robot configuration when the joint velocityq equals to 0. In our case, we use the joint values when the cable is finally stationary after adding the weight (see Figure 12b). The stiffness K can be computed by using Equation (12) withq = 0, where q + is the pseudoinverse of q.
After getting stiffness K, we take it to Equation (11). The damping D can then be calculated as: whereq + is the pseudo inverse ofq. Unlike the calculation for stiffness, we use joint values prior to the cable entering its final stationary position to ensure that joint velocity is not 0. Finally, we update the Gazebo model based on the calculated stiffness and damping. Recalling that each joint on the cable also has the roll rotation. We can use the same method to identify the stiffness and damping properties if we can figure out how to add a constant twist torque on the cable tip. This part is left as future work. Currently, we use the joint limits to set the roll rotation based on the observation. Figure 15 shows the deformation of the cable with the updated model parameters in the simulation. Detailed comparison of the model deformation in simulation with the cable deformation in the real world will be discussed in Section 4.

Experiments and Results
Our system framework for task automation is evaluated by running the DRC Plug Task both in simulation and real world for 20 trials. In simulation, the average completion time is 66.25 s with an average real-time factor: 0.65. Figure 16 shows the robot completing the DRC Plug Task in simulation. In real world, the average completion time is 31.53 s. Figure 17 shows the robot completing the DRC Plug Task in the real world. We also test our automation approach with a different robot platform which is a Kinova Gen3 7-DOF arm. Figure 18 shows the Gen3 robot arm completing the DRC Plug Task in the real world. In order to validate the performance of our geometrical modeling of the cable, pose alignment controller, and the cable model in simulation, we conduct different experiments in each aspect.

Validation of the Geometrical Modeling of the Cable and the Pose Alignment Controller
To validate the performance of our modeling method and pose alignment controller, we implement three experiments. The first experiment is for testing the grasp functionality with different initial poses of the cable. The second experiment shows the performance of the pose alignment controller with two different cables. The last experiment shows the robustness of our pose alignment controller under external disturbances.
The first experiment is for testing the grasp functionality with different initial poses of the cable modeled by our geometrical modeling method. Recalling the trade-off mentioned in Section 2.2, we need to pick values of d min and d max for different cables. For the power cable (with plug), we define d min = 18 cm and d max = 30 cm. For the HDMI cable, these parameters are selected as d min = 12 cm and d max = 24 cm. Our method will prioritize the sample point closest to the midpoint in this range as grasp point when performing tasks. Figure 19 shows the robot is able to grasp the cable with three different initial states. Figure 20 shows the modeling of the cable for these three different initial poses. Table 1 shows the model parameters for these states.   The second experiment shows the performance of the pose alignment controller with two different cables. Due to the limitation of point cloud publishing frequency (30 Hz), we set the constraints for linear and angular velocities in the Cartesian space as 1.5 m/s and 0.6 rad/s to prevent the substantial deformation of the cable in the interval of the point cloud update. We implement the pose alignment controller with two different cables (10 trials/each). Figure 21 shows two successful runs during these experiments. Figure 22 shows the pose differences between the cable tip frame and the pre-insert frame. Our pose adjustment control loop ends when the pose differences meet the thresholds which is 0.01 m for translational terms and 0.02 rad for rotational terms. In this experiment, we conclude that the HDMI cable (average elapsed time: 12.26 s) needs more time than the power cable (average elapsed time: 5.59 s) to adjust the pose. A potential reason might be that the HDMI cable has more deformation than the power cable. Another experiment is performed for evaluating the robustness of the pose alignment controller by adding disturbances to the pose alignment control loop and testing whether our algorithm can still achieve the target. Figure 23 shows that different disturbances added to the cable tip while the system is in the pose alignment loop and the final pose of the cable after the loop, which demonstrates that our pose alignment controller is able to resist such external disturbances. These results demonstrate that our method for modeling linear flexible objects can provide a 3D model with adaptive parameters based on the visual feedback. This model is capable of representing various shapes of the cable during the Plug Task. More importantly, our pose alignment algorithm is robust to handle different cable-like objects and is capable of resisting disturbances.

Validation of the Cable Model in Simulation
To validate the model in simulation, we designed two experiments. First, we fix the same link on the cable to the world, and compare the joint positions of the cable in simulation and real world. Second, we fix different links on the cable horizontally, and compare the pose of the cable tip in simulation and real world. In simulation, the link of the simulated cable can be fixed to the world by adding a "fixed" joint connecting the "world" link and the link you want to fix. In real world, we fix the cable by using a zip-tie. Figure 24 shows the setup in simulation and real world for comparison of the cable deformation. Apriltags are attached to the cable tip and on the first four revolute joints to get the transformation at each joint. Figure 25 shows the visualized frames in Rviz, and the "tag_1" frame is equivalent to the "cable_tip" frame.   Our first experiment is comparing the deformation of the cable by fixing the same link horizontally to the world. We use the setup in Figure 24 and obtain the frames ("tag_1", "tag_2", . . . , "tag_5") both in simulation and real world (see Figure 25). Transformations of these frames are used to get the joint positions for the comparison in simulation and real world. We first compare the joint positions of the cable with no external force, then we add different weights to the cable tip and compare the differences of the joint positions under these disturbances. Figure 26 shows the comparison result by fixing the link-5 (the link after tag-5) on the cable to the world. From the Figure 26, we conclude that the shape of the cable in the simulation is close to the real cable with the largest difference of the joint positions being 0.005 rad, and the largest percent error between the simulation and the real world is 2.00%. To test the simulation model with external disturbances, we add two different weights (50 g and 100 g) to the cable tip, and we compare the joint positions of the simulated cable with the one of the real cable (see Figures 27 and 28). The percent errors are reasonable which proves the accuracy of our simulated cable model. Our second experiment is fixing different links on the cable horizontally to the world in simulation and real world to compare the deformation of the cable tip. We compare the rotation angles around the z-axis from the frame we have fixed to the world to frame "tag_1". This experiment is to compare the sagging angle of the cable tip in simulation and real world. Similar to the first experiment, three comparisons are made for validation. Figure 29 shows the comparison result by fixing different links on the cable to the world. From the Figure 29, we conclude that the difference of the sagging angle between the simulation model and real cable is within a reasonable range (<0.04 rad), and the largest percent error is 3.38%.
We also run this experiment with two different weights (50 g and 100 g) added to the cable tip. Figures 30 and 31 show the results. In the case of a 50 g weight added to the cable tip, the maximum difference of the cable sagging angle between the simulation and the real world is 0.02 rad, and the largest percent error is 3.12%. If a 100 g weight is added to the cable tip, the maximum difference of the cable sagging angle between the simulation and the real world is 0.048 rad, and the largest percent error is 4.11%.

Conclusions
In this paper, we proposed a geometrical modeling method based on the curves on two projection planes for linear flexible objects subject to gravity. The geometrical model enables tracking the 3D curvature of the linear flexible object, the pose of the tip and the pose of the selected grasp point on the object. A robust pose alignment controller based on the geometrical model with adaptive parameters can bring the cable tip to a desired pre-insert position. We designed and used an autonomous system pipeline with our formulated methods to accomplish the DRC Plug Task autonomously.
We also proposed a new strategy called Sim2Real2Sim. This strategy includes two procedures: sim-to-real and real-to-sim. The sim-to-real procedure is to organize the methods used in simulation and apply them to the real world in a reasonable way. The real-to-sim procedure is for updating the models in simulation which bridges the gap between simulation and real world. A novel identification method based on inverse dynamics is implemented to update the parameters of the simulated flexible objects. The comparison of the deformation of the flexible objects in simulation and real world proves the accuracy of the model in the simulation The success of the system framework running both in simulation and the real world demonstrates the practicability and reliability of our proposed methods. Possible future work includes: (1) Explore different model representations of the cable shapes in the case of more complex deformations; (2) apply our approach to different robot platforms (e.g., a dual-arm system) or to other flexible object manipulation tasks (e.g., a cable routing assembly task); (3) get more evaluations on our Sim2Real2Sim strategy; (4) update the simulation model for the linear flexible objects by adding more degrees of freedom (e.g., using more apriltags); (5) bring the material and friction information from the real world to the simulation to obtain a more close-to-reality simulated cable model and a better simulation when the cable makes contact with the environment.
Author Contributions: Methodology, P.C. and T.P.; Software, P.C.; Supervision,T.P.; Validation, P.C. and T.P.; Writing-original draft, P.C.; Writing-review & editing, P.C. and T.P. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results