Multi-Modal Sensing and Robotic Manipulation of Non-Rigid Objects : A Survey

This paper aims to provide a comprehensive survey of recent advancements in modelling and autonomous manipulation of non-rigid objects. It first summarizes the recent advances in sensing and modelling of such objects with a focus on describing the methods and technologies used to measure their shape and estimate their material and physical properties. Formal representations considered to predict the deformation resulting from manipulation of non-rigid objects are then investigated. The third part provides a survey of planning and control strategies exploited to operate dexterous robotic systems while performing various tasks on objects made of different non-rigid materials.


Introduction
In order for robots to reach the next milestone in task flexibility and come closer to a human-like skillset, an important task that remains largely unsolved is the proper handling of non-rigid objects.This includes any object that can change shape while being manipulated, such as rope, wires, metal plates, fabrics, sponges, rubber, organs and living tissues, etc.The intelligent autonomous handling of such objects will allow automating many labour-intensive or delicate tasks, be it in industrial assembly, surgery or general household tasks.
Previous surveys conducted on robotic interaction with non-rigid objects include the work of Khalil and Payeur [1], which provided comprehensive coverage of the subject, as well as a comparison with algorithms for handling rigid objects and giving particular attention to multi-sensory feedback systems.Jiménez [2] presented an overview of modelling and control techniques, with a strong focus on the manipulation of planar objects such as thin sheets and cloth-like material.A very recent review by Sanchez et al. [3] also provides a broad coverage of the topic, focusing on the classification of control tasks based on the type of object being handled as well as on the specific subtask that is performed in the various steps of manipulation and shape control.However, the quantity and variety of recent advances on the topics of sensing and manipulation of non-rigid objects is such that a single review cannot hope to capture them all.
This survey complements the previous works by providing a review of many research works that were not covered in prior surveys.It also provides broad coverage on the topics of sensing, modelling and robotic control of non-rigid objects in a single work.In addition, more attention was given to works on 3D non-rigid objects, which is a research area that had not been widely addressed in surveys but has received important contributions in recent years.
This paper is divided into three parts.Section 2 summarizes the recent advances in sensory systems, with a focus on describing the technologies and configurations used to acquire and process sensor data.This section is organized following the characteristics of the technology used, namely vision sensing collected in form of images, depth or both (Section 2.1); tactile sensing described by force, torque or any other physical interaction with the environment (Section 2.2); and the combination of various sensing modes (Section 2.3).The strategies used to describe the object are analyzed in Section 3, by considering aspects such as shape measurement and estimation of physical or geometrical properties of objects while investigating ways to represent and predict such behaviours.This section is organized according to approaches that rely on a formal model (Section 3.1) as a main descriptor and those that do not take a model into consideration (Section 3.2).Section 4 provides a survey of planning and control strategies exploited to support a robotic system in performing various tasks with non-rigid objects.The natural classification of research based on the shape of the objects being handled is reflected in the organization of this section.This shape classification includes linear objects such as ropes and wires (Section 4.1); planar objects such as thin plates and cloth-like material (Section 4.2); the outline or 2D projection of objects (Section 4.3); and full 3D objects (Section 4.4).Recent machine learning approaches are highlighted in Section 4.5, while papers that focus on more general or unique tasks are discussed in Section 4. 6.
This organization of topics does not aim to build a formal and systematic classification of objects and tasks as it was done in [3], but rather to group similar research in order to let the trends emerge naturally.Even though sensing and control are both influenced by the type and geometry of the targeted object, sensing appears to be much less affected by it than control.Moreover, most of the presented control strategies are not inherently dependent on a specific sensing technology or model type, if a formal model is used at all.In addition, several authors presented their works by focusing on either modelling or control in order to describe their proposed solution in a comprehensive way.It is therefore sensible to decouple sensing, modelling and control by organizing models and sensing technologies based on their characteristics while control strategies are grouped based on the task to which they are applied.This promotes a clear logical separation between the core components of any robotic system, i.e. sensing, modelling and control, which makes for a perspective that complements previous reviews and provides support for those who may be interested in a specific component of such a system.

Sensing and Data Processing
All the different approaches that have focused on the robotic manipulation of non-rigid objects at some point have had to handle one form or another of data, whether from the environment, the robot, the particular object or the interaction of all of them.The data coming from sensors must capture the intrinsic complexity of these systems and in many cases the sensory infrastructure is as sophisticated as them.Consequently, there exists a variety of aspects to consider [4] when designing this process, such as the sensor technologies used, localization and interaction of sensors with the environment, multi-sensor data fusion, and data sampling techniques.In principle, robotic manipulation of non-rigid objects can be mainly categorized with respect to the sensing technologies used, that is, vision, tactile or derived from different modalities.In the rest of the section, the most relevant aspects regarding the choice of such configurations are covered, describing in particular the scenarios that contemplate non-rigid 3D objects in their systems.Furthermore, an overview of the features of these systems is presented in Table 1 at the end of the section.

Vision-Only Sensing
Güler et al. [5] used a Logitech HD Pro C920 Webcam (Logitech, Newark, CA, USA) to develop 2D deformation tracking under a robotic pushing scenario.The system used a geometric scheme that simulated the physical object behavior based on a position-based dynamics (PBD) model.Similarly, Hui et al. [6] proposed a 2D deformation tracking system and applied it to robotic hand manipulation while also adding capabilities for material classification.They used a three-finger Barrett Hand robot and a Kinect sensor to acquire visual and depth data.However, depth data was only used in the segmentation process of their method.
On the other hand, Jordt et al. [7] proposed a 3D deformation tracking system with capabilities of high spacial and temporal resolution.They used a vision setup that consists of a SR4000 Time-of-Flight (ToF) (Mesa Imaging, Zürich, Switzerland) camera with a CCD camera to acquire colour and depth data at high frame rate and a laser line scanner with two CCD cameras to acquire data at high resolution such that the system is constructed by combining both techniques.Later, in [8], the vision setup was replaced by a single Kinect sensor and used in a deformation system similar to their previous work.However, neither was tested in a robotic scenario until Fugl et al. [9] adopted the deformation tracking concept and included a physical model in their system.It also estimated the objects' material properties in which the reference values were taken by a PASCO PS-2189 force sensor and a 6 DoF (Degree-of-Freedom) UR5 Arm (Universal Robots, Odense, Denmark).Leizea et al. [10] also proposed a 3D tracking method based on physical simulation focused on the manipulation of organs in surgery.The setup consisted of a 7 DoF Mitsubishi robot with an indenter attached to the gripper, a Kinect sensor that acquired colour and depth data and an MCR 301 rheometer (Anton Paar, Graz, Austria) that obtained the material properties of the objects after sampling each material.They combined both visual and material characterization to build the physical model.Additionally, the gripper was equipped with a Mini40 force/torque sensor, but it was only used to monitor the force exerted by the robot.Lin et al. [11] presented a strategy to pick up non-rigid objects resting on a table with a robotic hand.The setup consisted of a Barrett Hand with no kind of tactile sensor installed and a 3D laser scanner from NextEngine.The latter is used to discretize the objects into a tetrahedral mesh using MeshLab software and the Computational Geometry Algorithm Library.Navarro-Alarcon et al. [12] proposed a strategy to control the 3D shape of non-rigid objects by capturing visual features that quantify the deformation.The setup consisted of a 6 DoF Staubli robot and a 3 DoF surgical robot that hold the object while deforming it and a stereo vision system built with two CCD sensors.Similarly, in Alambeigi et al. [13] a deformation control framework used a da Vinci Research Platform.For this setup, the robots consisted of two Patient Side Manipulators equipped with two EndoWrist ProGrasp Forceps (Intuitive surgical, Inc., California, USA) and one Endoscopic Camera Manipulator equipped with a stereo endoscope.
Pure vision systems for non-rigid 3D objects exhibit a tendency to incorporate some kind of 3D sensors in their systems.Until now, low cost Kinect-type sensors stand as the preferred technology for such applications.In particular, this solution is integrated mainly in situations where a robotic gripper produces a single-sided contact with objects.On the other hand, some works (e.g., [11,12]) incorporate custom 3D vision setups while using a two-finger gripper to perform the deformations.

Tactile-Only Sensing
Drimus et al. [14,15] introduced a tactile array sensor based on a piezoelectric rubber material.Although the main purpose of their work is the description of the sensor, an object classification system is also described which includes a mixture of rigid and non-rigid objects.The robotic setup consists of a Schunk PowerCube Robotic Arm with a Schunk PG70 parallel gripper, which was equipped with the sensor.Mira et al. [16] proposed a grasping strategy by analyzing the tactile data acquired during the task.It was built upon two 7 DoF Mitsubishi manipulators.One robot was equipped with a Shadow five-finger hand with a Tekscan tactile sensor able to record pressure levels throughout an array distributed in the finger sections and the palm.On the other robot, a Kinect sensor was used to recognize the object at the beginning, but it was not used in the grasping strategy.The same configuration was used by Delgado et al. [17] and updated in [18] by replacing the previous robots with two Kuka LWR arms equipped with two Shadow hands.One robot kept the Teskcan sensor, whereas the Kinect sensor of the previous one was replaced with a Biomimetic Tactile sensor from SynTouch.This system is able to record pressure, vibrations and temperature along the contact points and was used to control the movement of the fingers in the hands.Zaidi et al. [19] adopted the same setup as [18].However, they proposed another grasping approach by considering a physical model that used the information of the contact interaction between the object and the robotic hand.In contrast to other groups, pure tactile systems incorporate mainly multi-finger robotic grippers to analyze their performance.Compared to previous works, there is a tendency to include richer sensory system in terms of multiple contact capability, high sensitivity and spatial resolution (e.g., Biotac type sensors).A previous survey on tactile sensing with robotic hands [20] also emphasizes the importance of high-resolution sensors in order to detect small variations in object's shape.

Multi-Modal Sensing
Arriola-Rios and Wyatt [21] integrated force and visual information to develop a multi-modal learning framework to monitor and predict deformations exerted by a robot in a single-point contact scenario.The set up consisted of a DAQ-FT-GAMA force sensor and a colour firewire camera that records perpendicularly to the scene.However, the system is able to capture 2D deformations only.Similarly, Cretu et al. [22] introduced a platform capable of monitoring and predicting the behavior of the object in 2D but applied for robotic hand manipulation.They combined visual, force and position information collected in a setup initially introduced by Khalil et al. [23] that consisted of a three-finger Barrett robotic hand combined with PPS RoboTouch tactile pads sensors and a point Grey Research Flea 2 firewire camera.The visual information was used in a contour tracking method and associated with the tactile data, which was pre-processed to transform from the binary representation to a pressure equivalent.It was done through a linear interpolation that was performed over the calibrated measurements, and then the pressure data was multiplied by the active area of the sensor pad to obtain the final reaction force.
Frank et al. [24] presented a method to estimate elastic properties by simulating the deformation of the object, using visual and tactile information that was acquired from multiple views and combined.The setup consisted of a 7 DoF robot manipulator built with Schunk Powercube modules and equipped with a Schunk-FTCL-050 force/torque sensor integrated into the gripper through a wooden stick to minimize occlusions.In the same gripper was attached either a Bumblebee stereo or a PMD-[vision]-O3 ToF camera.The former was employed for textured objects and the latter for uniformly coloured objects since better depth results were obtained.Later, Frank et al. [25] updated the vision system, replacing the stereo camera by a Kinect sensor.This time, the deformation models were learned by the robot and applied to a robotic navigation system.Petit et al. [26] presented a deformation predictor based on a physical model for a situation in which a robot maintains a single-point contact with the object.The setup consisted of a Kuka LWR arm equipped with a force sensor in its gripper and a Asus Xtion 3D camera that recorded the data by moving the sensor around the object.They used the data to estimate the material parameters based on a former approach [27] that derived internal elastic forces within the model which are linearly related to the stiffness matrix which contains Young's modulus and the Poisson ratio.Gil et al. [28] adopted a similar framework as Delgado et al. [17], which consisted of a Shadow robot hand equipped with a Tekscan tactile sensor on the fingertips and a Kinect camera with an eye-in-hand configuration.In this approach, however, visual data from the Kinect sensor are used to detect and track the deformation while the force measurement is used as a complement in the case that a grasp adjustment is needed.Caccamo et al. [29] proposed a system for modeling deformation of elastic surfaces.The setup consisted of a PrimeSense RGB-D camera a Kinova Jaco24 robotic arm equipped with a 3 fingered Kinova KG-3 gripper that carried a 3D OptoForce force sensor.
The trend for visual setups introduced in Section 2.1 tends to continue in a similar direction for multi-sensor systems.On the other hand, tactile sensing has not been integrated with the same emphasis on adding more complex setups as in Section 2.2.Overall, the integration of tactile data provides physical information which is either not considered or estimated somehow in its pure visual counterparts.Therefore, multi-modal systems intend to enrich their models with more diverse information about the environment.

Description of Non-Rigid Objects
This section focuses on the strategies used to describe non-rigid objects in robotic manipulation scenarios and especially considering contributions made for 3D objects.Typically, the description of this kind of object is characterized by modelling its shape over time by combining the representation of the surface and its deformation.Contrarily, there are other scenarios in which a particular model is not considered and the information of the object acquired from the sensors is processed directly according to the particular application.Thus, the analyzed works are categorized according to whether or not they use a model to describe the objects, that is whether they are model-based or model-free.In addition, Table 2 at the end of the section provides an overview of the most relevant features of the proposed strategies.

Model-Based Approaches
According to a previous model categorization presented by Montagnat et al. [30] and Salzmann and Fua [31] but adapted to the application of robotic manipulation of non-rigid objects, the deformable modelling strategies used in the subsequent works can be categorized as follows: physical, geometric, learned and hybrid models.

Physical Models
This type of methods includes approaches that explicitly represent the model by integrating in some manner Newton's law of motion.Initially, Fugl et al. [32] proposed a 3D model to describe linear deformations based on the Navier-Cauchy equations and considering an elastic and isotropic material.The mechanical response was formulated from Hooke's law using the Navier-Cauchy equations.In addition, some boundary conditions were defined considering the case of robotic manipulation since the model is used for simulation only.A Finite Difference Method (FDM) is formulated for the Navier-Cauchy equations to be approximated with a discrete set of algebraic equations.Similarly, another model that describes linear deformations was presented by Fugl et al. [9] based on a discretized Euler-Bernoulli model and also developed for an elastic and isotropic material.The physical model is able to compute the deformation curve as a function of Young's modulus, the mesh geometry and the gripper pose.Afterwards, a comparison between the deformed model and the input data is synthesized by an error function.
Frank et al. [25] proposed a 3D model built on a previous work [24] which consists in the estimation of the elasticity properties, described by Young's modulus and the Poisson ratio, considering an elastic and isotropic material.The data from the sensors are used to formalize a linear Finite-Element-Method (FEM) of the discretized tetrahedral mesh, which relates the external forces acting on the nodes of the mesh and the consecutive displacement with the elasticity parameters via the Stiffness Matrix.The FEM is initialized with a given stiffness matrix and the problem is solved by comparing the observed and simulated displacement and minimizing their difference.The robotic surgery application of Leizea et al. [10] used the acquired colour and depth information as the input to a non-linear FEM model to simulate the deformation behavior.As a preliminary step, the system needs to compute the density, Young's Modulus and Poisson ratio parameters of an elastic and isotropic material.Additionally, a set of key-points that relates the input data with a 3D mesh of tetrahedrons is also defined.Afterwards, the model is initialized as a Saint-Venant-Kirchhoff formulation within a FEM structure using a tetrahedral mesh that is updated in every frame to generate a new model shape.This is done by matching selected key-points with the input raw point cloud to find associations between them and discarding those key-points that present no deformations.Thus, the FEM simulation uses the previous association to displace the mesh until it represents the deformed state of the object.In a series of publications, Petit et al. [26,27,33] developed the techniques to represent the deformation of elastic and isotropic material as a linear FEM model of a tetrahedral mesh.Firstly, a graph cut-based method was applied to segment the object from the undesired visual data, then a rigid Iterative Closest Point (ICP) algorithm was used to estimate a transformation from the point cloud to the mesh.Afterwards, a registration procedure is performed by computing external forces exerted by the point clouds with respect to the nodes of the mesh and associating them with the internal forces computed from the visual and force data, as described in Section 2.3.Thus, the estimation of the deformations consisted in solving a dynamic system of linear ordinary differential equations involving the internal and external forces.Lin et al. [11,34] presented a squeeze grasping approach in which the deformation is described by analyzing the displacements of the points in contact with the object.Based on a previous approach [35] but adapted for a 3D representation, they formulated a linear FEM model of tetrahedral mesh for an elastic and isotropic object.Similarly, a set of external forces are applied to the nodes of the mesh displacing their location when deformation occurs, although the influence of gravity was also included by combining the proportional contribution of the mass of each tetrahedron.
On the other hand, the grasping strategy of Zaidi et al. [19] considers known isotropic objects represented as a non-linear Mass-Spring model (MSM) in a tetrahedral mesh, in which lumped masses and non-linear springs are attached to the nodes and edges respectively, as described in [36].The description of the global and contact areas of deformation are based on the tracking of the positions of the nodes by solving the dynamic equation of Newton's Second law.
In general, FEM-based models stand as the most popular class of physical models and provide a well-established methodology to formally describe deformations.This method has been developed in principle to deal with linear and planar objects, but it was also adapted to volumetric objects.However, it has only been explored for 1D or 2D linear elastic deformations, as opposed to MSM-based models as in [19] that address non-linear deformations.Moreover, physical models have a dependency on the material parameters, which are often unknown or cannot be described easily in a more general setting, given the complex material structure of certain objects.

Geometric Models
This category includes approaches with the objective of representing the surface and its evolution over time as a geometrically motivated model, which means that physical information as described in Section 3.1.1 is not included in the scheme.A parametric representation named Non-uniform rational basis spline (NURBS) was used by Jordt et al. [7] to describe and store the object deformation surface.First, they extract a set of 2D KLT features in the low resolution image data to create a dense 2D deformation map that then is fitted into a 2D NURBS function.Afterwards, the 3D surface deformation function is created using the previous 2D NURBS, the depth data and a global pose descriptor forming a 3D NURBS.Finally, the deformation sequence is propagated to a high resolution mesh by mapping the vertex position in 3D to the corresponding NURBS coordinates.Afterwards, Jordt and Koch [8] improved their model by removing the necessity to explicitly detect tracking features.Instead, the algorithm defines a 3D mesh from the initial frame and registers it into a NURBS surface function, such that the mesh deformation is manipulated by the NURBS control points.Then, it is fitted to the current measurements by minimizing an error function between the mesh and the colour and depth data.
In order to improve the stability issues of physics-based models, Güler et al. 's [5] system used a position-based dynamics model (PBD) to simulate the deformation.The model is a geometric simulation method called meshless shape matching, originally presented in [37].The construction of the model required as input a set of particles with some initial configurations.However, there is no requirement of connectivity information between particles.An optical flow based algorithm is used to spread the set of particles across the object surface in image frames consecutively.In principle, the model is able to simulate several levels of deformation, however for this work only shear-and stretch-like deformations are considered, which is controlled by a Process in a simpler, generic robotic experimental setup, parameter named β.Afterwards, Caccamo et al. [29] included a Gaussian Proccess Regression (GPR) to the PBD methodology in order to estimate the deformation of a elastic heterogeneous surface.Gil et al. 's [28] grasping strategy used colour and depth data and geometric information to be able to model the shape deformation of a planar object.This method, originally described in [38], consists in clustering the point clouds into subsets named patches, characterized by the size and number of points contained.Then, the curvature variations maps (CVM) between every points of a patch are estimated by analyzing the eigenvalues of the points in a neighborhood using Principal Component Analysis (PCA).Furthermore, a curvature histogram (HCN) is defined and represents the distribution of the curvature variation.It is used to detect singular points, which are those with maximum curvature values.Finally, by combining CVM and HCN, the system is able to detect deformations by finding the critical points in which occur the variations.As can be seen, tactile information was not part of the modelling process; instead, a tactile-vision algorithm is proposed for a grasp planner.Furthermore, the latter deformation descriptor was used in Mateo et al. [39] as a part of their proposed surface supervision method that includes volumetric object representation.Additionally, the curvature surface approach was formalized as "Curvatures Skeletons" and described in detail in [40].
Alternatively, Hui et al. [6,41] used an implicit geometric representation named fast level set for tracking the deformable surface.Initially, the object was segmented from the raw colour and depth data by applying RANSAC and k-d tree search algorithms to remove the undesired surfaces.The segmented mesh was latter transformed into the YUV colour space and mapped in a log-polar domain.Thus, the proposed fast level set method was applied into the map mesh for detecting and tracking the contour over time.This method collected 3D data, however, the depth information was only used for the segmentation process since the modelling approach was developed in 2D image data.
The main advantage of geometric models is their ability to express a complete deformation while tracking the object's shape.Thus, in principle, the requirement of physical models to know the material properties in advance can be avoided.Overall, there is no clearly dominant methodology used for geometrical models, which mainly address 2D deformations.Moreover, techniques such as 3D NURBS [7,8] and surface curvatures [28,40] appear as the first attempts to describe 3D deformations.

Learned Models
The complexity to represent non-linear effects, added to the need to estimate unknown parameters, inspired the community to propose several methods that use the available information as training data to infer the object shape.In that sense, the deformation monitoring proposed by Cretu et al. [22] introduced a segmentation phase that consists in training a growing neural gas network to cluster regions of an image based on colour and space information given as hue -saturation-value (HSV) components and coordinates of pixels, respectively.The output of the network represents the object of interest and background.Thus, the object is filtered to obtain its contour using the Sobel edge detector.Afterwards, a new neural gas network is applied to reduce the number of points in the contour with the intention of preserving more details where local deformations occur.The reduced points are later used in another neural gas network trained to track the contour over the image sequences.Finally, force and position measurements of the robotic hand are associated with the tracked contour by means of a feedforward neural network that is trained to predict the behavior of the objects when being manipulated.
Lately, Tawbe and Cretu [42,43] extended the previous approach to manage 3D deformations.Similarly, 3D data were segmented to obtain a mesh that represents the object of interest, which was done by implementing a RANSAC algorithm to identify flat surfaces (e.g., table and walls) and eventually remove them from the point cloud.In addition, the data that were not captured by the sensor due to occlusion with the robotic tool was filled by using a mesh processing software.Furthermore, the remaining meshes were selectively reduced by using in this case a variation of the QSlim algorithm with aims to simplify the points that are not in the neighborhood where the interaction occurs.Afterwards, the new mesh is clustered according to the distance with respect to the original mesh, then a stratified sampling technique is employed to only retain a subset of data.Similar to the previous approach although adapted to manage 3D point clouds, the data are fitted in a neural gas network.Finally, a series of feedfordward neural networks are trained per each cluster of the object mesh in order to obtain a relationship between the force data and the position of the neural gas nodes over each cluster.
In Arriola-Rios and Wyatt [21], they combined two learned models in sequence in order to predict the object's shape and reaction forces.Firstly, a visual tracking system is applied to provide the data to train the models.They proposed a tracking algorithm named "linear snake".It represents the contour as a polygon in which the vertices act as control points adjusted following the deformation.This information is used for the shape predictor (SP) to initialize the mesh of a mass-spring model (MSM), based on a modified version of the model proposed by [44] that considers elastic and plastic deformations.Nevertheless, the focus of this work is to automatically calibrate the MSM model.Thus, an evolutionary algorithm is used to search for the parameter space of the model.Similarly, to train the force predictor (FP), the position and forces from the sensory information is used to obtain a stress-strain diagram, which is approximated by implementing a regression model.The system is able to predict the object behaviour in multiple steps and also classify the material in case a new one is added.However, the system is restricted to model 2D deformations only, although the authors discussed the inclusion of a 3D model as future work.
Cherubini et al. [45] proposed a representation framework for the manipulation of a malleable plastic object.In this work, the authors initially defined a set of "actions" (i.e., pushing, tapping and incising), such that each action affects only a subset of the system state.The current state of the object is defined through a function that depends on the previous state and on the external action applied.Thus, the problem is reduced to the minimization of the error between the actual and desired state.However, since no formal description for the state function is provided, a multi-layer perceptron neural network is used to learn this mapping by using image data of humans performing the task.

Hybrid Models
This category is defined in order to include the approaches that combine in their system several of the previous modelling methods to work in a unified fashion.Bodenhagen et al. [46] presented a system for robotic grasping and manipulation of flexible objects in applications such as peg-in-hole and laying-down.They combined several of the methodologies developed in their previous works.For instance, the deformation tracking relies on the NURBS surface approach [7,8] introduced in Section 3.1.2while simultaneously searching for the material parameter Young's modulus by assuming an homogeneous object.Moreover, a physical model and deformation prediction are built based on the approach of [9] presented in Section 3.1.1.Nevertheless, they included a non-linear beam model applied for larger deformations, which linear beam models are not sufficient to capture.A learning phase based on Kernel Density Estimation (KDE) method is also considered, which involved the identification of the most suitable manipulation actions and the current state of the object accordingly.Güler et al. [47] updated their previous approach [5] presented in Section 3.1.2to include a FEM model that generated ground truth data of various deformation parameters.The robustness of the system was improved by adding quadratic deformation (i.e., twist and bend).Similarly, volume conservation deformability was also included together with a new control parameter θ, which is estimated by matching the PBD and the FEM deformation model.The latter is built upon Young's modulus and the Poisson ratio for a linear elastic and isotropic material.The experiments show that, by adding this new control parameter, it is possible to relate with the elasticity parameter of the FEM model.

Model-Free Approaches
This category considers approaches that do not require a particular model in their systems.Instead, they use methodologies that perform online estimation of the deformation.For instance, Mira et al. [16] and Delgado et al. [17] presented a strategy to grasp flexible objects by using a task-dependent approach.Instead of using a strategy to model the deformations, their problem relied on an object identification system.Several parameters of the object such as shape, dimension, position and orientation are extracted visually, and then used to search for the best fit in a database.The saved information is later used in a grasp planning algorithm.Navarro-Alarcon et al. [12] did not consider any explicit model for their strategy to control the deformation of non-rigid objects.Instead, they developed an algorithm that estimates a vector of "deformation" parameters.These are basically a set of features points defined to characterize the deformation locally by combining information of position and shape.These features are computed using image processing techniques and located over the surface of the object.Nevertheless, these features do not provide enough information to consider multidimensional deformation such as stretching, bending and so on.The system allows monitoring and controlling single-point deformations only.Delgado et al. [18] proposed a controller for local deformations similar to the goal of Navarro-Alarcon et al. [12].However, the system is based on a set of "images" obtained from tactile data of different kind of sensors.A unique tactile image is created for each finger of the robotic hand in such a way that a virtual map is created with the pressure levels that the fingers can read on each contact point.In this work, instead of using the tactile images to detect features that describe the shape, they are used to obtain position and force data at the contact points.This information is used directly to control the configuration of the fingers.Recently, Alambeigi et al. [13] proposed another deformation control framework capable of handling heterogeneous 3D non-rigid objects.Similar to previous works, a set of features points are defined in an image, for which a non-linear function maps the feature points from the image to the world space.This function is estimated by iteratively solving a system of non-linear equations by means of Broyden's method and is necessary to define the optimization problem.This optimization allows controlling the features points in the world with respect to the desired points in the image.Results exhibit that the proposed system is also able to handle disturbances during the task (e.g., incisions).
Model-free systems share multiple characteristics with learned models presented in Section 3.1.3.The main difference is seen in the adoption of machine learning models to learn the mapping function instead of directly deriving it by using some kind of mathematical description.Overall, these systems provide a framework that is scalable and generalizes well to more complex non-rigid objects.Despite this, they are still in an early stage and only simple applications have been integrated. 1 approaches that generalize the type of deformation.

Control of Robotic Manipulation for Non-Rigid Objects
Many control strategies have also been proposed for the safe and efficient handling of non-rigid objects.This part of the survey aims to provide a high-level overview of the different techniques and concepts used to accomplish this task in recent works.As such, there is a focus on the selection and planning of the manipulation task and trajectory rather than on the low-level control and driving of the actuators.Most of this section is organized in terms of the type of object being manipulated.Section 4.1 describes the handling of linear objects such as ropes, wires, cables, and flexible beams, in terms of both tying knots and routing, shaping and otherwise moving them.Section 4.2 groups the research that works with planar objects such as metal and plastic sheets, as well as all cloth-like materials.The tasks discussed in this section include sorting and folding laundry, assisted dressing, and cancelling unwanted deformations in an industrial setting.Section 4.3 presents research that is concerned with shaping the outline, or 2D projection of non-rigid objects.The shaping and handling of 3D objects is discussed in Section 4.4.Recent machine learning approaches are grouped in Section 4.5 and papers that work with more general problems or that can hardly be described by the previous categories are discussed in Section 4.6.A summary of the surveyed control strategies can be found in Table 3 at the end of the section.

Linear Objects
Many automation tasks are concerned with the handling of linear non-rigid objects, such as the knotting and routing of ropes and cables.Research exploring the tying of knots includes the work of Wakamatsu et al. [48], who studied the knotting and unknotting of rope with a topological description of the object's state as a set of oriented crossings.They define four basic manipulation operations to perform state transitions and use them to derive a high level plan to take the object from its current state to an arbitrary target state.From this plan, they computed a sequence of approximate grasp points and directions of motion to perform the knotting/unknotting operation with a single manipulator.They also established a planning method to determine which parts of the rope should be pulled in order to tighten a knot.Saha and Isto [49] built upon a similar topological description and added the possibility of crossings of the rope with a rigid object, e.g., a beam, box or other support.Their planner makes use of optional "sliding supports" (knitting needles) to hold parts of the unfinished knot in place during manipulation.The resulting system uses two collaborating manipulators to tie self knots and knots around static objects.Bell [50] developed a system to tie knots without any sensing, by using static fixtures and a system of tracks to handle stiff linear objects such as steel wire.More recently, Wang [51] explored multiple techniques to tie knots with minimal sensing.These techniques revolve around the use of automatically-designed mechanical fixtures and a local string control strategy using vector fields.They also studied the different constraints required to tie a knot with this system, such as the required number of contact points and regrasp operations.
Works on routing and shaping linear objects include Moll and Kavraki [52], who developed a path planning algorithm for shaping flexible wires with robotic grippers holding both endpoints.They defined that the wire is in a "stable" state when its shape matches a minimal energy curve, and restricted the planner to such stable states.This reduces the size of the state space and ensures that the strain on the wire is minimal throughout the manipulation process.It is then possible to explore the entire space of collision-free states in order to bring the wire to a desired configuration.Tavasoli et al. [53] studied the case of two planar robots cooperatively handling a flexible beam.With the goal of suppressing the unwanted vibrations of the beam, they built upon the observation that the positioning and vibration happen in two different time scales.They decoupled the system into "slow" and "fast" subsystems.The slow system behaves as a fully rigid system and is used for the positioning task while the fast system is only concerned with the vibration of the object.The resulting composite controller is able to perform the position tracking task while suppressing the unwanted vibrations of the beam.Ding et al. [54] also considered a technique for suppressing unwanted vibrations during the manipulation of a flexible beam.Their approach is to decompose the system into actuated and underactuated parts, and use a position-based control strategy with sliding mode control to quickly dampen the vibrations.Shah and Shah [55] considered the task of attaching heavy cable bundles with fragile interconnections to an aircraft fuselage.The fragility of the interlinks adds severe constraints to the manipulation and anchoring tasks.Their simulation-based planner incorporates gravity in the computation of the cable shape, and carefully selects a sequence of grasping and anchoring points to ensure that the interlinks are not damaged in the assembly process.
Overall, recent works on knotting involve complex operations with the added challenges of having an arbitrary initial state [48], interactions with other objects [49], or limited sensing [50,51].Similarly, works on routing cables add the constraint of minimizing strain on the cable or specific parts of it [52,55], while moving a flexible beam requires cancelling its vibrations in an optimal manner.These added constraints on the performed tasks and the small number of recent works indicate that the shaping and handling of linear non-rigid objects is already a topic that is quite well known, and that researchers are moving to more complex manipulation tasks.

Planar Objects
Most of the research effort in handling non-rigid objects is currently concerned with planar objects, i.e., objects that may be modelled as a bidimensional mesh.This includes fabrics and leather, as well as sheetmetal and other thin sheets of flexible material.These materials are often considered to have negligible stretchability, and the deformation is therefore normal to the plane of the resting object.
An everyday application of the automated handling of cloth is folding laundry.Bell [50] derived the number of grasp points necessary to immobilize a polygonal cloth and showed that this number could be greatly reduced by simple manipulations that make use of gravity.They used these findings to develop a system to fold a T-shirt with a fixed rod and a manipulator with few degrees of freedom.Maitin-Shepard et al. [56] used a vision-based geometrical approach to detect and grasp the corners of a towel and fold it with a series of regrasps.Cusumano-Towner et al. [57] furthered this approach by using a Hidden Markov Model to recognize the behaviour of the cloth and detect its type during an initial "disambiguation" manipulation phase, which enables them to bring various articles into a configuration that is suitable for future operations such as folding.Bersch et al. [58] presented a system to find possible grasps on a crumpled shirt and evaluate their chances of success with machine learning.While using folds detected through 2D and 3D vision data, this heuristic strategy fits a simplified gripper model around the point cloud of the desired grasp point to compute the valid gripper poses for holding the shirt.A function learned automatically with a support vector machine is then used to evaluate the quality of the potential grasp point and gripper pose based on geometrical features that enhance the success rate.Their system then iteratively moves its grasp points towards the shoulders of the shirt before executing an open-loop folding routine.Miller et al. [59] presented a motion planning algorithm to perform a user-defined sequence of folds on a cloth laying on a table.Their approach greatly reduces the complexity of the problem by dividing the cloth into two quasi-static polygons: one that lays flat on the table and one that is suspended by the grippers.Willimon [60] also considered the problem of flattening a crumpled or folded piece of laundry.Their system is able to grab an item from a pile of laundry, classify it using RGBD data and active manipulation, and unfold it before moving on to the next item.Li et al. [61] used a simulation-based approach to find the optimal end effector trajectory during a single folding motion.Given the start and end positions of the manipulator, an offline simulation explores possible trajectories in order to minimize the error between the result and the user-defined "folded" state.This optimal trajectory avoids errors such as dragging the cloth (if the path is too low) or lifting and piling it up (path is too high).The learned trajectory can then be generalized to garments of similar shapes.
In a series of papers which are summarized in [62], Doumanoglou et al. developed a complete system to fold a pile of crumpled clothes with a dual-armed robot.The first step of their pipeline, picking up a single garment from the pile, is done by using a depth camera to detect folds in the pile, which are "the most suitable grasping points even for humans".One of the grippers is then moved to grasp the fold that is the highest in the pile.In the second step, unfolding re-grasps, the pose of the object is defined by the relationship between the current grasp points and the manually selected grasp points which will cause the cloth to unfold due to gravity.This representation allows the grippers to be moved towards the optimal grasps easily.Once the garment is laid on the table, an intermediary step is necessary to completely flatten it.Here, a small brush is used by one of the arms to push the detected folds towards the exterior while the other arm holds the cloth in place.Finally, the folding procedure is performed by matching the detected cloth polygon with a predefined triangular folding move (g-fold).
Sannapaneni et al. [63] developed an algorithm for learning a folding sequence from the visual detection of special markers attached to key points of the cloth during a demonstration.The learned sequence is then generalized to handle different sizes of clothes which have the same shape as in the demonstration.Yang et al. [64] preferred to use a deep learning approach with teleoperation training.The learned model is combined with the robot's sensor data at runtime to generate the folding plan.This combined approach increases the robustness of the system and allows the robot to complete or restart the task when disturbed by an external event.Jia et al. [65] used a "visual feedback dictionary" to map visual features of the cloth material to velocities of the end effector.Combined with their representation of features as "histograms of oriented wrinkles" with high and low-frequency components, this allowed them to complete complex manipulations with little training data.
Some recent work has also tackled the task of assisted dressing, where a person with limited mobility would be helped by a robotic assistant in the everyday task of dressing up.The system of Yamazaki et al. [66] helps a sitting person put on a pair of pants.They used the optical flow to estimate the state of the clothing and developed a path planning algorithm that adapts to the length, size and position of the person's legs.Moreover, the system is able to recover from failures, such as the bottom of the pants getting stuck on a toe, by attempting to revert to the previous state.Gao et al. [67] used an online iterative optimization algorithm to find a person's preferred dressing path for putting on a sleeveless vest.The general human pose and movement space is recognized through vision sensors, while force control is used for local optimization.The initially configured dressing path is therefore iteratively driven towards the path of least resistance, which is considered to be the person's preferred dressing path.Zhang et al. [68] used a hierarchical task structure in which the path planning task is subordinated to the task of minimizing the interference on the user's movements.This system is therefore able to help someone put on a top while taking into account their movements as well as their previously estimated range of motion.The interference minimization is accomplished mostly through force sensing as the occlusions from the robot and clothing prevent accurate real-time visual pose estimation.
In an industrial setting, Flixeder et al. focused on the task of layering sheets of cloth-like material over a rigid shape, as it is a common problem for tasks such as fiberglass reinforcement and shaping leather.In [69], they experimented with multi-arm manipulation and tested various control strategies for position, force-impedance, and parallel position and force control.In [70], they proposed a mechatronic design as well as control and planning strategies for accurate lay up of flexible strips on a complex mold.The work of Li et al. [71,72] is focused on cancelling the unwanted deformation of a flexible PCB prior to soldering it.This is done with adaptive region control that allows an assistive arm to control the deformation from any point in the desired region.Significant effort is also devoted to ensuring that the deformation is stabilized before the soldering is performed, while completing the task in as little time as possible.Another approach to cancel unwanted deformations in large plates of sheetmetal or plastic was explored by Park et al. [73], who embedded a "smart damper" in the gripper.As the end effector cancels most of the deformation, the rest of the manipulator arm can be controlled as if the object was fully rigid.Some research work attempted to solve more general problems related to the handling of non-rigid sheets rather than focusing on a specific task.Zacharia et al. [74] proposed a model-free approach to handling flexible sheets laying on a table by augmenting fuzzy logic with genetic algorithms for visual servoing.Shibata et al. [75] used simultaneous control of the displacement and deformation of a piece of cloth to perform the "wiping motion", one of the motion primitives used by humans when handling fabrics.Kinio and Patriciu [76] showed that the H ∞ controller was superior to a PID controller for indirect target point manipulation in a sheet of silicone.Elbrechter et al. [77] developed a path planning algorithm for anthropomorphic hands with different friction coefficients to fold a piece of paper.This is done using visual and tactile feedback as well as real-time physics modelling of the sheet of paper.In their approach, a hierarchical state machine allows dynamically switching between different controllers, e.g., one to achieve contact with the paper and one to maintain contact.They notice that the motion normal to the surface (to maintain contact) is controlled with tactile feedback while the motion in the tangential direction (to fold and crease the paper) is controlled independently with visual feedback.Bodenhagen et al. [46] used a learning approach with multiple learning and estimation phases to compute the ideal actions to apply to tasks such as conveyor grasping, peg-in-hole and laying down of a thick silicone sheet.Kruse et al. [78] used only force and vision feedback for the collaborative human-robot handling of a large sheet of fabric, where the robot follows the motions of the human while keeping the fabric taut.They later expanded this work in [79] by considering a mobile robot, which allows for a greater variety of tasks such as carrying a sheet around a corner.
Other interesting approaches to the manipulation of non-rigid sheets include the work of Dang et al. [80], who proposed a solution to control the general shape of a flexible surface with potential field based control of an array of microactuators embedded in the object.Patil and Alterovitz [81] worked on the problem of automated tissue retraction by surgical assistance robots.This task involves grasping and peeling back a thin layer of tissue to make the underlying area accessible to the surgeon.A sampling-based planner is used to explore the space of possible grasp points and paths to minimize the maximum deformation energy, the maximum stress and the total control effort needed to provide sufficient exposure of the underlying area.Their system relies on a physical simulation of the model and can be used to select an initial grasp point, optimize a human-defined retraction path or compute the entire motion sequence while avoiding obstacles.Inahara et al. [82] performed the non-prehensile stretching and compression of a thin wheat dough by controlling the acceleration of a vibrating plate on which the object lays.This was later improved in Higashimori et al. [83] by allowing the object to "jump" from the plate during shaping.From the realm of computer graphics and animation, Bai et al. [84] developed an algorithm to compute the necessary joint torques of simulated anthropomorphic hands in order for a cloth to follow a defined motion.It is interesting that the task description is given as a path to be followed by any point(s) of the cloth.The forces on the contact points are automatically derived to bridge the state of the cloth and that of the hands, which is formulated as a model-predictive-control problem.Recently, Cocuzza and Yan [85] explored the shaping of a thin rheological object (fondant icing).They developed a method to identify the material's properties from tensile tests and used this model to compute an optimal motion path for a serial manipulator to shape the icing over a cake.The manipulation speed is increased by using the object's elastic deformation properties to control the final plastic deformation.
The handling of planar non-rigid objects is rapidly becoming a well-researched topics, with cloth-like materials receiving the most attention.This is easily explained by the number and variety of applications that they open for industrial and household automation.First, in the timeframe covered by this survey, the automated folding of laundry has moved from open-loop detection and folding for a single towel to full-fledged systems able to autonomously fold a pile of assorted garments.It appears that the main challenge for such a system is the extraction and identification of the item to fold before bringing it into a suitable initial configuration.This is due to the infinite number of poses that clothing may take.Assisted dressing, a second household task involving clothing, has also received some attention.Here, the focus is on eliminating risks of injury and increasing human comfort.This is done by using a combination of vision, tactile and force sensing to deal with occlusions and by developing algorithms that are able to comply with unexpected movements during the dressing session.However, these systems have not yet achieved much flexibility and are only able to handle a single garment and a fixed initial position.
A wider variety of materials is used in non-household environments, with the primary goal of shaping thin sheets of material.The control schemes used in these applications are as different as the tasks to which they are applied, but a common strategy is the use of force control in addition to position control, especially for materials that show some elasticity in the direction of the desired deformation.The use of tactile and force feedback is also a frequent addition to the primarily vision-based systems.Overall, and even though some areas have already been explored, the handling of planar non-rigid objects is an active research field with many unsolved problems and potential improvements.

2D Projection of Objects
Another subject of interest with regards to the manipulation of non-rigid objects is the active shaping of the 2D projection of an object.This includes tasks such as controlling the shape of an object's contour in an image, or indirectly moving internal object points to specific targets.Gopalakrishnan and Goldberg [86] considered the case of grasping an object with two frictionless contacts.They discussed the concept of "deform closure", which is the "deformation space" equivalent to holding a similarly shaped rigid object in form closure with contact points in concavities.They computed the optimal jaw separation by balancing the energy needed to release the object with the energy that would cause a permanent deformation.
In [87], Das explored multiple tasks related to the 2D control of non-rigid objects.First, they developed a planning algorithm to shape the contour of an object into a desired curve with an arbitrary number of planar manipulators.This system also computes the location of the contact points on the original contour from their position on the deformed contour.The second controller they designed is for collaborative target point positioning.In this case, multiple fingers or manipulators work together to bring an internal point to a desired position by applying forces to the object contour.Their final task is a path planning algorithm for inserting and positioning a bevel-tip needle through a soft tissue.The needle is only rotated by increments of 180 • so that the deformation of the tissue-and therefore the path of the needle-are kept in the plane.This allows precise positioning of the needle tip.
Higashimori et al. [88] presented a method to actively shape a rheological object with unknown viscoelastic properties by separating the plastic deformation from the elastic deformation.To observe a large deformation and accurately estimate the material's properties without causing unwanted deformation, they set the maximum deformation in the "parameter estimation" phase to the desired deformation.Once the elastic response is known, they used integral force control to drive the plastic deformation.This allows for shorter manipulation times than position based control as it is not necessary to wait for the elastic deformation to dissipate.A similar property was exploited by Yoshimoto et al. [89] to shorten manipulation times.Once the elastic properties of the object are known, they used force control with visual feedback to drive the deformation stress.Once a sufficient force has been applied, the elastic recovery automatically drives the object towards the desired shape without further interaction, even though an excessive deformation was temporarily applied.
In [90], Das and Sarkar presented techniques to shape the outline of a planar rheological object with multiple manipulators.They discretized the object boundary into a number of control points equal to the number of manipulators, and developed an optimized motion planner to pick the initial contact points.This selection is based on minimizing an energy-like parameter between the desired curve and the original one.In [91], they augmented their controller with the shape Jacobian matrix, which maps the general shape of the object with the local shape changes at the control points.This also enables their planner to compute the optimal intermediate shapes as the object deforms.In [92], they expanded this work to move internal control points towards target locations by applying minimal forces to the object boundary.The total forces applied to the object are minimized by monitoring energy dissipation in order to select appropriate actuation points.
Alonso-Mora et al. [93] presented a system to achieve the collaborative manipulation of large flexible objects such as bedsheets with a separate wheeled robot handling each control point.The object is modelled as the triangulation polygon of its 2D projection (top view) with constraints on the minimum and maximum of the variable-length neighbour distances to prevent overstretching and excessive sagging.They used a centralized planner which defines the task as a change of configuration of the object polygon.Low level planning is left to the individual robots, which do not communicate directly.Instead, they observed the position of their neighbours and the force transmitted through the object to determine the object's current configuration.A receding horizon local planner is implemented on each robot to compute its next position and avoid collisions by solving a velocity optimization problem.Recently, Navarro-Alarcon and Liu [94] proposed a new representation of the object contour based on a truncated Fourier series.This representation, while being more compact than spatial-domain approaches such as point cloud, allows them to effectively ignore the high frequency components during visual servoing.This provides their system with more speed and reliability when dealing with objects that have a jagged contour, or if the desired shape is defined by a rough sketch.Instead of computing the object's full deformation model, their algorithm performs the online local estimation of the deformation properties.This information is used to iteratively drive the object towards the desired shape within the possibilities of the predefined control points.
2D control of non-rigid objects covers many use cases where the desired deformation and the applied forces exist in the same plane.As such, it may be a reasonable simplification for handling 3D objects when the deformation along a certain dimension can be safely ignored, or when depth information is not available and would be difficult to collect.Once again, the recent research presents a variety of approaches, but a few trends still emerge.First is the prevalent use of multiple manipulators to handle potentially large objects [87,[90][91][92][93], where sophisticated collaboration schemes are allowed by the reduced computational complexity of a 2D workspace.Another interesting strategy is the use of force control on the elastic deformation of the object in order to induce a desired plastic deformation in a shorter time [88,89].Overall, it appears that most of the surveyed approaches use 2D control as a computational simplification for handling 3D objects when the depth constraints can be integrated in the 2D space [93] or ignored.

3D Objects
Recently, researchers have turned their attention to the automated handling of 3D non-rigid objects.Earlier works by Navarro-Alarcon et al. focused on the ability to create specific point and angle-based deformation features that can be observed by a 2D vision system.In [95][96][97], they built a dynamic velocity control law for a single control point on the object.They also used an iterative estimation of the deformation Jacobian based on 2D visual feedback only.This technique allows them to avoid the need for prior knowledge and modelling of the object while explicitly controlling the object's elastic deformation through real-time visual servoing.Multiple types of deformation features were explored, namely moving a given point to a target location, creating a certain angle, and changing the distance between points.They expanded this work in [98] by presenting an analytical, energy-based solution for active deformation while retaining the adaptive behaviour of their controller and its ability to function in uncalibrated environments.They also developed a solution which does not need to compute the deformation optical flow in real-time by making use of information from offline deformation tests.They also considered curvature-based deformation features.In [99], they developed a controller which is able to make use of all 6 degrees of freedom available to their manipulator, which was not the case with their previous experiments.This larger range of motion allows for more flexibility in the deformation tasks, in terms of both simultaneously controllable features and the number of reachable configurations.More recently, in [12], they used stereoscopic vision feedback and a similar control algorithm for controlling 3D deformation features with two manipulators.Once again, no prior knowledge of the object model is required as the vector of deformation parameters is estimated in real time.
In recent work, Delgado et al. focused on in-hand manipulation of elastic objects with tactile control only.In [100], they started by classifying objects between rigid and non-rigid based on the "sensation of rigidity" and total displacement when varying the contact forces applied by the robotic fingers.Then, they introduced a planning and control system to maintain and adapt the contact forces while performing basic manipulation tasks such as lifting, rotating, squeezing and moving the object.This control strategy is based solely on tactile information and does not depend on an object model nor on knowledge of its weight and friction coefficient.In [18], they explored similar tasks while taking into account a minimalist spring-based model of the object.An initial exploration phase is used to find the object's elasticity parameters (Young's modulus) by varying the applied forces and measuring the displacement and stiffness.These data are then integrated in a basic model which connects each contact point to the object's centre of mass with a spring.Afterwards, manipulation tasks are planned and performed mostly by readjusting the fingers' positions and applied forces.In [101], they used tactile images created by dynamic Gaussians as a common representation for tactile data.This representation allows merging data from sensors with different resolutions and provides a high-level interface for controlling the pressure applied by each finger.Their system allows both the creation of tactile images from observed pressure data and the control of the robot fingers based on a desired tactile image.They tested this control strategy with different bimanual tasks where each hand is equipped with a different sensor technology and must respect a global desired tactile image while the handled object is being bent, folded or otherwise moved by the robot.
The research on control of 3D non-rigid objects is quite recent and focuses on basic shaping tasks and in-hand control.Even though the small number of research groups discovered in this survey does not allow for much generalization, some trends may still be noticed.The control tasks are based on feedback from a single type of sensor, resulting in either visual servoing or tactile control.Moreover, they use only a minimal representation of the object, if any model is used at all.This leads to reactive strategies that must be constantly adjusted based on real-time observations.Overall, the manipulation of 3D non-rigid objects is a topic that is still in its infancy.

Learned Control
Given the recent popularity of machine learning algorithms, it is interesting to highlight their applications to the handling of non-rigid objects.Recent research works include Li et al. [61], who used multiple simulation experiments in order for their system to learn an optimal end effector path for a single folding motion.Lee et al. [102] presented a method for learning force-based manipulation "skills" from multiple demonstrations.Their system warps the demonstrated forces and end effector positions to match the current situation, and applies statistical learning to combine multiple demonstrations in order to perform a new task and automatically select a tradeoff between the error in position and in force.The demonstrations are done using either teleoperation or direct "kinesthetic teaching".They have tested their system with tasks such as tying knots in ropes of various lengths, flattening a towel and erasing a whiteboard.Tang et al. [103] presented a new method to warp learned paths to new situations.In their work, the function relating the trained shape to the test shape is derived in the tangent space instead of in the cartesian space.Contrary to the "point cloud" cartesian mapping, this tangent mapping preserves the structural information of the object, therefore eliminating the risk of overstretching the object when warping manipulation paths from the training scene to the test scene.They tested their algorithm by shaping a simulated cable.
Sannapaneni et al. [63] developed a system that learns a folding task from visual demonstrations.Special markers are attached to key points of the object while it is folded by a human operator.The system learns the marker paths and is able to generalize them for folding articles of different sizes but identical shape.Yang et al. [64] used teleoperation training with deep learning in a folding task.The robustness of the system is increased by combining the learned model with sensory data in real-time to compute the motion plan and recover from disturbed or interrupted tasks.Langsfeld [104] developed a system that learns multiple new tasks by observing a human performing the task as well as by iterative approximation.Tasks such as pouring a specific volume of fluid in a container and cleaning a compliant part were successfully learned and generalized to handle different parameters.Hu et al. [105] performed the online learning of an object's deformation model by using a Gaussian Process Regression algorithm that selectively ignores uninformative data.This learned model is used to build a visual servoing controller to manage the 3D deformation of various objects.This system is evaluated by performing tasks such as bending a rolled towel or a plastic sheet, folding a towel, and placing a piece of fabric such that pins may be inserted in specific locations.
Machine learning was shown to be a powerful tool when dealing with the complex interactions involved in the control of non-rigid objects.It is especially valuable when it allows to avoid the need to build an explicit control algorithm for tasks that may be difficult to describe formally, or when it is not feasible to capture all of the task parameters.Overall, the most popular learning approach for robotic manipulation of non-rigid objects appears to be learning by demonstration, where the robot learns and generalizes a task by "watching" a human perform it.This is a powerful paradigm as it allows a general-purpose robot to perform multiple tasks with non-rigid objects without the need to build a separate controller for each task.

Other Control Strategies
Even though the categories presented in previous sections cover most use cases and strategies for controlling robotic manipulators handling non-rigid objects, many interesting approaches are not easily related to any of these specific groups.The research presented in this section discusses these exotic control strategies, solutions to general problems which are not limited to a single object geometry, and some unusual applications.
Smolen and Patriciu [106] explored surgical applications with a simulation approach to deformation planning.They used the reproducing kernel particle method to simulate a soft tissue that is manipulated by several control points at its boundary.The goal of the planner is to move internal control points to target positions by applying forces to the external manipulation points.Goldman et al. [107] presented multiple algorithms to enable surgical robots to autonomously map the shape and stiffness of living tissues.Their techniques rely on force and position data to capture the object parameters in the immediate probing region.A hybrid force-motion controller performs the exploration sequence in a user-defined area and uses a recursive algorithm for multiresolution sampling based on the local stiffness differences.They also considered the exploration of "deep" features by following the natural boundaries defined by stiffness segmentation.
Sugaiwa et al. [108] presented an algorithm to set the grasping force for an initially unknown object by measuring its physical properties.Their in-hand sensing approach uses the deflection of a passive mechanical element to detect the moment of deformation or slippage, which they combined with the applied forces and hand configuration to deduce specific properties.First, they measured the forces required to create a dent in the object as well as to prevent slippage when lifting.All measurements are done by incrementally increasing the force applied to the object until it starts to move.The object's stiffness (denting force) is measured by checking for discrete deformations, its weight is computed by attempting to roll it on the table, and the friction coefficient between the object and the fingers is measured by pushing the object "into" the table while holding it by its sides.The lift-off force is then computed based on the object's weight and friction coefficient.The signed difference between the denting force and the lifting force is used to classify the object as rigid, soft or excessively soft, and the grasp force is set to the lifting force for rigid and excessively soft objects, and to the denting force for soft objects.This allows setting the grasping force as high as possible to avoid slippage while minimizing the deformation.
Berenson [109] used the concept of diminishing rigidity to avoid modelling and simulating the objects being manipulated.This principle states that the effect of a force on a non-rigid object diminishes as the distance to the application point increases.They used this property to quickly estimate the deformation Jacobian which is used to drive internal control points towards targets based on external forces.They applied this technique to tasks such as tying a rope around a cylinder, spreading a cloth on a table, and collaborative folding, all while correcting for overstretching and avoiding obstacles.Frank et al. [25] included the handling of non-rigid objects in the path planning system of a mobile robot.As the robot navigates the environment, it interacts with surrounding objects to estimate their deformation properties and takes them into account when building a model of the environment.This allows the platform to consider passing through a curtain or pushing aside a plant in order to reach an otherwise inaccessible area.
Essahbi et al. [110] worked with the muscle separation process in the meat industry.The goal is to dynamically generate the pulling and cutting tasks for a multi-arm system to separate the meat from the bone.The workpiece is detected and modelled with input from a structured light vision system and force sensors, and the cutting path is selected based on tissue curvature.The object model predicts the tissue behaviour once it is cut and helps in setting the force applied by the cutting and pulling arms.Langsfeld et al. [111,112] automated the bimanual cleaning of a compliant part.Initially, only the geometrical shape of the part is known to the system.The part stiffness is discovered during the cleaning task and integrated as a linear finite element model.The goal is to achieve efficient path planning for the cleaning arm as well as for the grasping arm, as regrasps might be necessary in order to hold the part closer to the area being cleaned and avoid excessive deformation.Wnuk et al. [113] performed a general analysis of a complete bin-picking scenario where the system could handle non-rigid objects through simulation.They described the different steps of the process, namely object localization, approach, grasping, and subsequent manipulation, as well as the different challenges faced during each step.This work allows them to develop the hardware and software requirements for the successful completion of the task, as well as provide a theoretical system architecture to meet these requirements.

Conclusions
This paper presents a review of the latest developments related to the robotic manipulation of non-rigid objects, providing a symbiotic coverage of the sensing, modelling and control strategies used in these systems.
Data acquisition systems remain either vision-based or in slight integrations with force-based sensors.Improvements have mainly been seen in the incorporation of depth data through commercial RGB-D cameras in order to deal with 3D representations.Future works are expected to show a greater adoption of solutions based on multi-modal setups, which have shown promising results for robotic perception.Moreover, improvements in data fusion strategies as well as a greater role for tactile data should be explored.Meanwhile, the description of non-rigid objects is still dominated by physical models, most of which can, in principle, handle linear elastic deformation only.While this can be enough for specific objects, it presents a clear limitation with regards to generalization.Nevertheless, there is a limited amount of work, presented in learned and hybrid models as well as model-free approaches, in which the focus has been to increase the complexity and generalization of the representation.However, these systems are still in an early development, since several of them have been developed for 2D deformations only.That is the reason why future works have to move towards robust 3D representations, for which promising results have been obtained by adding machine learning strategies in the modelling systems.
In terms of control and planning strategies, most of the recent research efforts are focused on the handling of planar non-rigid objects, and particularly on cloth-like materials.However, much research is also concerned with the shape control of objects in a 2D projection, as can be observed with a simple vision system.In opposition, few researchers have yet tackled the problem of manipulating 3D objects.Following the current trend in robotics and computer systems, many of the more recent research approaches address the problem through machine learning techniques, regardless of the specific type of object handled or the task being performed.
Future developments on the control of robotic manipulation of non-rigid objects are expected to follow and expand upon the current trends.While there appears to be little work left to be done on linear objects, there is still much to learn on planar and 3D objects.Even though laundry folding has been widely researched, few complete systems have yet been developed.Following this, it can be expected that more optimizations and robust, complete solutions will be developed in the near future.Assisted dressing systems are similarly expected to become more robust and gain in flexibility and, after compliance with human movements is solved, the next logical step could be the handling of multiple types of clothing in a single system.In industrial settings, more complex applications with varied planar materials are expected to emerge as the control techniques for planar objects improve.
As the research on planar non-rigid objects becomes more abundant, attention will gradually begin to turn to 3D objects.Even if it is clear that the sensing and control simplifications introduced by controlling the 2D projection of the object will continue to be widely useful, some critical applications may not afford them.Eventually, these specialized tasks which require full control of 3D non-rigid objects will demand robotic replacements for human workers.More generally, as machine learning techniques and computing power improve, it can be expected that learning approaches and artificial intelligence will take a more prevalent place in robotics, with complex tasks such as controlling non-rigid objects at the frontlines.As the control strategies continue to be improved, it is expected that support for non-rigid objects of all types will be integrated in a wider variety of systems and applications, bringing human-like manipulation skills to general-purpose robots.
Overall, this survey aims to orient the reader among the latest developments in robotic sensing, modelling and manipulation of non-rigid objects, an area that is growing in popularity and importance as robotic manipulators are involved in more sophisticated tasks and reaching out of the factory to collaborate with human beings in daily activities.

Table 1 .
Characteristics of the surveyed sensing systems.

Table 2 .
Characteristics of the surveyed object description strategies.