Active Prior Tactile Knowledge Transfer for Learning Tactual Properties of New Objects

Reusing the tactile knowledge of some previously-explored objects (prior objects) helps us to easily recognize the tactual properties of new objects. In this paper, we enable a robotic arm equipped with multi-modal artificial skin, like humans, to actively transfer the prior tactile exploratory action experiences when it learns the detailed physical properties of new objects. These experiences, or prior tactile knowledge, are built by the feature observations that the robot perceives from multiple sensory modalities, when it applies the pressing, sliding, and static contact movements on objects with different action parameters. We call our method Active Prior Tactile Knowledge Transfer (APTKT), and systematically evaluated its performance by several experiments. Results show that the robot improved the discrimination accuracy by around 10% when it used only one training sample with the feature observations of prior objects. By further incorporating the predictions from the observation models of prior objects as auxiliary features, our method improved the discrimination accuracy by over 20%. The results also show that the proposed method is robust against transferring irrelevant prior tactile knowledge (negative knowledge transfer).


Motivation
We humans perceive tactual properties of an object (e.g., stiffness, texture, temperature, weight) by applying exploratory actions (e.g., pressing, sliding, static contact, lifting) [1]. After applying different exploratory actions on an object, we can obtain its different tactile information. Conversely, making the same exploratory action on different objects produces different tactile observations. Therefore, when we learn about an object, we always link its physical properties with the exploratory actions that we apply on it.
Besides different kinds of exploratory actions, the tactile information we perceive from an object is also dependent on how we apply an action. Consider an example of pressing on two objects. Object 1 is made of soft sponge, and object 2 is made by covering a solid metal with a soft sponge surface. When pressing our fingertips on both objects with a small normal force, we can recognize similar object deformations. However, if we press with a larger normal force, object 1 deforms much more than object 2, since we have reached the metal part of object 2. A similar situation occurs when we apply the sliding movement on object surfaces with different forces and velocities. As a result, by pplying different exploratory actions in different ways, we can build a detailed knowledge of the object's tactual properties which we call tactile exploratory action experiences.
We humans learn about new objects in an active and incremental way. We actively select the most informative exploratory actions to interact with them [2,3]. More importantly, we relate these new objects with the experiences of exploring objects that we have previously encountered. By transferring the prior tactile knowledge, or prior tactile exploratory action experiences, we can largely reduce the amount of exploratory actions required to discriminate among new objects. In this way, we humans save a lot of time and energy, and recognize new objects with high accuracy [4][5][6][7][8][9][10].
Can robotic systems with a sense of touch also perform like humans to actively transfer the past tactile exploratory action experiences when learning about new objects (transfer learning)?

Background
Over the past decades, researchers have developed various tactile sensors and mounted them on robotic systems (e.g., [11][12][13][14][15][16][17]). In this way, the robots with a sense of touch can perceive different objects' tactual properties by applying exploratory actions. For example, a robot can slide its sensory parts on objects to sense their textural properties [18][19][20][21], establish a static contact to estimate the temperature [22], or lift objects to measure their center of mass [23]. Bhattacharjee et al. [24] developed algorithms to classify objects into four categories: (1) Hard-Unmoved; (2) Hard-Moved; (3) Soft-Unmoved; and (4) Soft-Moved using One Nearest Neighbor Classifier, Hidden Markov Models and Long Short Term Memory networks based on features of time-varying tactile sensor data (maximum force, contact area, and contact motion). Furthermore, several methods have been proposed for the active object exploration problem, in which the robot actively applies multiple exploratory actions to recognize objects (e.g., [25][26][27][28][29][30][31][32]).
However, the problem of transferring the robotic tactile knowledge has been rarely investigated. Even though many transfer learning techniques have been successfully applied to several areas (e.g., Natural Language Processing: [33]; WiFi-based localization: [34]; Computer Vision: [35][36][37][38]; Bio-informatics: [39]), it is our works that introduced tactile transfer learning. Previously, Kaboli et al. [20,21] developed a novel textural descriptor. Using the descriptor, a ShadowHand dexterous robotic hand equipped with BioTac sensors on its fingertips could efficiently discriminate among object surface textures. Later, we designed a transfer learning method [40][41][42] so that the robotic hand could reuse the prior texture models from 12 objects to learn about the surface textures of 10 new objects. However, since only the sliding movement was applied, the robot could only transfer the object textural properties.
In our previous works [43,44], we proposed an active touch learning method in which an UR10 robotic arm with an artificial skin on its end-effector or fingertips could apply sliding, pressing, and static contact movements to learn about objects' surface texture, stiffness, and thermal conductivity, respectively. Even though our active learning method enables the robot to efficiently learn about objects, the robot still needs to learn from scratch given a new set of objects. In this regard, recently, for the first time in robotics and tactile domains, we proposed an algorithm called Active Tactile Transfer Learning (ATTL) [45] to actively transfer multiple physical properties of prior objects. Using ATTL, the UR10 robotic arm could actively select prior knowledge to transfer (surface texture, stiffness, and thermal conductivity by applying sliding, pressing, and static contact movements). As a result, the robot could use fewer training samples (even one sample) to achieve higher recognition rate, when it learns about new objects.
The robotic systems in the above-mentioned works only applied exploratory actions with fixed action parameters, e.g., sliding with a fixed velocity to perceive surface textures. In order to learn their detailed physical properties (e.g., the vibro-tactile feedbacks by sliding at different speeds) so as to better discriminate among them, the robots should be able to apply exploratory actions with different action parameters.

Contribution
In this paper, we focus on actively transferring the prior tactile exploratory action experiences to learn more details about the physical properties of new objects (see Figure 1). Our contributions are two-fold: • We enable a robot to apply exploratory actions with multiple action parameters. In this way, the robot gains more detailed tactile information. • We propose an active tactile transfer learning algorithm so that the robot leverages the previously obtained detailed tactile knowledge (prior tactile exploratory action experiences) while learning about a new set of objects. In the sequel, we first introduce the robotic system (Section 2). Then, we illustrate how the robot applies exploratory actions and obtains the physical properties of objects (Section 3). Afterwards, we illustrate our proposed tactile transfer learning in detail (Section 4), followed by a systematic evaluation of the method (Section 5). We finalize this paper with a conclusion and a discussion about future works (Section 6).

Multi-modal Artificial Skin
To enable the robot to perform more human-like behaviours with multiple tactile sensing modalities, we designed and manufactured multi-modal artificial skin (Figure 2a made by seven active tactile modules (Figure 2b [12]. Each module is a small hexagonal printed circuit board equipped with off-the-shelf sensors (one temperature sensor, one accelerometer, three normal force sensors, and one proximity sensor). In this way, robots are equipped with such an artificial skin that contains seven temperature sensors, seven accelerometers, 21 normal force sensors, and seven proximity sensors. They can emulate the human tactile sensing about temperature, vibrations, force, and light touch. Their technical information is summarized in Table 1.

UR10 Robotic Arm
We mounted the multi-modal artificial skin on the end-effector of an Universal Robotic Arm (UR10) with six DoFs (Figure 2a). The UR10 was controlled in collaboration with the aritificial skin in order to apply different exploratory actions on objects.

Exploratory Actions Definition
By applying exploratory actions on objects with different action parameters, the robot can attain different feature observations. In this work, we consider three types of exploratory actions: pressing (denoted as P), sliding (denoted as S), and static contact (denoted as C). Formally, we define N α number of exploratory actions as A = {α θ n n } N α n=1 , where θ n is the action parameters that define "how" the robot can apply the exploratory action. We further define θ n ∈ {θ P , θ S , θ C }, where θ P , θ S , and θ C represent the action parameters for the pressing, sliding, and static contact movements respectively.

Pressing
The robotic system presses its sensory part on the object surfaces in order to perceive its stiffness (see Figure 3a). The pressing movement consists of pressing until a depth of d P and holding the artificial skin for t P seconds, i.e., θ P = [d P , t P ]. During the pressing, the multi-modal artificial skin can record the normal force feedbacks from each normal force sensor: F n f ,n s = {F m n f ,n s } t P · f s m=1 in order to measure the object stiffness. n f is the index of a normal force sensor in one skincell (n f = 1, ..., N f , in our case N f = 3), and n s is the index of skincells in the artificial skin (n s = 1, ..., N s , in our case N s = 7). f s is the sampling rate of the artificial skin, and m the sampling time step. In addition to the normal force feedbacks, the robot can also record the temperature feedbacks from each temperature sensor in order to measure the object thermal conductivity: T n t ,n s = {T m n t ,n s } t P · f s m=1 , n t = 1, ..., N t , with N t being the number of temperature sensors in one skincell (in our case N t = 1).

Sliding
The robot slides the artificial skin on the object surface and perceives its textural properties [18,21] (see Figure 3b). To do this, the robot first builds a contact with objects with the normal force of F S , then it linearly slides on the objects with a speed of v S for t S seconds, θ S = [F S , v S , t S ]. During sliding, the robot collects the outputs of accelerometers (in three axes: x, y, z): a   . Then the robot combines these signals together: a = {a n a ,n s } N a ,N s n a =1,n s =1 ; a n a ,n s = [a (x) n a ,n s , a (y) n a ,n s , a (z) n a ,n s ], n a = 1, ..., N a , where N a is the number of accelerometers in one skincell (in our case N a = 1). Besides, the change of temperature during sliding is also collected as an extra information T n t ,n s = {T m n t ,n s } t S · f s m=1 .

Static Contact
The object thermal cues can be attained by the robotic system by applying static contact movement: the robot presses its sensory part against the object surface until a depth of d C and maintains the contact for t C seconds, i.e., θ C = [d C , t C ] (see Figure 3c). The normal force feedbacks and temperature feedbacks are recorded:

Stiffness
We use the normal force averaged over all normal force sensors and time steps as an indicator for the object stiffness. For the pressing movement with pressing time steps t P · f s , object stiffness can be estimated byF = 1

Textural Property
In this work, we use the same textural feature extraction method in [43]: The vibration signals a in the artificial skin are used to calculate the activity, mobility and complexity features, denoted as A(a), M(a), C(a). These features represent the object tactile properties in the time domains. We also computed the linear correlation of accelerometer signals between different directions (xy,yz,xz) denoted as L(a), as these accelerometer components are correlated with each other during the sliding movement. The final descriptor of textural features combines activity, mobility, complexity and linear correlation together [43]:

Thermal Conductivity
To extract the features that describe the object thermal cues, we first calculate the average temperature sequence from all the temperature sensors: T n t ,ns N t ·N s . We then calculate its gradient at each time step as: ∇T, and combine it with the average temperature sequence: [T, ∇T]. To avoid the curse of dimensionality, we further reduce this combination to 10 dimensions via Principle Component Analysis (PCA) method and use it as the final feature to describe the object thermal conductivity. Table 2 summarizes the exploratory actions, the sensory feedbacks and the corresponding tactile features.

Exploratory actions Action Parameters (θ) Sensory feedbacks Features
Pressing

Transferring Prior Tactile Exploratory Action Experiences
This section describes our proposed active prior tactile knowledge transfer algorithm (APTKT) in detail. First, we formulate our problem in Section 4.1. Then, we illustrate our transfer learning method, including its process (Section 4.3) and the problems of what to transfer (Section 4.4), how to transfer (Section 4.5), from where to transfer, and how much to transfer (Section 4.6). The motivation of our method is demonstrated in Figure 1.

Problem Formulation
Assume that a robotic system has gained prior tactile knowledge of some old objects, on which the robot has previously applied different exploratory actions with different action parameters. These prior exploratory action experiences consist of the feature observations perceived by the multiple sensors and observation models from the old objects. Now, the robot is tasked to learn about a set of new objects. Since the old objects might share some similar physical properties with the new objects, by leveraging the related tactile exploratory action experiences, the robot can learn about new objects more efficiently.
We define N new number of new objects ) the robot is tasked to learn about through different exploratory actions A = {α θ n n } N α n=1 (For simplicity, we will denote α as an exploratory action in the rest of the paper). In other words, the robot should actively attain object feature We further define the robot prior tactile experience for an exploratory action α for N old number of prior objects These feature observations are collected by the multiple tactile sensors from the artificial robotic skin.
We formulate our problem as the transfer learning in the Gaussian Process Classification (GPC) framework [46], where each object is regarded as a class, and for each exploratory action, a GPC model is built as the observation model. The robot iteratively applies the exploratory actions and leverages prior tactile knowledge to improve the GPC observation models of new objects.

Gaussian Process Classification
The Gaussian Process Classification (GPC) model describes the mapping between the observation set X and the output set Y by: X f − → Y. The latent function g(x) in the GPC model is assumed to be sampled from a high-dimensional gaussian distribution called GP prior [46]: g(x) ∼ GP (m(x), K(x, x )), where each sample g(x) is a random variable. In this work, we use one-vs-all multi-class classification. For each object class, a binary GPC whose output label is converted to {−1, +1} is trained for each of the N labels: f n (·). Given a new sample x * , each binary classifier predicts the observation probability of its label p(y n |x * ). The sample is assigned to the class with the largest prediction probability y * = arg max y n ∈Y p(y n |x * ).

Process
The robot following our proposed method first applies each exploratory action one time on each new object, in order to collect a small number of feature observations V new = {V new α n } N α n=1 (Initial data collection). Then, the robot reuses its prior tactile exploratory action experiences to improve the observation models for each new object (Initial prior knowledge transfer). During this process, the robot compares the relatedness between its prior tactile exploratory action experiences and the new objects (Section 4.6), and chooses the most related one to transfer the old object feature observations V old (Section 4.5). Afterwards, the robot begins to iteratively collect and combine the feature observations and update the prior tactile knowledge in order to improve the observation models. At each iteration of prior tactile knowledge updating, the robot (1) actively selects the next object and the next exploratory action in order to attain a new feature observation, and (2) updates the prior tactile knowledge for the selected exploratory action. The iteration terminates when there is no improvement in the observation models of new objects. Our algorithm is demonstrated by Figure 4.

What to Transfer
When the robotic system applies an exploratory action on objects, it perceives multiple feature observations (e.g., by the pressing movement, the robot can perceive the object stiffness and thermal conductivity). The prior tactile exploratory action experiences are built using the feature observations of prior objects from multiple sensory modalities that are combined together and the corresponding GPC observation models of prior objects.
In order to combine the observations perceived from different tactile sensors, we first define v α as the feature observation of an exploratory action α. It is comprised of multiple observations: is an observation from the sensor modality m α , M α is the number of sensing modalities. For the pressing and static contact movements, we use the normal force and temperature sensing, for the sliding movement the accelerometer and temperature sensing ( Table 2). Then, we assume that for a sensor modality m α , a kernel function K (m α ) is given. To combine multiple feature observations so as to exploit the information from all sensors after applying the exploratory action α, we linearly combine the kernels: where γ (m α ) α ≥ 0. This hyper-parameter controls how much the robot can rely on the sensor modality m α . It ranges between 0 and 1, with γ (m α ) α = 0 indicating that the sensor feedback is not informative, and γ (m α ) α = 1 highly useful. We further constrain these hyper-parameters with L 1 norm: For each exploratory action, a GPC observation model is built using K . The hyper-parameters of γ and kernels are selected by maximizing the log marginal likelihood [46].
Finding optimal weights. end end Estimated sensory feedback weights.   . We assume that these two function values are not independent of each other, but are sampled together over a dependent Gaussian Prior (GP). This dependent GP is then used to construct the GPC observation model of the new object. The latent function can be modified accordingly: [45]. We further incorporate the relatedness between prior object and new object into the dependent GP model by introducing the following dependent kernel function: . We constrain its range within [0, 1]. As Chai et al. [47] evaluated, λ = 0 indicates that the old object and the new object are totally different, while λ = 1 indicates that the two objects are the same.

From Where and How Much to Transfer
Section 4.5 describes how to transfer the prior tactile knowledge to learn about new objects. This section illustrates how the robotic system selects the most related old object (from where to transfer) and how to determine the relatedness (λ) between two objects (how much to transfer).
To do this, we use our previously proposed method [45]. A larger value indicates that these two objects are highly similar. Therefore, we can use it to select the most related old object (denoted as c old * ) for a new object regarding the exploratory action α. Furthermore, to avoid transferring irrelevant tactile information, we add a threshold neg which prevents the robot from selecting any old object when the prediction value is smaller than neg . The final old object selection criterion is: Once we select c old * , we further use the predictions from the observation model of old objects to determine the object relatedness λ * : λ * =p(c old * |V new ).

Prior Exploratory Action Experiences Update
When the robot updates its prior exploratory action experiences, it needs to iteratively collect a new feature observation by applying an exploratory action on an object. We use our previously proposed active tactile learning algorithm [43] called Active Touch for Learning Physical Properties (AT-LPP). Using our AT-LPP approach, the robot actively decides which new feature on the object to explore next (denoted as c new * ) and which physical property to learn next (which exploratory action to apply next). It is denoted as α * ). In the following, we briefly summarize the AT-LPP algorithm (Algorithm 2) [43].
The and exploratory action α n ; N new α n ,j is the number of feature observations. A large UNC(α n , c j ) indicates that the robot is uncertain about the object feature observations from the exploratory action α n . As discussed in [43], an efficient next object and the next action selection process should be considered to greedily reduce such uncertainty while at the same time allowing the robot to explore (exploration-exploitation trade-off). In this regard, the next exploratory action α * and the next object c new * are determined by: where explor is the exploration rate, and p rand is randomly generated following the uniform distribution U (0, 1).

Algorithm 2 Active Touch for Learning Physical Properties
Exploratory actions and new objects Once the robot collects a new feature observation, it updates the prior tactile exploratory action experiences only from action α * . This process includes updating the feature observation combination, updating the object relatedness λ, and transferring these prior feature observations to the observation models of new objects.

Experimental Objects
In order to evaluate the performance of the proposed active prior tactile knowledge transfer algorithm (APTKT), we deliberately selected 10 daily objects with different physical properties which served to build the robotic prior to tactile exploratory action experiences (see Figure 1 Prior objects). Furthermore, we selected five new objects about which the robotic system was tasked to learn (Figure 1 New objects). For each new object, there existed one or more old objects that shared similar physical properties. For example, both rough sponge and smooth sponge are soft; paper box and hard box have similar surface textures; metal toolbox and biscuit box have high thermal conductivity. In this way, when learning about new objects based on their physical properties, the robot can leverage the related prior tactile knowledge.

Exploratory Action Determination and Test Data Collection
In our experiment, we defined seven exploratory actions from the pressing, sliding, and static contact movements with various action parameters (Pressing: P1, d P = 1 mm, t P = 3 s; P2, d P = 2 mm, t P = 3 s. Sliding: S1, F S = 0.1 N, t S = 5 s v S = 1 cm/s; S2, F S = 0.1 N, t S = 1 s, v S = 5 cm/s; S3, F S = 0.2 N, t S = 5 s, v S = 1 cm/s; S4, F S = 0.2 N, t S = 1 s, v S = 5 cm/s. Static Contact: C1, d C = 2 mm, t C = 15 s). Before applying any of the seven exploratory actions, the robot established light contact with the objects which were detected once the total normal force on the artificial skin increased above 0.05 N. Furthermore, after applying an exploratory action, the robot was controlled to raise its end-effector for 30 s such that the temperature sensors could be restored to the ambient temperature.
We evaluated the performance of our proposed method based on a test dataset built by the robot by applying each actions 20 times on each object. During this process, objects were manually shifted and rotated so that the data was robust against the variations in the object contact locations with the artificial skin.

Evaluation of Multiple Feature Observations Combination Method
We first evaluated the performance of our proposed robotic multiple feature observation combination algorithm. To do this, the robot selected 10 groups of objects (shown in Figure 1) to construct the GPC observation models for each of the seven exploratory actions. Each group contained five objects that were selected randomly both from the old and new object lists, following a uniform distribution. The algorithm performance was evaluated by the discrimination accuracy of the test dataset predicted by the GPC models with the growing number of feature observations. We compared our method with the baseline methods that built the GPC models using only a single sensor modality.
The experiments were conducted 10 times for each object group. For a fair comparison, we used RBF kernel [46] for each sensor modality. Results are plotted in Figure 6. For all seven exploratory actions, our proposed algorithm either took advantage of combining different sensor modalities to reach the best discrimination accuracy (P1, P2, C1, S4 in Figure 6), or performed the same as the best single-sensor result (S1, S2, S3 in Figure 6), indicating that the robot actively selected the most informative sensory feedback to learn about objects.  . Multiple feature observations combination results for exploratory actions P1, P2, C1, S1, S2, S3, S4 and the averaged result. STIF: building the GPC observation model based on object stiffness; Thermal-C: thermal conductivity; Texture: object surface textural properties; Multi: combined feature observations. The horizontal axis represents the number of feature observations. The vertical axis represents the discrimination accuracy of the test dataset.

Evaluation of the Transfer Learning Method with Different Groups of Prior Objects
In this experiment, we evaluated the performance of our proposed transfer learning method (APTKT) to learn the five new objects (see new objects in Figure 1) with different groups of prior objects (see prior objects in Figure 1). To start the learning process, the robot applied each of the seven actions once on each new object. When the robot iteratively learned the new objects' physical properties, it updated both the multiple feature observations combination and the prior tactile knowledge built by the dependent GPC models with all the feature observations collected so far. At each learning iteration, we measured the object discrimination accuracy of the test dataset. The transfer learning performance was compared with the baseline learning method that combined multiple feature observations without transferring any prior tactile knowledge.
We randomly shuffled the prior objects into ten groups following a uniform distribution. Each group consisted of the feature observations and the observation models from three prior objects. We conducted the experiment with five trials for each group. In each trial, the robot followed the transfer learning approach and no-transfer approach to collect 40 feature observations in total, allowing a fair comparison between different learning strategies to be made. Figure 7 illustrates that with the help of prior knowledge, the robot consistently outperformed the learning process without prior knowledge with a discrimination accuracy of 10%.  In order to further evaluate the robustness of APTKT, the robot was then tasked to learn about objects via applying only one of the exploratory actions. The experimental procedure was the same as the one described above. As the results in Figure 8 show, The robot had a larger improvement by actions P1, P2 and C1 than actions S1, S2, S3 and S4. For example, the robot increased the discrimination accuracy by 25%, when it reused the prior tactile instance knowledge from the movement P2. However, when learning about objects by actions S1 and S4, little improvement was seen. This was due to the fact that different exploratory actions produced different object feature observations. For action P2, there existed higher related prior tactile knowledge than S1 and S4, and the robot could benefit more on it. In all scenarios, using our proposed transfer learning algorithm, the robot could achieve a higher discrimination accuracy than the baseline method with the same number of feature observations. Therefore, we can conclude that APTKT helps the robot build reliable observation models of new objects with fewer training samples, even when only one kind of exploratory action is applied.

Increasing the Number of Prior Objects
We further evaluated the performance of our proposed method with an increasing number of prior tactile experiences. Intuitively, as the number of old objects grows, it is more likely that the robot can find highly-related prior tactile knowledge, so that the learning performance can continue to be improved. In this regard, the robot was asked to learn about new objects via all seven exploratory actions, with the number of old objects increasing from 5, 7 to 10. We followed the same experimental procedure described above, and conducted each experiment with five trials. Unexpectedly, as Figure 9a-d show, the growing number of prior tactile knowledge reduced the transfer learning improvement. This was because as the number of prior objects grow, it was more difficult for the robot to classify them. As a result, the object relatedness λ predicted by the old object GPC models was lower than the threshold neg , making the robot stop transferring prior knowledge. To compensate for this, we use our previously proposed feature augmentation trick [45]. We defined p(c old i |v) as the prediction probability that a feature observation from the new object v is assigned to the old object c old i . Then we augmented a feature observation v from a new object as: predictions from old objects' observation models ].
The auxiliary features [p(c old 1 |v), ..., p(c old N old |v)] encode the knowledge of all prior objects. They represent the relatedness between prior objects and the new object, and thus can help the robotic system to distinguish among new objects. Furthermore, since the auxiliary features can be regarded to be perceived from an auxiliary sensor, we directly employed our proposed multiple feature observation combination method to the augmented feature observations by casting a weight γ to its kernel. The augmented feature observations were then used to build the new object dependent GPC models.
We tested our proposed feature augmentation technique when the robot leveraged the tactile knowledge of 3, 5, 7, and 10 prior objects to learn about new objects via all seven actions. The learning performance is shown by the green curves in Figure 9a-d. Clearly, by introducing the probability predictions as auxiliary features, the robot was able to reuse the prior tactile knowledge again, and it achieved similar improvement of discrimination accuracy for 3 prior objects, and higher improvement for 5, 7, and 10 prior objects compared to the other methods. Specifically, when resuing 10 prior objects, the robot achieved 20% higher discrimination accuracy than the baseline method, when only one new feature observation was collected, showing the one-shot learning behaviour. This experiment also indicates that with a further growing number of prior objects, a further improvement of discrimination accuracy is achievable.

Negative Prior Tactile Knowledge Transfer Testing
When the constructed prior tactile exploratory action experiences are not relevant to the new objects, a brutal-force transfer may degrade the learning performance, resulting in the negative knowledge transfer phenomena. In this case, the transfer learning algorithm should stop leveraging irrelevant prior knowledge.
In order to evaluate our proposed transfer learning method (APTKT) against the negative tactile knowledge transfer, we deliberately selected irrelevant prior objects and compared the transfer learning performance with the baseline method, following the same experimental process described in Section 5.4. When finding which objects were relevant (or irrelevant) to each other, we built object confusion matrices to roughly evaluate the object similarity. For each of the seven exploratory actions, we trained a Gaussian Mixture Model (GMM) and calculated the object confusion matrix. To do this, we first used GMM to cluster all the samples from the dataset with the hyper-parameters optimized by the Expectation-Maximization (EM) algorithm. The number of clusters was set to be the same as the number of objects (in our case, 15), and each cluster centroid was initialized as the mean value of all data samples that belonged to an object. The maximum EM iterations was set to be 100, with convergence threshold being 0.001. We further calculated the confusion matrix averaged over all exploratory actions. These matrices indicated the averaged similarity between objects. We rescaled their values to be within 0-1, with 0 meaning that two objects are totally dissimilar, and 1 the same. The objects which had low similarity values with target objects were selected as irrelevant objects. The results are shown in Figure 10. According to Figure 10  The results in Figure 11 illustrate that the discrimination accuracy achieved by APTKT was similar to the baseline method, when the robot applied either one or all seven exploratory actions. The results indicate that our proposed algorithm stopped transferring negative prior tactile instance knowledge. . Negative prior tactile knowledge transfer testing. The prior objects that were unrelated to the new objects were deliberately selected.

Conclusions
In this work, we proposed a transfer learning method for a robot equipped with multi-modal artificial skin to actively reuse the prior tactile exploratory action experiences when learning about the detailed physical properties of new objects. These prior action experiences are built by the feature observations, when the robotic arm applies the pressing, sliding and static contact movements with different action parameters on the previous-explored objects (prior objects). The feature observations are perceived from multiple sensory modalities. Using our proposed tactile transfer learning method, the robot has a "warm start" of the learning process. It applies fewer exploratory actions and gains a detailed tactile knowledge of new objects (e.g., normal force feedback at different pressing depths).
One limitation of our work is that performing static contact movement took 15 s, which prevented the rapid transfer learning. Furthermore, due to the limitations of our artificial skin, the robot can only interact with objects with flat surfaces. In the future, we will extend our method to more exploratory actions (such as tapping and lifting), so that the robot can transfer more exploratory action experiences to learn more physical properties of an object, such as auditory feedback and center of mass. Furthermore, an interesting topic would be how to transfer the prior tactile knowledge across different exploratory actions, e.g., transferring the tactile knowledge from pressing to static contact movement.