A Latent State-Based Multimodal Execution Monitor with Anomaly Detection and Classification for Robot Introspection
Round 1
Reviewer 1 Report
I think the paper is thorough and important analysis of robot programming. The system uses HMM models to identify and handle anomalies in a robotic task.
The authors present previous approaches and the theory on which their system is based adequately.
The example is good but the results are a bit difficult to understand. The description of the setup I good enough to reconstruct such an experiment if necessary.
The reviewer thinks that the paper is important and well written. The English is adequate. It can be improved but it is understandable. I am not sure other methods should be compared here.
I would add to the discussion and results the practicality of the suggested method, an analysis of the program running and alghoritm handling times.
The paper should be accepted with minor corrections.
Technical remarks:
- Fig 9, Fig 8 – legend in caption not on each subgraph
Author Response
Response to Reviewer 1 Comments
I think the paper is thorough and important analysis of robot programming. The system uses HMM models to identify and handle anomalies in a robotic task.
The authors present previous approaches and the theory on which their system is based adequately.
Point1: The example is good but the results are a bit difficult to understand. The description of the setup I good enough to reconstruct such an experiment if necessary.
Response 1: To illustrate the results in a more intuitive and accessible fashion. We revised the results section by modifying the positions of figures and added further explanations. Now, the results explanations are organized by the following two subsections:
· Anomaly detection
o The illustration for constructing a hidden state-based anomaly detector, which includes three steps:
1. representing multimodal observations through latent states;
2. concatenating the derived log-likelihood values of a latent state;
3. calculating the anomaly detection threshold.
o We compared the performance of the proposed method and a baseline that with a static threshold for each skill based on the kitting experiment which is composed of 6 skills (induced anomalies for each skill in testing).
· Anomaly classification
o We explain the anomaly sample was executed around the anomaly triggered flag by a given a window_size of 2 seconds.
o We present an analysis and comparison of the modeling performance among:
1. parametric HMMs and non-parametric HMMs,
2. independent observations,
3. linear regressive observations
4. And various inference methods.
o We compared the results on our anomaly dataset for anomaly classification in Table 2.
o The results concluded that the proposed method for anomaly classification outperforms the baseline methods.
Point 2: The reviewer thinks that the paper is important and well written. The English is adequate. It can be improved but it is understandable. I am not sure other methods should be compared here.
Response 2: Just to emphasize that we performed intuitive comparisons and analysis of the proposed anomaly detection and anomaly classification methods. For anomaly detection, we compared the sensitivity and detection accuracy with the method described in [2]. For anomaly classification, we compared the classification accuracy between the proposed multiclass classifier and a set of baselines that including nonparametric and parametric methods. Based on this outline it is our estimation that we have a comprehensive comparative analysis and that no further comparisons are needed.
Point 3: I would add to the discussion and results the practicality of the suggested method, an analysis of the program running and algorithm handling times.
Response 3: According to the background and motivation of this work, I updated the potential impact and applications in the Discussion section:
· Development of robot introspection is expected to have a direct impact on a large variety of practical applications.
o To aid systems to prevent failures in robot manipulation tasks.
o Safety improvements in human-robot collaboration by assessing the quality of learned internal models for each skill. Such assessment can speed up the recovery of anomalies and/or repair process by providing detailed skill identification and anomaly monitoring.
The paper should be accepted with minor corrections.
Technical remarks:
Point 4: - Fig 9, Fig 8 – legend in a caption not on each subgraph
Response 4: We updated Fig. 8 and Fig. 9 legends.
Author Response File: Author Response.docx
Reviewer 2 Report
The language of the paper is weak, and therefore it is difficult in many points to clearly understand what the authors mean and want to communicate.
The paper claims that the proposed method is based on "robot introspection." However, this is a big overclaim. Actually, in facts, their method examines the parameters of an HMM. For a detailed description of introspection methods, see, for example, Chella et al.: "A cognitive architecture for robot self-consciousness. "
The paper is mostly based on the approach proposed by Park, Kemp, and colleagues. A similar method has been suggested by Guo, Liao, Wang, Yu, Ji, Li: "Multidimensional Time Series Anomaly Detection: A GRU-based Gaussian Mixture Variational Autoencoder Approach."
The authors should better clarify the novelty of their approach.
Although the paper proposes several experiments, it does not offer any comparison with related methods.
Author Response
Response to Reviewer 2 Comments
Point 1: The language of the paper is weak, and therefore it is difficult in many points to clearly understand what the authors mean and want to communicate.
Response 1: Grammar and word usage were improved throughout the paper.
Point 2: The paper claims that the proposed method is based on "robot introspection." However, this is a big overclaim. Actually, in facts, their method examines the parameters of an HMM. For a detailed description of introspection methods, see, for example, Chella et al.: "A cognitive architecture for robot self-consciousness. "
Response 2: In our case, we refer to a type of physical robot introspection. We attempt to define Physical introspection as the ability for the robot to understand and assess its physical actions. We have been researching in this line of work for some time. See for example related ideas in the 2017 IROS Workshop: “Introspective Methods for Reliable Autonomy” [http://130.243.105.49/Agora/IROS2017_Introspection/].
Robot introspection for predicting and explaining the behavior of the robot in subsequent executions of the task using traditional Hidden Markov Model was first proposed by Maria Fox:” Robot introspection through learned hidden Markov models” that referred in the paper [20]
Thus, the considered robot introspection is different from the robot consciousness in this paper.
Point 3: The paper is mostly based on the approach proposed by Park, Kemp, and colleagues. A similar method has been suggested by Guo, Liao, Wang, Yu, Ji, Li: "Multidimensional Time Series Anomaly Detection: A GRU-based Gaussian Mixture Variational Autoencoder Approach."
Response 3: The proposed methods belong to the field of Bayesian nonparametric methods. These methods have shown to have robust performance for learning complex dynamical phenomena in the last two decades [25,26,30,32,33,54]. In particular, note the following applications: speaker diarization by Emily B. Fox [26], robot process monitoring by Enrico Di Lello [38,39], and human motion segmentation by Michael C. Hughes [54].
With respect to works referenced by the reviewer (Park et. al and Guo et. al), please note that all such works make use of deep learning methods. While it is clear deep learning has been ground breaking, we believe Bayesian non-parametric methods have relative advantages in particular for anomalous instances in robots at the current time:
Bayesian Nonparametric models (BN) VS. Deep Learning methods (DL)
Pros of BN:
1) A good model can be trained with far fewer data, which is hard to access in failure tasks in robotics at this point.
2) Data of various lengths can be processed.
3) The size of hidden state space is automatically learned from input data without suffering as much from biases.
4) The model is computationally efficient.
In contrast, DL suffers from comparative weaknesses:
1) It requires large amounts of data to train a sufficient model;
2) The training data must have equal length;
3) The hidden state space size is manually provided before training and can possibly lead to overfitting;
4) The computational complexity is much larger making it hard to run online.
Point 4: The authors should better clarify the novelty of their approach.
Response 4: The novelty of this paper is four-fold (see the second last paragraph in the Introduction section):
1. We consider the non-parametric HMM methods for learning dynamical models for time series with complex and uncertain behavior patterns in robot manipulation task. Specifically, we present how Bayesian non-parametric methods can be used to provide a flexible and computationally efficient structure for modeling the multivariate time series and addressing the problem of anomaly detection and classification.
2. This work constitutes the first attempt to examine the hidden state space and the prediction of the log-likelihood, associated with nominal executions for multimodal anomaly detection using non-parametric auto-regressive HMM.
3. A multiclass classifier based on Bayesian non-parametric HMMs with memorized variational inference with scalable adaptation is used for robust anomaly classification, even when trained with few samples.
4. A multimodal based robot introspection system is open-source, which could be a valuable reference for other researchers. In particular, we explore the multiple roles of non-parametric methods, which leads to much faster training and requires no domain knowledge.
Point 5: Although the paper proposes several experiments, it does not offer any comparison with related methods.
Response 5: We performed intuitive comparisons and analysis of the proposed anomaly detection and anomaly classification methods.
For anomaly detection, we compared the detection accuracy and sensitivity between the proposed hidden-state based dynamic threshold method and our previous gradient-based static threshold method described in [2]. Specifically, our gradient-based static threshold outperformed those related methods [38,39] by Enrico Di Lello thresholding with the mean minus several times of standard deviation was proofed in our submitted paper “Fast, Robust, and Versatile Event Detection through HMM Belief State Gradient Measures”.
For anomaly classification, we compared the classification accuracy between the proposed multiclass classifier and a set of baselines that including nonparametric and parametric methods (see Section 7.2). Based on this outline it is our estimation that we have a comprehensive comparative analysis.
Author Response File: Author Response.docx
Round 2
Reviewer 2 Report
The authors fully acknowledge my previous remarks.