Next Article in Journal
Inspection Application in an Industrial Environment with Collaborative Robots
Previous Article in Journal
Reinforcement Learning for Collaborative Robots Pick-and-Place Applications: A Case Study
 
 
Article
Peer-Review Record

Modeling Interaction in Human–Machine Systems: A Trust and Trustworthiness Approach

Automation 2022, 3(2), 242-257; https://doi.org/10.3390/automation3020012
by Alessandro Sapienza *, Filippo Cantucci and Rino Falcone
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Automation 2022, 3(2), 242-257; https://doi.org/10.3390/automation3020012
Submission received: 10 February 2022 / Revised: 22 March 2022 / Accepted: 22 March 2022 / Published: 25 March 2022

Round 1

Reviewer 1 Report

General review comments: This manuscript covered a study of modeling interaction in the human-machine system with a trust and trustworthiness approach. This study considered the different types of interactions with a system among the different parts instead of a single part of the possible interaction. The advantages and challenges are addressed for the presented theoretical formalization and an agent simulation. It is mentioned that the findings are crucial for evaluating the evolution of human-machine interface (HMI) models. The experimental setting results based on the number of agents for each category and their performance for each task are presented. Although, this manuscript is well-written. However, it needs revision before publication. The following reviews and comments need to be addressed.

Reviews and comments: 

  • Nomenclature/acronyms are required for all the abbreviations used in the manuscript.
  • The reference style needs revision. The reference number should start with number 1. This type of error rise due to latex name-based to number-based reference transformation. The \bibliographystyle{unsrt} might be useful.
  • The introduction section is comprehensive. However, the research gap(s) need to be specific. How about other models? Is there a similar model/framework developed by the previous researchers? Please clarify the research gaps, aims, tasks, and novelty of this work. You have already added the contribution; that’s supportive. Please consider combining the introduction and related works. Introduction missing references for many scientific/research statements. The references used in a group, like on page 4, lines 127, 141, and 143, need to be used separately in each statement for more accessible source tracking.
  • Provide an overview of the paper in the last part of the introduction. Please give an outline of the paper, which section covers what briefly.
  • The background and review section provided only qualitative information. It would support adding quantities values (for example, performance/accuracy) from the reviewed previous research works.
  • The theoretical formulation required a supportive mathematical formulation model system architecture/block diagram. Adding the experimental setting in the main section(s) of the manuscript instead of using it in Appendix A would support the reader, as it carries important information of the presented model formulation.
  • Justification is needed for the simulation models and results. Why are only 10% and 25% request probability selected? Why do you start measuring values after a transient phase of 50-time units? Is it an optimized time unit value? What is the basis of selecting specific agent numbers and categories? How is the standard deviation measured (need mathematical formulation)?
  • Figure 1 presents the relationship between trustworthiness and trust. It shows that this is a one-way approach. Please cross-check if any feedback or interaction is possible/available in your presented model/system/approach. It would support seeing the difference between the general approach and your presented one (if there are any).
  • Figures 2 to 7 should be reproduced/revised with standard tools (no black color background), using different line styles and markers to be distinguishable in black-white printing to the readers. Figures titles should include the subfigure number. Use single parenthesis for the subfigure number (a) and (b).
  • Please keep separating the discussion and conclusion parts. A conclusion should be concise, and it should focus on the contribution, novelty, key results, challenges, and future research directions in the conclusion part.

Author Response

Dear Reviewers,

We would like to sincerely thank you for the precious comments and for giving us the opportunity to improve our work. In this new version of the paper we introduced significant changes guided by your comments and observations, in order to make the contribution of our study hopefully clearer and more precise.

As requested by the Editors, all the revisions have been clearly highlighted in the text, with the exception of Introduction, that has been extensively modified.

Best regards

Alessandro, Filippo, and Rino

Reviewer #1:

General review comments: This manuscript covered a study of modeling interaction in the human-machine system with a trust and trustworthiness approach. This study considered the different types of interactions with a system among the different parts instead of a single part of the possible interaction. The advantages and challenges are addressed for the presented theoretical formalization and an agent simulation. It is mentioned that the findings are crucial for evaluating the evolution of human-machine interface (HMI) models. The experimental setting results based on the number of agents for each category and their performance for each task are presented. Although, this manuscript is well-written. However, it needs revision before publication. The following reviews and comments need to be addressed.

Reviews and comments:

  1. Nomenclature/acronyms are required for all the abbreviations used in the manuscript.

ANSWER: the “Abbreviations” section has been enriched to include any abbreviations that were missing.

  1. The reference style needs revision. The reference number should start with number 1. This type of error rise due to latex name-based to number-based reference transformation. The \bibliographystyle{unsrt} might be useful.

ANSWER: Reference order has been revised.

  1. The introduction section is comprehensive. However, the research gap(s) need to be specific. How about other models? Is there a similar model/framework developed by the previous researchers? Please clarify the research gaps, aims, tasks, and novelty of this work. You have already added the contribution; that’s supportive. Please consider combining the introduction and related works.

ANSWER: According to the reviewer’s request, Introduction and related works have been combined. Section Introduction has been completely modified, in order to clearly highlight the research gaps (most of the existing approaches focus just on a specific dimension of the trust relationship, trust or trustworthiness; trust models often limit the role of trustor and trustee specifically to humans or artificial systems) and the novelty of our contribution:

“In order to cope with such research gap, in this work we try to address the issue of trusted collaboration between agents (be they human or artificial) in its most complete application: trustor and trustee can be interchangeably an artificial or a human agent.

Therefore, we will not simply address the problem of artificial systems' trustworthiness, but also that of their trust (what characteristics must an artificial system have to trust another system?) and their reciprocal relationships.”

  1. Introduction missing references for many scientific/research statements. The references used in a group, like on page 4, lines 127, 141, and 143, need to be used separately in each statement for more accessible source tracking.

ANSWER: We have now introduced a separate discussion in order to demonstrate the importance of each contribution and the type of approach it uses.

  1. Provide an overview of the paper in the last part of the introduction. Please give an outline of the paper, which section covers what briefly.

ANSWER: We added an outline of the paper at the end of the introduction.

  1. The background and review section provided only qualitative information. It would support adding quantities values (for example, performance/accuracy) from the reviewed previous research works.

ANSWER: As this is more conceptual than quantitative work, this type of data is not available in this case. Therefore, in order to satisfy the reviewer's request, we improved in the introduction the discussion about the problems highlighted in the current literature and how they impact on trust model.

  1. The theoretical formulation required a supportive mathematical formulation model system architecture/block diagram. Adding the experimental setting in the main section(s) of the manuscript instead of using it in Appendix A would support the reader, as it carries important information of the presented model formulation.

ANSWER: Appendix A has now been integrated in the Simulations Section.

  1. Justification is needed for the simulation models and results. Why are only 10% and 25% request probability selected? Why do you start measuring values after a transient phase of 50-time units? Is it an optimized time unit value? What is the basis of selecting specific agent numbers and categories? How is the standard deviation measured (need mathematical formulation)?

ANSWER:  In order to answer to this request, we introduced a comprehensive justification in Section 3.

a)The values of 10% and 25% have been chosen in order to represent respectively a situation of unload and overload for the network. On average, agents take 3 time units to perform a task. With a 10% request probability, a device receives on average one task every 5 time units (half from humans and half from other devices), so the estimated average load is 60%. On the other hand, with a 25% request probability, a device receives a task every 2 time units, therefore the estimated average load is 150%. Therefore, we need the first scenario to investigate the behavior of the model in an ideal case, that is, when the devices are almost always available. The second scenario, on the other hand, represents a more interesting situation, in which the devices have to manage conflicting situations. Of course, these are average expected values, generated randomly, so it is possible that moments of greater loading or unloading may occur.

  1. b) A preliminary analysis of the framework highlighted the presence of strong fluctuations in the output values at the beginning of the simulation. This is due to the fact that, when the simulation starts, the agents do not possess precise knowledge on the actual capabilities and availability of their partners, as well as the system load. To overcome such problem and to eliminate the random effects on the output, we considered a transient phase equal to 50 time units. Indeed, this time window allows us to analyze stable results without such variability.
  2. c) The choice of agent distribution was guided by a preliminary context analysis. Generally, in a hospital the number of patients is greater than the number of nurses, which is greater than the number of doctors. As the National Nurse United association reports (https://www.nationalnursesunited.org/ratios, accessed at 13/12/2021), the recommended nurse-to-patient ration should ranges from 1:1 to 1:4. Similarly, the doctor-to-nurse ratio fluctuates from 1:2 to 1:4 (https://www.oecd.org/coronavirus/en/data-insights/number-of-medical-doctors-and-nurses, accessed at 13/12/2021). Certainly, these numbers undergo various fluctuations, both by country and by department. Therefore, in order to stick to the proportions identified, we choose to consider 1 doctor, 3 nurses and 5 patients. As far as it concerns devices, the most important factor is not how many devices there are, but what the system load is. Thus, for the sake of simplicity, the system is sized thinking of a human-device ratio equal to 1. This allows us to better interpret the effect of the request probability.
  3. Figure 1 presents the relationship between trustworthiness and trust. It shows that this is a one-way approach. Please cross-check if any feedback or interaction is possible/available in your presented model/system/approach. It would support seeing the difference between the general approach and your presented one (if there are any).

ANSWER: We are grateful to the reviewer for raising this point, as it gives us the opportunity to provide an important clarification. “Regarding the relationship between trust and trustworthiness, it is important to underline that this relationship is indeed reciprocal and not one-way. On the one hand, the trustor modifies its trust based on the trustworthiness of the trustee. However, the trustee can also adjust its trustworthiness due to the trustor's trust. For example, if the trustee believes that the trustor has a too low level of trust, it may act by trying to determine the conditions under which it is more trustworthy. In other words, it is a two-way relationship, in which trustor and trustee can influence each other.”

This clarification has been added in Introduction.

  1. Figures 2 to 7 should be reproduced/revised with standard tools (no black color background), using different line styles and markers to be distinguishable in black-white printing to the readers. Figures titles should include the subfigure number. Use single parenthesis for the subfigure number (a) and (b).

ANSWER: All the figures have been modified according to the reviewer’s request.

  1. Please keep separating the discussion and conclusion parts. A conclusion should be concise, and it should focus on the contribution, novelty, key results, challenges, and future research directions in the conclusion part.

ANSWER: Discussion and Conclusions have been separated in two different sections, in order to give a more precise structure to the contribution.

Reviewer 2 Report

The simulation method should be described in more detail. Although the experimental setting is described in Appendix A, it seems to me that the explanation of the reason and background for expressing competence, willingness, and trustworthiness in those equations is lacking. Is there any evidence that trustworthiness can be explained by the value of the sum of the linear combination of competence and willingness divided by 2? The parameter settings are also one of the important ones, and I would like to see a detailed description of them.

Table A1, what is labels A, B, C? -> I could not catch the meaning of those labels. In addition, I could not catch the meaning of the label of tuser.

I could not understand what oldcompetence and oldwillingness meant ... . 

As you mentioned the two dimensions (competence, and willingness) in Section 3.1, competence represents the set of qualities making the trustee good for the task t: skills, know how, expertise, knowledge, self-esteem, self-confidence. I did not know how those characteristics were reflected in the simulation. So, I want to recommend you explain the detailed description of them in simulation methodology.

What is written at the top of the results section should be moved to the simulation section. The measurements and scenarios should be written in the simulation. The structure of the simulation and results sections needs to be reorganized.

The authors also need to correct some typographical errors: 

line 313: t1 e t2, what is e?

line 318: t1, t2, t2, o tuser.

willigness -> willingness

All graphs are difficult to read. The background of the graph should be white. The labels and units of the vertical axis are missing.

Author Response

Dear Reviewers,

We would like to sincerely thank you for the precious comments and for giving us the opportunity to improve our work. In this new version of the paper we introduced significant changes guided by your comments and observations, in order to make the contribution of our study hopefully clearer and more precise.

As requested by the Editors, all the revisions have been clearly highlighted in the text, with the exception of Introduction, that has been extensively modified.

Best regards

Alessandro, Filippo, and Rino

Reviewer #2:

  1. The simulation method should be described in more detail. Although the experimental setting is described in Appendix A, it seems to me that the explanation of the reason and background for expressing competence, willingness, and trustworthiness in those equations is lacking. Is there any evidence that trustworthiness can be explained by the value of the sum of the linear combination of competence and willingness divided by 2? The parameter settings are also one of the important ones, and I would like to see a detailed description of them.

ANSWER: For what it concerns the parameters, a detailed description has been added in section 3:

a)The values of 10% and 25% have been chosen in order to represent respectively a situation of unload and overload for the network. On average, agents take 3 time units to perform a task. With a 10% request probability, a device receives on average one task every 5 time units (half from humans and half from other devices), so the estimated average load is 60%. On the other hand, with a 25% request probability, a device receives a task every 2 time units, therefore the estimated average load is 150%. Therefore, we need the first scenario to investigate the behavior of the model in an ideal case, that is, when the devices are almost always available. The second scenario, on the other hand, represents a more interesting situation, in which the devices have to manage conflicting situations. Of course, these are average expected values, generated randomly, so it is possible that moments of greater loading or unloading may occur.

  1. b) A preliminary analysis of the framework highlighted the presence of strong fluctuations in the output values at the beginning of the simulation. This is due to the fact that, when the simulation starts, the agents do not possess precise knowledge on the actual capabilities and availability of their partners, as well as the system load. To overcome such problem and to eliminate the random effects on the output, we considered a transient phase equal to 50 time units. Indeed, this time window allows us to analyze stable results without such variability.
  2. c) The choice of agent distribution was guided by a preliminary context analysis. Generally, in a hospital the number of patients is greater than the number of nurses, which is greater than the number of patients. As the National Nurse United association reports (https://www.nationalnursesunited.org/ratios, accessed at 13/12/2021), the recommended nurse-to-patient ration should ranges from 1:1 to 1:4. Similarly, the doctor-to-nurse ratio fluctuates from 1:2 to 1:4 (https://www.oecd.org/coronavirus/en/data-insights/number-of-medical-doctors-and-nurses, accessed at 13/12/2021). Certainly, these numbers undergo various fluctuations, both by country and by department. Therefore, in order to stick to the proportions identified, we choose to consider 1 doctor, 3 nurses and 5 patients. As far as it concerns devices, the most important factor is not how many devices there are, but what the system load is. Thus, for the sake of simplicity, the system is sized thinking of a human-device ratio equal to 1. This allows us to better interpret the effect of the request probability.

Even the discussion on competence, willingness and trustworthiness has been improved in Section 3:

“As far as it concerns trustworthiness assessment, despite the very rich literature about this topic, there is no standard solution to solve this problem. First of all, there are many theoretical models in the literature. We refer to the model proposed by Castelfranchi and Falcone, which identifies the core trust in the two components of competence and willingness (see Equation 1).

Nevertheless, these models must be instantiated in the specific domains. As far as we are concerned, we represent trust operationally with a value in [0,1], where 0 means absolute distrust and 1 means maximum trust. It depends on various components.

In other words, we need to discuss the nature of the function f. The model proposed by Castelfranchi-Falcone is general, so it can be instantiated in different formulations, be they linear or non-linear. We choose to refer to linear models. Due to their intrinsic characteristics and potential, linear models have been widely used within social science, to reproduce mental processes. More in details, they allow to study a particular behavior or social phenomenon, relating it to different cognitive variables or environmental factors which influence or determine it. For example, it is possible to model the propensity to trust. Among the many works, in (Bulińska-Stangrecka and Bagieńska, 2018) the authors investigate the links between interpersonal trust and competences, relations, and cooperation in Polish telecommunications companies. As a further example, in (Mashinchi et al., 2011) a variation of linear regression analysis is considered to model trust ratings of delivered services as a function of QoS (Quality of Services).

It therefore remains to be discussed what weight to assign to the two variables, competence and willingness. They do not necessarily have the same importance. For example, there may be contexts in which willingness is taken for granted. Furthermore, since trust is a subjective dimension, it certainly depends on the trustor, who may consider one aspect more important than the another. The same stands for the trustee, the context, etc. However, in our specific case, both have an effect on the trustworthiness of the agents and we have no reason to prefer one component over the other. For this reason, we choose to give them equal importance.”

This choice has also been specified in the Conclusions as a limitation of our approach.

  1. Table A1, what is labels A, B, C? -> I could not catch the meaning of those labels. In addition, I could not catch the meaning of the label of tuser.

ANSWER: There was a mistake reporting labels between section 3 and Appendix A. This has now been corrected. A, B, and C stood for E1, E2, and E3. As for tuser, this represented the task specific for each category of human, i.e. tasks that can only be performed by a specific category of humans. In order to clarify this point, it has been divided into the three tasks t_doctor, t_nurse and t_patients.

  1. I could not understand what oldcompetence and oldwillingness meant.

ANSWER: OldCompetence and OldWillingness represent respectively the trustor’s estimations of competence and willingness before the executions of the new task. We use them to update the estimation of Competence and Willingness. This has been clarified in the text:

“The two sub-dimensions of competence and willingness are computed each time a task is completed, interrupted or refused. Equation 1 shows how competence is updated. Specifically, OldCompetence represents the belief of the trustor about the competence of the trustee before requiring the execution of the new task. At the beginning of the simulation, rather than starting from a situation of complete uncertainty, its initial value is determined from the trustee's category of belonging. Performance represents the task outcome, i.e. how good was the trustee in executing the assigned task. Resuming, Competence is updated as the weighted mean of its precedent value OldCompetence and the recent performance Performance. In a similar way, in Equation 2, OldWillingness represents the estimation of willingness before the execution of the new task. Its initial values is equal to 1, since we agents are supposed to be always willing to accept task requests. Concerning Availability, it is a boolean variable and it is equal to 1 if the trustee successfully accomplish the task, or 0 if it refuse/interrupt the task.”

  1. As you mentioned the two dimensions (competence, and willingness) in Section 3.1, competence represents the set of qualities making the trustee good for the task t: skills, know how, expertise, knowledge, self-esteem, self-confidence. I did not know how those characteristics were reflected in the simulation. So, I want to recommend you explain the detailed description of them in simulation methodology.

ANSWER: We have now clarified, in Section 3, how these components affect the simulation implementation:

“Table 2 reports the average performance of the categories for each task. These values, together with a standard deviation of 20%, are used to generate the average performance of each category member, i.e. this dimension determines the competence of the agents in the various tasks. Of course, this is an objective and internal property of the agent. As such, this value cannot be accessed directly, but only subjectively estimated.

Unlike competence, willingness values are determined only by the specific system conditions. Of course, in general an agent has an inherent predisposition to be willing/unwilling to perform a task for a specific trustor. However, in this specific case, we assume that agents are always well disposed towards other partners. Therefore, the only limitations in terms of willingness are due to the effective availability of the agents, i.e. whether or not they are engaged in other tasks.”

  1. What is written at the top of the results section should be moved to the simulation section. The measurements and scenarios should be written in the simulation. The structure of the simulation and results sections needs to be reorganized.

ANSWER: we reorganized the structure of the simulation and results according to the reviewer’s request.

  1. The authors also need to correct some typographical errors:

line 313: t1 e t2, what is e?

line 318: t1, t2, t2, o tuser.

willigness -> willingness

ANSWER: We corrected such errors.

  1. All graphs are difficult to read. The background of the graph should be white. The labels and units of the vertical axis are missing.

ANSWER: All the figures have been modified according to the reviewer’s request.

Round 2

Reviewer 1 Report

Thanks for the review and comments. 

Author Response

Dear Reviewer,

we are grateful to you for your important work, which has allowed us to greatly improve our contribution.

Best regards Alessandro, Filippo, and Rino 

Reviewer 2 Report

Clarity and readability have been improved, particularly in the method section. I have no further major comments.

line 326: t1, t2, t2, t_doctor, ... -> t1, t2, t3, t_doctor, ...

line 401: e -> and.

Author Response

Dear Reviewer,

As requested, we have corrected the typos in the text.

We are grateful to you for your important work, which has allowed us to greatly improve our contribution.

Best regards Alessandro, Filippo, and Rino 

Back to TopTop