Measuring Trust with Psychophysiological Signals: A Systematic Mapping Study of Approaches Used

: Trust plays an essential role in all human relationships. However, measuring trust remains a challenge for researchers exploring psychophysiological signals. Therefore, this article aims to systematically map the approaches used in studies assessing trust with psychophysiological signals. In particular, we examine the numbers and frequency of combined psychophysiological signals, the primary outcomes of previous studies, and the types and most commonly used data analysis techniques for analyzing psychophysiological data to infer a trust state. For this purpose, we employ a systematic mapping review method, through which we analyze 51 carefully selected articles (studies focused on trust using psychophysiology). Two signiﬁcant ﬁndings are as follows: spatial resolution. (2) Regarding outcomes: there is only one tool proposed for assessing trust in an interpersonal context, excluding trust in a technology context. Moreover, there are no stable and accurate ensemble models that have been developed to assess trust; all prior attempts led to unstable but fairly accurate models or did not satisfy the conditions for combining several algorithms (ensemble). In conclusion, the extent to which trust can be assessed using psychophysiological measures during user interactions (real-time) remains unknown, as there several issues, such as the lack of a stable and accurate ensemble trust classiﬁer model, among others, that require urgent research attention. Although this topic is relatively new, much work has been done. However, more remains to be done to provide clarity on this topic.


Introduction
Trust is an essential factor in all interrelationships [1,2], e.g., relationships between humans within an organization or between humans and a technical artifact [3][4][5].
The outcomes of trust include technology adoption, giving up control, being more willing to take risks, the facilitation of innovation, cooperation, investment, and, sometimes, influences on user performance [6,7]. Conversely, the absence of trust yields a negative user experience, low adoption, investment loss, etc. [8][9][10].
Although trust as a construct has been widely researched in several disciplines, such as sociology, psychology, cognitive science, and human-computer interactions, among others, effectively measuring user trust remains a challenge.
For instance, self-reporting survey instruments, such as those developed in [10][11][12], that are predominantly used for measuring trust are subjective and not objective. Thus, it is not possible to use self-reporting instruments to assess trust in real-time because only after completing one or more tasks/interactions are such instruments applied [13,14].
However, advances in sensing technologies, leading to the development of low-cost and effective research-grade psychophysiological sensors, have resulted in a shift from the use of self-reporting instruments and methods to objective methods for assessing trust using psychophysiological signals. Typical examples of some standard psychophysiological signals are electroencephalogram (EEG), electrocardiography (ECG), electrodermal activity (EDA), audio and electromyography (EMG), and biochemistry markers (e.g., testosterone and oxytocin) (see [14] for a detailed review).
Assessing trust with a psychophysiological signal involves measuring human physiological responses to psychological states by identifying the underlying physiological patterns during episodes of psychological experiences (e.g., the decision to rely or not rely on an artificially intelligent agent). This psychophysiological signal provides an unobtrusive way to measure user trust implicitly and objectively during an interaction, thereby giving trust researchers the opportunity to tap into the human mind based on the mind-body connection. [15,16].
However, the use of psychophysiological signals for assessing trust can be elusive and multifaceted [17]. For instance, there is no one-to-one mapping between trust states and psychophysiological signals. However, there is the possibility to produce one-to-many (one trust state correlating with several physiological responses) or many-to-one (several trust states correlating to one physiological response) mappings. This elusive relationship further complicates the interpretation of the relevant signals [18,19]. Moreover, there are several limitations and advantages associated with each psychophysiological signal (see [15,16] for more details).
Therefore, it is clear that researchers should approach this topic (assessing trust with psychophysiological signals) in different ways-for example, adjusting the trust measurement scale used, the number and types of physiological signals used, the type of trust relationship studied, the analysis technique/algorithms used to analyze the data to infer the trust state, and the inference validation technique used.
These different approaches used in studies assessing trust with psychophysiological signals offer diverse perspectives on how to use psychophysiological signals to assess user trust. However, without a structured overview of the approaches adopted, this diversity makes the use of psychophysiological signals for assessing trust unclear and renders it more challenging for academics (e.g., human factors, multimodal technologies interaction (MTI) and human computer interaction (HCI)) and industry practitioners to engage in fruitful discussions and compare findings across studies. Therefore, the primary motivation and aim of this study is to present an extensive survey and in-depth analysis of systematically selected publications to provide a bird's-eye view of the approaches employed in studies assessing trust with psychophysiological signals. In particular, we examine the outcomes (both the least and most explored) and the use of psychophysiological signals, inference validation, and analysis techniques. The rest of the article is, therefore, divided into sections describing the backgrounds, methods, discussions, future research, and conclusions.

Trust Definition
The definition of trust differs from discipline to discipline. Despite these wide disparities, several comprehensive reviews have offered conceptually similar definitions of trust. For instance, the authors in [20] define trust as a trustor (evaluator) voluntarily taking a risk (e.g., reliance) based on the subjective belief that a trustee (evaluatee) will behave as anticipated by the trustor under situations of uncertainty (e.g., a lack of information or inconsistent information). Moreover, the authors in [21,22] defined trust as a behavior that makes the trustor vulnerable based on the actions of the trustee.
The trustor's belief that the trustee possesses trustworthy characteristics then influences the trustor's behavior and intentions. Furthermore, the comprehensive literature review in [10,23,24] describes trust as a belief, attitude, intention, or behavior.
Following the above definitions and drawing upon previous research [25], we define trust as a subconscious compound cognitive process (mental deliberation, reasoning, and mental processing, involving memory, learning, and accumulated knowledge). This subconscious cognitive process (trust) is a compound because of the numerous related cognitive activities (see Figure 1), henceforth referred to as the net cognitive state [26]. By this definition, trust, as a subconscious net cognitive state, informs what the trustor (i.e., the trustor's beliefs) knows about the trustee and elicits conscious and observable events, such as intentions/behaviors (e.g., the decision to use, cooperation, and dependence) that the trustor exhibits towards the trustee, acting as bystanders to the trust state. Therefore, although a trustor's trust towards a trustee can be assessed by obtaining a self-reported response (trust belief) or observation (e.g., counting frequencies) of the resulting intentions/behaviors (e.g., reliance, cooperation, or decision to use or not use) of the trustor towards the trustee. Notably, the belief (self-reported response) and intentions/behaviors (e.g., reliance, cooperation, or decision to use or not use) of the trustor towards the trustee is only a means for inferring and quantifying the trustors' trust state towards the trustee and does not replace the meaning of trust.

Psychophysiology, Trust Assessment, and Challenges
Psychophysiology is the study of human physiological responses to psychological states (e.g., emotions and trust), thereby leading to the term psycho + physiology [27]. Psychophysiology involves recording human physiological changes (psychophysiological signals) using physiological sensors during episodes of psychological experiences (e.g., emotions and trust) [15,19,28].
The physiological sensors continuously monitor and record changes at two broad levels of human physiology, the peripheral nervous system (e.g., galvanic skin response and heart rate variability) and the central nervous system (e.g., brain) [15]. Some examples of these physiological sensors include the following: • Electroencephalogram (EEG): EEG measures neurophysiological activities by recording post-neuronal electrical signals using electrodes (dry/wet) connected to an individual's scalp.
EEGs are currently available in different form factors ranging from wired to wireless and from using simple headbands (e.g., muse and neuro link) to complex full head scalp systems with 64 or more channels. • Functional magnetic resonance imaging (fMRI): fMRI is a neurophysiological imaging technology that monitors changes in brain regions using the simple principle that active brain regions will receive more blood flow than others. At the same time, this method's primary strength is its ability to accurately pinpoint the region of the brain that is activated. This advantage has made it an excellent choice for researchers trying to understand the neurophysiological mechanisms modulating cognitive and psychological experiences such as trust and emotion. However, its main limitation is its form factor. An fMRI is the size of a room or half a room. Moreover, fMRI machines require an individual to lay flat inside their encasements. Above all, an fMRI machine cost more than an EEG. • Functional Near-Infrared Spectroscopy (fNIRS): fNIRS measures neurophysiological activation and deactivation by emitting a near-infrared spectroscope beam into the scalp, and the resulting resistivity to the emitted near-infrared spectroscope beam as a result of the amount of oxygenated and de-oxygenated blood in the brain determines the neural activation related to the tasks/activities that the individual is experiencing or performing. Its functioning is similar to that of fMRI as an imaging technology but differs in its form factor, cost, and underlying operational mechanisms. For instance, some fNIRS systems are sold for a few hundred dollars and are mobile and portable. However, these systems are unable to pinpoint the region of the brain that is active as accurately as fMRI. Photoplethysmography: This method measures changes in vital physiology, such as heart rate and blood pressure, using contactless and non-invasive optical lights emitted onto the skin's surface from an optical device (e.g., a camera).
The use of psychophysiological signals for assessing trust is motivated by the fact that trust is a subconscious net cognitive state modulated by human biological systems (central and peripheral nervous systems). Consequently, the use of psychophysiological signals indeed captures the objective nature of trust concerning events over time [20,29].
This development has aided researchers in achieving groundbreaking discoveries. For example, in the context of human-computer interaction (HCI), researchers have developed intelligent interfaces that enable the adaptation of machines to a user's trust state using one or more physiological signals [13,30].
Moreover, HCI research has demystified the relationship between human physiological responses and user trust states during their interactions with technical artifacts [31]. Furthermore, in the context of human-human interactions, researchers have unraveled several neurological cortical patterns that modulate user trust [32].
However, the use of psychophysiological signals for assessing users' experiences (e.g., trust and emotions) is still in its infancy and includes numerous challenges accompanied by several solutions that are not yet standardized. For instance, the authors in [16] suggest that most psychophysiological signals, except for EMG, are inconsistent and unreliable for measuring users' experiences (emotions) during interactions, especially when users interact naturally (e.g., with the free body movements of participants) compared to non-natural interactions (e.g., with restricted body movement). The authors in [16] note that noise arising from body movement (e.g., eye movement for EEGs and electrode placement for other psychophysiological sensors) can corrupt psychophysiological signals. Although there are robust filtering algorithms that can effectively denoise these signals, the question of which filtering algorithm to use and under what circumstances remains solely at the researcher's discretion.
Moreover, the authors in [33] suggest that the choice of location for psychophysiological sensor electrode placement has a significant impact on the data quality. Depending on the psychophysiological sensor, there are several electrode placement locations, which are chosen solely at the researcher's discretion based on ethics. For example, placing an ECG sensor electrode on the chest area is most viable for medical experts, while electrode placement on the arms is most viable in human-computer interactions.
Furthermore, as a result of the numerous factors associated with the use of psychophysiological signals, the authors in [34] proposed a guideline for researchers (HCI) using psychophysiological signals, including studying the differences between individual participants (e.g., age and gender), bio-signal characteristics (wet or dry conductivity electrode position), using a theory that appropriately maps the phenomenon (e.g., trust) to physiological responses that are measured, applying sufficient contextual information to explain noise in the psychophysiological signal, a combination of two or more algorithms for data analysis to infer a user's state (i.e., ensemble technique), the use of inference validation to ensure that the stimuli for the phenomenon yields the desired effect, and the use of multiple psychophysiological signals justified by the appropriate theory and supported by suitable for data integration and model training.
Moreover, because there is no consensus on a single perfect measurement system (psychophysiological or self-reporting instrument), the authors in [33] strongly recommend a combination of several measurement techniques (e.g., self-reporting and psychophysiological signals).
Despite the objectivity and novelty of the findings made using psychophysiological signals, it is clear that researchers employ diverse approaches to assess trust using psychophysiological signals. However, it is unclear how much progress this diversity has yielded, especially considering the research gaps left unattended due to this diversity.

Related Work and the Need for a Systematic Mapping Study
To clarify and enhance the understanding among academics on the topic of assessing trust using psychophysiological signals, some researchers have reviewed prior studies exploring the physiological correlates of trust by addressing different issues.
Firstly, to establish what trust is and is not, the authors in [25] conducted a review, which established that trust goes beyond rational decision making and includes subconscious processes that inform and elicit rational activities, such as decision making. This led to the conclusion that trust involves a highly subconscious process that informs rational activities like decision making, which act as mere bystanders.
Moreover, the authors in [14] conducted a review on extensive biological evidence (fMRI, oxytocin, and testosterone) of trust to establish that trust is a behavior and not just a mere perception, as widely acclaimed by some researchers who mostly assess trust using self-reporting instruments. Therefore, since behavior is strongly rooted in human biology, trust should be measured using psychophysiological signals.
However, these reviews have not investigated the approaches used in studies assessing trust with psychophysiological signals (i.e., fMRI, fNIRS, EEG, ECG, EDA, audio, etc.). By "approaches", we refer to: • The nature and choice of data analysis techniques used to analyze the data to infer trust in studies assessing trust with psychophysiological signals, such as ensemble (the combination of two or more machine learning algorithms), dynamic (single machine learning algorithm), or static (a basic statistical test for the patterns, significance, or relationships between variables) methods [17,35,36]. • The measurement scale used for measuring trust in studies assessing trust with psychophysiological signals. As trust and distrust are opposite [37][38][39], each is measured differently on a scale from low to high. • The most frequently used and maximum number of combined psychophysiological signals used in prior studies assessing trust with psychophysiological signals, as the trade-off between the accuracy and obtrusiveness of physiological sensors could influence the researcher's choice of physiological sensors and, ultimately, the results [17,40]. • The number and type of inference validation methods for validating the psychophysiological signals used in studies assessing trust with psychophysiological signals, as finding appropriate reference measurements is likely to remain an application-specific affair [17,40].
Therefore, the goal of this review is to show the current gaps in approaches for the use of psychophysiological signals to assess trust. This gap in approaches will help further the understanding of trust assessment using psychophysiological signals among researchers by providing a bird's-eye view of the approaches used in studies assessing trust via psychophysiological signals.

Materials and Methods
We followed the guidelines for systematic mapping studies used in [41] and adopted from [42][43][44] to conduct this study, as illustrated in Figure 2.
Systematic mapping studies are preferred over systematic reviews or scoping studies because systematic mapping can help examine prior studies and provide a bird's-eye view of existing issues (e.g., gaps in the approaches used in previous studies focused on assessing trust with psychophysiological signals) [43]. This supports our primary goal, which is to provide a visual summary of the approaches used in studies exploring psychophysiological signals for assessing trust. The outcomes of this study will further fruitful discussion among researchers and academics across disciplines. The systematic mapping study process adopted from [41] as adopted from [42][43][44].

Identifying the Research Questions
The research questions in this study are based on the four components in [45] (also used in [42]).
• Population: Peer-reviewed literature that attempts to assess trust using psychophysiological signals. • Intervention: Empirical and non-empirical studies that attempt to assess trust using psychophysiological signals (i.e., context (human-human or human-technology relationships), the number and types of physiological sensors, analysis techniques used, and inference validation techniques). • Comparison: Similarity of approaches used in studies that attempt to assess trust using psychophysiological signals-that is, the most and least frequently used psychophysiological signals, inference validation techniques, and analysis techniques. • Outcomes: Types of outcomes from studies that attempt to assess trust using psychophysiological signals to identify research gaps.
The questions raised for this study based on the four components above are presented in Table 1. These questions are aimed at providing a bird view of the approaches used by researchers assessing trust using psychophysiological signals.

RQ1
What are the most frequently used and maximum numbers of combined psychophysiological signals in studies assessing trust with psychophysiological signals?
To provide a broad overview of the most commonly used psychophysiological signals and the minimum and maximum number of psychophysiological signals in studies assessing trust with psychophysiological signals.

RQ2
What are the most frequently used and maximum numbers of combined inference validation methods in studies assessing trust with psychophysiological signals?
To provide a broad overview of the most commonly used inference validation methods, the minimum and maximum number of inference validation methods in studies assessing trust with psychophysiological signals.

RQ3
What are the types and most common data analysis techniques used to analyze psychophysiological data to infer trust in studies assessing trust with psychophysiological signals?
To provide a broad overview of the types of data analysis and/or fusion algorithms used to infer trust.

RQ4
What are the scales used in studies assessing trust with psychophysiological signals?
To provide a broad overview of the scales used to measure trust to establish if these scales are consistent or inconsistent across studies that assess trust with psychophysiological signals.

RQ5
What are the outcomes from studies assessing trust with psychophysiological signals?
To provide an overview of the outcomes (e.g., models, tools, and processes) of previous studies assessing trust with psychophysiological signals and to identify the gaps in existing contributions.

Identifying Relevant Studies
We searched seven electronic databases using keywords derived from our research questions outlined in Table 2. Trust and psychophysiology are the most crucial concepts in this study, from which our keywords were derived. The search included literature published between January 1970 and April 2019 to identify the origin of this subject. The search results and search strings are detailed in Table 3.  Table 3. Papers retrieved from the selected digital libraries.

Study Selection
The criteria set for including a paper required that the paper be peer-reviewed and published between January 1970 and April 2019 to enable us to investigate the history of the topic's origin. Moreover, the paper had to be empirical, written in English, and focus on assessing trust with psychophysiological signals. Since the task was to analyze the approaches used in studies assessing trust with psychophysiological signals and to identify existing gaps in the literature, the criteria for excluding a paper were not being written in English, assessing trust using either only self-reporting instruments or biochemical features (such as oxytocin or testosterone), being based on tutorials, panel discussions, short papers (<=3 pages), newsletters, magazine articles, and personal blogs. We also excluded studies that, although relevant, could not be accessed, such as non-empirical papers, e.g., experience papers, opinion papers, and philosophical papers.
The inclusion and exclusion criteria were applied to all five stages of the study screening method outlined in Table 4 (adopted from [46]). This method was also used in [41]. Screening of all 585 papers found in the automatic search by title resulted in 105 articles. This screening ensured that the titles included one or all the keywords and that the other inclusion criteria were satisfied. A further manual search through a backward and forward examination of the selected studies' related work sections and reference lists yielded 24 more studies, leading to a total of 129 articles. After excluding duplicate articles (32), the list was reduced to 97 articles. After reading the abstracts of the remaining articles (97), the articles were reduced to 51. Finally, we thoroughly read through these 51 articles, all of which were found to be relevant, leaving a total of 51 articles included in this study.

Classification Themes, Keywords, and Scheming Process
We followed the key-wording process adopted in [41,43] to build a classification scheme by reading the abstract and entire content of each included primary study. For each article we read, the relevant keywords (see Table 5) were extracted and used to build the themes in the schema. We continued to update the schema containing the classification schemes until all articles had been read completely.

Data Extraction
Based on the themes and keywords in Table 5 that we extracted during the schema process described above, we created a spreadsheet to record the following data extracted. This same method was employed in previous research [41]: • Publication year and type/venue. • The data analysis technique or algorithm used to infer trust. • The type and number of inference validation method used. • The type and number of psychophysiological signals used. • The type of contribution made. • The type of scale used for assessing trust with psychophysiological signals. • The type of trust relationship being studied to classify articles as either interpersonal trust (human-human or human-organizational) or trust in technology (human-computer-human or human-computer).

Data Synthesis
Data synthesis was done in two ways, as suggested in [41]. The manual data synthesis involved an iterative classification of studies into groups (e.g., the type of data analysis technique, the scale used for assessing trust, and the type of trust relationship studied), while the automatic process involved coding the study data numerically into an excel spreadsheet to enable a fundamental descriptive statistical analysis to produce some of the charts reported in this study. Figure 3 shows that the use of psychophysiological signals for assessing trust began in 2002. However, the trend for the publication years in Figure 3 indicates that there was increase in contributions or research interest from 2007 onward.

Origin and Publication Venue/Year
Between these periods (2007 to present), the increase in the number of publications could be the result of more disciplines becoming involved in the study of trust using psychophysiological signals, such as sociology, psychology, and cognitive science, to help foster trust relationships between humans (interpersonal trust: human-human and human-organization) and HCI researchers alongside human factor engineers seeking to foster trust relationships between users and technological artifacts (trust in technology: human-technology and human-technology-human).
Therefore, this topic is relatively new and highly multi-disciplinary. However, since our search ended on April 2019, the publication trends in Figure 3 do not include papers published after April 2019.
Furthermore, as illustrated in Figure 4, most (58.5% of the 51 included articles) scientific contributions that employ psychophysiological signals for assessing trust are journal publications. In comparison, 41.2% of the articles were published in conference/symposium proceedings.

RQ1: Most Frequently Used and Combined Psychophysiological Signals Used for Assessing Trust
Regarding the use of the central nervous system psychophysiological signals sensors, Figure 5 shows that the most frequently used type of psychophysiological signal for assessing trust is EEG. In comparison, fMRI is the second most frequently used type of psychophysiological signal in studies assessing trust with psychophysiological signals. Next most common are EOG and fNIRS, which appear to be rarely used. These findings are most likely due to the wide variability and availability of low-cost and effective EEG systems, such as those from Muse, Neuro-kit, Brain-amp, and G.tech. This was reinforced by the authors in [14], who, despite suggesting that fMRI signals are the most popularly used psychophysiological signals in studies assessing trust, also anticipated an increase in the use of other neurophysiological devices, such as EEG.
Furthermore, regarding the use of the peripheral nervous system psychophysiological signals sensors (EDA, ECG, and eye-tracker), the most frequently used types of psychophysiological signals for assessing trust are electrocardiography (ECG) and electrodermal activity (EDA). In comparison, audio and video are the second most frequently used psychophysiological signals in studies assessing trust with psychophysiological signals. The third most frequently used type of psychophysiological signal is eye tracker. Other seldom-used methods include photoplethysmography (PPP), impedance cardiography (ICG), piezo-electric belt pneumatoraces, and potentiometers.
These peripheral psychophysiological signals have become predominant in studies assessing trust, likely due to their unobtrusiveness, low cost, ease of accessibility, and simplicity in signal analysis.
Furthermore, as explained in Figure 5, irrespective of context, EEG is clearly the most widely used type of psychophysiological signal in studies assessing trust with psychophysiological signals. Moreover, fMRI is mostly used in studies assessing trust with psychophysiological signals in the context of interpersonal trust. This result could be due to the large size of fMRI machines, which makes them unsuitable for assessing trust in a technological context (human-computer interactions).  Figure 6 below identifies the number and type of the most commonly combined and not yet combined psychophysiological signals based on the 51 included primary studies that assessed trust with psychophysiological signals. One reason for combining psychophysiological signals is because they have varying temporal resolutions that ultimately influence the depth and breadth of the results obtained [15].
The maximum number of psychophysiological signals combined in any study that assessed trust with psychophysiological signals was three (3). For instance, combinations of ECG + EDA + eye tracker and photoplethysmography (PPP) + a piezo-electric belt pneumatorace + a potentiometer were reported in two studies (SM11 and SM38). However, all the combined psychophysiological signals are peripheral nervous system monitoring signals that are limited in temporal resolution [15].
Double psychophysiological signals (EDA, ECG, eye tracker, audio, EOG, and EEG) are the next most commonly found combinations in studies assessing trust with psychophysiological signals. For instance, some studies combined two peripheral nervous system psychophysiological signals sensors. For example, two studies (SM43 and SM48) combined EDA and ECG, one study (SM15) combined ECG and eye tracker, one study (SM16) combined ECG and video, and five studies (SM3, SM6, SM26, SM44, and SM50) combined audio and video. Moreover, one study (SM5) combined EDA and EMG, and one study (SM20) combined ECG and impedance cardiography (ICG), while other studies assessing trust combined psychophysiological signals monitoring both the peripheral and central nervous system. For instance, only two studies (SM4 and SM8) combined EEG and EDA, four studies (SM2, SM25, SM27, and SM33) combined EEG and EOG, and one study combined eye tracker and FMRI (SM31).
These findings suggest that very few studies have attempted to provide a comprehensive assessment of trust based on the brain-body relationship (the central and peripheral nervous system) by combining a central nervous system psychophysiological signals (e.g., EEG, fMRI, fNIRS) to more than one peripheral nervous system psychophysiological signals when assessing trust. Indeed, each psychophysiological signal has a varying temporal resolution and measures unique physiological factors. For example, EDA measures skin conductance, ECG measures heart rate related to stress, EEG measure cognitive activities, and facial expression measure valence [15].
Moreover, because only one study attempted to use audio signals to assess trust, little is known about the potential of audio psychophysiological signals for assessing trust. The same is true for signals from eye tracking, among others.

RQ2: Most Frequently Used and Combined Inference Validation Methods Used in Studies Assessing Trust with Psychophysiological Signals
As illustrated in Figure 7, out of the fifty-one included primary studies that assessed trust using psychophysiological signals, 47.1% used only self-reporting inference validation methods, while 33.3% used only behavioral inference. Only 11.8% combined both behavioral inference (i.e., task/activity such as decision making, among others) and self-reporting inference validation methods. However, 7.8% used neither the self-reporting method nor used the behavioral inference method.
This finding suggests that it is a common practice among researchers investigating trust with psychophysiological signals to compliment psychophysiological signals with other data sources to confirm that the recorded signals represent actual human cognitive or affective states, such as trust. Further contextual analysis, as illustrated in Figure 8, showed that interpersonal trust studies employed more self-reporting inference validation methods than behavioral inference methods. In contrast, there is little difference between the frequency of usage of self-reporting and behavioral inference validation methods among trust in technology studies.
Further, more trust in technology studies than interpersonal trust studies did not employ any inference validation method, as illustrated in Figure 8. This could be because the topic of trust originated from psychology and cognitive science. Considering that trust in technology researchers are mostly HCI or human factor researchers, it can also be assumed that there is a lack of methodology transferability from classical psychology and cognitive science to applied domains, such as HCI. Moreover, the authors in [33] suggest that the entry barrier could be a factor responsible for the less-common usage of psychophysiological signals in HCI research. This is further reinforced by the greater number of interpersonal trust studies than trust in technology studies that combined inference validation methods, as outlined in Figure 8 below. To ameliorate this issue, the scientific community could investigate the transferability of methodologies between psychology/cognitive science, HCI/human factor science, and physiological computing in assessing trust using psychophysiological signals.

RQ3: Analysis Techniques Used for Analyzing and Inferring Trust from Psychophysiological Signal Data
As illustrated in Figure 9, most (74.5%) studies assessing trust with psychophysiological signals adopted static techniques like basic statistics (e.g., t-test or ANOVA) for analyzing psychophysiological data to infer a trust state. However, only 17.6% of the 51 included primary studies assessing trust with psychophysiological signals employed dynamic techniques such as those commonly used in the machine learning or data science fields for high-level predictive analytics in either classification or regression problems-for example, support vector machine(SVM), linear/quadratic discriminant analysis(L/QDA), logistic regression(LR), and naive-Bayesian(NB) classification.
Further, only 7.8% of the 51 primary included studies assessing trust with psychophysiological signals used ensemble techniques to overcome the limitations of previous dynamic algorithms by combining two or more dynamic algorithms to enhance the accuracy of the predictions in the classification or regression problems associated with assessing trust [30,47].
This finding further suggests that most researchers assessing trust with psychophysiological signals either lack sufficient knowledge on data science/machine learning methods to explore the psychophysiological signal data values or have low awareness of how to use data science/machine learning methods. This provides an opportunity to examine challenges in the transferability of data science/machine learning methodologies between researchers, such as HCI/human factor scientists, psychological researchers, cognitive science researchers, sociologists, etc., to assess trust with psychophysiological signals. A potential review of the dynamic and ensemble algorithms used in studies assessing trust with psychophysiological signals is also desirable to help trust researchers identify the best algorithms to adopt in their research.
Moreover, as illustrated in Figure 9, interpersonal trust studies employed more static techniques for analyzing psychophysiological signal data to infer trust than dynamic techniques. In contrast, trust in technology studies employed more dynamic techniques for analyzing psychophysiological signal data to infer trust than static techniques/algorithms.
Trust in technology studies also used more (3) ensemble techniques for analyzing psychophysiological signals to infer trust compared to interpersonal trust studies.
This result further re-affirms the lack of knowledge or awareness of data science/machine learning methods among researchers using psychophysiological signals to assess trust. However, this gap is predominant among non-HCI/human factor researchers, as HCI/human factor researchers also have more technical skills compared to researchers in psychology or the cognitive science field.

RQ4: Trust Scales Used for Inferring Trust from Psychophysiological Signal Data
As explained in Figure 10 below, among the 51 included primary studies assessing trust with psychophysiological signals, 29.4% used a low-high trust scale, 35.3% used a trust-distrust scale, 13.7% did not define (undefined) any scale, and 5.9% used a low-neutral-high trust scale. Low-medium-high, trust-neutral-distrust, initial-later, low-medium-neutral-high, 0 to 7, 0 to 100%, over-under, and trust-mistrust trust scales were each used by 2% of the 51 included primary studies assessing trust with psychophysiological signals. However, the most appropriate scale to be used by researchers investigating trust with psychophysiological signals remains uncertain. To clarify the issue of scales, below we outline the definitions of some vital keywords: • Misdistrust: Distrusting a trustworthy person/trustee [48]. This is also commonly referred to as under-trusting. • Mistrust: Trusting an untrustworthy person/trustee [49], also commonly referred to as over trust resulting from misinformation. • Untrust: Being unable to establish trust [49]. • Undistrust: Being unable to establish a distrust state [50]. • Trust involves desirable expectations based on beliefs that contribute to behavior. That is, trust verifies trustworthiness [51]. However, distrust involves undesirable expectations. For example, a trustor may not trust a trustee due to a lack of information, but this does not imply that the trustor distrusts the trustee. Moreover, users may not use the object of trust (e.g., the technical artifact) a few times during a series of interactions, which does not mean that the user distrusts the object of trust (e.g., the technical artifact) [52]. However, when entity A believes that entity B presents potential undesirable consequences, a distrust state is formed.
Moreover, using the summary of numerous trust scales from different studies presented in [20], we present a comprehensive trust scale in Figure 11. This scale is used to clarify the consistency of the diverse trust scales found in studies assessing trust with psychophysiological signals.
As illustrated in Figure 11 below, the first scale shows that trust can be measured on a scale of trust-untrust, while distrust can be measured on a scale of undistrust-distrust.
The second scale is a multi-class scale that measures trust on three categorical levels, each further subdivided into three subscales (high, medium, and low). Similarly, the second scale measures distrust on a scale of three categorical levels that are further divided into three subscales (high, medium, and low).
The third scale measures trust on six categorical levels (CT = complete trust, VHT = very high trust, HT = high trust, HMT = high medium trust, LMT = low medium trust, and LT = low trust). Distrust is similarly measured using six categorical levels (CD = complete distrust, VHD = very high distrust, HD = high distrust, HMD = high medium distrust, LMD = low medium distrust, and LD = low distrust).
The fourth scale is a discrete scale that measures trust from 0 to 6 and distrust from 0 to −6. The fifth scale measures trust on a continuous scale from 0 to 1, while distrust is measured as a continuous scale from 0 to −1. The sixth scale is a binary scale that measures trust as 1 and distrust as 0.
Based on the above definitions of terms and the descriptions of the comprehensive trust scales, only the sixth scale remains contentious among trust researchers, as some trust researchers suggest that trust and distrust are opposite ends of the same scale but not conceptually different [37,39,53]. Other researchers disagree with this line of thought and suggest that trust and distrust are not inverses of each other and are not conceptually the same, citing that the process of trust violation is more heavily weighted to human than trust-building processes, which further implies that the way trust is learned/built is different from the way distrust is learned/built, as explained by the principle of symmetry [38,54].
Therefore, the validity of the results reported in 37.3% of the 51 included primary studies assessing trust with psychophysiological signals on scales of trust-distrust and trust-neutral-distrust are unclear due to their scale disparities, despite the quality of their research. Moreover, the results of 4% of the studies that assess trust with psychophysiological signals on scales of trust-mistrust and over-undertrust remain unclear because mistrust is not a variation of trust and does not belong to the same evaluative category as trust. Similarly, over-trust and undertrust, also referred to as misdistrust and mistrust, are not the same as trust or a variation of trust. A potential research opportunity exists in investigating these various scales and areas of potential convergence.
Interestingly, the other 43.3% of the 51 included primary studies assessing trust with psychophysiological signals on a scale of low-high (29.4%), low-neutral-high(5.9%), low-medium-high (2%), low-medium-neutral-high (2%), continuous scales from 0 to 7, and percentages from 0 to 100 are consistent with the conceptual definition of trust, as each study fits into one of the scales ranging from 1 to 5 (see Figure 11). Furthermore, 13.7% of the 51 included primary studies assessing trust with psychophysiological signals had no defined scale. This is a further testament to the lack of understanding of what trust means and how trust should be scaled and measured.
Furthermore, as illustrated in Figure 12 below, the issue of appropriate scale utilization is not unique to trust in either an interpersonal or a technological context, as there are more studies in both contexts that utilize scales such as trust-distrust, trust-neutral-distrust, trust-mistrust, and over-undertrust (also referred to misdistrust and mistrust). Figure 11. Comprehensive trust scale adopted from [20,55]: H = high, M = medium, L = low, CD = complete distrust, CT = complete trust, VHD = very high distrust, VHT = very high trust, HD = high distrust, HT = high trust, HMD = high medium distrust, HMT = high medium trust, LMD = low medium distrust, LMT = low medium trust, LD = low distrust, and LT = low trust.

Figure 12.
Frequency of usage of different trust scales in studies assessing trust with psychophysiological signals by context (trust in technology and interpersonal trust).

RQ5: Contributions and Potential Research Gaps
As illustrated in Figure 13, there are no studies focused on bridging the gap between the various disciplines (physiological computing, psychology, cognitive science, HCI/human factor science, among others) by identifying the challenges in the transferability of methods/knowledge. This is evident in the fact that none of the 51 included primary studies contributed to the methodology, irrespective of context (trust in technology or interpersonal trust).
Moreover, only one (SM46 in Appendix A) study provided contributions related to process which is focused on how to foster trust between two humans during their interactions with the use of psychophysiological signals. However, this study did not measure trust with psychophysiological signals; instead, it presented the physiological signals of both interacting individuals.
In addition, only one study (SM47) proposed the development of a tool for assessing trust. This particular study proposed the development of a real-time trust assessment tool that can help a potential investor assess the trustworthiness of an investment adviser's information.
Furthermore, 17 of the 51 included primary studies assessing trust with psychophysiological signals developed a model as their contribution. Among the 17 models, five studies focused on the context of interpersonal trust (SM21, SM40, SM44, SM28, and SM36) developed models using static techniques (i.e., models developed using static techniques), two studies (SM26 and SM39) developed models using dynamic technique ( i.e., models developed using algorithms such as SVM, LDA etc. ), and one study (SM50) developed an ensemble model (i.e., models developed using the combination of two or more algorithm from dynamic technique). In contrast, in the context of trust in technology, two studies (SM19 and SM13) developed models using static technique (i.e., models developed using statistical test such as ANOVA, t-test etc.), three studies (SM12, SM18, and SM1) developed dynamic models (i.e., models developed using dynamic techniques), and four studies (SM3, SM4, SM6, and SM8) developed ensemble models (i.e., models developed using ensemble techniques).
However, the ensemble models reported in the five studies that employed ensemble techniques for analyzing psychophysiological signals to infer a user trust state were either fairly accurate (with room for significant improvement), suffered instability issues (as reported in SM4), or did not satisfy the primary goals of ensemble algorithms.
For example, using a supervised approach, SM4 (trust in technology context) developed a voting ensemble trust classifier model that combined five different algorithms using EDA and EEG psychophysiological signals. The resulting classifier model was fairly accurate but unstable (the minimum accuracy was 43% and 46%, and maximum accuracy was 61% and 72%), despite being trained with first 40 signals samples and all 100 signal samples.
Likely as a result of the challenges faced by SM4 when combining several distinct algorithms, other researchers developing trust classifier models with the ensemble method have only combined the same algorithms. Consequently, the resulting models do not satisfy the reasons for combining multiple algorithms (e.g., reducing prediction errors, increasing generalizability, decreasing bias, improving variance sensitivity, overcoming the limitations associated with each algorithm, and increasing accuracy) [17].
Furthermore, using an unsupervised approach, SM50 (interpersonal context) developed a voting ensemble neuro-fuzzy neural network-based trust classifier model. The resulting model accuracy ranged between 45% and 98%, depending on the trust factor with the same psychophysiological data. This also indicates that the model is unstable. SM6 (interpersonal context) also developed an ensemble neuro-fuzzy neural network trust classifier model. Despite optimizing the classifier model with an evolutionary algorithm, the resulting model attained an accuracy of only 66.8%; stability was neither assessed nor reported. There is, therefore, substantial room for improvement, and the resulting model is also subject to neural network algorithm limitations. Moreover, SM3 developed an ensemble neural network trust classifier model optimized with a genetic algorithm. Although the resulting model attained an accuracy of 83%, its stability was not assessed, and it is unclear what accuracy value was reported (minimum, mean, or maximum), as the cross-validation method was used for model training and validation. Moreover, this model is subject to the limitations of the neural network algorithm.
Therefore, a significant gap on the extent to which trust can be assessed in real-time remains unaddressed. Bridging this gap will require the availability of a stable, accurate, bias free, and variance-sensitive ensemble trust classifier model suitable for the automatic classification of user trust from psychophysiological signals.
Addressing the issue of stability and accuracy could require more data samples, as SM4 was short-lived and used a limited sample size. In addition, SM18 proved that the temporal characteristics of psychophysiological signals are another potentially important factor, as psychophysiological signals are continuous and not discrete like the data obtained with self-reporting instruments. Moreover, four ensemble techniques exist: voting (soft and hard) boosting, bagging, and stacking. However, it is not clear which ensemble method is most suitable, because all prior studies used the voting ensemble method [47,56,57]. Further, several feature selection techniques are employed (e.g., filter, wrapper, embedded, and hybrid techniques), which could result in variable feature sets. Therefore, finding the most useful feature selection method is another factor that could influence ensemble model performance.
Furthermore, it can be inferred from Figure 13 below that the use of psychophysiological signals for assessing trust, irrespective of context (trust in technology or interpersonal trust), is generally perceived as simply another measurement instrument. This is reinforced by the fact that 32 of the 51 studies contributed metrics, possibly because most studies used static techniques, as expected. Studies in the context of interpersonal trust that employed static techniques are two-times more common than studies on trust in technology that used static methods. This further highlights the issue of methodological knowledge transferability between the various disciplines researching trust using psychophysiological signals.

Limitations
The main limitations of this review study are outlined below: • The search period did not include articles that were published after April 2019. • Only seven electronic databases were included in the search. • Focus on trust could have excluded some cybersecurity and network communication articles that also deal with trust topics. • Our search was not exhaustive for all existing electronic databases.

Future Work
Trust definitions differ among researchers. Some define trust as a belief or intention and measure trust by operationalizing some of its characteristics with sets of questions that result in self-reporting instruments. Other researchers define trust as a rational and cautious decision-making process and measure it implicitly with psychophysiological signals or behavioral inferences during controlled experiments. Irrespective of the schools of thought, there are no generally accepted self-reporting instruments, psychophysiological signals, or inference validations for assessing trust. This could be the reason for the lack of a generally acceptable scale for measuring trust when using psychophysiological signals, as highlighted in RQ4. As weight is measured in kilograms on a continuous decimal scale, so too should a trust scale be defined and validated. Therefore, a potential future research opportunity exists in investigating the most appropriate scale for assessing trust when using psychophysiological signals.
Furthermore, the lack of a general definition and understanding of what trust means could also explain why the majority of self-reporting trust measurement instruments differ significantly (i.e., in what trust-related attributes they measure). This could also explain why some studies assessing trust with psychophysiological signals combined both inference validation and self-reporting methods, used a single inference validation method, or did not use any inference validation method.
Therefore, a potential research gap remains in developing self-reporting trust assessment instruments that are generally acceptable from context to context (human-human, human-computer-human, and human-computer relationships) and place to place (surveys, lab experiments, and the wild). This type of instrument would serve as a more reliable inference validation to complement the use of psychophysiological signals. In addition, the behavioral inference methods commonly used in experimental studies (e.g., economic investment games) are yet to be standardized. For example, it is unclear if behavior such as cooperation, dependence, and decision to use all elicit the same evoked potential in psychophysiological signals. That is, the changes in ECG signals (e.g., heart rate variability, peaks, and troughs) when a user exhibits high trust during various experiments (e.g., through economic trust games or human-computer simulation) could be compared to see if the dependence resulting from trust is the same irrespective of context.
The continuous disagreement on trust definitions/meanings has far-reaching consequences, as evident in the slow uptake of implicit measurements of trust using psychophysiological signals. This is evident in the fact that although the topic of assessing trust with psychophysiological signals is relatively new (see Figure 3), most psychophysiological signals were invented long ago: Therefore, an important unanswered question is why the uptake of the objective measurement of trust (psychophysiology) has been slow and how we can improve this uptake among research communities (HCI, MTI, and human factors). Although psychophysiological signal techniques require specialized skillsets to use, these knowledge gaps could be bridged through conference workshops and tutorial sessions/papers. However, without a clear distinction between what trust is and how to measure trust, the issue of uptake will remain unsolved.
Furthermore, there is a consistent imbalance when comparing the usage frequency of psychophysiological signals for assessing trust-for example, comparing the commonly used techniques of EEG and fMRI to other psychophysiological signals, such as audio, EDA, eye tracking, and video. These findings could be due to the topic of trust assessment using psychophysiological signals still being relatively new and attracting growing research interests. A potential opportunity for future research could be to empirically investigate which psychophysiological signal is most suitable for assessing trust, irrespective of context. Moreover, the maximum number of psychophysiological signals ever combined in studies assessing trust with psychophysiological signals (based on the 51 included primary studies) is three (3), all of which were peripheral nervous system psychophysiological signals monitors. Another important research opportunity exists in investigating the significant differences (if any exist) when one or more signals are combined in studies assessing user trust.
An investigation that combines one central nervous system psychophysiological signals monitor with two or more peripheral nervous system psychophysiological signals monitors is especially necessary. This method could provide a broader, more comprehensive understanding of user trust based on the brain-body relationship.
In addition, there are no stable or accurate ensemble trust classifier models, irrespective of the context (i.e., interpersonal trust or human-computer trust), or methods for model development (supervised and unsupervised). This leaves the question of to what extent trust can be assessed with psychophysiological signals unanswered. This factor is important because a stable, accurate, unbiased, and highly generalizable ensemble trust classifier model is required to achieve real-time trust assessment. However, to address this question, substantial and diverse research efforts will be required, such as investigations into psychophysiological signal pre-processing methods (e.g., electrode references, placement locations, and signal filtering frequencies), ensemble methods (e.g., voting, stacking, bagging, and boosting), feature selection methods (e.g., filter, wrapper, embedded, and hybrid methods), psychophysiological signal epoch lengths, and predictive machine learning methods (supervised vs. unsupervised and classification vs. regression). Without addressing all these issues, it is highly unlikely that we will be able to develop a tool for assessing trust in real-time.

Conclusions
In conclusion, this systematic mapping review provided a bird's-eye view of the approaches used in studies assessing trust with psychophysiological signals, especially the number of psychophysiological signals combined together, the most frequently used psychophysiological signals, and the inference validation methods used, as well as the types of analysis techniques employed in these studies alongside the outcomes from prior studies. The goal was to provide a quick fact book to facilitate entry for researchers seeking to use psychophysiological signals to assess trust, in addition to identifying some potential gaps. Some of the most intriguing findings and potential future research opportunities highlighted includes: (1) The lack of a stable and accurate ensemble trust classifier model to enable real-time trust assessment with psychophysiological signals leaves the question of what extent can trust be measured in real-time unattended. (2) Little is known about the potential of trust assessment with psychophysiological signals such as audio, despite being less obtrusive. (3) The lack of a common understanding of what is trust and a distinction between it and how to measure trust has steered several issues such as: the lack of methodology transfer between the various disciplines involved in trust research using psychophysiological signals for assessing users, the low uptake of psychophysiological signals among researchers seeking to assess users trust objectively despite the fact that psychophysiological signals were invented several decades ago, and the lack of common scales to adopt when assessing trust with psychophysiological signals.  Acknowledgments: In this section you can acknowledge any support given which is not covered by the author contribution or funding sections. This may include administrative and technical support, or donations in kind (e.g., materials used for experiments).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. Hu, X., Xu, Z., Li, Y., & Mai, X. (2018). The impact of trust decision-making on outcome processing: Evidence from brain potentials and neural oscillations. Neuropsychologia, 119, 136-144.