Objective Detection of Trust in Automated Urban Air Mobility: A Deep Learning-Based ERP Analysis

: Urban Air Mobility (UAM) has emerged in response to increasing traffic demands. As UAM involves commercial flights in complex urban areas, well-established automation technologies are critical to ensure a safe, accessible, and reliable flight. However, the current level of acceptance of automation is insufficient. Therefore, this study sought to objectively detect the degree of human trust toward UAM automation. Electroencephalography (EEG) signals, specifically Event-Related Potentials (ERP), were employed to analyze and detect operators’ trust towards automated UAM, providing insights into cognitive processes related to trust. A two-dimensional convolutional neural network integrated with an attention mechanism (2D-ACNN) was also established to enable the end-to-end detection of trust through EEG signals. The results revealed that our proposed 2D-ACNN outperformed other state-of-the-art methods. This work contributes to enhancing the trustworthiness and popularity of UAM automation, which is essential for the widespread adoption and advances in the UAM domain.


Introduction
Some urban areas are facing increasingly serious traffic congestion issues with the development of society.Conventional ground transportation methods might be insufficient in addressing this congestion [1].To provide more possibilities for urban transport, the idea of "Urban Air Mobility (UAM)" was introduced, having been initially put forth by the National Aeronautics and Space Administration (NASA) [2].UAM involves flying in complex urban airspace and densely populated areas.Many stakeholders have been attempting to develop UAM.For example, the Federal Aviation Administration (FAA) provided a broader UAM blueprint [3] and instruction for vertiport design [4].Rahman et al. explored how to integrate vertiports with existing public transport systems, as well as design guidelines for vertiports [5].Rostami et al. [6] and Kadhiresan [7] investigated a way of designing the configuration of Electric Vertical Take-off and Landing (eVTOL) aircraft.
However, just like operators of other vehicles, human operators of UAM vehicles are also prone to experiencing fatigue and mental workload in complicated conditions, leading to decreased cognitive abilities and human errors [8].Additionally, UAM involves commercial flights and has a broad application in the mass market [9].Hence, wellestablished automation technologies are critical to ensure safe, accessible, and reliable UAM.However, public trust in automated vehicles remains precarious at present [10,11].Concerns about the reliability and safety of current automation, a lack of understanding of automation decisions, and past accidents and incidents of automated vehicles all hinder people's trust in automation [12].People's trust in automation can affect their acceptance and use of the system, such as the time they use the automated system and the level at which they use it [13].Ascertaining the operator's trust in UAM automation can effectively prevent misuse or abuse of the automation, thereby improving UAM safety, increasing public acceptance of UAM vehicles, and thus motivating the development of UAM.
Therefore, it is imperative to ascertain the level of human trust in UAM automation.The UAM Coordination and Assessment Team (UCAT) of NASA categorized UAM automation into five levels [14], as shown in Figure 1.The Assistive level represents a lower tier of automation, where operational safety is entirely entrusted to humans.The Comprehensive Safety-assurance level involves automation offering safety-related aid, like collision avoidance, yet humans remain entirely accountable for operational safety.Meanwhile, the Collaborative and Responsible level entails the automation system being proficient in executing specific functions, so humans can be relieved, and training on such functions can be reduced.However, human supervision of the entire system remains crucial to ensure safety, which means that human involvement is still required.The Highlyintegrated Automated Networks level no longer requires real-time human involvement but can be improved through human intervention.At the System-wide Automation Optimization level, human supervision and intervention in the system are no longer required [15].
Aerospace 2024, 11, x FOR PEER REVIEW 2 of automation decisions, and past accidents and incidents of automated vehicles all hind people's trust in automation [12].People's trust in automation can affect their acceptan and use of the system, such as the time they use the automated system and the level which they use it [13].Ascertaining the operator's trust in UAM automation can effective prevent misuse or abuse of the automation, thereby improving UAM safety, increasi public acceptance of UAM vehicles, and thus motivating the development of UAM.Therefore, it is imperative to ascertain the level of human trust in UAM automatio The UAM Coordination and Assessment Team (UCAT) of NASA categorized UAM aut mation into five levels [14], as shown in Figure 1.The Assistive level represents a low tier of automation, where operational safety is entirely entrusted to humans.The Com prehensive Safety-assurance level involves automation offering safety-related aid, like c lision avoidance, yet humans remain entirely accountable for operational safety.Mea while, the Collaborative and Responsible level entails the automation system being pro cient in executing specific functions, so humans can be relieved, and training on such fun tions can be reduced.However, human supervision of the entire system remains cruc to ensure safety, which means that human involvement is still required.The Highly-in grated Automated Networks level no longer requires real-time human involvement b can be improved through human intervention.At the System-wide Automation Optim zation level, human supervision and intervention in the system are no longer requir [15].Currently, automation technology in the aviation domain has reached a certain m turity level.The aircraft can fly automatically in most flight phases, leading to a reduc workload for pilots and improved flight efficiency and accuracy [16].However, the cu rent automation technology still has certain limitations.For example, there are potent safety risks [17].Additionally, the criteria for ethics are unclear, such as how automati should set priorities when continuing to drive may threaten other lives but swerving m threaten the lives of the onboard passengers [18].Hence, it is generally believed th achieving the Collaborative and Responsible level could be gradually realized in the ne few decades [19].Hence, this study mainly focuses on detecting trust in automation at t Collaborative and Responsible level.
It is widely believed that the safety of automation systems plays a crucial role in th acceptance, as automated systems involve human safety [20].The safety assessment based on comprehensive safety data throughout the entire flight process, including flig data, training data, procedures, forecasting data, etc.Another important factor that infl ences human trust towards automation is the user's psychological perception of au mated systems [21].It emphasizes the user's immediate perceptions when utilizing au mation.This paper primarily investigates the latter aspect, delving into objective mea urements beyond the traditional subjective evaluations of user trust in automation sy tems.
Traditionally, the measurement of trust levels is commonly conducted through su jective surveys [22], including questions about the acceptance level [23], usability [24], a Currently, automation technology in the aviation domain has reached a certain maturity level.The aircraft can fly automatically in most flight phases, leading to a reduced workload for pilots and improved flight efficiency and accuracy [16].However, the current automation technology still has certain limitations.For example, there are potential safety risks [17].Additionally, the criteria for ethics are unclear, such as how automation should set priorities when continuing to drive may threaten other lives but swerving may threaten the lives of the onboard passengers [18].Hence, it is generally believed that achieving the Collaborative and Responsible level could be gradually realized in the next few decades [19].Hence, this study mainly focuses on detecting trust in automation at the Collaborative and Responsible level.
It is widely believed that the safety of automation systems plays a crucial role in their acceptance, as automated systems involve human safety [20].The safety assessment is based on comprehensive safety data throughout the entire flight process, including flight data, training data, procedures, forecasting data, etc.Another important factor that influences human trust towards automation is the user's psychological perception of automated systems [21].It emphasizes the user's immediate perceptions when utilizing automation.This paper primarily investigates the latter aspect, delving into objective measurements beyond the traditional subjective evaluations of user trust in automation systems.
Traditionally, the measurement of trust levels is commonly conducted through subjective surveys [22], including questions about the acceptance level [23], usability [24], and expectations of future use [25].There are also studies employing objective real-time measurements of human trust.For example, Highland et al. [12] used objective indicators, including eye-tracking data, mental workload, and task performance, to measure human trust in autonomy in dog-fighting.Dsouza et al. [26] adopted electroencephalography (EEG) to evaluate passenger trust in drivers.He et al. [27] employed electrocardiogram (ECG) and pupil diameter to investigate drivers' risk perception and trust under the Society of Automotive Engineers (SAE) Level 2. Perello-March et al. [28] measured functional near-infrared spectroscopy (fNIRS), establishing the connection between human neural activities and trust within a highly automated driving context.However, most such studies mainly focus on ground vehicle drivers or fighter pilots, and research on the trust of commercial UAM operators remains largely unexplored.Compared to ground vehicles and traditional aircraft, the detection of trust for UAM vehicles is much more of a challenge.On the one hand, UAM vehicles usually run in more complex scenarios, which involve three-dimensional flying in busy urban areas, so the cognitive process for trusting the automation will be more sophisticated and the key features more difficult to capture [29].On the other hand, since operators of UAM vehicles tend not to be as well trained as traditional civil or military aviation pilots, their misuse or abuse of automation may be greater [30].
Trust is a cognitive process with neural correlates related to emotion, cognition, and decision-making [31].Regarding this, it can be said that trust can be linked to physiological indicators, including eye-tracking data, ECG, EEG, fNIRS, etc. [12,[26][27][28].Among a broad spectrum of physiological signals, EEG is considered to be the most responsive indicator of neural activities [32] and has been used in many studies [33][34][35].It is believed that trust and distrust are different cognitive processes because different brain patterns would be activated [36].Given that trust is related to attention, perception, and cognition, it is a complex concept involving multidimensional information.By leveraging the advantages of neural networks for non-linear and complex hidden feature extraction, effective detection of trust might be enabled [37].Deep learning (DL) has found extensive application across diverse domains owing to its good performance and has shown satisfying results in applications such as mental load assessment [38], motor imagery [39], and emotion recognition [40].
This research aims to apply such methods to detect human trust in automated UAM within the realm of human-computer interaction by analyzing Event-Related Potentials (ERP), an important component of EEG.In other words, the aim of this study is to establish a trust detection framework in automated UAM by developing an attention-based convolutional neural network through ERP.Subsequently, the efficacy of the proposed framework will be validated by conducting a performance comparison with three state-of-the-art deep learning models.In summary, the research questions are: ➢ Does the ERP reflect human trust in automated systems in flight scenarios?➢ Is deep learning superior to traditional statistical analysis or machine learning for processing and analyzing EEG signals?If so, for what reasons?➢ Would the integration of the attention mechanism and 2D convolution optimize the deep learning model?If so, for what reasons?

Methodology
Trust refers to a complex psychosocial concept that involves the reliance and dependence of one entity on another [41].In this study, trust is defined as the acceptance and attitude towards automation, rather than dependence on the system.ERP has demonstrated its ability to reflect social trust [26,42].ERP typically occurs within 600 ms of stimulus appearance; they can reflect brain response [43].This study specifically focuses on the P300 component.The P300 component is a positive potential that reaches its peak at approximately 300-600 ms after a stimulus and is regarded as closely associated with emotional and cognitive states [44].By analyzing its latency, wave amplitude, and spectrum, insights into the neural processes related to human trust in automated UAM can be gained.A two-dimensional convolutional neural network integrated with the attention mechanism (2D-ACNN) was also developed to further detect trust.The overall experiment setup is shown in Figure 2 [45].
spectrum, insights into the neural processes related to human trust in automate can be gained.A two-dimensional convolutional neural network integrated wit tention mechanism (2D-ACNN) was also developed to further detect trust.The experiment setup is shown in Figure 2 [45].

Data Description
The data for this study are part of the Simplified Vehicle Operations (SVO [47], which has been adapted for further research beyond its original goals.The ments were conducted using a simulator with three screens, a pushrod to control a joystick to control direction, an avionics instrument display, a cockpit seat, and model of the aeroG Aviation aG-4 "Liberty" Electric Vertical Take-off and Land TOL) aircraft, as shown in Figure 2, upper left part.Physiological data were coll the BIOPAC system [48], where EEG signals were acquired from 21 locations on th within a dry electrode cap following the international 10-20 system.The electrod bution is shown in Figure 3.

Data Description
The data for this study are part of the Simplified Vehicle Operations (SVO) project [47], which has been adapted for further research beyond its original goals.The experiments were conducted using a simulator with three screens, a pushrod to control throttle, a joystick to control direction, an avionics instrument display, a cockpit seat, and a flight model of the aeroG Aviation aG-4 "Liberty" Electric Vertical Take-off and Landing (eVTOL) aircraft, as shown in Figure 2, upper left part.Physiological data were collected by the BIOPAC system [48], where EEG signals were acquired from 21 locations on the scalp, within a dry electrode cap following the international 10-20 system.The electrode distribution is shown in Figure 3.
This project involved 40 participants to ensure a sufficient amount of data.They were asked to operate the simulated aircraft to take off from the suburb, then fly through the outskirts to the urban airspace, and eventually land on the top floor of a skyscraper.The flight duration was approximately 40 min, as depicted in Figure 4.This project involved 40 participants to ensure a sufficient amount of data.They were asked to operate the simulated aircraft to take off from the suburb, then fly through the outskirts to the urban airspace, and eventually land on the top floor of a skyscraper.The flight duration was approximately 40 min, as depicted in Figure 4. Flying the eVTOL in a city scenario is a challenging task because the pilot needs to control the aircraft in a complex urban environment, dealing with various complex factors such as other aircraft operating nearby, including other eVTOLs or helicopters, certain authorized and not authorized flying objects including drones, and the urban area architectonic landscape.The vertical take-off and landing also pose a great challenge to the pilot, as they require highly precise control of the aircraft's attitude and position.To reduce the pilot's workload, many eVTOLs are equipped with advanced automation systems such as automatic obstacle avoidance, automatic cruise, etc. [49].However, a crewless flight can only be currently achieved in a controlled environment.
Before the experiment, participants were trained to operate the eVTOL and practiced for the secondary task, which is the oddball task [50], to evaluate workload.In this project, half of the participants had to fly manually.The rest of the participants were engaged in flight with a highly automated system, wherein they only needed to monitor the flight  This project involved 40 participants to ensure a sufficient amount of data.They were asked to operate the simulated aircraft to take off from the suburb, then fly through the outskirts to the urban airspace, and eventually land on the top floor of a skyscraper.The flight duration was approximately 40 min, as depicted in Figure 4. Flying the eVTOL in a city scenario is a challenging task because the pilot needs to control the aircraft in a complex urban environment, dealing with various complex factors such as other aircraft operating nearby, including other eVTOLs or helicopters, certain authorized and not authorized flying objects including drones, and the urban area architectonic landscape.The vertical take-off and landing also pose a great challenge to the pilot, as they require highly precise control of the aircraft's attitude and position.To reduce the pilot's workload, many eVTOLs are equipped with advanced automation systems such as automatic obstacle avoidance, automatic cruise, etc. [49].However, a crewless flight can only be currently achieved in a controlled environment.
Before the experiment, participants were trained to operate the eVTOL and practiced for the secondary task, which is the oddball task [50], to evaluate workload.In this project, half of the participants had to fly manually.The rest of the participants were engaged in flight with a highly automated system, wherein they only needed to monitor the flight Flying the eVTOL in a city scenario is a challenging task because the pilot needs to control the aircraft in a complex urban environment, dealing with various complex factors such as other aircraft operating nearby, including other eVTOLs or helicopters, certain authorized and not authorized flying objects including drones, and the urban area architectonic landscape.The vertical take-off and landing also pose a great challenge to the pilot, as they require highly precise control of the aircraft's attitude and position.To reduce the pilot's workload, many eVTOLs are equipped with advanced automation systems such as automatic obstacle avoidance, automatic cruise, etc. [49].However, a crewless flight can only be currently achieved in a controlled environment.
Before the experiment, participants were trained to operate the eVTOL and practiced for the secondary task, which is the oddball task [50], to evaluate workload.In this project, half of the participants had to fly manually.The rest of the participants were engaged in flight with a highly automated system, wherein they only needed to monitor the flight situation and did not need to fly manually.As this paper aims to analyze trust in automated UAM, we mainly focus on participants who fly with an automated system.Although the automated system is introduced to reduce the pilot's workload, automation accidents are induced artificially.In other words, the automation system could randomly fail and trigger an alarm, requiring participants to press a button and take control of the aircraft.Additionally, to investigate the effects of automation failure on operators' driving behavior, 10 participants experienced a "catastrophic accident" during the automated flight, where the automation failed and resulted in the crash of the aircraft after 10 min of the flight.After the crash, the aircraft was reborn in the same location and participants were asked to continue the experiment.During the flight, participants were also required to perform a non-driving-related oddball task to better evoke ERP [50].Within the oddball paradigm, participants received a sequence of auditory stimuli with a 5 s interval, wherein 85% were standard stimuli and 15% were oddball stimuli.Participants were asked to step on a sensor under their feet when the oddball stimuli occurred.The ERP waveforms elicited by the oddball paradigm convey crucial information to understand cognitive mechanisms regarding trust [51].Since participants were asked to allocate their cognitive resources primarily to control the simulator, it was assumed that the oddball paradigm would not impose an additional flight workload on them.
During the experiment, there were regular pauses every 10 min, during which participants were interviewed about their subjective workload (NASA Task Load Index), ease of operation, and feelings about the automation [52].

EEG Signal Processing
For this study, EEG data from 20 participants who utilized the automation system for flying will be processed and analyzed to demonstrate the feasibility of applying ERP to detect trust in UAM automation.These participants were divided into two groups because different levels of trust can be induced through the automation reliability perceived by naive participants [28].Participants who experienced a "catastrophic accident" were regarded as the Low Trust (LT) group, while the rest were assigned to the High Trust (HT) group.For the LT group, we only analyzed data after the "catastrophic accident".A t-test of subjective questionnaires was conducted to prove that the categorization was acceptable [53].
The EEG data underwent processing by the MNE-Python package [54].Initially, the data were subjected to filtration via a second-order Butterworth filter, within the range of 0.1-40 Hz.Following this, noise separation was performed through Independent Component Analysis (ICA) [55].The EEG signals before and after processing can be found in Figure 5.
situation and did not need to fly manually.As this paper aims to analyze trust in automated UAM, we mainly focus on participants who fly with an automated system.
Although the automated system is introduced to reduce the pilot's workload, automation accidents are induced artificially.In other words, the automation system could randomly fail and trigger an alarm, requiring participants to press a button and take control of the aircraft.Additionally, to investigate the effects of automation failure on operators' driving behavior, 10 participants experienced a "catastrophic accident" during the automated flight, where the automation failed and resulted in the crash of the aircraft after 10 min of the flight.After the crash, the aircraft was reborn in the same location and participants were asked to continue the experiment.During the flight, participants were also required to perform a non-driving-related oddball task to better evoke ERP [50].Within the oddball paradigm, participants received a sequence of auditory stimuli with a 5 s interval, wherein 85% were standard stimuli and 15% were oddball stimuli.Participants were asked to step on a sensor under their feet when the oddball stimuli occurred.The ERP waveforms elicited by the oddball paradigm convey crucial information to understand cognitive mechanisms regarding trust [51].Since participants were asked to allocate their cognitive resources primarily to control the simulator, it was assumed that the oddball paradigm would not impose an additional flight workload on them.
During the experiment, there were regular pauses every 10 min, during which participants were interviewed about their subjective workload (NASA Task Load Index), ease of operation, and feelings about the automation [52].

EEG Signal Processing
For this study, EEG data from 20 participants who utilized the automation system for flying will be processed and analyzed to demonstrate the feasibility of applying ERP to detect trust in UAM automation.These participants were divided into two groups because different levels of trust can be induced through the automation reliability perceived by naive participants [28].Participants who experienced a "catastrophic accident" were regarded as the Low Trust (LT) group, while the rest were assigned to the High Trust (HT) group.For the LT group, we only analyzed data after the "catastrophic accident".A t-test of subjective questionnaires was conducted to prove that the categorization was acceptable [53].
The EEG data underwent processing by the MNE-Python package [54].Initially, the data were subjected to filtration via a second-order Butterworth filter, within the range of 0.1-40 Hz.Following this, noise separation was performed through Independent Component Analysis (ICA) [55].The EEG signals before and after processing can be found in Figure 5.As the P300 waveform is most prominent in the frontal and parietal lobe [51], we selected data from these patterns, including the CPz, Pz, Fz, FCz, Cz, P3, F3, P4, and F4 electrodes for further statistical analysis.The EEG data were then segmented with an epoch of 800 ms time-locked to the oddball stimulus, spanning from 200 ms prior to the oddball stimulus to 600 ms following it.For each epoch, the data from the first 200 ms were used as the baseline to normalize the data from the subsequent 600 ms.
The processed EEG data were exported into Excel files, including the EEG information for each selected channel, the trust condition (HT or LT), and the participant IDs.The MNE-Python package was adopted to compute the average ERP waveform for each participant to enhance the stability and reliability of the waveform [56].Overlaying the ERP reduces the effect of random noise and better reflects the neural responses, so the average ERP waveform of all participants was then overlayed for each trust condition.Finally, the amplitude and latency of ERP waveforms were measured and compared across different trust conditions.

Trust Detection Framework
Although there are some studies analyzing human trust using EEG, few devote effort into developing DL frameworks for trust detection [37].The key to utilizing DL on trust detection is to efficiently extract crucial EEG features and perform classification.Shafiei et al. used a linear SVM in combination with kernel-target alignment (KTA) and kernel class separability (KTS criteria) to detect a surgeon's trust in a robot-assisted interface in robot-assisted surgery (RAS) [57].The method is simple and effective but performs well only in binary classification.Choo and Nam utilized a convolutional neural network (CNN) to estimate operators' trust in automation while performing Air Force multi-attribute task battery (AF-MATB) by extracting and analyzing image features of electroencephalogram (EEG) signals [37].This network showed great results.However, the experimental data for this network were collected under specific tasks where the EEG waveforms were more pronounced.It may not be as effective as such in a perturbed real flight scenario and modification is required.
For this research endeavor, our plan involves the utilization of a convolutional neural network (CNN) to accomplish this task.A CNN is a deep neural network with a convolutional structure and multiple building blocks, which serves the dual purpose of feature extraction and classification [58,59].As a CNN requires minimal preprocessing of the input data, it is well suited for EEG signal analysis and has been utilized in tasks such as brain disease detection [60], emotion classification [61], BCI identification [62], etc.We believe such dominance could also work in trust detection.
Typically, deep learning studies for EEG signal recognition mostly utilize Conv1D or Long Short-Term Memory (LSTM) [63].However, these structures can only extract features along a single direction, potentially ignoring useful spatial information.The Conv2D was utilized, which possesses the capability to extract interconnections among multiple EEG channels, not solely concurrently but also across neighboring time segments, for trust detection [64].The EEG signals, consisting of electrical activities at different electrode positions, are transformed into a time series of 2D data.The data are represented as Equation ( 1): where the variable "i" signifies the i-th channel, "t" corresponds to the starting time, and "t+n" is the ending time.Subsequently, 2D convolution and pooling techniques [65] were adopted to automatically derive temporal and spatial features from the EEG data.It is noteworthy that EEG signals encompass an extensive array of multi-dimensional information, and the aim is to extract only the information relevant to trust [66].Hence, we integrated an attention module, enabling the network to adaptively adjust its attention to different features [67].This attention mechanism helps emphasize crucial features while ignoring noise or unimportant information.
In general, the proposed trust detection framework consists of three parts: CNN feature extraction modules, attention module, and classification module, as illustrated in Figure 6.
To be more specific, the feature extraction module comprises three blocks, each encompassing a sequence of components: a 2D convolutional layer, a 2D max-pooling layer, and a batch normalization layer.This arrangement is illustrated in Figure 7.The convolution kernel glides through the time steps, capturing time domain features, and performing convolution operations on different spatial locations to extract spatial features.The Max-Pooling layers incorporate 2 × 2 max-pooling filters, employing a stride of 2. Batch normalization is utilized to prevent overfitting.For more intricate insights into convolution and pooling methodologies, in-depth information can be found in [68].After processing by the feature extraction module, several features are extracted.To be more specific, the feature extraction module comprises three blocks, each encompassing a sequence of components: a 2D convolutional layer, a 2D max-pooling layer, and a batch normalization layer.This arrangement is illustrated in Figure 7.The convolution kernel glides through the time steps, capturing time domain features, and performing convolution operations on different spatial locations to extract spatial features.The Max-Pooling layers incorporate 2 × 2 max-pooling filters, employing a stride of 2. Batch normalization is utilized to prevent overfitting.For more intricate insights into convolution and pooling methodologies, in-depth information can be found in [68].After processing by the feature extraction module, several features are extracted.Subsequent to the completion of the feature extraction module, an attention module is introduced.By calculating the correlation between the Query and Key, the attention mechanism determines the importance of each feature and concentrates more attention on crucial ones.The Value is then weighted and summed according to the computed attention weights.In this way, the model automatically fuses information from different time steps and channels based on the correlation between Q and K, obtaining more informative attention mechanism helps emphasize crucial features while ignoring noise or unim portant information.
In general, the proposed trust detection framework consists of three parts: CNN fea ture extraction modules, attention module, and classification module, as illustrated in Fig ure 6.To be more specific, the feature extraction module comprises three blocks, each en compassing a sequence of components: a 2D convolutional layer, a 2D max-pooling laye and a batch normalization layer.This arrangement is illustrated in Figure 7.The convolu tion kernel glides through the time steps, capturing time domain features, and performin convolution operations on different spatial locations to extract spatial features.The Max Pooling layers incorporate 2 × 2 max-pooling filters, employing a stride of 2. Batch nor malization is utilized to prevent overfitting.For more intricate insights into convolutio and pooling methodologies, in-depth information can be found in [68].After processin by the feature extraction module, several features are extracted.Subsequent to the completion of the feature extraction module, an attention modul is introduced.By calculating the correlation between the Query and Key, the attentio mechanism determines the importance of each feature and concentrates more attention o crucial ones.The Value is then weighted and summed according to the computed atten tion weights.In this way, the model automatically fuses information from different tim steps and channels based on the correlation between Q and K, obtaining more informativ Subsequent to the completion of the feature extraction module, an attention module is introduced.By calculating the correlation between the Query and Key, the attention mechanism determines the importance of each feature and concentrates more attention on crucial ones.The Value is then weighted and summed according to the computed attention weights.In this way, the model automatically fuses information from different time steps and channels based on the correlation between Q and K, obtaining more informative feature representations.The expression of the attention weights is represented by Equation ( 2): where Y is the feature vectors, f Q (Y'), f K (Y'), and f V (Y') represent three different linear transformations, and d k is the length of the feature vector.Finally, fully connected layers made up of 512 and 64 hidden neurons are used for the final classification.The output layer is a Dense layer with Softmax activation.
During the process of model training, the batch size was set at 128, the initial learning rate was established as 0.001, and the Adam optimizer was utilized [69].Early stopping was adopted, where training stops when the model's accuracy no longer improves over 50 epochs, and the optimal model was saved [70].The model was implemented using PyTorch.

Evaluation Method
To gauge the efficacy of the proposed DL model, we employ metrics including Accuracy, Recall, Precision, and F1 score for evaluation, following the approaches outlined by [71] and [72].These metrics are defined as Equation ( 3): where TP stands for True Positive, TN signifies True Negative, FP denotes False Positive, and FN represents False Negative.Furthermore, to gauge the efficacy of the proposed trust detection framework, a comparative analysis was carried out.Several state-of-the-art DL models were incorporated for comparison with regard to the above metrics.The compared models are as follows: • IIFTWSVM [73]: IIFTWSVM is a modified version of the Intuitionistic Fuzzy Twin Support Vector Machines (IFTWSVM) method.This approach integrates both membership and non-membership weights, aimed at minimizing the impact of outliers.It introduces diverse Lagrangian functions as a strategy to circumvent the need for matrix inverse computations.

•
Cluster-based KNN [74]: this method involves calculating the distance from the clustered mean of the test sample.The test samples are categorized according to the primary vote of their neighboring points, determined by the first K rows in ascending order.

•
Bi-LSTM [75]: Bi-directional LSTM involves training two LSTM layers on the input sequence instead of one.The key difference is that the first recurrent layer in the network is duplicated, creating two layers placed side by side.

Results
In this part, we investigated the subjective perceptions of participants' trust towards the automated system and participants' performance in the oddball task.As the dataset was originally used for workload analysis, we further analyzed it to determine whether the data were capable of reflecting trust.The ERP from the HT group and LT group were also compared.Furthermore, an analysis of the performance of the proposed trust detection model was conducted, along with an evaluation of other DL models.

Subjective Trust Rating
Prior to the experiment and every 10 min during the experiment, participants were requested to rate their trust toward the automation and workload.It was anticipated that variations in flight scenarios and disparate levels of automation trustworthiness would yield distinct ratings among participants.
The outcomes are presented in Figure 8.It is evident that the subjective trust ratings of both groups differed only by 0.15 before the experiment.However, after the "catastrophic accident", the trust rating in the LT group dropped sharply, followed by a slow recovery, but remained lower than at the beginning.On the other hand, the trust level in the HT group exhibited fluctuations and gradually increased after the start of the experiment.
Prior to the experiment and every 10 min during the experiment, participants we requested to rate their trust toward the automation and workload.It was anticipated th variations in flight scenarios and disparate levels of automation trustworthiness woul yield distinct ratings among participants.
The outcomes are presented in Figure 8.It is evident that the subjective trust rating of both groups differed only by 0.15 before the experiment.However, after the "cat strophic accident", the trust rating in the LT group dropped sharply, followed by a slo recovery, but remained lower than at the beginning.On the other hand, the trust level the HT group exhibited fluctuations and gradually increased after the start of the exper ment.

Oddball Task Performance
As trust level would affect participants' cognitive resources, it was assumed that odd ball task performance might vary accordingly.The reaction time and error rate of the odd ball task between the two groups are shown in Figure 9.An independent samples t-te was performed, indicating a noteworthy distinction regarding the level of trust betwee the two groups.Participants exhibited generally longer reaction times and higher erro rates when their trust towards automation was lower.

ERP Results
The ERPs were analyzed for both trust conditions.Figure 10A exhibits ERP wav forms synchronized with the initiation of oddball stimuli for each channel within the tw groups.Topographic maps illustrating the P300 effect can be observed in Figure 10B.Fi ure 10C showcases superimposed waveforms of the P300 response across variou

Oddball Task Performance
As trust level would affect participants' cognitive resources, it was assumed that oddball task performance might vary accordingly.The reaction time and error rate of the oddball task between the two groups are shown in Figure 9.An independent samples t-test was performed, indicating a noteworthy distinction regarding the level of trust between the two groups.Participants exhibited generally longer reaction times and higher error rates when their trust towards automation was lower.

Subjective Trust Rating
Prior to the experiment and every 10 min during the experiment, participants were requested to rate their trust toward the automation and workload.It was anticipated that variations in flight scenarios and disparate levels of automation trustworthiness would yield distinct ratings among participants.
The outcomes are presented in Figure 8.It is evident that the subjective trust ratings of both groups differed only by 0.15 before the experiment.However, after the "catastrophic accident", the trust rating in the LT group dropped sharply, followed by a slow recovery, but remained lower than at the beginning.On the other hand, the trust level in the HT group exhibited fluctuations and gradually increased after the start of the experiment.

Oddball Task Performance
As trust level would affect participants' cognitive resources, it was assumed that oddball task performance might vary accordingly.The reaction time and error rate of the oddball task between the two groups are shown in Figure 9.An independent samples t-test was performed, indicating a noteworthy distinction regarding the level of trust between the two groups.Participants exhibited generally longer reaction times and higher error rates when their trust towards automation was lower.

ERP Results
The ERPs were analyzed for both trust conditions.Figure 10A exhibits ERP waveforms synchronized with the initiation of oddball stimuli for each channel within the two groups.Topographic maps illustrating the P300 effect can be observed in Figure 10B.

ERP Results
The ERPs were analyzed for both trust conditions.Figure 10A exhibits ERP waveforms synchronized with the initiation of oddball stimuli for each channel within the two groups.Topographic maps illustrating the P300 effect can be observed in Figure 10B.Figure 10C showcases superimposed waveforms of the P300 response across various channels within both groups.Figure 10D illustrates the average amplitude and latency of the P300 response.It can be observed that for the HT group, the P300 response exhibited a more positive amplitude and shorter latency.channels within both groups.Figure 10D illustrates the average amplitude and latency of the P300 response.It can be observed that for the HT group, the P300 response exhibited a more positive amplitude and shorter latency.

Training Results
The data described in 2.2 previously were used as the training set for the proposed 2D-ACNN.Train_Test_Split [76] was used to randomly divide the data, allocating 80% of it for the training set and reserving 20% for the test set.Several models, including IIFT-WSVM, Cluster-based KNN, and Bi-LSTM, were compared through the evaluation method.Table 1 and Figure 11 show the five-fold cross-validation results of these models.It can be observed that 2D-ACNN outperforms the other classifiers across all evaluation metrics.

Training Results
The data described in 2.2 previously were used as the training set for the proposed 2D-ACNN.Train_Test_Split [76] was used to randomly divide the data, allocating 80% of it for the training set and reserving 20% for the test set.Several models, including IIFTWSVM, Cluster-based KNN, and Bi-LSTM, were compared through the evaluation method.Table 1 and Figure 11 show the five-fold cross-validation results of these models.It can be observed that 2D-ACNN outperforms the other classifiers across all evaluation metrics.Furthermore, we performed an ablation study to demonstrate the efficacy of the attention block.Table 2 shows the model performance with and without the attention module.Figure 12 demonstrates feature maps with and without the attention module.The left map is the output feature map of the attention layer and the right map is the output feature map of the third convolutional layer.Although it is difficult to directly discern the usefulness of the attention module from the feature maps, the feature maps with and without the attention module differ.Coupled with the fact that the model with the attention module outperforms the one without, as shown in Table 2, we therefore hypothesize that the inclusion of the attention module helped the model to focus on key features.

ERP Features for Trust
This study analyzes the EEG signals of two groups with different trust levels for an automated UAM vehicle.The ERP results revealed that when participants have higher Furthermore, we performed an ablation study to demonstrate the efficacy of the attention block.Table 2 shows the model performance with and without the attention module.Figure 12 demonstrates feature maps with and without the attention module.The left map is the output feature map of the attention layer and the right map is the output feature map of the third convolutional layer.Although it is difficult to directly discern the usefulness of the attention module from the feature maps, the feature maps with and without the attention module differ.Coupled with the fact that the model with the attention module outperforms the one without, as shown in Table 2, we therefore hypothesize that the inclusion of the attention module helped the model to focus on key features.Furthermore, we performed an ablation study to demonstrate the efficacy of the tention block.Table 2 shows the model performance with and without the attention mo ule. Figure 12 demonstrates feature maps with and without the attention module.The l map is the output feature map of the attention layer and the right map is the output featu map of the third convolutional layer.Although it is difficult to directly discern the usefu ness of the attention module from the feature maps, the feature maps with and witho the attention module differ.Coupled with the fact that the model with the attention mo ule outperforms the one without, as shown in Table 2, we therefore hypothesize that t inclusion of the attention module helped the model to focus on key features.

ERP Features for Trust
This study analyzes the EEG signals of two groups with different trust levels for automated UAM vehicle.The ERP results revealed that when participants have high

ERP Features for Trust
This study analyzes the EEG signals of two groups with different trust levels for an automated UAM vehicle.The ERP results revealed that when participants have higher trust in the automation system (15.06millivolts), their P300 responses are more positive compared to the responses of participants who have lower trust (12.51 millivolts).In addition, as trust decreases, participants' risk perception increases, leading to heightened vigilance and a higher workload.Consequently, participants required more time to respond to stimuli, resulting in longer latency in the LT condition (330 ms) compared to the HT condition (284 ms).These findings align with earlier studies showing that lower trust levels can induce increased cognitive load and negative emotions [77].

Trust Detection Based on DL
A DL framework for trust detection in automated UAM using ERP was developed.Both Bi-LSTM and 2D-ACNN showed good performance in detecting trust, suggesting that there is a correlation between trust and the ERP components and that this correlation can be captured by DL models.Considering that trust is a complicated neural process, the relationship between trust and ERP is likely to be complex and non-linear.The DL models, consisting of multiple layers, can learn hierarchical and abstract representations, which is crucial when dealing with the intricacies of trust and its neural correlates.By leveraging the deep architecture, the non-linear relationship between trust and ERP can be effectively modeled, thereby achieving better performance in trust detection.

The 2D-ACNN Structure
The proposed 2D-ACNN exhibits better performance compared to the other-mentioned DL methods, perhaps benefiting from decoding ERP signals in a better way.The utilization of 2D convolution in the feature extraction module facilitates the extraction of temporal and spatial attributes at multiple scales.Furthermore, a scaled dot-product attention module is integrated with the CNN, which can appropriately assign weights, indicating that paying more attention to specific electrodes and time steps may contribute to enhancing classification performance.

Conclusions
This study utilizes a DL framework to detect operators' trust in automated UAM through objective neurophysiological methods.The operator's trust was greatly reduced after a "catastrophic accident" caused by automation.As trust reduces, the plotted P300 waveform amplitude also decreases, indicating that patterns in ERP can reveal the operator's trust in the automated UAM.The proposed 2D-ACNN can effectively extract key features from ERP using an attention mechanism, while preserving both temporal and spatial features, enabling trust detection.It outperforms other DL classifiers with respect to accuracy, recall, precision, and F1 score.The accuracy of 2D-ACNN is 94.12%, which is 5.66% higher than the widely used Bi-LSTM model.The research findings contribute to the following areas:

•
Demonstrating the neurocognitive differences between trust and distrust, and revealing the close association between trust and risk perception during automated UAM; • Proposing a novel 2D-ACNN model for trust detection through EEG signals, and exhibiting superior performance to other models due to its great feature extraction;

•
Integrating the attention mechanism with 2D-CNN, thus achieving better trust detection by focusing on key time points and EEG channels.
In summary, the detection of trust can help the operators avoid over-trust or distrust in automation, thereby optimizing the design of automated vehicles.It is also an important step to promote UAM development.In future work, the limitations of this study will be further addressed, including increasing the sample size for analysis and training, investigating and modifying more advanced classification networks, and performing data augmentation, etc.

Figure 4 .
Figure 4. Flight path, take-off point, and landing point for the experiment.

Figure 4 .
Figure 4. Flight path, take-off point, and landing point for the experiment.

Figure 4 .
Figure 4. Flight path, take-off point, and landing point for the experiment.

Figure 5 .
Figure 5.The EEG signals before and after processing.

Figure 5 .
Figure 5.The EEG signals before and after processing.

Aerospace 2024 ,
11, x FOR PEER REVIEW 8 of 16 attention mechanism helps emphasize crucial features while ignoring noise or unimportant information.In general, the proposed trust detection framework consists of three parts: CNN feature extraction modules, attention module, and classification module, as illustrated in Figure 6.

Figure 7 .
Figure 7.The architecture of the feature extraction module [64].

Figure 7 .
Figure 7.The architecture of the feature extraction module [64].

Figure 7 .
Figure 7.The architecture of the feature extraction module [64].

Figure 8 .
Figure 8.The subjective rating of trust and workload at 0 min, 10 min, 20 min, and 30 min during the experiment.

Figure 9 .
Figure 9.The subjective rating of trust and workload at 0 min, 10 min, 20 min, and 30 min durin the experiment.

Figure 8 .
Figure 8.The subjective rating of trust and workload at 0 min, 10 min, 20 min, and 30 min during the experiment.

Figure 8 .
Figure 8.The subjective rating of trust and workload at 0 min, 10 min, 20 min, and 30 min during the experiment.

Figure 9 .
Figure 9.The subjective rating of trust and workload at 0 min, 10 min, 20 min, and 30 min during the experiment.
Figure 10C showcases superimposed waveforms of the P300 response across various

Figure 9 .
Figure 9.The reaction time and error rate of the oddball task for HT and LT.

Figure 12 .
Figure 12.The feature maps with the attention module (left) and without the attention module (right).

Figure 12 .
Figure 12.The feature maps with the attention module (left) and without the attention modu (right).

Figure 12 .
Figure 12.The feature maps with the attention module (left) and without the attention module (right).

Table 2 .
Five-fold cross-validation results with and without attention module.

Table 2 .
Five-fold cross-validation results with and without attention module.

Table 2 .
Five-fold cross-validation results with and without attention module.