Methodological Considerations Concerning Motion Sickness Investigations during Automated Driving

Automated driving vehicles will allow all occupants to spend their time with various non-driving related tasks like relaxing, working, or reading during the journey. However, a significant percentage of people is susceptible to motion sickness, which limits the comfort of engaging in those tasks during automated driving. Therefore, it is necessary to investigate the phenomenon of motion sickness during automated driving and to develop countermeasures. As most existing studies concerning motion sickness are fundamental research studies, a methodology for driving studies is yet missing. This paper discusses methodological aspects for investigating motion sickness in the context of driving including measurement tools, test environments, sample, and ethical restrictions. Additionally, methodological considerations guided by different underlying research questions and hypotheses are provided. Selected results from own studies concerning motion sickness during automated driving which were conducted in a motion-based driving simulation and a real vehicle are used to support the discussion.


Introduction and Overview
Motion sickness is well known amongst users of any kind of transportation. Sea sickness, airplane sickness, even space sickness have been investigated over the past 100 years [1]. Today, depending on the considered reference, up to 60% of Americans suffer from car sickness [2,3]. At the same time, original equipment manufacturers (OEMs) are developing towards automated driving, allowing drivers to hand over full control to the vehicle and by that engaging in non-driving related tasks while driving. Moreover, automated vehicles may include new cabin designs that enable different human postures and thus support the execution of non-driving related activities. With that, the possibility of suffering from motion sickness expands from passengers to drivers. To use the "value of time" generated by automated vehicles, users expect to engage in a large variety of tasks during driving, ranging from reading, working, playing video games, watching movies and many more [4,5]. However, while enabling such tasks in the vehicle is promising in terms of user satisfaction, it is exactly those activities that increase the probability of motion sickness [2,6]. Hence, within the context of automated driving, higher incidence numbers and more severe symptoms of motion sickness can be expected [7,8], which will impair the user experience in the vehicle. Besides negative subjective experiences, it is, until now, unclear how motion sickness influences take over and driving performance in case the automation system reaches its limitations and drivers have to take over the driving task. Regarding professionals at sea, a study found up to 60% impaired performance due to sea sickness [7]. Hence, motion sickness could not only lead to a decreased acceptance of automated vehicles but also to a decrease in driving safety. Consequently, there has been an increasing demand of investigation in motion sickness in the context of automated driving. In particular, two research questions are of interest: (1) The first goal is to get valid estimations of the prevalence, symptoms, and symptom evolution, to understand the influencing factors and development over time. (2) The second main issue is the development of countermeasures for motion sickness that are applicable in the vehicle.
To answer these research questions, controlled empirical studies are necessary. Yet, there has only been a small amount of research conducted in realistic vehicle settings.
This paper discusses basic methodological considerations for designing and conducting empirical studies on motion sickness in the vehicle context. It shall support applied researchers to decide on the right methods, measures, samples, and ethical considerations. The paper furthermore includes unpublished data confirming the methodological approaches.
The data presented in the subsequent chapters is based on a study conducted in the high-level driving simulator of the Wuerzburg Institute for Traffic Sciences (WIVW GmbH) and an AUDI A8 serial vehicle. A total of N = 24 participants took part in the study. The study had a within-subjects design, i.e., every participant took part in four separate sessions, of which two were drives in the real vehicle and two were drives in the driving simulator. In the real driving part, an Audi A8L equipped with SAE Level 2 functionality was used as the test vehicle. A trained experimenter drove the car on an Autobahn track while the participant was sitting in the front passenger seat. The experimenter used the Level 2 functions whenever possible. Passing maneuvers were performed similar to an autonomous vehicle. The simulator runs with the driving simulation software SILAB ® . The motion system uses a hexapod with six degrees of freedom and can briefly display a linear acceleration up to 5 m/s 2 or 100 • /s 2 on a rotary scale. It consists of 6 electro-pneumatic actuators (stroke ± 60 cm; inclination ± 10 • ). The mockup is created with a BMW 520i with automatic transmission. As the visual system of the WIVW simulator is defined for the driver only, the participants were sitting in the driver's seat during the run in the driving simulator. The driving behavior of the simulated vehicle was defined as comparable as possible to the driving behavior of the real vehicle. Additionally, the road geometry of the real Autobahn track was implemented precisely in the driving simulation. The participant's task was to watch a video during the rides of approx. 40 min. The four runs occurred in a counterbalanced order with a minimum of two days between each day of participation. In the study, motion sickness was measured via the misery-scale (MISC) [9] every two minutes during the run. After the run, a symptom questionnaire was used. It included a list with symptoms of the motion sickness questionnaire (MSQ) [10] and from the simulator sickness questionnaire [11], which were rated on a scale with four categories ranging from "none" to "severe". After the last run, the participants had to compare both test settings (real vehicle vs. driving simulator) in form of several questions. Physiological data (participants' temperature, electrodermal activity, electrogastrogram) were recorded with a Varioport Polygraph (Becker Meditec).

Symptoms, Prevalence, and Time Course of Motion Sickness
The main symptom of motion sickness is nausea leading up to vomiting [12]. However, nausea is typically preceded and accompanied by symptoms like burping, (cold) sweat, pallor, fatigue, headache, or dizziness [9,13,14]. Appearance and chronology of the symptoms varies a lot between different persons and between the different types of motion sickness (e.g., carsickness, seasickness, simulator sickness) [7]. For example, oculomotor symptoms (such as eye strain or difficulties in focusing) are more often to be found in situations in which sickness is induced by visual stimuli (e.g., simulators) than in situations in which sickness is primarily elicited by movements (e.g., sea travel) [12].
It is difficult to make statements about the prevalence of motion sickness because its occurrence depends on multiple factors such as the mean of transport (car, bus, ship, train, etc.) but also on duration and intensity of provocation (driving through curves vs. driving on a straight highway). However, it has repeatedly been demonstrated in laboratory studies that motion sickness occurrence and intensity Information 2020, 11, 265 3 of 22 highly depend on the frequency of accelerations which act upon the passenger. Frequencies of about 0.16 to 0.2 Hz are particularly provocative to elicit motion sickness [15][16][17][18][19][20][21][22][23][24]. Furthermore, there is an effect of age: young children up to two years are immune to motion sickness [25]. Afterwards, mean motion sickness susceptibility raises to a peak at the age of 16 to 20 years [26]. Subsequently, the susceptibility decreases with increasing age. Concerning gender, women are generally more prone to motion sickness than men [25,[27][28][29][30][31]. The reasons for the gender difference are not clear. However, hormonal factors or a lower threshold to admit motion sickness symptoms in women are discussed as possible explanations [25,[27][28][29][30][31].
Similarly to the prevalence, the time course of motion sickness also depends on various factors like the individual susceptibility as well as the type and intensity of sickness provocation. The first symptoms may be perceived immediately after onset in highly provoking conditions. In less severe conditions, motion sickness may occur after 10 to 20 min in susceptible participants. Depending on the study design, symptoms often intensify linearly with progressing provocation duration and decrease rapidly after offset [32][33][34].

Motion Sickness Theories
There are various models and theories trying to explain the mechanisms leading to motion sickness, e.g., the toxin-hypothesis [35], postural instability theory [36], negative reinforcement model [37], or the rule of thumb [38]. More popular than these models is the theory of sensory conflict and rearrangement [14] or its revision, the neural mismatch theory [39]. It states that motion sickness occurs if there is a discord between different sensory inputs, i.e., the visual and the vestibular system. For example, if a passenger is reading a book during a drive, the eyes register a static environment and give feedback that the person is not moving. However, the vestibular organs register the longitudinal and lateral accelerations of the vehicle and give feedback that the person is moving. In this situation, the probability for motion sickness is higher than in a passenger who looks ahead and thus has no contradictory impressions [34]. In addition, motion sickness depends on the type of task [40,41].
Reason and Brand extended the theory by the component of expected sensory impressions [14]. The sensory rearrangement theory states that motion sickness increases when the actual visual and vestibular impressions differ from the expected ones, i.e., when future movements cannot be anticipated. Within this context, effects of habituation may also be relevant (neural mismatch theory) [39]. In general, the better the passenger's view ahead, the lower the risk of motion sickness [34]. For these reasons, motion sickness is more likely to occur when the passenger is sitting in the back seat compared to sitting in the front seat.

The Study Setting
In general, two study methods are applicable for motion sickness studies concerning autonomous driving: field experiments with real vehicles and studies in a driving simulator.
In field experiments, the participant is passenger of a real vehicle in a realistic road environment or on a test track. Naturally, the experimenter has no full control of the dynamic events happening and the experiences the participants make in a field study, resulting in reductions of internal validity. However, there are different ways to control for this. First of all, the driving style of the used vehicle should be standardized. In future, this can be realized by using automated functions that perform driving manoeuvers in the same way with high reliability. Until these automated vehicles are commonly available for this kind of study, human drivers need to drive the test vehicles. High levels of realism for future automated vehicles can be achieved by Wizard-Of-Oz settings, in which the automation is simulated by a human driver, e.g., [42][43][44]. It is necessary that human drivers are instructed or trained towards a specified and thus reproducible driving style [45]. In the presented setting (cf. Chapter 1), we used assistance systems like adaptive cruise control (ACC) and lane keeping for standardization. The trained drivers learned how to perform lane change maneuvers with the necessary step of actions (for example setting indicator before moving steering wheel, changing lanes in six to seven seconds). Along with that, there should be a low number of experimental drivers in order to avoid inter-individual differences in driving style between experimenters. Finally, after completing data collection, it is recommended to analyze the dynamic driving data to identify any conspicuousness within the actual realized driving behavior. If possible, the data can be systematically compared to the vehicle dynamics measured (1) within the same study to check for internal validity and (2) in other settings or situations in order to check for external validity.
However, even if the vehicle dynamics are kept as standardized as possible, external factors like traffic or weather conditions cannot be kept constant or manipulated consciously. However, these aspects can affect motion sickness: A high traffic density can lead to an increased number of braking and overtaking maneuvers due to slower vehicles. This driving behavior can lead to stronger symptoms of motion sickness. In contrast, a low traffic density enables homogenous driving with less accelerations and decelerations, which reduces the probability for motion sickness.
In contrast to field studies, driving simulators enable conducting studies in a highly controlled environment. They are used since the 1960s to investigate driving performance and behavior and are classified into three categories [46]:

•
High-level simulators incorporate a motion system and full vehicle cabs; • Mid-level simulators are static simulators with a full vehicle cab; • Low-level simulators are built around simple components such as game controllers and computer monitors.
As most researchers attribute motion sickness in vehicles to contradictory impressions between the vestibular system (which perceives motion) and the visual system (which perceives no motion, e.g., while reading a book), the use of a high-level simulator with a motion system is recommended. In mid-level and low-level simulators, in contrast, only visual induced motion sickness can be investigated. Basically, research questions concerning countermeasures or physiological correlates are conceivable in these simulators. However, it remains unclear how the results of these studies in simulators without motion system would be applicable for automated driving.
The most common motion platform of high-level simulators is a hexapod which provides motion in six degrees of freedom (x, y, z, roll, pitch, yaw). Compared to travelling in a real vehicle, longitudinal and lateral accelerations are different. The feeling for realistic accelerations is generated by hacks like tilting the presented scenery. More elaborated simulators mount the hexapod on an x-y table on which the simulation cabin is moved to produce more realistic accelerations. According to Carsten and Jamson [47], however, even a large motion system is not capable to provide realistic accelerations in special driving situations like negotiating a long curve.
Probably the most important benefit of driving simulation is the ability to create repeatable scenarios which are tailored to a certain research question. Depending on the research question, motion-sickness provoking scenarios with many strong lateral and longitudinal accelerations are possible as well as more homogenous driving scenarios with few accelerations only (e.g., highway scenarios). Additionally, the researcher is free in the selection of the driving behavior of the autonomous vehicle: each imaginable driving style is feasible even if this driving behavior is not possible in a real autonomous vehicle yet. Another benefit of driving simulation is the availability of data: the simulator provides all data that would be provided by a real test vehicle (e.g., velocity, acceleration) as well as data of the traffic environment (e.g., surrounding traffic, road geometry). Besides, the participant's behavior (e.g., head movement, glance behavior) and physiological data can be monitored and recorded in a simple way: the laboratory conditions make video recordings easier due to constant light conditions and physiological data recording more precise due to less disturbing artifacts of the environment (e.g., temperature, humidity). On the other hand, there are disadvantages of driving simulation. Some participants of simulator studies suffer simulator sickness, which is a subtype of motion sickness in simulated environments. The phenomenon occurs in all types of simulators-it also appears in fixed-base simulators without motion system due to visual stimuli only. Similar to motion sickness, it is caused by a mismatch between the visual perception and the vestibular sensation of acceleration and deceleration [14,48]. For a motion sickness study, this means that the results for motion sickness can be confounded with simulator sickness. For studies regarding prevalence or development of motion sickness it is recommended to exclude participants who have shown symptoms of simulator sickness in previous studies in order to diminish this artifact. However, as simulator sickness and motion sickness are related and show similar symptoms due to similar reasons, it is possible that some countermeasures are effective against both symptoms. Therefore, it has to be discussed if participants with simulator sickness problems are allowed in a study concerning motion sickness countermeasures. However, this issue has to be decided for each countermeasure or research question separately.
An important issue of driving simulation is the validity. A distinction that has been made on simulator validity is between absolute and relative validity [49]. Relative validity exists when effects in the simulator and under the same road conditions are in the same order and direction. In contrast, absolute validity is present when the numerical values are about equal in both systems. A lot of validation studies were carried out in various simulators. They compared various parameters of the driver's behavior (e.g., velocity, lateral displacement, braking behavior, gaze direction) between driving in a simulator and driving in a real vehicle. In most cases, the studies showed that relative validity exists while absolute validity was only rarely verified [50]. However, these results do not provide evidence that validity is given for motion sickness studies. In a motion sickness study, behavior of a driver is not relevant-moreover, the occupants' visual and vestibular perceptions are important.
To the authors' knowledge, there have not yet been studies comparing an occupant's motion sickness in a driving simulator to his/her motion sickness in a real vehicle. Therefore, we conducted the study design as described above.
The results showed that the progress of motion sickness was comparable in both conditions. After a general rise at the beginning of the run (approx. first 12 min), the sickness ratings increased more slowly in the second and last third. Compared to real driving, self-reported motion sickness was slightly higher in the simulation compared to the real vehicle ( Figure 1). However, the maximum sickness values during the runs do not differ (Wilcoxon signed-rank test: Z = 1.40, p = 0.162). The sessions of n = 3 drivers had to be aborted due to high sickness ratings in the simulator. In the field study, the run of n = 1 driver was terminated before the end of the test course. According to the symptom questionnaire, most symptoms occurred in a similar frequency and intensity in both runs (Figure 2 left). However, three symptoms differed significantly concerning their intensity: in the driving simulator, participants had higher general discomfort, more difficulties concerning focusing, and increased appetite (Figure 2 right). In a final interview after both runs, the participants stated that the motion sickness symptoms were more distinct in the driving simulator compared to the real vehicle (t(23) = 5.65, p < 0.001).
These results indicate that relative validity is given for the high-level simulator of the WIVW GmbH concerning motion sickness as the progression during the runs was comparable and the occurrence of frequent symptoms was similar. In contrast, absolute validity cannot be verified, as some of the self-reported symptoms were more distinct in the simulator.
The recommendation for the most appropriate study setting depends on the research questions: A field experiment offers the highest validity and should be used for studies which investigate the prevalence and the development of motion sickness. In this case, a conduction on public roads should be selected. The realistic test track could represent a highway, rural road or inner-city track. Previous studies used driving on highways and inner city roads to identify if and how strong motion sickness occurs. In these studies, the participants performed different tasks in the vehicle [51]. symptom questionnaire, most symptoms occurred in a similar frequency and intensity in both runs (Figure 2 left). However, three symptoms differed significantly concerning their intensity: in the driving simulator, participants had higher general discomfort, more difficulties concerning focusing, and increased appetite (Figure 2 right). In a final interview after both runs, the participants stated that the motion sickness symptoms were more distinct in the driving simulator compared to the real vehicle (t(23) = 5.65, p < 0.001).  These results indicate that relative validity is given for the high-level simulator of the WIVW GmbH concerning motion sickness as the progression during the runs was comparable and the occurrence of frequent symptoms was similar. In contrast, absolute validity cannot be verified, as In contrast, in case the research question covers the investigation of countermeasures avoiding or reducing the symptoms of motion sickness, it is crucial to choose a test setting that causes motion sickness in the participants quickly and with a high probability. In the vehicle context, this setting was mainly realized on test tracks, on which high provoking maneuvers were driven by the experimenters (e.g., driving in the shape of an eight, or constant stop and go). Other researchers made use of placing the participant rearwards in a vehicle driving on urban roads [51,52]. Within this setting, a comparison between a baseline trial and a repetition of the same condition with potential countermeasures allows to investigate the effectiveness in avoiding symptoms. In particular, considering the efforts put into these kinds of participant studies, an efficient and reliable creation of provoking situations needs to be considered in the study design. Besides, a simulator study using a motion sickness provoking scenario can also be conducted when investigating countermeasures. A requirement for this option is the validity of the driving simulator. The presented study shows that a high level driving simulator without x-y table can also offer relative validity-however, as driving simulators are very different this has to be tested for each simulator individually.

The Participant's Task
In general, automated driving will enable the driver to engage in various non-driving related activities. In motion sickness research, one relevant research question refers to specifically examining the different non-driving related tasks (NDRTs) for their potential to cause motion sickness. In respective investigations, subjects could either be free to engage in realistic everyday NDRTs of their choice or be presented with a specific NDRT. While many standardized tasks exist in the context of manual driving, such standardization is widely missing in the context of automated driving. Therefore, it would be desirable to also evaluate secondary tasks that cover certain groups of conceivable NDRTs in the future. For our setting, we chose a naturalistic NDRT. Based on previous research [5], it can be expected that the use case watching a video in an automated vehicle has some external validity.
Concerning other research questions such as the evaluation of countermeasures, it may also be relevant to induce motion sickness in a targeted manner or to investigate an extreme scenario. In this case, NDRTs that are characterized by highly limited peripheral and external vision of motion are required as hints about the vehicle's future motions can counteract motion sickness [53][54][55]. Therefore, a mainly visual NDRT should be presented in a way that assures gazing away from the road scene. To ensure standardization of the amount of peripheral vision across participants, visual material should be presented at a fixed location, e.g., by means of displays instead of providing handheld devices such as tablets. Naturally, fixed display positions also lead to more standardized participant movements. Since peripheral vision can be manipulated by both display position and size [7], to prevent the participant from using peripheral vision, a visual NDRT could be presented at a downward angle or on a large display. Further, to promote continuous task engagement, it is recommended to choose an NDRT that is difficult to interrupt or provides instructions and incentives for subjects to focus on the task and refrain from road glances (e.g., concentrating on visual tasks like reading or watching a movie during the drive increases the risk of motion sickness). Artificial, standardized NDRTs can therefore be suitable for this. Please note that engaging in a visual NDRT may cause visual problems such as strained eyes or blurred vision, which cannot be differentiated from symptoms of motion sickness. To better control for this, visual task characteristics and the duration of the task engagement may be considered. Similarly, fatigue may occur due to the experimental session's duration or as a motion sickness symptom. Further research should examine both the relationship between motion sickness and fatigue as well as methods to control for confounding effects.
For the selection of an adequate task for empirical studies on motion sickness, classifications of NDRTs, provide relevant dimensions such as the primary modality, the locality, the possibility of road glances, the need for sustained attention, and incentives to continue the task [56]. In addition, the presented material should be controlled for emotionality of content when motion sickness is measured using physiological correlates. Therefore, in our study, subjects watched a movie on a display positioned below the central information display. We further instructed subjects to refrain from road glances. The videos contained documentaries, which were interesting but not emotionally arousing. Other examples for such tasks may be reading a text or answering a quiz that is presented visually. Finally, participant posture should be considered in motion sickness studies given that the risk of motion sickness is also higher when the passenger is sitting on a rearward facing seat compared to a forward facing seat [52]. Moreover, for postures facing in the driving direction, a regular driving posture may increase the risk of motion sickness compared to a reclined posture [57].

Sample and Recruitment
In order to investigate motion sickness in autonomous vehicles a participant study is recommended. The requirements for the recruitment depend on the study's research question.
For a large variety of research questions, it is necessary that a significant part of the sample suffers from motion sickness during the study. For example, the effect of a countermeasure for motion sickness during travelling can only be demonstrated when a control condition in a between-or within-subjects design exists in which motion sickness occurs. In contrast, people who are not susceptible to motion sickness do not need countermeasures and are not relevant for the study question. It is only possible to identify physiological correlates of motion sickness when the participants have phases with and without motion sickness. Therefore, the selection of participants is crucial for the study's success as not all people are susceptible to motion sickness. This consideration leads to the next question regarding participants' recruitment: how to identify participants who are susceptible to motion sickness?
A common instrument to predict motion sickness susceptibility is the MSSQ (Motion Sickness Susceptibility Questionnaire) [14,58]. This tool queries how often several means of transport (e.g., cars, busses, airplanes) and amusement rides (e.g., carousels, rollercoasters) were used in the past and how often sickness occurred. The answers result in a motion sickness susceptibility score. However, the results of our own study indicate that the MSSQ total score is not appropriate to identify subjects who are susceptible to motion sickness while travelling in a car. There was no significant correlation (Spearman r(24) = 0.266; p = 0.210) between the MSSQ total score and the suffered motion sickness (measured via a misery scale according to [9]) in a real driving study on the Autobahn in which the N = 24 participants were passengers and had to watch a video during the drive (see Figure 3 left). The MSSQ probably covers too many means of transport-respondents with no motion sickness problems in cars can also achieve high MSSQ scores when having symptoms, for instance, in trains and airplanes. In contrast, respondents who compensate for their motion sickness in real driving situations might reach lower MSSQ scores than would be intended: people who know that they are susceptible to motion sickness might not engage in NDRTs in provoking situations and therefore did not experience any severe motion sickness in the past years.
However, the more specific MSSQ item "Over the last 10 years, how often you felt sick or nauseated in cars?" also showed no significant correlation (Spearman r(24) = 0.212; p = 0.319) to the suffered motion sickness in the study (see Figure 3 right). The question is very inaccurate as it does not differ between driving in an urban or rural area or on a highway. In addition, it summarizes travelling in a car while reading or texting on the back seat as well as being a co-driver who is attentive to the traffic situation. As the prevalence depends on the individual threshold to motion stimulation and varies under different situations [59], a curvy rural road can lead to symptoms for some people while other people suffer from motion sickness in urban scenarios only. Therefore, it is recommended to use a highly specific question with the exact test scenario as a screening question for the participants' recruitment (e.g., "Do you get symptoms of motion sickness as a co-driver while reading on the Autobahn?").
Concerning other research questions, a more common sample is required. A representative sample is necessary to investigate the prevalence of motion sickness. The sample should be representative concerning all aspects which can affect the prevalence of motion sickness, e.g., age [60,61] and gender [27,29]. drive (see Figure 3 left). The MSSQ probably covers too many means of transport-respondents with no motion sickness problems in cars can also achieve high MSSQ scores when having symptoms, for instance, in trains and airplanes. In contrast, respondents who compensate for their motion sickness in real driving situations might reach lower MSSQ scores than would be intended: people who know that they are susceptible to motion sickness might not engage in NDRTs in provoking situations and therefore did not experience any severe motion sickness in the past years.

Subjectively Perceived Motion Sickness
Subjective participant ratings via questionnaires are the most common method to measure motion sickness and to validate other measurement tools like physiological or behavioral measures. Within the subjective measurement approaches, there are two basic principles: either the participants are asked to evaluate their overall motion sickness in a single rating or the participants are questioned in detail about multiple or even all potential motion sickness symptoms and their intensity. Short questionnaires allow for a continuous online assessment of motion sickness during the test drive, which enables describing the time course of motion sickness development. In contrast, detailed questionnaires are suitable for pre-post evaluations to determine if and to what extent a certain condition has led to motion sickness.
One example for a short overall rating is the fast motion sickness scale (FMS) [62]. The FMS is a verbal rating scale ranging from 0 (no motion sickness at all) to 20 (severe sickness). Participants are asked to evaluate the current motion sickness and to focus on nausea, general discomfort and stomach problems. However, the scale of the FMS is unanchored. Hence, it is not possible to verbally describe what the distinct values on the scale stand for. Further, it is uncertain if the values on the scale actually represent the same degree of subjectively perceived motion sickness for each participant. It thus remains concealed if e.g., a value of 15 is associated with nausea and if this is valid for every participant of the sample. Therefore, unanchored scales do not deliver information about the characteristics of motion sickness. Due to its unspecific character, the rating may further be biased by other comfort restrictive factors, like boredom or fatigue.
Another tool to quickly measure subjective motion sickness is the misery-scale (MISC) [9]. It is an 11-point scale trying to capture the quantitative and qualitative degree of motion sickness within one combined rating. For this purpose, the scale's numeric values are assigned to more or less specific motion sickness symptoms and their intensity. The scale comprises the following gradation: 0 (no problems), 1 (uneasiness without specific symptoms), 2-5 (slightly to severely perceived specific symptoms like dizziness, headache, stomach awareness, etc.), 6-9 (nausea from slight to severe/retching), and 10 (vomiting). Thus, in contrast to scales like the FMS, the MISC values can be interpreted descriptively and it is assumed that every single value is interpreted similarly by all participants. Like the FMS, the MISC is able to assess motion sickness quickly, in short intervals and during motion sickness induction. The MISC suggests that nausea is perceived as more inconvenient than all other motion sickness symptoms. This, however, neglects that other symptoms like severe headache may also be perceived as very unpleasant. Without experiencing nausea, the MISC does not allow the participant to reach high motion sickness scores, even if the driving comfort has largely decreased. Therefore, strictly speaking MISC data cannot be considered as interval scaled. This impedes the analysis and interpretation of the results.
For these reasons, it may be useful to let the participants evaluate different specific symptoms on separate Likert scales. In addition to nausea, it would be plausible to include headache, general discomfort, dizziness, and-depending on the study design-also fatigue (especially during long or uneventful drives). In our study, these symptoms have been observed frequently after a 40-min Autobahn drive (71% of participants stated general discomfort, 96% fatigue) or are assumed to be perceived as particularly inconvenient (nausea, headache, dizziness). However, it should be ensured that the interrogation remains short.
In contrast to these quick and efficient methods, the motion sickness questionnaire (MSQ) [10] represents an approach to capture multiple or even all potential motion sickness symptoms and their intensity. There are different versions of the MSQ with different numbers of items [11]. The questionnaire consists of a checklist with items that are evaluated either concerning their presence (symptom present vs. not present) or concerning their intensity (none, slight, moderate, severe). Thus, the MSQ provides an extensive impression of the participant's current motion sickness. However, completing the questionnaire is relatively time-consuming and is thus not suitable for frequent motion sickness interrogations. It is, therefore, recommended to use it at the end of the driving study or during breaks (directly after provocation offset). Hence, the scale is rather suitable for pre-post evaluations and may be combined with a short online-questionnaire like the FMS, MISC or symptom-specific Likert scales. A comparative overview of the four discussed tools is given in Table 1. It is important to add that subjective ratings may be prone to several biases, such as demand characteristics or social desirability as discussed in Chapter 4. Further, the participant's mental model of the own susceptibility may affect the ratings (i.e., self-fulfilling prophecy). For example, participants believing to be highly susceptible may indicate higher motion sickness ratings, not only because they feel motion sick, but also because they expect to do so and in that sense to confirm their own beliefs. In addition, directly asking participants about their motion sickness symptoms may lead to a very conscious introspection of perceived motion sickness symptoms. Thus, participants may "discover" symptoms which would not have been perceived consciously otherwise. Further research is needed to determine if and to what extend these potential biases affect subjective motion sickness ratings. Nonetheless we consider it important to directly ask participants about their sickness symptoms because motion sickness and discomfort highly depends on the subjective evaluation.

Physiological Correlates
Because subjective ratings may be prone to biases, research has tried to measure motion sickness objectively. Over the last decades, there have been many attempts to describe motion sickness with physiological correlates. Among others, heart rate, blood pressure, respiration rate, gastrointestinal reactions, and skin conductance parameters have been investigated, e.g., [63][64][65][66][67]. However, until now there has been no reliable success in correlating physiological measures with subjectively perceived motion sickness. Reasons are the high variability of motion sickness provoking stimuli as well as the high individual specificity of reactions. For example, there are rather individual correlations between subjectively reported motion sickness and heart rate or blood pressure [68].
Three measures in which a correlation with motion sickness has been shown across multiple laboratory studies are body temperature [69], skin conductance [69,70], and electrogastrogram [71,72]. Hereinafter it shall be discussed to what extent these three measures are applicable to capture motion sickness in a driving study under naturalistic conditions.

Temperature
In previous studies, it was shown that motion sickness affects the human thermoregulation [69]. Nobel and colleagues demonstrated that in cold water body temperature decreases faster in motion sickness induced participants than in control participants [73]. Similarly, in a thermo-neutral environment body temperature was lower in motion sick participants than in control participants [74]. In the latter study, for example, the mean difference was about 0.4 • C between control participants and such who stated to be "very nauseous/almost vomiting". In the cited studies, body temperature was measured by a rectal thermistor. Not surprisingly, this procedure is perceived as an unreasonable imposition by many participants and may be doubtful for ethical reasons. One of multiple alternatives to make temperature measurement more convenient for the participants is to place the thermistor under the armpit. The participants should not move their arm during the measurement. It should be considered that mean axillary temperature is some tenth • C lower than rectal body temperature [75]. Within this procedure, body and skin temperature cannot be clearly distinguished, although they should not be equated. In some previous studies, differences in body temperature were not necessarily accompanied by significant differences in skin temperature [67,73]. Further, skin temperature can be biased, e.g., by perspiration, environmental temperature, or participants' clothing (warm/light). However, most biases can be controlled easily by the experimenter. Temperature and ventilation in the test vehicle can be held constant by air condition and participants can be instructed to wear comparable types of warm/light clothes. Further, the measured signal can be controlled easily by the experimenter since the range of value is relatively constant across participants (approx. between 36 and 38 degrees Celsius), which makes it easy to detect technical signal disturbances. Moreover, the signal is relatively stable and hardly susceptible to artifacts (e.g., movements, speaking; see Figure 4). As body temperature seems to react relatively slowly to influences, it is to expect that it does so with regard to motion sickness. Consequently, to detect potential effects, heavy provocation and/or a long measurement period might be necessary.
Information 2020, 11, 265 11 of 22 [71,72]. Hereinafter it shall be discussed to what extent these three measures are applicable to capture motion sickness in a driving study under naturalistic conditions.

Temperature
In previous studies, it was shown that motion sickness affects the human thermoregulation [69]. Nobel and colleagues demonstrated that in cold water body temperature decreases faster in motion sickness induced participants than in control participants [73]. Similarly, in a thermo-neutral environment body temperature was lower in motion sick participants than in control participants [74]. In the latter study, for example, the mean difference was about 0.4 °C between control participants and such who stated to be "very nauseous/almost vomiting". In the cited studies, body temperature was measured by a rectal thermistor. Not surprisingly, this procedure is perceived as an unreasonable imposition by many participants and may be doubtful for ethical reasons. One of multiple alternatives to make temperature measurement more convenient for the participants is to place the thermistor under the armpit. The participants should not move their arm during the measurement. It should be considered that mean axillary temperature is some tenth °C lower than rectal body temperature [75]. Within this procedure, body and skin temperature cannot be clearly distinguished, although they should not be equated. In some previous studies, differences in body temperature were not necessarily accompanied by significant differences in skin temperature [67,73]. Further, skin temperature can be biased, e.g., by perspiration, environmental temperature, or participants' clothing (warm/light). However, most biases can be controlled easily by the experimenter. Temperature and ventilation in the test vehicle can be held constant by air condition and participants can be instructed to wear comparable types of warm/light clothes. Further, the measured signal can be controlled easily by the experimenter since the range of value is relatively constant across participants (approx. between 36 and 38 degrees Celsius), which makes it easy to detect technical signal disturbances. Moreover, the signal is relatively stable and hardly susceptible to artifacts (e.g., movements, speaking; see Figure 4). As body temperature seems to react relatively slowly to influences, it is to expect that it does so with regard to motion sickness. Consequently, to detect potential effects, heavy provocation and/or a long measurement period might be necessary. In our study, the temperature's median was calculated for each interval of two minutes and served as the dependent measure for the subsequent analyses. The temperature was recorded under the armpit and correlated with the likewise every two minutes recorded MISC-ratings. Because a high inter-individual variability was expected [68], the number of significant positive or negative correlations between temperature and subjective measurement of motion sickness every two minutes was counted for each participant and each run (two-tailed testing). In 57.6% of all cases, a significant  temperature decreases with an increasing motion sickness rating) between temperature and subjective motion sickness was observed ( Figure 5). In order to estimate if the found correlations are stable within each participant, the possibility to replicate the found correlations was checked. However, only n = 3 participants showed significant negative correlations between temperature and motion sickness in more than two test drives (i.e., temperature decreased with increasing motion sickness ratings). The results indicate not only a high inter-individual, but also a high intra-individual variability of the found correlations. The variability may also derive from confounding factors like driving time or time of day.

Electrodermal Activity
Another measure which has frequently been investigated with regard to motion sickness is skin conductance. Derived from the observations of "cold sweating" [76], a positive correlation between motion sickness and electrodermal activity (EDA) seems quite plausible and has been shown in several studies [69,70]. Like temperature measurement, EDA recording is technically simple. The procedure is hardly unpleasant for the participants because the electrodes are fixed on the hands (frequently index and middle finger). The electrodes can be attached by the experimenter; hence, it is ensured that the electrodes are pinned correctly and identically across all participants. The measurement can be monitored by the experimenter because whether the measurement is working properly is apparent from the raw signal.
However, EDA is very susceptible to external influences and artifacts. This is a major obstacle in recording EDA under natural driving conditions. Unexpected stimuli strongly affect the EDA. These include, for example, motion perceptions resulting from longitudinal and lateral accelerations, which emerge naturally during driving. Additionally, EDA is affected by speaking and movements of the participants (see Figure 6). Therefore, participants should not move or speak during the drive-this should particularly be considered when asking participants about their current motion sickness. Instead of orally answering questions, it is possible to capture the participants' responses via e.g., a   EDA measurement and analysis is characterized into two types: first, the (tonic) skin conductance level (SCL) which describes the slowly changing conductance of the skin and can be analyzed by computing and comparing means or medians per time interval. The tonic level is overlaid by the second type-the (phasic) skin conductance reactions (SCR)-which are referred to discrete stimuli (e.g., sound, motion perception) and can be seen as sudden peaks in the raw signal. In a naturalistic setting, these phasic reactions frequently represent artifacts which are not directly associated with motion sickness but rather surprise or arousal [77] and are therefore not a suitable measure to detect motion sickness in driving. Therefore the more robust SCL should be analyzed if EDA is recorded.
To assess if the EDA is associated with motion sickness, our study also investigated the effects of motion sickness on skin conductance. EDA was recorded on the participants' index and middle fingers (left hand in simulator, right hand in real vehicle). The EDA's median was calculated for each interval of two minutes and served as the dependent measure for the correlations with the MISC ratings. A rise of the EDA was observed at the beginning of the test drive. Therefore, the first eight Figure 6. Exemplary raw electrodermal activity (EDA) data of a participant during a 44-min drive as a passenger on a German Autobahn. The numerous peaks in the chart indicate external events like braking, participant's movements, and motion sickness rating procedures. Since these events are not necessarily related to motion sickness in a naturalistic test setting, these peaks should be considered as artifacts.
In our study, the temperature's median was calculated for each interval of two minutes and served as the dependent measure for the subsequent analyses. The temperature was recorded under the armpit and correlated with the likewise every two minutes recorded MISC-ratings. Because a high inter-individual variability was expected [68], the number of significant positive or negative correlations between temperature and subjective measurement of motion sickness every two minutes was counted for each participant and each run (two-tailed testing). In 57.6% of all cases, a significant positive (i.e., temperature increases with motion sickness rating) or negative correlation (i.e., temperature decreases with an increasing motion sickness rating) between temperature and subjective motion sickness was observed ( Figure 5).
In order to estimate if the found correlations are stable within each participant, the possibility to replicate the found correlations was checked. However, only n = 3 participants showed significant negative correlations between temperature and motion sickness in more than two test drives (i.e., temperature decreased with increasing motion sickness ratings). The results indicate not only a high inter-individual, but also a high intra-individual variability of the found correlations. The variability may also derive from confounding factors like driving time or time of day.

Electrodermal Activity
Another measure which has frequently been investigated with regard to motion sickness is skin conductance. Derived from the observations of "cold sweating" [76], a positive correlation between motion sickness and electrodermal activity (EDA) seems quite plausible and has been shown in several studies [69,70]. Like temperature measurement, EDA recording is technically simple. The procedure is hardly unpleasant for the participants because the electrodes are fixed on the hands (frequently index and middle finger). The electrodes can be attached by the experimenter; hence, it is ensured that the electrodes are pinned correctly and identically across all participants. The measurement can be monitored by the experimenter because whether the measurement is working properly is apparent from the raw signal.
However, EDA is very susceptible to external influences and artifacts. This is a major obstacle in recording EDA under natural driving conditions. Unexpected stimuli strongly affect the EDA. These include, for example, motion perceptions resulting from longitudinal and lateral accelerations, which emerge naturally during driving. Additionally, EDA is affected by speaking and movements of the participants (see Figure 6). Therefore, participants should not move or speak during the drive-this should particularly be considered when asking participants about their current motion sickness. Instead of orally answering questions, it is possible to capture the participants' responses via e.g., a numeric keypad. Alternatively, intervals of motion sickness provocation and intervals of interrogation can be separated, and the latter be excluded from the statistical analysis. However, a temporal separated recording of subjective and physiological data impairs correlation analyses. Beside artifacts, effects deriving from the driving time can bias EDA.
EDA measurement and analysis is characterized into two types: first, the (tonic) skin conductance level (SCL) which describes the slowly changing conductance of the skin and can be analyzed by computing and comparing means or medians per time interval. The tonic level is overlaid by the second type-the (phasic) skin conductance reactions (SCR)-which are referred to discrete stimuli (e.g., sound, motion perception) and can be seen as sudden peaks in the raw signal. In a naturalistic setting, these phasic reactions frequently represent artifacts which are not directly associated with motion sickness but rather surprise or arousal [77] and are therefore not a suitable measure to detect motion sickness in driving. Therefore the more robust SCL should be analyzed if EDA is recorded.
To assess if the EDA is associated with motion sickness, our study also investigated the effects of motion sickness on skin conductance. EDA was recorded on the participants' index and middle fingers (left hand in simulator, right hand in real vehicle). The EDA's median was calculated for each interval of two minutes and served as the dependent measure for the correlations with the MISC ratings. A rise of the EDA was observed at the beginning of the test drive. Therefore, the first eight minutes of the 40to 45-min test drive were excluded. Additionally, intervals with tight curves were also excluded from the analysis to minimize artifacts deriving from the traffic scenario. Like in the temperature analysis, for each participant and each condition it was counted whether there is a significant positive (i.e., EDA increases with motion sickness rating) or negative correlation (i.e., EDA decreases with increasing motion sickness rating) with subjectively measured motion sickness. In 38.6% of all cases, a significant positive or negative correlation between EDA and motion sickness was observed. Again, the possibility to replicate the found correlations was checked in order to estimate if the found correlations are stable within each participant. However, as shown in Figure 7, no participant showed replicable positive or negative correlations between EDA and motion sickness in more than two test drives. Again, the results indicate not only a high inter-individual but also a high intra-individual variability of the found correlations. As described above, we observed that SCL rose at the beginning of the test drive (probably due to excitement) and then fell over time, independently of perceived motion sickness (probably due to habituation to the study setting). Thus, contrary to the temperature findings, it is highly probable that the found variability derives from confounding factors like driving time or the appearance of external events (e.g., sudden brakes), which emerge naturally during a realistic test drive. These biases may conceal potential effects from motion sickness on SCL. Altogether, there are several confounding effects which affect EDA in a natural driving setting. These should be considered and carefully controlled within the study.  Electrogastrography Electrogastrography (EGG) is another method which has been investigated to measure motion sickness. The EGG measures pacemaker potentials in the stomach which coordinate the gastric contractions [78]. Thus, the EGG does not capture the actual motility of the stomach but rather the efforts to actuate. Corresponding to typical motion sickness symptoms like nausea or awareness of the stomach, Stern and colleagues found changes in this pacemaker potential, namely a decrease in amplitude and an increase in frequency from 3 to 5-7 cycles per minute in motion sick participants [71,72]. Even if this correlation seems to be rather individual [79], it could nonetheless be shown across different studies, as for example [71,72,[79][80][81]. Therefore, the EGG seems to be a promising signal for a physiological measurement of motion sickness. The EGG is a very weak signal which is easily overlaid by movements (e.g., of the abdominal muscles; see Figure 8) [78,82]. Therefore, it is very important that participants do not move or speak during EGG recording. [71,72] used an optokinetic drum to induce motion sickness by vection. With this method it is possible to induce motion sickness without participants moving or being moved. In the context of driving, however, the application of EGG is naturally more challenging. In a naturalistic drive, participants are moved by the vehicle. The resulting acceleration forces may elicit unconscious movements of the participants like e.g., muscle tensions to compensate centrifugal forces in a curve. Similar to SCL, the circumstance that participants should not speak or move makes it difficult to ask them about their current motion sickness. However, motion artifacts have a different impact on EGG-analysis in comparison to the impact they have on SCL-analysis. SCL is analyzed by computing and comparing means or medians. Electrogastrography Electrogastrography (EGG) is another method which has been investigated to measure motion sickness. The EGG measures pacemaker potentials in the stomach which coordinate the gastric contractions [78]. Thus, the EGG does not capture the actual motility of the stomach but rather the efforts to actuate. Corresponding to typical motion sickness symptoms like nausea or awareness of the stomach, Stern and colleagues found changes in this pacemaker potential, namely a decrease in amplitude and an increase in frequency from 3 to 5-7 cycles per minute in motion sick participants [71,72]. Even if this correlation seems to be rather individual [79], it could nonetheless be shown across different studies, as for example [71,72,[79][80][81]. Therefore, the EGG seems to be a promising signal for a physiological measurement of motion sickness. The EGG is a very weak signal which is easily overlaid by movements (e.g., of the abdominal muscles; see Figure 8) [78,82]. Therefore, it is very important that participants do not move or speak during EGG recording. [71,72] used an optokinetic drum to induce motion sickness by vection. With this method it is possible to induce motion sickness without participants moving or being moved. In the context of driving, however, the application of EGG is naturally more challenging. In a naturalistic drive, participants are moved by the vehicle. The resulting acceleration forces may elicit unconscious movements of the participants like e.g., muscle tensions to compensate centrifugal forces in a curve. Similar to SCL, the circumstance that participants should not speak or move makes it difficult to ask them about their current motion sickness. However, motion artifacts have a different impact on EGG-analysis in comparison to the impact they have on SCL-analysis. SCL is analyzed by computing and comparing means or medians. Therefore motion artifacts reduce the interpretability of the results. In contrast, EGG is analyzed by spectral analysis which can be entirely ruled out by frequent or unnoticed motion artifacts [78,82]. In addition, the EGG raw signal is overlaid by other signals (e.g., from respiration, activity of intestine, etc.) [78] which are filtered later on. Hence, the experimenter cannot monitor any interpretable raw-signal during the test drive.
Information 2020, 11,265 15 of 22 etc.) [78] which are filtered later on. Hence, the experimenter cannot monitor any interpretable rawsignal during the test drive. In our study, EGG was recorded on the participant's abdominal surface. The electrodes were positioned according to the recommendations of Yin and Chen [82] and were attached by the participants themselves. Despite the instruction not to move, we found a high number and frequency of motion artifacts in most participants (an example is given in Figure 8). Therefore, a meaningful analysis was not possible and we refrain from reporting results.  Figure 6), the peaks in the chart indicate artifacts. In EGG data, these derive mainly from participant's movements.
Beside these methodological issues, some ethical aspects should be considered when EGG is recorded. The restriction not to move or speak might withhold participants from reporting when they feel very ill or when they wish to quit the study. In addition, for some participants it can be uncomfortable to have electrodes placed on the abdominal surface by an experimenter. To avoid this, it is possible to let the participants attach the electrodes themselves. Then, however, the experimenter has no control over whether the electrodes are placed correctly. Preparing the skin for attaching the electrodes [82] can also result in unpleasant feelings for participants. Additionally, amount and time of the last meal have to be controlled because this affects the stomach's activity and the development of motion sickness [83]. To avoid this, it is possible to ask participants to be fasted when EGG is recorded or to provide a standardized meal at some time before the start of the test drive.
Altogether, the EGG is hardly suitable to be applied in motion sickness studies under naturalistic driving conditions from the standpoint of current measurement techniques.

Data Analyses in General
Due to ethical reasons (see Chapter 4), participants must be able to terminate participation at any stage of the study. Furthermore, the experimenter has to terminate the session in cases of conspicuous suffering of the participant. Therefore, a researcher has to expect dropouts during the conduction of a motion sickness study. In driving studies concerning other topics (e.g., acceptance of a new driver assistance system) these dropout participants are often replaced by other participants so that each condition consists of a sufficient and equal number of data, which facilitates the statistical analysis. In a motion sickness study, however, the occurrence of a dropout is very important as it indicates that motion sickness was too distinct.
Concerning post-study questionnaires (e.g., MSQ), dropouts are not a problem for data analysis as all participants-regardless of cancelling or completing the session-can fill it out. However, all data collected during the runs are sensitive to dropouts during the session. On the one hand, this influences the statistical data analysis and might necessitate the usage of tests which can handle dropouts and missing data. On the other hand, however, researchers can use dropout rates as dependent variables, investigating which conditions caused how many people to abort the trials due to sickness. Furthermore, dropouts enable time-based parameters describing the progress of motion  Figure 6), the peaks in the chart indicate artifacts. In EGG data, these derive mainly from participant's movements.
In our study, EGG was recorded on the participant's abdominal surface. The electrodes were positioned according to the recommendations of Yin and Chen [82] and were attached by the participants themselves. Despite the instruction not to move, we found a high number and frequency of motion artifacts in most participants (an example is given in Figure 8). Therefore, a meaningful analysis was not possible and we refrain from reporting results.
Beside these methodological issues, some ethical aspects should be considered when EGG is recorded. The restriction not to move or speak might withhold participants from reporting when they feel very ill or when they wish to quit the study. In addition, for some participants it can be uncomfortable to have electrodes placed on the abdominal surface by an experimenter. To avoid this, it is possible to let the participants attach the electrodes themselves. Then, however, the experimenter has no control over whether the electrodes are placed correctly. Preparing the skin for attaching the electrodes [82] can also result in unpleasant feelings for participants. Additionally, amount and time of the last meal have to be controlled because this affects the stomach's activity and the development of motion sickness [83]. To avoid this, it is possible to ask participants to be fasted when EGG is recorded or to provide a standardized meal at some time before the start of the test drive.
Altogether, the EGG is hardly suitable to be applied in motion sickness studies under naturalistic driving conditions from the standpoint of current measurement techniques.

Data Analyses in General
Due to ethical reasons (see Chapter 4), participants must be able to terminate participation at any stage of the study. Furthermore, the experimenter has to terminate the session in cases of conspicuous suffering of the participant. Therefore, a researcher has to expect dropouts during the conduction of a motion sickness study. In driving studies concerning other topics (e.g., acceptance of a new driver assistance system) these dropout participants are often replaced by other participants so that each condition consists of a sufficient and equal number of data, which facilitates the statistical analysis.
In a motion sickness study, however, the occurrence of a dropout is very important as it indicates that motion sickness was too distinct.
Concerning post-study questionnaires (e.g., MSQ), dropouts are not a problem for data analysis as all participants-regardless of cancelling or completing the session-can fill it out. However, all data collected during the runs are sensitive to dropouts during the session. On the one hand, this influences the statistical data analysis and might necessitate the usage of tests which can handle dropouts and missing data. On the other hand, however, researchers can use dropout rates as dependent variables, investigating which conditions caused how many people to abort the trials due to sickness. Furthermore, dropouts enable time-based parameters describing the progress of motion sickness: How long does it take until the dropouts occur? Does this time differ between the test conditions? Therefore, researchers should not see dropouts as a problem (like in other research issues), but rather as an increase of information.
In general, time-based parameters describing the progress of motion sickness are important for motion sickness studies: if a continuous online assessment of motion sickness is conducted (e.g., via FMS, MISC, or symptom-specific Likert scales), it is possible to use parameters which define the time until a participant reaches a specific symptom (e.g., "time to nausea" or "time to sweating"). These data are helpful for the description of motion sickness and the effect of countermeasures.

Ethics in Motion Sickness Studies
The American Psychological Association has released a code of conduct that is relevant to research in psychology and other sciences [84]. It includes five fundamental principles which define how to treat participants in scientific investigations. The first principle "beneficence and nonmaleficence" states that researchers should take care of their participants and their wellbeing. This principle is violated by studies concerning motion sickness as unpleasant symptoms like headache, nausea, or sweating are provoked in these studies. Regarding this aspect, motion sickness research has similarities to pain research: research on a specific topic requires undesirable physical effects and uncomfortable situations for the study participants. Concerning pain studies, the Committee on Ethical Issues of the International Association for the Study of Pain (IASP) has published ethical guidelines for pain research [85]. According to the authors, "health, safety and dignity of human subjects have the highest priority in pain research"-of course, this is also applicable for motion sickness research. Researchers of motion sickness can orientate and adjust their procedure to these guidelines, in particular concerning the following principles: "Potential participants should be informed fully of the goals, procedures, and risks of the study before giving their consent". In a motion sickness study, participants must fill out an informed consent prior to the study. In particular, research on motion sickness has to be mentioned as the study's aim (i.e., no cover story) and the participant has to be informed that undesirable physical effects of the study (e.g., headache, sickness, sweating) are likely.
"Participants must be able to decline, or to terminate, participation at any stage without risk or penalty. Stimuli should never exceed a subject's tolerance limit and subjects should be able to escape or terminate a painful stimulus at will". In a motion sickness study, the participant is allowed to leave the study anytime. The experimenter has to stop the run immediately or as soon as possible. Of course, it is not allowed to exert pressure on the participants to continue the test session.
"The minimal intensity of noxious stimulus necessary to achieve goals of the study should be established and not exceeded." In a motion sickness study, the researcher must consider criteria when to break off a session: is it really necessary that the participants get strong motion sickness until vomiting? For most research questions it should be sufficient that participants feel first or moderate symptoms of nausea (e.g., for the evaluation of an intervention's effect) as several studies have shown that the motion sickness process is linear with further provocation [32,33]. Besides, even weaker symptoms are experienced as uncomfortable and are not desired during autonomous driving. The break-off criterion could be a predefined participant judgement on a scale measuring well-being, which is given regularly during the session. Additionally, a continuous monitoring through the experimenter can also help to evaluate the participants' well-being: In cases of conspicuous suffering (e.g., moaning, convulsing) the experimenter has to terminate the experiment.
After deciding to stop the experiment due to the participant's wish or a participant's rating over a predefined threshold or conspicuous suffering, the experimenter must stop the session immediately. After the participant has left the sickness provoking situation, the experimenter has to offer various options to the participant in order to relieve her/his motion sickness: e.g., breathe fresh air, visit a restroom, have a cold or warm drink; for emergency cases like a circulatory collapse the participants should have the option to lie down.
At the end of the study the participants' well-being should be evaluated again. If the participants still suffer from motion sickness symptoms, they should be strongly encouraged not to drive a car for safety reasons. In this case, the researcher should provide a shuttle back home or organize a taxi transfer and take on its costs.
The experimenter must be trained in all these mentioned aspects to ensure a good treatment of the participant. A high degree of empathy and training in the detection of motion sickness signals is especially important in order to avoid artefacts of the study situation. Some participants might play down the symptom severity because (1) they form an interpretation of the experiment's purpose and adjust their judgments to fit that interpretation (demand characteristics) or (2) they see high severity judgments as an indicator for weakness (social desirability). The experimenter must break off the experiment in both cases to impede further suffering of the participant.

Conclusions
Automated vehicles have the potential to provide significant benefits for the occupants as they can spend their time with various non-driving related activities during the journey. However, this scenario increases the risk of motion sickness and requires an investigation of the phenomenon of motion sickness in the context of automated driving. The present paper discusses methodological aspects for studies investigating the two main research questions: (1) what is the prevalence of motion sickness in a specific scenario (e.g., autonomous driving on a highway) and how do the symptoms develop? (2) Which countermeasures are effective in the prevention and reduction of motion sickness?
If researchers are interested in the prevalence and development of motion sickness in a specific scenario, we suggest conducting a field study in a setting which is as natural as possible. The test vehicle should be driving autonomously or operated by a trained experimenter (Wizard-Of-Oz setting) on public roads in order to achieve external validity. The participants should deal with an NDRT which is likely to be used in an autonomous vehicle in a future setting (e.g., reading or texting). This task should be self-paced so that the participants can interrupt the task when they want to and are able to glance up at the road. As the prevalence of motion sickness in this scenario is of interest, the researchers should select a representative sample concerning all aspects which can affect the prevalence of motion sickness, e.g., age and gender.
In contrast, the setting of a study investigating countermeasures for motion sickness is more standardized. This is necessary as the comparison between runs with the countermeasure (treatment run) and runs without the countermeasure (baseline run) has to be conducted under controlled conditions in order to achieve a high degree of internal validity. The influences of extraneous variables to the measurement should be minimized or removed. Therefore, the study must be conducted in a standardized setting, either on a test track or in a driving simulator. The scenario should provoke motion sickness in the baseline run as a positive effect in the treatment run can only be detected under these conditions. On a test track, standardized maneuvers like driving in a figure eight or constant stop-and-go are recommended. The maneuvers should be driven by a trained experimenter. In the driving simulator, a more naturalistic test course like a winding rural road is possible. The participants should deal with a standardized NDRT which controls glances on the road or totally impedes them. The researchers should select participants who are susceptible to motion sickness in the investigated setting. For this purpose, specific screening questions are more useful than general tools like the MSSQ. Table 2 gives an overview of the recommendations for studies concerning the two main research questions. Of course, the two research questions concerning prevalence/development and countermeasures are not distinct opposites which require an "either-or decision" in the study design. Mixed research questions are imaginable, e.g., when investigating which of two countermeasures is the most effective one in a naturalistic setting. These studies require a mix of methods from both directions.
Independent of the research question, subjective measurement tools like questionnaires and inquiries are necessary to determine motion sickness. Quick and efficient tools like the MISC scale or symptom-specific Likert scales are recommended to assess the intensity of the symptoms during driving. In contrast, comprehensive questionnaires like the MSQ are appropriate to capture a lot of motion sickness symptoms and their intensity after a run. The usage of physiological measurements to detect motion sickness is difficult under non-laboratory conditions. Existing literature reports a high degree of inter-individual variance in physiological reactions-additionally, we found a high intra-individual variance during the study with four test sessions. Furthermore, most data are affected by external events like breaking or a change of posture. It will be challenging to detect physiological correlates of motion sickness which can be assessed reliably and practicably during autonomous driving in realistic settings.
When planning a study concerning motion sickness during autonomous driving, it is imperative that the researchers consider ethical principles. Especially a comprehensive informed consent, predefined break-off criteria, and a protective treatment by trained experimenters is necessary to conduct the study in an appropriate manner.
In sum, more research is necessary for the investigation of motion sickness and possible countermeasures. This paper contributes to solving methodological questions during this research.