Developing a Multimodal HMI Design Framework for Automotive Wellness in Autonomous Vehicles

: With the development of autonomous technology, the research into multimodal human-machine interaction (HMI) for autonomous vehicles (AVs) has attracted extensive attention, especially in automotive wellness. To support the design of HMIs for automotive wellness in AVs, this paper proposes a multimodal design framework. First, three elements of the framework were envisioned based on the typical composition of an interactive system. Second, a ﬁve-step process for utilizing the proposed framework was suggested. Third, the framework was applied in a design education course for exempliﬁcation. Finally, the AttrakDiff questionnaire was used to evaluate these interactive prototypes with 20 participants who had an afﬁnity for HMI design. The questionnaire responses showed that the overall impression was positive and this framework can help design students to effectively identify research gaps and expand design concepts in a systematic way. The proposed framework offers a design approach for the development of multimodal HMIs for autonomous wellness in


Introduction
With the rapid development of automatic technology, the market for autonomous vehicles (AVs) is increasingly enormous [1]. At present, assistive driving technology has become relatively mature [2], and could result in extensive changes in people's mode of travel by reducing the burdens of driving [3,4]. Subsequently, non-driving related activities in vehicles may become diverse [5][6][7]. Recent studies have shown that passengers want to engage in activities that promote health and wellness [3,8]. According to Singleton et al. [9,10], the notion of automotive wellness can be described as maintaining physical health and positive mental wellbeing in vehicles. To date, there have been several explorations aimed at improving different aspects of autonomous wellness through the development of various human-machine interaction (HMI) systems. For example, Krome et al. [11] implemented an exercise bike into the car context using exergame elements to translate traffic jams into fitness behaviors. Bito et al. [12] developed an automatic in-vehicle video system to improve the driving experience by generating highlights of a trip. Deserno et al. [13] embedded medical sensors into a bus to empower the vehicle with diagnostic functions for stroke prevention. From a social wellness perspective, Lakier et al. [14] proposed a cross-car game concept for semi-autonomous driving.
In the transition towards automated driving in our society, HMI researchers should be focused on how to design and develop AV applications to shed light on future car scenarios [33,34]. Despite the growing research efforts focused on autonomous wellness, there is a lack of design frameworks to help designers efficiently find niches in the context and direct them in seamlessly developing design concepts. Given this inadequacy, this paper aims to develop a multimodal HMI design framework for automotive wellness. Based on a theoretical work by Benyon et al. [35], we argued that subjects of the system (i.e., AVs, passengers), target activities related to wellness, and system interactivity (i.e., system input, system output) should be systematically considered as key elements when developing multimodal HMIs for autonomous wellness in AVs. Accordingly, we proposed a design process to support designers in their efforts to adopt these three elements in their practices easily. We examined the proposed framework based on a design education course and used the students' works to exemplify its usefulness.

Background
There is a long history of design method exploration in order to inspire designers to carry out interaction design. For example, participatory design methods can directly involve users as partners in the design process [36]. Cultural probes [37] aim to inspire new design ideas, focusing on the creative, engaging, and playful applications of technologies. Ethnography is used to understand the social and situated use of technologies in particular environments [38]. These methods certainly play an important role in inspiration. Nevertheless, one of their weaknesses is that researchers tend to be trapped in a narrow perspective without holistic design considerations when applying these methods.
In addition, a well-established design process can help designers produce results efficiently. For instance, the Double Diamond design model, proposed by the British Design Council, has been widely used in design research and practices [39]. In this model, a linear design process has been clearly divided into four parts: discover, define, develop, and deliver. To our knowledge, there are a vast amount of evolved frameworks of the Double Diamond design model to suit different kinds of design projects [40]. Similarly, in this paper, we hoped to leverage such a linear design process to support the design and development of the HMI design of automotive wellness in AVs, which involves the composition of the interactive system.
Benyon et al. [35] used People, Activities, Contexts and Technologies as four elements of a framework for interactive system design. First, people are regarded as the essential part of the interactive system, for which individual differences in physical, psychological, and usage characteristics cannot be ignored. Second, ten important characteristics (i.e., frequency, working well at both peaks and troughs of working, interrupted and pick up with no mistakes, response time of system, communication and coordination with others, well-defined tasks, safety, dealing with errors, data requirements and input device, media) of activities were listed that need to be considered, especially the content. Third, contexts can be distinguished into the organizational context, the social context, and the physical circumstances. Sometimes context can be seen as the surrounds of an activity and it should take the responsibility of gluing activities together. Fourth, the last, but not least, essential element of an interactive system is technologies. According to Benyon et al. [35], activities and contexts determine requirements for technology development, whereas the deployment of new technologies changes activities and contexts.
While there have been some existing models to support the HMI design for AVs (e.g., [41]), they lacked thorough considerations of improving automotive wellness. Therefore, based on the aforementioned design methods and models, we propose a new design framework and a matched process in this paper.

Elements of the Framework
According to Benyon et al. [35], an interactive system consisted of the following elements: people, contexts, activities, and technologies. Thus, we tried to map the key HMI design considerations of autonomous wellness in AVs to these aforementioned elements for supporting designers in exploring relevant themes. As shown in Figure 1, a circular model with color-coded elements was proposed, with three main aspects, including subjects, target activities, and system interactivity.

Elements of the Framework
According to Benyon et al. [35], an interactive system consisted of the following elements: people, contexts, activities, and technologies. Thus, we tried to map the key HMI design considerations of autonomous wellness in AVs to these aforementioned elements for supporting designers in exploring relevant themes. As shown in Figure 1, a circular model with color-coded elements was proposed, with three main aspects, including subjects, target activities, and system interactivity. Specifically, in an AV environment, we deemed passengers as people and vehicles as contexts of the interactive system, respectively. They were further categorized into the category of subjects (in yellow) to represent the information senders (vehicles) and receivers (passengers) of the system.
In addition, we adopted the concept of activities from [3] and dedicated it to in-vehicle activities related to health and wellbeing. Following previous categorizations of invehicle activities [42,43], in this study, we summarized target activities (in green) as health, entertainment, work, and social communication. However, the types of target activities might be expanded in the future.
Lastly, we proposed the system interactivity of HMIs that corresponded to technologies, which could be further divided into inputs and outputs from a human-computer interaction perspective. As for system inputs, due to the advance in ubiquitous sensing [13,[44][45][46], it is common to track user behaviors and physiological status as inputs for interactive technologies, which may significantly expand the modalities of data for future HMIs. As for system outputs, according to the increase in considering multiple sensory Specifically, in an AV environment, we deemed passengers as people and vehicles as contexts of the interactive system, respectively. They were further categorized into the category of subjects (in yellow) to represent the information senders (vehicles) and receivers (passengers) of the system.
In addition, we adopted the concept of activities from [3] and dedicated it to invehicle activities related to health and wellbeing. Following previous categorizations of in-vehicle activities [42,43], in this study, we summarized target activities (in green) as health, entertainment, work, and social communication. However, the types of target activities might be expanded in the future.
Lastly, we proposed the system interactivity of HMIs that corresponded to technologies, which could be further divided into inputs and outputs from a human-computer interaction perspective. As for system inputs, due to the advance in ubiquitous sensing [13,[44][45][46], it is common to track user behaviors and physiological status as inputs for interactive technologies, which may significantly expand the modalities of data for future HMIs. As for system outputs, according to the increase in considering multiple sensory channels for HMI designs [47,48], we envisioned that different modalities (e.g., visual, auditory, haptic, etc.) could be leveraged to provide proper feedback in the interactive system of HMIs.

The Process for Utilizing the Proposed Framework
Following Morrison et al. [49], we proposed a five-step design process that could support HMI researchers in discovering the research questions, defining target subjects, and developing and delivering design concepts, where each step will be taken as the premise of the next step. These five steps can be described as follows.

1.
Find research gaps. Researchers need to extract passengers, AVs, target activities, system inputs and system outputs based on desktop research and/or user studies. It is helpful to find gaps by contrasting these elements; 2.
Specify the subjects. In this step, researchers are required to specify the personal information of passengers and scenarios of vehicle usage; 3.
Specify target activities. This step is to help researchers clarify which aspects of wellness they hope to solve through HMI design. Passengers' demands should be sufficiently considered; 4.
Specify system interactivity. In this step, system inputs and system outputs need to be considered systematically. Explicitly, researchers are required to compare which combinations of system inputs and system outputs can achieve the target activities more effectively; 5.
Design HMI for autonomous wellness in AVs based on pre-defined elements. All the elements of the interactive system have been defined in the previous steps. Thus, specific products need to be designed within the limits of these elements.

Model Application
We applied the proposed framework in a second-year bachelor design course for practice. In this course, we defined the following subjects according to the framework. Specifically, passengers were defined as middle-aged commuters with a family of procreation, and AVs were defined as 1.5-h one-way commuting. Next, we illustrate the applicability of this framework with five students' cases from this course.

Social Communication-Behavior-Multichannel Feedback
"Personal Interaction Devices in Car" is a set of interactive devices for parent-child communication in the car, as shown in Figure 2. During the design, this group further defined commuters as commuters who take their children to work. Thus, passengers in this project specifically referred to parents and children in commuting and AVs referred to the commuting process together with children. In the aspect of target activities, this group chose social communication aimed at helping parents to perceive the children's status in time and prevent children from feeling concern. Finally, as for system interactivity, this group regarded the children's pat and press on the doll as system inputs, which triggered the tactile feedback (warming) of the toy as well as the visual feedback (light) and auditory feedback (music) of the display device as system outputs arising from these behaviors.

Social Communication-Behavior-Visual Feedback
The "Time Hobbyhorse" is a device that helps commuter parents and children waiting at home to communicate in real-time and share the journey, as shown in Figure 3. This group further expanded passengers into commuters and children waiting at home. Regarding the aspect of target activities, this group also chose social communication aiming at enhancing the communication between children and parents by presenting the position of parents during commuting to children. Finally, as for system interactivity, this group regarded parents' position status as system inputs and triggered the visual feedback (rotation of the Time Hobbyhorse and lighting) as system outputs from this behavior.

Social Communication-Behavior-Visual Feedback
The "Time Hobbyhorse" is a device that helps commuter parents and children waiting at home to communicate in real-time and share the journey, as shown in Figure 3. This group further expanded passengers into commuters and children waiting at home. Regarding the aspect of target activities, this group also chose social communication aiming at enhancing the communication between children and parents by presenting the position of parents during commuting to children. Finally, as for system interactivity, this group regarded parents' position status as system inputs and triggered the visual feedback (rotation of the Time Hobbyhorse and lighting) as system outputs from this behavior.

Entertainment-Behavior-Multichannel Feedback
"Carfee Break" is a device designed to allow passengers in AVs to enjoy a more comfortable rest environment, as shown in Figure 4. In the aspect of target activities, this group chose entertainment aimed at relieving passengers' fatigue during commuting. Finally, as for system interactivity, this group regarded the behavior of putting hot drinks into the cup holder as system inputs and triggered the visual feedback (lighting) and olfactory feedback (pleasant aroma) as system outputs from this behavior.

Social Communication-Behavior-Visual Feedback
The "Time Hobbyhorse" is a device that helps commuter parents and children waiting at home to communicate in real-time and share the journey, as shown in Figure 3. This group further expanded passengers into commuters and children waiting at home. Regarding the aspect of target activities, this group also chose social communication aiming at enhancing the communication between children and parents by presenting the position of parents during commuting to children. Finally, as for system interactivity, this group regarded parents' position status as system inputs and triggered the visual feedback (rotation of the Time Hobbyhorse and lighting) as system outputs from this behavior.

Entertainment-Behavior-Multichannel Feedback
"Carfee Break" is a device designed to allow passengers in AVs to enjoy a more comfortable rest environment, as shown in Figure 4. In the aspect of target activities, this group chose entertainment aimed at relieving passengers' fatigue during commuting. Finally, as for system interactivity, this group regarded the behavior of putting hot drinks into the cup holder as system inputs and triggered the visual feedback (lighting) and olfactory feedback (pleasant aroma) as system outputs from this behavior.

Entertainment-Behavior-Multichannel Feedback
"Carfee Break" is a device designed to allow passengers in AVs to enjoy a more comfortable rest environment, as shown in Figure 4. In the aspect of target activities, this group chose entertainment aimed at relieving passengers' fatigue during commuting. Finally, as for system interactivity, this group regarded the behavior of putting hot drinks into the cup holder as system inputs and triggered the visual feedback (lighting) and olfactory feedback (pleasant aroma) as system outputs from this behavior.

Social Communication-Behavior-Visual Feedback
The "Time Hobbyhorse" is a device that helps commuter parents and children waiting at home to communicate in real-time and share the journey, as shown in Figure 3. This group further expanded passengers into commuters and children waiting at home. Regarding the aspect of target activities, this group also chose social communication aiming at enhancing the communication between children and parents by presenting the position of parents during commuting to children. Finally, as for system interactivity, this group regarded parents' position status as system inputs and triggered the visual feedback (rotation of the Time Hobbyhorse and lighting) as system outputs from this behavior.

Entertainment-Behavior-Multichannel Feedback
"Carfee Break" is a device designed to allow passengers in AVs to enjoy a more comfortable rest environment, as shown in Figure 4. In the aspect of target activities, this group chose entertainment aimed at relieving passengers' fatigue during commuting. Finally, as for system interactivity, this group regarded the behavior of putting hot drinks into the cup holder as system inputs and triggered the visual feedback (lighting) and olfactory feedback (pleasant aroma) as system outputs from this behavior.

Entertainment-Explicit Behavior-Visual Feedback
"E-Car" is a modular intelligent manual double-trigger mode switch, as shown in Figure 5. This group expanded AVs into four kinds of modes: Business Mode for commuting, Didi Mode for sharing, Couple Mode for lovers' travel, and Tourism Mode for traveling. In the aspect of target activities, this group also chose entertainment aimed at improving comfort. Finally, as for system interactivity, this group regarded different gestures for particular modes as system inputs and triggered the visual feedback (lighting) as system outputs from these behaviors.
"E-Car" is a modular intelligent manual double-trigger mode switch, as shown in Figure 5. This group expanded AVs into four kinds of modes: Business Mode for commuting, Didi Mode for sharing, Couple Mode for lovers' travel, and Tourism Mode for traveling. In the aspect of target activities, this group also chose entertainment aimed at improving comfort. Finally, as for system interactivity, this group regarded different gestures for particular modes as system inputs and triggered the visual feedback (lighting) as system outputs from these behaviors.

Health-Physiological Status-Visual Feedback
"Health Tester" is an auxiliary product that can monitor the physical condition of passengers in real-time, as shown in Figure 6. In the aspect of target activities, this group chose physiological status aimed at helping passengers to realize their health status. Finally, as for system interactivity, this group considered the physiological data of passengers monitored by sensors as system inputs and triggered the visual feedback (lighting) as system outputs from the physiological status. Specifically, the device was divided into four parts: the eyes in the upper left corner corresponded to the passengers' fatigue; the heart in the left lower corner corresponded to the passengers' heart and other important body organs; the human figure pattern in the middle represents the muscles in different parts of the body was to show the degree of muscle fatigue; the bar on the right was used to show the temperature of the passengers.

Health-Physiological Status-Visual Feedback
"Health Tester" is an auxiliary product that can monitor the physical condition of passengers in real-time, as shown in Figure 6. In the aspect of target activities, this group chose physiological status aimed at helping passengers to realize their health status. Finally, as for system interactivity, this group considered the physiological data of passengers monitored by sensors as system inputs and triggered the visual feedback (lighting) as system outputs from the physiological status. Specifically, the device was divided into four parts: the eyes in the upper left corner corresponded to the passengers' fatigue; the heart in the left lower corner corresponded to the passengers' heart and other important body organs; the human figure pattern in the middle represents the muscles in different parts of the body was to show the degree of muscle fatigue; the bar on the right was used to show the temperature of the passengers.
"E-Car" is a modular intelligent manual double-trigger mode switch, as shown in Figure 5. This group expanded AVs into four kinds of modes: Business Mode for commuting, Didi Mode for sharing, Couple Mode for lovers' travel, and Tourism Mode for traveling. In the aspect of target activities, this group also chose entertainment aimed at improving comfort. Finally, as for system interactivity, this group regarded different gestures for particular modes as system inputs and triggered the visual feedback (lighting) as system outputs from these behaviors.

Health-Physiological Status-Visual Feedback
"Health Tester" is an auxiliary product that can monitor the physical condition of passengers in real-time, as shown in Figure 6. In the aspect of target activities, this group chose physiological status aimed at helping passengers to realize their health status. Finally, as for system interactivity, this group considered the physiological data of passengers monitored by sensors as system inputs and triggered the visual feedback (lighting) as system outputs from the physiological status. Specifically, the device was divided into four parts: the eyes in the upper left corner corresponded to the passengers' fatigue; the heart in the left lower corner corresponded to the passengers' heart and other important body organs; the human figure pattern in the middle represents the muscles in different parts of the body was to show the degree of muscle fatigue; the bar on the right was used to show the temperature of the passengers.

Evaluation
In order to effectively evaluate these five design concepts, we conducted an evaluation with 20 participants. The main purpose of this test was to examine the attractiveness of these design prototypes, in order to provide evidence of the effectiveness of our proposed framework for multimodal HMI designs.

Participants
A total of 20 participants (13 males, 7 females) from 23 to 45 years old (M = 29, SD = 7.5) were recruited for this design evaluation. We recruited participants through targeted invitation. All participants had education experience in industrial design or interaction design, and were very familiar with the theories, methodologies, and case studies of HMI. Specifically, four university lecturers in design, four design practitioners with more than three years' work experience, four Ph.D. candidates in design, and eight postgraduate students in design took part in the test.

Setup
The test was conducted online. Five video showcases and five design images were prepared to introduce the concepts of each work, respectively. An online survey was also prepared based on the AttrakDiff questionnaire [50], which is helpful to evaluate the attractiveness of products [51]. AttrakDiff has been widely applied for heuristic testing of HMI designs as a single evaluation method [52][53][54][55]. Similarly, we conducted single evaluations on the five designs and adopted the AttrakDiff online tool [56] to analyze our research data.

Procedure
Before the experiment, we briefly introduced this project and demonstrated the concept and functionality of each prototype through video showcases and design images. Then participants, were allowed to ask questions about the prototypes to help them understand the concepts thoroughly. Subsequently, they were invited to fill in the questionnaire.

Data Collection and Analysis
The questionnaire data were collected through a Chinese online survey service provider called Wen Juan Xing, which can provide functions equivalent to Amazon Mechanical Turk [57]. All the obtained data were imported to the AttrakDiff online tool for data analysis, which can automatically calculate the Hedonic Quality (HQ) and the Pragmatic Quality (PQ) of the evaluated concept design, in order to reveal its attractiveness (ATT). Specifically, HQ is an index that describes the originality and beauty of a product, whereas PQ can be used to indicate usability aspects, i.e., efficiency, effectiveness, and learnability [58]. The higher the HQ and PQ values, the evaluated product has higher ATT. The values of HQ, PQ, and ATT all ranged from −3 (negative) to 3 (positive). Based on the online tool, a visual analysis diagram of AttrakDiff can be generated to support the data analysis, in which the vertical axis represents the HQ value (bottom = low extent) and the horizontal axis represents the PQ value (left = low extent) of a specific design. In the diagram, a rectangle is used to show the questionnaire results, where its size is adversely associated with the confidence of the data.

Results
We obtained 20 responses to the 5 concept designs and imported them to AttrakDiff for data analysis. As shown in Figure 7, the values of all students' designs were close to the "desired" area and their ATT values were greater than 1.0. That is to say, the attractiveness of these prototypes was weighed positively.
Specifically, Figure 7a shows that the HQ value of Personal Interaction Devices in Car is 0.86 (confidence = 0.51) and the PQ value is 1.02 (confidence = 0.47), which jointly resulted in an ATT value of 1.32. Additionally, Figure 7b reveals that Time Hobbyhorse achieved higher results, with an HQ value of 1.23 (confidence = 0.42) and a PQ value of 1.43 (confidence = 0.38). The ATT value of this design was as high as 1.85, which proved its strong attractiveness. By comparison, we found that the concept of Personal Interaction Devices in Car was less desirable than Time Hobbyhorse. This might be due to the reason that social interactions within vehicles could lead to safety concerns. By enabling lightweight, single-direction communications between passengers and their families at home, Time Hobbyhorse was therefore deemed more acceptable and novel to our participants. This finding implies that while developing multimodal HMIs for automotive wellness, the design framework should also support designers to brainstorm possibilities beyond the context of vehicles only.   According to Figure 7c, Carfee Break had an HQ value of 0.75 (confidence = 0.35) and a PQ value of 1.14 (confidence = 0.33), with an overall ATT value of 1.17 for verifying its attractiveness. The result of E-Car (Figure 7d) showed that the HQ value was 0.83 (confidence = 0.20) and the PQ value was 1.26 (confidence = 0.26), and the ATT value of this work was 1.16. Both Carfee Break and E-Car had strong task-oriented features, because they selected entertainment as target activities, aiming to provide passengers with a relaxing ambience.
As shown in Figure 7e, Health Tester was evaluated with an HQ value of 1.08 (confidence = 0.26) and a PQ value of 0.38 (confidence = 0.31), with an ATT value of 1.19. Of all these works, Health Tester was the least task-oriented, which can be attributed to the fact that it focused on helping passengers learn their health conditions within the car, through timely visualization of their physiological data. Therefore, rather than focusing on specific HMI tasks, Health Tester was applied in the context of an ambient display to provide a reminder of unhealthy conditions.

Discussion and Conclusions
In this paper, we have proposed a multimodal HMI design framework for automotive wellness in AVs based on the notion of Benyon et al. [35]. In this framework, three elements were proposed and divided into five dimensions. Accordingly, we developed the process of utilizing this framework and examined it in a design course to exemplify the effectiveness of this model. We learned that the proposed framework could support the development of extraordinary HMI concepts for automotive wellness as the clearly classified target activities of our proposed framework. Moreover, we learned that the list of system outputs and system inputs from the framework helped the students turn the concept designs into feasible engineering developments. Based on the application throughout the design course, we also found one advantage of the linear process for utilizing the proposed framework was the provision of clear guidance to explore HMI design. All these students' works were evaluated by the AttrakDiff questionnaire, with 20 participants who had design backgrounds. Due to the positive AttrakDiff results (all with ATT > 1), we proved that the proposed framework and the design process could effectively support students in developing desirable multimodal HMIs for automotive wellness.
Although the proposed framework has been proven to be effective in practice, it was shown that it is not sufficiently detailed. Consequently, students may feel that the concept of some elements is not clearly defined, which required us to reflect on the framework. The development of multimodal HMI design for automotive wellness in AVs requires researchers to consider various factors in the interactive system. First, each element needs to be expanded for design ideation. For example, target activities have been summarized as health, entertainment, work, and social communication. However, the actual demand of passengers may become more diverse with increased attention to wellness [59,60]. Thus, the types of target activities should be reclassified and expanded. As for system inputs, they can be connected with specific technologies. For example, it would be worth investigating an index of existing sensors used to monitor different physiological measures [13]. Second, the proposed framework was aimed at relatively simple tasks. A remaining challenge is whether it is still applicable when facing a complex in-vehicle situation with multiple passengers [61,62]. This is one of the directions we need to study next.