User Experience of Multi-Mode and Multitasked Extended Reality on Different Mobile Interaction Platforms

: “Extended Reality (XR)” refers to a uniﬁed platform or content that supports all forms of “reality”—e.g., 2D, 3D virtual, augmented, and augmented virtual. We explore how the mobile device can support such a concept of XR. We evaluate the XR user experiences of multi-mode and multitasking among three mobile platforms—(1) bare smartphone (PhoneXR), (2) standalone mobile headset unit (ClosedXR), and (3) smartphone with clip-on lenses (LensXR). Two use cases were considered through: (a) Experiment 1: using and switching among different modes within a single XR application while multitasking with a smartphone app, and (b) Experiment 2: general multitasking among different “reality” applications (e.g., 2D app, AR, VR). Results showed users generally valued the immersive experience over usability—ClosedXR was clearly preferred over the others. Despite potentially offering a balanced level of immersion and usability with its touch-based interaction, LensXR was not generally received well. PhoneXR was not rated particularly advantageous over ClosedXR even if it needed the controller. The usability suffered for ClosedXR only when the long text had to be entered. Thus, improving the 1D/2D operations in ClosedXR for operating and multitasking would be one way to weave XR into our lives with smartphones.


Introduction
Recently, the phrase "Extended Reality" (or XR in short) has become a general term to represent different individual modes of reality, e.g., where "X" is replaceable with the words, "virtual", "augmented", "physical", etc. [1][2][3]. Another intended definition of XR refers to a single platform or content that supports or uses various forms of reality (2D/3D virtual, augmented, augmented virtual) together [1, 2,4]. That is, an XR platform should support multiple modes of operation within a single or across different applications. Thus, we claim that an important and desirable feature of the XR platform would be the convenient and quick use and seamless switch among the different modes of operation. However, the potential of such conceptualization has not been demonstrated as much, not to mention its design space explored. In this paper, we explore the possibility of extending the mobile devices, we are now so accustomed to using, to support such a concept of XR as well. That is, as a platform, the mobile device should support multi-mode and multitasking usage not only among XR (e.g., 2D, AR, VR) but also regular non-XR modes/apps (e.g., texting). We conducted a study to evaluate and compare the XR user experiences among three mobile platforms-(1) bare smartphone, (2) standalone mobile headset unit and (3) smartphone with clip-on lenses (see Figure 1). User experience in general refers to how the user interacts with and experiences a product, system or service, and among others, includes the user's perceptions of utility, usability and task efficiency. For XR, it can be further extended to include the sense of immersion and presence (or object presence in the case of AR).
The study also illustrates and highlights the usage of a multi-mode XR application. We consider two use cases in separate experiments: (1) Experiment 1: using and switching among different modes of operation within a single XR application while multitasking with a regular smartphone app, and (2) Experiment 2: general multitasking among different "reality" applications (e.g., 2D app, AR, VR). The former is focused on the user experience of the mode switching within a single multi-moded XR application, while the latter is on multitasking among separate XR applications. The contribution of this work can be summarized as follows: • Demonstrated the multi-mode XR contents and multitasking on three different types of mobile platforms (PhoneXR, ClosedXR and LensXR), • Experimentally confirmed the importance of ensuring the minimum level of presence and immersion, as an XR platform, even at the expense of usability problems, and • Derived requirements for the unified mobile platform for further acceptability.

Related Work
Most works in virtual/augmented operate in a single mode. There are not many examples of systems or contents that operate in multiple modes (on a singular platform) [4][5][6][7][8].
One of the early examples is the AR-based picture book called the MagicBook [9]. In this work, the main operating mode is AR where a marker on a picture book is used to make fairy tale characters pop out of the book, but the user is also allowed to dive into the virtual picture book environment, using the same headset with the video see-through set-up. The mobile video see-through platform (e.g., Cardboard VR using the smartphone camera) would certainly allow similar functionality, but 2D-oriented tasks could not be accomplished by the more natural touch interface [10]. The 2D tasks in mobile AR/VR including the "managerial" ones (like system-level commands or option selection) ordinarily rely on 2D interfaces (e.g., floating menus) situated in 3D space [11][12][13][14].
The popular AR PokemonGo game also has the "virtual" mode in which the user can search for the Pokemons using a 2.5D map in the bird's eye viewpoint [15]. In fact, many AR applications often make use of the non-AR mode for managing miscellaneous and secondary functions. For example, IKEA's interior design app will first have the user select the furniture through regular 2D browsing (which is obviously deemed more convenient), then place and view the selected item in the AR space. This is feasible because the platform is based on the (bare) smartphone with which a 2D touch interface is readily available. Sugiura et al. featured a system, called the Dollhouse VR that combined the table-top 2D display with immersive VR for architectural design collaboration [16]. While this was a rare XR system with two "major" functional and display modes, each mode operated on separate platforms by different users.
Applications that operate in different reality modes are also starting to appear. For example, the Facebook Social XR [17] and Spatial.io [18] feature a shared space in both VR, where the users are transported to an imaginary virtual place, and AR, where the users are situated in a real location context, modes. However, these modes operate as separate applications, i.e., neither in association with one another nor in a multi-tasked fashion. SpaceTop was also an application that fused 2D and spatial 3D interactions in a single desktop workspace. It extended the traditional desktop interface with interaction technology and visualization techniques that enable seamless transitions between 2D and 3D manipulations [19].
While it may take some time until the smartphone, mobile VR headset, and AR glasses may converge into a single XR wearable, in the meantime, the most natural choices for the integrated mobile XR platform would be the bare smartphone, and the standalone all-inone VR system with the display, processor and sensors (including the camera for video see-through AR capability) integrated into one wearable mobile unit [11]. With the recent emergence of the standalone mobile XR platforms, we exclude the cardboard type [20] from our comparison study because it would be too cumbersome and inconvenient to insert/take the smartphone into/out of the headset unit to switch between different modes.
Another lesser known, yet viable alternative is the clip-on lenses with the smartphone [21]. The clip-on lenses are compact, foldable and easy to carry. The act of clipping is also simple and more convenient than, e.g., using cardboard. As the design allows the fingers to access the touchscreen, the users can also use the touchscreen buttons for various interactions including the mode switch and regular smartphone usage (see Figure 2) Although the smartphone display is not completely isolated (peripheral view into the outside world visible), it has been shown that the immersive experience is comparable to that with the closed VR headsets, and expectedly significantly higher than (bare) smartphone VR [22]. In our study, we include this platform in the comparison of the user experience. Multitasking, especially in the context of smartphone usage, has been mostly studied in terms of its hazards in lowered overall productivity [23]. We are not aware of any in-depth research on the topic of user behaviors in operating XR applications in multi-mode nor multitasking of XR and non-XR applications.

Experimental Design
The first experiment assesses the XR experience in terms of using and switching among different modes of operation within a single XR application while multitasking with a regular smartphone app such as responding to an incoming text (more emphasis on the former). The platforms considered are (1) purely smartphone-based ("PhoneXR"), (2) standalone headset ("ClosedXR") and (3) smartphone with the clip-on lenses ("LensXR") as illustrated in Table 1. In the cases of PhoneXR and ClosedXR, all modes and apps are operated and switched within the bare phone or standalone headset. However, for LensXR, the 2D texting and operating the 2D mode in the multi-mode XR application are carried out with the lenses clipped off (i.e., in the bare smartphone configuration). For AR/VR modes, the lenses are clipped back on (see the illustration of mode change in Figure 1 (bottom) for LensXR). Our pilot interviews and trials revealed that 2D app and 2D mode XR operation was preferred with the bare phone touchscreen. For simpler interactions within AR/VR, touchscreen interaction with the lenses clipped on was used. When touching the screen with the lenses clipped on, the finger would be seen blurred (called "Blurry touch")-while we have shown that accurate (more so than ray casting) touch selection was still quite possible [24], it is deemed more difficult than the regular touch as the finger is seen blurred and the imagery magnified by the lenses. We expect the "PhoneXR" to exhibit the highest usability as all interactions extend from the usual smartphone touch, but the least rated XR experience in the sense it would feel like using another smartphone app without much immersive quality of the VR and AR. The ClosedXR would suffer in the usability aspect because the headset has to be worn and interaction carried out indirectly using the ray-casting method with the controller. LensXR is expected to be received as a balanced solution (and possibly offer the best overall experience), which has been shown to provide reasonable immersion and presence, yet allow the usual touchscreen-based interaction for the AR/VR modes [22,24].
The experiment is one factor (platform type) with 3 levels within the subject repeated measure. Figure 1 illustrates the three different test platforms. Dependent variables assessed the XR user experience as a combination of the individual modal experiences and usability, mode/app switch as well as the total mobile platform/combination experience (see Section 3.5).

MonsterGO: Concept Multi-Mode XR App
We developed a concept multi-mode XR app called the "MonsterGO" not only to carry out the experiment but to also showcase such an application possibility. MonsterGo is modeled after the storyline of Pokemon [15], where the user roams around the world, real or virtual, and search/capture. By design, it operates in three different modes. Each mode offers a unique experience and the content attempts to match the operations to the characteristics of the media type (2D, VR, or AR) in hopes of promoting a total and combined mobile XR experience (see Figure 3).
In the 2D mode, the 2D map of the virtual world and the user character in it is shown in the display and the user can search and travel by controlling the character-the specific interface could be direct touch, virtual joystick, gaze/time-out or other means depending on the actual platform used (see Figure 4). Multi-mode XR application called MonsterGo, that is multi-taskable with the regular apps (like texting and making phone calls). It has three different modes-2D for map-based exploration and search, 3D VR for first-person battles, and AR for interaction in the real environment.
As the user roams around the world in the 2D navigational/map mode, one might find portals to the 3D virtual world for completing certain missions (e.g., initiating a battle with another monster) in the first-person viewpoint immersively. The user would enter and exit the VR space and interact (e.g., battle) by various means, again depending on the given platform.
The user might also come across portals to the real/AR world and enter them for completing another type of mission-e.g., capturing a monster situated in a real location as an AR experience. The user would enter and exit the AR space and interact (e.g., capture) by various means, again depending on the given platform. The AR mode content was implemented using an image marker.

Interaction and Mode Switch and Multitasking
To use a multi-mode XR application and multitask with the regular smartphone app on the given mobile platform, a means of mode switching is needed in addition to that for the task interaction itself. Multitasking support, e.g., for suspending and reactivating apps, is often provided at the operating system (or system software) level. For example, the Android operating system provides various facilities for multitasking through various notifications, alarms, split-screen, and app switch mechanisms [25]. However, no current support is available with respect to "directly" switching among different modes within the same application. In Experiment 1, a simple selection of a button (or equally icon/portal/notification) for mode/app switch is implemented and built into the Mon-sterGo and the texting applications.

Experimental Set-Up and Task
The three tested XR mobile platforms were set up as follows. The MonsterGo ran on the PhoneXR just like any smartphone game app, operated by holding the phone in two/one hand(s) and interacting with the usual touch. Mode/app-specific notification and status information was shown in little insets on the screen that could be touched for the mode/app switch. To use the ClosedXR platform, the user had to use the Vive Pro unit and carry out the interactive experimental task using the controller held in one's hand. The LensXR was operated without the lenses for 2D and with them for AR/VR modes-the mode switch was made by the touch button then the user clipped on or off the lenses oneself. Such actions seem somewhat cumbersome compared to using the lenses for all the modes without ever needing to clip them on or off. However, as indicated already, our preliminary interviews found that the subjects much preferred such a configuration. The Samsung Galaxy S8 smartphone [26], running Android version 9 was used for PhoneXR and LensXR. The POCKET-VR clip-on magnifying lenses were used for LensXR [21]. As for the ClosedXR, while ideally, a standalone VR unit (with the processing unit and display all integrated into one) should be used, for operational reasons, we used the Vive Pro tethered to the PC and emulated the situation.
The experimental task started in the 2D map navigation mode of the MonsterGo application. Thus, for LensXR, the phone was initially not clipped on with the lenses. Then the user was asked to freely explore the scene while switching between different modes in the given platform for five minutes. In the 2D mode, the user could navigate the map of the scene moving through the pathways by touch, by which one would find portals to switch to AR or VR mode. The mode switch and other simple interactions were carried out, depending on the platform, by touch (PhoneXR), Blurry touch (LensXR) or ray casting selection (ClosedXR). Interactive tasks included simple engagement (e.g., fight, feed) with the virtual character or content object. In the AR/VR mode, the user could also view the scene by either rotating the phone (PhoneXR, LensXR) or rotating the head (ClosedXR). Once in a while, text messages arrived and a notification appeared on the top of the screen. The 2D texting app could be invoked by selecting the notification and changing the mode/setting, and entering the letters by touching/selecting the letters on the virtual keyboard. Completing the text (by sending the send button) would automatically take the user back to the previous mode (AR or VR). The subject was asked to go back and forth (or selectively reactivate) among the three XR app modes at least three times and respond to the occasional (three times) incoming text messages.

Experimental Procedure
Experiment 1 started by collecting the basic subject information and getting familiarized with the three platforms, MonsterGo and the texting app, and the overall experimental task. 24 subjects participated in this experiment (mean age: 25.5, SD = 4.06). 18 and 17 out of the 24 participants had previous experience using AR and VR content on various platforms.
In the main experimental session, the user experienced all three different platform treatments, each of which was presented in a balanced order. The whole session took about an hour. After each session, the user filled out an extended user experience survey which assessed the individual modal experiences as well as the total mobile experience. Naturally, the survey combines the questions for VR (e.g., presence, immersion, realism, distraction), AR (e.g., object presence, naturalness), mobile (e.g., usability, mode switch and situation awareness) and for the total user experience (general preference, satisfaction, and balance). See Appendix A for the details.

Results
One-way ANOVA with the Tukey HSD pairwise comparison was applied to the survey data for analysis. Statistically significant effects were found for a number of UX questions. Q1 and Q2 asked of the UX regarding the mode switching and multitasking with the texting app-naturalness or flow and ease of use, respectively, (see Figure 5). In both cases, the general trend was PhoneXR being the most natural or easiest, followed by the ClosedXR (without any statistical difference), then by LensXR (with statistically significant difference to PhoneXR, p = 0.009 for Q1; p = 0.011 for Q2). Q3 and Q4 assessed the ease of use for the 2D modes, namely the 2D map operation in MonsterGo and texting. Most obviously, the PhoneXR showed superior usability for 2D tasks, but with a statistically significant difference for the texting operation only (p = 0.044 vs. LensXR; p < 0.001 vs. ClosedXR) as it (entering of letters) involved much more physical load (see Figure 6).
Q5-Q8 assess the UX for AR and Q9-Q12, VR experiences. For AR, except for Q5 which evaluated the ease of use in which LensXR exhibited less usability than PhoneXR (p = 0.007), no significant differences were found among the three platforms in terms of the object presence, the user felt immersion and distraction level (see Figure 7). This is a result consistent with that for 2D operation usability.  As for VR, however, the general trend was that ClosedXR was the most satisfying in terms of the overall presence, immersion, and distraction level, then followed by LensXR (but without a statistical difference), then PhoneXR (with significant differences, presence: p = 0.002 vs. ClosedXR; immersion: p < 0.001 vs. ClosedXR; distraction: p = 0.035 vs. ClosedXR). The image magnification and isolation contributed to the higher level of presence, immersion and concentration for ClosedXR and LensXR (see Figure 8).
Finally, the overall ease of the multi-mode and multitasking usage (Q14), PhoneXR came on top with a significant difference to LensXR (p = 0.028), but not to ClosedXR. The preference data showed the ClosedXR to be the clear top choice with PhoneXR and LensXR being distant second and third (see Figure 9).

Experimental Design
Experiment 2 was conducted mostly similarly to Experiment 1. The difference was that instead of mainly looking into the usage of multiple modes in a single XR application (MonsterGo) with the direct mode switch interface built into the application itself, the second experiment focused on multitasking among many single-mode XR and non-XR applications and tasks (see Table 2). The experiment is again one-factor (platform type) with 3 levels within the subject repeated measure. Dependent variables were the same as in Experiment 1. Based on the results from Experiment 1, we can expect the "ClosedXR" to exhibit the most satisfying overall user experience because of the superior level of presence and immersion for XR; the interaction usability again might not be as high compared to that in PhoneXR, but acceptable given the relative ease and low frequency of the interactions as tested in this experiment. Note that in Experiment 1, texting with the entering of many letters, despite being an occasional task, was regarded as relatively demanding. On the other hand, the application switch or 2D app operation in this experiment, as in most usual cases, requires only a few to just several actions, and the difference between using the familiar touch and ray casting selection would be little.

Multitasking
Multitasking was carried out through the management functions of the, in this case, Android mobile operating system [25]. Specifically, the common 3-button navigation (Android version 9) at the bottom (in portrait mode) or side (in landscape mode) of the main Android display (see Figure 10a) was used [27]-the rightmost '<' button is for going back to the previous application, the middle (rounded square), the home button for displaying the app menus, and the left-most (three vertical bars) invokes the thumbnail display of the currently executing applications that can be touch-selected for reactivation or focusing (App overview). As the XR applications essentially require the use of the whole screen, multitasking with the split screen option was not used. The UX of the same three mobile platforms was compared.

Experimental Set-Up and Task
The three tested XR mobile platforms were set up in the same way as in Experiment 1. The user was asked to multitask among several applications/tasks each of which operated in three different modes-2D, AR, and VR.
The 2D tasks included web browsing and information search for particular topics like the weather, stock price, zip code, and particular new articles. They involved typing in the web address (into the browser) and search keywords (into the search engine) and clicking on various buttons and icons as well. The AR applications involved directing the platform toward the pre-prepared markers and viewing various augmented objects. The VR applications were viewing 360-degree videos and playing a simple tunnel navigation game. All applications were readily available through the standard Android Gridview which laid out the application icons on the screen for the initial selection. Figure 10 shows the application menu and scenes from some of the multi-tasked applications. Subjects used the aforementioned 3-button navigation interface to select, reactivate and switch between the applications according to the instructions given by the experimenter. The instructed experimental task started by executing the mobile web browser (2D app). Thus, for LensXR, the phone was initially not clipped on with the lenses. The subject was asked to go back and forth (or selectively reactivate) among the several applications 2-3 times.

Experimental Procedure
Experiment 2 also started by collecting the basic subject information and getting familiarized with the three platforms and applications, and the experimental task. 18 subjects participated in this experiment (mean age: 23.7, SD = 3.34). 11 and 17 out of the 18 participants had previous experience using AR and VR content on various platforms.
In the main experimental session, the user experienced all three different platform treatments, each of which was presented in a balanced order. The whole session took about an hour. After each session, the user filled out the same user experience survey used in Experiment 1 (a few questions that were relevant only to Experiment 1 were excluded).

Results
Contrary to our expectation, Experiment 2 showed the ClosedXR to be the most natural (Q1) and easy-to-use (Q2) platform for multitasking among the three applications (see Figure 11). For Q1, ClosedXR showed the highest level of naturalness with a significant difference from PhoneXR (p = 0.028). This was surprising given that the multitasking management involves mostly 1D or 2D operations which would normally be thought to be easier with PhoneXR. Figure 11. Survey results regarding the multitasking. Statistical differences are indicated by color grouping and the "*" mark. The vertical axes represent the subjective levels in the 7-point Likert scale.
Q3 assessed the ease of use for the 2D application, namely the web browsing task. This too involved mostly 1D or 2D operations, however, in this case, as was in Experiment 1, the obvious expected result was obtained: PhoneXR and LensXR showed the highest usability for 2D tasks, with statistically significant differences to ClosedXR (p < 0.001 vs. PhoneXR and LensXR) (see Figure 12). Q5-Q8 assess the UX for AR and Q9-Q12, VR experiences (Figures 13 and 14). For AR, similarly to Experiment 1, no statistically significant effects were found with regard to all the UX questions. The same goes for the VR case. ClosedXR was the most satisfying in terms of the overall presence, immersion, and distraction level, then followed by LensXR (but without statistical differences), then PhoneXR (with significant differences, presence: p = 0.003 vs. ClosedXR; immersion: p = 0.005 vs. ClosedXR; distraction: p = 0.08 vs. ClosedXR). Finally, regarding the overall ease of multitasking usage, PhoneXR came on top with a significant difference to LensXR (p < 0.001) and to ClosedXR (p < 0.001). The preference data however again showed the contrary, as was in Experiment 1: the ClosedXR was the clear top choice with PhoneXR and LensXR being distant second and third. See Figure 15.

Discussion and Conclusions
In this paper, we investigated how the current mobile devices could support such a concept of XR. We conducted a study to evaluate and compare the XR user experiences of multi-mode and multitasking usage among three mobile platforms, namely, PhoneXR, ClosedXR, and LensXR, for both multi-mode and multitasking operations.
In summary, Experiments 1 and 2 showed perhaps the easily expected results-2D operations were easier and more efficient with PhoneXR, and immersive VR experience was best achieved with ClosedXR. The potential of LensXR as a balanced platform (between immersion and usability) proved not to be the case-mostly because the "Blurry" touch, that is having to interact and touch the screen while looking at the screen through the lens, was more difficult than previously reported [24]. One alternative we have not explored is the use of a separate controller (as the ClosedXR) with LensXR. While this could improve the interactivity, the user would experience difficulty in holding both the mobile LensXR device and the controller in one's two hands. Nevertheless, interestingly, for both simple multimode operation and somewhat more involved multitasking, users apparently preferred the ClosedXR over the others. Users did not see the LensXR as a balanced, but rather as a compromised solution that failed to fulfill any particular objective.
In general, it seems that subjects desired and assigned a high value to the clear support of the immersion and presence for the given mobile platform and were able to tolerate a moderate amount of usability problems such as having to wear the headset and somewhat inefficient 2D tasks needed for simple app specific interactions and mode/app switches. The biggest complaints arose when a relatively long text had to be entered using the ray casting in ClosedXR.
It would go without saying that using separate platforms, e.g., between the smartphone for regular 2D apps and a standalone headset for XR apps would be very difficult and cumbersome. In a single integrated platform configuration, PhoneXR almost seems to lose any meaning without many capabilities for providing sufficient immersion. Instead, improving the 1D/2D operations in the standalone ClosedXR headset for operating and multitasking with non-XR apps would be one way to weave XR into our lives. One good strategy is to, as was done partly in this work, use built-in multi-mode and multitasking switch functions at least for those frequently multi-tasked apps (like texting, phone, web search, etc.) Such an XR platform could potentially coexist or even compete with regular smartphones for serious XR users.  Institutional Review Board Statement: Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements.
Informed Consent Statement: Consent form was provided and signed by the participants of the experiment.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.