Towards a Taxonomy of Feedback Factors Affecting the User Experience of Augmented Reality Exposure Therapy Systems for Small-Animal Phobias †

Small-animal phobias has been treated using in vivo exposure therapies (IVET) and virtual reality exposure therapies (VRET). Recently, augmented reality for exposure therapies (ARET) has also been presented and validated as a suitable tool. In this work we identified an ensemble of feedback factors that affect the user experience of patients using ARET systems for the treatment of small-animal phobias, and propose a taxonomy to characterize this kind of applications according to the feedback factors used in the application. Further, we present a customized version of the taxonomy by considering factors/attributes specific to the visual stimuli. To the best of our knowledge, no other work has identified nor provided an explicit classification or taxonomy of factors that affect the user experience of patients using this kind of systems for the treatment of small-animal phobias. Our final aim is to two-fold: (i) provide a tool for the design, classification and evaluation of this kind of systems, and (ii) inspire others to conduct further work on this topic.


Introduction
A specific phobia (or simply phobia) is defined as a persisting fear of an object, situation or activity that does not ordinarily justify fear [1].Specific phobia is the most common lifetime anxiety disorder with prevalence rates in the United States estimated to be as high as 15.6 percent [2].According to the American Psychiatric Association [3], in the case of small-animal phobias, prevalence rates are between 3.3% y 7%.An individual suffering from phobia experiences excessive anxiety when exposed to a given stimulus; the trigger stimulus may be a specific entity (e.g., an animal) or situation (e.g., being in a closed place).In any case, excessive and unrealistic fear of the stimulus can lead to evasion behaviors that interfere with the subject's life.
Several studies suggest that exposure-based treatment is effective in the treatment of phobic fear and avoidance behavior (e.g., [4,5]).The exposure component of therapy generally involves a gradual hierarchical exposure to the object of fear, in a safe and controlled manner.The purpose of the exposure is to aid the patient to learn convincingly that the consequences s/he fears do not necessarily happen.Historically, this type of therapy is performed in vivo (In Vivo Exposure Therapy-IVET) facing the real stimulus or a physical representation of it.However, the main limitation of this technique is that a patient may not want to face the object of his/her fear in that way.In fact, it has been reported that when patients find out that the therapy involves facing the threat, about 25% of them refuse to perform the therapy or terminate it in advance [6][7][8].
The use of technology has allowed the search of less threatening and more practical alternatives to IVET, which has led to explore the use of Virtual Reality (VR) and Augmented Reality (AR) in exposure therapies.With respect to their effectiveness, there are several studies that have shown that both VRET and ARET are equally effective as IVET for the treatment of specific phobias [9][10][11][12][13].In this type of exposure therapy systems through VRET and ARET systems, stimuli are generated by means of virtual elements that are used to quantify and qualify the experience of use [14].The objective of these stimuli is that the patient perceives that the object s/he fears is present [15].
Unlike the IVET modality, in the VRET and ARET systems the presence of the object can be generated in a gradual and systematic exposure in order to try to reduce the probability of the patient leaving the therapy, and this, by means of the system feedback, either virtually or through augmented reality [16].The generation of adequate feedback motivates the user to interact with a system [17].According to Vitense et al. [18], the auditory, haptic and visual modalities are the most prominent types of feedback in augmented reality systems.
In this work, our objective is to identify some design factors that allow generating the presence of objects of fear in ARET systems.In particular, we are interested in (i) identifying feedback factors that affect the experience of use of ARET systems for the treatment of small-animal phobias, and (ii) defining a taxonomy to characterize this type of applications according to the type of feedback factors used, which can eventually support the design, classification and evaluation of this type of systems.

Mixed Reality Technologies for Exposure Therapies
Various proposals for exposure therapies have been reported in the literature using VR and AR technologies in what is called Mixed Reality (MR) technologies (see Figure 1).On the one hand, in exposure therapies with VR (VRET) the patient is immersed in a virtual environment (VE) where s/he faces a virtual representation of the object of his/her fear [12,19].In contrast to the VRET that generates a complete virtual environment, in AR exposure therapies the real environment is augmented by introducing synthetic or virtual elements [20][21][22][23][24].That is, while VR substitutes the physical environment with a virtual one, AR uses virtual elements to build on the existing physical environment.A fourth case, which we will not attend in this work but that arises from this characterization, is Augmented Virtuality (AVET), where the user is immersed in a virtual environment to which real objects are introduced.In addition to having the advantages of IVET, such as the continuous exposure of individuals to (a virtual representation of) the object of their fear, VRET also presents other advantages such as better control of anxiety, and the fact of not having a real threat (e.g., a virtual spider can not sting) [25].For instance, different VRET systems have been proposed for the treatment of arachnophobia [25][26][27][28][29].In the aforementioned works, the main findings suggest that VRET improves users' interest in therapy with respect to IVET [29], significantly reduces fear with short and long term benefits when using 4 different spiders [28] and that there are significant differences in the Spider Beliefs Questionnaire subscale (SBQ-F) after treatment with VRET with respect to IVET [27].ARET also presents the same advantages with respect to VRET, "that is, total control over the way the exposure is conducted, easier access to the threatening stimuli, no risk of real danger to the patient, the possibility of going beyond reality, and confidentiality, but it can be less expensive than VR because it is not necessary to model the whole environment" [11].The evaluation results of [11] indicate that users accepted the ARET and that the treatment was effective.Besides, the main result of [30] in which a projector-based ARET system for the treatment of cockroach phobia is used, indicates that when compared to IVET and VRET systems, the ARET system has some advantages in terms of communication between the patient and the therapist, and the natural interaction with the system.On the other hand, the evaluation of an ARET system for the treatment of cockroaches presented in [22] shows that it induces similar sensations of presence and anxiety when using either a typical head mounted display or an optical head mounted display.According to Juan and Joele [15], the goal of exposure systems must be to generate real or virtual stimuli so that the patient perceives the presence of the object of his/her fear (animal or situation).
Under this perspective, as shown in Figure 2, an exposure therapy system allows a patient to experience the presence of an object of fear in a particular environment through exposure to it.Moreover, exposure is achieved through the interaction of the subject by manipulating the environment and the object of fear and by receiving the stimuli resulting from that manipulation through the feedback they provide.It should be noted that considering the use of VR and AR, the environment, the object of fear and the feedback can be real and / or virtual.It should also be mentioned that manipulation can be implicit (without the need for explicit action on the part of the patient) or explicit, as well as direct (when interacting with a real physical object) or mediated by technology.

Method
In order to identify the interaction design characteristics that allow generating the presence of objects of fear, a qualitative systematic review of the literature was conducted (following the guide presented in [31]), particularly considering those that report studies of mobile applications with AR for the treatment of small-animal phobias.
This leaded to the following research questions (RQs): • RQ1: What are the design features of AR mobile applications for small animal phobia treatment?• RQ2: How do these attributes impact the experience of use of the participants?
According to the research questions, the keywords identified were: mobile applications, smallanimal phobia, user experience, and design guidelines, which were grouped to formulate the search strings.The search was performed in different databases, i.e., Science Direct, Springer, ACM, PUBMED, IEEE.In total, 3459 results were obtained.
The works were pre-selected based on their titles and their summaries, and in some cases on the full text.The inclusion criteria considered were: -Studies that present mobile applications for the treatment of phobias to small animals.-Studies that indicate or evaluate design aspects that impact the user experience.
The following criteria state when a study was excluded: -Studies that do not explain the design characteristics of the applications or that are focused on the effectiveness of the treatment.-Mobile applications for other phobias.-Studies presenting non-peer reviewed material.
The inclusion and exclusion criteria were applied independently by two of the authors and their results were later compared.In total, 63 works were pre-selected applying the inclusion criteria.When comparing the results of both researchers, 13 were the studies that coincided to be included in the selection.The works were then analyzed and 5 were eliminated because they contained repeated results.Then, a total of 8 studies were chosen.
Finally, to extract the data from the chosen studies, the works were classified by the title of the article, the name of the authors, the year of publication, the approach and the design characteristics observed.The design characteristics were extracted and classified using grounded theory [32] individually by two researchers, then compared and refined.The classification of design factors was made according to the feedback modality, e.g., visual, auditory, haptic.Finally, the proposed taxonomy was generated considering the sense of presence of the virtual elements that are used to quantify and qualify the user experience.The identified studies are briefly explained next.

Augmented Reality Exposure Therapy Systems for Small-Animal Phobias
In Juan, Botella et al. [33] a system that allows to determine different characteristics of the object of fear (i.e., cockroach) so that the treatment could be progressive is described.For example, it considers the size of the cockroach, the number of cockroaches, the way the cockroaches look (zoom in/zoom out).Cockroaches move repetitively and differently depending on where they appear, and return to their initial position.During the therapy, the user can kill the cockroaches with a flyswatter and a cockroach killer, and listen to the corresponding sound.The evaluation was conducted with a 26-year-old female who showed the maximum anxiety level on a ten-point scale before and after the session (measured using the subjective units of discomfort scale-SUDS [34]).Evaluation results suggest that the use of the AR system generates anxiety in the patient given the presence of the object of fear.
In addition, Juan, Alcaniz et al. [35] presents a system that allows to determine the type of object of fear (i.e., spider, cockroach) and its size (small, medium, and large), with the most realistic appearance, so that its structure, movement and texture can be analyzed.There are three states or movements that can be observed in animals: static, moving, and dead.The spider moves its legs, while the cockroach moves its feelers and legs.There are three markers that the system recognizes: insecticide, flyswatter, and dustpan.During the execution of the therapy the user can kill the animals (using the insecticide or flyswatter) and throw the animal to the dustpan.9 participants evaluated the system, 5 with cockroach phobia and 4 with spider phobia, with severe phobia diagnosis of 6 to 10 (on an scale from 0 to 10).The results of experience of use suggest that participants feel the presence of the fear objects, and perceive them as real.
Moreover, Juan and Joele [15] describes a system that has the same characteristics as the one presented in [35], however, it also includes the option to visualize the virtual elements without using the markers.The results of an evaluation with 25 subjects with a diagnostic lower than 97 using the fear and avoidance questionnaire (scale 0 to 16) [36], suggest that it is better to use invisible markers in AR systems, given that 100% of the participants indicated that they were more surprised when observing the object of fear.
Additionally, in Botella, Breton-López et al. [16] the ''Cockroach Game" is presented, the system is a puzzle game using a mobile phone as the application device in which the main objective is to interact with the object of fear (i.e., cockroaches) while matching the pieces of a puzzle.The game has two scenarios with different levels of difficulty.In the first scenario, the screen option, the user can see the cockroaches on various surfaces that are displayed on the phone screen; the virtual insects are shown on the patient's winter shoes (closed toe) in the first level, on the patient's summer shoes (open toe) on the intermediate level, and on a patient's hand on the advanced level.The second scenario includes a camera option which allows the users to see the virtual cockroaches on real surfaces (e.g., on their real clothes, on their real hands, etc.).In order to complete the puzzle, the user must obtain the pieces that comprise it by ''killing" cockroaches, after interacting with them.In addition, the application can play sounds such as an applause while the user acquires puzzle pieces during the game.Evaluation results regarding the experience of use of a subject with a score higher than 4 in phobia avoidance (scale 0 to 8), indicate that the anxiety level decreased, which suggests that the subject endured more the presence of the object of fear.
Next, in Botella, Pérez-Ara et al. [11] we found an application that uses markers consisting of white squares with black borders that contain symbols or letters.When the camera finds a marker in the real world, the program recognizes it and activates the virtual objects of fear (cockroaches).The virtual insects have structure, movements and texture similar to real cockroaches.The system includes 3D spiders and cockroaches and enables real-time interactivity, so that participants can see the actual place where they are located through the display device, and the feared stimuli (spiders and cockroaches) in the same place.The system allows to modulate variables that can be manipulated in the AR system: the number of animals, movements of the animals, their size (small, medium and large), type of spider, and, finally, the possibility of displaying the animal on various surfaces.All of these combined options enable the therapist to apply the treatment progressively.The system also allows participants to "kill" cockroaches.The results of an evaluation with 64 subjects with fear and avoidance scales of at least 4 (scale 0 to 8), show that IVET and ARET treatments generate similar fear and avoidance scales.These suggest that in both treatments the perception of the presence of the object of fear is similar.
In addition, in Ramírez-Fernández et al. [23] we found a system that during therapy execution allows the user to interact with a virtual object of fear (i.e., animal) and receives visual, auditory and haptic feedback.The TPAD Phone is used as a haptic device to provide tactile stimulus to the user when interacting with the animal.The interaction varies depending on the level of phobia detected in the diagnosis, divided into three levels of challenge: (i) image of a playful animal in 2D (for severe phobia), (ii) image of a real animal in 2D (for moderate phobia), and (iii) 3D augmented reality animal (form mild phobia).The results of an evaluation with 14 participants with diagnosis of mild to severe phobia, suggest that the automatic assignment of the challenge level of the interaction based on the diagnosis, can be useful to improve the experience of using AR systems.
Additionally, Twiz Spider Phobia [37] is a system that has two versions, the non immersive-ARET version only requires a smartphone to be used, while the immersive-ARET version, requires AR lenses to be used.During therapy execution, the virtual object of fear (i.e., spider) is visualized in the area in which it was called.It moves its limbs based on the proximity of the user.In addition, the duration of exposure with the spider depends on the user's abilities or the specialist's assignment.
Finally, Rich Apps 4D Spider On a Hand Simulator [38] is a system that allows the user to select among 5 variants of the object of fear (i.e., spider).During the execution of the therapy, the selected spider is automatically deployed in an augmented form and moves emitting sounds while simulating to bite the user every 3 s.The user can finish the execution of the therapy when required.
Later, these systems were analyzed to identify design features that could affect the experience of use of the patients.The observed characteristics were obtained by classifying the design factors using grounded theory [32].From axial coding, three categories were identified according to the modality of feedback: visual, auditory, or haptic, which, according to Vitense [18], are the most prominent types of feedback in AR systems.
Visual feedback provides information on the identification and location of the objects of fear (real or virtual), as well as significance on the actions allowed in the interaction with those objects [39], among others.
Auditory feedback enriches the information transferred to the user when interacting with AR systems [17].The advantages of auditory feedback are evident when manipulating any object that can produce sound, e.g., a bell, a door, a mouse.For instance, if we press a button that we expect to produce a sound and it does not produce any, we try to press it again until we hear that our request took effect.
Haptic feedback is a crucial sensory modality in the interactions of VR or AR systems [40].It involves force feedback, e.g., the hardness of an object, the weight, and inertia; tactile feedback, e.g., the surface contact geometry, smoothness, sliding, and temperature; and perceptual feedback, e.g., detection of the user's body position, and posture [41,42].
Table 1 shows some of the design characteristics identified in the analysis.It includes the reference to the system, the type of configuration (established manually by the therapist or automatically by the system), the level of exposure to the object of fear (single level or adaptable according to configuration) based on the user's diagnosis (established using e.g., fear and avoidance scales, DSM-IV instrument), and the kind of feedback (visual, auditory and haptic).Configuration Type: manual by therapist (MT), automatic by the system (AS).Level of Exposition: according to configuration (AC), according to patient's answers (ApA), single level (SL).Visual: two dimensions (2D), three dimensions (3D), real (R), cartoon (C), 2D image (I), augmented immersive (AI), augmented non-immersive (AnI), not animated (NA), random fixed (RF), random responsive (RR), diverse sizes (DS), unique size (US), fixed (F).Auditory: enjoyable (E), natural (i.e., like the animal) (N), static (S), not included (NI).Haptic: not included (NI), simulated (Si), static (S).

Proposed Taxonomy of Factors that Affect the Experience of Use of ARET Systems for Small-Animal Phobias
According to Baus and Bouchard [14] in VRET and ARET exposure systems, stimuli are generated by means of virtual elements that are used to quantify and qualify the user experience.These stimuli have as objective that the patient perceives the object of his/her fear as present in the environment [15].The experience of use of the patient is affected by the perception of the presence of the feared object and the immersion in the environment surrounding that object.In this proposal, however, we have mainly focused on characterizing the design elements related to generating the presence of the object of fear.
Considering the use of ARET technologies, the presence of the object of fear can be achieved as a gradual and systematic exposure in order to try to reduce the probability of the patient of abandoning the therapy, and this, by means of feedback from the system, either virtually or through AR [16].
As shown in Figure 3, the Presence of the object of fear regarding experience of use depends on the following three elements: Realism, Intensity and Interaction [14,[43][44][45].A brief description of these elements, along with that of the additional elements they comprise is provided next.
(a) Realism refers to the degree of convergence between the user's expectations and the current experience of the virtual element [46].It is defined by: (a) the number of dimensions in which the object is presented, and (b) the appearance of the object, which may correspond to reality or be an abstraction.For example, in Juan, Alcaniz et al. [35] cockroaches are presented in 3D using realistic images, while in Ramírez-Fernández et al. [23] the spider can be presented in 2D or 3D and as a cartoon or as a picture of a real spider.(b) Interaction refers to the presence or absence of interactive elements that react (or not) to the actions of the user.Thus, the interaction is defined primarily by the absence or presence of behavior.Then, if there is behavior, it can be: (a) fixed and with a unique pattern; (b) random; (c) controlled by the therapist; or (d) responsive, that is, that responds dynamically to the patient's actions.For example, in Botella, Breton-López et al. [16], during the game the objects of fear (spiders or cockroaches) appear in a predefined way.On the other hand, in Botella, Pérez-Ara et al. [11] the objects of fear are activated when the camera detects the included markers, and in Twiz Spider Phobia [37], the spider moves its extremities according to the detected proximity of the user.(c) Intensity refers to the levels of variability of the object that is used as a stimulus.This include: (a) the number of instances, (b) the size of the instance, (c) the type or aversiveness of the instance (e.g., a little mouse, a young rat, or an old and dirty rat), (d) the type of behaviour or aggressiveness level of the instance (e.g., a temerous dog, a happy dog, or an aggressive dog).For example, as described in Botella, Pérez-Ara et al. [11], in several of the systems we found that the type (e.g., tarantula or black widow), size (e.g., large, medium, small) and number of objects (e.g., from 1 to 60) can be specified by the therapist to progressively adapt the treatment, modifying gradually the intensity of its effect.As an instance of this general taxonomy, consider the customization presented in Figure 4, where attributes specific to visual stimuli have been included.Concerning Realism, as can be seen in the figure, visual elements may include: 2D or 3D images, or even movies or animations (4D).Further, these may represent abstract elements, such as icons or cartoons, or even realistic elements, such as captured photos or photorealistic images (e.g., Figure 5).Regarding Interaction, the visual elements may present no behavior, such as still images, or may present behaviors with movement along a fixed predefined path, or even along random paths.Furthermore, these behaviors can be more sophisticated by responding to actions that are specified by the therapist at run-time or even responding in a congruent manner to the actions of the patient while s/he performs the therapy.
Finally, concerning Intensity, there could be 1, or 2-5 (few) instances of the object of fear, or even 60 or one hundred of them (many).Further, each instance of the object of fear can be represented in a small, medium, large or huge size.Additionally, the visual representation of the object of fear may be more aversive (e.g., an "ugly" adult spider) or less aversive (e.g., a "cute" baby spider) (see Figure 6).Finally, this visual representation may present a very aggressive behavior (e.g., a spider that is continuously attempting to bite the patient) or a less aggressive behavior (e.g., a spider that is asleep).Similar customization exercises should be conducted in order to identify attributes specific to the auditory and haptic stimuli.

Discussion, Conclusions and Future Work
In this work, we present the results of our work in progress proposal of a taxonomy of factors that affect the user experience of patients using ARET systems for the treatment of small-animal phobias.The taxonomy organizes an ensemble of general factors and attributes identified from evidence found in the literature and from our previous work on PhobyTherapy [23].
In the current version of the taxonomy we propose three main categories of factors for the concept of Presence of the object of fear in exposure therapy: Realism, Interaction and Intensity.Further, we have presented a customized version of the taxonomy by considering factors/attributes specific to the visual stimuli.To the best of our knowledge, no other work has identified nor provided an explicit classification or taxonomy of factors that affect the user experience of patients using this kind of systems for the treatment of small-animal phobias.
The importance of this version of the taxonomy, is that it can be now used as a foundation to classify, compare and even inform the design of existing or new ARET systems for the treatment of small-animal phobias, particularly from a visual stimuli perspective.In addition, this taxonomy can be used to facilitate communication with specialists for the collaborative design of new tools that consider these aspects from a more clinical point of view according to their experience with traditional therapies.For instance, once factors and attributes such as these have been identified, it is possible to establish an adaptation mechanism for the impact of the provided visual stimuli, increasing or decreasing the possible impact by moving along the "continuum" specified by the different values of each factor.As an example, consider the representations in Figure 5; according to the specialists consulted during the design of PhobyTherapy, the aversiveness of the smaller (baby) spider representation is lower than that of the larger (adult) spider representation.A similar example can be given considering the realism of the abstract and realistic representations of the spiders in Figure 4.
The main limitation of our work resides in that the current version of the taxonomy has been developed focusing chiefly on the visual stimuli.Similar customization exercises should be conducted in order to identify attributes specific to the auditory and haptic stimuli to actually complement the proposed taxonomy.Further, it should be highlighted that the proposed taxonomy does not intend to be comprehensive, neither on the categories considered (three of them), nor on the factors identified for each category.These limitations are proposed as directions of future work, as our final aim is, on the one hand, to furnish a tool for the design, classification and evaluation of this kind of systems, and on the other, to attract attention and inspire others to conduct further work on this topic.

Figure 1 .
Figure 1.User experience in exposure therapies with Mixed Reality technologies.

Figure 2 .
Figure 2. Interaction cycle (Manipulation -Feedback) between the subject, the environment and the object of fear in the exposure therapy system.

Figure 3 .
Figure 3. Proposed Taxonomy of factors that affect the experience of use of ARET systems for smallanimal phobia.

Figure 4 .
Figure 4.The proposed taxonomy including attributes specific to visual stimuli.

Figure 5 .
Figure 5. Example of abstract and realistic images for a spider.

Figure 6 .
Figure 6.Example of intensity: number of instances and size of the instance.

Table 1 .
Design features observed in the augmented reality systems for the treatment of small-animal phobias.