Designing for Comfort in VR Public Speaking: How Avatar Realism and Natural Environments Shape User Experience and Stress Responses
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsI will provide my comments and suggestions below:
In the Introduction when referring to public speaking within VR-based training, in lines 56-58, I'd say that it is not super clear the sentence: studies have shown that VR exposure can effectively reduce public speaking anxiety and improve communication skills. What I miss here is adding something that refers to the absolute wonder of VR which is that what is being rehearsed or trained in VR can actually have the potential of being transferred to the real scenario. It's not just within VR that they will feel calmer or less anxious, but when speaking to a real audience also. And to me, this is not clear here. I'd also add more detail in that it has shown to significantly improve the perception of a message to be more persuasive and the perception of the speaker as more charismatic while speaking to a real audience after having trained in VR see (Valls-Ratés, Ï., Niebuhr, O., & Prieto, P. (2023). Encouraging participant embodiment during VR-assisted public speaking training improves persuasiveness and charisma in secondary school students. Frontiers in Virtual Reality, 4. DOI: 10.3389/frvir.2023.1074062).
In line 52 and some time before/after I see that the reference Anderson et al., 2013 includes the initials of their name.
Regarding the Theoretical Framework in Social Evaualitve Thread, I keep wondering: with VR we can actually decide what is that we wish to trigger in the user. For high-anxiety speakers we might want to trigger them the anxiety that they would feel in reality, so that they can manage it and work with it. So here, it will very much depend on what our goal is with the anxiety. The aim will of course be to reduce it so it interferes the least with the speaker performance when they are in front of a real audience, but does it really need to be the same in the virtual world? Well, again it will depend on what we really want to accomplish. Or maybe the very first sessions are aimed at the speaker to feel a very similar type of anxiety, but gradually to make them feel more comfortable by also switching to another kind of environment or type of avatar. And with this topic I think it also makes sense to comment on it in lines 170-172.
Regarding Avatar Realism, there are also studies showing the effects that not so much the realism, but the stance/attitude of the avatars have in triggering more anxiety to the speaker. Thus, a negative audience (distracted, not looking at us, interrupting or making noise) will have a negative effect on the speaker's anxiety and therefore production of the speech, see (Rodero, E. & Larrea, O. (2022). Virtual reality with distractors to overcome public speaking anxiety in university students. Comunicar. 30. 10.3916/C72-2022-07.).
In the Methods section, page 268; when referring to the figure 1, I think that naming the different steps of the procedure the same as in the actual figure would be much clearer. Also, the figure is inserted and it leaves the last syllable of the last word in the line right above, lost below it, and the Figure 1 stays on the next page.
Figure 2 pictures should be bigger because there is no way we can actually see the differences between the realistic and stylised audiences. Also, please be consistent with capitalising or not the words in the figure.
Questions I have:
Why did the authors not look for a validated questionnaire regarding their experience with VR, and conducted qualitative interviews instead?
Why did the authors not ask for their self-assessed anxiety before giving the speeches so that they could compare them to their physiological arousal results, and to their experience at the end?
Why did the authors not look into the quality of the speeches by external raters to see the relationship between lower arousal and maybe better performance or the other way around?
In my view, it is not clear enough the purpose of using VR for public speaking in this experiment. Is it to make people more at ease with a real audience in front? Is it that it is a one-time experience with VR and that is it? I'm not sure I understand the real purpose of having so many participants undergoing the experiment in public speaking and how it is stated in both the Design and procedure and in the Conclusion. In the design and procedure it could help to know also what the participants were debriefed about the experiment. Were they told that it was an experiment to see how much X they would feel? In the conclusions, it would also help to clarify that depending on the purpose of each VR training, the environments need to adapt in order to achieve something specific. VR doesn't necessarily mean that it needs to calm people down.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper examines how avatar realism and environmental context in virtual reality influence user experience and physiological stress during public speaking tasks. Using a 2 × 2 experimental design, the study finds that natural environments improve subjective experience, while stylized avatars reduce physiological stress, indicating a dissociation between cognitive and physiological responses.
Although the paper is well written and addresses an interesting and relevant topic, I have several questions and concerns that should be addressed to strengthen the manuscript:
- Insufficient citations for key claims
There are multiple instances, particularly in the Introduction and Theoretical Framework sections, where claims are presented without adequate supporting references. For example, statements such as line 67 (“environmental context may influence emotional and stress-related responses…”) and line 120 (“social evaluative threat involves both conscious appraisal and automatic physiological activation…”) require proper citations. The authors should carefully review the manuscript and ensure that all claims not directly supported by their empirical results are grounded in prior literature. - Justification of social evaluative threat sources
In line 116, the authors state that social evaluative threat in VR is “primarily conveyed through virtual human characters.” This claim requires further justification. Why would environmental context not also serve as a source of evaluative threat? The authors should provide theoretical or empirical evidence to support this claim or revise the statement to better reflect the broader literature. - Inconsistency in theoretical logic
The Theoretical Framework suggests that both avatar realism (lines 139–143) and environmental context (lines 171–179) may influence social evaluative threat. However, the proposed dual-pathway framework later separates these effects, attributing subjective experience to environmental context and physiological stress to avatar realism (lines 190–194). This separation is not sufficiently justified. The authors should provide clearer reasoning and theoretical support for distinguishing these pathways. - Weak theoretical grounding of the dual-pathway framework
While the dual-pathway framework is interesting, it currently appears somewhat ad hoc and not strongly derived from established theory. The authors are encouraged to strengthen their theoretical foundation by drawing on relevant frameworks, such as dual-process theories or social presence theory. In particular, the manuscript should clarify:
Why would avatar realism primarily operate through implicit (physiological) pathways? And why would the environmental context not directly influence physiological responses?
- Definition and measurement of user experience
In Hypothesis 1, “user experience” is not clearly defined. The authors should specify how this construct is operationalized and which survey components (e.g., satisfaction, usability, etc.) contribute to it. - Participant recruitment and sample characteristics
The Methods section lacks sufficient detail regarding participant recruitment. The authors should clarify:- Whether participants were volunteers or recruited through coursework
- The proportion of undergraduate vs. graduate students
- Whether differences in experience levels may influence results
- The reason for the gender imbalance (73.5% female) and whether this may introduce bias
- It may also be valuable to explore whether gender plays a moderating or mediating role in the observed effects.
- Clarification of measurement construction
It is unclear how composite variables such as UX, US, TU, perceived support, and perceived threat were calculated from the survey items. The authors should explicitly describe how these measures were derived (e.g., item averaging, factor scores) and ensure transparency, particularly regarding Table 1. - Scope of mediation analysis
The mediation analysis only examines perceived audience support and perceived threat as mediators of avatar style. Given that both avatar realism and environmental context are manipulated variables, it is unclear why environmental context was not included in the mediation analysis. The authors should justify this choice or consider extending the analysis. - Practical implications could be strengthened
The Discussion section would benefit from more concrete design recommendations. For example, under what conditions should designers use stylized versus realistic avatars? What trade-offs exist between realism, immersion, and user comfort? - Future research directions: audience size
While the study controls for audience size (line 286), this factor may itself significantly influence stress responses. Future research could investigate audience size as an additional variable, since perceived stress likely varies with the number of observers.
Overall, this is a promising study with a solid experimental design and meaningful contributions. Addressing the concerns above would significantly strengthen the theoretical rigor, methodological transparency, and practical impact of the paper.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsVirtual reality (VR) is increasingly used in public speaking training. This article investigates the influence of the avatar's visual style (realistic or stylized) and the type of scene (natural or indoor) on subjective experience and physiological stress, and the respective roles of the environmental context and the design of the virtual audience.
The introduction sets the scene and presents previous research, highlighting its gaps and limitations. The section on the theoretical framework explores aspects related to the feeling of social evaluation threat, the impact of avatar realism on social presence, and its influence on negative or positive perception, drawing on the uncanny valley hypothesis. It also addresses the impact of the nature of the environment (natural, urban, professional, etc.). The use of a dual-track framework is proposed, suggesting that public speaking experiences in virtual reality could be shaped by two partially independent mechanisms.
Three relevant hypotheses are proposed. 1) Natural environments in virtual reality will induce a more positive subjective user experience; 2) Stylized virtual audiences will elicit weaker physiological stress responses; 3) Scene type and avatar realism are expected to have largely independent effects on user experience and physiological stress responses, with limited or insignificant interactions.
The virtual experience was played by 132 participants using a 2 × 2 between-subjects design. The design and methodology are clearly outlined. The experimental setup and virtual environments are well described.
Subjective experience was assessed using standardized questionnaires, while physiological responses were measured by electrodermal activity and heart rate variability, supplemented by post-experimental interviews.
The measurements taken and the data processing are clearly specified. The analysis of the results is described in great detail and with relevance.
The ensuing discussion builds upon the preceding sections and offers conclusions that confirm the first two hypotheses. Regarding the third, the result is less clear: contrary to expectations concerning interaction effects, the results did not reveal a significant interaction between scene type and avatar style. Rather than considering this a null result, the authors argue that this finding empirically supports the choice of the dual-channel model.
The authors then objectively outline some limitations related to the low diversity in the test population, with rather static situations leading to low expressiveness, and the fact that the physiological measurements were based on short-term responses. Suggestions for future research are offered. In conclusion, the main results presented in the article are summarized. These results expand upon previous findings and may be useful for future work in broader application areas.
Note: line 267: the end of the sentence is truncated (no text before 4.3)
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI thank the authors for taking the comments and suggestions seriously and for improving the manuscript based on them.
I only see minor things, specially regarding the in-text citations. Some have the brackets without space after the last letter of the previous word, some that have more than one author have brackets for each reference they cite. In the final reference list, I see inconsistency in how journals are reported. Some are shortened, while others are not.
regarding the title of the article, I would not have the word REALISM hyphened, but sent to the next line of the title.
Line 546 two words are together.
Regarding Figure 2, I still believe that something else can be done so that the differences between stylised-realistic avatars and natural-indoor environments can be seen. Maybe the speech tasks could be placed above the stimulus and audiences?
Thank you and good job!
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe revision looks good to me. Nice job!
Author Response
Thank you very much for your positive comment and recognition of our revised manuscript. We really appreciate your time and valuable efforts in reviewing our work.
