Next Article in Journal
Propagation Characteristics of Shock Waves and Distribution Features of Loads in T-Shaped Tunnels with Protected Door
Previous Article in Journal
Design, Implementation and Experimental Validation of an ADCS Helmholtz Cage
Previous Article in Special Issue
From Human–Machine Interaction to Human–Machine Cooperation: Status and Progress
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploratory Research on the Potential of Human–AI Interaction for Mental Health: Building and Verifying an Experimental Environment Based on ChatGPT and Metaverse

1
School of Human Sciences, Waseda University, Tokorozawa 359-1192, Japan
2
Faculty of Human Sciences, Waseda University, Tokorozawa 359-1192, Japan
3
IMSL Shenzhen Key Lab, PKU-HKUST Shenzhen Hong Kong Institute, Shenzhen 518057, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(20), 11209; https://doi.org/10.3390/app152011209
Submission received: 21 August 2025 / Revised: 10 October 2025 / Accepted: 13 October 2025 / Published: 20 October 2025

Abstract

The demand for mental health support has highlighted the potential of conversational AI and immersive metaverses. However, these technologies possess weaknesses. The AI agents are intelligent but often disembodied, while metaverse environments provide a sense of presence but typically lack dynamic and intelligent responsiveness. To address this gap, we design and verify an experimental environment integrated with a conversational AI agent, enabled by ChatGPT, into a metaverse platform. We conducted a within-subjects experiment with 15 participants who interacted with the agent in both the immersive metaverse and a standard text-chat interface to investigate user preferences and subjective experiences. After the experiment, participants are required to answer a questionnaire to assign the scores, which can represent the user preferences and subjective experiences. The results showed that the scores were slightly different between the two conditions. Especially, qualitative feedback from participants revealed that all participants subjectively reported the AI-Metaverse condition as better. This study provides an exploratory study to demonstrate the potential of human–AI interaction in mental health support that should be further investigated.

1. Introduction

In recent years, the global demand for mental health services has increased significantly. According to the WHO Report [1], over 150 million people in the European Region were living with mental health condition in 2021. This trend was exacerbated by the COVID-19 pandemic, as changes to social isolation and connection patterns increased the risk of mental disorders, such as depression, anxiety, and loneliness [2]. However, traditional mental healthcare systems often struggle to meet this growing demand due to significant barriers, including high costs [3], long waiting times for professional services [4], and patient concerns over privacy protection [5]. As a result, many individuals in need of help do not receive timely and effective support. This challenge has catalyzed a turn toward digital technologies as researchers and clinicians seek scalable, affordable, and accessible solutions. In the past several years, two of the most promising frontiers in this area are Artificial Intelligence (AI) and immersive environments like the metaverse and virtual reality (VR) [6]. In fact, the convergence of these technologies, which integrates AI into metaverse platforms, is now seen as a major direction for the future of digital healthcare, with potential applications ranging from mental healthcare to anti-aging [7].
One approach involves conversational AI powered by advanced Large Language Models (LLMs) like ChatGPT. These AI systems can hold remarkable human-like conversations, offering a scalable method for providing support. Much research has provided strong support for this approach in reducing symptoms of depression and anxiety and improving treatment outcomes [8,9]. For example, Furukawa et al. [8] developed a smartphone-based Cognitive behavioral therapy (CBT) app that implements five representative CBT skills to mitigate depressive symptoms, which falls short of the depression diagnostic threshold. Furthermore, De Freitas et al. [9] demonstrated that AI companions can reduce feelings of loneliness as effectively as human interaction. However, a significant barrier remains with most text-based AI. The lack of a physical form, which is a disembodied interaction, can create a sense of detachment and artificiality for the individual. This perception of communicating with a machine can hinder the formation of a deep and authentic emotional bond, which is crucial for effective mental health support in the real world.
A second approach involves the metaverse and virtual reality (VR). The power of these technologies lies in their ability to create a strong sense of presence, which is the subjective feeling of being physically located within a virtual environment [10,11,12,13]. This feature allows for the creation of safe and controllable social scenarios where individuals can practice social skills and overcome real-world anxieties [12]. However, these immersive platforms have their own limitations. The environment and any non-player characters within them are typically pre-designed and static. They lack the capacity to dynamically adapt to an individual’s unique emotional state or conversational needs. This environmental rigidity creates a different kind of barrier to forming a meaningful and effective emotional connection.
While these technological domains have shown considerable potential on their own, they each possess distinct weaknesses. On one hand, conversational AI is intelligent and responsive, but its lack of physical body can lead to user distrust and a sense of artificiality [14]. On the other hand, the metaverse provides an embodied experience, but its environment lacks intelligence and personalized responsiveness [13]. This analysis leads to a critical question regarding how to overcome these limitations. The proposed solution involves integrating an intelligent and conversational AI agent within an immersive metaverse. Based on HCI principles, we hypothesize that the synergy between an AI agent’s responsive intelligence and the metaverse’s embodied presence can create a positive user experience. This may help users feel truly seen and understood, thereby enhancing user engagement.
To investigate this question, we build and verify a human–AI interaction environment to explore the potential for mental health. First, we design and build an experimental environment that integrates an AI agent, powered by ChatGPT, into a custom-designed metaverse. The AI is embodied as a physical avatar, and individuals can converse with it in real-time. Next, we conduct an experiment in which we recruit participants to interact with the AI agent within this environment. Thereafter, we use a seven-point Likert scale questionnaire to assess user preferences and subjective experiences. From a practical standpoint, this work introduces and validates a new type of digital approach that is scalable and engaging. This could be helpful for individuals who face barriers to accessing conventional face-to-face mental health support. From a theoretical standpoint, this work provides exploratory insight into how the human–AI interaction affects human emotion and social well-being, offering insights and design guidelines for the future development of AI agents in mental healthcare.
The remainder of this paper is organized as follows. Section 2 reviews related work in AI and metaverse for mental health support. Section 3 describes the design and implementation of an experimental environment. The experiment conditions and results are introduced and discussed in Section 4. Finally, Section 5 concludes this study and highlights future research directions.

2. Related Work

In this section, we introduce recent literature on the use of conversational AI as a tool for mental health support and discuss how immersive environments like the metaverse are being used to foster social connection. Finally, we emphasize the position of this study.

2.1. AI as an Emotional Support Companion

The concept of automated therapeutic conversation dates to early systems with Eliza in the 1960s. Since then, the application of AI in mental healthcare has significantly evolved, moving from task-oriented tools for clinical screening [15] to intelligent AI companions designed to foster psychological well-being and social connection [3,6,16,17]. This shift has been largely driven by the advent of advanced Large Language Models, known as LLMs, which enable AI agents to engage in the adaptive, human-like conversations necessary for relationship building [18]. Recent studies confirm their potential as a powerful intervention for loneliness [9,19]. For example, De Freitas et al. [9] demonstrated that interaction with an AI companion can reduce loneliness as effectively as interaction with a human, establishing that AI can serve an important relational function.
The mechanism behind this effectiveness can be partially explained by social penetration theory, which posits that the disclosure of personal emotion plays a crucial role in enhancing intimacy and trust within relationships. Based on this theory, AI’s ability to appropriately perform emotional disclosure is critical for building user trust and satisfaction [20]. A chatbot that discloses humanlike emotions has more positive effects in the context of mental health counseling than a chatbot that discloses only factual information. This sense of trust is further enhanced by the non-judgmental nature of AI. A core principle of human-centered AI design [21] is to create a safe and private space for users. Croes et al. [14] confirmed that the participants in their experiment felt no fear of being judged when talking to a chatbot, which encourages more open and honest self-disclosure when seeking mental health support.
However, the effectiveness of these interactions is highly dependent on communication modality. Croes et al. [14] found that voice-based interactions were more effective at building trust than text-based interactions alone. This finding highlights a broader principle that intelligent and more human-like modalities are more effective at fostering connection. This extends to the limitation of current systems, which is disembodiment. The experience of interacting with an AI that lacks physical form can create a sense of artificiality and detachment for the user. Providing AI with some form of physical presence is a critical next step to enhance its potential in emotional support. This has led researchers to explore new paradigms, such as the combination of AI with spatial computing [22], to create more present and effective interventions. In this context, spatial computing refers to technologies that exist as an avatar within a virtual space, enabling interaction through non-verbal cues such as gestures, thereby creating a sense of physical co-presence. This suggests that while AI’s intelligence is crucial, the physical presence is equally important for establishing the emotional connection required to reduce loneliness.

2.2. Metaverse for Avatar-Based Immersive Interactions

The need for the embodiment of AI-driven support leads to the exploration of immersive environments such as the metaverse and VR. Ifdil et al. [10] proposed that the metaverse offers a new approach to global mental health challenges, particularly in the post-COVID-19 pandemic era. The potential of this new field is widely recognized, with researchers such as Govindankutty and Gopalan [11] analyzing the benefits and challenges to mental health, while Al Falahi et al. [23] have outlined the future direction. The therapeutic uses of immersive techniques are wide-ranging. For example, Wang et al. [24] confirmed the effectiveness of VR in providing established interventions such as meditation and mindfulness to reduce stress and anxiety. These technologies have revealed their ability to induce a strong sense of presence, which is a user’s subjective feeling of being physically within the virtual world.
The effectiveness of VR in creating a powerful sense of presence is well documented. For instance, Wang et al. [12] demonstrated that immersive environments are effective for complex motor learning, such as Taichi, suggesting a deep integration between the user’s body movements and their virtual representations. This embodied presence underlies the metaverse’s potential for social connection. In a comparative study, Cho et al. [13] investigated the effectiveness of metaverse-based counseling versus traditional face-to-face counseling. While their findings showed no significant difference in reducing psychological symptoms, they established the metaverse as a clinically viable alternative. They found that the metaverse group reported significantly higher work alliance and counseling satisfaction. This suggests that the embodied and psychological safety of avatar-based interactions can foster a unique and powerful emotional bond.

2.3. Position of This Study

Section 2.1 and Section 2.2 outline two promising but ultimately incomplete technological trajectories. Conversational AI offers advanced dialogic intelligence but lacks a body, which can hinder the formation of a deep emotional bond. Conversely, the metaverse provides a powerful platform for embodied interaction that enhances therapeutic effect, but it lacks dynamic and intelligent responsiveness from the environment or the agents.
Recent work has begun to explore this exact synergy. For example, Fang et al. [25] developed a social simulation system using VR, AR, and LLMs to help users practice stress relief in a controlled, interactive environment. This study provides crucial validation for the technical approach of integrating advanced AI with immersive platforms for a therapeutic purpose.
The feasibility of such systems is emerging. However, it is currently unknown whether the integration of an LLM-based AI agent and the metaverse-based social presence can contribute to mental health support. This study is designed to investigate the potential of human–AI interaction through building and verifying an experimental environment based on ChatGPT and the Metaverse.

3. Design and Implementation of an Experimental Environment

This section details the design of the experiment environment developed for this study. The system architecture is designed to facilitate an embodied and conversation-based interaction between an individual and an AI agent within a metaverse environment. The purpose of this system is to serve as an experimental environment for investigating the potential of human–AI interaction.

3.1. Basic Architecture Based on ChatGPT and Metaverse

The basic architecture of our experiment environment is the integration of an advanced conversational AI with an immersive social environment, which is the metaverse. This configuration allows the AI agent to have a physical presence, which addresses the key limitation of disembodiment identified in Section 2.
The system is designed to operate within a social VR platform. The specific platform used in our implementation is VRChat [26], a widely used metaverse application that enables users to interact in shared 3D spaces. The image illustration of this environment is shown in Figure 1. Within the metaverse environment, the AI agent is embodied as a custom-designed 3D avatar of a friendly robot, as shown in Figure 2. This provides the user with a visible entity to interact with, creating a sense of co-presence, which is the feeling of sharing a physical space with another being. The interaction takes place in a calm and aesthetically pleasing virtual room designed to be a comfortable and healing space.
The connection between our AI agent and the metaverse environment is achieved using a Virtual Audio Cable (VB-Cable) (https://vb-audio.com/Cable/index.htm (accessed on 13 August 2025)). This software creates a virtual bridge, allowing the AI agent to listen to the user’s voice from metaverse environment and speak back into the metaverse through a virtual microphone. It enables real-time and voice-based interaction with the embodied AI agent. The technical details of this process are described in Section 3.3.

3.2. AI Agent Persona and Behavioral Configuration

The personality of AI agent is defined by a set of rules that govern conversational behavior. The AI agent is configured to act as an excellent psychological counselor, and the behavior is governed by four key principles as shown below.
  • Primary Role: The AI agent’s main purpose is to provide comfort and listen empathetically to the user’s concerns.
  • Conversational Style: It is instructed to communicate like a gentle and kind friend, using concise responses to maintain a natural conversational rhythm.
  • Engagement Strategy: The AI agent actively fosters dialog by occasionally asking open-ended questions to explore topics further.
  • Error Handling: It has built in strategies to politely ask for help when it encounters potential speech recognition failures.

3.3. Functional Modules and Data Processing

The architecture of the experiment environment is composed of three modules that operate sequentially in a data processing pipeline. The pipeline manages the real-time flow of information from the user’s speech to the AI agent’s audible response, enabling a natural, voice-based conversation. The process is illustrated in Figure 3 and detailed below.

3.3.1. Speech Recognition Module

This module is responsible for capturing the user’s voice and converting it into machine-readable text. It executes a three-step process.
  • Voice Input: The process begins with the system receiving the user’s voice from the metaverse environment, which is routed into the application via the first virtual audio device, which is the cable output function.
  • Voice Capture: The incoming audio stream is continuously monitored by a voice activity detection (VAD) algorithm. The monitoring relies on a real-time analysis of the audio waveform’s intensity, calculated using the Root Mean Square (RMS). Recording ceases after a set period of silence is detected, ensuring the complete capture of the user’s utterance.
  • Speech-to-Text (STT) Conversion: The recorded audio segment is processed by OpenAI’s Whisper model, which is Large V3 [27]. As a state-of-the-art Automatic Speech Recognition (ASR) model, Large V3 is known for its high accuracy and robustness in handling various accents and moderate background noise. This capability is critical for the experiment, as it helps to ensure the user feels accurately heard and understood, which is a foundational element for building a connection and minimizing interactional friction.

3.3.2. Conversational Logic Module

This module processes the transcribed text to generate a coherent and empathetic response, serving as the brain of the AI agent.
  • Language Processing: The text string from the previous module is sent to GPT-4 [28], a powerful Large Language Model (LLM). This module is responsible for all higher-level cognitive tasks, including understanding the user’s intent, recalling past turns in the conversation to maintain context, and generating empathetic and context-aware responses. The ability to process language allows the AI agent to engage in the kind of supportive, multi-turn dialog that is essential for building a sense of companionship.

3.3.3. Speech Synthesis Module

This module gives the AI agent a voice and is critical for creating personality and a sense of presence. This process involves two final steps as follows.
1.
Text-to-Speech (TTS) Synthesis: The text response generated by GPT-4 is transformed into a natural-sounding audio file by the VOICEVOX [29] engine, a speech synthesis tool that provides a variety of expressive Japanese voices.
2.
Voice Output: The synthesized audio is routed back into the metaverse environment through the second virtual audio device, which is the cable input, and functions as the AI agent’s microphone. This transforms the AI agent from a disembodied text generator into an audible character within the virtual world, allowing the user to hear the voice directly from the AI agent’s avatar.

4. Experiment Results and Discussion

We conducted an experiment based on our designed environment, as proposed in Section 3. This section details the experiment design and experiment results and then provides a discussion of the findings and the limitations.

4.1. Experiment Overview

4.1.1. Participants

A total of 15 participants, which involves 11 males and four females, were recruited for the experiment. The participants were all in their 20 s, with a mean age of 23.53 years. All participants were fluent in Japanese, and an exclusion criterion was any self-reported diagnosed mental illness. The experiment was conducted in November 2023. All participants provided written informed consent. This process informed participants that they had the right to withdraw from the study at any time during the sessions, for any reason, and without disadvantages, should they feel any discomfort. Additionally, they could unconditionally withdraw their consent and request the deletion of their data after the experiment was completed.

4.1.2. Experiment Conditions

The experiment employed a within-subjects design in which each participant experienced two experimental conditions.
  • AI-Metaverse Condition: A voice-based conversation with the AI agent in the immersive metaverse environment.
  • AI-Text Condition: A text-based conversation with the same AI agent through a chat interface.
The experimental environment was implemented under technical specifications shown in Table 1.
The experiment followed a structured, four-step procedure for each participant.
  • Pre-experiment Questionnaire: First, we take a questionnaire to gather baseline data on the participants’ current feelings and their prior experience with AI agents and the metaverse.
  • VR Test: Before the AI-Metaverse condition, we provided instructions on how to use the Meta Quest 2. Participants then perform a brief body synchronization exercise in the metaverse environment, which involves moving their head and hands and following the AI avatar around the room to establish a sense of presence.
  • Interaction Session: Participant then completes both the 10 min AI-Metaverse session and the 10 min AI-Text session. To ensure privacy and encourage open expression, each participant is alone in the room during the interaction sessions.
  • Post-experiment Questionnaire: Following each session, participants complete a detailed questionnaire to assess their user experience in that specific condition.
Data are collected via these questionnaires using a seven-point Likert scale, where lower scores indicate a more positive outcome, with one being very satisfied and seven being not at all satisfied. The questionnaires assessed key metrics, mainly including overall satisfaction, ease of use, perceived support for worry, perceived support for loneliness, perceived support for healing, and intention for future use, among other factors. The questionnaires are attached in the Appendix A and Appendix B.

4.2. Experiment Results

4.2.1. Analysis of User Experience

The descriptive statistics for the user experience ratings are presented using minimums, maximums, medians, modes, and ranges, which are appropriate for ordinal Likert scale data, as shown in Table 2.
For overall satisfaction, ease of use, perceived support for worry, AI response, and intention of future use, the central tendency was similar between the “AI-Metaverse” condition and the “AI-Text” condition, with the same median. The primary reasons cited for being willing to reuse the metaverse service were a non-everyday, appealing environment, and the appearance and voice of the AI agent. However, in the ratings for the perceived support for loneliness and the perceived support for calming, the “AI-Metaverse” condition received a more positive median rating compared to the “AI-Text” condition. Moreover, the feedback from participants showed the results that all 15 participants felt the “AI-Metaverse” condition was better than the “AI-Text” condition subjectively.
The drawback of the “AI-Metaverse” condition was technical performance. The understanding and the naturalness of the conversation were rated better in the medians of the “AI-Text” condition versus the “AI-Metaverse”. The participants’ feedback indicated that speech recognition errors were the cause of unnatural conversation in the metaverse.

4.2.2. Analysis Based on Pre-Psychological State

To further investigate the effects of the “AI-Metaverse” environment, we divided participants into two subgroups based on participants’ self-reported psychological states before the experiment, which were whether they felt worried. The descriptive statistics for these comparisons are shown in Table 3. For overall satisfaction, ease of use, and perceived support for worry, the median rating was the same for both groups. However, the differences were observed in perceived support for loneliness and intention for future use, where the “Feels Worried” group reported a negative median rating compared to the “Does Not Feel Worried” group.

4.2.3. Analysis Based on VR Experience

Then, we investigated whether the user preference for the “AI-Metaverse” environment is dependent on an individual’s VR experience. We compared the scores based on their questionnaire before the experiment. Participants were divided into three groups. The first one is a novice, who has no or one experience with VR. The second one is intermediate, for those who have had several experiences with VR, and the last one is experienced, for those who continually use VR. The descriptive statistics are shown in Table 4.
As shown in Table 4, the results revealed a consistent central tendency for overall satisfaction, with all three groups reporting a median score of 3.00. However, for ease of use, perceived support for loneliness and intention for future use, the Intermediate group reported the most positive median score, compared to the “Novice” and “Experienced” groups. In contrast, for perceived support for worry, the “Intermediate” group reported the least positive median rating (Median = 5.00). Since the small sample sizes of the subgroups, these patterns, as preliminary observations, should be noticed.
To investigate whether prior VR experience influenced the perceived support for loneliness, we first categorized the 7-point Likert scale ratings into three distinct groups: “Perceived Support” (ratings 1–3), “Neutral” (rating 4), and “Did Not Perceive Support” (ratings 5–7). The frequency distribution across the VR experience subgroups is shown in Table 5. Fisher’s Exact Test was conducted on this table to determine if there was a statistically significant association between a participant’s VR experience level and their perceived support for loneliness. The result was not statistically significant with a p-value of 0.590. This suggests that, within the sample of this experiment, a participant’s prior VR experience did not influence whether he/she perceived the “AI-Metaverse” condition as supportive in alleviating loneliness.

4.2.4. Analysis Based on Consultation Frequency

To further understand user preferences, we categorized participants into three groups based on the experience of consulting others about their worries, which was asked in the pre-experiment questionnaire. Participants were categorized into three groups which were frequently, never, and case-by-case. The results are shown in Table 6.
For overall satisfaction, the medians were 3.00 for the “Never” group and “Case-by-Case” group, and 4.00 for the “Frequently” group. Remarkably, the “Never” group, those who do not consult others, reported the scores from perceived support for loneliness, with a median of 2.00 and a mode of 2, slightly lower than the “Case-by-Case” group, with a mean of 3.00 and a mode of 3, and the frequently group, those who frequently consult others, with a mean of 5.00 and a mode of 5. Furthermore, the “Never” group reported the highest intention to reuse the service, with a median of 1.00 and a mode of 1, followed by the “Case-by-Case” group, with a median of 2.00 and a mode of 2. The “Frequently” group, who are accustomed to seeking support, showed the lowest intention for future use, with a median of 4.00 and a mode of 4. These results provide a preliminary suggestion that a user’s pre-existing consultation habits may in-fluence their preferences. It suggests a hypothesis for future investigation that the “AI-Metaverse” condition may be appealing to individuals who do not typically seek support from others.
We also investigated whether a participant’s prior consultation experience influenced his/her perceived support in alleviating loneliness. The frequency distribution for these subgroups is shown in Table 7. The p-value of Fisher’s Exact Test was 0.048, which indicates that a participant’s prior experience with seeking support had a significant effect on whether he/she perceived the “AI-Metaverse” condition as supportive in alleviating loneliness.

4.2.5. Qualitative Analysis of Intention for Future Use

As mentioned in Section 4.2.1, the intention to use the AI-Metaverse condition was slightly higher than that for the AI-Text condition. To explore the detailed reasons, we summarized the qualitative feedback provided by participants.
As shown in Table 8, participants who reported that they intended to reuse the AI-Metaverse condition most frequently cited its affective and experiential qualities. The top reasons were being attracted by the non-everyday space and the AI’s appearance was cute, followed by the AI’s voice and the utility as a private space to discuss personal worries. Conversely, the reason for not intending to reuse the AI-Metaverse service was the technical performance, specifically that the voice recognition was not accurate.
In contrast, the main reason for intending to reuse the AI-Text condition was the functional precision, with participants stating that the content recognition was accurate. However, the drawback of the text-based interaction was the lack of psychological presence, with the reasons for not intending to reuse it being that it did not feel like counseling, and the participants did not intend to consult non-humans.

4.3. Discussion

4.3.1. The Potential of Human–AI Interaction on Loneliness Alleviation

This study explored the potential of human–AI interaction using an AI agent in the metaverse. Our findings provide preliminary insights into the user experience and highlight future directions. However, the experimental design confounds several factors. The AI-Metaverse condition combined embodiment with voice interaction and a specific persona, while the AI-Text condition was text-based only. Therefore, it is difficult to fully attribute the observed subjective preference to the Metaverse environment, as the effect could also result from the modality, such as the differences between voice and text, or the AI agent’s persona.
The result of our experiment showed the user preference for the AI-Metaverse environment. While the satisfaction scores were similar between the two conditions, all 15 participants reported that the AI-Metaverse condition was better than the other. This suggests that the subjective experience of being in a shared, virtual space with an AI agent is a powerful factor for users.
Finally, our analysis suggests that the AI-Metaverse was appealing to individuals who reported that they never seek emotional support from other people. This group reported the highest subjective preference and the strongest intention to use the system again. This suggests that AI agents could serve as a valuable and accessible first step for individuals who are hesitant to seek human support.

4.3.2. Limitations

With a sample size of N = 15, the statistical analysis in this study may be underpowered and is considered preliminary findings that require validation in future work. Furthermore, our experimental design lacks counterbalancing for the order of the two conditions. All participants experienced the conditions in the same sequence. This introduces the possibility of ordering effects; for example, participants’ experience in the second condition may have been influenced by practice or novelty effects from the first condition. In addition, the dialog log or objective interaction metrics, such as the number of conversational turns, utterance lengths, or topic depth, were not recorded, which are basic mediators of perceived support.
In addition, the instrumentation of subjective experience used in this study was proposed with a single question for this experimental environment, without validation. The study would have been more rigorous if we had employed standardized, multi-item scales to measure user experience. Finally, since Likert scale data are ordinal, the results of our statistical tests should be interpreted as exploratory results, and the multiple comparisons should be further investigated.

5. Conclusions

In this study, we explored the potential of human–AI interaction by designing and demonstrating the feasibility of an experimental environment that integrates a conversational AI agent, powered by ChatGPT, into a metaverse platform. We conducted a within-subjects experiment with 15 participants who interacted with the AI agent in both the immersive metaverse and a standard text-chat interface to investigate user preferences and the subjective experience. The results showed a unanimous subjective preference for the AI-metaverse interaction. Furthermore, our analysis suggests that in the field of mental health support, this interaction paradigm may be appealing to individuals who do not typically seek emotional support from others.
Future work will employ more rigorous experimental designs and compare with the baselines, such as including a voice-only condition, to compare the effect of each component, and employ validated instruments, such as the UCLA Loneliness Scale, with a pre- and post-measurement design to assess changes in user experiences, and recording objective interaction metrics, such as the number of conversational turns, utterance lengths, or topic depth, for enhancing internal validity. Furthermore, we will investigate how to improve the technical performance of AI agents by reducing latency and enhancing speech recognition accuracy. We will also record the key metrics, such as latency per stage, errors, and failure rates, and a more detailed system prompt design to better understand their impact on the user experience and enhance the replicability of our study. Finally, future work should employ statistical methods for ordinal data, such as chi-square tests or ordinal logistic regression, to ensure a more robust analysis.

Author Contributions

Conceptualization, P.C.; methodology, P.C.; software, P.C.; validation, P.C.; formal analysis, P.C.; investigation, P.C. and R.C.; resources, P.C.; data curation, P.C.; writing—original draft preparation, P.C. and R.C.; writing—review and editing, R.C.; visualization, R.C.; supervision, L.Y. and Q.J.; project administration, Q.J.; funding acquisition, L.Y. and Q.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work for the third and last authors was supported in part by the 2023–2025 Shenzhen Science and Technology Program under Grant GJHZ20220913144201002.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki. According to the Waseda University Ethics Committee on Research with Human Subjects Application Guidelines, the experiment conducted solely by an undergraduate student (the first author in this paper) that was inherently non-invasive and did not have a conflict of interest with any other institution did not, in principle, require assessment, if the supervisor (the last author) confirmed in advance that the research plan did not have any issues including ethical issues and ensured that the research was conducted responsibly.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The deidentified quantitative data and qualitative feedback from the questionnaires of this study will be made available by the corresponding author on request.

Acknowledgments

We sincerely thank all the participants in our experiment for their time and contributions. The authors made use of AI tools (ChatGPT, Gemini and DeepL) solely to improve language expressions, enhance readability and clarity. All content was carefully reviewed and edited by the authors, who take full responsibility for the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Post-Experiment Questionnaire Items for AI-Metaverse Condition.
Table A1. Post-Experiment Questionnaire Items for AI-Metaverse Condition.
No.Item
Q1Rate your overall satisfaction with the AI-Metaverse Counseling Room service.
Q2Rate the ease of use of the AI-Metaverse Counseling Room interface.
Q3Rate the alleviation of your worries after using the AI-Metaverse Counseling Room.
Q4Rate the reduction of your loneliness after using the AI-Metaverse Counseling Room.
Q5Rate the extent to which you gained a deeper understanding of your feelings after using the AI-Metaverse Counseling Room.
Q6Rate the healing effect of the AI-Metaverse environment.
Q7Rate the AI agent’s response time in the AI-Metaverse environment.
Q8Rate the naturalness of the conversation with the AI agent in the AI-Metaverse environment.
Q8.1[For respondents who rated Q8 as scale 5–7 (Not natural) only]
What aspects of the conversation felt unnatural? (Multiple selections allowed)
Q8.2[For respondents who rated Q8 as scale 5–7 (Not natural) only]
If the conversation with the AI had been more natural and fluent, do you believe the reduction of loneliness and worries would have been more effective?
Q9Rate your intention to use this kind of AI-Metaverse Counseling Room in the future.
Q9.1[For respondents who rated Q9 as scale 1-4 (Intending to use) only]
Why would you want to use it again? (Multiple selections allowed, up to three)
Q9.2[For respondents who rated Q9 as scale 5-7 (Intending not to use) only]
Why would you not want to use it again? (Multiple selections allowed, up to three)
Q9.3[For respondents who rated Q9 as scale 5–7 (Intending not to use) only]
How would you like the AI-Metaverse Counseling Room to be improved for you to want to use it? (Multiple selections allowed)
Q10What emotions did you experience while using the AI-Metaverse Counseling Room? (Multiple selections allowed)
Q11Regarding this experiment, please provide any advice for service improvement, or any other comments or opinions. (Optional)

Appendix B

Table A2. Post-Experiment Questionnaire Items for AI-Text Condition.
Table A2. Post-Experiment Questionnaire Items for AI-Text Condition.
No.Item
Q1Rate your overall satisfaction with the AI-Text Counseling Room service.
Q2Rate the ease of use of the AI-Text Counseling Room interface.
Q3Rate the alleviation of your worries after using the AI-Text Counseling Room.
Q4Rate the reduction of your loneliness after using the AI-Text Counseling Room.
Q5Rate the extent to which you gained a deeper understanding of your feelings after using the AI-Text Counseling Room.
Q6Rate the healing effect of the AI-Text environment.
Q7Rate the AI agent’s response time in the AI-Text environment.
Q8Rate the naturalness of the conversation with the AI agent in the AI-Text environment.
Q8.1[For respondents who rated Q8 as scale 5-7 (Not natural) only]
What aspects of the conversation felt unnatural? (Multiple selections allowed)
Q8.2[For respondents who rated Q8 as scale 5-7 (Not natural) only]
If the conversation with the AI had been more natural and fluent, do you believe the reduction of loneliness and worries would have been more effective?
Q9Rate your intention to use this kind of AI-Text Counseling Room in the future.
Q9.1[For respondents who rated Q9 as scale 1-4 (Intending to use) only]
Why would you want to use it again? (Multiple selections allowed, up to three)
Q9.2[For respondents who rated Q9 as scale 5-7 (Intending not to use) only]
Why would you not want to use it again? (Multiple selections allowed, up to three)
Q9.3[For respondents who rated Q9 as scale 5-7 (Intending not to use) only]
How would you like the AI-Text Counseling Room to be improved for you to want to use it? (Multiple selections allowed)
Q10Assuming an AI agent of the same performance was used in both conditions, which environment would you choose?
Q11Regarding this experiment, please provide any advice for service improvement, or any other comments or opinions. (Optional)

References

  1. WHO. Artificial Intelligence in Mental Health Research: New WHO Study on Applications and Challenges. Available online: https://www.who.int/europe/news/item/06-02-2023-artificial-intelligence-in-mental-health-research--new-who-study-on-applications-and-challenges (accessed on 13 August 2025).
  2. Hwang, T.J.; Rabheru, K.; Peisah, C.; Reichman, W.; Ikeda, M. Loneliness and social isolation during the COVID-19 pandemic. Int. Psychogeriatr. 2020, 32, 1217–1220. [Google Scholar] [CrossRef] [PubMed]
  3. Casu, M.; Triscari, S.; Battiato, S.; Guarnera, L.; Caponnetto, P. AI Chatbots for Mental Health: A Scoping Review of Effectiveness, Feasibility, and Applications. Appl. Sci. 2024, 14, 5889. [Google Scholar] [CrossRef]
  4. Coombs, N.C.; Meriwether, W.E.; Caringi, J.; Newcomer, S.R. Barriers to healthcare access among US adults with mental health challenges: A population-based study. SSM-Popul. Health 2021, 15, 100847. [Google Scholar] [CrossRef] [PubMed]
  5. Lustgarten, S.D.; Garrison, Y.L.; Sinnard, M.T.; Flynn, A.W. Digital privacy in mental healthcare: Current issues and recommendations for technology use. Curr. Opin. Psychol. 2020, 36, 25–31. [Google Scholar] [CrossRef] [PubMed]
  6. Stephanidis, C.; Salvendy, G.; Antona, M.; Duffy, V.G.; Gao, Q.; Karwowski, W.; Konomi, S.; Nah, F.; Ntoa, S.; Rau, P.-L.P.; et al. Seven HCI Grand Challenges Revisited: Five-Year Progress. Int. J. Hum.-Comput. Interact. 2025, 41, 11947–11995. [Google Scholar] [CrossRef]
  7. Mozumder, M.A.I.; Armand, T.P.T.; Imtiyaj Uddin, S.M.; Athar, A.; Sumon, R.I.; Hussain, A.; Kim, H.C. Metaverse for Digital Anti-Aging Healthcare: An Overview of Potential Use Cases Based on Artificial Intelligence, Blockchain, IoT Technologies, Its Challenges, and Future Directions. Appl. Sci. 2023, 13, 5127. [Google Scholar] [CrossRef]
  8. Furukawa, T.A.; Tajika, A.; Toyomoto, R.; Sakata, M.; Luo, Y.; Horikoshi, M.; Akechi, T.; Kawakami, N.; Nakayama, T.; Kondo, N.; et al. Cognitive behavioral therapy skills via a smartphone app for subthreshold depression among adults in the community: The RESiLIENT randomized controlled trial. Nat. Med. 2025, 31, 1830–1839. [Google Scholar] [CrossRef] [PubMed]
  9. De Freitas, J.; Oğuz-Uğuralp, Z.; Uğuralp, A.K.; Puntoni, S. AI companions reduce loneliness. J. Consum. Res. 2025, ucaf040. [Google Scholar] [CrossRef]
  10. Ifdil, I.; Situmorang, D.D.B.; Firman, F.; Zola, N.; Rangka, I.B.; Fadli, R.P. Virtual reality in Metaverse for future mental health-helping profession: An alternative solution to the mental health challenges of the COVID-19 pandemic. J. Public Health 2023, 45, e142–e143. [Google Scholar] [CrossRef] [PubMed]
  11. Govindankutty, S.; Gopalan, S.P. The Metaverse and Mental Well-Being: Potential Benefits and Challenges in the Current Era. In The Metaverse for the Healthcare Industry; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
  12. Wang, L.; Gai, W.; Jiang, N.; Chen, G.; Bian, Y.; Luan, H.; Huang, L.; Yang, C. Effective Motion Self-Learning Genre Using 360° Virtual Reality Content on Mobile Device: A Study Based on Taichi Training Platform. IEEE J. Biomed. Health Inform. 2024, 28, 3695–3708. [Google Scholar] [CrossRef] [PubMed]
  13. Cho, S.; Kang, J.; Baek, W.H.; Jeong, Y.B.; Lee, S.; Lee, S.M. Comparing counseling outcome for college students: Metaverse and in-person approaches. Psychother. Res. 2024, 34, 1117–1130. [Google Scholar] [CrossRef] [PubMed]
  14. Croes, E.A.; Antheunis, M.L.; van der Lee, C.; de Wit, J.M. Digital confessions: The willingness to disclose intimate information to a chatbot and its impact on emotional well-being. Interact. Comput. 2024, 36, 279–292. [Google Scholar] [CrossRef]
  15. Anmella, G.; Sanabra, M.; Primé-Tous, M.; Segú, X.; Cavero, M.; Morilla, I.; Grande, V.; Mas, A.; Sanabra, M.; Martín-Villalba, I.; et al. Vickybot, a Chatbot for Anxiety-Depressive Symptoms and Work-Related Burnout in Primary Care and Health Care Professionals: Development, Feasibility, and Potential Effectiveness Studies. J. Med. Internet Res. 2023, 25, e43293. [Google Scholar] [CrossRef]
  16. Liu, A.R.; Pataranutaporn, P.; Maes, P. Chatbot companionship: A mixed-methods study of companion chatbot usage patterns and their relationship to loneliness in active users. arXiv 2024, arXiv:2410.21596. [Google Scholar] [CrossRef]
  17. Li, H.; Zhang, R.; Lee, Y.C.; Kraut, R.E.; Mohr, D.C. Systematic review and meta-analysis of AI-based conversational agents for promoting mental health and well-being. npj Digit. Med. 2023, 6, 236. [Google Scholar] [CrossRef]
  18. Lee, A.; Moon, S.; Jhon, M.; Kim, J.; Kim, D.; Kim, J.E.; Park, K.; Jeon, E. Comparative Study on the Performance of LLM-based Psychological Counseling Chatbots via Prompt Engineering Techniques. In Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal, 3–6 December 2024; pp. 7080–7082. [Google Scholar] [CrossRef]
  19. Kim, M.; Lee, S.; Kim, S.; Heo, J.I.; Lee, S.; Shin, Y.B.; Cho, C.H.; Jung, D. Therapeutic Potential of Social Chatbots in Alleviating Loneliness and Social Anxiety: Quasi-Experimental Mixed Methods Study. J. Med. Internet Res. 2025, 27, e65589. [Google Scholar] [CrossRef]
  20. Park, G.; Chung, J.; Lee, S. Effect of AI chatbot emotional disclosure on user satisfaction and reuse intention for mental health counseling: A serial mediation model. Curr. Psychol. 2023, 42, 28663–28673. [Google Scholar] [CrossRef] [PubMed]
  21. Thieme, A.; Hanratty, M.; Lyons, M.; Palacios, J.; Marques, R.F.; Morrison, C.; Doherty, G. Designing Human-centered AI for Mental Health: Developing Clinically Relevant Applications for Online CBT Treatment. ACM Trans. Comput.-Hum. Interact. 2023, 30, 1–50. [Google Scholar] [CrossRef]
  22. Spiegel, B.M.R.; Liran, O.; Clark, A.; Samaan, J.S.; Khalil, C.; Chernoff, R.; Reddy, K.; Mehra, M. Feasibility of combining spatial computing and AI for mental health support in anxiety and depression. npj Digit. Med. 2024, 7, 22. [Google Scholar] [CrossRef] [PubMed]
  23. Al Falahi, A.; Alnuaimi, H.; Alqaydi, M.; Al Shateri, N.; Al Ameri, S.; Qbea’h, M.; Alrabaee, S. From Challenges to Future Directions: A Metaverse Analysis. In Proceedings of the 2nd International Conference on Intelligent Metaverse Technologies Applications (iMETA), Dubai, United Arab Emirates, 26–29 November 2024; pp. 34–43. [Google Scholar] [CrossRef]
  24. Wang, X.; Mo, X.; Fan, M.; Lee, L.H.; Shi, B.; Hui, P. Reducing Stress and Anxiety in the Metaverse: A Systematic Review of Meditation, Mindfulness and Virtual Reality. In Proceedings of the Tenth International Symposium of Chinese CHI (Chinese CHI ‘22), Guangzhou, China, 22–23 October 2022; Association for Computing Machinery: New York, NY, USA, 2024; pp. 170–180. [Google Scholar] [CrossRef]
  25. Fang, A.; Chhabria, H.; Maram, A.; Zhu, H. Social Simulation for Everyday Self-Care: Design Insights from Leveraging VR, AR, and LLMs for Practicing Stress Relief. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ‘25), Yokohama, Japan, 26 April–1 May 2025; pp. 1–23. [Google Scholar] [CrossRef]
  26. VRChat. Available online: https://hello.vrchat.com/ (accessed on 13 August 2025).
  27. Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning (ICML’23), Honolulu, HI, USA, 23–39 July 2023; pp. 28492–28518. Available online: https://dl.acm.org/doi/10.5555/3618408.3619590 (accessed on 12 October 2025).
  28. OpenAI GPT4. Available online: https://openai.com/ja-JP/api/ (accessed on 13 August 2025).
  29. VOICEVOX. Available online: https://voicevox.hiroshiba.jp/ (accessed on 13 August 2025).
Figure 1. An Image Illustration of the Integration of ChatGPT and Metaverse Environment.
Figure 1. An Image Illustration of the Integration of ChatGPT and Metaverse Environment.
Applsci 15 11209 g001
Figure 2. An AI Agent in the Integration of ChatGPT and Metaverse Environment.
Figure 2. An AI Agent in the Integration of ChatGPT and Metaverse Environment.
Applsci 15 11209 g002
Figure 3. Functional Modules and Data Processing of the ChatGPT and Metaverse Environment.
Figure 3. Functional Modules and Data Processing of the ChatGPT and Metaverse Environment.
Applsci 15 11209 g003
Table 1. Technical Specifications of the Experimental Environment.
Table 1. Technical Specifications of the Experimental Environment.
Version
OSMicrosoft Windows 10 Pro 64bit
CPUAMD Ryzen 7 3700X 8-Core Processor
GPUNVIDIA GeForce RTX 2060 Graphics card
HeadsetMeta Quest 2
Python 3.10.11
Visual Studio Code1.85.1
Table 2. Basic Statistics of Overall Satisfaction.
Table 2. Basic Statistics of Overall Satisfaction.
ItemConditionMinMaxMedianModeRange
Overall SatisfactionAI-Metaverse263.0032~6
AI-Text153.0021~5
Ease of UseAI-Metaverse152.0011~5
AI-Text142.0021~4
Perceived Support for WorryAI-Metaverse273.0032~7
AI-Text163.0021~6
Perceived Support for LonelinessAI-Metaverse253.0022~5
AI-Text164.0041~6
Understanding of FeelingsAI-Metaverse273.0032~7
AI-Text132.0021~3
Perceived Support for CalmingAI-Metaverse152.0011~5
AI-Text163.0021~6
AI Response TimeAI-Metaverse375.0053~7
AI-Text375.0053~7
Conversation NaturalnessAI-Metaverse264.0052~6
AI-Text142.0021~4
Intention for Future UseAI-Metaverse152.0021~5
AI-Text172.0021~7
Table 3. Results on Different Psychological State Subgroups.
Table 3. Results on Different Psychological State Subgroups.
ItemGroupMinMaxMedianModeN
Overall SatisfactionFeels Worried253.0038
Does Not Feel Worried363.0037
Ease of UseFeels Worried152.0018
Does Not Feel Worried152.0027
Perceived Support for WorryFeels Worried273.0038
Does Not Feel Worried273.0037
Perceived Support for LonelinessFeels Worried253.0038
Does Not Feel Worried252.0027
Intention for Future UseFeels Worried143.5048
Does Not Feel Worried152.0027
Table 4. Results on Different VR Experience Subgroups.
Table 4. Results on Different VR Experience Subgroups.
ItemGroupMinMaxMedianModeN
Overall SatisfactionNovice253.0038
Intermediate263.00N/A3
Experienced333.0034
Ease of UseNovice152.0018
Intermediate121.0013
Experienced132.0024
Perceived Support for WorryNovice273.0038
Intermediate375.00N/A3
Experienced243.5044
Perceived Support for LonelinessNovice253.0028
Intermediate252.0023
Experienced233.0024
Intention for Future UseNovice153.0048
Intermediate131.0013
Experienced242.5024
Table 5. Frequency Distribution of Perceived Support for Loneliness by VR Experience Subgroup.
Table 5. Frequency Distribution of Perceived Support for Loneliness by VR Experience Subgroup.
Perceived SupportNeutralDid Not Perceive Support
Novice503
Intermediate201
Experienced400
Table 6. Results on Different Consultation Experience Subgroups.
Table 6. Results on Different Consultation Experience Subgroups.
ItemGroupMinMaxMedianModeN
Overall SatisfactionNever263.0025
Case-by-case363.0036
Frequently354.0044
Ease of UseNever131.0015
Case-by-case143.0036
Frequently152.50N/A4
Perceived Support for WorryNever273.0035
Case-by-case243.0036
Frequently374.50N/A4
Perceived Support for LonelinessNever232.0025
Case-by-case253.0036
Frequently355.0054
Intention for Future UseNever141.0015
Case-by-case242.0026
Frequently354.0044
Table 7. Frequency Distribution of Perceived Support for Loneliness by Consultation Experience Subgroup.
Table 7. Frequency Distribution of Perceived Support for Loneliness by Consultation Experience Subgroup.
Perceived SupportNeutralDid Not Perceive Support
Never500
Case-by-Case501
Frequently103
Table 8. The Summary of Qualitative Feedback from Participants.
Table 8. The Summary of Qualitative Feedback from Participants.
ConditionIntentionReason (Multiple Selections Allowed)Count
AI-MetaverseUseAttracted by the non-everyday space8
The AI’s appearance was cute8
The AI’s voice was cute6
Can discuss worries you can’t tell people5
The space was comfortable4
It had an effect on reducing loneliness4
There was a sense of immersion1
Could converse fluently1
Not to UseThe voice recognition was not accurate3
Don’t intend to consult non-humans1
Dislike speaking out loud1
It had no effect on reducing loneliness1
AI-TextUseThe content recognition was accurate6
Don’t have to speak out loud2
It had an effect on reducing loneliness2
It was fun to talk about my hobbies1
Not to UseDidn’t feel like counseling4
Don’t intend to consult non-humans2
Text input was inconvenient1
Might become aggressive in text1
It had no effect on reducing loneliness1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chung, P.; Cong, R.; Yao, L.; Jin, Q. Exploratory Research on the Potential of Human–AI Interaction for Mental Health: Building and Verifying an Experimental Environment Based on ChatGPT and Metaverse. Appl. Sci. 2025, 15, 11209. https://doi.org/10.3390/app152011209

AMA Style

Chung P, Cong R, Yao L, Jin Q. Exploratory Research on the Potential of Human–AI Interaction for Mental Health: Building and Verifying an Experimental Environment Based on ChatGPT and Metaverse. Applied Sciences. 2025; 15(20):11209. https://doi.org/10.3390/app152011209

Chicago/Turabian Style

Chung, PuiTing, Ruichen Cong, Lin Yao, and Qun Jin. 2025. "Exploratory Research on the Potential of Human–AI Interaction for Mental Health: Building and Verifying an Experimental Environment Based on ChatGPT and Metaverse" Applied Sciences 15, no. 20: 11209. https://doi.org/10.3390/app152011209

APA Style

Chung, P., Cong, R., Yao, L., & Jin, Q. (2025). Exploratory Research on the Potential of Human–AI Interaction for Mental Health: Building and Verifying an Experimental Environment Based on ChatGPT and Metaverse. Applied Sciences, 15(20), 11209. https://doi.org/10.3390/app152011209

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop