1. Introduction
Virtual reality (VR) technology has exhibited great potential in various fields, including education, healthcare, and technical training. Unlike traditional learning methods, VR can present unimaginable scenes that people have never seen, such as humans exploring the microscopic world or traveling through space to Mars. This special experience makes it an effective tool for broadening human cognitive fields [
1]. To help researchers and educators better understand cognitive processes in VR environments and to promote the development of cognitive science and education [
2], it is necessary to explore the visual information processing in VR scenes.
Currently, many studies have investigated the effects of VR-based learning. Lv et al. [
3] employed VR technology to construct an immersive geographical learning environment to enhance learning. Liang et al. [
4] reported that competition in virtual environments inhibited spatial memory, whereas cooperation enhanced sustained attention. Makransky et al. [
5] proposed the Cognitive Affective Model of Immersive Learning (CAMIL), emphasizing immersion, interactivity, and factors such as motivation and embodiment in virtual learning. Chen et al. [
6] found that VR technology significantly improves the usefulness and novelty of creative design while increasing extraneous and germane cognitive loads. In addition, EEG measurement technology has been used to assess neural responses in VR environments. Baceviciute et al. [
7] analyzed frequency-domain changes in EEG signals and observed increased theta wave activity along with decreased alpha and beta wave activity in VR, indicating higher cognitive processing. Marcolin et al. [
8] conducted a study using machine learning methods to identify participants’ stress states based on EEG signals in VR environments.
Although research on VR has yielded many findings, these studies have mainly focused on exploring immersion and embodiment experiences. A critical gap remains in understanding how novel VR environments influence the cognitive processes of individuals without prior knowledge. Prior knowledge plays a crucial role in visual cognition. Stored prior knowledge in the brain is utilized to interpret current visual stimuli, thereby facilitating the acquisition of knowledge and comprehension [
9,
10]. Without prior information, novices’ comprehension of visual information is often limited to surface-level attributes, and they struggle to grasp the core content depicted in graphics [
10]. In immersive VR visual environments, the influence of prior knowledge on learners may become amplified. The spatial and dynamic characteristics of VR place higher demands on learners’ ability to extract key information, filter out irrelevant stimuli, and maintain focused attention. Learners without prior knowledge may be more susceptible to interference from the abundance of visual information due to the absence of established cognitive frameworks, which may lead to an imbalance in the distribution of cognitive resources. Moreover, learners tend to engage predominantly in perceptual-level information processing in VR, which may hinder deeper cognitive processing. Differently, the unique visual features of VR may also enhance learners’ interest and promote exploratory behavior. For instance, Wu et al. [
11] reported that although novices in virtual environments struggled to clearly understand problems and propose targeted solutions due to their lack of prior knowledge, they displayed greater learning motivation and curiosity. As mentioned above, given the uncertainty regarding both the cognitive challenges and the potential advantages that learners without prior knowledge may experience in VR environments, it is of great significance to investigate the cognitive mechanisms of VR learning without prior knowledge.
In response to this gap, this study aims to investigate how VR affects the visual cognitive processes of learners without prior knowledge. Based on the dual coding theory, we proposed the Framework of Human Visual Cognition without Prior Knowledge as an interpretative framework that organizes visual cognitive processing into four critical links: memory link, analogy link, logical thinking link, and creative thinking link. In addition, inspired by the theoretical framework, we designed an exploratory cognitive experiment to explore the effects of VR on learners without prior knowledge. We innovatively employed a Mars exploration scenario, a topic entirely unfamiliar to the participants. Drawing on the above research background and theoretical analysis, we proposed the following two hypotheses:
H1. The VR environment significantly influences learning outcomes (the four cognitive links) in individuals without prior knowledge.
H2. The VR environment significantly influences cognitive engagement in individuals without prior knowledge. Specifically, we focus on differences in attention, meditation metrics, and EEG characteristics of learners between the VR and the PC groups.
To test these hypotheses, we conducted a between-subjects, cross-medium comparative experiment. This experiment integrated questionnaire assessments, measurements of attention, meditation, and EEG. To eliminate non-physiological artifacts (e.g., power line interference) and physiological artifacts (e.g., ocular and muscular artifacts), EEG signals were preprocessed using filters and the stationary wavelet transform (SWT) with Coif4 at level 8 decomposition combined with statistical thresholding. Subsequently, a short-time Fourier transform (STFT) was applied to analyze brainwave activity. This study provides an interpretative theoretical framework and preliminary empirical evidence for understanding learning-related cognitive processes in VR environments, offering theoretical references for future research in cognitive science and VR-based learning. Moreover, the findings offer insights into designing VR scenes that align with better human cognitive patterns.
2. Research Background
Vision is the primary channel of information transmission in VR environments, and information is typically presented in the form of images and texts, which is the main focus in current cognitive research on VR [
7,
12]. Dual coding theory is one of the most widely accepted theoretical models for explaining the cognitive process of images and texts [
13]. As shown in
Figure 1, the most basic hypothesis of the dual coding theory is that cognitive abilities are divided into two independent neural subsystems: the verbal coding system and the non-verbal coding system, which process text and image information, respectively [
14].
In addition to dual coding theory, Piaget’s cognitive development theory elucidates the mechanism of cognitive evolution. It describes the four stages of children’s cognitive development: sensorimotor, preoperational, concrete operational, and formal operational [
15]. Each stage develops progressively, with cognitive abilities evolving from lower to higher levels. Although Piaget’s cognitive development theory was originally developed to study children’s cognitive growth, its description of cognitive development mechanisms and knowledge construction patterns provides important references for cognitive science research.
Based on these theories, this study explored the development of cognitive abilities by introducing the concept of prior knowledge from cognitive theory, aiming to investigate how individuals process visual information in VR environments without prior knowledge, and its impact on cognitive performance. Dual coding theory provides a theoretical paradigm for establishing the mechanism of visual information processing in environments without prior knowledge, while Piaget’s cognitive development theory offers a theoretical basis for the staged division of cognitive abilities. This research provides a new perspective for understanding visual cognition in VR-based instruction and, through cognitive experiments, proposes preliminary directions for VR education and skill research, such as enhancing cognitive performance by optimizing visual scenes in VR.
3. Theoretical Research
VR provides an immersive learning environment without prior knowledge, where participants encounter entirely new visual information with limited activation of task-specific prior knowledge. To explore VR’s unique characteristics in this regard, we first conducted theoretical research on the mechanisms of visual cognition without prior information in VR environments. We proposed a conceptual theoretical framework: the Framework of Human Visual Cognition without Prior Knowledge. This framework aims to conceptually elucidate the cognitive processes involved in the brain’s processing of novel visual input.
The Framework of Human Visual Cognition without Prior Knowledge developed in this paper serves as a supplementary and modified version of the dual coding theory. It explains how individuals process entirely new visual information in environments without prior knowledge, particularly in VR, and how this process contributes to the construction of cognitive frameworks. The Framework of Human Visual Cognition without Prior Knowledge is shown in
Figure 2.
In the information processing part, we adopt the structure of dual coding theory. Information received through visual channels is processed separately in the verbal coding system for words and texts, and the non-verbal coding system for images or videos [
14,
16,
17]. Information is first represented as different representation units in the corresponding systems: words and texts are represented as logogens, and images and videos are represented as imagens. Subsequently, visual information is processed in three forms: representational, referential, and associative types, which realize the information’s mutual transformation. The representational type is denoted as “what you see is what you get”; the referential type is represented as “logogens and imagens can be associated with each other”; and the associative type is expressed as “logogens and imagens can be associated within themselves”. Finally, mental models are constructed to develop cognition [
14,
16,
17,
18].
In addition, based on Piaget’s cognitive development theory [
19,
20], which emphasises the gradual development of thinking skills from simple perception to mature logical thinking, we conceptually organize four key links involved in visual knowledge construction: memory, analogy, logical thinking, and creative thinking. Although cognitive abilities involve several other aspects, this paper primarily focuses on the construction of visual cognition and the relationships among these cognitive links, especially in VR learning scenarios.
Memory is the ability of humans to encode, store, and retrieve the content and experiences of sensory input and is conceptually positioned as the first link of cognitive learning [
21]. In the absence of prior knowledge, visual working memory is first stimulated when people learn new knowledge, and the input visual information is stored in the form of logogens and imagens by activating the memory link.
Analogy refers to the ability to transfer the conclusion about one object to another based on the similarities between the two objects [
22]. Importantly, the transfer of information is facilitated by the analogy link in an associative and referential manner. After the accumulation of memories related to a “white cat”, the analogy link prompts the brain to evoke the memory of the “white cat” image when individuals encounter a “black cat” for the first time. This constitutes the association between images. Meanwhile, individuals tend to label the observed object as a “black cat” by comparing the “white cat” in their memory with the “black cat” they observe. This constitutes the mutual reference between images and words. The formation of the analogy link relies on the existing memory storage and can be activated directly by the two subsystems, and is conceptually positioned as the second link in cognitive learning.
Logical thinking is a way of thinking that produces new cognitive insights by judging, reasoning, and summarizing the nature of things and the laws of development. The logical thinking link delves into the abstract essence and internal connections [
23]. It can help us summarize the general conclusions from the learning of individual knowledge, so as to explain the emergence of a certain phenomenon. Logical thinking plays an extremely important role in the process of understanding objective truth and is conceptually positioned as the third link of cognitive learning.
Creative thinking is the process that involves the spontaneous modification and recombination of existing knowledge, leading to novel perspectives. Human wisdom is predominantly reflected in individuals’ creative thinking ability, that is, the ability to think of completely new ideas to solve problems [
24]. The establishment of the creative thinking link is conceptually described as being primarily influenced by three elements: memory, analogy, and logical thinking, and it is conceptually positioned as the final link of cognitive learning.
In general, this model proposes a conceptual framework for understanding visual information processing without prior knowledge. Based on the dual coding theory, the present model further extends this perspective by organizing four conceptually related cognitive links into a sequence that covers key aspects of the process from information input to cognitive construction. This organization conceptually illustrates a possible progression from information input to cognitive construction, emphasizing a theoretical view beyond static encoding. Specifically, the immersive and highly novel environment of VR provides ideal conditions for exploring the core element of this model—the absence of prior knowledge. In such environments, learners are less influenced by prior knowledge and are more likely to construct new knowledge structures. Accordingly, this framework serves as a theoretical reference for examining visual cognition in learning situations characterized by absent prior knowledge, with particular relevance to VR-based learning scenarios.
4. Research Method
This study designed an exploratory experiment to examine the impact of VR environments without prior knowledge on learners’ cognitive processing. The experiment used traditional cognitive measurement methods to assess learning-related performance across four links inspired by the theoretical framework: memory, analogy, logical reasoning, and creative thinking. EEG measurement tools were incorporated as complementary indicators to capture learners’ neurophysiological states during learning without prior knowledge.
4.1. Research Participants
This study posted a recruitment activity for participants at a university, inviting applicants to complete a questionnaire comprising 9 multiple-choice questions related to Mars (e.g., “Have you ever watched videos of Mars?” and “What is the primary color of the Martian surface?”). Students who scored low (three or fewer correct answers) and reported limited exposure to Mars-related content were selected as participants with limited prior knowledge, as assessed by the Mars-related questionnaire. Ultimately, 54 eligible participants were recruited. Their screening scores ranged from 0 to 3 (M = 1.33, SD = 1.08), indicating generally low familiarity with the topic. Participants were randomly assigned to an experimental group (N = 27) and a control group (N = 27). To further ensure equivalence in residual prior knowledge, independent samples t-tests were conducted on screening scores. Results indicated no significant difference between the VR group (M = 1.44, SD = 1.12) and the PC group (M = 1.22, SD = 1.05), t = 0.752, p = 0.456, suggesting comparable baseline familiarity across groups. Upon completion of the experiment, participants received a corresponding reward.
4.2. Research Material
4.2.1. Learning Content
This study conducted a Mars exploration learning task to investigate the impact of VR on human cognitive processes without prior knowledge. For most students, Mars is highly unusual and specialized, differing greatly from everyday experiences and basic instructional content. Students lack both life experience and relevant knowledge backgrounds, which allows for maximizing the isolation of prior knowledge effects on cognitive processing. At the same time, the content of Mars exploration is highly visual and low in abstraction, making it well-suited for knowledge visualization through VR technology. Its immersive environment can directly translate the learning content into learners’ sensory input, without relying on complex abstract reasoning (e.g., the learning content in mathematics), thereby encouraging learners without prior knowledge to focus their cognitive resources on constructing representational units and developing cognitive links. Moreover, the topic of Mars exploration aligns with the prerequisites of meaningful learning. On one hand, the learning content itself has a clear logical structure, which can support the development of complete process links and meet the requirements of this study for measuring cognitive links. On the other hand, as a widely concerned frontier topic, the unknown nature of Mars exploration and the practical significance of future human habitation on Mars can motivate learners to engage in learning. This engagement is not merely a result of curiosity effects but reflects a cognitive drive toward active cognitive processing, prompting learners to invest sustained cognitive resources to achieve knowledge construction [
25].
The experimental group used the PICO Pro VR device (PICO Interactive Co., Ltd., Beijing, China) to receive VR learning guidance, immersively watching a 360-degree video about the Mars environment provided by the PICO Video application (see the left panel of
Figure 3). The video was sourced directly from the built-in PICO Video application and was created based on real Mars exploration images, with text descriptions synchronized with the visuals. To ensure consistency of learning content, the control group used a standard PC to receive Mars scene learning guidance (see the right panel of
Figure 3). The video, sourced from
https://www.youtube.com/watch?v=ZEyAs3NWH4A (accessed on 1 July 2024), was selected from similar Mars exploration image material. It was appropriately edited to match the VR video in terms of duration (4 min), visual content, pacing, and subtitles.
4.2.2. Pre-Test
Before the formal experiment, each participant was required to complete a cognitive ability measurement scale as a benchmark to evaluate the overall cognitive ability levels in both the experimental and control groups. This scale comprised 11 items across three dimensions: memory, logical thinking, and creative thinking, using a 5-point Likert scale for scoring, ranging from never (1) to always (5) [
26,
27,
28,
29].
Reliability analysis of the collected data indicated that Cronbach’s α coefficients for the three dimensions were 0.727, 0.855, and 0.770, respectively. Factor analysis indicated that the explained variance was 67.83%. The scale demonstrated good reliability and validity.
4.2.3. Learning Assessment Instruments
With reference to the proposed cognitive framework, we developed four tests based on the video course content to evaluate the participants’ learning performance related to memory, analogy, logical thinking, and creative thinking. Each cognitive link was operationalized through a corresponding performance-based assessment and designed based on previous research. Specifically, memory was measured through recall tasks involving both textual and visual information derived from the instructional content; analogy was assessed through proportional reasoning and transfer-based comparison tasks; logical thinking was evaluated via open-ended inference questions involving deduction, induction, and abduction; and creative thinking was measured using an expert-rated design task assessing novelty and feasibility. These four instruments represent observable indicators corresponding to the four conceptual links of the framework and allow partial empirical examination between the two groups under different learning environments. The details are as follows.
Memory performance test: To assess participants’ memory of the learning content, this study referenced the design of memory tests in previous research [
7,
30,
31] and built memory test items based on both textual and scene-related information from the instructional material. The memory test consisted of 15 questions, including 7 multiple-choice questions (e.g., “What is the color of the surface of Mars? (A) Blue, (B) Yellow, (C) Red [correct answer], (D) Gray”) and 5 fill-in-the-blank questions (e.g., “Evidence of water on Mars is _______”) (each worth 1 point). In addition, participants completed one sequencing task, in which they were asked to recall the order of scenes presented in the video (worth 4 points). Finally, there were 2 short-answer questions (e.g., “Which rovers mentioned in the video could not operate on Mars? What were the reasons for their failure?”) (worth a maximum of 9 points). All questions were based on the content mentioned in the video. Participants were allowed a maximum response time of 3 min.
Analogy performance test: The analogy ability test was designed based on previous studies [
32,
33,
34]. It included six proportional analogical reasoning problems (e.g., “Earth:Tornado::Mars:(Sandstorm)”) (each worth 1 point), and three short-answer questions (e.g., “Based on the Mars scene video, compare the climate on Mars with that on Earth”) (each worth 3 points). The test aimed to assess participants’ analogical transfer ability in different learning environments. Participants were allowed a maximum response time of 3 min.
Logical thinking performance test: The logical thinking ability test was designed based on deduction, induction, and abduction [
35,
36]. It included 8 short-answer questions (e.g., “What features do you think transportation vehicles on Mars should have to cope with the challenges of the Martian environment?”) (each worth 5 points). This test evaluated participants’ ability to make sound inferences based on the knowledge they acquired and to solve specific problems. Participants were allowed a maximum response time of 8 min. Responses were scored based on the following criteria, which assessed their reasoning and completeness:
- (1)
5 points: Fully consistent with Mars’ environmental characteristics, covers ≥3 core points, logical analysis is clear, and meets all question requirements;
- (2)
3–4 points: Addresses the question core, includes 2–3 key points with reasonable logic, or aligns with Mars’ environment but is not comprehensive;
- (3)
1–2 points: Mentions 1 relevant key point without analysis, logic is vague, or partially deviates from requirements;
- (4)
0 points: Unanswered, completely irrelevant, or contradicts basic Mars-related common sense.
Creative thinking performance test: An open-ended challenge was designed to measure creative thinking abilities [
37,
38]. Based on the Mars knowledge previously learned, participants were asked to design a vehicle for living on Mars, which must be distinct from all existing vehicles on Earth and adaptable to Mars’s complex geographical environment. Participants were required to draw a simple design on paper and provide a brief explanation of their design, including but not limited to functionality, materials, appearance, dimensions, transmission methods, and power source. Participants were given 8 min to complete their designs. We invited two experts to evaluate the transportation designs based on two dimensions—novelty and feasibility—using a 5-point Likert scale [
37,
39]. The final score for each dimension was calculated by averaging the two experts’ ratings. Novelty refers to the degree to which the design surpasses existing Mars rover designs, and feasibility assesses the likelihood of implementing the design. The consistency of ratings between the two experts was evaluated using IBM SPSS Statistics 27. At a 95% confidence interval, the intraclass correlation coefficient (ICC) with a two-way mixed-effects model and consistency type for novelty and feasibility were 0.710 (95% CI: [0.547–0.821]) and 0.653 (95% CI: [0.469–0.783]), respectively, indicating moderate inter-rater consistency [
40].
All scorers were blinded to the participants’ group (VR vs. PC) during the entire scoring process, which effectively mitigated potential scoring bias.
4.2.4. EEG Measurement Tool
To further examine the impact of different learning environments on cognitive function, we used the Neurosky MindWave mobile device to collect participant EEG data from the frontal Fp1 electrode, which plays a crucial role in cognitive attention, working memory, and learning-related psychological states [
41]. Research [
42] reported that different cognitive states induce distinct power changes in specific EEG frequency bands. These changes exhibit significant discriminability in the Fp1 channel, which can be used to reflect individuals’ different cognitive states (e.g., meditation). The study further supported the effectiveness of using the NeuroSky MindWave mobile device (NeuroSky Inc., San Jose, CA, USA) to assess cognitive states. Neurosky MindWave, with its lightweight and wireless design, has been widely used in VR educational research and cognitive neuroscience [
43,
44,
45]. Compared to multi-channel systems, the single-channel Neurosky MindWave minimized user discomfort and reduced signal interference caused by prolonged use or electrode displacement when worn concurrently with heavy VR headsets, thereby improving both the feasibility of the experiment and the stability of collected data. Besides, previous research has shown its good reliability and stability in capturing EEG signals from the Fp1 electrode and estimating attention and meditation levels [
46,
47,
48,
49]. In this experiment, participants wore both a VR headset and the EEG device simultaneously. The device captures raw EEG at 512 Hz and employs an algorithm called eSsence to measure meditation and attention levels on a scale of 0 to 100, transmitting the data via Bluetooth to the paired computer for further analysis [
48].
Previous research has indicated that alpha, beta, and theta signals are closely associated with cognitive learning activities [
7,
37,
50]. Therefore, this study focused specifically on changes in the activity of these three band signals. The frequency bands used were defined as follows: theta (4–7 Hz), alpha (8–12 Hz), and beta (13–30 Hz).
It is worth noting that participants in the VR environment wore a VR headset and the EEG device simultaneously, which may increase the risk of artifacts in the collected signals, including ocular artifacts from blinks, facial electromyography (EMG), and headset pressure on the scalp. To mitigate such risks, participants were asked to minimize excessive body movements and unnecessary muscle contractions during the experiment. In addition, strict preprocessing procedures were applied to the raw EEG signals to suppress the interference.
Before analysis, raw EEG data were preprocessed to remove artifacts and obtain relatively clean signals. We applied filters to eliminate non-physiological artifacts, including a 50 Hz IIR notch filter to suppress power line interference and a 5th-order zero-phase Butterworth bandpass filter (0.5–32 Hz) to remove low-frequency and high-frequency noise components. Then, we adopted the SWT to mitigate transient spikes caused by muscle activity and eye movements, a widely used denoising technique for EEG [
51,
52]. SWT is a multi-resolution signal decomposition method that maps time-domain signals to the time-frequency domain using a mother wavelet, decomposing the signal into approximation coefficients (
cAj) and detail coefficients (
cDj). The
cAj primarily reflects the low-frequency, sustained components of the valid EEG, while the
cDj contains high-frequency information across different scales. Transient artifacts manifest as instantaneous spikes in the time domain, which appear as localized, high-amplitude outliers in the
cDj after SWT decomposition [
53]. Compared with the conventional discrete wavelet transform, SWT offers translation-invariance and is widely applicable to denoising in single-channel EEG systems (e.g., NeuroSky MindWave) [
52,
54]. Moreover, a previous study compared multiple denoising methods and demonstrated that the approach combining SWT with statistical thresholding exhibits excellent performance in preserving neural components intact during the denoising process of single-channel EEG signals [
55]. Therefore, we applied a statistical threshold to selectively process these high-amplitude outliers in the
cDj, eliminating artifacts while preserving the meaningful, smoothly distributed high-frequency information. Specifically, in our study, the Coif4 wavelet was selected as the mother wavelet (due to its waveform similarity to ocular artifacts) [
53,
54], and the decomposition level was set to 8 to obtain the
cAj and
cDj. The statistical threshold given by (1) was then applied to the
cDj at levels 4–8 (corresponding to 16–32 Hz, 8–16 Hz, 4–8 Hz, 2–4 Hz, and 1–2 Hz). For each level, we calculated the standard deviation of the
cDj at that level and used an empirical factor of 1.5 to adjust the strictness of the threshold, thereby obtaining the threshold
Tj for the
cDj at that level. This empirical threshold has been widely used in EEG artifact removal to effectively separate high-amplitude ocular/muscle artifacts from low-amplitude neural activity [
54,
55,
56]. After this, a hard-thresholding operation was performed: the coefficients with absolute values greater than the threshold were set to zero [
54,
55]. Finally, the inverse SWT was conducted to reconstruct the denoised signal. SWT decomposition and inverse transformation were implemented via the swt() and iswt() functions in MATLAB R2022a. A comparison between the raw signal and the preprocessed reconstructed signal is shown in
Figure 4.
where
cDj is the detail coefficients at the
jth level, and
std(·) represents the standard deviation (threshold factor = 1.5).
After preprocessing to remove noise from the EEG signals, time–frequency analysis was performed for each subject using the STFT with a 1s Hann window and 50% overlap to calculate the power spectral density (PSD). To reduce inter-individual variability, the relative power of each frequency band (theta: 4–7 Hz, alpha: 8–12 Hz, beta: 13–30 Hz) was further computed to analyze changes in brain activity across bands. The calculation method is as follows:
where
RPband denotes the relative power within the corresponding frequency band, and
Pband represents the power value.
All computations were implemented in MATLAB R2022a. Data were analyzed using SPSS software to compare the performance differences between the experimental and control groups. Furthermore, we analyzed participants’ meditation and attention metrics during the tasks.
4.3. Experimental Procedures
This study conducted a cognitive experiment focused on Mars scene learning. Participants were randomly assigned to either the experimental group or the control group and completed the experiment individually in a laboratory. Before the experiment began, we provided a brief overview of the study, informing participants that the session would last approximately 40 min and that they would wear an EEG device. After viewing the Mars scene, participants were required to complete tests related to the video content. The specific steps of the experiment were as follows: (1) Signing the informed consent form; (2) Completing the cognitive ability pre-test; (3) Wearing the EEG device; (4) Familiarization with the equipment and experimental environment; (5) Watching the VR/PC Mars scenes; (6) Memory performance test; (7) Analogy performance test; (8) Logical thinking performance test; (9) Creative thinking performance test; (10) watching the VR/PC Mars scenes again; (11) Modify and supplement four tests. One VR participant with abnormal EEG data was excluded, including the attention and meditation measurements during video viewing, but the results on four cognitive tests after the video were retained. Specifically, the abnormal EEG data of this participant were objectively identified during the post-experimental data preprocessing phase. During the experiment, the EEG electrode pad attached to the participant’s scalp became loose, which impaired the stability of EEG signal acquisition: only the first 131 s of valid EEG signals were collected during the Mars video viewing task. Moreover, in the preprocessing of the raw EEG signals, invalid acquisition segments caused by poor electrode contact were clearly observed, which compromised the integrity of the experimental data. This EEG signal acquisition abnormality was an independent hardware contact failure, and it was unrelated to VR experimental conditions (e.g., headset fit, movement, sickness) as well as the participant’s own cognitive and behavioral states. In addition, the participant completed all the cognitive tests smoothly with complete responses and no abnormal performance. This study was approved by the institutional ethics committee, and informed consent was obtained from all participants for their participation in the study.
5. Results
5.1. Pre-Test Results
Before the experiment, we assessed the baseline cognitive abilities of both groups to check for any initial differences. Data normality was tested using the Shapiro–Wilk test, and some groups did not meet the normality assumption (
p < 0.05). We applied the Mann–Whitney U test for analysis. As shown in
Table 1, no significant differences were found between the VR-based experimental group and the PC-based control group in memory (
U = 336.0,
p = 0.612), logical thinking (
U = 330.0,
p = 0.547), and creative thinking abilities (
U = 300.5,
p = 0.264).
5.2. Test Outcomes
To determine the impact of different learning environments on learning outcomes without prior information, we first used the Shapiro–Wilk test to assess the normality of the data in each group, and some data did not meet the assumption of normality (p < 0.05). We conducted the Mann–Whitney U test to examine the scores of the two groups of participants in the four cognitive dimensions: memory, analogy, logical thinking, and creative thinking.
The results are presented in
Table 2. To limit the Type I error, a Bonferroni correction for multiple comparisons was specifically applied to cognitive tests involving two repeated measurements for the same dimension (e.g, Memory: 0.05/2 = 0.025). H1 predicted that the VR environment significantly influences learning outcomes (the four cognitive links) in individuals without prior knowledge. As shown in
Table 2, there were no significant differences in learning outcomes between the VR environment and the traditional PC environment in the first test results (
p > 0.025). In the second test results, the VR group’s scores on the memory test were significantly lower than those of the PC group (
U = 222.0,
p = 0.013, |
r| = 0.337), indicating a medium effect size. No significant differences were observed in analogy, logical, and creative thinking tests (
p > 0.025).
5.3. Attention and Meditation Results
To investigate the effects of the VR environment on participants’ attention and meditation without prior information, we compared the differences between the two groups during their initial viewing of the Mars video. The Shapiro–Wilk test indicated that the data in each group followed a normal distribution (all
p > 0.05). Therefore, we employed independent samples
t-tests to examine the group differences. The results are presented in
Table 3, and the corresponding boxplots are shown in
Figure 5.
H2 predicted that the VR environment significantly influences cognitive engagement in individuals without prior knowledge, including attention and meditation. As shown in
Table 3, both attention and meditation metrics in the VR environment were lower than those in the PC group, with meditation demonstrating a significant difference (
t = −2.017,
p = 0.049,
Cohen’s d = 0.554).
5.4. EEG Measurement Results
To further address H2, which predicted that the VR environment significantly influences EEG characteristics in individuals without prior knowledge, we compared the relative power of each frequency band between the two groups during their initial viewing of the Mars video. The Shapiro–Wilk test indicated that all data were normally distributed (all
p > 0.05). Therefore, independent samples
t-tests were conducted to examine group differences. The results are presented in
Table 4, and the corresponding boxplots are shown in
Figure 6.
Regarding H2, significant differences were observed in EEG activity between the two groups under different learning environments. The VR group exhibited significantly higher relative power in the beta compared to the PC group (t = 2.574, p = 0.013, Cohen’s d = 0.707), whereas the VR group showed significantly lower relative power in the theta band (t = –2.694, p = 0.010, Cohen’s d = 0.740).
6. Discussion
This study aims to explore the impact of a VR visual environment on cognitive processes without prior knowledge. Based on classical cognitive theory, this paper proposed an interpretative framework of visual cognition without prior knowledge. Inspired by the theoretical model, an exploratory cognitive experiment was designed and conducted to examine two core hypotheses concerning the effects of VR learning conditions on cognitive performance. The overall experimental results indicate that, under the present experimental conditions, support for H1 was largely limited, with only the second-round memory measure showing a statistically significant effect. Empirical support for H2 was tentative, as indicated by the results related to meditation metrics and EEG spectral characteristics. Specifically, the empirical contributions of this study are mainly reflected in the following aspects:
The results of the memory test showed that there was no significant difference between the two groups after the first learning session. However, when learners were asked to review the learning content and modify their answers, the VR group performed worse in memory. This may suggest that the lack of prior knowledge is the primary factor affecting memory when first learning entirely new knowledge, and the VR setting did not significantly enhance knowledge retention. This is consistent with our theoretical research, where the lack of prior knowledge is a major limiting factor for the first cognitive link—memory. However, after repeated learning of the same content, participants in the PC group were able to accurately locate and remember key information. In contrast, participants in the VR group seemed to be more focused on their VR experience, with a more noticeable influence from the learning environment, facing challenges in identifying and remembering key knowledge.
No significant differences were found between the two groups in the analogy, logical thinking, or creative thinking tests. This indicates that, within the scope of the present experimental design, these cognitive processes appear relatively stable and unaffected by changes in the learning environment.
In addition, we observed that both attention and meditation metrics were lower in the VR group, with meditation metrics showing a significant difference. This finding appears to contrast with the findings of Wang et al. [
37], who reported increased attention in immersive VR environments, and Yang et al. [
38], who found that attention levels increased while meditation levels decreased. These studies typically used familiar scenes to construct their VR environments, whereas our study additionally introduced the absence of prior information. These differences in results may reflect the potential role of prior knowledge on learners’ cognitive states in such learning contexts. Specifically, in the unfamiliar Mars setting designed for this study, VR did not appear to produce the commonly reported positive effects on attention or relaxation. Instead, the prior-free setting in VR may potentially make it more challenging for learners to maintain focused attention and achieve a relaxed state. As mentioned by Makransky et al. [
30], VR environments may distract learners and impair cognitive processing. This pattern is also broadly consistent with the lower memory performance observed in the VR group.
We also observed significant differences in the relative power of the theta and beta bands at Fp1. Previous research by Takahashi et al. [
57] reported that theta activity is associated with meditation, with a marked increase in frontal theta activity observed during meditation. In addition, Liang et al. [
50] and Lee et al. [
58] reported associations between beta wave activity and cognitive tension as well as increased cognitive load. Moreover, previous research on VR [
59] found that theta waves in the Fp1 region were enhanced in immersive virtual learning environments, which was interpreted as being consistent with patterns linked to maintaining attention and suppressing distractions, thus facilitating the learning process. However, in our study, a decrease in theta wave activity was observed under the VR Mars environment. Given that participants had no prior knowledge and found the content unfamiliar, this change in theta activity may reflect differences in cognitive processing compared with more familiar learning contexts. Learners may have found it more difficult to maintain focus and relaxation under such conditions. This interpretation appears to align with our behavioral data: the VR group exhibited significantly lower meditation metrics and memory performance. As noted by Monteiro and Liang [
60], VR may not be ideal for short-term learning because the brain is already taxed by the need to adapt to the VR environment, potentially limiting the cognitive resources available for concurrent learning-related processing. In the context of our experimental setup, learners’ cognitive resources may have been redirected from the effective encoding of key learning information to managing a prior-free environment, which appears to be compatible with patterns associated with higher cognitive load as reflected in the observed beta activity. This explanation aligns with Makransky et al. [
5], who suggested that VR learning may increase extraneous cognitive load, resulting in less learning retention. As described in the theoretical framework of this study, this shift may have weakened their ability to capture and encode key information, thereby affecting the development of the memory link. Although changes in beta waves are also associated with positive emotions [
61], studies have shown that learning efficiency in VR is generally lower and often requires more learning time [
7].
Although previous research has shown that VR performs well in supporting learning, enhancing creativity, and increasing focus, these benefits were not fully demonstrated in the present experimental conditions. This finding highlights the potential role of prior knowledge under the VR conditions used in this study, suggesting that VR learning may be more dependent on the learner’s prior knowledge. While Kim et al. [
62] mentioned that VR can significantly improve learning experiences for individuals lacking experience, such studies are still limited to everyday, familiar scenes, which are not entirely devoid of prior knowledge. The VR Mars environment used in this study may have reduced the positive effects on meditation, memory, knowledge transfer, and creativity. According to Sweller’s Cognitive Load Theory, learners without prior knowledge are susceptible to the influence of the learning environment, and their limited cognitive resources may have been increasingly allocated to processing the learning environment itself rather than task-relevant information [
63]. In this Mars-based VR environment, learners’ ability to extract and process relevant information may be restricted, posing greater demands on information processing resources. Kim [
64] also supported potential challenges associated with sustained cognitive focus tasks in their VR research on naturalistic environments: rich visual stimuli may enhance the processing of external attention, while undermining the sustained processing of task-relevant information. As Makransky et al. [
5] noted, attention to irrelevant information may trigger extraneous cognitive load, making relaxation and focused learning more difficult. Therefore, under VR learning scenarios involving highly unfamiliar content and limited prior knowledge, as examined in the present study, it may be important for VR designs to give careful consideration to complex visual and environmental factors. As highlighted by Khan et al. [
65], the design of the VR environment should consider the control of cognitive load. Complex designs, although providing a stronger sense of presence, do not necessarily lead to better learning outcomes, especially in unfamiliar environments, where attention to unnecessary information may be triggered, potentially suppressing memory retention, knowledge transfer, and creative potential.
7. Limitations and Future Research
This study acknowledges several limitations and offers directions for future research. First, although we recruited 54 college students from the same institution as participants, the sample size remains relatively small, and the participant group lacks diversity. Future studies should consider including a larger and more diverse sample, encompassing different age groups and educational backgrounds, to improve the generalizability of the findings. It should be noted that one participant’s EEG data were excluded due to hardware contact failure, while the valid behavioral data were retained. This asymmetrical handling of physiological and behavioral data may affect the robustness of the results. Future studies should conduct robustness analyses by reporting results both with and without this participant included.
Secondly, this study selected a Mars exploration scenario to create a learning environment without prior knowledge. However, the curiosity effect of the Mars scenario may have introduced additional influences on learners’ cognitive processes. Although efforts were made during the experimental design to ensure consistent learning content between the PC and VR groups and provide the VR group with adaptation time to the VR environment, to minimize biases caused by curiosity. The specific and relatively niche theme may still limit the generalizability of the results. In addition, differences in visual presentation across platforms—such as frame composition, color balance, and perceptual intensity—may have affected participants’ responses, potentially impacting stimulus equivalence between the groups. Future research could consider other topics and incorporate methods such as interviews, as well as more careful matching of video materials across platforms, to more comprehensively assess the effects of VR on individuals’ cognitive processes. This would help to validate and extend the findings of this study and provide support for the broader application of VR.
Thirdly, the fixed sequence of the four cognitive tests (memory, analogy, logical thinking, and creative thinking) may have introduced potential confounds, such as learning effects, fatigue, or the inadvertent influence of one task on subsequent ones. Moreover, while these tests were designed to align with the proposed cognitive framework and referenced prior literature, the validity evidence for some constructed measures—particularly analogy and logical thinking tasks—remains limited. Future studies should adopt counterbalanced test sequences and incorporate validated assessment instruments to isolate the effects of the learning environment.
Fourthly, the VR Mars scenario used in this study simultaneously incorporated multiple inherent attributes of VR, such as immersive, motion, and headset discomfort. The present study did not disentangle the independent cognitive effects of these attributes. Therefore, the cognitive states and learning performance observed in the VR group should be understood as the combined effects of these interacting factors. Future research could employ controlled experimental designs in which only a single VR attribute is manipulated, thereby clarifying the independent effects of different VR components on cognitive processing and learning outcomes.
Additionally, the potential relationships among the cognitive links described in the conceptual framework were not empirically examined in the present study. Future research could be designed to empirically investigate the development of the cognitive links proposed in the theoretical model, while collecting dynamic EEG data to provide more robust empirical support for the model.
Furthermore, this study utilized the Neurosky MindWave, which reduces discomfort from wearing when combined with the VR headset. However, single-channel EEG devices are unable to capture rich EEG signals from other brain regions and therefore cannot reflect neural activity in the entire brain. Meanwhile, the study did not include formal validation checks or sensitivity analyses for the EEG methodology, and the proprietary attention and meditation metrics lack extensive validation within cognitive neuroscience research. In addition, although Bonferroni correction was applied to repeated measures within the same construct, which aligns with common practices in cognitive and neurophysiological research, multiple comparisons across behavioral and EEG indicators that support a shared inferential narrative were not corrected for familywise error. Accordingly, the EEG findings should be interpreted primarily as indicative rather than diagnostic, including the attention and meditation metrics and rhythmic signals. Future research could consider combining a more convenient, non-head-mounted display (non-HMD), surround-screen projection-based VR system with a multi-channel EEG system and adopting more stringent statistical procedures. This combination could mitigate the adverse effects of wearing multiple devices on EEG signal capture and participant comfort. Meanwhile, capturing EEG signals from multiple brain regions would provide more comprehensive neurophysiological data for VR-based cognitive research.
Finally, subsequent research can also focus on the design of diverse VR learning environments, exploring how different design elements influence cognitive processes. This may help identify optimal design strategies that enhance user experiences and improve learning outcomes in VR. Such efforts will contribute to a deeper understanding and application of VR technology, offering empirical support for the integration of modern digital information technologies with human cognitive systems.
8. Conclusions
This paper proposes a conceptual theoretical framework to describe visual cognition in VR environments without prior knowledge. The cognitive process of visual information was organized as four related links: memory link, analogy link, logical thinking link, and creative thinking link. These links are intended to characterize key cognitive processes engaged during learning. The framework provides a conceptual perspective for understanding how visual information in VR may contribute to constructing abstract cognition.
Besides, a cognitive experiment inspired by the theoretical framework of this study showed that, under the conditions of the present design, the VR group exhibited lower memory performance in the second round and reduced meditation metrics. In contrast, no significant differences were observed in analogy, logical thinking, or creative thinking abilities. The experiment also revealed significant differences in theta and beta activity. These results highlight the potential importance of prior knowledge in such highly unfamiliar and exploratory VR-based learning contexts: learners without prior knowledge may be more susceptible to the characteristics of the VR environment, which could distract them and reduce relaxation, potentially hindering memory performance in such contexts. Moreover, under such conditions, the findings suggest that analogy, logical thinking, and creative thinking are more strongly related to overall cognitive abilities than to variations in the learning environment.
In conclusion, this study offers a theoretical framework for understanding visual cognitive processes in VR learning without prior knowledge. The proposed framework extends established cognitive theories within a specific learning condition, offering a conceptual perspective for interpreting how learners engage with highly novel visual information in VR environments. It represents an exploratory attempt to address a gap in current theoretical research concerning the characteristics of VR learning under conditions of absent prior knowledge. By situating cognitive mechanisms within this special but increasingly relevant scenario, the study offers a focused theoretical reference for future research at the intersection of cognitive theory and immersive learning technologies, while calling for further empirical research to validate the proposed framework and extend these findings to broader contexts.
In terms of practical implications, unlike previous research that emphasizes the benefits of VR in education, this study highlights potential challenges of VR-based learning for learners without prior knowledge engaging with unfamiliar topics. The present results offer preliminary implications that call for a more critical reflection on the broader application of VR in education and training, as well as the need for further investigation. The present study focuses on introductory VR learning scenarios involving novice learners and unfamiliar subject matter. From this perspective, the findings suggest that the role of visually rich but task-irrelevant elements in such contexts deserves closer examination in future research, especially regarding how learners allocate limited cognitive resources during learning. These findings should be interpreted within the exploratory scope of this study, constrained by the experimental design and measurement approach. The study examined only a single VR scene, device, and task, and therefore represents an initial step toward informing future investigations of instructional and design-related considerations in VR-based learning and training without prior knowledge. Further research is required before generalizable principles can be established.