1. Introduction
Population aging and the growing long-term burden of brain-health conditions motivate the development of low-burden, repeatedly applicable, non-invasive approaches that can be deployed in everyday contexts. Complementary stimulation paradigms have been explored, including rTMS combined with cognitive training in Alzheimer’s disease [
1]. Rhythmic sensory stimulation paradigms incorporating gamma-frequency components have also been presented through exploratory studies and longer-term case reports [
2,
3]. More recently, prospective work has directly evaluated the safety and acceptability of 40 Hz auditory stimulation, extending discussions on real-world feasibility [
4]. These studies are cited here to contextualize the motivation for repeated-use-oriented designs rather than to position the present healthy-cohort pilot as a clinical efficacy trial.
Global demographic trends further underscore the importance of scalable formats that can support sustained use. According to a United Nations report, the worldwide population aged 65 years and older is projected to increase substantially by 2050 [
5]. Alongside demographic aging, Alzheimer’s disease and dementia remain major contributors to global disease burden [
6,
7]. Mental health conditions represent another major global health burden [
8]. The WHO emphasizes prevention, early intervention, risk-factor management, and long-term national-level responses for dementia and related brain-health conditions [
9]. Together, these trends motivate research that establishes objective neurophysiological evidence of engagement while also considering delivery formats that can realistically support repeated exposure in everyday environments.
A key scientific rationale for sensory-stimulation approaches is neural entrainment. Gamma-band activity (approximately 30–70 Hz) has been linked to higher-order cognitive processes such as perception, attention, and memory, and accumulating evidence suggests gamma-band abnormalities in neurological conditions, including Alzheimer’s disease. On this basis, inducing gamma entrainment at specific frequencies—particularly around 40 Hz—has been proposed as a means to modulate neurophysiological responses and potentially influence disease-relevant processes [
10,
11]. Preclinical work has reported that 40 Hz acoustic stimulation may reduce pathological markers and modulate brain rhythms in an AD animal model [
10], and human EEG research has suggested enhanced neural responses during 40 Hz auditory entrainment under eyes-closed conditions [
11]. This line of work supports the plausibility of 40 Hz auditory stimulation as a non-invasive, sensory-based approach while also highlighting the importance of quantitative validation of the delivered stimulus in humans [
10,
11].
At the same time, delivery design is a practical constraint for repeated use. Presenting a standalone pure 40 Hz tone may elicit adverse perceptual experiences such as discomfort or tinnitus-like sensations, which can limit formats intended for prolonged listening [
12]. Integrating 40 Hz components into existing audio content (e.g., music) has been proposed to mitigate this burden; however, music-based approaches can be preference-dependent and may introduce perceived audio-quality changes that affect listening experience. Therefore, there is a need for auditory formats that (i) embed a controlled 40 Hz component in an everyday-compatible listening context with relatively neutral content and (ii) provide objective, EEG-based evidence that the resulting composite stimulus produces detectable frequency-specific signatures around 40 Hz.
The present study is an exploratory pilot quantitative EEG investigation designed to test whether a nature-based soundscape can serve as a low-burden carrier for a frequency-specific 40 Hz component that remains EEG-quantifiable in a realistic listening context. Critically, the 40 Hz OFF (soundscape-only) condition is not positioned as a competing intervention; rather, it functions as a contrast control to isolate the incremental contribution of additively layering a pure 40 Hz sine component within an otherwise identical soundscape experience (same content, preprocessing/mastering chain, and within-participant session structure). Within this contrast-control framework, we examine whether the 40 Hz ON condition yields stronger 40 Hz centered EEG signatures—operationalized as narrowband 40 Hz power and frequency-domain SNR around 40 Hz—and whether it is accompanied by supportive changes in phase-based synchronization indices (phase-locking value, PLV) in a narrow gamma band (35–45 Hz).
Hypotheses. Relative to the 40-Hz OFF contrast control, the 40-Hz ON condition will:
Hypothesis 1 (H1). Increase EEG signatures centered at 40 Hz (narrowband power and SNR around 40 Hz).
Hypothesis 2 (H2). Show an ON > OFF tendency in phase-based synchronization (PLV) in the 35–45 Hz band.
These hypotheses concern stimulus-locked, EEG-quantifiable signatures of the embedded 40 Hz layer, rather than clinical or behavioral efficacy. As an exploratory pilot, statistical tests are reported to flag hypothesis-generating signals rather than to support confirmatory inference.
3. Materials and Methods
3.1. Study Design Overview
This pilot quantitative EEG study examined whether a nature-based soundscape combined with a pure 40 Hz sine wave via additive layering elicits stronger narrowband gamma responses around 40 Hz than a soundscape-only contrast control. We used a single-blind, randomized-order, within-participant crossover design in which each participant completed both conditions—40 Hz OFF (contrast control) and 40 Hz ON (experimental)—within the same assigned soundscape set (
Figure 1). To reduce content-specific bias, participants were assigned to one of two soundscape sets (Waves or Forest; between-participants factor), and the within-participant condition order was counterbalanced (
Table 1).
Figure 1 summarizes the overall workflow from recruitment and set assignment to the listening session, washout, and EEG analysis.
3.2. Ethics
The study protocol was approved by the Institutional Review Board (IRB No. KMU-202509-HR-503). Written informed consent was obtained from all participants after they were informed of the study objectives, procedures, potential risks, and the processing and protection of personal data. Participants were free to withdraw at any time without penalty and received compensation after completing the study.
3.3. Participants
Adults aged ≥40 years living in Seoul, Republic of Korea, were recruited via online channels. We restricted eligibility to adults aged ≥40 years to align the pilot with a midlife-to-older-adult listening context and to reduce developmental heterogeneity associated with younger adults, while maintaining recruitment feasibility for an initial within-participant EEG study. Inclusion criteria were: (1) age ≥40 years; (2) no substantial difficulty in everyday listening; and (3) ability to complete the listening and EEG procedures. Hearing status was screened by pure-tone audiometry; eligibility required thresholds ≤25 dB HL at 0.5–4 kHz in both ears, consistent with commonly used criteria for normal hearing.
Exclusion criteria included neurological disorders (e.g., epilepsy, Parkinson’s disease), ongoing psychiatric treatment or psychoactive medication, severe tinnitus, implanted medical devices, pregnancy or breastfeeding, and medications known to affect brain activity.
A total of 11 participants were enrolled. Two participants (P01 and P04) were excluded from quantitative EEG analyses because the experimenter observed repeated drowsiness/sleep during the listening blocks, indicating non-adherence to the wakefulness requirement. Accordingly, quantitative EEG analyses were conducted on nine participants (
Table 1).
Table 1 summarizes participant characteristics, set assignment, within-participant condition order, and inclusion status (with exclusion reasons).
3.4. Conditions and Counterbalancing
Two conditions were tested within each assigned soundscape set:
40 Hz OFF (contrast control): soundscape-only, using an identical preprocessing/mastering chain.
40 Hz ON (experimental): the same soundscape with an additively layered pure 40 Hz sine component (not amplitude-modulated).
Condition order was randomized and counterbalanced across participants. Participants were not informed whether a given condition contained the 40 Hz component (single-blind).
3.5. Auditory Stimuli
3.5.1. Stimulus Structure
Stimuli followed a soundscape-by-layer structure: soundscape content (Waves vs. Forest) and presence of the 40 Hz layer (OFF vs. ON). For the Waves set, OFF and ON were denoted as A′ and A, respectively; for the Forest set, OFF and ON were denoted as B′ and B. Each participant was assigned to one set (between-participants factor) and completed both OFF and ON within that set (within-participant crossover).
3.5.2. Audio Preprocessing and Additive 40 Hz Layering
All auditory stimuli were rendered as stereo WAV files and followed a unified preprocessing and mastering chain across conditions. To stabilize low-frequency energy and reduce uncontrolled variability that can complicate stimulus characterization (e.g., large, content-dependent sub-bass fluctuations), we applied a steep high-pass filter (HPF; 78 Hz cutoff, 96 dB/oct) to the base soundscape signal. Because the HPF cutoff (78 Hz) strongly attenuates carrier energy around 40 Hz, intrinsic soundscape energy near 40 Hz was minimized, reducing potential masking/overlap with the added 40 Hz line component. The HPF was applied offline using a fixed, identical processing preset for all base soundscape files (both OFF and ON) to ensure consistent magnitude characteristics across stimuli. The resulting attenuation profile was verified via file-based spectral inspection on the rendered WAV outputs (
Figure 2a). The cutoff and slope were selected through iterative internal pilot checks as a pragmatic compromise: sufficiently attenuating low-frequency energy that could obscure or confound interpretation of a 40 Hz specific component while keeping the soundscape perceptually acceptable for the study’s listening context. This preprocessing step was intended primarily to reduce uncontrolled low-frequency masking near 40 Hz for stimulus verification and interpretation; perceptual naturalness was informally checked in internal pilot listening.
We selected additive layering (rather than amplitude modulation, AM) as an engineering choice to preserve the perceptual fidelity of the naturalistic soundscape carrier. Applying AM to the entire soundscape can introduce salient envelope “pumping” artifacts and global timbral changes that may reduce naturalness and make the manipulation more perceptually salient. In contrast, additive layering keeps the carrier waveform and its dynamics intact while enabling precise control and straightforward file-level verification of the intended 40 Hz component as a narrowband line feature. Additive layering can render a tonal cue perceptually salient if the relative level is set too high; therefore, we used conservative embedding and post-export verification to minimize clipping and unintended loudness shifts (
Table 2). We also did not use binaural beats because their perceptual strength depends on stable dichotic presentation and varies substantially across listeners, and the resulting “beat” is an illusory percept rather than a physically present 40 Hz line component. In addition, binaural-beat paradigms introduce dependence on stereo separation, ear-canal asymmetries, and headphone fit, complicating reproducible stimulus verification at the file level. Because our primary goal was an EEG-quantifiable, frequency-specific 40 Hz component embedded into an everyday soundscape with straightforward stimulus verification, additive layering offered the most direct and controllable approach for this pilot.
For the 40 Hz ON condition, a pure 40 Hz sine component was additively layered onto the HPF-processed soundscape. Importantly, the HPF was applied only to the base soundscape signal, and the 40 Hz sine component was generated and added after the HPF stage. Let the left and right HPF-processed soundscape channels be
and
. The additively layered 40 Hz component was defined as
and the final stereo outputs are given by
Here,
A is a predefined mixing gain applied to the 40 Hz sine component to keep the added component at a consistent, conservative level relative to the carrier soundscape at the file level. The same predefined value of A and the same rendering chain were applied consistently when producing the ON stimuli, thereby fixing the soundscape-to-40 Hz level relationship across files. Spectrogram examples confirm that the 40 Hz OFF stimuli do not show a narrowband 40 Hz line component, whereas the corresponding 40 Hz ON stimuli exhibit a distinct narrowband line at 40 Hz (
Figure 2b). Post-export file-based verification further documented integrated loudness (LUFS), true peak level (dBTP), DC offset, clipping, and loudness range (LRA), indicating small loudness shifts after layering and no clipping across files (
Table 2). Together, these steps support that OFF/ON stimuli differed primarily by the intended additive 40 Hz layer under an otherwise consistent production chain.
3.5.3. Block Structure and Duration
Each condition comprised seven cycles of 50 s playback followed by 10 s silence, yielding 420 s (~7 min) per condition. Quantitative EEG analyses included playback segments only (50 s) and excluded the silence intervals (10 s).
3.5.4. Post-Export File Verification and Loudness Consistency Checks
Waveform statistics were extracted from the final rendered WAV files to document integrated loudness (LUFS), true-peak level (dBTP), clipping, DC offset, and loudness range (LRA). Overall loudness changes after layering were small (Waves: +0.1 LUFS; Forest: +0.3 LUFS, ON relative to OFF), and no clipping was detected in any file (
Table 2). These file-level checks document consistency of the rendered stimuli (e.g., integrated loudness/true peak/no clipping) but do not substitute for participant-level ear-canal SPL calibration.
Table 2 retains the same underlying values but is presented with a slightly revised layout to facilitate EEG-oriented interpretation, including condition-wise comparisons.
3.6. Procedure
Sessions were conducted in a quiet indoor room. After fitting the EEG cap and adjusting electrode impedances, participants listened to the two conditions within the assigned soundscape set in a randomized and counterbalanced order. Each condition lasted ~7 min, and a 10 min washout period was provided between conditions (
Figure 1). A 10 min washout was used to minimize potential carryover related to short-term sensory adaptation, fatigue, or state drift between blocks. Our primary EEG endpoints quantify stimulus-locked steady-state-like signatures during playback; ASSR/steady-state responses are elicited by temporally structured stimulation and typically track the frequency/phase of the ongoing stimulus [
27]. Although post-stimulus (“OFF”) responses have been reported in some paradigms, a 10 min interval is expected to substantially reduce immediate carryover in this context [
28]. During listening, participants were instructed to keep their eyes closed, minimize movement (particularly jaw and facial muscle tension), and maintain a stable seated posture. During washout, participants were allowed to open their eyes and were reminded to remain awake. Throughout the session, the experimenter monitored participants’ vigilance (e.g., head nodding and reduced responsiveness) and provided verbal reminders when drowsiness was suspected. In addition, session video recordings (collected with participant consent for monitoring purposes) were reviewed to corroborate drowsiness-related observations; this served as supplementary verification rather than an objective physiological vigilance measure. No objective vigilance or eye-state channels (e.g., EOG) or standardized behavioral probes were implemented in this pilot. Nevertheless, clear drowsiness was observed during the listening blocks for P01 and P04, and their concurrent EEG traces were deemed unsuitable for reliable estimation of narrowband 40 Hz signatures. These datasets were therefore excluded from the quantitative EEG analyses according to the predefined quality-control criteria.
3.7. Playback Equipment
Stimuli were presented from a laptop computer positioned approximately 3 m away from the EEG system to reduce potential electromagnetic interference. Audio was delivered via wired in-ear earphones (XBA-A2; Sony Corporation, Tokyo, Japan). Individual ear-canal sound pressure level (SPL) was not measured in this pilot study. However, the rendered stimulus files used a fixed file-level embedding ratio between the carrier soundscape and the 40 Hz component, controlled by a predefined mixing gain (A) applied consistently during stimulus rendering (
Section 3.5.2;
Table 2). During data collection, the same playback device, earphones, and player/software settings were used for all participants; Windows 11 system volume was fixed at 90/100 (software scale), and the media player volume was fixed at 60/100 (software scale), and these settings were kept unchanged across participants and across both conditions. We did not perform individual loudness normalization or coupler/ear-canal SPL calibration; thus, between-participant differences in delivered ear-canal SPL may remain due to insertion depth, ear-canal acoustics, and sealing. Given the within-participant crossover design and the contrast-control pairing of OFF vs. ON within the same soundscape set, unmeasured between-participant SPL variability is unlikely to explain ON–OFF directionality, but it may contribute to inter-subject variability in response magnitude.
3.8. EEG Acquisition
EEG was recorded using a multichannel system (BIOS-S series, BioBrain Inc., Daejeon, Republic of Korea). Twenty-one electrodes were placed according to the international 10–20 system (Fp1, Fpz, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, Oz, and O2). Signals were sampled at 250 Hz, impedances were kept below 10 kΩ, and the system’s default reference/ground configuration was used.
Although O1 was recorded in the 10–20 montage, persistent noise/unstable contact was repeatedly observed at this channel; therefore, O1 was excluded from quantitative analyses. Accordingly, all EEG metrics (power, SNR, and PLV) were computed using the remaining 20 electrodes.
3.9. EEG Preprocessing
All EEG signals were processed in MATLAB (R2024b). Channel-wise mean removal was first applied to correct DC offsets. To attenuate 60 Hz power-line interference, a zero-phase two-pass IIR notch filter was applied. For spectral analyses (40 Hz power and SNR), a 4th-order two-pass IIR band-pass filter (0.5–50 Hz) was used. For PLV analyses, a 4th-order two-pass IIR band-pass filter (35–45 Hz) was applied, and the instantaneous phase was computed using the Hilbert transform.
Automated component-level cleaning (e.g., ICA) and formal epoch-rejection pipelines were not applied in this pilot because the primary goal was feasibility-oriented detection of stimulus-locked, narrowband signatures under a constrained sample size and a controlled listening procedure. We prioritized procedural control and conservative quality assurance: participants were instructed to keep their eyes closed and minimize movement, vigilance was monitored during the session, and participants with clear noncompliance (repeated drowsiness/sleep; P01, P04) were excluded (
Section 4.1). In addition, raw EEG traces were visually checked for gross recording failures (e.g., persistent saturation or disconnection), and an electrode with persistent instability (O1) was excluded from quantitative analyses (
Section 3.8). Under the within-participant contrast-control design (ON vs. OFF) with identical procedures and preprocessing across conditions, broadband artifacts such as motion/EMG would likely affect wider frequency ranges and both conditions similarly; therefore, the primary interpretation is restricted to candidate stimulus-locked, frequency-specific differences around 40 Hz rather than to generalized changes in signal quality. This consideration is particularly important for phase-based connectivity (e.g., PLV), which can be more sensitive to residual EMG, common-source effects, and referencing choices than narrowband power alone. Nevertheless, we acknowledge that ICA- or epoch-based cleaning could further improve robustness and is recommended for future confirmatory studies with larger samples and preregistered preprocessing plans.
3.10. Outcome Measures
3.10.1. 40Hz Power Quantification
The 40 Hz response was quantified from the power spectral density (PSD) as narrowband 40 Hz power, defined as the PSD value at the frequency bin nearest to 40 Hz (in μV
2/Hz). Channel-level outcomes were computed for all valid electrodes that passed quality control (O1 excluded as described in
Section 3.8), and region-level summaries (frontal, temporal, central, and parietal–occipital) were calculated for descriptive comparison.
This narrowband metric was intended to reflect frequency-specific responses centered at 40 Hz rather than broadband power differences.
3.10.2. Frequency-Domain SNR
We quantified frequency-domain signal-to-noise ratio (SNR) around 40 Hz from the power spectral density (PSD) by comparing the PSD at the target frequency bin to the mean PSD of neighboring bins. Specifically, for each channel, SNR at frequency
was computed as:
where
denotes the PSD value at the FFT bin closest to
, and
denotes the mean PSD of the
neighboring bins around the center bin (i.e., 10 bins total), excluding the center bin itself. A small constant
was added to the denominator to avoid numerical instability when the neighborhood power approached zero. For the main analysis, SNR was evaluated at
Hz.
3.10.3. Topographic Mapping
Condition-wise, channel-averaged 40 Hz power was visualized as scalp topographies using standard 10–20 coordinates with interpolation. Identical color scales were applied across conditions.
3.10.4. Phase-Locking Value (PLV)
Phase-based connectivity was quantified using the phase-locking value (PLV) in a narrow gamma band centered on the stimulation frequency. For each electrode pair, EEG signals were bandpass-filtered to 35–45 Hz (4th-order zero-phase bidirectional IIR), and the instantaneous phase was obtained via the Hilbert transform. PLV was computed for all unique channel pairs among the analyzed electrodes as:
where T denotes the number of time samples and Δϕ(t) is the instantaneous phase difference between the two channels at time t. The 35–45 Hz band was selected to center on 40 Hz while allowing a margin for inter-individual variability in how narrowband phase synchronization expresses around the target frequency. PLV was computed over the concatenated playback samples within each condition (seven 50 s playback segments per condition; total 350 s), which provides a longer effective estimation window to reduce instability associated with short segments. Because the O1 electrode was excluded due to persistent noise/artifacts, PLV analyses were conducted on the remaining 20 electrodes, yielding 190 unique pairs (20 × 19/2).
3.11. Statistical Analysis and Multiple Testing Considerations
This study was designed as an exploratory pilot investigation to assess whether a nature-based soundscape can serve as a low-burden carrier for a frequency-specific 40 Hz component that remains EEG-quantifiable in a realistic listening context. Importantly, the 40 Hz OFF condition (soundscape-only) was treated as a contrast control to isolate the incremental contribution of the additively layered 40 Hz sine component within an otherwise identical listening context (same content set, preprocessing/mastering chain, playback procedure, and within-participant block structure). Accordingly, the primary analytic goal was to examine detectability, directionality, and spatial tendencies of 40 Hz–centered EEG signatures in 40 Hz ON relative to 40 Hz OFF, rather than to make competitive efficacy claims about the soundscape-only condition.
Planned analyses. The primary planned endpoints were 40 Hz–centered narrowband power and frequency-domain SNR around 40 Hz (OFF vs. ON) computed during playback epochs, summarized at both electrode and region levels (including grand average). Supportive analyses included descriptive scalp topographies of 40 Hz power/SNR and a narrow gamma-band (35–45 Hz) PLV analysis, which are explicitly interpreted as exploratory/hypothesis-generating without multiplicity-aware inference.
Post hoc/exploratory analyses. Any additional inspections beyond these endpoints (e.g., additional descriptive checks for interpretation) were conducted post hoc and are labeled as exploratory where reported.
Within-participant condition differences (40 Hz ON vs. 40 Hz OFF) were evaluated using two-sided Wilcoxon signed-rank tests. For channel-wise outcomes (40 Hz power and frequency-domain SNR), tests were conducted across the analyzed electrodes; because the O1 electrode (left occipital) was excluded due to persistent noise, channel-wise inference was based on the remaining 20 electrodes. For phase-based connectivity, PLV was computed in the 35–45 Hz band for all unique channel pairs among the analyzed electrodes (20 electrodes, 190 unique pairs), and condition differences were evaluated per pair using the same two-sided nonparametric framework.
Given the exploratory nature of this pilot study, no multiplicity control (e.g., FDR) was applied across electrodes or channel pairs. Therefore, all reported p-values are uncorrected and are presented to indicate where signals may concentrate as hypothesis-generating evidence, not to support confirmatory inference. Consistent with this rationale, any visual markers or maps derived from nominal p-values (e.g., asterisks, −log10(p) panels, or p < 0.05 masks) should be interpreted as uncorrected reference cues only and not as multiplicity-validated findings. Future confirmatory work should preregister primary endpoints, incorporate multiplicity-aware inference (or cluster-based/permutation approaches for spatial/connectivity structure), and include calibrated playback-level documentation (e.g., coupler-based SPL estimation) to strengthen reproducibility and external validity.
For planning future confirmatory work, paired-sample effect sizes (e.g., Cohen’s d_z for ON–OFF differences on a prespecified primary endpoint) can be used to estimate the required sample size; we provide a brief planning reference in the Discussion.
5. Discussion and Conclusions
This pilot EEG study evaluated the plausibility of a soundscape-based auditory format as a practical carrier for a frequency-specific 40 Hz component that is EEG-quantifiable in a realistic listening context. The central contribution is not a competitive comparison between “soundscape-only” and “soundscape-plus,” but a contrast-control test that isolates the incremental contribution of an additively layered 40 Hz sine component within an otherwise identical soundscape experience. From this perspective, the observed ON > OFF directionality across multiple EEG readouts suggests a cautious, design-oriented interpretation: a naturalistic soundscape can accommodate a controlled 40 Hz layer in a way that yields detectable 40 Hz centered EEG signatures, thereby motivating more rigorous confirmatory work. The stimulus-production chain also reflected a pragmatic trade-off between suppressing low-frequency variability that could mask a 40 Hz–centered signal and preserving perceived soundscape naturalness, as described in
Section 3.5.2.
Across complementary EEG readouts, the 40 Hz ON soundscape showed consistent ON > OFF directionality in 40 Hz centered signatures. This convergence is important from an applied-science perspective because it suggests a design-oriented interpretation: a naturalistic soundscape can accommodate a controlled 40 Hz component while preserving an everyday-compatible listening context, and the embedded component remains detectable at the EEG level. The appropriate inference is therefore about the feasibility of EEG quantification and pattern plausibility—not definitive localization, generalizable effect sizes, or functional/clinical benefit. Any apparent scalp localization should be treated as exploratory and may not generalize without multiplicity-aware replication and uncertainty quantification (e.g., bootstrap-based topographies).
The exploratory connectivity (PLV) findings should be interpreted with particular caution. Pairwise connectivity involves many simultaneous tests; without multiplicity-aware inference, connectivity summaries can be over-interpreted. In this study, PLV results are best treated as a hypothesis-generating map of where larger ON–OFF phase-consistency differences may concentrate, motivating preregistered, multiplicity-aware replication and network-level statistical approaches. Moreover, sensor-level PLV can be inflated by volume conduction and other common-source effects and is sensitive to residual noise and referencing choices; therefore, it should not be interpreted as evidence of true inter-regional communication in this pilot. Future work should consider connectivity metrics designed to mitigate zero-lag coupling (e.g., imaginary coherence, PLI/wPLI, and/or source-space connectivity), together with preregistered, multiplicity-aware inference.
Several limitations delimit interpretation and define next steps. As a transparency note, this pilot study was not preregistered; therefore,
Section 3.11 explicitly distinguishes prespecified primary endpoints from supportive/exploratory analyses. First, the study was not powered for confirmatory inference; future work should increase sample size, preregister primary endpoints, and adopt multiplicity-aware (or cluster-/permutation-based) inference for spatial and connectivity structure.
To make the pilot nature of this work more actionable for future confirmatory studies, we provide a simple sample-size planning reference based on paired-sample effect sizes (Cohen’s d_z) from an ON–OFF within-participant design. These values are provided solely for rough planning and should not be treated as definitive requirements. For a two-sided α = 0.05 paired t-test (a common approximation for planning even when nonparametric tests are used in analysis), achieving 80% power requires approximately n = 34 for d_z = 0.5 (moderate), n = 24 for d_z = 0.6, n = 19 for d_z = 0.7, and n = 15 for d_z = 0.8 (large); for 90% power, the corresponding values are approximately n = 44, 32, 24, and 19, respectively. Because effect sizes estimated from small pilots can be unstable, future studies should select a single prespecified primary endpoint (e.g., an ROI summary SNR@40 metric), estimate d_z from the pilot cautiously, and power the confirmatory study toward the conservative end of the plausible range.
The age range was intentionally broad within a ≥40 years eligibility window to support recruitment feasibility; however, age-related heterogeneity could have contributed to between-participant variability in EEG responsiveness and listening-related factors. With n = 9, we were not positioned to conduct age-stratified analyses or robust covariate modeling. Future confirmatory studies should either narrow the age band or prespecify age-adjusted/stratified analyses to quantify and control potential age effects.
Generalizability is also restricted by the participant profile: this pilot enrolled middle-aged to older adults with normal hearing, and the findings may not extend to younger listeners, individuals with hearing loss, or clinical populations with cognitive impairment. Auditory aging and hearing status can alter effective stimulus delivery and neural responsiveness (e.g., audibility, salience, and cortical synchrony), and cognitive status may further modulate state dependence and variability in gamma-band signatures. Future confirmatory studies should explicitly include broader demographic and clinical strata (including hearing-screened younger adults and target populations such as MCI) and examine whether calibration and embedding parameters require adaptation across groups.
An additional limitation is the absence of objective vigilance/eye-state monitoring. Although drowsiness-related exclusions were based on direct behavioral observation and corroborated by post-session video review, this does not replace objective physiological vigilance measures, and residual state fluctuations may have affected EEG responsiveness. Future confirmatory studies should incorporate standardized vigilance control (e.g., EOG-based eye-state verification and/or brief behavioral probes) and prespecify exclusion criteria to minimize subjective decisions.
Second, we did not implement ICA- or epoch-based artifact-rejection pipelines in this pilot. While we emphasized procedural control and conservative quality checks (eyes-closed, minimal-movement instructions; vigilance monitoring; exclusion of participants with repeated drowsiness/sleep; visual screening for gross recording failures; and exclusion of an unstable channel), residual physiological and environmental noise may remain and could influence effect estimates. Future confirmatory studies should prespecify and apply standardized artifact-cleaning procedures (including ICA and/or epoch rejection) and report their impact on the primary 40 Hz centered endpoints.
Third, individual ear-canal SPL was not calibrated in this pilot. Because ASSR-like steady-state responses can depend on stimulus intensity, the absence of calibration may have introduced uncontrolled between-participant heterogeneity in the effective “dose” of the 40 Hz component, which could inflate inter-subject variability in SNR and attenuate apparent effect sizes. In practice, even when the file-level embedding ratio is fixed (via a predefined A), the delivered ear-canal SPL can vary with insertion depth, ear-canal acoustics, sealing/attenuation, and individual loudness preference. Importantly, the within-participant contrast-control design reduces the likelihood that such variability alone explains ON–OFF directionality; however, calibrated SPL documentation (e.g., coupler-based estimation and explicit gain-setting records) will be essential for reproducibility, cross-study comparability, and interpretation of individual differences in response magnitude.
Fourth, the present outcomes quantify EEG-level signatures and do not establish cognitive or clinical benefit; integrating behavioral proxies and longer-term feasibility protocols will be essential to connect EEG detectability to functional relevance. In addition, we did not collect concurrent subjective measures such as listening comfort, perceived naturalness, or perceptual detectability of the embedded 40 Hz layer. Future confirmatory studies should incorporate standardized self-reports and/or brief in-session checks alongside EEG to evaluate long-term real-world feasibility and to relate subjective acceptability to EEG detectability.
Finally, future studies should systematically parameterize the embedding strategy (e.g., relative 40 Hz level, personalization, audibility/comfort constraints) to derive robust design rules that maximize EEG detectability while maintaining a natural listening experience.
In conclusion, using a soundscape-only contrast control to isolate the incremental contribution of a 40 Hz layer, this pilot EEG study provides hypothesis-generating evidence that a soundscape can serve as a feasible carrier for an embedded 40 Hz component that remains EEG-quantifiable; it does not evaluate therapeutic efficacy.