1. Introduction
Over the centuries, the Church has served as a vital institution in civilization. It is a place strongly associated with religious beliefs, where believers seek spiritual release and tranquility. In addition to serving as a means to enhance people’s spiritual well-being, these spaces also provide opportunities for verbal address, the subject of which should be clearly communicated to the audience. Furthermore, churches often hold a variety of art and music events. These include recitals of liturgical music, chamber concerts, symphonic performances, and celebrations of popular or seasonal music.
The concept of “Acoustic Heritage” encompasses the unique sonic environments that emerge from the architectural, liturgical, and communal uses of sacred spaces. It recognizes that the auditory experience is central to spiritual, artistic, and cultural expression, shaping collective memory and identity. Churches located in historic districts are an invaluable cultural heritage. They integrate elements of architecture and urban design. However, the acoustic quality parameters of churches are often not considered during this architectural process.
Restoration efforts have traditionally focused on the rehabilitation of elements such as altarpieces, altars, floors, coffered ceilings, and their decorative or pictorial elements. Although the use of sound reinforcement systems may enhance speech intelligibility, the intrinsic acoustic response of these spaces—shaped by their geometry, materials, and volume—is often overlooked, despite its greater significance for the acoustic assessment of worship venues. Church acoustics represent a form of heritage that should be preserved with the same emphasis as architecture and artistic elements.
The study of soundscapes in religious heritage buildings has garnered significant attention over the past decade, driven by the interdisciplinary convergence of acoustics, architecture, and cultural heritage preservation [
1,
2]. Religious spaces, such as churches, mosques, and temples, serve as places of worship and repositories of intangible cultural heritage, where sound significantly influences spiritual and communal experiences [
3,
4].
The acoustic characteristics of places of worship can be evaluated using a variety of methods, ranging from the numerical quantification of sound energy-related parameters to binaural recording techniques and ambisonic impulse response measurements for different combinations of sound sources and receivers. The complex architecture of churches often forces evaluators to use diverse and numerous source-receiver positions, resulting in non-comparable results. Therefore, a comprehensive understanding of the spatial variation in acoustic parameters and the selection of the most suitable technological methods are essential for an accurate acoustic assessment [
5]. A literature review [
6] summarized decades of research on the acoustics of Western Christian churches, mainly focusing on heritage sites. This review highlights the work of research groups in several European countries, providing details of experimental procedures, results, and interpretations of how sound propagates, as well as subjective auditory experiences and computer-assisted simulation techniques. While most acoustic studies have been conducted in European churches, notable research has been carried out in the Americas.
In this light, church music represents a form of domain heritage that should be preserved just as much as architectural and performance aspects, suggesting that the acoustic characteristics of worship spaces are not only contextually significant for the intelligibility of speech and the appreciation of music during religious services but also acquire unique semiotic value specific to each geographic and cultural setting. Measurements of monaural and binaural impulse responses in distinct areas and for various liturgical functions within each cathedral have confirmed that the spatial distribution of reflected sound energy, reverberation time, and acoustic strength contribute meaningfully to the semiological dimension of the experience—even if these physical descriptors are not explicitly recognized by the performers themselves [
7]. Another study examined the acoustic properties of the Toledo Cathedral in Spain, showing it to be a highly reverberant space with poor speech intelligibility. However, an analysis of its liturgical uses suggested that the cathedral is not a single acoustic space, as evidenced by the differences between the chapels and the choir [
8].
One study measured the acoustic energy relationships in Gothic-Mudejar churches in southern Spain and compared the results of these measurements with theoretical models [
9]. The authors proposed an analytical model based on the measured musical clarity (C80) values. An alternative approach was presented in a study of eight Roman Catholic churches in Poland, where a comprehensive acoustic quality indicator was created by comparing several variables. This global indicator offers a user-friendly tool for both scientific research and practical decision-making [
10].
As technology has progressed over the years, so have the methods for evaluating acoustics. The initial characterizations were concerned with reverberation time (RT), early decay time (EDT), and speech intelligibility using the Speech Transmission Index (STI). Subsequent additions included further energy-related parameters, such as strength (G), lateral energy fractions (LF and LG), and speech and music clarity (C50 and C80). More recently, the interaural cross-correlation coefficient (IACC) has been used to check that sound arrives at the listener’s two ears simultaneously, calling for binaural or ambisonic recording equipment, as specified in the standards [
11,
12].
In addition to Christian churches, soundscapes for Buddhist temples have been evaluated, revealing sounds that disturb music listening and preferred sound sources for meditating. A study found that sound sharpness is most strongly correlated with visitors’ sound preferences [
13]. Recently, Ma et al. [
14] proposed a subjective scale for evaluating the sound quality in buildings. On this scale, the three principal latent factors in the human perceptual dimensions of sound are overall sound evaluation (E: Evaluation), energy content (P: Potency), and temporal-spectral content (A: Activity). This EPA approach (Evaluation, Potency, Activity) considers the interaction between the on-site listener and the measurement, thereby providing a valid and reliable method for quantitative analysis [
14].
Despite the growing body of research, a critical gap remains in the application of advanced statistical methods, such as structural equation modeling (SEM), to analyze the relationships between acoustic and perceptual variables in religious heritage sites. Originally developed in fields such as psychology and marketing [
15], SEM enables the modeling of complex interactions between observed variables and latent perceptual constructs. Recently, interdisciplinary research has demonstrated the potential to bridge objective acoustic measurements and subjective experiences in architectural and urban soundscapes [
14,
16,
17,
18]. However, its application to sacred architectural heritage, particularly in Latin America, remains largely unexamined.
Considering this background, the present study applies SEM to analyze both objective and subjective acoustic assessments in three colonial churches in Quito, which are relevant components of the city’s historical heritage. By integrating perceptual models with sociodemographic moderators such as age, gender, and residential location, this study provides new insights into the human-centered evaluation of acoustic heritage. Combining objective measurements with subjective assessments offers a comprehensive framework for understanding how individuals experience these historically and culturally significant soundscapes, while also informing future conservation and adaptive reuse strategies.
2. Evaluated Churches
In this study, three significant heritage churches in Quito were examined: (1) the Company of Jesus, (2) the Metropolitan Cathedral, and (3) theSagrario.
The Church of the Company of Jesus is one of the most representative examples of Quito’s baroque architecture in Latin America. Built between 1605 and 1765, the church is an extraordinary blend of Baroque, Moorish, and Neoclassical elements [
19]. Located in Quito’s city center, this religious building is best known for its ornate interior, which features gilded wood carvings, intricate altarpieces, and gold-leaf decorations. The Jesuits constructed the church as a multifaceted center for religion, education, and cultural activities. This reflects the Jesuits’ educational and spiritual values, as well as the widespread artistic and theological interests of the Counter-Reformation era. The aesthetics of the Quito School of Art, a Spanish- and indigenous-influenced regional style, can still be observed in the form of many church frescoes, sculptures, and ornamental details [
20]. Conservation and restoration efforts carried out over the last few decades have enabled the preservation of its historical and artistic heritage, making it a reference point for researchers and experts in heritage protection [
21].
The Metropolitan Cathedral is another important religious site in Quito, located in the city’s main square in the colonial quarter. Constructed from 1562 to 1806, the cathedral reflects the evolution of architectural styles from the colonial period to the early republican era. It features a blend of Gothic rib vaults, Baroque ornamentation, Mudejar ceilings, and neoclassical facades. This diversity of styles demonstrates how the European architectural model was gradually adapted to local building practices. Since the Spanish conquest, the cathedral has remained an integral part of the city’s religious, cultural, and political life [
22]. This richness of artistry also extends to the interior, which features masterful woodwork and stone carvings, as well as several Quito School artworks. The Cathedral also contains the tombs of several significant historical figures. Following several earthquakes in the area, the building itself has undergone several careful renovations, and its construction has been reinforced, preserving its spiritual and architectural essence.
Another well-known colonial religious structure is the Sagrario church, also located in Quito’s Old Town. Initially envisioned as a subsidiary chapel of the Metropolitan Cathedral, the Sagrario developed its own architectural identity and artistic importance as a separate parish church. The church was constructed in the 16th–18th centuries (Renaissance for construction and Baroque for decoration). It features a very coherent composition style with a concordant yet dynamic architecture. The surface features classical-style columns and intricate iconographic details carved from polished volcanic rock. Within, the visitor finds an overwhelming visual feast of gilded altarpieces, mural paintings, and a soaring dome that filters the light of the sun. The Sagrario is a key element of Ecuador’s national heritage, providing examples of the art and cultural interactions that characterized Andean colonial architecture [
23].
Table 1 shows the general attributes of the studied churches,
Figure 1 presents a schematic diagram of the churches, and
Figure 2 shows general views of the sites evaluated.
3. Methodology
The methodology includes three key components: acoustic evaluation through in-situ measurements, subjective assessment via opinion surveys, and the development of a structural equation model based on the EPA approach to analyze the relationships between perceptual and acoustic variables.
3.1. Acoustic Assessment
Acoustic evaluation followed a rigorous methodology to analyze the sound characteristics of the selected religious sites. To ensure controlled conditions, acoustic measurements were conducted during technical visits when the space was unoccupied. An omnidirectional sound source (CESVA BP012) was used, driven by a dedicated amplifier and connected to an M-Audio MTrack II interface, as shown in
Figure 3. Calibration was performed in situ before each measurement using software (EASERA v1.2) to ensure consistent excitation levels. The source was positioned at the altar, which is the central location for religious ceremonies.
Multiple receiver points (15–22 per site) were distributed along the nave, following the ISO 3382 recommendations [
12]. Symmetrical pairings were explicitly avoided to mitigate potential bias from architectural asymmetry. Each point was measured using a microphone (Beyerdynamic MM1) and a binaural dummy head (Neumann KU100). The microphone was primarily used to evaluate temporal acoustic parameters, including RT, EDT, C50, C80, Definition, sound pressure level within the first 80 ms after the initial arrival of the signal (L80), and STI. At the same time, the binaural dummy head captured spatial attributes, including the IACC, in full early and late modes.
Acoustic excitation was performed using an exponential sine sweep generated by software (EASERA v1.2) at a sampling rate of 44.1 kHz. Measurements captured broadband impulse responses across the full audible frequency range, with particular emphasis on mid-to-high frequencies due to their relevance for speech intelligibility, musical clarity, and spatial impression in worship contexts. Each receiver position was measured five times to ensure the reliability and accuracy of the data. The software automatically processed the impulse responses and calculated the acoustic parameters. Statistical post-processing involved averaging five measurements per point and aggregating data across three predefined zones—near, mid, and far (relative to the altar)—to derive the representative values for each area and the entire venue.
Acoustic measurements were conducted under the normal conditions of each space, without modifications. To minimize external noise interference and environmental fluctuations, several measures were implemented, including in situ calibration, closure of the venue to the public during measurement sessions, and scheduling the data collection during periods of minimal or no external activity. These procedures provided a comprehensive and reliable assessment of the acoustic environment within the analyzed space.
3.2. Subjective Assessment
To explore the perceptual and affective responses to the acoustic environment of the heritage churches under study, a subjective assessment was conducted through on-site surveys. Participants completed questionnaires immediately after visiting the churches, which included a range of contextual scenarios. Specifically, participants were surveyed following religious services, guided or independent tourist visits, and informal religious or touristic visits (e.g., private prayers or casual entry). This approach ensured the inclusion of a broad spectrum of acoustic perceptions shaped by different functional and experiential conditions within sacred spaces.
3.2.1. Participants’ Sample Size
Applying the guideline of six subjects per parameter resulted in a minimum sample size of 115 participants per site for this study. To enhance the consistency of the perceptual data and meet the statistical requirements of the analysis methods used, we distributed a greater number of surveys: at least 400 for each religious site [
24]. This decision followed methodological recommendations indicating that larger sample sizes enhance the stability of factor analysis results and support the practical application of the semantic differential technique.
In contrast, the EPA approach specified in this study consisted of three latent variables, each measured through two observed indicators, yielding a total of six observed variables [
14]. Following the best practices in SEM, the minimum sample size was estimated based on two commonly accepted criteria: at least ten respondents per observed variable and at least five respondents per estimated parameter.
Thus, a total of 1214 visitors participated in the survey, distributed across the sites as follows: the Cathedral (n = 409), the Company of Jesus (n = 403), and the Sagrario (n = 402). The participants included individuals visiting for both religious and touristic purposes, representing a balanced distribution of male and female respondents. Age groups ranged from 15–25 years to over 65 years, allowing for diverse understanding of sound perception across different life stages. Data were collected through face-to-face surveys conducted using digital forms between March and November 2024, ensuring comprehensive capture of visitor experiences under varying temporal and environmental conditions. Considering that the model includes six factor loadings, six error variances, and three latent covariances, a total of 23 parameters were estimated (see
Table 2, Parts II and III). Accordingly, the minimum recommended sample size was calculated to be 60 based on the parameter-based criterion and 60 based on the variable-based criterion. The final sample size far exceeded the minimum requirements, thereby ensuring a robust model estimation and high statistical power.
3.2.2. Questionnaire Design
The survey was structured into three main sections to capture the participants’ profiles and experiences with sound perception, as presented in
Table 2. The first section gathered personal and sociodemographic information, including age, gender, purpose of visit (touristic or religious), and frequency of visits to the religious sites. The second section focused on general perception related to the acoustic environment, using a 5-point Likert scale where 1 represented “Strongly Disagree” and 5 “Strongly Agree.” The third section explores perceptions associated with the latent variables defined in the EPA approach. This section contained 18 semantic differential indicators, all rated on a 5-point scale. For example, for the indicator “Pleasantness,” participants rated their experience from 1 (“Not pleasant at all”) to 5 (“Very pleasant”). This structured approach enabled a detailed assessment of how visitors perceive the sound environment within religious spaces, providing insights into both subjective evaluations and broader perceptual patterns.
Table 3 presents the complete set of input variables used to construct and estimate the structural equation model designed to explore the relationships between architectural and acoustic indicators, perceptual sound descriptors, and sociodemographic factors. It contains three main thematic blocks: (1) demographic and socioeconomic aspects, (2) architectural/acoustic perception ratings (General Opinions), and (3) perceptual responses subdivided according to the circumplex structure of soundscape dimensions—Evaluation, Potency, and Activity. Each variable was categorized by its thematic module and included the corresponding labels and coding scales.
3.3. Structural Equation Modeling
To investigate the correlations between objective acoustic behavior and subjective sound quality perception in churches, the current study employs SEM with three latent variables: EPA [
14], where Evaluation characterizes the global affective judgment of the acoustic scene (e.g., Pleasant-Unpleasant, Relaxed-Tense), Potency indicates the perceived intensity and dominance of the sound in the environment (e.g., Weak-Strong, Quiet-Loud), and activity represents the spectral content (e.g., Flat-Sharp, Low-High).
An EFA was performed to extract the primary latent constructs from a set of measured indicators without a preconceived structure. Next, to assess the discriminant and factorial validity of the constructs, CFAs were carried out and confirmed using composite reliability (CR) and average variance extracted (AVE). Discriminant validity was established because the maximum shared variance (MSV) for each factor was lower than the AVE. SEM was applied to explore the hypothesized relationships between latent perceptual constructs and objective acoustic measures while accounting for sociodemographic moderators. The model estimation followed a multi-step validation process including EFA, CFA, and full SEM path modeling using the Lavaan package [
25] in R-Studio (v2025.05). SEM enables the modeling of both direct and indirect (mediated) effects, offering a scientifically robust framework beyond simple correlational analyses.
The literature indicates that Several studies have applied latent variable analysis within SEM frameworks to assess auditory perception in urban soundscapes [
16,
17,
18,
26] and building environments [
14]. However, this methodology remains underexplored, particularly in the context of heritage churches. Recent contributions have sought to integrate objective acoustic measurements with subjective assessments of auditory perception [
27,
28] and user-reported perceptions of churches as tourist destinations [
29]. Based on the EPA approach proposed for assessing sound quality in buildings, the present study tests a conceptual SEM based on the three hypotheses illustrated in
Figure 4, specifically adapted to the context of heritage churches:
HA: The perceived Activity inside churches is positively correlated with the potency of sounds.
HB: The perceived Potency of sounds influences sound evaluation.
HC1: Participants’ sociodemographic characteristics influence the Activity, Potency, and Evaluation of sounds in churches.
HC2: Participants’ geographical residential location influences the Activity, Potency, and Evaluation of sounds in churches.
Figure 4.
Proposed theoretical model.
Figure 4.
Proposed theoretical model.
4. Results
Table 4 summarizes the acoustic analysis of the three heritage churches, where different reverberation and clarity behaviors were observed. RT varied between 2.1 s (the Company of Jesus) and 2.6 s (the Sagrario), although the Company of Jesus also showed the lowest EDT (2.4 s), which might denote slightly better sound decay control. The clarity indices were negative at all sites, with the Company of Jesus exhibiting the highest music clarity (C80 = −2.0), indicating a better musical acoustic environment. The definition values were consistent at 0.3, and the L80 levels were highest in the Company of Jesus (100.0 dB), suggesting a stronger early energy contribution. The STI and RASTI values were generally moderate (STI ≈ 0.4), as were the %AlCons values (17.8–20.2%), suggesting moderate levels of speech intelligibility. IACC values were relatively stable for the early and late reflections; the full-range IACC was highest in the Company of Jesus (0.4), indicating a somewhat stronger spatial impression.
Figure 5 shows the RT values across 1/1 octave bands, capturing the frequency-dependent decay characteristics of the sound field. Complementarily,
Figure 6 shows the IACCs, also in 1/1 octave bands, which provide insights into the spatial immersion of the acoustic environment from a binaural perspective.
The Metropolitan Cathedral exhibited the highest RT across most frequency bands, peaking at 3.2 s at 500 Hz, which suggests a particularly reverberant space, ideal for the performance of liturgical music but detrimental to the intelligibility of spoken language. The Sagrario presents an intermediate acoustic profile, with reverberation times comparable to those of the Cathedral at mid-to-high frequencies but slightly reduced at low frequencies. On the other hand, the Company of Jesus presented the lowest reverberation times in general, and especially at low frequencies (1.9 s at 250 Hz), indicating a more damped acoustic environment, which may favor the clarity of spoken words. These differences arise from the variations in the architectural and material properties of the three buildings.
For the IACC, large values indicate enhanced auditory spaciousness. The Company of Jesus generally shows the highest IACC across most frequencies, as well as the mid-high range, which can be as high as 0.4–0.5. These results indicate better performance in generating an enveloping sound field, which increases the perception of spatiality. In contrast, the IACC values are higher in the Metropolitan Cathedral and the Sagrario; specifically, below 1 kHz, the IACC values are higher in the Cathedral, while above 4 kHz, the Sagrario exhibits the highest values, denoting a directional sound field and less perceptual space. These data support the acoustic singularity of the Company of Jesus in terms of spatial sensations.
Principal component analysis with varimax rotation was conducted to identify orthogonal factors and their primary indicators. The Kaiser–Meyer–Olkin Index (KMO = 0.94), along with the measure of sampling adequacy (MSA) for each indicator, was used to assess sample adequacy [
30]. All indicators surpassed the acceptable threshold of 0.7 based on the MSA values, confirming their suitability for inclusion in the analysis.
The SEM development underwent a multi-stage validation process, which included EFA, CFA, and SEM path modeling techniques. The EPA approach determined that Activity (η1), Potency (η2), and Evaluation (η3) represent the fundamental latent variables characterizing the perceived acoustic quality within heritage churches.
Several models were tested for the EFA, and
Table 5 presents the results of the EFA that showed the highest association. Three clear factors emerged from the EFA, with eigenvalues above 1, explaining 74% of the total variance. The variance distribution was 35% for Factor 1 (Activity), 19% for Factor 2 (Potency), and 20% for Factor 3 (Evaluation). The principal factors extracted from the EFA aligned with those identified in previous studies [
31]. Activity showed high loadings from the Noise, Tense, and Unpleasantness indicators, which exceeded 0.89, while Potency revealed strong loadings from the Loudness, Reverberation, and Strength indicators. The Evaluation dimension relied on the indicators Speech and Music Clarity, Brightness, and Understandability.
CFA confirmed the structure because every standardized factor loading was statistically significant and surpassed the advised threshold of 0.70, except for one marginal instance.
Table 6 demonstrates that the AVE for each construct exceeded 0.50, thereby confirming convergent validity. The measurement of internal consistency proved strong, as the CR and Cronbach’s alpha values exceeded the accepted 0.70 standard [
32].
To validate this hypothesis, the structural relationships between the latent and observed variables were analyzed, as presented in
Figure 7 and
Table 7. The goodness-of-fit metrics for the conceptual model are presented in
Table 8. The regression results (
Table 7) indicate that all structural paths are statistically significant (
p < 0.001). For example, Activity was a significant predictor of Potency, and Potency significantly influenced Evaluation, suggesting a mediating mechanism whereby negative emotional responses (e.g., Noise, Tense) intensify the perceived acoustic environment, subsequently shaping global evaluative judgments of sound quality.
Table 8 shows an acceptable model fit. Although the Tucker–Lewis Index (TLI) and the root mean square error of approximation (RMSEA) values are slightly outside the ideal ranges, they still fall within acceptable limits, given the model complexity and large sample size. The index that most penalizes the fit of the data refers to the TLI for overly complex models, thus suggesting that the model does not capture all the variability in the data. However, the main paths are consistent with the theory [
33]. The standardized root mean square residual (SRMR) fell within the recommended values. Overall, the model exhibited a satisfactory fit to the data.
The SEM results confirm the hypothesized relationships among the latent constructs derived from the EPA model (see
Table 9). Specifically, perceived activity had a significant and positive influence on potency (β = 0.881,
p < 0.001), while potency had a strong adverse effect on evaluation (β = −0.872,
p < 0.001). This suggests a mediating mechanism through which the spectral characteristics of sound modulate the perceived intensity and, ultimately, the affective appraisal of the acoustic environment in heritage churches. Sociodemographic factors such as age and frequency of visits to the assessed churches were also significant. Adults and seniors reported higher perceptions of activity and potency but tended to evaluate the soundscape more negatively (e.g., adults→evaluation: β = −0.378,
p < 0.001). Additionally, the impact of residential location was a significant factor; the results suggest a significant heterogeneity effect among the residential locations of participants in Quito’s main districts (north, center, and south). Participants living in the north and south consistently perceived higher levels of activity (north location→activity: β = 0.581,
p < 0.001; south location→activity: β = 0.06,
p < 0.05) than those living in the central zone where the churches were located. Participants living in the north also rated potency higher than those living in the center (north location→potency: β = 0.665,
p < 0.001). However, participants living in the north and south declared a negative evaluation of sounds inside buildings when compared with people living in the city center (north location→evaluation: β = −0.59,
p < 0.001; south location→evaluation: β = 0.06,
p < 0.05). These findings highlight the significant differences in how participants evaluate sound environments in religious spaces.
These findings highlight the significance of physical acoustic parameters and listener context in shaping subjective sound quality evaluations in religious spaces.
5. Discussion
The research findings demonstrate the efficacy of the EPA approach in studying acoustic perception within heritage churches. The latent construct Activity captures negative emotional responses (such as Noise and Tense), Potency measures the perceived intensity of sound (such as Loudness and Reverberation), and Evaluation assesses the comprehensive quality of hearing (like Clarity and Brightness). The structure illustrates how acoustic perception is formed through physical sound measurements and emotional and cognitive reactions to auditory surroundings, which underlie the sound experience in sacred space.
The acoustic measurements revealed that the three churches exhibited moderately long RTs ranging from 2.1 s to 2.6 s, with the Company of Jesus demonstrating slightly superior acoustic behavior across several parameters, including C80 and IACC. These values align with those reported in [
7], where excessively high RTs were observed in Andalusian cathedrals—exceeding 8 s in some cases—yet highlighted the significance of architectural features and spatial source positioning (e.g., retro-choir) in mitigating these effects. In this study, the placement of the sound source at the altar, together with the natural material absorption characteristics, likely contributed to its comparatively better speech intelligibility and spatial impression. However, the dimensions of the churches analyzed in this study were significantly smaller.
Interestingly, despite low STI values (≈0.4), the perceptual clarity reported by participants—particularly in the Company of Jesus. This decoupling between physical and subjective intelligibility reinforces the observations made by Abadía [
3], who found low D50 values (<50%) across several Jesuit churches, yet moderate to high subjective ratings for clarity. This discrepancy highlights the influence of non-acoustic factors, including visual cues, source distance, and cognitive adaptation, in shaping sound perception in sacred contexts.
The SEM analysis confirmed the structural validity of the model, explaining 74% of the total variance. The high factor loadings and satisfactory goodness-of-fit indices supported the model’s structural validity. Activity significantly predicted potency, which strongly influenced Evaluation, validating the hypothesized mediation pathway. This mediation suggests that heightened emotional responses intensify perceived sound dominance, which subsequently shapes global evaluative judgment. High factor loadings, composite reliability, and goodness-of-fit indices support the robustness of the model. This suggests that emotionally charged reactions (e.g., stress and discomfort) intensify the perceived sound environment, which subsequently affects overall evaluative judgment. Such mediation aligns with the conceptual structure proposed in [
14] and is further supported by the high AVE and CR values observed in the CFA.
The path from Potency to Evaluation suggests that the mere intensity or dominance of sound may not be inherently negative; instead, it becomes evaluatively salient when modulated by emotional or contextual cues. This finding aligns with Abadía’s study [
34], which suggests that spatial volume and materiality play a defining role in listener impressions, independent of the reverberation time alone. The mediating pathways from Activity to Potency and from Potency to Evaluation underscore how emotional arousal amplifies perceived sound dominance, which subsequently influences overall appraisal. Notably, a high spatial impression (IACC) may enhance immersive experiences even in moderately reverberant spaces, reinforcing the semiotic and affective importance of envelopment in sacred acoustics [
35].
This study reinforces the concept that the acoustic quality of heritage churches cannot be fully understood using objective metrics alone. Perceptual experience is shaped by a complex interplay of physical, emotional, and contextual factors that SEM can effectively capture. These insights have practical implications for acoustic heritage conservation, soundscape-based assessments, and the integration of perceptual models into building acoustic standards. Although focused on colonial churches, this conceptual framework could be extended to other heritage venues, such as ancient theaters or classical opera houses, provided that appropriate contextual adaptations are made to the perceptual instruments.
While this study provides valuable insights into the perceptual and acoustic dynamics of heritage churches, it has several limitations that must be acknowledged. First, the acoustic measurements were taken under unoccupied conditions, which may not accurately reflect how sound absorption and diffusion behave with live audiences. While audience presence may introduce additional absorption and diffusion effects, controlled unoccupied measurements provide a stable reference for subsequent perceptual assessments. Even though statistical averaging and zone-based aggregation were applied to approximate typical usage scenarios, dynamic occupancy remains a variable that could impact the generalizability of the results. Additionally, perceptual data were gathered immediately after individual visits rather than during services, potentially underrepresenting the full range of soundscape experiences.
Furthermore, although SEM demonstrated strong explanatory power, it was limited to three latent constructs (EPA). It may not account for other dimensions such as the cultural context, visual-auditory interaction, or temporal variation in sound perception.
Future research should incorporate different occupancy scenarios and expand the model by including additional variables such as spatial congruence, expectation bias, and ritual familiarity. These enhancements could improve our understanding of how sacredness, history, and religious significance influence soundscape evaluation. Longitudinal studies tracking acoustic perception over time—or during different liturgical events—could also offer deeper insights into the dynamic relationship between sound environments and human experience. Further integration of binaural recordings, immersive auralizations, and virtual reality simulations could help assess design interventions or material additions in a non-invasive way, particularly in sensitive heritage contexts.
6. Conclusions
The acoustic characterization of three heritage churches in Quito not only revealed the objective physical conditions of these spaces, as identified by reverberation times, clarity indices, and speech intelligibility values, but also provided valuable insights into how these conditions are perceived by users/visitors through systematic subjective evaluation. The EPA approach to auditory perceptions revealed that, although the spaces were generally perceived as imposing and culturally resonant (Potency), the listeners found them somewhat confusing or lacking in clarity (Evaluation) and experienced a dense or saturated acoustic environment (Activity). These subjective perceptions complement the quantitative acoustic metrics and confirm the limitations posed by speech intelligibility, as well as the perceived magnificence and envelopment of sound, hallmarks of sacred spaces.
The proposed methodology provides evidence for experimental replicability in future research involving the use of surveys and perceptual questionnaires, which collect sociodemographic information (e.g., gender and occupational background), behavioral patterns (e.g., frequency and purpose of visits), and participants’ insights into sound assessment using soundscape indicators focused on historical building environments. The information was examined using sophisticated data analysis tools, such as the structural equation model, which integrates several multivariate techniques into a single comprehensive model. The findings reported in this research complement the understanding of complex path models with three latent variables: activity, potency, and evaluation. Only a few SEM studies have been described in the literature concerning the analysis of the relationship between the three perception components, with the added benefit of being assessed using sociodemographic and location characteristics in the context of historic church buildings.
The SEM examines the complex relationships between sound attributes and perceptual indicators, such as soundscape dimensions. First, the SEM approach analyzes perceptual indicator information to test explanatory conceptual models or latent constructs. Sociodemographic attributes were then categorized to test their relevance and the interrelationships between them. Finally, this multivariate statistical method can be used to indirectly measure the structured categories of attributes and latent constructs and their interrelations and to test inferences about the different categories when the latent variables are observed independently. Previous studies evaluating sound quality in buildings have provided limited evidence, examining only the relationships between subjective factors [
14]. The present study employed an advanced multivariate statistical tool to analyze the data and examine the influences of sociodemographic and location characteristics, considering their effects. Therefore, the gathered survey information on activity, potency, and evaluation will be integrated into an analytical tool for analyzing the location of historical churches, taking into account individual attributes and characteristics.
To investigate the relationship between acoustic and soundscape perceptual descriptors, the SEM approach was employed as a robust analytical tool to examine the intricate nature of ecclesiastical acoustics. Key results from the SEM demonstrated statistically confirmed pathways between sociodemographic variables and latent perceptual constructs, supporting the notion that acoustic quality is not entirely captured by more classical measures of intelligibility or clarity. A further positive predictor for Evaluation was Potency, which suggests that loudness and spatial spread can be beneficial within the conditions of a liturgical space. In contrast, the activity factor, which describes auditory dynamism and temporal density, was negatively correlated with it. This study demonstrates that the combined use of quantitative measures and auditory perception is crucial when examining acoustic heritage. In addition to being a robust approach, SEM provides a conceptual framework for bridging the gap between measurable acoustic events and human sensory perception. The methodology developed in this study can be applied to future research on the architectural acoustics of culturally significant contexts. It advocates refined interdisciplinary frameworks that recognize the diverse acoustical identities embedded in historical sacred spaces.
The findings of this study demonstrate that sound assessment design strategies should consider the functional aspects of places such as heritage church buildings. Further studies should investigate the practical applications of sound environmental assessment design strategies in different urban and architectural contexts. Additionally, a model describing the relationship between soundscape factors and physical acoustic parameters should also be developed to create practical sound assessment indicators for designing urban acoustic environments based on soundscape perception.
This study demonstrates the scientific relevance of applying the SEM to acoustic assessments. The SEM provides a conceptual framework that links measurable acoustic events to human sensory perception, addressing the limitations of relying solely on traditional acoustic metrics. The integrated approach combining objective measures, perceptual evaluations, and sociodemographic moderators offers valuable insights into heritage conservation, adaptive reuse, and acoustic interventions in culturally significant spaces.