Sensory Heritage Is Vital for Sustainable Cities: A Case Study of Soundscape and Smellscape at Wong Tai Sin
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe document represents a significant and original contribution to the literature on sensory heritage and urban sustainability. The methodologies are adequate and the results convincing. The suggested revisions will improve the clarity, rigour and impact of the work without requiring substantial changes to the structure or conclusions.
The olfactory assessment methodology is less standardised than the soundscape approach and requires improvement in two key areas.
-Olfactometry team training details,
-A more in-depth discussion of the limitations of the 24 Zarzo scales is needed. Several scales showed minimal variance at this site (shown in Figure 13), and the relatively high missing data rate (12.8%) requires recognition. Discuss the site-specific applicability of these scales and consider the potential benefits of developing temple-specific olfactory descriptors for future research.
The current structure of the results section needs to be reorganized to improve logical flow and readability. Implement the following sequence:
Acoustic environment -Characteristics of the soundscape - Characteristics of the acoustic landscape -Analysis of interviews
This progression from objective measurements to subjective perceptions to stakeholder experiences will improve understanding and create a more coherent narrative structure throughout the presentation of results.
Author Response
Please see attachment
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsRecommendation: Major Revision
Manuscript Number: sustainability-3728975
Title: Sensory Heritage is Vital for Sustainable Cities: A Case Study of Soundscape and Smellscape at Wong Tai Sin
- Overview and general recommendation
The manuscript has done a good preliminary work. I think that the descriptions of some points were inadequate. I recommend that a major revision is warranted. I explain my concerns in more detail below. I ask that the authors specifically address each of my comments in their response.
- Major comments
1) What were the research questions in this manuscript? Currently, the description of the research question is somewhat vague. It is recommended that the authors refine the research question more precisely, clarifying whether the core of the study is to explore the internal mechanisms of a specific phenomenon or to solve a particular practical problem.
2) It is recommended that the systematic and comprehensive review of existing research would be conducted further, which can provide a strong basis for the proposal of research questions.
3) It is suggested that further explanations be made on the origin and construction of the theory of sensory heritage. How to guide the soundscapes and smellscapes?
4) In the manuscript, only this temple of Wong Tai Sin was analyzed. Can the case of this single temple present the primary significance in Chinese temple architecture as described in lines 560-561? Is this case of a temple typical and representative?
5) In the manuscript, how is the construction process of grounded theory?
6) How can the results of this manuscript be supported in the existing theories and literature? It is recommended to discuss further.
7) It is suggested to further explain the innovativeness and significance of knowledge contributions of this manuscript.
8) It is recommended to discuss Limitations of this manuscript.
- Minor comments
1) For Figure 4, it is suggested that the functional layout diagram in Figure 4 be redrawn. It is recommended to use simple color schemes to further clarify the purpose of each function.
2) The "Method" from lines 288 to 292 of result 3.4 is basically the same as the text from lines 157 to 163 of result 2.2.3. It is recommended to organize and modify it.
3) It is suggested that the writing format of references be further unified.
Author Response
Rev2
Author Notes to Reviewer
Comment 1: (The English is fine and does not require any improvement.)
Response 1: Thank you for these words. Nevertheless, while revising we have found ample room for improvement to language and style, and have combed the text for typos.
Comments 2: (The manuscript has done a good preliminary work. I think that the descriptions of some points were inadequate. I recommend that a major revision is warranted. I explain my concerns in more detail below. I ask that the authors specifically address each of my comments in their response.)
Response 2: Thank you. The revised manuscript addresses each of the comments and we hope the improvements are satisfactory.
Comments 3 ((1) What were the research questions in this manuscript? Currently, the description of the research question is somewhat vague. It is recommended that the authors refine the research question more precisely, clarifying whether the core of the study is to explore the internal mechanisms of a specific phenomenon or to solve a particular practical problem.)
Response 3: Thank you for pointing this out. The research questions, in three groups, are formulated towards the end of this section, as follows:
“With the present case study we placed the searchlight on Wong Tai Sin Temple, a unique but at the same time representative site in Hong Kong’s cityscape, intertwining culture, religion, tourism, and sensory experiences. Our approach was a mixed-methods approach. The first set of research questions related to descriptive characteristics of the chosen site. What is the place of Wong Tai Sin in the physical cityscape fabric? What is its role in the cultural landscape? How does it contribute to creating the identity of the city? What is the acoustic environment like and how is the soundscape perceived? How can we effectively characterise its smellscape? Measurements and observations were complemented by a set of interviews with visitors of varying backgrounds, to bring a deeper understanding of sensory heritage from the perspective of stakeholders. Through this, we addressed research questions about how people relate sensory perceptions with cultural practices: What specific sounds and smells contribute to a general appreciation of the site? How do soundscape and smellscape relate to, give shape to, and inform us about cultural and religious practices? Finally, we formulated questions to contextualise the study: What general conclusions can we draw from Wong Tai Sin towards other sites in Hong Kong and elsewhere? How can soundscape and smellscape be leveraged for the development of sensory heritage policy?”
Comments 4 (2) It is recommended that the systematic and comprehensive review of existing research would be conducted further, which can provide a strong basis for the proposal of research questions.…3) It is suggested that further explanations be made on the origin and construction of the theory of sensory heritage. How to guide the soundscapes and smellscapes?)
Response 4: Thank you for these comments which have prompted a thorough revision and extension of the Introduction. We hope that both queries are satisfactorily addressed:
“In cities such as Hong Kong, there is a constant struggle between human traditions, forces of technologically driven desires (‘smart cities’), and natural contextual constraints (such as climate change). Further emphasised by the high population density, these struggles and constraints place strong conditions on sustainable urban development of the city. While security, population health, and economy, are highest on the list of concerns by residents and decision-makers, matters such as general well-being, cultural identity, and heritage, are not far behind, especially given that they invariably interact with the more fundamental and measurable matters. There is much to be done to increase the awareness of the value that these qualities offer.
City spaces that are culturally valued typically present an “intertwined tangible-intangible duality, expressed both as a physical construction and as a set of social, traditional practices” (Lenzi, Sabada, & Lindborg 2021). In this context, sensory heritage is vital. The origin of this concept might be traced to sensory history and anthropology (Classen et al. 1987, Augé 1992; Bembibre 2024). Multisensory approaches have increasingly gained importance in heritage studies (Parker et al. 2023; Ppali et al. 2024; Xu et al. 2022). In this body of research, the understanding of what constitutes sensory heritage builds on two well established concepts, namely tangible and intangible heritage. The former refers to physical artefacts in the built environment, emphasising “architecture, their homogeneity or their place in the landscape” (UNESCO 1972). The intangible cultural heritage (ICH; UNESCO 2003), is constituted by any "non-corporeal manifestation of tradition-based creativity [that reflects] the community's social or cultural identity. It includes… the social, intellectual and cultural processes that… have made possible the development of a distinct cultural tradition whose preservation and protection is important…" (UNESCO 2021b p. 5). In Hong Kong, these forms of heritage fall, respectively, under the purview of the Antiquities and Monuments Office and the Intangible Cultural Heritage Office.
However, these frameworks have been criticised (Caust & Vecco 2017; Lindborg et al. 2024) for an intrinsic hierarchisation of the senses, so that items appreciable by sight (e.g. monument, bridge, dance performance) are prioritised over those accessible through touch (e.g. sculpture, textile) or hearing (e.g. soundscape), and again over smell or taste. The concept of sensory heritage captures something that goes beyond the UNESCO definitions, which are framed as physical objects and tradition-based practices. To identify what this ‘thing’ might be, we may start by acknowledging that the specific urban sites that a community values combine physically persistent and ephemeral qualities. Two observations can be made to show gaps in the definitions stated above. Firstly, some aspects of the experience when within physical environments, for example the sounds and smells that they contain, might be ephemeral in the sense that the individual object whence they originate may quickly vanish, and yet they might be more or less continually replenished so as to create a long-term, steady-state presence. Hence, sounds and smells can take on a physical character (cf. Firat 2021). Secondly, the actions that give rise to those sounds and smells might not be fully intentional (as opposed to the clear ICH situation of, say, a traditional dance performance), but rather, the side-product of any kind of action, be it ‘cultural’ or not (cf. semi-designed environments, Lindborg 2015). The action itself might be everyday-ish or menial, but what matters here is that the perception of such sounds and smells can take on significance (perhaps over time) for a community. Hence sounds and smells take on a cultural character. Thus, the concept of sensory heritage is novel and vital to understand urban sites of value.
The focus that we are here placing on sounds and smells is certainly a simplification compared to the multisensory complexity experienced in natural environments. As noted above, vision plays a primary role in the appreciation of cultural heritage, and intermodal relationships between visual, auditory, and olfactory senses combine to great effect (Spence, 2020, p. 2; Lindborg & Liew 2021). Visual and cognitive factors can dominate smell perception. For example, a noteworthy experiment showed that the smell of white wine that had been dyed red was described with red wine descriptors by students of œneology (Brochet & Dubourdieu 2001).
Throughout our lives, multi-sensing is the normal condition, and therefore lived experiences are encoded in multisensory memory. There is ample evidence that smells can trigger episodic memory recall (Herz 2011), and non-olfactory stimuli can trigger smell memories, true or not (Keller & Vosshall 2004). If not lived and true as memories, they might be projections of imagined smells (Young 2020; Lindborg & Liew 2021). Both smells and sounds can facilitate spatial recall through activities such as visualisation by sketching, or physicalisation through clay-making, thus bringing a sense of tangibility to recalled memories (Xiao & Lindborg 2025).
Inching towards a definition of sensory heritage, we look towards research in urban soundscape and smellscape. The International Organisation for Standardisation defines soundscape as constituted by multiple sounds in the acoustic environment, perceived and understood by individuals or groups of people (ISO 2018: 12913-1). Soundscape can have positive cultural value as well as negative impact on community health. Similarly, smellscape is the perceived olfactory environment, resulting from a complex mixing of volatile olfactory compounds. While smell has always been a part of human experience and informal research for hundreds of years, smellscape research only started to emerge in the 1980s (Porteus 1985; Henshaw 2013; Xiao 2018). As pointed out by Bembibre, “the significance of smell in connection with heritage is rarely recognized. This is caused by 1) fragmented knowledge of the sensory worlds of the past and the present, 2) the low awareness of the importance of smells and olfaction in intangible heritage practices, and 3) the lack of adequate methods to identify, record and safeguard smells” (Bembibre 2024).
With the above in mind, we tentatively define sensory heritage as follows: the sum total of the culturally valued sensorial experiences of a community, manifested as sights, sounds, smells, tastes, and textures, enabled through practices, rituals, and everyday activities, together with narratives and memories.
Sensory heritage is starting to be recognised in national legislation, with France leading the way (Morel-a`L’Huissier 2020). To advance the development of a legal framework, policy research should not only take into account the existing legislation that serves to regulate nuisance (such as noise and malodeur) but also identify valued exponents of the sensory heritage (e.g. positive sounds and smells) that need to be protected and are beneficial to promote. Sensory heritage as a vital component of people’s connection to their environment, both current and past through memory, stories, and history. While the concept need not abide by a strict definition, increased awareness and attention to it will generate wider benefits in the economy. Everyday activities and narratives shape people’s identity, and may offer added value to the lived experiences of people. To illustrate how sensory heritage can be understood as the intersection between everyday practices, legislative ordinances, soundscape, smellscape, built heritage and ICH, refer to the schematic overview in Figure 1.”
Comment 5 (In the manuscript, only this temple of Wong Tai Sin was analyzed. Can the case of this single temple present the primary significance in Chinese temple architecture as described in lines 560-561? Is this case of a temple typical and representative?)
Response 5: Yes, we argue that Wong Tai Sin is representative of Chinese temples in Hong Kong. We have carefully edited the section describing the site for clarity and to avoid redundancy. Under ‘Materials and methods’, there is a new section, “Context”, followed by “Site characteristics” (from line 94):
“Context
The term ‘Chinese temples’ refers to places of worship for Chinese religions, including folk religion, Buddhism, Taoism, and Confucianism. In Cantonese, the main language of Hong Kong, terms like 廟 (miu6 in Jyutping romanization) or 寺廟 (zi6 miu6) are used. A liberal definition would also include monasteries 觀 (gun3), nunneries 庵 (am1), and shrines built for religious purposes, even if they are strictly speaking not temples in the context of Mandarin or Cantonese languages. The legal definition follows the more liberal usage of the term (Chinese Temples Ordinance). Hong Kong has more than 300 registered Chinese temples. The Chinese Temples Committee, established in 1928 under the Chinese Temples Ordinance, is the largest managerial body. Most temples were built before the 1950s on sites that were originally on seafronts or grounds relatively empty and far from residential areas. With the city’s extensive urban development, they are now typically surrounded by residential, commercial, and industrial buildings (Cai & Wong, 2021; Lindborg et al., 2024; Lam et al., 2024; 2025). Today many local residents are fairly secularised, and attending Chinese temples might be more of an expression of respect for tradition, which is an important matter, rather than an activity borne out of religious fervour (Liu, 2003).
Site characteristics
One of the oldest Chinese temples in Hong Kong is Wong Tai Sin Temple 黃大仙祠 (wong4 daai6 sin1 ci4; henceforth abbreviated WTS). The name literally means ‘yellow great immortal shrine’. It is a well-known establishment that is in many ways representative of Hong Kong’s Chinese temples. It is the largest of its kind and has grown to become a tourist attraction. As such, different groups of people may relate to the temple in differing ways, while co-existing within the establishment. Typical elements of Chinese temples in regard to sounds and smells include the playing of music, the sounds (and smells) of rituals, and the burning of incense. Whilst these activities form an integral part of cultural and religious practices, they also bring nuisance or even negative health impacts (such as respiratory diseases) to individuals (Lam et al. 2024; Cai & Wong 2021; Wang et al 2007).
WTS is situated in the northern part of Kowloon peninsula towards Lion Rock Country Park (Figure 2). The first structures on the site were erected in the early 1920s. Today, it covers an area of approximately 18,000 m2, with multiple buildings, water features, and a park. The architecture is traditional, with red pillars, gold-coloured roof with blue friezes, yellow latticework, and multi-coloured carvings (Figures 3-4). With more than 10,000 visitors each day, WTS is a prominent and bustling shrine, renowned for its claim to "make every wish come true upon request" 有求必應 (jau5 kau4 bit1 jing3). The photos in Figures 5-7 give an impression of colours, layout, and activities. The temple is unique in that it embraces three major religions: Taoism, Buddhism, and Confucianism, with halls dedicated to deities from each, such as the Three Saints Hall 三聖堂 (saam3 sing3 tong4) which dates from 1972. WTS is notable for being the only temple in Hong Kong authorised to conduct Taoist wedding ceremonies, and its ‘beliefs and customs’ have been listed as an Intangible Cultural Heritage of China (Sik Sik Yuen 2025). The complex is managed by the religious charitable organisation Sik Sik Yuen 嗇色園 (sik1 sik1 jyun4), promoting its function as a symbol of Hong Kong identity (Guo 2024).”
Comment 6 (In the manuscript, how is the construction process of grounded theory?)
Response 6: To clarify the process of interviews, transcription, and text analysis, we have revised the section ‘Interviews’ as follows (from line 170):
“We developed a method for ad hoc street interviews with visitors to understand their experiences, memories, and perspectives on the soundscape and smellscape (Flick 2000). Members of the team individually approached all kinds of visitors to the site, inside or in the immediate vicinity of the temple. After consenting to being interviewed and audio recorded, the conversation followed a semi-structured format (see Supplementary Materials 4), typically 4-6 minutes in duration, covering four thematic areas, namely: the interviewee, their expectations, their thoughts and feelings about sounds and soundscape, and about smells and smellscape. The recordings were post-processed for clarity and transcribed. By interviewing different stakeholders (e.g., local residents/visitors, tourists, worshippers, staff members, merchants, etc.), our aim was to understand a range of perspectives regarding the sounds and smells of WTS in particular, and of Chinese temples in general.
The analysis of interviews was based on grounded theory using both classical content analysis (Bauer and Gaskell 2000) and LLM-supported BERTopic modelling (Grootendorst 2022). Sentiment analysis was likewise made both through interpretative ratings by the researchers and automatically using natural language processing tools. During analysis, our approach alternated between interpretative and computational methods. Details are given in the Results section.”
Comment 7 (How can the results of this manuscript be supported in the existing theories and literature? It is recommended to discuss further.)
Response 7: Thank you for this comment. The revised Introduction brings in existing literature and discusses how UNESCO definitions of built and intangible heritage leave a gap, that must be filled by some form of theoretical framework for sensory heritage. We still know too little to venture very far in this direction, but we are proposing a set of concepts that are related to sensory heritage.
In the Discussion section this thread is brought back and put in relation to the higher-level findings of the present study. At the start:
“In this study, we attempted to build an empirically based understanding of how the complementary duality of the physical and ephemeral might be evidenced.
As we have evidenced, Wong Tai Sin is representative of Chinese temples in Hong Kong yet has a unique role to play in the physical cityscape fabric as well as the ephemeral cultural landscape, as a place and guarantor preserving deep traditions for residents and believers, and promoting tourism by giving visitors opportunities for authentic experiences. As one of the oldest Chinese temples in Hong Kong, it is a cornerstone for creating the city’s identity.
Its soundscape is busy and city-like, with high sound pressure levels due to both visitors’ voices, piped background music, and surrounding traffic noise. Natural sounds such as birdsong are present in some areas of the site, such as the small park within the compound and in a line of trees on the plaza outside the main gate (see Figure 7). Specifically, the smellscape is dominated by smoky incense from the religious practice of burning joss sticks during prayers. Visitors often express an emotional attachment to this smellmark, but at the same time recognise the smoke as potentially harmful. In response to these concerns, Sik Sik Yuen has modernised by installing a water mist smoke suppression system at the main altar, and offering free incense sticks that produce less smoke. Unfortunately, the latter effort is straining relations with the local ‘free business people’ who sell sticks outside the temple compound: often women with limited means of gaining an income.”
Comment 8 (It is suggested to further explain the innovativeness and significance of knowledge contributions of this manuscript.)
Response 8: Thank you. We have developed the Conclusion section in this direction, and it now reads as follows:
“The aim of this study has been to contribute knowledge to sensory heritage research in general and to Hong Kong’s cultural landscape in particular. Through measurements, observations, ratings, and interviews with diverse stakeholders, this paper identifies and summarizes a range of varied perspectives on the sensory experience at Wong Tai Sin Temple. Analysis of interview responses have highlighted its defining characteristics, in comparison with similar sites in Hong Kong and abroad. People often express their experiences in emotional terms, and they spontaneously report memories, often from childhood, that put present-day experiences in relief. While specific sounds (such as the background music piped throughout the site) and smells (dominated by incense burning) are valuable to some people, others express concerns about noise and air pollution. The conflicting perspectives show that the field of sensory heritage is still in an early phase. More data needs to be collected from different stakeholder perspectives to identify the right questions, to refine the theoretical framework and create a unifying, productive definition. This, we believe, will stimulate policy development that goes beyond regulation of the negative impact of nuisances and strives to actively preserve and promote sensory heritage in sustainable and inclusive ways. The implications are clear for urban designers, travel agencies, and companies and government agencies that work with branding and virtual tourism. Our study explicates the links between soundscape, smellscape, and culture, and lends evidence to the importance of sensory heritage for sustainable cities.”
Comment 9 (It is recommended to discuss Limitations of this manuscript.)
Response 9: You are very right to point this out. Your comment here, and comments from the other reviewers, have provoked a section ‘Limitations of the study’, at the end of the Discussion. It reads as follows:
“The last point highlights the fact that the present study has important limitations. We acknowledge that the exclusion of video recordings from the analysis in the present study leaves many important aspects unexplored. Even when the focus is on hearing and smelling, sight will most certainly play a role in people’s appreciation of a complex multimodal environment. This limitation should be considered when interpreting the results. The rich audiovisual materials that we have captured at Wong Tai Sin, but not yet had time to explore, might form the basis for a future study.
Another limitation to consider concerns the ‘snapshot’ character of the study. The team visited the site on several occasions during the winter months outside of religious festivals. The weather was mild and dry on the days we collected field data. Given that the weather in Hong Kong varies quite dramatically over the year, for example with hot and rainy months in the summer, the changing conditions are likely to affect both soundscape and smellscape strongly. Another seasonal matter to consider are religious festivals, which give rise to increased activity and more visitors. In order to form a full picture, data collection should be repeated several times over a longer period. We hope that the present study might provide indications of which tools and methods would be the most effective to employ in a larger and potentially more costly undertaking.
Finally, we must recognise that the field data that we were able to collect are only from the outdoors areas of WTS, while much of the religious activities are performed inside the temple buildings. At the outset, Sik Sik Yuen gave us permission to record audio and video only outdoors, and this restriction shaped the study design. Only towards the very end of the process, after gaining trust from representatives of the management, were we granted permission to record inside the Main Hall. Alas, this material has had to be set aside for future work”
Comment 10 (minor 1) (For Figure 4, it is suggested that the functional layout diagram in Figure 4 be redrawn. It is recommended to use simple color schemes to further clarify the purpose of each function.)
Response 10: We considered this option, but since the 10 functional areas are all different in character, we have found it clearer to stick with the simple labels. The map is in fact similar to information placards found near Wong Tai Sin.
Comment 11 (minor 2) (The "Method" from lines 288 to 292 of result 3.4 is basically the same as the text from lines 157 to 163 of result 2.2.3. It is recommended to organize and modify it.)
Response 11: Thank you for pointing this out. Based on your comment and comments from the other reviewers, we have thoroughly revised section 2.2.3 (Methods/Smellscape ratings), and the corresponding paragraph in the Results has been reduced to avoid redundancy. The extended section 2.2.3 (Smellscape ratings) now reads as follows:
“Humans have a physiological capacity to detect and distinguish a vast number of smells. However, people's capacity to detect, identify, and describe smells (and smell mixtures) is relatively limited (Keller & Vosshall 2004). Dravnieks and collaborators (1978, 1984) arrived at a large ‘Smell Atlas’ giving the hedonic value (pleasantness) of odorous stimuli, including food items (such as lemon, dill, fried chicken, sour milk), non-food items (such as paint, cat urine, leather, tar), and also adjectives not specific to source (such as warm, stale, heavy). Zarzo (2021) re-analysed the atlas by having a large number of subjects rate 160 stimuli using a set of 146 descriptors. The high-dimensionality dataset was subjected to a series of principle component analyses seeking to categorise smell compounds in a reasonable number of classes. They arrived at a parsimonious model with 24 categories, and it forms the basis of our field ratings of environmental smells. The scales in this operational protocol are of three kinds. Firstly, Pleasant, is a single scale to estimate the general pleasantness of the smellscape. Secondly, a set of 10 scales cover non-food smell sources: Floral, Musk, Woody, Camphoraceous, Chemical solvent, Burnt, Sulfidic, Animal, Sickening, and Fetid decay. Thirdly, 13 scales cover food-related smell sources: Fruity, Citrus, Spicy, Balsamic.vanilla, Balsamic.caramel, Herbaceous, Green, Buttery, Nutty, Cooked.meat, Fatty, Fishy, and Sour. The presence of each smell is rated on a 9-step Likert scale (0 = Not present, 2 = A little, 4 = Some, 6 = A lot, 8 = Extremely much). Given the constraints of how the 24-categories model was constructed, the ‘Zarzo scales’ that we have employed here are not expected to provide a detailed description of complex smellscapes in general. It is likely that the characteristics of any specific site may warrant a more narrow or expanded set of descriptors and rating scales, which may be determined before data collection (e.g. based on previous studies) or afterwards (e.g. based on further dimensionality reduction).
To improve the quality of smellscape annotations using this operational protocol, the researchers have dedicated significant time to structured ‘olfactometry panel training’, following recommendations by Belgiorno and collaborators (2013). The individual skills and nose sensitivity of the team’s members were developed in around 15 training sessions conducted in the laboratory, approximately two hours every other week, prior to and during the period of the present case study. Each session consisted of multiple tasks. Firstly: blind testing (smelling) of smell compounds, both with self-collected (such as soil samples, fruit, cardboard, coffee powder etc) and with the ‘scratchable’ booklets of the University of Pennsylvania Smell Identification Test (UPSIT; Doty et al. 1984). The accuracy of smell identification was tracked for each member and evidenced a general improvement over time. Secondly: group discussions about these smell experiences to develop a shared descriptive vocabulary and deeper understanding of the smells, their chemical nature, and cultural usage. The training increased the members’ awareness of smells and individual confidence in conducting smellscape annotations. Thirdly, we developed a web-based interface for field annotation of the smellscape (see Supplementary materials 2).”
Comment 12 (minor 3) (It is suggested that the writing format of references be further unified.)
Response 12: We agree with this comment and are paying close attention to reference formatting in the revision process.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsLines 27-35: The concept of sensory heritage is explained solely through the listing of the five senses (visual, auditory, olfactory, gustatory, and tactile) without engaging sufficiently with established cultural heritage frameworks, such as UNESCO’s concept of Intangible Cultural Heritage. It is recommended that the authors incorporate relevant theoretical references and refine the definition to make the conceptual explanation clearer and more accessible.
Line 16: The presentation of research results is unclear.
Line 26: While the manuscript emphasizes auditory and olfactory perception as central to the study of sensory heritage, it overlooks the role of vision, which is widely recognized as the dominant modality in human sensory experience. The authors are encouraged to at least briefly acknowledge the intermodal relationship between visual, auditory, and olfactory stimuli, and clarify whether the exclusion of vision from the analysis may have implications for the interpretation of results.
Line 26:The research problem and objectives are not explicitly stated. Please consider clearly defining the research questions at the end of the introduction.
Lines 27–28: The content is largely repetitive with the Abstract. Please consider removing or rephrasing to avoid redundancy.
Lines 34–35: The conceptual description of sensory heritage would benefit from more scholarly references. Currently, the claims lack empirical support.
Lines 26-55:The research objective outlined in the abstract appears overly broad and does not clearly define the specific problem the study aims to address. It covers both olfactory and auditory aspects, as well as topics such as culture, health, and policy recommendations. The content is wide-ranging and fragmented, making it difficult to be effectively supported within the limited word count of a short paper.
It is recommended to include the core findings of the study in the abstract.
Figure 1: The conceptual diagram lacks depth and integration. It only lists related terms but does not provide a literature-based analytical framework. The authors are encouraged to supplement it with a critical review of how sensory heritage connects to broader theories in urban studies, memory, and cultural identity.
Lines 88–93: The practical and academic significance of the case study is not clearly articulated.
Lines 94: It is recommended to divide the “Methodology” section into two parts—“Soundscape and Smellscape Data Collection” and “Interview Analysis and Topic Modelling”—to improve clarity and readability.
Lines 94: The research methodology and analytical process involve multiple steps. It is recommended to include a flowchart illustrating the logical sequence—from field data collection → sensory rating → interviews → topic modeling → integrated interpretation—to help readers better understand the overall structure of the study.
Line 136: It is unclear whether measurement instruments were calibrated or error-checked before data collection.
Lines 138 and 148: The selection criteria and classification of participants should be clearly described. Were they laypeople, tourists, experts, or local residents? How were they recruited, and was any demographic balancing considered?
Line 174:The rationale for the interview topics remains unclear. It is suggested to supplement relevant theoretical foundations or references.
Lines 187–190: Since data collection occurred mainly during weekday afternoons in winter, the temporal and seasonal representativeness is limited. This limitation should be explicitly acknowledged and discussed in the discussion section.
Line 205: The study does not conduct a quantitative correlation analysis between the acoustic measurement data (e.g., SPL values) and the interview transcripts. Moreover, there appears to be a disconnection between the acoustic measurements and the interviews in terms of time, space, and data integration. It is recommended to clarify how the acoustic data could be associated with the subjective interview results or explain why such integration was not established.
Line 207-210: The authors chose to fill in "NA" entries in the soundscape and smellscape ratings using the lowest value. This approach may introduce systematic bias. It is recommended that the authors consider using more neutral imputation methods, such as K-Nearest Neighbors (KNN) imputation or Multiple Imputation, for comparative validation. Alternatively, a sensitivity analysis could be conducted to determine whether different imputation strategies significantly affect the soundscape or smellscape evaluation metrics and subsequent statistical results, thereby enhancing the robustness of the model.
Line212-214 The use of "iTestMic2 and Type 2 sound level meters" is mentioned. It is recommended to supplement the equipment consistency test results to ensure data reliability.
Figure 8 is missing axis labels
Line255-261 Regarding the non-parametric test, the specific differences between groups are not explained. It is advisable to add detailed results of post-hoc tests.
Line 304: The author uses 24 Zarzo indicators for factor analysis but does not explain why these specific 24 items were chosen or how their validity was verified. It is recommended to provide the rationale for this methodological choice in the methods section or literature review.
Line 319:The interpretation of latent variables such as “Sour” and “Incense” in the factor analysis lacks physiological or cultural justification (e.g., why “sulfidic smell” and “sourness” are grouped into the same category).
Line 337: It is suggested to elaborate on the sampling strategy for interviewees (e.g., whether random sampling, stratified sampling, etc. was used), and to provide additional sociological characteristics of the participants (such as gender, age, education level, visit frequency, etc.) to enhance the representativeness and credibility of the sample.
Line 378: The terms “BERtopic” and “BERTopic” appear in different capitalizations throughout the text. It is recommended to standardize the formatting of this term for consistency across the manuscript.
Line 571: For Figure 20, it is recommended to provide a clearer explanation, especially regarding how the strength of associations between different topics was calculated, along with the specific quantitative results.
Line 575:For example, while the paper highlights the significant connections between smellscape, religion, and emotion, it does not further explore the cultural and psychological mechanisms that explain why certain smells (such as incense) are closely associated with emotional and religious experiences.
Lines 743-746 The conclusion is vague and does not specifically correspond to the research objectives.
Lines 752-756 The contributions of the research lack differentiation from existing studies, and the discussion on limitations is one-sided, failing to cover the limitations of the research method itself.
Full text:It is recommended to use smellscape consistently throughout the paper, except when referring specifically to objective environmental factors, in which case olfactory environment may be used.
Comments on the Quality of English LanguageThe English expression of this article is smooth and professional.
Author Response
Rev3
Author Notes to Reviewer
Comment 1: (The English could be improved to more clearly express the research.…The English expression of this article is smooth and professional.)
Response 1: Thank you for the kind words, though perhaps the comments were contradictory. In the revision we have found ample room for improvement to language and style, and have combed the text for typos. We hope the English in the revised manuscript is fully satisfactory.
Comment 2: (Lines 27-35: The concept of sensory heritage is explained solely through the listing of the five senses (visual, auditory, olfactory, gustatory, and tactile) without engaging sufficiently with established cultural heritage frameworks, such as UNESCO’s concept of Intangible Cultural Heritage. It is recommended that the authors incorporate relevant theoretical references and refine the definition to make the conceptual explanation clearer and more accessible.)
Response 2: We agree with this comment which has prompted us to thoroughly revise and extend the Introduction. In particular, the relevant UNESCO frameworks are now referenced and we highlight how and why they should lead us to pay more attention to sensory heritage. The Introduction now reads as follows:
“In cities such as Hong Kong, there is a constant struggle between human traditions, forces of technologically driven desires (‘smart cities’), and natural contextual constraints (such as climate change). Further emphasised by the high population density, these struggles and constraints place strong conditions on sustainable urban development of the city. While security, population health, and economy, are highest on the list of concerns by residents and decision-makers, matters such as general well-being, cultural identity, and heritage, are not far behind, especially given that they invariably interact with the more fundamental and measurable matters. There is much to be done to increase the awareness of the value that these qualities offer.
City spaces that are culturally valued typically present an “intertwined tangible-intangible duality, expressed both as a physical construction and as a set of social, traditional practices” (Lenzi, Sabada, & Lindborg 2021). In this context, sensory heritage is vital. The origin of this concept might be traced to sensory history and anthropology (Classen et al. 1987, Augé 1992; Bembibre 2024). Multisensory approaches have increasingly gained importance in heritage studies (Parker et al. 2023; Ppali et al. 2024; Xu et al. 2022). In this body of research, the understanding of what constitutes sensory heritage builds on two well established concepts, namely tangible and intangible heritage. The former refers to physical artefacts in the built environment, emphasising “architecture, their homogeneity or their place in the landscape” (UNESCO 1972). The intangible cultural heritage (ICH; UNESCO 2003), is constituted by any "non-corporeal manifestation of tradition-based creativity [that reflects] the community's social or cultural identity. It includes… the social, intellectual and cultural processes that… have made possible the development of a distinct cultural tradition whose preservation and protection is important…" (UNESCO 2021b p. 5). In Hong Kong, these forms of heritage fall, respectively, under the purview of the Antiquities and Monuments Office and the Intangible Cultural Heritage Office.
However, these frameworks have been criticised (Caust & Vecco 2017; Lindborg et al. 2024) for an intrinsic hierarchisation of the senses, so that items appreciable by sight (e.g. monument, bridge, dance performance) are prioritised over those accessible through touch (e.g. sculpture, textile) or hearing (e.g. soundscape), and again over smell or taste. The concept of sensory heritage captures something that goes beyond the UNESCO definitions, which are framed as physical objects and tradition-based practices. To identify what this ‘thing’ might be, we may start by acknowledging that the specific urban sites that a community values combine physically persistent and ephemeral qualities. Two observations can be made to show gaps in the definitions stated above. Firstly, some aspects of the experience when within physical environments, for example the sounds and smells that they contain, might be ephemeral in the sense that the individual object whence they originate may quickly vanish, and yet they might be more or less continually replenished so as to create a long-term, steady-state presence. Hence, sounds and smells can take on a physical character (cf. Firat 2021). Secondly, the actions that give rise to those sounds and smells might not be fully intentional (as opposed to the clear ICH situation of, say, a traditional dance performance), but rather, the side-product of any kind of action, be it ‘cultural’ or not (cf. semi-designed environments, Lindborg 2015). The action itself might be everyday-ish or menial, but what matters here is that the perception of such sounds and smells can take on significance (perhaps over time) for a community. Hence sounds and smells take on a cultural character. Thus, the concept of sensory heritage is novel and vital to understand urban sites of value.
The focus that we are here placing on sounds and smells is certainly a simplification compared to the multisensory complexity experienced in natural environments. As noted above, vision plays a primary role in the appreciation of cultural heritage, and intermodal relationships between visual, auditory, and olfactory senses combine to great effect (Spence, 2020, p. 2; Lindborg & Liew 2021). Visual and cognitive factors can dominate smell perception. For example, a noteworthy experiment showed that the smell of white wine that had been dyed red was described with red wine descriptors by students of œneology (Brochet & Dubourdieu 2001).
Throughout our lives, multi-sensing is the normal condition, and therefore lived experiences are encoded in multisensory memory. There is ample evidence that smells can trigger episodic memory recall (Herz 2011), and non-olfactory stimuli can trigger smell memories, true or not (Keller & Vosshall 2004). If not lived and true as memories, they might be projections of imagined smells (Young 2020; Lindborg & Liew 2021). Both smells and sounds can facilitate spatial recall through activities such as visualisation by sketching, or physicalisation through clay-making, thus bringing a sense of tangibility to recalled memories (Xiao & Lindborg 2025).
Inching towards a definition of sensory heritage, we look towards research in urban soundscape and smellscape. The International Organisation for Standardisation defines soundscape as constituted by multiple sounds in the acoustic environment, perceived and understood by individuals or groups of people (ISO 2018: 12913-1). Soundscape can have positive cultural value as well as negative impact on community health. Similarly, smellscape is the perceived olfactory environment, resulting from a complex mixing of volatile olfactory compounds. While smell has always been a part of human experience and informal research for hundreds of years, smellscape research only started to emerge in the 1980s (Porteus 1985; Henshaw 2013; Xiao 2018). As pointed out by Bembibre, “the significance of smell in connection with heritage is rarely recognized. This is caused by 1) fragmented knowledge of the sensory worlds of the past and the present, 2) the low awareness of the importance of smells and olfaction in intangible heritage practices, and 3) the lack of adequate methods to identify, record and safeguard smells” (Bembibre 2024).
With the above in mind, we tentatively define sensory heritage as follows: the sum total of the culturally valued sensorial experiences of a community, manifested as sights, sounds, smells, tastes, and textures, enabled through practices, rituals, and everyday activities, together with narratives and memories.
Sensory heritage is starting to be recognised in national legislation, with France leading the way (Morel-a`L’Huissier 2020). To advance the development of a legal framework, policy research should not only take into account the existing legislation that serves to regulate nuisance (such as noise and malodeur) but also identify valued exponents of the sensory heritage (e.g. positive sounds and smells) that need to be protected and are beneficial to promote. Sensory heritage as a vital component of people’s connection to their environment, both current and past through memory, stories, and history. While the concept need not abide by a strict definition, increased awareness and attention to it will generate wider benefits in the economy. Everyday activities and narratives shape people’s identity, and may offer added value to the lived experiences of people. To illustrate how sensory heritage can be understood as the intersection between everyday practices, legislative ordinances, soundscape, smellscape, built heritage and ICH, refer to the schematic overview in Figure 1.”
Comment 3 (Line 16: The presentation of research results is unclear.)
Response 3: We agree. The revised Abstract now reads:
“Sensory heritage is constituted by culturally valued practices, rituals, and everyday activities as experienced through all the senses. While sight is our dominant sense, hearing and smelling are the two most immersive and pervasive senses. Soundscape is a major field within urban studies but smellscape is not fully appreciated. This study is part of Multimodal Hong Kong, a project that aims to document sensory cultural heritage across the city by capturing the complex interplay between soundscape, smellscape, urban experiences, everyday activities, and memory. We investigated the multisensory environment at Wong Tai Sin Temple through acoustic measurements and perceptual ratings of soundscape and smellscape within and around the site (197 locations), and conducted semi-structured interviews with visitors (N = 54, 15015 words) that were analysed through content analysis and natural language processing. Results show that elevated noise levels depend largely on human voices and pipe music within the compound, and traffic around it. The smell of incense dominates near the temple altars, while natural, grassy smells are present in the park area. Interview responses confirm that incense burning is a traditional religious activity that creates a smellmark for Chinese temples, but at the same time, is perceived as having negative implications on health. The study contributes to the growing body of sensory heritage research, underscoring the importance of soundscape and smellscape for culturally inclusive, vibrant, and sustainable cities. ”
Comment 4: (Line 26: While the manuscript emphasizes auditory and olfactory perception as central to the study of sensory heritage, it overlooks the role of vision, which is widely recognized as the dominant modality in human sensory experience. The authors are encouraged to at least briefly acknowledge the intermodal relationship between visual, auditory, and olfactory stimuli, and clarify whether the exclusion of vision from the analysis may have implications for the interpretation of results.)
Response 4: We recognise the important role of vision and intermodal relationship. Given a complex site such as Wong Tai Sin, a full multimodal analysis of visual appreciation and interactions between vision and other sensory modalities is beyond the remit of this study. In the Introduction, we have highlighted this, as well as the perspective that memory and imagination brings to perception, with references from the literature. (Introduction, about half-way):
“The focus that we are here placing on sounds and smells is certainly a simplification compared to the multisensory complexity experienced in natural environments. As noted above, vision plays a primary role in the appreciation of cultural heritage, and intermodal relationships between visual, auditory, and olfactory senses combine to great effect (Spence, 2020, p. 2; Lindborg & Liew 2021). Throughout our lives, multi-sensing is the normal condition, information is encoded in multisensory memory, and sensory memories There is ample evidence that non-olfactory stimuli can trigger smell memories, that are projected as imagined smells (Young 2020), and vice versa, that smells can trigger episodic memory recall (Herz 2011). Smells and sounds can facilitate spatial recall, for example through sketching, and clay-making can bring a sense of tangibility to recalled memories (Xiao & Lindborg 2025).”
Comment 4 (Line 26:The research problem and objectives are not explicitly stated. Please consider clearly defining the research questions at the end of the introduction.)
Response 4: Thank you for pointing this out. The research questions, in three groups, are formulated towards the end of this section, as follows:
“With the present case study we placed the searchlight on Wong Tai Sin Temple, a unique but at the same time representative site in Hong Kong’s cityscape, intertwining culture, religion, tourism, and sensory experiences. Our approach was a mixed-methods approach. The first set of research questions related to descriptive characteristics of the chosen site. What is the place of Wong Tai Sin in the physical cityscape fabric? What is its role in the cultural landscape? How does it contribute to creating the identity of the city? What is the acoustic environment like and how is the soundscape perceived? How can we effectively characterise its smellscape? Measurements and observations were complemented by a set of interviews with visitors of varying backgrounds, to bring a deeper understanding of sensory heritage from the perspective of stakeholders. Through this, we addressed research questions about how people relate sensory perceptions with cultural practices: What specific sounds and smells contribute to a general appreciation of the site? How do soundscape and smellscape relate to, give shape to, and inform us about cultural and religious practices? Finally, we formulated questions to contextualise the study: What general conclusions can we draw from Wong Tai Sin towards other sites in Hong Kong and elsewhere? How can soundscape and smellscape be leveraged for the development of sensory heritage policy?”
Comment 5: (Lines 27–28: The content is largely repetitive with the Abstract. Please consider removing or rephrasing to avoid redundancy.)
Response 5: This is true. We have avoided such redundancy in the revised and extended Introduction. Please see Comment/Response 2 for details.
Comment 6: (Lines 34–35: The conceptual description of sensory heritage would benefit from more scholarly references. Currently, the claims lack empirical support.)
Response 6: We agree that there is a gap between concept, theory, and empirical data in the field of sensory heritage research. In the present manuscript, regarding underlying concepts and theory, we would like to refer to Comment/Response 2 above, reporting the thoroughly revised and extended Introduction section. To our best knowledge, there is no published literature where a conceptual description of sensory heritage makes testable claims. We suggest that at this moment in the development of the field, studies that are based on new, empirical data which are subjected to exploratory analysis are helpful, in that such research might generate theory, from the bottom up. The diagram of ‘concepts related to sensory heritage’ that we have given in Figure 1 proposes some of the components that a theory of sensory heritage might include.
Comment 7: (Lines 26-55:The research objective outlined in the abstract appears overly broad and does not clearly define the specific problem the study aims to address. It covers both olfactory and auditory aspects, as well as topics such as culture, health, and policy recommendations. The content is wide-ranging and fragmented, making it difficult to be effectively supported within the limited word count of a short paper.
Response 7: This is a point that we largely agree with. Hopefully the introduction, which has been thoroughly revised and extended, may adequately address this point: see Comment/Response 2 above. The objectives as outlined in the Abstract (especially the last two sentences) have been clarified.
Comment 8: (It is recommended to include the core findings of the study in the abstract.)
Response 8: We totally agree. Please see Comment/Response 3 for the revised Abstract.
Comment 9: (Figure 1: The conceptual diagram lacks depth and integration. It only lists related terms but does not provide a literature-based analytical framework. The authors are encouraged to supplement it with a critical review of how sensory heritage connects to broader theories in urban studies, memory, and cultural identity.)
Response 9: To improve the depth and integration, we have thoroughly revised the Introduction to address theory and conceptual development of sensory heritage, especially considering UNESCO definitions and recent critics. There are also references to multisensory perception and recent policy research, to frame the need for raising awareness and impact of positive, culturally valued sounds and smells. Please see Comment/Response 2.
Comment 10: (Lines 88–93: The practical and academic significance of the case study is not clearly articulated.
Response 10: We agree, and have emphasised the practical and academic significance at the end of the Abstract:
“The study contributes to the growing body of sensory heritage research, underscoring the importance of soundscape and smellscape for culturally inclusive, vibrant, and sustainable cities.”,
The matter is also addressed in the revised Conclusion, as follows:
“The aim of this study has been to contribute knowledge to sensory heritage research in general and to Hong Kong’s cultural landscape in particular. Through measurements, observations, ratings, and interviews with diverse stakeholders, this paper identifies and summarizes a range of varied perspectives on the sensory experience at Wong Tai Sin Temple. Analysis of interview responses have highlighted its defining characteristics, in comparison with similar sites in Hong Kong and abroad. People often express their experiences in emotional terms, and they spontaneously report memories, often from childhood, that put present-day experiences in relief. While specific sounds (such as the background music piped throughout the site) and smells (dominated by incense burning) are valuable to some people, others express concerns about noise and air pollution. The conflicting perspectives show that the field of sensory heritage is still in an early phase. More data needs to be collected from different stakeholder perspectives to identify the right questions, to refine the theoretical framework and create a unifying, productive definition. This, we believe, will stimulate policy development that goes beyond regulation of the negative impact of nuisances and strives to actively preserve and promote sensory heritage in sustainable and inclusive ways. The implications are clear for urban designers, travel agencies, and companies and government agencies that work with branding and virtual tourism. Our study explicates the links between soundscape, smellscape, and culture, and lends evidence to the importance of sensory heritage for sustainable cities.”
Comment 11: (Lines 94: It is recommended to divide the “Methodology” section into two parts—“Soundscape and Smellscape Data Collection” and “Interview Analysis and Topic Modelling”—to improve clarity and readability.
Response 11: Thank you for this suggestion, which we have included in the revision.
Comment 12: (Lines 94: The research methodology and analytical process involve multiple steps. It is recommended to include a flowchart illustrating the logical sequence—from field data collection → sensory rating → interviews → topic modeling → integrated interpretation—to help readers better understand the overall structure of the study.
Response 12: This is a very constructive suggestion, thank you. We have created a new Figure 6 for the revision, to be inserted at the start of the section for Materials and Methods. To keep the total number of images manageable, we have retracted one of the photos of the sites, and slightly rearranged the (proposed) layout of Figures 4 and 5.
Comment 13: (Line 136: It is unclear whether measurement instruments were calibrated or error-checked before data collection.
Response 13: Yes, they were checked and calibrated before being put to use. This has been clarified in the revision.
Comment 14: (Lines 138 and 148: The selection criteria and classification of participants should be clearly described. Were they laypeople, tourists, experts, or local residents? How were they recruited, and was any demographic balancing considered?
Response 14: Thank you for highlighting this. The description of how we selected participants and conducted interviews has now been clarified (Methods/2.2.5. Interviews, line 169), see below. The demographics of participants had been reported in Section 3.5, line 338 and we hope it is adequate. Note also that Supplementary materials 5 lists details about each of the participants.
“We developed a method for interviews with visitors to understand their experiences, memories, and opinions about WTS (Sprague et al. 2024; Flick 2000). Through convenience sampling, members of the team individually approached all kinds of visitors, inside or in the immediate vicinity of the temple (see Figure 4). After consenting to being interviewed and audio recorded, a conversation ensued, where the interviewer followed a semi-structured format to cover four thematic areas, namely: about the interviewee themselves, their expectations, their thoughts and feelings about sounds and soundscape, and about smells and smellscape (see Supplementary Materials 4). The conversation was always held in a language that was comfortable for the interviewee. Participants could opt out of the interview at any time, and they were not reimbursed. The interviews lasted typically 4-6 minutes in duration. Recordings were post-processed for clarity and transcribed. By approaching different stakeholders (e.g., local residents/visitors, tourists, worshippers, staff members, merchants, etc.), our aim was to understand a range of perspectives regarding the sounds and smells of WTS in particular, and of Chinese temples in general.”
Comment 15: (Line 174:The rationale for the interview topics remains unclear. It is suggested to supplement relevant theoretical foundations or references.
Response 15: The design of semi-structured interviews in four themes was made with the specifics of the chosen site in mind and limitations such as the available time and resources. The method was based on our understanding of handbook recommendations (in particular, Bauer & Gaskell 2000), some of which we have applied in our own previous studies (e.g. Lindborg & Friberg 2015; Yue, Lindborg et al 2025). See also the response above.
Comment 16: (Lines 187–190: Since data collection occurred mainly during weekday afternoons in winter, the temporal and seasonal representativeness is limited. This limitation should be explicitly acknowledged and discussed in the discussion section.
Response 16: Thank you for pointing this out. Your comment here, and comments from the other reviewers, have provoked a section ‘Limitations of the study’, at the end of the Discussion. It reads as follows:
“The last point highlights the fact that the present study has important limitations. We acknowledge that the exclusion of video recordings from the analysis in the present study leaves many important aspects unexplored. Even when the focus is on hearing and smelling, sight will most certainly play a role in people’s appreciation of a complex multimodal environment. This limitation should be considered when interpreting the results. The rich audiovisual materials that we have captured at Wong Tai Sin, but not yet had time to explore, might form the basis for a future study.
Another limitation to consider concerns the ‘snapshot’ character of the study. The team visited the site on several occasions during the winter months outside of religious festivals. The weather was mild and dry on the days we collected field data. Given that the weather in Hong Kong varies quite dramatically over the year, for example with hot and rainy months in the summer, the changing conditions are likely to affect both soundscape and smellscape strongly. Another seasonal matter to consider are religious festivals, which give rise to increased activity and more visitors. In order to form a full picture, data collection should be repeated several times over a longer period. We hope that the present study might provide indications of which tools and methods would be the most effective to employ in a larger and potentially more costly undertaking.
Finally, we must recognise that the field data that we were able to collect are only from the outdoors areas of WTS, while much of the religious activities are performed inside the temple buildings. At the outset, Sik Sik Yuen gave us permission to record audio and video only outdoors, and this restriction shaped the study design. Only towards the very end of the process, after gaining trust from representatives of the management, were we granted permission to record inside the Main Hall. Alas, this material has had to be set aside for future work.”
Comment 17: (Line 205: The study does not conduct a quantitative correlation analysis between the acoustic measurement data (e.g., SPL values) and the interview transcripts. Moreover, there appears to be a disconnection between the acoustic measurements and the interviews in terms of time, space, and data integration. It is recommended to clarify how the acoustic data could be associated with the subjective interview results or explain why such integration was not established.
Response 17. This is an intriguing observation, thanks for pointing it out. It might be hard to conduct a quantitative analysis between these sets of data, partly because one is quantifiable (with well-defined variables) and the other essentially qualitative (with themes and topics emerging through the analytical process rather than ‘variables’). Also, the two datasets were collected at slightly different locations (though all within the same site) and times (though on the same days). Their differing sizes might present a technical hurdle. In place of a ‘summative quantitative integration’, the Discussion serves to bring the quantitative and qualitative perspectives together.
However, your comment has stimulated another and more fundamental analysis, which we had initially ignored. It is a Canonical correlation analysis, now completed and included in the revised manuscript. It is a concluding step of the analysis of field data (so, not the interviews). It is inserted at the end of the section ‘Analysis and Results from field data’, and reads as follows:
“We turned to Canonical Correlation Analysis (CCA; Thompson 2000; Lindborg & Liew 2021) to explore the degrees of overall association between the three different kinds or modalities of data that the analysis had yielded so far, i.e. acoustic measurements, soundscape ratings, and smellscape ratings.
Like the exploratory factor analysis (EFA) employed in the previous step, CCA is a multivariate dimensionality reduction technique to identify linear combinations of observed variables that represent underlying latent variables (factors or canonical variates). While EFA examines relationships within a single set of variables, CCA compares two sets that may have differing numbers of variables, and finds the projection that accounts for the most of the covariance between the two. The set of canonical correlations indicate the strength of association between two sets of variables. The first canonical correlation captures the strongest linear relationship between the two variable sets and is usually the most important.
The present dataset had 197 rows corresponding to the data collection locations at WTS. Acoustic measurements had two variables (LAeq and LCeq), for soundscape ratings we included six variables (Pleasantness, Eventfulness, and four Sound types: Human, Nature, Traffic, Other), and for smellscape four variables (Pleasant and the three latent factors: Off-putting/chemical, Incense, Grassy). The analyses were conducted with the cancor function in R (R Core Team 2025).
In the first CCA, we compared acoustic measurements with soundscape ratings. The first canonical correlate was 0.62, pointing to an overall strong relationship. In the pattern of covariances (between the variables of each matrix and the first canonical variate), it was found that loadings were high onto LAeq (0.73) and LCeq (0.95), and at the same time high onto Traffic (0.86) and medium negatively onto Nature (-0.46). This indicates that noise levels were strongly associated with more Traffic sounds and to some extent also with fewer Nature sounds.
In the next case, we compared acoustic measurements with smellscape ratings. The first canonical correlate was 0.39, indicating a weak relationship. In the pattern of covariances, the loading was high onto LAeq (0.95) and at the same time high negatively onto Grassy (-0.73). This indicates that lower noise levels were found at locations with more Grassy smells, but note that the overall association was weak.
Lastly, we compared soundscape and smellscape ratings. The first canonical correlate was 0.64, indicating a strong overall relationship with an interesting pattern of covariances. Loadings were high negatively for Traffic (-0.86) and Other sounds (-0.55, mostly music), and at the same time high negatively for Off-putting/chemical smells (-0.89) and medium for Pleasant smells (0.48). The signs for these loadings indicate that locations with more Traffic and Other sounds also had more Off-putting/chemical smells that were less Pleasant.”
Comment 18: (Line 207-210: The authors chose to fill in "NA" entries in the soundscape and smellscape ratings using the lowest value. This approach may introduce systematic bias. It is recommended that the authors consider using more neutral imputation methods, such as K-Nearest Neighbors (KNN) imputation or Multiple Imputation, for comparative validation. Alternatively, a sensitivity analysis could be conducted to determine whether different imputation strategies significantly affect the soundscape or smellscape evaluation metrics and subsequent statistical results, thereby enhancing the robustness of the model.)
Response 18: In fact, our use of the expression ‘missing data’ was unclear. Because of the design of the web interface for ratings, ‘none detected’ and ‘NA’ entries were not handled separately, and this created uncertainty for the analysis. However, we believe it is a relatively minor flaw, and would like to explain and justify the imputation methods (line 205–) as follows:
“For 18 locations, a SPL meter was not available but smellscape and soundscape ratings were made. Together with one technical dropout, these missing data (9.6% of acoustic measurements) were imputed with median values across the site (since dB is a logarithmic scale). when smellscape and soundscape ratings were made at a location or time. These missing data (9.6% of LA and LC measurements) were imputed with median values across the site (since dB is a logarithmic scale). In ratings of the soundscape, usage of the web interface may have caused some of the missing data. Annotators occasionally used the NA option (i.e. ‘no response’) in the sense of ‘Do not hear [this sound type] at all’. Missing data for Sound type (9.0% of ratings on five scales) were imputed by means. The comparatively few missing ratings for Pleasantness and Eventfulness (5.4%, on eight semantic scales) were also imputed by means. In smellscape ratings the imprecision in our web interface design may have caused most of the missing data (12.8% of ratings on 24 Zarzo scales). The relatively high percentage was likely due to some team members using the NA option (‘no response’) as a shorthand for ‘[this smell is] Not present’. The incoherence in usage within the team was discovered about halfway into the data collection, but not seen as a problem. Therefore, these data points have been imputed by 0 (zero) in the analysis.”
Comment 19: (Line212-214 The use of "iTestMic2 and Type 2 sound level meters" is mentioned. It is recommended to supplement the equipment consistency test results to ensure data reliability.
Response 19: To avoid redundancy and potential confusion, the information about SPL meter calibration is now only given in the Methods/Acoustic measurements section, and the section reporting results is tidied up for clarity.
Comment 20: (Figure 8 is missing axis labels
Response 20: As these are histograms, the height of bars represent counts or relative frequency, and it is not strictly necessary to give a unit for the y-axis. We believe leaving it out actually gives the histogram a cleaner visual appearance. If it is the journal’s standard to do it otherwise, we can fix it.
Comment 21: (Line255-261 Regarding the non-parametric test, the specific differences between groups are not explained. It is advisable to add detailed results of post-hoc tests.)
Response 21: Thank you for pointing this out. The differences between groupings found to be significant in post-hoc tests, as expressed in step units on the Likert scales, have been added, both in Section 3.3.2 line 254– and in Section 3.5.2 line 513–,
Comment 22: (Line 304: The author uses 24 Zarzo indicators for factor analysis but does not explain why these specific 24 items were chosen or how their validity was verified. It is recommended to provide the rationale for this methodological choice in the methods section or literature review.)
Response 22: Thank you for highlighting the importance of the ‘Zarzo scales’. The revision gives a fuller description of how they came about and their limitations in the context of a field case study. The following paragraph is added (inserted at line 150):
“Humans have a physiological capacity to detect and distinguish a vast number of smells. However, people's capacity to detect, identify, and describe smells (and smell mixtures) is relatively limited (Keller & Vosshall 2004). Dravnieks and collaborators (1978, 1984) arrived at a large ‘Smell Atlas’ giving the hedonic value (pleasantness) of odorous stimuli, including food items (such as lemon, dill, fried chicken, sour milk), non-food items (such as paint, cat urine, leather, tar), and also adjectives not specific to source (such as warm, stale, heavy). Zarzo (2021) re-analysed the atlas by having a large number of subjects rate 160 stimuli using a set of 146 descriptors. The high-dimensionality dataset was subjected to a series of principle component analyses seeking to categorise smell compounds in a reasonable number of classes. They arrived at a parsimonious model with 24 categories, and it forms the basis of our field ratings of environmental smells. The scales in this operational protocol are of three kinds. Firstly, Pleasant, is a single scale to estimate the general pleasantness of the smellscape. Secondly, a set of 10 scales cover non-food smell sources: Floral, Musk, Woody, Camphoraceous, Chemical solvent, Burnt, Sulfidic, Animal, Sickening, and Fetid decay. Thirdly, 13 scales cover food-related smell sources: Fruity, Citrus, Spicy, Balsamic.vanilla, Balsamic.caramel, Herbaceous, Green, Buttery, Nutty, Cooked.meat, Fatty, Fishy, and Sour. The presence of each smell is rated on a 9-step Likert scale (0 = Not present, 2 = A little, 4 = Some, 6 = A lot, 8 = Extremely much). Given the constraints of how the 24-categories model was constructed, the ‘Zarzo scales’ that we have employed here are not expected to provide a detailed description of complex smellscapes in general. It is likely that the characteristics of any specific site may warrant a more narrow or expanded set of descriptors and rating scales, which may be determined before data collection (e.g. based on previous studies) or afterwards (e.g. based on further dimensionality reduction).”
Comment 23: (Line 319:The interpretation of latent variables such as “Sour” and “Incense” in the factor analysis lacks physiological or cultural justification (e.g., why “sulfidic smell” and “sourness” are grouped into the same category).)
Response 23: Thank you for highlighting the need to introduce well justified semantic labels, since they reflect the interpretation made and might influence the way results are understood by Readers. Considering previous research, including by Zarzo who introduced the underlying terms (from Dravnieks and others), we have changed the naming of the first latent factor to be somewhat more neutral in character. The second factor, however, we believe can be justified to keep the label we originally gave it. Mentions of the renamed factor have been updated elsewhere in the revision. The revised paragraph (in section 3.4) now reads:
“The first latent factor explained 20.8% of the total variance and was mainly defined by ratings on four rating scales: Sulfidic (loading = 0.93), Sour (0.74), Chemical (0.68), and Sickening (0.45). As discussed by Zarzo (2021), these descriptors have in common that they relate to unpleasant, pungent, or irritating sensory qualities, and most likely unpleasant odours that provoke discomfort. Therefore, the latent factor was labelled Off-putting/chemical. The second latent factor, explaining 16.8%, was defined by Burnt (0.79), Woody (0.78), and Musk (0.45). These descriptors are generally related to warm, natural, and smoky aromas. In the present context they were clearly reflective of the practice of burning incense, and therefore we labelled this latent factor Incense.”
Comment 24: (Line 337: It is suggested to elaborate on the sampling strategy for interviewees (e.g., whether random sampling, stratified sampling, etc. was used), and to provide additional sociological characteristics of the participants (such as gender, age, education level, visit frequency, etc.) to enhance the representativeness and credibility of the sample.)
Response 24: We agree that this was not clear, and the matter has been addressed in the revision. Please see Comment/Response 14 for details.
Comment 25: (Line 378: The terms “BERtopic” and “BERTopic” appear in different capitalizations throughout the text. It is recommended to standardize the formatting of this term for consistency across the manuscript).
Response 25: Thank you for spotting this typo which is now corrected.
Comment 26: (Line 571: For Figure 20, it is recommended to provide a clearer explanation, especially regarding how the strength of associations between different topics was calculated, along with the specific quantitative results.)
Response 26: Thank you for pointing this out. We have clarified the explanation in a new Section with the heading ‘Association between topics’. We have also introduced a new table to clearly list values for all the associations, which are discussed with examples in the section that follows. The new passage in the revised manuscript is as follows:
“Finally, we estimated the association between each of the two main topics, Soundscape and Smellscape, and each of the 12 topics emerging from the interpretative content analysis. Two values were produced to express the association, namely mean pleasantness and word count. To recall, Pleasantness was estimated in the interpretative sentiment analysis described above. Word count was the total number of lemmatized words shared between the main and the emergent topic. The values are given in Table 4. See also Figure 19 for boxplots of the distributions of pleasantness, and Supplementary materials 6 for details.
Table 4. Associations between Soundscape, Smellscape, and twelve emergent topics.Values outside parenthesis represent mean pleasantness across statements that make association (range: -2…+2), and values within parenthesis represent the total number of words that the associated topics share.”
Comment 27: (Line 575:For example, while the paper highlights the significant connections between smellscape, religion, and emotion, it does not further explore the cultural and psychological mechanisms that explain why certain smells (such as incense) are closely associated with emotional and religious experiences.)
Response 27: This is true. However we feel that a full treatment of these important questions might take the present study too far, and in a potentially distracting direction. A deeper exploration of these mechanisms, and especially the role of memory, is envisaged in future study (for which we have already conducted a pilot interview study). Meanwhile, the revised Introduction provides a framing of the matter, via the following paragraph:
“Throughout our lives, multi-sensing is the normal condition, and therefore lived experiences are encoded in multisensory memory. There is ample evidence that smells can trigger episodic memory recall (Herz 2011), and non-olfactory stimuli can trigger smell memories, true or not (Keller & Vosshall 2004). If not lived and true as memories, they might be projections of imagined smells (Young 2020; Lindborg & Liew 2021). Both smells and sounds can facilitate spatial recall through activities such as visualisation by sketching, or physicalisation through clay-making, thus bringing a sense of tangibility to recalled memories (Xiao & Lindborg 2025).”
Comment 28: (Lines 743-746 The conclusion is vague and does not specifically correspond to the research objectives.)
Response 28: Thank you for pointing this out. As mentioned above, the research questions are now clearly stated (please see Comment/Response 4 for the details). Each question is retraced in the Discussion section, either at the top, or indeed in the Conclusion, which has been revised and extended.
Comment 29: (Lines 752-756 The contributions of the research lack differentiation from existing studies, and the discussion on limitations is one-sided, failing to cover the limitations of the research method itself.)
Response 29: Thank you for pointing this out. We recognise and discuss the main limitations of the study at the end of the Discussion. See Comment/Response 16 for details.
Comment 30: (Full text:It is recommended to use smellscape consistently throughout the paper, except when referring specifically to objective environmental factors, in which case olfactory environment may be used.)
Response 30: Thank you for pointing this out. We have followed this recommendation in the revision.
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe quality of the manuscript is good after this revision.

