Next Article in Journal
A Sustainable Framework for Planning and Management of Diving Operations for Underwater Search and Rescue in Strong Tidal Current Environments: Lessons from the Sewol Ferry Disaster
Previous Article in Journal
Metaheuristic-Optimized Cassava Starch/CNF/SiO2 Bio-Nanocomposite Films for Sustainable Food Packaging: A Data-Driven Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonlinear Perceptual Thresholds and Trade-Offs of Visual Environment in Historic Districts: Evidence from Street View Images in Shanghai

1
College of Communication and Art Design, University of Shanghai for Science and Technology, Shanghai 200093, China
2
Landscape Planning Laboratory, Graduate School of Horticulture, Chiba University, Matsudo Campus, B-Building, 648, Matsudo 271-8510, Japan
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(24), 11075; https://doi.org/10.3390/su172411075
Submission received: 13 November 2025 / Revised: 1 December 2025 / Accepted: 7 December 2025 / Published: 10 December 2025
(This article belongs to the Section Tourism, Culture, and Heritage)

Abstract

Historic districts, as important spatial units that carry urban cultural memory and everyday social life, play a crucial role in shaping residents’ spatial identity, emotional attachment, and perceptual experience. Although quantitative research on built environments and perception has advanced considerably in recent years, the mechanisms through which perception is formed in historic districts, particularly the nonlinear threshold effects and perceptual trade-off patterns that arise under conditions of high-density and mixed land use, remain insufficiently examined. To address this gap, this study develops an analytical framework that integrates spatial attributes with multidimensional subjective perceptions. Focusing on six historic districts in central Shanghai, the study combines micro-scale environmental indicators extracted from street-view imagery, POI data, and public perceptual evaluations and employs an XGBoost model to identify the nonlinear response patterns, threshold effects, and perceptual trade-offs across seven perceptual dimensions. The results show that natural elements such as visual greenery and sky openness generate significant threshold-based enhancement effects, and once reaching a certain level of visibility, they substantially increase positive perceptions including beauty, safety, and cleanliness. By contrast, commercial and traffic-related facilities exhibit dual and competing perceptual influences. Moderate densities enhance liveliness, whereas high concentrations tend to induce perceptual fatigue and intensify negative emotional responses. Overall, perceptual quality in historic districts does not arise from linear accumulation but is shaped by dynamic perceptual trade-offs among natural features, functional elements, and cultural symbolism. Overall, the study reveals the coupling mechanism between spatial renewal and perceptual experience amid the pressures of urban modernization. It also demonstrates that increasing visible greenery (e.g., planting street trees, incorporating micro-green spaces, improving façade greening), enhancing street openness (e.g., optimizing view corridors, reducing visual obstruction, implementing moderate setback adjustments), guiding a moderate mix and spatial distribution of commercial and service functions, and strengthening the perceptibility of cultural landscape elements (e.g., façade restoration, streetscape coordination, and improved signage systems) are concrete and effective planning and design actions for improving landscape quality and enhancing the experiential quality of historic districts.

1. Introduction

1.1. Background

Urban perception refers to the subjective psychological experience individuals form within urban spaces, encompassing multiple dimensions such as aesthetic evaluation, spatial cognition, and emotional responses [1,2]. Serving as a critical link between the built environment and human behavior, it not only influences individuals’ understanding and preference formation regarding urban spaces but also implicitly shapes behavioral decisions and patterns of social interaction [3,4].
Within the complex context of urban life, perceptual mechanisms provide a fundamental basis for understanding the interaction between humans and their surrounding environments. Existing research has demonstrated that the physical appearance of cities and the resulting perceptual experiences have significant impacts on residents’ mental health, behavioral patterns, and overall well-being [5]. Positive perceptions foster community attachment and social participation, whereas negative perceptions may lead to psychological stress and social isolation [6,7,8,9]. Therefore, systematically measuring urban perception not only contributes to understanding the semantic structure and spatial heterogeneity of the environment but also offers a novel perspective for explaining the psychological foundations and social consequences of human activities [10].
With advances in technology and data analytics, research on urban perception has undergone a methodological shift—from qualitative interviews to visual and quantitative analyses. Scholars have increasingly employed image recognition, sensor-based data, and machine learning techniques to explore the complex relationships between the built environment and human perception from broader and more multidimensional perspectives. This methodological transformation has given rise to perceptual urbanism, a paradigm in which cities are no longer understood merely as physical spaces, but as dynamic perceptual systems [7].
However, existing studies still exhibit notable disciplinary limitations. Most research has focused on traditional historic towns or general urban environments, while systematic and in-depth investigations of historic districts within modern metropolitan contexts remain limited. Compared with traditional ancient cities, historic districts in modern metropolitan contexts often possess two types of attributes. On the one hand, they carry the aesthetic value of cultural heritage, including architectural character, material texture, and cultural symbolism. On the other hand, they also bear the functional pressures of metropolitan operations, including transport circulation, commercial activities, and everyday residential needs [11]. This dual attribute creates a core tension that has become increasingly prominent. The challenge lies in how to preserve the aesthetic integrity of cultural heritage while simultaneously meeting the high-density and highly functional demands of contemporary cities. Different districts display strong heterogeneity in visual characteristics, functional configurations, and social activity rhythms, and these differences form a critical contextual basis for how residents develop spatial preferences and experiential judgments. However, existing studies often focus on a single area or a specific street segment, making it difficult to reveal how cross-district variations influence perceptual mechanisms.
Building on this foundation, the study adopts perceptual balance as the core of its analytical framework. Perceptual balance focuses on how individuals integrate and weigh heritage-oriented aesthetic features such as façade continuity, historical texture, and green landscapes together with function-oriented environmental cues such as commercial density, service facilities, and traffic organization. By examining how people reconcile heritage aesthetics and urban functionality at both cognitive and emotional levels, the study identifies the spatial cues that shape this balance and the differentiated mechanisms through which they operate, thereby laying the foundation for subsequent theoretical and methodological exploration.

1.2. Literature Review

1.2.1. Definition and Significance of Historic Districts

Historic districts are generally defined as contiguous urban areas that preserve substantial historical remnants and authentically reflect the traditional character and local identity of a specific historical period [12]. Within these spaces, tangible heritage—such as street layouts, courtyard typologies, and architectural façades—intertwines with intangible elements, including traditional craftsmanship and local rituals, to form a multilayered cultural landscape. Such districts not only document the process of urban evolution but also embody residents’ collective memory and sense of identity. However, under the pressures of rapid urbanization, capital-driven redevelopment has exposed many historic districts to the dual risks of cultural heritage loss and landscape degradation. Renovations that lack contextual sensitivity, or that promote excessive commercialization, often undermine spatial quality and contribute to community hollowing, thereby weakening both the cultural continuity and everyday vitality of these areas [13,14].
Against this backdrop, an emerging line of research has begun to integrate living heritage renewal with public perception. Zheng et al. (2024) proposed the concept of progressive urban regeneration, which introduces slow-mobility systems and public activity spaces through carefully controlled spatial scales and context-sensitive design [15]. This approach aims to maintain livability and openness while preserving the traditional urban fabric. Meanwhile, the application of street-view image analysis and deep learning techniques has further advanced the quantitative assessment of urban perception. Recent findings indicate that the proportion of historical façades, the perceived sense of historical character, and the richness of hardscape elements significantly influence the intensity of public cultural perception, offering quantitative evidence to inform façade restoration and landscape interventions [14]. X. Gao et al. (2025) further emphasized that enhancing visual comfort requires the careful integration of natural elements and cultural symbols within high-density built environments, in order to mitigate perceptual stress and strengthen psychological identity [16]. Similarly, M. Li et al. (2021) combined sensor data with the Google Earth Engine modeling framework to reveal the nonlinear mechanisms through which building height, street connectivity, functional diversity, and human-scale landscape characteristics influence neighborhood vitality [17]. Their results also highlighted a pronounced winter effect, suggesting that design strategies should consider seasonal variations and incorporate microclimatic regulation.
Different types of historic districts exhibit distinct requirements for morphological conservation and functional renewal. A study of Suzhou’s ancient city revealed that street connectivity, visual openness, and functional diversity jointly shape the overall quality of spatial perception. Building on these findings, a hierarchical renewal framework—organized around streets, courtyards, and canals—has been proposed to reinforce the role of public experience in achieving organic heritage conservation [18]. In contrast, research on Beijing Road in Guangzhou demonstrated that maintaining a balance among authenticity, theatricality, and legitimacy enhances commercial vitality while preserving cultural value, offering valuable insights for the moderate renovation of high-footfall historic commercial streets [19].
Although empirical studies on traditional historic cities and commercial districts have become increasingly abundant, systematic investigations into small-scale and functionally complex “interstitial” historic districts within modern metropolises remain limited. These districts not only sustain the continuity of traditional landscapes but also accommodate diverse urban functions such as offices, residences, and transportation interchanges. Their internal spatial structures often involve pluralistic property ownership and ongoing processes of residential turnover, while progressively integrating digital governance systems and emerging economic activities. Nevertheless, current research remains insufficient in addressing the coordination among cultural preservation, economic utility, and social equity. Most multi-scale assessments are still confined to single street segments or individual buildings, making it difficult to simultaneously reconcile the interrelationships among heritage value, spatial vitality, and urban infrastructure networks.

1.2.2. Mechanisms Linking the Built Environment and Human Perception

Individual perception of the environment is a core component in understanding human–environment relationships and evaluating spatial coherence, making it a central concern in urban design and spatial planning [20]. Since Lynch (1964) proposed the “five elements” framework of urban imageability, scholars have increasingly recognized that urban space is not merely a passive physical container, but a contextual system that can be perceived, remembered, and reinterpreted [21]. Research in environmental psychology has further revealed that human cognition, emotion, and behavior are closely interrelated within spatial contexts. Environmental characteristics influence psychological states not only through their physical form but also by evoking emotional resonance, which shapes individuals’ subjective experiences of place [22].
At the cognitive level, researchers have investigated how the spatial structure of the built environment influences perceptual quality. S. Li et al. (2022) found through street-space analysis that the height-to-width ratio and the diversity of points of interest jointly determine spatial legibility and recognizability, whereas excessively elongated streets weaken human-scale perception and reduce walking intention [23]. Iamtrakul et al. (2023) further argued that transportation accessibility and street design not only affect commuting efficiency but also shape residents’ subjective evaluations of safety and convenience, thereby influencing overall life satisfaction [24]. Complementarily, Q. Chen et al. (2022) revealed the systemic effects of environmental structure, showing that street connectivity, functional diversity, and green coverage collectively shape residential experience and exhibit pronounced spatial heterogeneity [25]. Collectively, these studies indicate that the built environment influences individuals’ cognitive processing of spatial quality through multiple physical dimensions.
At the preference level, environmental factors shape individuals’ emotional experiences and behavioral choices through affective pathways. Safety and pleasantness are key psychological variables in travel decision-making, jointly influenced by factors such as lighting, greenery, and traffic density. For example, cyclists tend to choose routes that are shorter, flatter, less congested, and well-lit, with some showing a preference for streets featuring mixed land use or abundant vegetation [26]. Herrmann-Lunecke et al. (2021) found in their walking study that vegetation, open spaces, and continuous sidewalks evoke positive emotions, whereas noise and surface deterioration induce anxiety and avoidance [27]. Findings from Gao et al. (2024) in high-density urban contexts present a more nuanced picture [22]. Concentrated functional clusters and compact street networks can enhance perceived vitality, yet may simultaneously induce sensory fatigue. These findings suggest that emotional responses exhibit nonlinear characteristics, and that different types of spatial stimuli may produce opposite outcomes under varying contextual conditions.
Collectively, these studies demonstrate that the relationship between the built environment and human perception is not unidirectional, but rather constitutes a dynamic system mutually coupled across cognitive, emotional, and behavioral dimensions. Urban space, therefore, should no longer be viewed as a passive backdrop, but as an active medium that shapes human experiences and social interactions. For historic districts, this understanding holds critical implications. Previous regeneration practices have often prioritized morphological restoration and commercial revitalization, while neglecting the subjective experiences of residents and visitors [28]. However, the perceptual quality of public spaces directly influences users’ behaviors and evaluations, as modifications to landscape features can significantly affect visitors’ dwelling patterns and psychological responses [13]. Hence, perception should not be regarded as an auxiliary dimension, but rather as a key pathway through which historic district conservation and renewal contribute to cultural resilience and social sustainability.

1.2.3. Evolution of Street-Imagry and Perception Analysis Methods

The widespread availability of street-view imagery has introduced a novel perspective for studying urban perception. With the advancement of geolocation technologies and mapping services, geographically tagged street-view images have increasingly become a vital data source for exploring the relationship between the built environment and human perception [7,29]. Compared to traditional field surveys, such data offer distinct advantages in cost efficiency, spatial coverage, and visualization, enabling low-cost representation of urban spatial structures and visual characteristics [30]. Consequently, street-view imagery is increasingly serving as a fundamental data source for urban perception research, enabling the large-scale and continuous acquisition and analysis of environmental features.
However, data-driven approaches are not without limitations. While machine learning models have significantly improved analytical efficiency and comparability, perception—as a complex psychological experience—cannot be fully captured through quantification [31]. X. Li et al. (2015) noted that crowdsourced ratings are prone to subjective bias and semantic inconsistencies, particularly in cross-cultural contexts [32]. Similarly, Lou et al. (2024) emphasized that while models can effectively identify visual differences, they often fail to capture genuine emotions and lived experiences [31]. At the same time, the “black-box” nature of deep learning further constrains interpretability. While these models can efficiently extract image features, they often struggle to explain which specific visual elements actually drive perceptual differences, thereby posing challenges for translating research findings into planning and design practice. Striking a balance between model performance and interpretability has thus become a central issue in computational perception research. More critically, the participant composition of the Place Pulse 2.0 dataset exhibits a pronounced geographic bias, with most evaluators concentrated in the United States, India, the United Kingdom, Brazil, and Canada [33]. As a result, their perceptual standards align more closely with Western urban contexts, limiting the applicability and cultural interpretability of the findings in Asian settings.
In summary, the integration of street-view imagery and machine learning technologies has significantly expanded the scope of urban perception research, enabling quantitative analysis and visual representation of environmental features. At the same time, concerns regarding data representativeness and cultural adaptability underscore the need for a more critical and context-sensitive understanding of visual perception. Future studies should aim to strike a balance between algorithmic modeling and contextual interpretation by incorporating regional cultural differences and local perceptual characteristics, thereby establishing evaluation frameworks that are both generalizable and interpretatively robust.

1.3. Research Objectives and Questions

Historically, urban historic districts have faced the dual challenges of preserving cultural traditions while accommodating contemporary functional demands amid ongoing modernization. These districts are expected to maintain the authenticity of their historical character while engaging in moderate commercial adaptation to support socio-economic development. However, there remains no clear consensus on how to strike an appropriate balance between the traditional urban fabric and modern requirements. Although previous studies have examined environmental perception in historic districts from both macro-level perspectives (e.g., urban planning and policy) and micro-level aspects (e.g., street interfaces and architectural characteristics), limited attention has been paid to the perceptual mechanisms of historic districts within contemporary urban contexts. This study aims to integrate architectural functions and micro-environmental features to analyze the interactive mechanisms between the built environment and human perception. Accordingly, it addresses the following research questions:
  • How do built-environment elements in historic districts influence pedestrians’ perceptions of these areas?
  • How do different functional attributes and environmental features exhibit nonlinear and threshold effects across various perceptual dimensions?
  • Within a given perceptual dimension, which environmental and functional attributes play critical roles, and how can optimizing macro-level functional configuration and micro-level spatial quality enhance pedestrians’ positive perceptions and willingness to engage with the space?

2. Methods

2.1. Study Area

As China’s economic and cultural center—as well as a leading global metropolis—Shanghai has developed a multilayered spatial structure and a distinctive historical character throughout its modern urbanization process. The historic districts embedded within the city’s urban fabric not only document the social life and architectural forms of various historical periods but also bear witness to Shanghai’s transformation from a traditional port city into an international metropolis [11]. These areas serve not only as carriers of tangible heritage, but also as vital sources of collective memory and local identity for residents.
However, in the context of rapid urbanization and spatial restructuring, Shanghai’s historic districts face multiple challenges. Some areas are experiencing infrastructure deterioration, demographic shifts, and a weakening sense of cultural identity—factors that contribute to the gradual erosion of the social functions and symbolic significance of traditional spaces. Balancing the preservation of historical patterns with the demands of contemporary urban development has thus become a central issue in urban governance and design practice.
Based on archival research and field investigations, this study selected six historic districts within Shanghai’s inner ring—each retaining characteristic historical features—as the research objects (Figure 1). These districts exhibit significant differences in spatial morphology, architectural style, and functional composition, encompassing both high-density commercial zones and residential or culture-oriented living areas (Appendix A). By conducting cross-sectional comparisons of built-environment characteristics and perceptual differences across these various district types, the study aims to provide a more comprehensive understanding of the spatial expressions and perceptual mechanisms of historic districts in the context of a modern metropolis.

2.2. Analytical Framework

This study establishes an integrated analytical framework that combines spatial data, street-view imagery, and subjective perception evaluations to identify the mechanisms through which built-environment elements influence perceptual differences in historic districts (Figure 2). The framework follows a three-tier logic: space, visual, and perception. First, the study delineates spatial boundaries and collects objective environmental data; second, visual features are extracted from street-view images using deep learning models; and finally, subjective perceptual responses are quantified through questionnaire-based evaluations. This approach enables correlation analysis between objective environmental characteristics and subjective perception outcomes.
First, the boundaries of the six selected historic districts in Shanghai were determined using data from the China Historical Architecture Conservation Network (http://www.aibaohu.com/). Subsequently, the street network for each district was extracted from OpenStreetMap (www.openstreetmap.org), and a grid of sampling points was generated in ArcGIS Pro 3.4 at 50-m intervals to ensure spatial uniformity and representativeness. Based on these sampling points, the Baidu Map API (https://lbsyun.baidu.com/) was employed to collect information on POIs and corresponding street-view images within the study area, thereby constructing the foundational dataset for analysis.
Second, the street-view images were processed using the PSPNet semantic segmentation model to identify and quantify 150 categories of micro-environmental elements. These elements include buildings, vegetation, roads, advertisements, vehicles, pedestrians, and others. The proportion of pixels corresponding to each category was used to describe the visual composition of the streetscape, serving as input variables for subsequent perception modeling.
To obtain the public’s subjective perceptions of the district environment, an online anonymous questionnaire survey was designed and administered. Participants were required to be at least eighteen years old and to evaluate the environment in the street-view images across seven perceptual dimensions. Commonly used perceptual dimensions include safety, liveliness, beauty, wealthiness, depression, and boredom [10]. The historic districts examined in this study are high-density everyday urban spaces with strong residential and local functions. In such contexts, cleanliness and orderliness have been repeatedly identified in environmental psychology and urban governance research as important determinants of comfort, safety, and place image [34]. Therefore, this study includes cleanliness as one of the indicators to more accurately reflect the structure of environmental experience within the local context (Table 1).
All questions were measured using a five-point Likert scale, and the survey design and implementation were approved by the Ethics Committee of the University of Shanghai for Science and Technology. A total of eight hundred and thirty-five valid responses were collected, and when combined with two hundred ninety-six thousand six hundred fifty-seven POI records and three thousand three hundred seventy-one street-view images, a comprehensive database integrating multiple sources of information was established.
Finally, the study employed the XGBoost model, using micro-environmental elements and POI features as independent variables and perceptual ratings as dependent variables, to identify the importance and direction (positive or negative) of various built-environment characteristics across different perceptual dimensions. This framework enabled systematic modeling of the relationship between objective environmental attributes and subjective perception outcomes, providing an operational pathway for quantifying environmental experience and informing design optimization strategies in historic districts.

2.3. Dependent Variable

To evaluate pedestrians’ subjective perceptions of different district environments, an anonymous online questionnaire survey was conducted from 19 May to 18 August 2025. Participants were invited to rate the presented street-view images across seven perceptual dimensions: beautiful, boring, depressing, lively, safe, wealthy, and clean. The questionnaire adopted an open-ended rating mechanism, allowing participants to evaluate only the dimensions they found relevant, rather than responding to all items sequentially. This design aimed to encourage respondents to answer based on their genuine impressions, thereby enhancing the authenticity and psychological validity of the perceptual responses.
Building on previous research, the Likert scale records subjective intensity in an ordinal format, allowing respondents to express both the direction and magnitude of their evaluations. It is particularly well-suited for quantifying continuous subjective experiences such as emotions and aesthetic judgments. The five-level progression from low to high enables participants to form stable judgments within a short time, thereby reducing cognitive load and improving response consistency. The Likert scale has been widely applied in studies of environmental perception, walking experience, and urban imagery, demonstrating strong comparability and interpretability. Accordingly, this study adopts the Likert scale and subsequently tests the reliability and validity of the scale items. For robustness analysis, ordinal-variable treatment approaches are also considered to minimize measurement error and enhance the stability of the results.
The questionnaire was distributed through a custom-developed program created by the research team. The system randomly selected images from the database and assigned them to each participant, generating a unique anonymous ID for every session. Each image set was linked to the corresponding ID and locked prior to submission to ensure that no image was reused, thereby preventing rating bias. Each participant was required to evaluate five street-view images. This number was determined based on pilot testing, which considered participants’ attention thresholds and response burden. The results indicated that when the number of images exceeded five, both completion rates and rating quality declined significantly, whereas five images effectively balanced data volume with participant focus and response reliability.
The survey followed the principles of voluntary and anonymous participation and complied with ethical review requirements. After excluding invalid, duplicate, or incomplete responses, a total of eight hundred and thirty-five valid questionnaires were retained. Most participants were university students and faculty members from the Shanghai region, many of whom had lived or studied in the city for an extended period and were therefore familiar with the everyday environments and spatial characteristics of local historic districts. This sample composition reflects a relatively high level of education and strong environmental sensitivity, providing a robust basis for the analysis of urban visual perception and environmental experience in this study.

2.4. Independent Variable: Environmental Features

2.4.1. Baidu Street View Image Environmental Features

Street-view imagery, with its high visual information density and ability to represent spatial structure, has become an important data source for studying the relationships between urban physical environments and spatial perception [35].
In this study, SVIs were obtained in batches from the Baidu Map Open Platform based on the geographic coordinates of the sampling points. To more accurately simulate the human eye–level perspective and observation height, parameters such as azimuth angle, pitch angle, and viewpoint height were specified to ensure consistency and representativeness of the image views. A total of 3371 street-view images were ultimately collected for analysis.
Subsequently, pixel-level semantic segmentation was performed on the street-view images using the PSPNet model in combination with the ADE20K dataset, identifying and extracting approximately 150 feature categories (Figure 3), including micro-environmental elements such as sky, roads, and pedestrians. The PSPNet model is an advanced deep convolutional neural network that incorporates a pyramid pooling module to effectively integrate global and local features, thereby enhancing overall scene interpretation. Through an optimized loss function design, the model achieves state-of-the-art segmentation performance, demonstrating high efficiency and accuracy in street-view image processing tasks [36,37].

2.4.2. POI Data

POIs are a core concept in geographic information systems, encompassing extensive information on public infrastructure and commercial facilities. They are characterized by large data volume, broad spatial coverage, high identification accuracy, and convenient accessibility [38]. In recent years, with the continuous advancement of China’s digital urbanization, urban spatial entities have increasingly been abstracted into “points of interest” and visualized for users through mapping applications. POI data not only reflect the spatial distribution and attribute characteristics of urban functional facilities, but are also closely associated with human perceptions of urban space. As a result, they have been widely applied in studies of urban perception and human behavior [7].
In this study, POI data within the research area were extracted using the Baidu Map API (https://lbsyun.baidu.com/), obtaining information such as name, address, and geographic coordinates. After data cleaning, coordinate transformation, and sorting procedures, a total of 296,657 POI records were collected. Based on the official classification system, the POIs were categorized into eight major types (Table 2): food, shopping, life services, travel sights, leisure and entertainment, institutional and business services, transportation services, and accommodation services.

2.5. Data Processing and Analysis

2.5.1. Reliability and Validity Testing of the Questionnaire

To ensure the validity of the questionnaire, both reliability and validity tests were conducted. For the reliability assessment, Cronbach’s α coefficient was used to evaluate the internal consistency of each set of variables. This coefficient is a widely used metric for assessing the consistency of interval or quasi-interval scale ratings and has been broadly applied in scale reliability evaluations [39,40]. A higher α value indicates stronger measurement reliability; generally, a coefficient above 0.6 is considered acceptable, while values above 0.7 indicate good internal consistency. This study adhered to these standards to ensure that the variable structures demonstrated high reliability [41].
To further assess the structural suitability of the variable sets, the Kaiser–Meyer–Olkin (KMO) test and Bartlett’s test of sphericity were conducted as part of the validity analysis. Together, these tests provided a preliminary basis for evaluating the structural validity of the questionnaire.

2.5.2. Analyzing the Influence Mechanisms Between Subjective Perception and Environmental Variables Based on the XGBoost Model

To explore the contribution and underlying mechanisms of built-environment variables across different perceptual dimensions, the XGBoost machine learning model was employed for analysis. XGBoost is a gradient boosting–based ensemble learning method that iteratively optimizes a set of weak learners, making it well-suited for research scenarios involving numerous variables, complex feature types, and nonlinear relationships. Its structure effectively captures interaction effects and nonlinear responses among high-dimensional features, and it has demonstrated strong predictive performance and stability in studies of urban spatial analysis, behavioral science, and environmental perception [42,43,44]. Therefore, compared with other models, XGBoost demonstrates superior performance in analyzing complex nonlinear relationships that are essential for environmental research [45].
However, for complex models such as XGBoost, the original model structure cannot be directly interpreted. Shapley Additive Explanations (SHAP) can overcome this limitation [46]. Therefore, this study employed the SHAP method to enhance model transparency and result credibility. Originating from Shapley value theory, this approach quantifies the marginal contribution of each variable to the model’s prediction and identifies both the direction—indicating whether the variable promotes or suppresses perceptual ratings—and the relative importance of each factor [47,48].
Subsequently, the outputs of the XGBoost model were used to evaluate the predictive effects of built-environment variables across different perceptual dimensions, while SHAP analysis was employed to identify key variables and their directions of influence. To further reveal the nonlinear structures of these features, Partial Dependence Plots (PDPs) were used to illustrate the average marginal effects of environmental variables on model outputs across different value ranges. PDPs visualize the marginal effects of individual features on the outputs of machine learning models and are particularly suitable for exploring non-monotonic relationships within high-dimensional and multifaceted data structures [49]. This method isolates the target feature by holding other variables constant, enabling the observation of how variations in that feature influence perceptual ratings, thereby revealing potential threshold effects, saturation points, or nonlinear patterns [50,51]. SHAP and PDPs can be used to quantify feature importance and to visualize domain-specific outcomes, which supports the identification of the most effective interventions [52]. This process facilitates an understanding of the sensitive intervals and critical turning points of specific environmental factors from the perspectives of urban design and spatial governance, providing empirical evidence for informed policy recommendations and design strategies.

3. Results

3.1. Questionnaire Validation Results

Before conducting the model analysis, a reliability test was performed on the questionnaire data. The model inputs consisted of two categories of variables: the first included eight POI indicators, and the second comprised eleven micro-level visual elements derived from street-view imagery. In total, these 19 variables were used to construct the built-environment feature matrix.
First, by examining the values of Cronbach’s α coefficients, the results (Table 3) show that the POI variables achieved an α value of 0.789, exceeding the commonly accepted threshold of 0.7 and indicating stable internal consistency among these variables. The α value for the micro-level visual elements was 0.632—slightly below the 0.7 benchmark but still within an acceptable range. When all 19 variables were combined, the overall α coefficient reached 0.729, suggesting that the overall variable structure demonstrated good internal reliability.
It is important to note that environmental feature variables differ substantially from conventional psychological scales. Elements such as vegetation, building façades, public facilities, and open-space components in urban environments exhibit high heterogeneity in both type and spatial distribution; therefore, a level of internal consistency comparable to that of attitudinal scales should not be expected. Previous studies have pointed out that the diversity and complexity of urban environmental features often lead to relatively lower internal consistency, which in fact reflects the genuine heterogeneity of spatial landscapes rather than deficiencies in scale quality. Accordingly, this study considers the obtained reliability results to be both reasonable and valid, providing a stable data foundation for subsequent model analyses.
Secondly, as shown in Table 4, the KMO measure of sampling adequacy was 0.753, exceeding the commonly accepted threshold of 0.7 and indicating strong inter-variable cohesion and suitability for factor analysis. In addition, Bartlett’s test of sphericity yielded a significant result (χ2 = 8821.612, df = 171, p < 0.001), suggesting good overall validity. These findings demonstrate that the data structure provides a solid statistical foundation to support subsequent model inference.
The implications of these results extend beyond statistical verification. In the context of urban spatial research, the reliability and validity of variables determine whether model inputs can accurately represent environmental structures and district characteristics. The higher internal consistency observed among POI variables suggests that urban functional facilities follow clear clustering logics and planning orientations, forming relatively stable spatial-functional patterns. In contrast, the more dispersed structural characteristics of micro-level visual elements indicate that environmental details within streetscapes are more diverse and open—jointly shaped by multiple factors such as building age, material selection, greenery maintenance, and community governance. This distinction reflects two complementary dimensions of the urban environment. On one hand, functional facilities are more likely to achieve standardization and order through institutional planning. On the other hand, street-level details tend to embody locality and everydayness, exhibiting visual and perceptual richness as well as uniqueness. Overall, the reliability and validity tests demonstrate that the variable system employed in this study possesses a strong data foundation and robust statistical applicability, thereby supporting subsequent perception modeling and contribution analysis based on machine learning. The model performance metrics for each perceptual dimension (MSE, RMSE, MAE, and R2) are summarized in Appendix B, Table A2 for reference.

3.2. Correlation Characteristics and Mechanistic Interpretation of Perceptual Dimensions

The correlation matrix reveals distinct clustering patterns among the built-environment elements of historic districts (Figure 4), with variables primarily forming two internally coupled structures. The first cluster comprises commercial and service-oriented functional elements, while the second cluster consists of spatial-scene and visual composition factors. This structure reflects the composite characteristics of functional density and scene representation within historic districts and further illustrates the symbiotic relationship between diverse urban activities and spatial semantics.
Within the functional cluster, the high correlations among shop, dining, leisure entertainment (leisure), transportation (trans), and institutional business (instBus) confirm the spatial logic of co-location between commercial and service facilities. Multiple urban functions within historic districts often overlap and interact, jointly shaping a commercial atmosphere characterized by high pedestrian density and a strong sense of vitality. This phenomenon may result not only from planning orientation but also from the combined effects of land-tenure configurations and concentrated consumer demand in high-density districts. However, some of these strong correlations may merely reflect co-location effects rather than direct causal linkages between functions; therefore, the direction and magnitude of such influences require further validation through subsequent modeling.
Variables related to spatial scene and structural composition exhibit another form of consistency. Building components (bldComp), street furniture (stFurn), sky, and road surface (roadSurf) show strong correlations, indicating that spatial openness, façade continuity, and the distribution of street facilities typically influence visual experience in an integrated manner. In narrow alleys, the simultaneous compression of road surface and sky, and in wider streets, their concurrent expansion, create a pattern of structural co-variation among these variables. The moderate positive correlation between green view (grnView) and natural surface (natSurf) also suggests a co-existence pattern among nature-related elements; however, it should be noted that these two variables partially overlap in their semantic definitions, and therefore the strength of their correlation should be interpreted with caution.
Cross-cluster relationships are generally weak, yet several connections hold interpretive significance. The correlations among human presence, dining, and commercial advertisements (commAds) may suggest the attraction effect of commercial facilities, but they may also simply reflect statistical co-occurrence patterns in densely populated areas. The moderate coupling between trans and multiple scene-related variables forms a structural linkage, indicating that transportation infrastructure serves as a critical mediator between functional clustering and spatial perception. Meanwhile, the slight negative correlations between grnView and both dining and trans are more likely to represent spatial differentiation among district segments rather than environmental conflicts, reflecting semantic distinctions between pedestrian-oriented zones and green, residential street environments.
The above correlation structures reveal the parallel variation patterns and semantic coupling logic among environmental elements in historic districts. Commercial functions, activity density, and visual characteristics jointly contribute to the construction of the perceptual framework of historic districts, whereas natural elements exhibit a more dispersed and selectively embedded pattern within the spatial structure (see Appendix C for details).

3.3. Key Factors Influencing Subjective Perception

3.3.1. Major Variables Across Perceptual Dimensions

The variable importance results reveal a differentiated structural pattern across the seven perceptual dimensions. First, in the beautiful dimension (Figure 5a), grnView exhibits a substantially higher contribution than all other variables. This finding indicates that natural elements play a decisive role in shaping visual preference, aligning with previous research emphasizing that “visual greenery is a key source of aesthetic experience”. Notably, greenery is perceived not only as an ecological component but also as a symbolic indicator of a pleasant and high-quality environment. This suggests that, within the visual evaluation of historic districts, greenery remains the most immediate aesthetic cue.
In contrast, the lively dimension is primarily driven by functional elements such as dining, leisure, accommodation (accom), instBus, and trans (Figure 5b), while showing relatively weak dependence on natural landscapes. The results indicate that perceptions of urban vitality are mainly associated with the density of commercial facilities, the availability of activity opportunities, and spatial accessibility, rather than with green coverage. This finding resonates with a long-standing discussion in urban planning—that a pleasant urban experience arises not only from greenery but also from the richness of urban services and activity settings.
The safe, clean, and wealthy dimensions exhibit highly similar structural patterns (Figure 6). All three are primarily supported by elements associated with order and management, including instBus, lifeSvc, accom, shop, and commAds, while the contribution of grnView remains relatively limited. In other words, residents and visitors tend to form impressions of safety, cleanliness, and affluence based more on social order, governance performance, and consumer symbolism than on the presence of natural environments. This pattern suggests that perceptions of urban quality are shaped largely by institutional capacity and commercial atmosphere rather than by greenery itself.
Among the emotion-related dimensions, boring and depressing display distinct underlying logics. In the boring dimension (Figure 6d), grnView again demonstrates significant explanatory power, suggesting that the absence of natural elements tends to evoke feelings of monotony and emptiness. In contrast, in the depressing dimension (Figure 6e), dining and trans dominate, implying that high-intensity environments characterized by commercial and traffic activities may induce crowding and psychological stress—particularly in historic districts where spatial carrying capacity is limited. This phenomenon indicates that urban vitality and psychological comfort do not follow a linear correspondence but instead exhibit potential perceptual thresholds.
Overall, distinct differentiation is observed across perceptual dimensions in terms of natural elements, functional facilities, and social-order cues. Natural landscapes enhance aesthetic experience, yet their positive effects are not universally applicable. Functional facilities improve convenience and vitality, but in high-density environments, they may also lead to spatial compression and psychological fatigue. Governance and consumer symbols contribute to perceptions of safety and quality, yet they may simultaneously diminish natural affinity and social warmth. These results highlight that spatial strategies in historic districts should vary according to perceptual objectives. Aesthetics, vitality, and psychological comfort cannot be enhanced through a single type of element alone; rather, they depend on the combined configuration and proportional balance of multiple environmental factors (see Appendix D for details).

3.3.2. Positive and Negative Contribution Characteristics of Variables Across Perceptual Dimensions

The positive and negative contributions of variables reveal the differentiated roles of various environmental elements in perceptual formation. Overall, elements related to naturalness, functionality, and orderliness exhibit pronounced nonlinear and context-dependent characteristics across different perceptual dimensions, reflecting the complexity and multidimensional interactions underlying perceptual mechanisms in historic districts.
As shown in Figure 7a, within the beautiful dimension, grnView exhibits the strongest positive contribution, with aesthetic perception rising sharply after moderate green coverage and approaching saturation in the higher-value range. This trend highlights the central role of green landscapes in shaping aesthetic experience, consistent with the existing theoretical consensus on the link between visual greenery and pleasurable perception. In contrast, the positive effects of dining and leisure stem primarily from social interaction and atmospheric liveliness rather than from visual greenery itself. A similar pattern of green enhancement can also be observed in the higher-value ranges of the clean and wealthy dimensions (Appendix E), though with weaker magnitudes (Appendix F). This indicates that the advantage of greenery is largely confined to aesthetic contexts, serving more as a supportive cue in perceptions of quality and affluence. Conversely, instBus and trans show negative effects in high-density areas, suggesting that institutional buildings and transport facilities may induce visual pressure and perceptual disturbance. This dual mechanism of “green enhancement and institutional suppression” reveals the underlying tension between aesthetic experience and spatial order.
Distinct from aesthetic experience, the lively dimension demonstrates a functionality-driven vitality mechanism (Figure 7b). Dining and leisure significantly enhance street vitality, indicating that social interaction and activity diversity are key sources of emotional restoration and environmental attractiveness. However, when instBus and trans occupy excessively high proportions, social spaces become compressed by formal functions, leading to a decline in perceived liveliness. This inverse effect suggests that urban vitality does not simply depend on usage intensity, but rather on the balance between activity types and opportunities for social interaction. Notably, within the lively dimension, the contribution of grnView is markedly weaker, implying that greenery, when detached from social or activity contexts, offers limited psychological appeal. Correspondingly, in the boring dimension (Appendix E), the absence of commercial facilities and pedestrians significantly increases perceived monotony, while green view can alleviate dullness in low-activity settings, though its influence remains weaker than that of social and commercial stimuli (Appendix F). Together, these patterns illustrate a “social attraction and spatial emptiness inhibition” mechanism, highlighting the compensatory dynamics between human activity and environmental openness.
The safe, clean, and wealthy dimensions exhibit a highly consistent structural pattern (Appendix E). The clustering of instBus, accom, shop, and stFurn collectively supports judgments of order and quality, while the role of green view remains relatively secondary (Appendix F). Notably, in the safe dimension, trans and human variables exert positive effects within moderate ranges, reflecting a safety mechanism driven by appropriate levels of pedestrian presence and public surveillance. In contrast, the depressing dimension (Figure 7c) reveals that high-density traffic and institutional spaces intensify negative emotions, forming a dual manifestation of “order and pressure”. This inverse relationship is evident in both the SHAP decision pathways and local response intervals (Appendix F). Meanwhile, dining and shop variables mitigate negative emotions when moderately distributed, indicating that commercial and social activities serve as psychological buffers under certain conditions. These findings suggest that vitality and depression are often co-shaped by the opposite effects of the same types of elements operating at different density ranges, underscoring the nonlinear and context-dependent nature of environmental perception in historic districts.
Overall, green elements dominate the aesthetic dimension, social and recreational functions drive the vitality dimension, and transportation and institutional spaces reinforce the negative emotion dimension. This cross-dimensional contrast reveals the nonlinear logic of perceptual mechanisms: the positive effects of greenery are not universal; functional elements can both stimulate vitality and generate burden; and order-related facilities may shift from cues of attraction to sources of oppression depending on context. Such interwoven relationships underscore the complexity of perceptual structures in historic districts and provide insights for future spatial optimization. Only through a dynamic balance among greenery, order, and safety can the visual and emotional attractiveness of historic districts be sustained over time.

3.4. Analyzing Feature Effects Using Partial Dependence Plots

The results of the Partial Dependence analysis reveal that the relationships between built-environment elements and multidimensional perceptions generally exhibit nonlinear response patterns characterized by threshold, saturation, and reversal effects. This indicates that environmental perception does not accumulate linearly; rather, it is significantly activated within specific value ranges and tends to show marginal diminishing or even inverse effects at higher levels.
In the beautiful dimension (Figure 8), grnView exhibits the most typical threshold pattern: when the green view ratio is below approximately 0.30, its effect is limited; once this threshold is exceeded, perceived beauty increases significantly and then plateaus around 0.55. Moderate clustering of dining and leisure (0.2–0.4) likewise enhances pleasantness, but excessive concentration (>0.6) leads to aesthetic fatigue, illustrating a balance between functional density and visual experience. In contrast, the boring dimension shows an inverse diversity–monotony pattern (Appendix G). Moderate clustering of dining and lifeSvc (around 0.3) reduces the sense of emptiness, whereas excessive density of instBus and trans (>0.5) results in boredom. GrnView values above 0.4 continue to mitigate monotony, extending its positive aesthetic effect (Appendix H). In the clean dimension, lifeSvc and accom increase perceived cleanliness within the 0.2–0.3 range, but their influence weakens when the proportion exceeds 0.6. Conversely, higher densities of transportation (>0.4) and commAds (>0.5) reduce the impression of cleanliness, reflecting the visual and cognitive burden caused by functional overconcentration (Appendix H).
The lively and depressing dimensions form a striking mirror relationship. The former demonstrates a positive threshold effect of moderate enhancement, whereas the latter exhibits a negative threshold pattern characterized by overconcentration reversal. In the lively dimension (Figure 9a), the proportions of dining and leisure significantly increase perceived vibrancy within the 0.3–0.5 range, peaking around 0.6–0.7 before reaching saturation. In contrast, trans turns negative beyond 0.5, indicating that vitality does not increase monotonically with density, but rather arises from a balanced distribution of pedestrian flow and activity intensity. Conversely, in the depressing dimension (Figure 9b), instBus and transportation intensify depressive feelings when their proportions exceed 0.5. Accom and lifeSvc provide mild buffering effects at moderate levels, but likewise turn negative under high-density conditions. This reversal relationship suggests that when institutional or transport-oriented spaces become overly concentrated, environmental attractiveness may shift into psychological pressure.
A similar nonlinear threshold effect is also observed in the safe dimension (Figure 9c). When grnView coverage exceeds 0.3, perceived safety increases significantly, suggesting that natural elements enhance the perception of order by improving visual openness and comfort. Shop and lifeSvc generate stable positive effects within the 0.2–0.4 range, whereas instBus and commAds above 0.5 noticeably weaken the sense of safety. These findings indicate that visual cleanliness and functional order constitute the perceptual foundation of safety. In comparison, the perception of affluence in the wealthy dimension is primarily shaped by commercial and accommodation elements (Appendix G). Moderate levels of commAds and shop (approximately 0.3–0.5), together with accom (approximately 0.4–0.6), enhance perceived prosperity, whereas excessive density reverses this effect—extending the previously identified order–perception consistency pattern (Appendix H).
Overall, although different perceptual dimensions emphasize distinct aspects, they all follow a common threshold–saturation–reversal pattern. grnView produces significant perceptual enhancement when exceeding approximately 30–40%, representing a cross-dimensional threshold shared among perceptual categories, whereas high-density zones of instBus and trans consistently generate negative experiences. These findings suggest that spatial optimization in historic districts should not pursue mere quantitative accumulation of elements, but rather aim for balanced functional configuration based on threshold identification, thereby maximizing psychological comfort and perceptual quality while preserving historical character.

4. Discussion

This study develops a systematic understanding of perceptual mechanisms in historic districts through the integration of multisource spatial data. Its significance lies not only in identifying the directional influence of key environmental variables but also in revealing how these factors jointly structure the ways in which people experience historic districts. By integrating POI functional attributes, micro-scale environmental indicators, and subjective perceptions, the study proposes a perceptual balance framework centered on the triad of natural characteristics, functional characteristics, and orderliness (Figure 10). This framework explains why historic districts can produce markedly different experiences of aesthetics, liveliness, safety, and comfort. In contrast to traditional approaches that emphasize the effects of single variables, the study demonstrates that experiential quality in historic districts arises from the proportional and structural coordination among these elements rather than from the simple accumulation of individual factors.
Natural characteristics, functional characteristics, and orderliness play distinct roles in shaping experiences within historic districts, yet their mechanisms of influence exhibit clear complementarities. Natural characteristics provide the foundation for positive perception. Their presence enhances aesthetic pleasure and improves overall experience by moderating microclimatic conditions and enriching visual depth. Functional characteristics form the driving system of street life, enabling diverse everyday scenarios such as social interaction, consumption, strolling, and lingering. Orderliness maintains the structural stability of street experience by shaping environmental clarity, visual cleanliness, and spatial legibility. The coexistence of these three dimensions allows historic districts to preserve cultural texture while maintaining everyday spatial flexibility, to support commercial activities while still offering a pleasant pedestrian environment. It is this structural balance, rather than the intensification of any single dimension, that produces street experiences that are simultaneously dynamic and cohesive.
The threshold effects identified in this study further deepen this understanding. The findings indicate that the influences of natural characteristics, functional characteristics, and orderliness do not increase linearly with intensity but instead exert significant effects within specific sensitivity ranges. Once natural characteristics surpass an initial level of visibility, their contributions to aesthetic enhancement and emotional improvement become most pronounced. Functional characteristics most effectively stimulate street vitality when present at moderate levels of mix and density, yet when their concentration exceeds a certain upper limit, they tend to induce visual fatigue, noise disturbance, and social pressure. This aligns with the concepts of moderate complexity and activity load thresholds widely discussed in urban design research. Orderliness strengthens perceptions of safety, comfort, and spatial legibility within an appropriate range, but excessive formalization can suppress social interaction and spatial openness. This result corresponds with recent research emphasizing that environmental order is most conducive to safety and comfort when maintained within a moderate range.
These findings encourage a reconsideration of spatial quality and governance strategies in historic districts. Traditional renewal practices often emphasize isolated improvements such as adding greenery, beautifying façades, or enhancing commercial functions. However, the results of this study indicate that the key to experiential quality lies in the structural coordination among elements across different scales rather than in the enhancement of any single component. Increasing commercial facilities may raise activity density, but it can also disrupt natural characteristics and orderliness, thereby pushing perceptual responses beyond critical thresholds. In contrast, smaller spaces with appropriate internal proportions of natural characteristics, functional characteristics, and orderliness often exhibit greater attractiveness than larger spaces.
Therefore, the perceptual balance model proposed in this study not only offers a new theoretical perspective for explaining experiential differences in historic districts but also provides a structured cognitive basis for urban design. At the macro level, functional configurations should avoid excessive concentration. At the meso level, orderliness should remain clear without becoming overly restrictive. At the micro level, natural elements should maintain a stable level of visual presence. This perspective shifts governance strategies from material upgrading toward experiential enhancement and from localized fixes toward structural adjustment. It also establishes a framework with stronger theoretical depth and practical relevance for the future renewal of historic districts grounded in cultural continuity, everyday life, and resilience.

4.1. Environmental Elements and Mechanisms of Perceptual Formation

The perceptual experience of historic districts is not a direct reflection of their physical attributes, but rather a process through which social meanings are generated in the interaction between people and the environment. The results indicate that grnView exerts a consistent positive effect on the beautiful, safe, and clean dimensions, suggesting that naturalness serves as a fundamental basis for shaping aesthetic experience and psychological safety [53,54]. This finding aligns with the Attention Restoration Theory [55], which posits that natural elements can promote the formation of pleasure and a sense of order by reducing cognitive load. Notably, this effect exhibits a threshold pattern: when the green coverage ratio exceeds approximately 0.2–0.3, positive perceptions increase significantly, echoing the concept of a “recognizable threshold effect” in perceptual psychology. Within the context of historic districts, this naturalness effect is particularly crucial. Because these areas are constrained by spatial morphology and conservation regulations, greenery typically appears in micro-scale forms such as street trees, courtyard vegetation, or façade greening, whose visibility and recognizability become key determinants of psychological comfort. Natural elements provide visual relief and emotional regulation within dense historical fabrics, enabling residents to achieve a balance between a sense of history and a sense of everyday life.
At the same time, certain functional elements display reversal effects. Dining and leisure facilities enhance vibrancy and social interaction when maintained at moderate densities, yet their excessive concentration can induce a sense of pressure and fatigue. This phenomenon can be interpreted through anthropological and social-psychological perspectives: individual experiences of space are not passively received, but are shaped through the cultural meanings attributed to environments. When a place becomes overly shaped by the logic of the experience economy, its original social rhythms and everyday interactions are replaced by commercial imperatives, leading to place alienation and psychological detachment. For historic districts, such alienation is especially pronounced, as their uniqueness stems from heterogeneous rhythms of daily life and layers of cultural memory. When these qualities are replaced by commodified experiences, the complexity of perception reflects an ongoing tension between urban modernity and everyday authenticity, representing a core contradiction that defines the challenges of contemporary historic district renewal.

4.2. Functional Attributes and Social Imagery

Functional attributes not only determine spatial morphology but also profoundly shape the social imagery and cultural identity of urban districts. The significant influence of commercial advertisements and retail facilities on the wealthy and lively dimensions reveals the reproductive mechanism of consumer landscapes in shaping urban imagery. Through visual symbols and consumer atmospheres, commercial elements construct symbolic representations of “prosperity” and “modernity”, transforming historic districts into stages for the display of aesthetic and consumer identities. However, such symbolic prosperity is often accompanied by social exclusion and the erosion of everyday diversity, leading to a typical phenomenon of symbolic gentrification [56]. Hence, while functional renewal can stimulate economic vitality, it may simultaneously erode the social memory and cultural continuity of historic districts.
This phenomenon becomes even more complex in the context of historic districts. The decline of traditional industries and the influx of high-end commerce often reshape the cultural representation of place, transforming the original spaces of living into spaces of display. From a tourism studies perspective, historic districts are no longer merely residential or commercial areas but have evolved into “places to be gazed upon” [57]. Pedestrians experience “staged authenticity” through representational symbols, yet the cultural landscapes thus presented are frequently detached from residents’ real lives, creating a dual structure of experience. This tendency toward cultural commodification reveals the dilemma of heritage preservation in globalized cities: the challenge of maintaining economic sustainability while preventing spatial landscapes from being reduced to superficial experiential goods. In other words, functional differentiation is not simply a matter of economic stratification, but represents a redistribution of social identity, cultural capital, and symbolic power within the urban hierarchy.

4.3. Interactions Among Environmental Elements and Mechanisms of Perceptual Balance

Within the same perceptual dimension, the interactions among different environmental elements reveal the complex mechanisms underlying perceptual optimization. Greenery and spatial openness enhance visual permeability and psychological restoration, whereas order-related facilities [58], while reinforcing a sense of safety, they may simultaneously weaken social interaction and spatial intimacy. This indicates that safety and aesthetics do not inherently coexist, but must be balanced within the tension between regulation and freedom. When urban governance overemphasizes order, it risks fostering the expansion of a “managed aesthetics”, causing districts to lose their everyday flexibility and social vitality.
This issue is particularly pronounced in historic districts. As spatial typologies that simultaneously pursue heritage preservation and modern governance, their renewal processes often encounter tensions among protection, regeneration, and management. The observed perceptual reversals—where order-related facilities contribute positively to the safe dimension but negatively to the depressing one—reflect the latent risks of over-rationalized spaces. Such spaces may appear clean and orderly on the surface, yet they often lack social warmth and interpersonal accessibility. Therefore, spatial governance in historic districts should move beyond a purely safety-oriented logic toward a model of perceptual balance planning, emphasizing micro-scale functional diversity, interface vitality, and maintenance of heterogeneity. Only by preserving social interaction and emotional attachment while safeguarding the historical fabric can historic districts truly achieve the coexistence of safety, aesthetics, and liveability, sustaining their temporal depth and social resilience as “living heritage”.

4.4. Limitations and Future Research Directions

Although this study develops a perception model for historic districts using multisource data, current technological conditions still impose several constraints on urban visual datasets. These limitations do not arise from the research itself but stem from practical restrictions related to data availability and methodological applicability. At the same time, these constraints provide clear directions for methodological refinement and theoretical advancement in the next stage of research.
First, existing studies largely rely on street-view images that are static, captured during daytime, and collected from vehicle-mounted cameras. These forms of data make it difficult to capture the dynamic experiences of historic districts as they change across time and weather conditions. In reality, the atmosphere of a district and the emotional responses it evokes can vary significantly with differences in time of day, seasonal transitions, and climatic situations. However, most available street-view images are provided by commercial mapping platforms whose update frequency is limited. These platforms also face practical constraints in capturing images at night or under special weather conditions due to technical and operational limitations. Future research may explore more flexible image collection approaches, such as crowdsourcing to build street-view databases that cover multiple times of day and various weather situations, including additional nighttime images, seasonal updates, and visual information captured under different climatic conditions. At the same time, low-cost mobile data collection methods and community-based image generation approaches may be explored to overcome the structural limitations of commercial platforms in terms of timing, frequency, and acquisition conditions. Such supplementary data would help to provide a more comprehensive depiction of how historic districts are experienced under different situational contexts.
Second, the vehicle-mounted viewpoint has inherent limitations in viewing height and framing scope, making it difficult to capture spatial details at the pedestrian scale. Many key cues related to perceived safety, comfort, and everyday life—such as façade details, street-level activities, and community interactions—are often not clearly visible in vehicle-based imagery. Future research should incorporate pedestrian-perspective data sources such as wearable cameras, mobile mapping imagery, and immersive virtual reality roaming experiments. Integrating these visual materials with physiological responses such as eye tracking, skin conductance, and heart rate, and further combining them with behavioral trajectory data, can help establish a more robust connection between online visual evaluations and actual walking experiences.
Third, the subjective evaluation data used in this study were collected from participants who were able to take part in an online survey and were familiar with visual assessment tasks. This group demonstrated clear advantages in task comprehension, rating consistency, and cooperation with online experimental procedures, which helped ensure the quality and reliability of the data collection process. However, the social and cultural backgrounds of this sample group are relatively concentrated, which makes it difficult to fully represent the cognitive and experiential diversity of the broader population. In reality, differences in culture, lifestyle, and social experience can strongly shape how people understand environmental concepts such as safety, beauty, wealth, and boredom. Future research should gradually establish a comparative framework across cultures, regions, and communities, including expanding sample sources, developing multilingual systems, and conducting cultural calibration analyses. By creating standardized perception benchmarks across cities, the generalizability and theoretical explanatory power of the research can be further strengthened.
Fourth, the current infrastructure for urban data remains fragmented, which limits the multimodal integration of urban experience. Street-view imagery, soundscapes, microclimatic data, and human behavioral signals differ substantially in sampling frequency, spatial scale, and semantic representation, creating bottlenecks for cross-modal alignment and coordinated analysis. An important direction for future research is therefore the development of an aligned multimodal perception database, which would allow for a more systematic representation of experiential trajectories and deepen theoretical understanding of the mechanisms through which perception is formed.
Finally, advancing urban perception research to a higher level requires the coordinated development of technological systems, institutional arrangements, and data platforms. Establishing open data standards, building shared perception databases across cities, developing unified evaluation protocols for virtual and augmented reality, and linking these tools with urban computing frameworks such as digital twins will provide the foundation for comparative studies across cities and cultures. These efforts will further shift the renewal strategies of historic districts from evaluations focused on static landscapes toward integrated governance approaches that emphasize experience, culture, and resilience.

5. Conclusions

This study focused on six historic districts in Shanghai and constructed a framework linking environmental elements and human perception by integrating street-view imagery, POI data, and subjective evaluations. The results indicate that the experience of historic districts is not a direct projection of single physical attributes, but rather a dynamic equilibrium formed among naturalness, functionality, and orderliness. Green view enhances aesthetic appeal and comfort; commercial and service elements support vitality and opportunities for use; and institutional and transportation factors reinforce cues of order and safety. These three categories of elements exhibit directional variations under different intensities and combinations, suggesting that the perceptual structure of historic districts reflects multi-element synergies rather than simple additive effects.
At the theoretical level, this study first established a verifiable explanatory chain linking urban imagery, environmental elements, and psychological perception, thereby clarifying the primary cues and relative weights of different perceptual dimensions. Further results reveal that aesthetics and vitality are not necessarily aligned: while an increase in functional intensity may create more opportunities for social interaction, it can also induce crowding and stress, exposing a common experiential tension within historic districts. Hence, the attractiveness of historic districts arises from the combined effects of openness, activity, and cultural continuity, rather than from a unidirectional increase in either naturalness or functionality.
At the methodological level, this study established a scalable and reproducible quantitative pathway supported by multi-source spatial data and interpretable machine learning. Street-view pixel composition and POI functional structures were integrated into a unified modeling framework, in which feature contributions and local responses jointly provided directional and interval information for each variable, allowing the relationships between environmental factors and perceptual dimensions to be traced and verified. Compared with conventional appearance assessments that rely heavily on expert judgment, this framework ensures consistent measurement standards across multiple spatial scales.
At the practical level, this study highlights that the design and management of historic districts should pay close attention to threshold effects and proportional relationships among environmental cues. Based on the quantitative results, several practice-oriented principles can be proposed. First, visual greenery within the approximate range of 0.30 to 0.55 produces the most stable improvements in aesthetic quality and comfort. This suggests that historic districts should promote an appropriate level of visible greenery, primarily through street trees, green seams, and micro green spaces, while avoiding excessive coverage that obscures façades and historical patterns. Second, dining, leisure, and neighborhood service facilities support urban vitality most effectively when their combined presence falls within the approximate range of 0.20 to 0.40. When densities exceed about 0.60, issues such as noise, crowding, and visual fatigue are more likely to occur, indicating the need for micro-scale spatial arrangements and functional layering to regulate density. Third, institutional and traffic-related elements can enhance the perception of order when their visibility remains within moderate levels of approximately 0.20 to 0.50. However, excessive exposure of these features can weaken the sense of spatial affinity, and they should therefore be moderated through façade detailing, street furniture, and material integration. Overall, these principles converge on a central understanding: enhancing the experiential quality of historic districts does not rely on maximizing any single environmental element, but rather on maintaining an appropriate proportional balance among greenery, activity intensity, and spatial order. This proportional relationship can be translated directly into design control parameters, offering quantitative guidance for architectural renovation, streetscape improvement, façade conservation, and functional allocation. In doing so, historic districts can accommodate contemporary everyday use while simultaneously sustaining and enriching their cultural and experiential significance.
Under current conditions, this study still faces several domain-wide limitations. These constraints do not alter the directional validity of the conclusions but may affect the extent and granularity of their generalization. The street-view imagery used in this research originates from platform-based periodic collections, which lack sufficient coverage of nighttime scenes, extreme weather, and festive events—thus limiting the capture of temporal rhythms and group-specific variations. The update frequency and visibility rules of POI data are influenced by platform operations, leading to inconsistencies in timeliness and accuracy across business types. Moreover, most images reflect vehicular perspectives, potentially underrepresenting pedestrian micro-spaces and inner-block environments. The subjective evaluation samples were primarily collected from participants capable of engaging online, whose cultural backgrounds and lifestyles may influence their perceptual boundaries of safety, wealth, and cleanliness. Comprehensive improvements in temporal coverage, scene detail, and participant diversity would require higher data collection costs, cross-departmental collaboration, and the establishment of more mature open-data standards.
Building on the above understanding, future work should advance progressively rather than seeking immediate comprehensiveness. When conditions permit, temporal street-view data, pedestrian-perspective imagery, soundscape information, and microclimatic data can be incorporated to depict experiential trajectories across day–night cycles, seasons, and activity periods. Experimental settings could further integrate eye-tracking, physiological responses, and virtual reality roaming to compare the consistency between online image-based evaluations and real-world experiences. Additionally, developing standardized perception databases across cities and population groups would allow cultural and social heterogeneity to be embedded within the indicator system. These three research pathways—corresponding, respectively, to the temporal, sensory, and population dimensions—complement and naturally extend the analytical framework proposed in this study.
Overall, this study identified the key factors and principal ranges influencing the multidimensional perception of historic districts, providing quantitative evidence linking spatial composition to experiential quality. The value of this research lies not in resolving all issues at once, but in establishing a sustainable and extensible foundation for both technical exploration and cognitive understanding. With the continued advancement of data collection, open standards, and urban governance, the proportional relationships among greenery, activity, and order will be more precisely delineated, enabling the renewal of historic districts to evolve beyond physical restoration toward the integrated enhancement of experiential quality and cultural continuity.

Author Contributions

Conceptualization, Z.W., Y.H. and W.Z.; methodology, Y.H. and W.Z.; software, Y.H.; validation, Y.H.; formal analysis, Y.H. and W.Z.; investigation, Y.H. and W.Z.; resources, Y.H. and W.Z.; data curation, Y.H. and W.Z.; writing—original draft preparation, Y.H. and W.Z.; writing—review and editing, Y.H.; visualization, W.Z.; supervision, Y.H.; project administration, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (Ethics Committee) of the University of Shanghai for Science and Technology (protocol code USST20250519 and date of approval: 19 May 2025).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
POIPoint-of-Interest
SVIsstreet-view images
KMOKaiser–Meyer–Olkin
SHAPShapley Additive Explanations
PDPsPartial Dependence Plots
leisureleisure Entertainment
transtransportation
instBusinstitutional business
bldCompbuilding components
stFurnstreet furniture
roadSurfroad surface
grnViewGreen view
natSurfnatural surface
commAdscommerce ads
accomaccommodation

Appendix A

Table A1. Basic information of the six historic districts in central Shanghai.
Table A1. Basic information of the six historic districts in central Shanghai.
CodeNeighborhood NameAge of ConstructionFeatured ElementsTypical Architectural Styles
1The Bund1900–1941The historic architectural ensemble of the Bund, skyline silhouette, and the spatial configuration of the surrounding streetsEuropean Neoclassicism, Eclecticism, Greek Revival, Gothic, Baroque, Spanish, etc.
2Nanjing West Road1899–1941residential and public buildingsArt Deco, Neoclassical, Spanish, Russian Classical styles, etc.
3Yuyuan Road1919–1941Gardens, lilong residences, and educational buildingsBritish Gothic Revival style, British detached garden style, Neoclassical style, British countryside villa style, etc.
4Hengshan Road

Fuxing Road
1919–1941Garden houses, lilong residences,
and apartments
Traditional architectural styles from Spain, Britain, France, Norway, and Germany; British classical garden and vernacular styles; British Classical and Palladian styles; French Renaissance style; Modernist style; Shikumen style, etc.
5Shanyin Road1900–1941Revolutionary historical sites, gardens, and lilong residencesArchitectural features of traditional Chinese palatial structures, British architectural style, pseudo-Japanese style, Art Deco style, British Neoclassical style, Late Renaissance style, etc.
6The Tilan Bridge1901–1945Special-purpose buildings, lilong residences, and religious sitesArt Deco style, Jewish architectural decorative style, Queen Anne style of British architecture, etc.

Appendix B

Table A2. Model Performance Metrics Across Perceptual Dimensions.
Table A2. Model Performance Metrics Across Perceptual Dimensions.
TargetReginSamplesMSERMSEMAER2
beautifulall8340.25940.50930.30950.8498
boringall8340.28250.53150.32940.8172
cleanall8340.47860.69180.47360.7992
depressingall8340.33300.57700.37430.8194
livelyall8340.38680.62190.38320.8079
safeall8340.26060.51050.29220.8201
wealthyall8340.14970.38690.20020.8623

Appendix C

Correlation Patterns and Mechanistic Interpretation Among Environmental Variables

Commercial and service-related variables exhibit a strong degree of internal coupling, with correlation coefficients of 0.85 between shop and dining, 0.79 between leisure and dining, 0.73 between shopping and leisure, and as high as 0.80 between trans and instBus. This high consistency confirms the composite spatial distribution pattern of commercial functions: dining, retail, and transport nodes frequently cluster together, forming high-frequency co-occurrence business zones. However, such strong correlations may also reflect the high-density land-use configuration characteristic of historic districts rather than genuine functional interdependence. In other words, statistical co-occurrence does not necessarily imply mechanistic coupling, a distinction that has often been highlighted in related research as a potential source of interpretive bias.
Scene- and space-composition variables exhibit a different pattern. The highest covariance occurs between sky and roadSurf (r = 0.86), reflecting the natural co-occurrence between street openness and sky visibility. The correlations between bldComp and stFurn (r = 0.62), as well as between bldComp and sky (r = 0.62), indicate the synchronized variation among building façades, street amenities, and visual openness. This consistency partly arises from the perspective characteristics of street-view imagery: in narrow alleyways, both road and sky views contract simultaneously, which may amplify the apparent correlation strength while masking fine-grained spatial differences. The correlation between grnView and natSurf is moderate (r = 0.56), which is numerically reasonable; however, given the semantic overlap in their definitions, caution should be exercised in interpreting their relationship.
Although cross-cluster relationships are generally weak, they provide important clues for understanding the bridging mechanisms among spatial elements. The correlations between human and both dining and commAds are 0.43, indicating that human activity patterns partially correspond to the spatial distribution of commercial and advertising environments. This association may reflect the attraction effect of commerce, or it may simply represent a statistical outcome of population concentration. The correlations of stFurn with sky and roadSurf (both r = 0.45) could suggest a latent structural consistency between furniture placement and spatial openness, though such alignment may also result from the contingencies of historic spatial layouts. The correlations between trans and either sky or roadSurf (approximately 0.38) are modest but act as bridging nodes within the network structure, linking the functional and scene-related clusters. Negative relationships are generally weak, appearing only between dining–grnView (−0.15) and trans–grnView (−0.12). This trend may indicate that high-green-view areas tend to have relatively limited commercial and transport activities, though it more likely reflects spatial zoning effects rather than causal conflicts.
Overall, the correlation matrix reveals two highly interrelated clusters—the commercial and service group and the scene and composition group—as well as several moderate cross-cluster connections. The most prominent co-occurrences are observed between shop–dining, leisure–dining, trans–instBus, and sky–roadSurf. Cross-cluster variables play a bridging role within the overall structure, while negative relationships remain relatively weak. These findings indicate parallel variations among environmental elements within historic districts and raise an important question: to what extent do these statistical couplings reflect actual spatial processes? The answer likely depends on a combination of factors, including imagery scale, indicator definitions, and spatial distribution characteristics. This issue should be further clarified through multi-level data validation in future studies.

Appendix D

Variable Importance Across Seven Perceptual Dimensions

The beautiful dimension exhibits the strongest concentrated effect, with the contribution of grnView reaching 0.228, far exceeding all other variables and almost dominating the explanatory power of aesthetic judgment. This is followed by dining (0.135), instBus (0.110), trans (0.105), and leisure (0.100), while natSurf (0.001) shows an almost negligible influence. This unimodal structure demonstrates the overwhelming explanatory strength of visual greenery in shaping aesthetic perception, whereas the marginal effects of other factors remain relatively limited. In contrast, the lively dimension displays a more dispersed pattern of variable contributions, with multiple functional elements entering the upper tier simultaneously. Dining (0.138) and leisure (0.130) rank highest, followed by accom (0.121), instBus (0.117), and trans (0.111), while grnView (0.056) declines markedly. This suggests that the sense of urban vitality relies primarily on the combined density of service facilities and activity functions, rather than on greenery itself.
Across the safe, clean, and wealthy dimensions, the ranking of variables shows a high degree of consistency, reflecting the dominant role of order and management cues in shaping these perceptions. In the safe dimension, instBus (0.134) ranks highest, followed by shop (0.118), trans (0.114), and lifeSvc (0.113), while grnView (0.067) carries a comparatively lower weight. In the clean dimension, lifeSvc (0.152), accom (0.150), and instBus (0.147) form the core supporting structure, followed by dining (0.126), trans (0.116), and commAds (0.107). In the wealthy dimension, the top variable is again instBus (0.135), followed by accom (0.098), commAds (0.088), dining (0.085), and shop (0.084), while grnView (0.051) declines further. This structure indicates that when forming impressions of safety, cleanliness, and prosperity, individuals rely primarily on facilities, services, and consumption-related symbols as indicators of order and management, whereas natural elements exert comparatively limited explanatory power.
In the emotion-related dimensions, the variation in variable contributions appears more complex. In the boring dimension, dining (0.148) and instBus (0.135) remain highly influential, while grnView (0.119) rises again, indicating that natural elements regain explanatory power in mitigating boredom. Other variables, including accom (0.115), lifeSvc (0.110), and trans (0.109), fall within the moderate range. In contrast, within the depressing dimension, dining (0.146), trans (0.141), shop (0.125), and lifeSvc (0.125) show the highest contributions, whereas grnView (0.050) drops to the lowest level. These patterns suggest that commercial and transportation activities tend to reinforce feelings of pressure and congestion, exhibiting an opposite directional effect compared with the lively dimension.
From a cross-dimensional perspective, the importance of variables exhibits clear differentiation and context dependency. Dining consistently ranks among the top five across all seven dimensions, with contribution values ranging from 0.085 to 0.148, demonstrating the broadest explanatory range. InstBus remains highly influential in six dimensions (0.103–0.147), showing particularly strong effects in the safe and clean categories. GrnView displays the most pronounced contextual variability, peaking in beautiful (0.228), maintaining a moderate effect in boring (0.119), and stabilizing between 0.050–0.067 in other dimensions. LifeSvc and accom maintain relatively high values in the clean and wealthy dimensions, while commAds demonstrates selective influence, being significant only in clean (0.107) and wealthy (0.088). NatSurf shows extremely low weights across all dimensions (0–0.011), indicating a minimal contribution to explaining perceptual differences.
Overall, the variable structures across different dimensions can be categorized into three distinct patterns. The first is a unimodal structure dominated by grnView, represented by the beautiful dimension. The second is a multi-core structure driven by functional facilities, exemplified by lively. The third is a stable structure centered on order and governance cues, encompassing safe, clean, and wealthy. Emotion-related dimensions such as boring and depressing fluctuate between these two types, exhibiting alternating dominance of green restoration and functional pressure. Collectively, these results complement the main-text analysis by providing more detailed quantitative evidence, supporting a multi-dimensional interpretation of the perceptual formation mechanisms within historic districts.

Appendix E

Importance of Variable Features Across Dimensions: Boring, Clean and Safe

Figure A1. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: boring.
Figure A1. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: boring.
Sustainability 17 11075 g0a1
Figure A2. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: clean.
Figure A2. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: clean.
Sustainability 17 11075 g0a2
Figure A3. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: safe.
Figure A3. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: safe.
Sustainability 17 11075 g0a3

Appendix F

Distribution Range of SHAP Contribution Values Between Perceived Dimensions and Thresholds

Consistent with the overall perceptual mechanism emphasizing the importance of green elements, the SHAP value range for grnView in the beautiful dimension (−0.10 to +1.25) shows a typical threshold-based enhancement trend, indicating that aesthetic evaluations increase significantly once the green-view ratio reaches a moderate level. Meanwhile, dining (−0.08 to +0.95) and leisure (−0.05 to +0.70) also exhibit stable positive effects, suggesting that urban life and social activities can enhance aesthetic pleasure through atmospheric experience, rather than relying solely on natural scenery. However, when the values of instBus and trans exceed certain thresholds, their SHAP contributions shift sharply into the negative range, implying that formalized spatial configurations may erode aesthetic perception. This highlights a tension between visual pleasure and institutional interference within the aesthetic dimension.
This tension further evolves within the boring dimension. Although dining and lifeSvc initially help alleviate spatial monotony—with peak SHAP values of +1.05 and +0.65 respectively—their increasing spatial concentration and functional homogeneity ultimately reduce environmental diversity and experiential interest, demonstrating a functional saturation–perceptual decline mechanism. By contrast, grnView (−0.25 to +0.80) maintains a steady positive influence under low-stimulation contexts, indicating that natural elements, while not the strongest drivers, still play a regulatory and compensatory role when activity or commercial stimuli are lacking. This reinforces the complementary pathway between natural and social stimuli in shaping environmental perception.
Beyond aesthetics and interest, environmental cleanliness represents another distinct perceptual dimension. The variable structure of the clean dimension highlights the mechanism of order formation. LifeSvc (−0.30 to +1.40) and accom (−0.25 to +1.15) play the most prominent roles in shaping perceptions of cleanliness, indicating the importance of residential and service-related functions in sustaining visual order. However, instBus (−0.35 to +0.95) and dining (−0.28 to +0.90) exhibit a downward trend under high-density conditions, revealing a tension between order and intensity. Particularly notable is the nonlinear reversal observed in the trans variable (−0.40 to +0.85), which underscores the critical role of moderate usage in maintaining a coherent impression of cleanliness.
This perceptual pressure becomes further amplified within the depressing dimension. High-density trans and instBus emerge as the primary triggers of negative emotions, with SHAP value ranges reaching up to +0.90 and +0.85, respectively. These results suggest that excessive management-oriented and mobility-related environments may translate into experiences of psychological oppression. In contrast, dining (−0.30 to +0.85) and shop (−0.25 to +0.80) can mitigate negative emotions under moderate distributions, indicating that social and commercial functions exert a certain psychological buffering effect. However, this alleviating influence is also limited by spatial density; once commercial elements become overly concentrated, their positive effects tend to diminish.
In contrast, the lively dimension reflects a positive coupling pathway between perceived vitality and spatial sociability. Dining (−0.20 to +1.35) and leisure (−0.25 to +1.20) serve as the primary driving variables, further confirming the positive effects of high accessibility and activity density on emotional restoration and spatial attractiveness. However, instBus and trans—as formal functional facilities—display a clear downward trend once exceeding specific density thresholds, suggesting that while such elements enhance organization, they may also suppress vitality by limiting informal social interactions. This phenomenon forms a structural contrast with the depressing dimension, where the same variables contribute positively to negative emotions, highlighting the context-dependent dual effects of functional elements.
This dual characteristic is also evident in the safe dimension. InstBus (−0.20 to +1.10) and shop (−0.15 to +0.95) enhance the perception of safety by reinforcing a sense of spatial order, while lifeSvc and dining under moderate distribution conditions contribute to a “watchful presence” effect through pedestrian flow and public visibility. At the same time, the SHAP range of trans (−0.35 to +0.85) exhibits a trend of low-value promotion and high-value attenuation, suggesting that the influence of traffic flow on perceived safety may shift with increasing density. This pattern reflects a delicate balance between accessibility and surveillance. Some of these mechanisms partially overlap with those identified in the clean dimension, further supporting the interactive framework between functional density and social order.
Finally, the wealthy dimension emphasizes the constructed relationship between commercial symbolism and perceived spatial prestige. InstBus (−0.15 to +1.05) and accom (−0.20 to +0.95) form the fundamental symbolic framework, while commAds and shop (with maximum values of +0.85 and +0.75, respectively) enhance the perception of prosperity through the visual manifestation of commercial landscapes. The role of dining (−0.25 to +0.80) is more bidirectional, with its influence largely depending on the hierarchical level of business types. High-end restaurants reinforce impressions of refinement, whereas the clustering of low-end venues may diminish that perception. This pattern parallels the social-atmosphere mechanism identified in the lively dimension, underscoring the dual function of commercial spaces in shaping both emotional experience and social symbolism.
In summary, the analysis of variable ranges reveals the nonlinear logic and interactive mechanisms underlying perceptual structures in historic districts. Green and natural elements consistently enhance aesthetic and experiential enjoyment in the beautiful and boring dimensions, while consumption and social functions construct social symbolism and emotional appeal within lively and wealthy. Conversely, transportation and institutional elements exhibit dual, tension-driven effects between safe and depressing. It is precisely this interwoven multidimensional perceptual structure that forms the complex foundation of historic district experiences, providing a key reference for achieving a dynamic spatial balance among greenery, order, and safety in future urban renewal practices.

Appendix G

Partial Dependence Plots of Key Environmental Variables for “Boring”, “Clean”, and “Wealthy” Perceptions

Figure A4. PDP analysis of environmental features across multiple perception dimensions: boring, clean, wealthy.
Figure A4. PDP analysis of environmental features across multiple perception dimensions: boring, clean, wealthy.
Sustainability 17 11075 g0a4

Appendix H

Detailed Nonlinear Threshold Patterns Across Perceptual Dimensions

The partial dependence analysis further reveals the nonlinear response patterns of built-environment elements across different perceptual dimensions, illustrating the specific sensitive ranges and saturation thresholds of individual variables. In the beautiful dimension, grnView exhibits the most representative threshold pattern. When its coverage falls below approximately 0.25–0.30, aesthetic responses remain weak; once this threshold is exceeded, aesthetic ratings increase sharply and plateau around 0.55–0.60. This indicates that the perception of natural scenery is not a linear accumulation but is instead triggered once a recognizable proportion is reached. Similarly, moderate clustering of dining and leisure (0.2–0.4) enhances visual richness, yet proportions above 0.6 induce aesthetic fatigue, reflecting a subtle balance between vitality and order. In contrast, instBus and trans show negative effects at medium-to-high proportions, confirming that intensely functional spaces are more likely to be perceived as visually cluttered or oppressive.
This pattern of “moderate optimality” is inversely reflected in the boring dimension. While the beautiful dimension follows an upward curve of pleasure from low to high values, boring reveals a reverse relationship—one that shifts from richness to monotony. When everyday functional facilities are sparse, their influence is minimal; however, as the proportion of dining or lifeSvc approaches 0.3–0.35, the streetscape transitions from “empty” to “diverse”. Once instBus and trans exceed 0.5, this diversity gives way to an experience combining boredom and spatial pressure. It is worth noting that grnView continues to mitigate monotony beyond 0.4, indicating that the aesthetic effect of natural elements extends positively across perceptual dimensions—enhancing beauty while simultaneously reducing boredom.
Building on this, the clean dimension continues the previously identified order–perception logic, though in a more nuanced manner. LifeSvc and accom significantly enhance perceptions of cleanliness within the 0.2–0.3 range, yet their positive effects weaken beyond 0.6. trans and commAds reduce perceptions of cleanliness once exceeding 0.4–0.5, suggesting that when functional or visual information becomes overloaded, the sense of order is replaced by visual fatigue.
When this saturation–reversal trend becomes further intensified, it produces the mirror effect observed in the depressing dimension. Once the proportions of instBus and trans exceed 0.5, feelings of oppression increase markedly, indicating that excessive concentrations of institutional and transportation spaces generate psychological pressure. In contrast, accom and lifeSvc provide mild buffering effects at moderate proportions (around 0.3–0.4), yet they also turn negative under high-density conditions. In other words, from beautiful to depressing, the perceptual curve shifts from a positive threshold of pleasure activation to a negative threshold of pressure accumulation, forming a complete psychological symmetry.
In contrast to this negative mirror pattern, the lively dimension reconstructs a positive perceptual equilibrium within the midrange. Dining and leisure significantly enhance liveliness between 0.3 and 0.5, peaking at 0.6–0.7 before reaching saturation. StFurn strengthens vibrancy once exceeding 0.2, though excessive presence leads to visual clutter. Meanwhile, trans becomes negative beyond 0.5, suggesting that the sense of vitality does not arise from sheer density but depends on balanced social interaction and rhythmic spatial flow. If depressing represents the psychological cost of over-functionalization, then lively embodies its optimal counterstate, together constituting a dynamic perceptual balance within historic district environments.
The other side of this equilibrium is reflected in the safe dimension. The perception of safety depends not only on functional density but also on visual order and environmental visibility. When grnView coverage exceeds 0.3, perceived safety increases significantly, suggesting that natural elements enhance the sense of order through open views and comfort. Shop and lifeSvc also produce stable positive effects within the 0.2–0.4 range but tend to plateau beyond 0.6. In contrast, instBus and commAds exhibit strong negative effects above 0.5, indicating that visual cleanliness and perceived safety share a closely aligned psychological basis. Thus, the safe dimension structurally inherits the cognitive burden pathway of clean, yet transforms it into a positive perception of social order.
Finally, the wealthy dimension extends this order–comfort continuum, representing an elevation of spatial quality to a symbolic level. CommAds and shop enhance perceptions of prestige within the 0.3–0.5 range but display diminishing or even reversed effects beyond 0.6. Accom maintains a stable positive influence between 0.4 and 0.6, whereas high-density instBus and trans weaken impressions of affluence. This pattern closely mirrors the curve observed in the beautiful dimension, suggesting that the sense of affluence psychologically stems from moderate order and visual relaxation. In other words, from safety to wealth, perceptual upgrading within historic districts depends on the stability of environmental order and the control of stimulus intensity.
In summary, the seven perceptual dimensions collectively form a continuous progression from aesthetic enhancement to functional saturation, followed by vitality equilibrium and ultimately order refinement. Their shared pattern follows a typical “low insensitivity–midrange activation–high saturation or reversal” trajectory, reflecting the nonlinear nature of environmental perception. Among the variables, grnView consistently acts as a universal positive threshold, triggering favorable perceptions once coverage exceeds approximately 30–40%. In contrast, high-density instBus and trans environments continuously evoke negative experiences, highlighting the perceptual costs of over-intensified institutional and transport spaces. Taken together, these findings indicate that optimizing historic districts should move beyond the additive accumulation of physical elements and instead pursue threshold-based balance and spatial coordination. Only by achieving such equilibrium can planners realize the dynamic integration of historical character, functional vitality, and psychological comfort within complex urban contexts.

References

  1. Wedyan, M.; Saeidi-Rizi, F. Assessing the Impact of Urban Environments on Mental Health and Perception Using Deep Learning: A Review and Text Mining Analysis. J. Urban Health 2024, 101, 327–343. [Google Scholar] [CrossRef] [PubMed]
  2. Yao, Y.; Liang, Z.; Yuan, Z.; Liu, P.; Bie, Y.; Zhang, J.; Wang, R.; Wang, J.; Guan, Q. A Human-Machine Adversarial Scoring Framework for Urban Perception Assessment Using Street-View Images. Int. J. Geogr. Inf. Sci. 2019, 33, 2363–2384. [Google Scholar] [CrossRef]
  3. Ewing, R.; Handy, S. Measuring the Unmeasurable: Urban Design Qualities Related to Walkability. J. Urban Des. 2009, 14, 65–84. [Google Scholar] [CrossRef]
  4. Porzi, L.; Rota Bulò, S.; Lepri, B.; Ricci, E. Predicting and Understanding Urban Perception with Convolutional Neural Networks. In Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane Australia, 26–30 October 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 139–148. [Google Scholar]
  5. Cao, Y.; Yang, P.; Xu, M.; Li, M.; Li, Y.; Guo, R. A Novel Method of Urban Landscape Perception Based on Biological Vision Process. Landsc. Urban Plan. 2025, 254, 105246. [Google Scholar] [CrossRef]
  6. Farahani, M.; Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Choi, S.-M. A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes. Sustainability 2023, 15, 10403. [Google Scholar] [CrossRef]
  7. Ji, H.; Qing, L.; Han, L.; Wang, Z.; Cheng, Y.; Peng, Y. A New Data-Enabled Intelligence Framework for Evaluating Urban Space Perception. ISPRS Int. J. Geo-Inf. 2021, 10, 400. [Google Scholar] [CrossRef]
  8. Luo, J.; Liu, P.; Xu, W.; Zhao, T.; Biljecki, F. A Perception-Powered Urban Digital Twin to Support Human-Centered Urban Planning and Sustainable City Development. Cities 2025, 156, 105473. [Google Scholar] [CrossRef]
  9. Zhang, J.; Yu, Z.; Li, Y.; Wang, X. Uncovering Bias in Objective Mapping and Subjective Perception of Urban Building Functionality: A Machine Learning Approach to Urban Spatial Perception. Land 2023, 12, 1322. [Google Scholar] [CrossRef]
  10. Zhang, F.; Zhou, B.; Liu, L.; Liu, Y.; Fung, H.H.; Lin, H.; Ratti, C. Measuring Human Perceptions of a Large-Scale Urban Region Using Machine Learning. Landsc. Urban Plan. 2018, 180, 148–160. [Google Scholar] [CrossRef]
  11. Lyu, Y.; Abd Malek, M.I.; Ja`afar, N.H.; Sima, Y.; Han, Z.; Liu, Z. Unveiling the Potential of Space Syntax Approach for Revitalizing Historic Urban Areas: A Case Study of Yushan Historic District, China. Front. Archit. Res. 2023, 12, 1144–1156. [Google Scholar] [CrossRef]
  12. Li, Y.; Xu, C. Empirical Research on Protection and Regeneration of Historic District in Hankou Old Settlement the Case of Li Huangpi Road Historic District in Hankou, Wuhan. In Proceedings of the 2011 International Conference on Electric Technology and Civil Engineering (ICETCE), Lushan, China, 22–24 April 2011; IEEE: New York, NY, USA, 2011; pp. 4121–4126. [Google Scholar]
  13. Ding, W.; Wei, Q.; Jin, J.; Nie, J.; Zhang, F.; Zhou, X.; Ma, Y. Research on Public Space Micro-Renewal Strategy of Historical and Cultural Blocks in Sanhe Ancient Town under Perception Quantification. Sustainability 2023, 15, 2790. [Google Scholar] [CrossRef]
  14. Ye, Y.; Zhao, T.; Shi, X.; Liu, P.; Fei, T. The Identification of Cultural Genes in Historic Districts and Their Influences on Cultural Perception—Case Study in Central Street in Harbin, China. J. Asian Archit. Build. Eng. 2025, 24, 2430–2446. [Google Scholar] [CrossRef]
  15. Zheng, S.; Zhang, J.; Zu, R.; Li, Y. Visual Perception Differences and Spatiotemporal Analysis in Commercialized Historic Streets Based on Mobile Eye Tracking: A Case Study in Nanchang Wanshou Palace, China. Buildings 2024, 14, 1899. [Google Scholar] [CrossRef]
  16. Gao, X.; Wang, H.; Zhao, J.; Wang, Y.; Li, C.; Gong, C. Visual Comfort Impact Assessment for Walking Spaces of Urban Historic District in China Based on Semantic Segmentation Algorithm. Environ. Impact Assess. Rev. 2025, 114, 107917. [Google Scholar] [CrossRef]
  17. Li, M.; Liu, J.; Lin, Y.; Xiao, L.; Zhou, J. Revitalizing Historic Districts: Identifying Built Environment Predictors for Street Vibrancy Based on Urban Sensor Data. Cities 2021, 117, 103305. [Google Scholar] [CrossRef]
  18. Xu, J.; Wang, J.; Zuo, X.; Han, X. Spatial Quality Optimization Analysis of Streets in Historical Urban Areas Based on Street View Perception and Multisource Data. J. Urban Plan. Dev. 2024, 150, 05024036. [Google Scholar] [CrossRef]
  19. Xie, Q.; Hu, L.; Wu, J.; Shan, Q.; Li, W.; Shen, K. Investigating the Influencing Factors of the Perception Experience of Historical Commercial Streets: A Case Study of Guangzhou’s Beijing Road Pedestrian Street. Buildings 2024, 14, 138. [Google Scholar] [CrossRef]
  20. Deng, Z.; Chen, D.; Qin, X.; Wang, S. Comprehensive Assessment to Residents’ Perceptions to Historic Urban Center in Megacity: A Case Study of Yuexiu District, Guangzhou, China. J. Asian Archit. Build. Eng. 2021, 20, 566–580. [Google Scholar] [CrossRef]
  21. Lynch, K. The Image of the City; MIT Press: Cambridge, MA, USA, 1964; ISBN 978-0-262-62001-7. [Google Scholar]
  22. Gao, W.; Sun, X.; Zhao, M.; Gao, Y.; Ding, H. Evaluate Human Perception of the Built Environment in the Metro Station Area. Land 2024, 13, 90. [Google Scholar] [CrossRef]
  23. Li, S.; Ma, S.; Tong, D.; Jia, Z.; Li, P.; Long, Y. Associations between the Quality of Street Space and the Attributes of the Built Environment Using Large Volumes of Street View Pictures. Environ. Plan B Urban Anal. City Sci. 2022, 49, 1197–1211. [Google Scholar] [CrossRef]
  24. Iamtrakul, P.; Chayphong, S.; Kantavat, P.; Hayashi, Y.; Kijsirikul, B.; Iwahori, Y. Exploring the Spatial Effects of Built Environment on Quality of Life Related Transportation by Integrating GIS and Deep Learning Approaches. Sustainability 2023, 15, 2785. [Google Scholar] [CrossRef]
  25. Chen, Q.; Yan, Y.; Zhang, X.; Chen, J. A Study on the Impact of Built Environment Elements on Satisfaction with Residency Whilst Considering Spatial Heterogeneity. Sustainability 2022, 14, 15011. [Google Scholar] [CrossRef]
  26. Chen, P.; Shen, Q.; Childress, S. A GPS Data-Based Analysis of Built Environment Influences on Bicyclist Route Preferences. Int. J. Sustain. Transp. 2018, 12, 218–231. [Google Scholar] [CrossRef]
  27. Herrmann-Lunecke, M.G.; Mora, R.; Vejares, P. Perception of the Built Environment and Walking in Pericentral Neighbourhoods in Santiago, Chile. Travel Behav. Soc. 2021, 23, 192–206. [Google Scholar] [CrossRef]
  28. Jiang, S.; Liu, J. Comparative Study of Cultural Landscape Perception in Historic Districts from the Perspectives of Tourists and Residents. Land 2024, 13, 353. [Google Scholar] [CrossRef]
  29. Huang, G.; Yu, Y.; Lyu, M.; Sun, D.; Zeng, Q.; Bart, D. Using Google Street View Panoramas to Investigate the Influence of Urban Coastal Street Environment on Visual Walkability. Environ. Res. Commun. 2023, 5, 065017. [Google Scholar] [CrossRef]
  30. Gebru, T.; Krause, J.; Wang, Y.; Chen, D.; Deng, J.; Aiden, E.L.; Li, F.-F. Using Deep Learning and Google Street View to Estimate the Demographic Makeup of Neighborhoods across the United States. Proc. Natl. Acad. Sci. USA 2017, 114, 13108–13113. [Google Scholar] [CrossRef]
  31. Lou, S.; Stancato, G.; Piga, B.E.A. Assessing In-Motion Urban Visual Perception: Analyzing Urban Features, Design Qualities, and People’s Perception. In Advances in Representation; Giordano, A., Russo, M., Spallone, R., Eds.; Digital Innovations in Architecture, Engineering and Construction; Springer Nature: Cham, Switzerland, 2024; pp. 691–706. ISBN 978-3-031-62962-4. [Google Scholar]
  32. Li, X.; Zhang, C.; Li, W. Does the Visibility of Greenery Increase Perceived Safety in Urban Areas? Evidence from the Place Pulse 1.0 Dataset. Int. J. Geo-Inf. 2015, 4, 1166–1183. [Google Scholar] [CrossRef]
  33. Rui, J.; Cai, C. Plausible or Misleading? Evaluating the Adaption of the Place Pulse 2.0 Dataset for Predicting Subjective Perception in Chinese Urban Landscapes. Habitat Int. 2025, 157, 103333. [Google Scholar] [CrossRef]
  34. Lansing, J.B.; Marans, R.W. Evaluation of Neighborhood Quality. J. Am. Inst. Plan. 1969, 35, 195–199. [Google Scholar] [CrossRef]
  35. He, J.; Zhang, J.; Yao, Y.; Li, X. Extracting Human Perceptions from Street View Images for Better Assessing Urban Renewal Potential. Cities 2023, 134, 104189. [Google Scholar] [CrossRef]
  36. Gong, Z.; Ma, Q.; Kan, C.; Qi, Q. Classifying Street Spaces with Street View Images for a Spatial Indicator of Urban Functions. Sustainability 2019, 11, 6424. [Google Scholar] [CrossRef]
  37. Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
  38. Fang, Y.-N.; Zeng, J.; Namaiti, A. Landscape Visual Sensitivity Assessment of Historic Districts—A Case Study of Wudadao Historic District in Tianjin, China. ISPRS Int. J. Geo-Inf. 2021, 10, 175. [Google Scholar] [CrossRef]
  39. Cronbach, L.J. Coefficient Alpha and the Internal Structure of Tests. Psychometrika 1951, 16, 297–334. [Google Scholar] [CrossRef]
  40. Shao, Y.; Yin, Y.; Xue, Z.; Ma, D. Assessing and Comparing the Visual Comfort of Streets across Four Chinese Megacities Using AI-Based Image Analysis and the Perceptive Evaluation Method. Land 2023, 12, 834. [Google Scholar] [CrossRef]
  41. Wen, Z.; Zhao, J.; Li, M. A Study on the Influencing Factors of the Vitality of Street Corner Spaces in Historic Districts: The Case of Shanghai Bund Historic District. Buildings 2024, 14, 2947. [Google Scholar] [CrossRef]
  42. Lee, J.; Kim, D.; Park, J. A Machine Learning and Computer Vision Study of the Environmental Characteristics of Streetscapes That Affect Pedestrian Satisfaction. Sustainability 2022, 14, 5730. [Google Scholar] [CrossRef]
  43. Xu, H.; Jiang, Y.; Xue, T.; Wang, Z.; Fang, Y.; Huang, X. Exploring the Impact of Objective Features and Subjective Perceptions of Street Environment on Cycling Preferences. Cities 2026, 168, 106434. [Google Scholar] [CrossRef]
  44. Yao, T.; Xu, Y.; Sun, L.; Liao, P.; Wang, J. Application of Machine Learning and Multi-Dimensional Perception in Urban Spatial Quality Evaluation: A Case Study of Shanghai Underground Pedestrian Street. Land 2024, 13, 1354. [Google Scholar] [CrossRef]
  45. Pande, C.B.; Egbueri, J.C.; Costache, R.; Sidek, L.M.; Wang, Q.; Alshehri, F.; Din, N.M.; Gautam, V.K.; Chandra Pal, S. Predictive Modeling of Land Surface Temperature (LST) Based on Landsat-8 Satellite Data and Machine Learning Models for Sustainable Development. J. Clean. Prod. 2024, 444, 141035. [Google Scholar] [CrossRef]
  46. Zhu, J.; Wang, S.; Ma, H.; Shan, T.; Xu, D.; Sun, F. Nonlinear Effect of Urban Visual Environment on Residents’ Psychological Perception—An Analysis Based on XGBoost and SHAP Interpretation Model. City Environ. Interact. 2025, 27, 100202. [Google Scholar] [CrossRef]
  47. Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
  48. Zeng, Q.; Gong, Z.; Wu, S.; Zhuang, C.; Li, S. Measuring Cyclists’ Subjective Perceptions of the Street Riding Environment Using K-Means SMOTE-RF Model and Street View Imagery. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103739. [Google Scholar] [CrossRef]
  49. Rui, J. Exploring the Association between the Settlement Environment and Residents’ Positive Sentiments in Urban Villages and Formal Settlements in Shenzhen. Sustain. Cities Soc. 2023, 98, 104851. [Google Scholar] [CrossRef]
  50. Dongfang, H. Perceived Traffic Safety Assessment for Pedestrians and Cyclists in Helsinki Using Street View Imagery. Master’s Thesis, University of Helsinki, Helsinki, Finland, 2025. [Google Scholar]
  51. Greenwell, B.M. Pdp: An R Package for Constructing Partial Dependence Plots. R J. 2017, 9, 421. [Google Scholar] [CrossRef]
  52. Wu, S.; Wu, S.; Chen, J.; Pan, C. An Interpretable Machine Learning Approach to Studying Environmental Safety Perception Among Elderly Residents in Pocket Parks. Buildings 2025, 15, 3411. [Google Scholar] [CrossRef]
  53. Ogawa, Y.; Oki, T.; Zhao, C.; Sekimoto, Y.; Shimizu, C. Evaluating the Subjective Perceptions of Streetscapes Using Street-View Images. Landsc. Urban Plan. 2024, 247, 105073. [Google Scholar] [CrossRef]
  54. Syamili, M.S.; Takala, T.; Korrensalo, A.; Tuittila, E.-S. Happiness in Urban Green Spaces: A Systematic Literature Review. Urban For. Urban Green. 2023, 86, 128042. [Google Scholar] [CrossRef]
  55. Kaplan, S. The Restorative Benefits of Nature: Toward an Integrative Framework. J. Environ. Psychol. 1995, 15, 169–182. [Google Scholar] [CrossRef]
  56. Smith, N. New Globalism, New Urbanism: Gentrification as Global Urban Strategy. Antipode 2002, 34, 427–450. [Google Scholar] [CrossRef]
  57. Heilman, S.C.; MacCannell, D. The Tourist: A New Theory of the Leisure Class. Soc. Forces 1977, 55, 1104. [Google Scholar] [CrossRef]
  58. Liu, Y.; Zhang, J.; Liu, C.; Yang, Y. A Review of Attention Restoration Theory: Implications for Designing Restorative Environments. Sustainability 2024, 16, 3639. [Google Scholar] [CrossRef]
Figure 1. Study area.
Figure 1. Study area.
Sustainability 17 11075 g001
Figure 2. Research workflow of this study.
Figure 2. Research workflow of this study.
Sustainability 17 11075 g002
Figure 3. Segmented street view images based on PSPNet.
Figure 3. Segmented street view images based on PSPNet.
Sustainability 17 11075 g003
Figure 4. Correlation heatmap.
Figure 4. Correlation heatmap.
Sustainability 17 11075 g004
Figure 5. Feature importance ranking of environmental factors across perception dimensions: (a) beautiful, (b) lively.
Figure 5. Feature importance ranking of environmental factors across perception dimensions: (a) beautiful, (b) lively.
Sustainability 17 11075 g005
Figure 6. Feature importance ranking of environmental factors across perception dimensions: (a) safe, (b) clean, (c) wealthy, (d) boring, (e) depressing.
Figure 6. Feature importance ranking of environmental factors across perception dimensions: (a) safe, (b) clean, (c) wealthy, (d) boring, (e) depressing.
Sustainability 17 11075 g006
Figure 7. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: (a) beautiful, (b) lively, (c) depressing.
Figure 7. SHAP summary plots showing positive and negative contributions of environmental features to different perception dimensions: (a) beautiful, (b) lively, (c) depressing.
Sustainability 17 11075 g007
Figure 8. PDP analysis of environmental features for each perception dimension.
Figure 8. PDP analysis of environmental features for each perception dimension.
Sustainability 17 11075 g008
Figure 9. PDP analysis of environmental features across multiple perception dimensions: (a) lively, (b) depressing, (c) safe.
Figure 9. PDP analysis of environmental features across multiple perception dimensions: (a) lively, (b) depressing, (c) safe.
Sustainability 17 11075 g009
Figure 10. Perceptual balance model of historic districts.
Figure 10. Perceptual balance model of historic districts.
Sustainability 17 11075 g010
Table 1. Definitions of the Seven Perceptual Dimensions.
Table 1. Definitions of the Seven Perceptual Dimensions.
DimensionAcademic Definition
SafeReflects the perceived level of personal security within the environment, encompassing cues such as visibility, openness, lighting, and the perceived likelihood of crime or social disorder.
LivelyCaptures the sense of vibrancy and activity in a place, often associated with pedestrian presence, commercial functions, social interactions, and dynamic street use.
BeautifulRefers to the aesthetic quality of the environment, including visual coherence, architectural attractiveness, presence of natural elements, and overall landscape appeal.
WealthyIndicates perceptions of economic prosperity or material affluence, often inferred from building conditions, commercial types, maintenance level, and visible investments in the urban environment.
DepressingRepresents negative emotional responses triggered by environmental cues such as decay, neglect, lack of activity, or visual disorder, often linked to feelings of sadness or discomfort.
BoringReflects the perceived lack of stimulation, novelty, or diversity in the environment, typically associated with monotony, low activity levels, or homogenous visual forms.
CleanMeasures the perceived degree of environmental orderliness, hygiene, and maintenance, including the absence of litter, pollution, graffiti, and signs of physical deterioration.
Table 2. Categories of Points of Interest.
Table 2. Categories of Points of Interest.
CodePOI CategorySub-Category
1Transportation ServicesTransportation infrastructure
2Accommodation ServicesAccommodation
3Travel SightCulture and media,
Tourist attractions,
Green space,
Natural features
4Institutional And Business ServicesCompanies and enterprises,
Government agencies,
Administrative divisions,
Administrative landmarks
5Life ServicesBeauty and personal care,
Property management,
Educational institutions,
Automotive services,
Daily living facilities,
Financial services
6ShoppingRetail and commerce
7Leisure And EntertainmentRecreational venues,
Sports and fitness
8FoodCatering services
Table 3. Cronbach’s alpha internal consistency coefficient.
Table 3. Cronbach’s alpha internal consistency coefficient.
ItemCronbach’s AlphaNumber of Items
POI + Micro-level visual elements0.72919
POI0.7898
Micro-level visual elements0.63211
Table 4. KMO measure of sampling adequacy and Bartlett’s test of sphericity.
Table 4. KMO measure of sampling adequacy and Bartlett’s test of sphericity.
KMO Measure of Sampling AdequacyBartlett’s Test of Sphericity
Approx. Chi-SquareDegrees
of Freedom (df)
Significance (p-Value)
0.7538821.612171.0000.000 ***
*** p < 0.001.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Zhang, W.; Huang, Y. Nonlinear Perceptual Thresholds and Trade-Offs of Visual Environment in Historic Districts: Evidence from Street View Images in Shanghai. Sustainability 2025, 17, 11075. https://doi.org/10.3390/su172411075

AMA Style

Wang Z, Zhang W, Huang Y. Nonlinear Perceptual Thresholds and Trade-Offs of Visual Environment in Historic Districts: Evidence from Street View Images in Shanghai. Sustainability. 2025; 17(24):11075. https://doi.org/10.3390/su172411075

Chicago/Turabian Style

Wang, Zhanzhu, Weiying Zhang, and Yongming Huang. 2025. "Nonlinear Perceptual Thresholds and Trade-Offs of Visual Environment in Historic Districts: Evidence from Street View Images in Shanghai" Sustainability 17, no. 24: 11075. https://doi.org/10.3390/su172411075

APA Style

Wang, Z., Zhang, W., & Huang, Y. (2025). Nonlinear Perceptual Thresholds and Trade-Offs of Visual Environment in Historic Districts: Evidence from Street View Images in Shanghai. Sustainability, 17(24), 11075. https://doi.org/10.3390/su172411075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop