Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data

Xie, Jingxue; Liu, Zhewei; Wang, Jue

doi:10.3390/ijgi14090338

Open AccessArticle

Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data

by

Jingxue Xie

^1,2

,

Zhewei Liu

³

and

Jue Wang

^1,3,*

¹

Department of Geography and Planning, University of Toronto, Toronto, ON M5S 1A1, Canada

²

United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan

³

Department of Geography, Geomatics and Environment, University of Toronto Mississauga, Mississauga, ON L5L 1C6, Canada

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(9), 338; https://doi.org/10.3390/ijgi14090338

Submission received: 11 July 2025 / Revised: 25 August 2025 / Accepted: 29 August 2025 / Published: 31 August 2025

(This article belongs to the Special Issue Spatial Information for Improved Living Spaces)

Download

Browse Figures

Versions Notes

Abstract

Urban livability and sustainability are increasingly studied at the neighborhood scale, where built, social, and governance conditions shape residents’ everyday experiences. Yet existing assessment frameworks often fail to integrate subjective perceptions with multi-dimensional environmental indicators in replicable and scalable ways. To address this gap, this study develops an Environmental, Social, and Governance (ESG)-informed framework for evaluating perceived environmental quality in urban communities. Using Baidu Street View imagery—selected due to its comprehensive coverage of Chinese urban areas—and Point of Interest (POI) data, we analyze seven communities in Shenyang, China, selected for their diversity in built form and demographic context. Kernel Density Analysis and Exploratory Factor Analysis (EFA) are applied to derive latent ESG-related spatial dimensions. These are then correlated with Place Pulse 2.0 perception scores using Spearman analysis to assess subjective livability. Results show that environmental and social factors—particularly greenery visibility—are strongly associated with favorable perceptions, while governance-related indicators display weaker or context-specific relationships. The findings highlight the differentiated influence of ESG components, with environmental openness and walkability emerging as key predictors of perceived livability. By integrating pixel-level spatial features with perception metrics, the proposed framework offers a scalable and transferable tool for human-centered neighborhood evaluation, with implications for planning strategies that align with how residents experience urban environments.

Keywords:

ESG framework; street-view imagery analysis; community sustainability; urban livability

1. Introduction

Over the past decade, community development has been shaped by rapid urbanization, demographic transitions, and the growing imperative of climate adaptation, resilience building, and social inclusion in cities worldwide. As cities continue to evolve, communities are increasingly recognized as vital subunits for evaluating urban sustainability and livability. This shift in perspective has prompted a broader redefinition of urban planning and management goals [1]. No longer limited to infrastructure provision and economic growth, contemporary planning agendas now place increasing emphasis on environmental protection, social equity, and transparent governance at the community scale [2,3,4].

In this context, the Environmental, Social, and Governance (ESG) framework—originally developed as a corporate sustainability evaluation tool—offers conceptual and operational advantages for community-scale analysis. Beyond its traditional role in assessing corporate risk and responsibility [5,6], ESG has evolved into a versatile analytical lens for evaluating environmental performance, social well-being, and governance capacity in urban settings [7,8]. Recent studies have applied ESG principles to urban resilience planning, social equity auditing, and environmental justice assessment, highlighting its potential to align planning strategies with community needs [9,10].

Nevertheless, the operationalization of ESG at the community level remains limited. Existing urban ESG assessments often rely on aggregate indicators such as city-wide carbon footprints or governance indices, which obscure local variation and experiential dimensions [11,12]. At the same time, studies leveraging street view imagery, POI data, or perception surveys tend to focus on isolated dimensions such as walkability, safety, or greenery [13,14]. This fragmentation reflects a broader theoretical gap: the lack of a comprehensive structure capable of integrating environmental, social, and governance factors with human-scale perceptions.

Prior studies have begun to bridge objective environmental indicators with subjective perceptions, demonstrating that greenery, walkability, and spatial openness significantly shape how residents evaluate urban quality. For instance, Helbich showed that exposure to urban greenery was positively linked with perceived mental well-being across European cities [15]. Li et al. similarly found that neighborhood vegetation and green view indices correlated with residents’ satisfaction and social cohesion [16]. Mouratidis highlighted how compact urban form and access to parks enhanced perceived liveliness but also produced trade-offs in safety [17]. Despite these advances, such studies remain fragmented and have rarely been situated within an integrated ESG framework.

In this study, the term community-scale refers to the neighborhood unit where daily activities, social interaction, and governance mechanisms intersect. Operationally, we delineate communities by a 1-km buffer around their geometric centroid, which approximates a 15-min walking radius—a benchmark increasingly used in planning practice [18,19]. This definition balances administrative boundaries with functional morphology, ensuring comparability across cases while aligning with contemporary discourses on accessibility and livability.

While numerous studies have leveraged Place Pulse or Google/Baidu Street View imagery to quantify perceptual attributes such as safety, walkability, and greenery, these efforts typically analyze single perceptual or environmental dimensions without embedding them in an integrated Environmental–Social–Governance (ESG) framework at the neighborhood scale [13,20,21,22]. By contrast, the ESG literature has expanded rapidly but remains predominantly macro-scale, emphasizing corporate or city-level indicators rather than micro-scale, perception-linked diagnostics [8,11,23]. Addressing this gap, our study bridges urban design (public-space configuration and spatial equity), computational geography (pixel-level street-view analysis and factor extraction), and community-scale planning practice (ESG-informed interpretation), thereby operationalizing ESG at the micro scale and linking it to residents’ perceptions in a unified analytical structure.

To address this gap, this study develops a replicable ESG-based framework that integrates street-level imagery and POI data for multi-dimensional community evaluation. By combining perceptual metrics with environmental and governance indicators, the framework seeks to bridge top-down ESG objectives with bottom-up lived experiences, offering a scalable and transferable tool for human-centered neighborhood assessment.

Specifically, this research aims to answer three key questions:

(1): How does ESG help build a comprehensive framework for analyzing the spatial environment around urban communities?
(2): How are these ESG-related spatial structures reflected in residents’ subjective perceptions of their neighborhoods?
(3): In what ways can a perception-informed interpretation of ESG dimensions contribute to more inclusive, responsive, and sustainable approaches to community-scale urban planning?

In response to these questions, this study introduces a novel ESG-based analytical framework that integrates pixel-level street view analysis and POI data to comprehensively evaluate diverse community environmental characteristics. Exploratory Factor Analysis (EFA) is employed to distill the most representative factors within environmental, social, and governance dimensions from numerous metrics. Subsequently, subjective perception scores derived from the Place Pulse 2.0 model is analyzed using Spearman correlation analysis to reveal intrinsic relationships between community micro-environmental characteristics and resident perceptions. Specifically, the environmental dimension addresses street-level visibility and greenery; the social dimension focuses on pedestrian activity and traffic interference; and the governance dimension evaluates infrastructure management through indicators such as sidewalk and roadway configurations.

Employing this framework, we selected seven communities in the central urban area of Shenyang, China, as case studies, taking into account their construction year, population size, and surrounding amenities to ensure the representativeness and generalizability of the research findings. The analysis reveals marked inter-community variation in green visibility, traffic density, and street configuration—spatial features that collectively shape residents’ subjective evaluations across all six Place Pulse dimensions: “Safety”, “Lively”, “Wealthy”, “Beautiful”, “Depressing”, and “Boring”. Notably, the study identifies three recurrent spatial shortcomings that negatively affect perceptual outcomes and governance performance: unbalanced pedestrian space allocation, homogeneous greening schemes, and perceptual fatigue triggered by hyper-dense commercial activity. Building on these contributions, the following section outlines the methodological design of the study.

2. Materials and Methods

The methodological design of this study is grounded in three key theoretical frameworks: the Socio-Ecological Model, the Broken Windows Theory, and Place-making Theory. Drawing on this framework, the study emphasizes micro-scale spatial characteristics such as walkway proportions, green view indices, and crowd aggregation patterns as key components in fostering livable and socially engaging community environments. Together, these three theories provide a robust theoretical foundation for the selection of ESG variables, the construction of factor dimensions, and the interpretation of how community-scale features influence human perception (Figure 1).

To operationalize the ESG framework at the neighborhood scale, this study first defines and explains specific indicators representing environmental, social, governance, and functional dimensions based on observable features from street-level imagery and POI data.

Environmental indicators include sky rate—a measure of visible sky area in street view imagery widely used to assess street enclosure and openness [24,25]; visual permeability—capturing unobstructed sightlines and façade openness, which is linked to perceived safety and spatial legibility [26]; and green vision rate—the proportion of visible vegetation pixels in imagery, a validated proxy for urban greenery exposure and visual ecological quality [22,27]. Social indicators include mobility traffic interference index, measuring the proportion of vehicular presence in carriageway areas, an approach adapted from prior street-level walkability and traffic safety studies [28,29]; and walkway ratio, the proportion of pedestrian path pixels relative to total street surface, following established walkability indices from street view data [13,30]. Governance indicators include relative walking width—the ratio of pedestrian path width to carriageway width, linked to equitable street space allocation [31,32]; crowd aggregation, calculated as the share of human presence in imagery, an indicator for public space use intensity and management needs [33]; and walking space ratio, the share of pedestrian areas within total visible ground surface, reflecting infrastructure inclusivity [34]. The POI-based indicator—total number of POI business types—measures functional diversity in neighborhoods, a concept rooted in urban vitality and land-use mix literature [35,36].

To account for spatial distribution and aggregation effects, Kernel Density Analysis is applied to all ESG-related indicators prior to factor extraction. This step helps smooth local variations, reveal spatial patterns, and ensure comparability across communities.

Next, EFA is conducted to identify latent components within each ESG category and reduce multicollinearity among spatial variables. These factor scores serve as the basis for subsequent perceptual analysis.

Subjective perception data, which cover dimensions such as safety, liveliness, and beauty, are extracted using the Place Pulse 2.0 model. Spearman correlation analysis is then used to examine the relationship between ESG-based spatial factors and perception scores, revealing which environmental, social, or governance components most strongly influence residents’ lived experiences at the neighborhood scale.

2.1. Study Area

As one of China’s major northeastern cities undergoing a critical phase of post-industrial transition and urban renewal, Shenyang offers a distinctive urban context for exploring the integration of ESG-related environmental and perceptual indicators at the community level [37]. Despite its status as a provincial capital with a sizable population and extensive infrastructure, Shenyang remains underrepresented in studies that apply fine-grained spatial and perception-based analysis—especially when compared to megacities like Beijing or Shanghai [38]. This relative research gap, coupled with the city’s complex mixture of aging industrial districts, new residential zones, and evolving green infrastructure, makes it a compelling case for examining the spatial heterogeneity and livability challenges of contemporary Chinese urban communities [39].

This study focuses on seven communities in the core urban area of Shenyang, chosen according to completion year, surrounding environmental factors, and geographical location. Due to data availability constraints at the community scale, demographic variables such as age, income, and occupation could not be included, as these statistics are only reported at the district or city level in official sources. These communities were selected for their representativeness of a wide range of characteristics within the city.

Table 1 provides a detailed description of the seven selected communities, emphasizing their distinct spatial and functional characteristics—such as the presence of overpasses, riverside landscapes, core metro hubs, government office zones, and large-scale parks—alongside differences in completion year, residential density, and urban location, which jointly reflect variations in environmental context, infrastructural governance, and socio-spatial composition. The data were sourced from the Lianjia Real Estate Brokerage Company, a leading property information platform in China [40].

2.2. Data Sources

This study primarily utilized Baidu Street View imagery captured in January 2023, supplemented by point-of-interest (POI) data from Gaode Maps dated to 2021, and road network information sourced from OpenStreetMap. The key areas of interest are seven communities in the core urban area of Shenyang, China. In this study, a 1-km buffer zone was delineated around the geometric centroid of each selected community to define its walkable service area. This spatial unit approximates a 15-min walking radius, which has been widely adopted as a planning benchmark for neighborhood-scale accessibility and livability assessments [19,41]. The buffer serves as the analytical boundary for aggregating spatial indicators and capturing the environmental and functional context that residents routinely experience in daily life. Using ArcGIS, street coordinates were extracted at 50-m intervals within the 1-km buffer zones, as can be seen in Figure 2. These captured coordinates were then used to collect street-level imagery for review.

To ensure transparency and reproducibility, all datasets employed in this study are summarized in Table 2, including the specific study objects, acquisition periods, and exact sources. This structured overview clarifies the provenance of each dataset used in the ESG-based framework, covering both primary spatial layers and supplementary attribute information.

2.3. Methodology

2.3.1. Evaluation Framework

To construct a multidimensional evaluation framework tailored to community-scale ESG analysis, this study draws upon established theoretical perspectives that link urban form with individual perceptions and behaviors. First, the Socio-Ecological Model posits that individual behavior is shaped by interactions across multiple environmental levels, including physical settings, social systems, and institutional structures [42]. This model provides a conceptual foundation for the ESG-based framework adopted in this study, enabling a multidimensional analysis of how street-level features such as green visibility, pedestrian density, and the spatial configuration of roads affect residents’ perceptions and social behaviors. Second, the Broken Windows Theory emphasizes how signs of physical disorder in the community environment (e.g., deteriorating infrastructure, chaotic traffic, or spatial imbalance) can lead to negative perceptions of safety and governance, thereby undermining residents’ sense of satisfaction and community cohesion [43]. This perspective informs the operationalization of governance-related indicators in the study, such as the “road extremization factor,” and underpins the use of Spearman correlation analysis to assess their associations with perceived safety and governance efficiency. Lastly, Place-making Theory highlights the role of design interventions and citizen participation in shaping public spaces that are vibrant, inclusive, and supportive of place identity [44]. The theoretical basis of the study combines principles of urban design with pedestrians’ subjective perceptions of the environment. Drawing on the Place Pulse 2.0 framework, we translated subjective psychological perceptions of the street environment into quantifiable dimensions. These were classified according to the ESG framework, as shown in Table 3 below, each dimension making a different contribution to the quality of the environmental perception. In this study, “crowd aggregation” is classified under the governance dimension rather than the social dimension because it functions as an indicator of public space management and spatial equity, both of which are core concerns in urban governance. While crowd density reflects the level of social interaction, persistent concentration in specific public areas often signals governance-related issues such as unequal distribution of pedestrian infrastructure, imbalanced allocation of open space, or insufficient regulatory oversight of public realm use. Prior urban governance scholarship highlights that the organization, accessibility, and safety of public spaces are shaped not only by social behaviors but also by the capacity of governance systems to manage spatial resources effectively and equitably [45,46]. From this perspective, crowd aggregation serves as a proxy for evaluating how well a community’s governance structure ensures balanced and inclusive use of shared spaces. This study draws upon previous research on environmental assessment in urban studies to systematically define and calculate ESG-related environmental indicators. These indicators are integrated into the proposed ESG framework as ESG-mapped dimensions, each further characterized by its underlying interpretive mechanism. These indicators were selected based on their interpretability and applicability as demonstrated in prior perception-based and spatial analysis studies [13,21,30].

2.3.2. Image Segmentation

Image segmentation plays a foundational role in urban informatics by enabling the extraction of meaningful spatial features from street view imagery. Each segment will represent certain features of an urban environment. While image segmentation can be performed using methods ranging from traditional thresholding and edge detection to advanced CNN-based models, this study adopts deep learning techniques due to their superior ability to capture complex spatial features and object boundaries in street-level imagery [47]. In this study, we employed the Pyramid Scene Parsing Network (PSPNet), an advanced CNN architecture specifically designed for semantic segmentation [47]. When paired with the Cityscapes Dataset, this structure allows for fine-grained classification of urban features—such as roads, buildings, and vehicles—achieving high accuracy in dense urban image segmentation and supporting a refined level of spatial analysis [48].

In this study, we gathered 35,588 street view images from 8897 sampling points using Baidu Street View. We set the vertical angle to 0 degrees and captured images in four directions with heading = 0, 90, 180, 270 at each sampling point. Figure 3 below is an example of the visualization we used in our study.

To assess the model’s applicability to the Shenyang context, we relied on PSPNet pre-trained on the Cityscapes dataset without additional fine-tuning. All Baidu Street View images were preprocessed by resizing to 1024 × 512 pixels, normalizing to the ImageNet mean and standard deviation, and center-cropping to maintain consistent framing across headings. PSPNet achieved a mean Intersection over Union (mIoU) of 85.4% on the PASCAL VOC 2012 dataset and a pixel accuracy of 80.2% on the Cityscapes dataset [47,48]. These official benchmarks provide a reliable reference, and our local validation confirmed that segmentation accuracy was sufficient to support the extraction of ESG-related spatial indicators in the Shenyang case study. Future work will incorporate systematic fine-tuning and quantitative evaluation (mIoU, pixel accuracy) on manually labeled subsets of Baidu Street View imagery.

2.3.3. Kernel Density Analysis

We extracted the raw data needed for this study-the proportion of different features in each street view image. Each segmented category was matched to analytically relevant ESG indicators Table 3. To capture the spatial distribution patterns of these features, we employed kernel density analysis, which is particularly well-suited for handling the inherently complex and heterogeneous spatial structure of urban environments. This method enables the identification of local intensity variations and spatial clustering, offering a clearer understanding of how specific urban features are distributed across the study area [49]. Using kernel density analysis, this study reveals complexities in the spatial distribution of specific environmental features across a variety of urban settings [50]. Kernel Density Analysis was implemented using the ArcGIS Pro “Kernel Density” tool. We applied the default quartic kernel function, and the search radius (bandwidth) was set automatically by the software according to the spatial distribution of the input point data. No manual adjustment of the smoothing factor was introduced in this study, ensuring that results are fully reproducible using ArcGIS’s default settings. Future studies may further test alternative bandwidths to evaluate the robustness of clustering patterns.

2.3.4. Factor Analysis and Place Pulse Perception Scoring

EFA is used here to identify and extract major latent factors within the ESG dimensions of the selected communities. The EFA method is especially suited to investigating data structures when the underlying factors are not predetermined, thus enabling the identification of common latent traits with much efficiency.

Factor extraction was done by the Varimax Rotation method because it maximizes variance in a way that the relationship between the factors and the observed variables becomes distinct. The latter approach allowed the highlighting of various ESG-related dimensions, along with their individual contributions to the community environment.

We use ResNet50 [51,52], a deep convolutional neural network, and the Place Pulse 2.0 dataset to calculate the perceptual scores from street view images. Place Pulse 2.0, developed at MIT, contains over 100,000 images from 56 global regions rated by participants across six perceptual dimensions (e.g., safety, liveliness, beauty) via a web-based crowdsourcing platform. This approach enables reliable perception predictions, supporting analysis of how ESG factors shape community perceptions [20].

3. Results

3.1. ESG Factors Distribution via Kernel Density Analysis

Data derived from image segmentation of street viewpoints—specifically those corresponding to the Secondary Dimension indicators inTable 3, such as sky rate, green vision rate, and walkway ratio—were visualized and analyzed in ArcGIS as georeferenced point features to examine their spatial distribution across communities. Kernel Density Analysis was applied to spatially represent the results of each dimension of research, showing all the patterns of spatial distribution in detail Figure 4, Figure 5 and Figure 6. This approach emphasizes intra-dimensional variations and reveals spatial patterns of ESG-related indicators across the studied communities. The kernel density analysis results show that features such as walkway ratio and green vision rate tend to cluster around riverside areas and large parks, whereas traffic interference and crowding indicators are more concentrated near metro hubs and commercial centers. These spatial insights contribute to the subsequent discussion by illustrating how the distribution of environmental and social features may influence residents’ perceptual experiences and local livability. These spatial distribution patterns also reflect the influence of physical layout on social dynamics. For example, communities located adjacent to rivers or large parks (e.g., C and E) exhibited concentrated clusters of greenery and walkability, supporting pedestrian interaction and community cohesion. By contrast, commercial and metro-centered communities (e.g., B and F) showed pronounced clustering of traffic interference and crowd aggregation, creating vibrant but sometimes stressful conditions for everyday social interaction. This comparison illustrates how proximity to ecological or commercial nodes not only structures environmental quality but also shapes the lived dynamics of urban communities.

3.2. Exploratory Factor Analysis of Community-Level ESG Factors

To reduce dimensionality and identify latent structures among the numerous ESG-related indicators, factor analysis was conducted. This technique helps to group highly correlated variables into underlying composite factors, thereby simplifying the complexity of multivariate data and facilitating interpretation across dimensions.. Firstly, the adequacy of the dataset for factor analysis was checked. The Kaiser-Meyer-Olkin (KMO) statistic, which measures the adequacy of the data for factor analysis by evaluating the proportion of variance among variables that might be common variance, was 0.686. Since this value exceeds the commonly accepted threshold of 0.6, it indicates that the sample is sufficiently adequate for conducting factor analysis [53]. The sphericity test by Bartlett [54] showed that the probability is less than 0.05; thus, the data are adequate for this factor analysis.

In factor analysis, factor loadings represent the correlation coefficients between observed variables and underlying latent factors [55]. To improve the interpretability of the factor structure, rotation techniques such as Varimax (orthogonal) or Promax (oblique) are often applied after factor extraction. These rotations redistribute the factor loadings without altering the underlying solution, aiming to simplify the structure so that each variable loads strongly onto one factor and weakly onto others. Rotated factor loadings therefore clarify the contribution of each variable to specific factors, facilitating clearer interpretation and labeling of latent constructs [56]. According to the factor analysis results in the Environmental dimension, Table 4 and Table 5 denote Factor 1 as the Environmental Openness Factor (E Factor 1). It is so designated because it is strongly positively correlated with Sky Rate (0.962) and Visual Permeability (0.876), suggesting that this factor essentially measures the openness of the sky and visual permeability within a community. Conversely, its negative correlations with POI Business Completeness (−0.925) and Total Number of POIs (−0.957) suggest that denser commercial facilities are associated with lower environmental openness.

In contrast, Factor 2 is highly positively correlated with the Green Vision Rate (0.972), referring to the extent of green vegetation within the community and the visible presence of trees or other natural elements in street views. Therefore, this factor is named as the Green Ecological Factor (E Factor 2) since it can accommodate the extent of presence and visibility of natural greenery and contribute to environmental quality in the urban environment.

As indicated from the factor analysis results for the Social dimension represented in Table 6 and Table 7, Factor 1 is a Traffic and Commercial Density Factor (S Factor 1). The factor scores denote the degree of interference by traffic and the density of commercial functionality within a community, reflecting the effect of transportation and commercial facilities on the social environment.

In this respect, Factor 2 reflects walkable community environments. The higher the walkway ratio, the more facilities for better walkability and the opportunities for social interaction. Therefore, Factor 2 has been termed as the Walkability Factor (S Factor 2) since this factor addresses a pedestrian-friendly and socially conducive environment.

According to the results of factor analysis related to the Governance dimension represented in Table 8 and Table 9, Factor 1 is labeled Crowd Aggregation Factor (G Factor 1). This indicates a crowd factor with a measure regarding the level of crowding and concentration of commercial facilities in the community. Higher scores indicate more severe crowding with greater commercial activity and higher foot traffic.

Factor 2 stands for the road extremization phenomenon whereby pedestrian space and motor vehicle lanes show a reverse relation. Therefore, a high score would mean the total imbalance in the distribution of pedestrian space, such that smaller streets will have very small areas for pedestrians while the large roads have disproportionately big open spaces for pedestrians. This obvious inequality then creates differences in the walking environment throughout the community. It is, therefore, identified as Factor 2: the Road Extremization Factor (G Factor 2), which forms the base for assessing the equity in road space distribution related to walkability.

The eigenvalues, explained variance, and factor loadings with weights for the Environmental, Social, and Governance dimensions are summarized in Table 10. These results clarify the contribution of each factor within its respective ESG dimension and provide a transparent basis for the subsequent correlation analysis linking ESG factors to Place Pulse perceptions. It should be noted that the EFA was conducted solely on ESG-related indicators, and not on perception scores. The explained variance values therefore represent the proportion of variance captured within the ESG indicator sets in Table 10. The subsequent Spearman correlation analysis links these ESG factors to Place Pulse perception outcomes such as safety and liveliness.

3.3. Multidimensional Perception Analysis Using Place Pulse 2.0

A multidimensional perception analysis of street view images from seven communities in Shenyang was conducted using the Place Pulse 2.0 model, which evaluates each image based on six perception dimensions: “Safety”, “Lively”, “Wealthy”, “Beautiful”, “Depressing”, and “Boring”.

As shown in Figure 7, the boxplots reveal significant variation in perceptual scores across the seven communities, indicating that residents’ cognitive responses to the built environment are far from homogeneous. For instance, the mean Place Pulse scores for “livelier” perception range from approximately over 80 in Community B and Community F to below 50 in Community E, suggesting substantial inter-community disparity in perceived liveliness. Similarly, standard deviations within individual communities remain high—exceeding 20 in most cases—which implies pronounced internal heterogeneity. These results highlight both spatial inequality in urban perception and micro-environmental fragmentation that may not be captured through aggregated indicators alone.

Such divergence in subjective evaluation points to potential socio-spatial mechanisms shaping the lived experience, possibly driven by contextual factors such as traffic density, pedestrian accessibility, or land-use diversity. These hypotheses will be further investigated in the subsequent correlation analysis section, which examines how ESG-related spatial features relate to perceptual outcomes.

3.4. Spearman Correlation Coefficient Analysis

After obtaining the perception scores, the six perceptual dimensions were combined with the latent factors extracted through factor analysis, enabling an integrated view of how objective urban features align with subjective perceptions. Spearman correlation coefficients were then computed to examine the direction and strength of association between these variables after combination Figure 8. The results reveal several notable relationships. For example, S Factor 2 (Walkability) shows a strong positive correlation with G Factor 2 (Road Extremization), suggesting that communities with more balanced pedestrian infrastructure tend to exhibit better governance in terms of equitable road allocation. This finding implies a connection that cannot be ignored between walkability and spatial justice in street design.

In addition, E Factor 2 (Green Ecological Quality) is positively correlated with both “More Beautiful” and “More Depressing” perceptions. This dual effect may indicate that while greenery enhances aesthetic appeal, excessive or poorly integrated green space may also induce feelings of monotony or isolation. Another key insight is the moderate positive correlation between S Factor 1 (Traffic and Commercial Density) and “More Boring”, reflecting the paradoxical role of dense traffic and commercial zones in both increasing urban vibrancy and diminishing perceptual diversity. Taken together, these findings underscore the complex, and sometimes counterintuitive, ways in which spatial design influences subjective experiences—highlighting the importance of context-sensitive ESG strategies that balance functionality, aesthetics, and emotional impact in future community planning.

4. Discussion

4.1. Identification of Latent Urban Structure Through Factor Analysis

This article used factor analysis to comprehensively analyze environmental, social, and governance dimensions of seven communities in Shenyang and found complex influences that were brought about by the layout of buildings, green infrastructure, traffic flow characteristics, governance mechanisms on perception and the sustainable development of the communities Figure 9. In this research, the Environmental dimension refers to the features that affect physical openness and ecological quality, such as sky visibility, vegetation coverage, and view corridors. The Social dimension focuses on elements that influence pedestrian experience and social interaction, such as walkway availability and traffic interference. The Governance dimension relates to spatial equity and management efficiency, reflected in road allocation balance, commercial crowding, and the distribution of walkable space. These dimensions, as operationalized through factor analysis, form the empirical basis for interpreting the urban qualities of each community in the subsequent sections.

4.1.1. Factor Analysis of Environmental Dimension in ESG

Our results indicate that building density, green space design, and geographic locations are key factors that significantly affect the environmental openness and green ecological quality of the communities. Highly built-up communities, Community A and Community F from this study Table 1, Figure 2, had low environmental openness due to the fact that such an effect would create visual obstruction through overpasses and high commercial density, as reflected in the significantly higher number of POIs compared with other communities. This also echoes previous research studies showing that higher building densities suppress the visual permeability of an area [57].

Conversely, riverside communities like C and Ein Table 1, Figure 2 demonstrated better results in green ecological indicators, owing to both higher green coverage and natural scenic views. However, Community G, despite being adjacent to a large park, failed to integrate external natural resources into its internal design, falling short in leveraging its ecological potential. These findings suggest that proximity to nature alone is insufficient—internal spatial planning must actively incorporate and extend ecological assets to unlock their full environmental potential. Urban planners should prioritize integrated green design—both within community boundaries and in connection with surrounding ecological infrastructure—to enhance perceived environmental quality and long-term sustainability.

4.1.2. Factor Analysis of Social Dimension in ESG

Traffic and commercial densities have been demonstrated to act as double-edged swords for the social dimension: while they introduced vibrancy, they also introduced challenges. For example, communities B and F are central metro hubs and commercial centers, and they are highly socially vibrant. However, the highly dense commercial facilities and large amount of traffic interference reduce the pedestrian experience. This supports previous research where it has been said that highly dense areas normally experience poor walkability [29].

In contrast, well-designed walkways and adequate pedestrian spaces makecommunities D and G very convenient and conducive for residents to perform their social interactions. This is again in affirmation of the important function that pedestrian-friendly space plays in strengthening the social dimension of communities [58]. The optimization of the social dimension thus calls for the right balance between commercial density and design of pedestrians’ space as ways answering both residents’ needs for access and comfort. Effective social design must go beyond promoting activity density and instead optimize the spatial configuration of walkways and interaction zones to foster positive and equitable social experiences.

4.1.3. Factor Analysis of Governance Dimension in ESG

The governance dimension was most affected by the interaction between road allocation and population concentration. Communities A and F, characterized by road extremization—an imbalance favoring motor vehicle lanes over pedestrian zones—experienced weaker governance performance in terms of spatial equity and mobility management. In contrast, communities C and D exhibited more balanced road layouts, contributing to better pedestrian experiences and perceived governance efficiency.

Notably, Community B presented a governance paradox: despite high crowd density and commercial activity, it struggled with space management, underscoring the complexity of governing hyper-dense environments. This supports prior work identifying crowd aggregation and spatial pressure as major governance challenges in dense urban contexts [45]. Strengthening governance in urban neighborhoods requires proactive management of spatial equity, including balanced allocation of road space and strategic control of crowding. Our findings highlight the need for fine-grained, design-sensitive governance tools rather than one-size-fits-all approaches. In practical terms, road extremization in communities A and F manifests in narrow, congested sidewalks where pedestrian flows are frequently obstructed by parked bicycles or informal vendor stalls, forcing walkers into carriageways and raising safety risks. Conversely, in the high-density commercial hub of Community B, crowd aggregation is visible in overcrowded intersections and metro exits, where large volumes of foot traffic converge with limited spatial management, leading to congestion, longer waiting times, and reduced comfort in daily mobility. These examples illustrate how governance shortcomings directly affect everyday experiences of safety, accessibility, and spatial equity in the studied communities.

4.2. Linking ESG-Based Spatial Factors with Subjective Perceptions

This section explores how latent ESG-based spatial factors—derived through factor analysis—correlate with residents’ subjective perceptions of their communities, using six evaluative dimensions from the Place Pulse 2.0 model. By applying correlation analysis, we identify meaningful associations that shed light on the mechanisms through which specific urban forms shape perceived environmental quality and livability.

A prominent finding is the strong positive correlation between S Factor 2 (Walkability) and G Factor 2 (Road Extremization), indicating that imbalanced road allocation—where motor vehicle lanes dominate over pedestrian space—is directly linked to reduced walkability and governance inefficiency. The rapid growth of motor vehicles has disrupted pedestrian networks and made city streets less friendly for pedestrians [32]. Communities with poorly proportioned pedestrian environments tend to perform worse in governance-related dimensions, particularly in delivering equitable and accessible mobility. This suggests that walkability should be reframed not merely as a mobility concern, but as a governance issue essential to inclusive urban design.

In terms of greenery, the results reveal a nuanced pattern. E Factor 2 (Green Ecological Quality) shows a dual relationship: it is positively correlated with both “More Beautiful” and “More Depressing” perceptions. Existing studies have been consonant in their findings of the effects of urban green on emotional wellbeing: highly positive. For instance, emotional wellbeing has been associated as significantly improved with urban green, allowing for relaxation and stress relief for its urban residents [59]. Thus, even merely viewing a residence with green views has been correlated positively with community satisfaction; this would imply that even the mere view of greenery can enhance emotional wellbeing by increasing the satisfaction about one’s living environment [60]. This reflects the ambivalent psychological effects of green space—while it enhances aesthetic value and may promote calmness, overly uniform or isolated greenery may also induce feelings of dullness or emotional detachment. This echoes recent findings that vegetation affects both ecological functioning and behavioral responses, including crime avoidance and stress reduction [33]. The implication is clear: the design of urban green spaces must go beyond quantity to consider spatial integration, diversity, and emotional resonance. It should be noted that this dual relationship was observed as a general pattern across all communities in our sample, rather than stratified by demographic characteristics. Due to the absence of community-level demographic data (e.g., age structure or socioeconomic composition), we were unable to examine subgroup variations. Future studies integrating such data could reveal whether different population groups perceive greenery differently, thereby offering more targeted guidance for urban planners.

Another revealing correlation is between S Factor 1 (Traffic and Commercial Density) and both G Factor 1 (Crowd Aggregation) and the perception dimensions “Livelier” and “More Boring”. This paradox illustrates a core challenge in urban planning: high density and commercial vibrancy can enhance energy and activity [61], yet without adequate spatial variety and leisure infrastructure, such environments may be perceived as monotonous. Urban vibrancy, in other words, does not automatically translate to experiential richness. Spatial diversity and informal social spaces must be designed in tandem with commercial hubs to avoid perceptual fatigue.

Lastly, E Factor 1 (Environmental Openness) was negatively associated with perceptions of safety and liveliness. Contrary to findings in European contexts where openness is positively received [62], our results suggest that in high-density East Asian cities, expansive but poorly programmed spaces may create discomfort or insecurity [63]. This points to a perceptual mismatch: increasing visual openness without corresponding social activation may reduce rather than enhance community vitality.

In sum, the discussion in the previous sections underscore a central argument of this study: urban form and perceptual quality are linked through multidimensional trade-offs, not linear relationships. ESG-based spatial features—especially walkability, greenery, and commercial density—exert complex and sometimes conflicting effects on how communities are perceived. Designing livable cities thus requires not only physical improvements but also a careful calibration of spatial equity, sensory experience, and psychological comfort.

4.3. Policy Implications and Urban Renewal Strategies

The findings of this study have direct implications for community-scale urban policy and renewal initiatives in rapidly transforming cities such as Shenyang. First, enhancing walkability should be prioritized as both a mobility and governance goal. This can be achieved by reallocating street space to expand pedestrian zones, retrofitting narrow sidewalks, and introducing traffic-calming measures to mitigate vehicular interference. International best practices, such as complete street design [31], can be adapted to local contexts to balance motorized and non-motorized mobility.

Second, green space integration must move beyond quantity-based targets toward design approaches that diversify vegetation types, improve spatial connectivity, and incorporate greenery into street-level infrastructure. Linear green corridors, pocket parks, and green façades can be deployed to maximize visual permeability while avoiding the monotony that emerged as a potential drawback in our results.

Third, commercial and social vibrancy management should be incorporated into neighborhood revitalization plans. High-density commercial areas benefit from programmed public spaces—such as small plazas, shaded seating, and cultural amenities—that counteract perceptual fatigue and enrich experiential diversity. Urban design guidelines should include a diversity index for land-use mix to ensure that vibrancy does not translate into visual or functional monotony.

Fourth, data-driven governance can leverage ESG-based spatial indicators to target resource allocation and monitor policy outcomes. Routine integration of street view imagery and POI analytics into municipal planning systems can provide up-to-date, fine-grained diagnostics of community conditions, enabling rapid response to emerging issues.

Finally, participatory planning mechanisms should be embedded in community renewal projects. By aligning top-down ESG objectives with residents’ lived experiences—captured through perception surveys and digital engagement platforms—policy interventions can achieve greater legitimacy, equity, and long-term sustainability.

Collectively, these strategies provide a roadmap for translating the ESG-based analytical framework into actionable urban policy. Their implementation requires cross-sectoral coordination between municipal agencies, private developers, and community organizations, supported by transparent monitoring and iterative feedback loops.

4.4. Limitations and Future Work

This study is not without limitations. One notable constraint lies in the spatial definition of community boundaries. While a 1-km circular buffer was adopted to approximate a 15-min walkable environment, this uniform approach may not accurately capture the actual scale and morphological diversity of each community, particularly given their varying shapes and built environments. Future research could explore more flexible or data-driven boundary delineation methods to better reflect residents’ lived spatial experience. Additionally, although street view imagery has proven effective for extracting environmental indicators, it does not perfectly simulate the pedestrian perspective. The inherent differences between vehicle-mounted and eye-level pedestrian viewpoints may influence visual interpretation, particularly in aspects such as enclosure, obstruction, and perceived safety. Addressing the vehicle–pedestrian visual gap will be a central focus of our subsequent studies, which aim to develop more refined methods for human-scale environmental assessment. Potential methodological improvements include the integration of pedestrian-collected street view imagery using portable 360-degree cameras, low-altitude drone imagery for areas with limited walkability, and augmented/virtual reality tools to better simulate human-scale perspectives. These approaches can help bridge the current gap between vehicle-mounted imagery and lived pedestrian experiences, thereby improving the contextual validity of perception-based analyses.

Although the Place Pulse 2.0 model has demonstrated robust performance in predicting perceptual attributes from street view imagery across multiple global contexts, its application to Baidu Street View images in Shenyang may introduce transferability limitations. Differences in cultural context, urban morphology, and image acquisition parameters (e.g., camera height, field of view) between Google Street View and Baidu Street View could affect the model’s perception predictions. Such variations may influence how certain environmental cues—such as signage, building facades, or public space configurations—are interpreted in terms of safety, liveliness, or beauty. While prior research has applied cross-platform transfer of perception models with reasonable accuracy [64,65], future studies could benefit from local calibration or domain adaptation techniques to mitigate these potential biases and improve the contextual validity of perception scoring.

The POI dataset used in this study was last updated in 2021, whereas the street view imagery was captured in 2023. While the functional composition of communities in Shenyang’s urban core tends to remain relatively stable over short periods, some localized changes—such as new developments, business turnover, or land-use conversions—may have occurred during this two-year interval. As our POI analysis focuses on aggregated structural diversity rather than the operational status of individual establishments, we expect the temporal mismatch to have a limited impact on the interpretation of co-located spatial features. Nonetheless, this difference is acknowledged as a potential limitation, and future work could incorporate temporally aligned datasets or multi-year POI updates to improve accuracy in fast-evolving urban areas. Another limitation is the lack of community-scale building density indicators such as floor area ratio (FAR). While FAR values are available at the district or city level through official planning documents, they are not publicly reported for individual residential communities in Shenyang. As our unit of analysis is the community, we were unable to incorporate FAR into Table 1. Instead, we used the total number of POIs as a proxy for commercial density. Future research could enrich the ESG–perception framework by integrating community-level building density metrics once such data become accessible through local planning bureaus or detailed cadastral datasets.

The use of a fixed 1-km buffer to delineate neighborhood units provides a standardized measurement framework aligned with established walkability and “15-min city” planning benchmarks [19,41], and supports comparability across case study sites. However, the morphological diversity of the selected communities may influence the proportion and type of functional areas included within the buffer. Given that our analysis emphasizes relative differences in spatial indicators rather than absolute counts, we expect this effect to be limited. Nonetheless, future research could incorporate sensitivity checks—such as varying buffer radii or employing network-based accessibility polygons—to evaluate the robustness of results across neighborhoods with differing spatial forms.

While PSPNet was trained on the Cityscapes dataset, which contains high-resolution street-level imagery from European cities [48], some visual features in Chinese urban settings—such as signage, pavement materials, and street furniture—differ from those in the training set. These differences may lead to occasional misclassification of certain object classes, particularly where visual cues are culturally or regionally specific. To mitigate this, we manually inspected a subset of segmentation outputs to ensure plausibility of the extracted indicators. Nonetheless, future work could apply domain adaptation techniques [66] or fine-tune the model using locally annotated Baidu Street View images to further improve segmentation accuracy and reduce potential bias.

Another limitation concerns the adoption of a uniform 1 km circular buffer around each community centroid to approximate a 15-min walkable environment. While widely used in accessibility studies, this approach ignores actual network distances and physical barriers, potentially distorting both environmental and social indicators. For example, in communities bounded by expressways or rivers lacking pedestrian bridges, the 1 km Euclidean buffer may include areas that are not truly walkable, thereby inflating walkway ratio and misrepresenting walkability–perception correlations. Conversely, in communities with highly connected street grids, a 1 km straight-line buffer might underestimate the accessible area, omitting relevant amenities or green spaces just beyond the circle. Similar effects could occur for greenery metrics if significant parks or waterfronts lie marginally outside the buffer, dampening their observed associations with aesthetic or liveliness perceptions. Future research should conduct sensitivity analyses using alternative spatial units—such as network-based service areas, isochrone polygons, or buffers adjusted for physical barriers—to assess the robustness of ESG–perception linkages to neighborhood boundary definitions.

5. Conclusions

This study develops and tests a novel ESG-based community assessment framework to reveal how surrounding environmental characteristics of seven representative communities in Shenyang influence residents’ subjective perceptions. The framework integrates pixel-level analysis of street-view imagery and POI data, utilizing EFA to identify the most representative key factors within environmental, social, and governance dimensions. Furthermore, resident perception data derived from the Place Pulse 2.0 model were analyzed using Spearman correlation analysis to elucidate intrinsic relationships between community environmental features and subjective perceptions.

The findings confirm that the proposed micro-scale ESG framework effectively captures and explains differences among communities concerning environmental quality, social interaction, and governance mechanisms. Within the environmental dimension, the factor of “Environmental Openness,” positively correlated with sky visibility and visual permeability, was negatively influenced by the density of commercial facilities. Meanwhile, the “Green Ecological” factor effectively reflected the influence of internal and external greenery on residents’ environmental perception. The social dimension analysis indicates that higher density of traffic and commercial facilities, although enhancing community vitality, simultaneously negatively affects the pedestrian experience. Conversely, the proper allocation of pedestrian spaces significantly enhances community livability and facilitates social interaction. Regarding the governance dimension, polarization in roadway space and unequal pedestrian-space distribution directly affected governance efficiency and residents’ satisfaction with community management.

Further perceptual analysis demonstrated a significant correlation between walkability and the polarization of road space. Communities with imbalanced spatial distribution typically exhibited deficiencies in governance efficiency. The “Green Ecological” factor displayed a complex perceptual impact, wherein communities with higher greenery coverage demonstrated strong aesthetic appeal, yet certain design elements might simultaneously induce feelings of monotony or depression, highlighting the necessity of detailed green-space design. Additionally, the dual impacts of traffic and commercial density warrant attention, as these factors, despite contributing to community vitality, may lead to perceptions of monotony due to insufficient spatial diversity. Finally, the study found a negative correlation between environmental openness and residents’ sense of safety and vitality, a finding notably divergent from conclusions commonly drawn in European and American contexts [67], warranting further exploration within China’s specific urban context.

Overall, by employing a micro-scale ESG framework coupled with refined pixel-level analytical methods, this research systematically illustrates how multidimensional community environmental factors collectively shape residents’ subjective perceptions. The outcomes not only provide new theoretical insights and analytical approaches for academia but also offer critical empirical evidence and practical tools for urban planners and policymakers aiming for finer-grained governance and sustainable urban planning practices. Ultimately, this contributes to developing more equitable, livable, and sustainable community environments in urban development. In practical terms, the ESG-based framework can inform urban regeneration policies in rapidly transforming Chinese cities by highlighting the importance of rebalancing pedestrian and vehicular space, integrating diverse greenery into street-level design, and ensuring functional land-use variety in commercial hubs. These guidelines are particularly relevant for post-industrial cities such as Shenyang, where large-scale renewal initiatives are underway, and can help planners implement regeneration strategies that are both sustainable and responsive to residents’ lived experiences.

Author Contributions

Conceptualization, Jingxue Xie and Jue Wang; methodology, Jingxue Xie; formal analysis, Jingxue Xie; data curation, Jingxue Xie; writing—original draft preparation, Jingxue Xie; writing—review and editing, Jingxue Xie, Zhewei Liu and Jue Wang; visualization, Jingxue Xie; supervision, Jue Wang; project administration, Jue Wang. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Japan Science and Technology Agency (JST) SPRING Program [JPMJSP2116], and additionally supported by the Computational and Quantitative Social Sciences Grant [DSI-CQSSY3R1P08] from the Data Sciences Institute at the University of Toronto, as well as the Black, Indigenous, and Racialized Scholar/Research Grant from the University of Toronto Mississauga [209552].

Data Availability Statement

Restrictions apply to the availability of the datasets used in this study. Street view imagery was obtained from Baidu Map, and point-of- interest (POI) data were sourced from Gaode Map, both of which are third-party platforms subject to their respective terms of service. Due to licensing and access limitations, these datasets cannot be shared publicly. Derived features and spatial indicators generated from these data sources are available from the corresponding author upon reasonable request. The analysis pipeline and processing steps are fully described in the Methods section to support reproducibility.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hamman, P. Definitions and Redefinitions of Urban Sustainability: A Bibliometric Approach. Environ. Urbain 2017, 11, 1–2. [Google Scholar] [CrossRef][Green Version]
Tan, S.Y.; Taeihagh, A. Smart City Governance in Developing Countries: A Systematic Literature Review. Sustainability 2020, 12, 899. [Google Scholar] [CrossRef]
McPhearson, T.; Cook, E.M.; Berbés-Blázquez, M.; Cheng, C.; Grimm, N.B.; Andersson, E.; Barbosa, O.; Chandler, D.G.; Chang, H.; Chester, M.V.; et al. A Social-Ecological-Technological Systems Framework for Urban Ecosystem Services. One Earth 2022, 5, 505–518. [Google Scholar] [CrossRef]
Teklemariam, N. Sustainable Development Goals and Equity in Urban Planning: A Comparative Analysis of Chicago, São Paulo, and Delhi. Sustainability 2022, 14, 13227. [Google Scholar] [CrossRef]
Cornell, B. ESG Preferences, Risk and Return. Eur. Financ. Manag. 2021, 27, 12–19. [Google Scholar] [CrossRef]
Ahmad, H.; Yaqub, M.; Lee, S.H. Environmental-, Social-, and Governance-Related Factors for Business Investment and Sustainability: A Scientometric Review of Global Trends. Environ. Dev. Sustain. 2024, 26, 2965–2987. [Google Scholar] [CrossRef] [PubMed]
Friede, G.; Timo, B.; Bassen, A. ESG and Financial Performance: Aggregated Evidence from More than 2000 Empirical Studies. J. Sustain. Financ. Invest. 2015, 5, 210–233. [Google Scholar] [CrossRef]
Teixeira Dias, F.; de Aguiar Dutra, A.R.; Vieira Cubas, A.L.; Ferreira Henckmaier, M.F.; Courval, M.; de Andrade Guerra, J.B.S.O. Sustainable Development with Environmental, Social and Governance: Strategies for Urban Sustainability. Sustain. Dev. 2023, 31, 528–539. [Google Scholar] [CrossRef]
Huang, C.C.; Chan, Y.K.; Hsieh, M.Y. The Determinants of ESG for Community LOHASism Sustainable Development Strategy. Sustainability 2022, 14, 11429. [Google Scholar] [CrossRef]
Lu, P.; Hamori, S.; Tian, S. Can ESG Investments and New Environmental Law Improve Social Happiness in China? Front. Environ. Sci. 2023, 11, 1089486. [Google Scholar] [CrossRef]
Gallucci, C.; Santulli, R.; Lagasio, V. The Conceptualization of Environmental, Social and Governance Risks in Portfolio Studies A Systematic Literature Review. Socio-Econ. Plan. Sci. 2022, 84, 101382. [Google Scholar] [CrossRef]
Bosi, M.K.; Lajuni, N.; Wellfren, A.C.; Lim, T.S. Sustainability Reporting through Environmental, Social, and Governance: A Bibliometric Review. Sustainability 2022, 14, 12071. [Google Scholar] [CrossRef]
Zhou, H.; He, S.; Cai, Y.; Wang, M.; Su, S. Social Inequalities in Neighborhood Visual Walkability: Using Street View Imagery and Deep Learning Technologies to Facilitate Healthy City Planning. Sustain. Cities Soc. 2019, 50, 101605. [Google Scholar] [CrossRef]
Wang, S.; Cai, W.; Sun, Q.C.; Wu, C.Y.H.; Huang, X.; Giannopoulos, I.; Alinaghi, N.; Liu, Z. Human-Perceived vs Actual Built Environment: Using Human-Centred GeoAI and Street View Images to Support Urban Planning in Australia. J. Environ. Manag. 2025, 389, 126070. [Google Scholar] [CrossRef] [PubMed]
Helbich, M. Dynamic Urban Environmental Exposures on Depression and Suicide (NEEDS) in the Netherlands: A Protocol for a Cross-Sectional Smartphone Tracking Study and a Longitudinal Population Register Study. BMJ Open 2019, 9, e030075. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Zhang, C.; Li, W.; Ricard, R.; Meng, Q.; Zhang, W. Assessing Street-Level Urban Greenery Using Google Street View and a Modified Green View Index. Urban For. Urban Green. 2015, 14, 675–685. [Google Scholar] [CrossRef]
Mouratidis, K. Built Environment and Social Well-Being: How Does Urban Form Affect Social Life and Personal Relationships? Cities 2018, 74, 7–20. [Google Scholar] [CrossRef]
Mehaffy, M.W.; Porta, S.; Romice, O. The “Neighborhood Unit” on Trial: A Case Study in the Impacts of Urban Morphology. J. Urban. Int. Res. Placemaking Urban Sustain. 2015, 8, 199–217. [Google Scholar] [CrossRef]
Moreno, C.; Allam, Z.; Chabaud, D.; Gall, C.; Pratlong, F. Introducing the “15-Minute City”: Sustainability, Resilience and Place Identity in Future Post-Pandemic Cities. Smart Cities 2021, 4, 93–111. [Google Scholar] [CrossRef]
Dubey, A.; Naik, N.; Parikh, D.; Raskar, R.; Hidalgo, C.A. Deep Learning the City: Quantifying Urban Perception at a Global Scale. In Computer Vision—ECCV 2016, ECCV 2016, Lecture Notes in Computer Science, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; pp. 196–212. [Google Scholar] [CrossRef]
Kang, Y.; Kim, J.; Park, J.; Lee, J. Assessment of Perceived and Physical Walkability Using Street View Images and Deep Learning Technology. ISPRS Int. J. Geo-Inf. 2023, 12, 186. [Google Scholar] [CrossRef]
Lu, Y.; Ferranti, E.J.S.; Chapman, L.; Pfrang, C. Assessing Urban Greenery by Harvesting Street View Data: A Review. Urban For. Urban Green. 2023, 83, 127917. [Google Scholar] [CrossRef]
Li, T.T.; Wang, K.; Sueyoshi, T.; Wang, D.D. ESG: Research Progress and Future Prospects. Sustainability 2021, 13, 11663. [Google Scholar] [CrossRef]
Ewing, R.; Handy, S. Measuring the Unmeasurable: Urban Design Qualities Related to Walkability. J. Urban Des. 2009, 14, 65–84. [Google Scholar] [CrossRef]
Ye, Y.; Richards, D.; Lu, Y.; Song, X.; Zhuang, Y.; Zeng, W.; Zhong, T. Measuring Daily Accessed Street Greenery: A Human-Scale Approach for Informing Better Urban Planning Practices. Landsc. Urban Plan. 2019, 191, 103434. [Google Scholar] [CrossRef]
Stamps, A.E., III. Enclosure and Safety in Urbanscapes. Environ. Behav. 2005, 37, 102–133. [Google Scholar] [CrossRef]
Li, X.; Ratti, C.; Seiferling, I. Quantifying the Shade Provision of Street Trees in Urban Landscape: A Case Study in Boston, USA, Using Google Street View. Landsc. Urban Plan. 2018, 169, 81–91. [Google Scholar] [CrossRef]
Marshall, W.E.; Piatkowski, D.P.; Garrick, N.W. Community Design, Street Networks, and Public Health. J. Transp. Health 2014, 1, 326–340. [Google Scholar] [CrossRef]
Lu, Y.; Chen, H.M. Using Google Street View to Reveal Environmental Justice: Assessing Public Perceived Walkability in Macroscale City. Landsc. Urban Plan. 2024, 244, 104995. [Google Scholar] [CrossRef]
Ki, D.; Chen, Z.; Lee, S.; Lieu, S. A Novel Walkability Index Using Google Street View and Deep Learning. Sustain. Cities Soc. 2023, 99, 104896. [Google Scholar] [CrossRef]
Pojani, D.; Stead, D. Sustainable Urban Transport in the Developing World: Beyond Megacities. Sustainability 2015, 7, 7784–7805. [Google Scholar] [CrossRef]
Ma, Y.; Jiao, H. Quantitative Evaluation of Friendliness in Streets’ Pedestrian Networks Based on Complete Streets: A Case Study in Wuhan, China. Sustainability 2023, 15, 10317. [Google Scholar] [CrossRef]
Wu, D.Y.; Wang, J. Analyzing Urban Crime Through Street View Imagery: Insights from Urban Micro Built Environment and Perceptions. Urban Sci. 2024, 8, 247. [Google Scholar] [CrossRef]
Forsyth, A.; Hearst, M.; Oakes, J.M.; Schmitz, K.H. Design and Destinations: Factors Influencing Walking and Total Physical Activity. Urban Stud. 2008, 45, 1973–1996. [Google Scholar] [CrossRef]
Jacobs, J. The Death and Life of Great American Cities; Random House: New York, NY, USA, 1961. [Google Scholar]
Frank, L.D.; Schmid, T.L.; Sallis, J.F.; Chapman, J.; Saelens, B.E. Linking Objectively Measured Physical Activity with Objectively Measured Urban Form: Findings from SMARTRAQ. Am. J. Prev. Med. 2005, 28, 117–125. [Google Scholar] [CrossRef]
Li, W.; Yi, P.; Zhang, D. Sustainability Evaluation of Cities in Northeastern China Using Dynamic TOPSIS-Entropy Methods. Sustainability 2018, 10, 4542. [Google Scholar] [CrossRef]
Leng, A.; Wang, K.; Bai, J.; Gu, N.; Feng, R. Analyzing Sustainable Development in Chinese Cities: A Focus on Land Use Efficiency in Production-Living-Ecological Aspects. J. Clean. Prod. 2024, 448, 141461. [Google Scholar] [CrossRef]
Gu, H.; Huan, C.; Yang, F. Spatiotemporal Dynamics of Ecological Vulnerability and Its Influencing Factors in Shenyang City of China: Based on SRP Model. Int. J. Environ. Res. Public Health 2023, 20, 1525. [Google Scholar] [CrossRef]
Lianjia (Shenyang). Lianjia Real Estate Data. 2023. Available online: https://sy.lianjia.com/ (accessed on 20 June 2025).
Handy, S. Is Accessibility an Idea Whose Time Has Finally Come? Transp. Res. Part Transp. Environ. 2020, 83, 102319. [Google Scholar] [CrossRef]
Stokols, D. Establishing and Maintaining Healthy Environments: Toward a Social Ecology of Health Promotion. Am. Psychol. 1992, 47, 6–22. [Google Scholar] [CrossRef]
Kelling, G.L.; Wilson, J.Q. Broken Windows. Atl. Mon. 1982, 249, 2151–9463. [Google Scholar]
Gehl, J. Life Between Buildings: Using Public Space; Island Press: Washington, DC, USA, 2011. [Google Scholar]
Visagie, J.; Turok, I. Getting Urban Density to Work in Informal Settlements in Africa. Environ. Urban. 2020, 32, 351–370. [Google Scholar] [CrossRef]
Green, J.N. On the Plaza: The Politics of Public Space and Culture. by Setha M. Low. Austin: University of Texas Press (2000). Reviewed by James N. Green. J. Political 2001, 8, 77. [Google Scholar] [CrossRef][Green Version]
Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
Lee, J. Spatiotemporal Analytics; CRC Press: Boca Raton, FL, USA, 2023. [Google Scholar] [CrossRef]
Vacková, J.; Bukáček, M. Kernel Estimates as General Concept for the Measuring of Pedestrian Density. Transp. A Transp. Sci. 2023, 21, 2236236. [Google Scholar] [CrossRef]
Wei, J.; Yue, W.; Li, M.; Gao, J. Mapping Human Perception of Urban Landscape from Street-View Images: A Deep-Learning Approach. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102886. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Kaiser, H.F. An Index of Factorial Simplicity. Psychometrika 1974, 39, 31–36. [Google Scholar] [CrossRef]
Tobias, S.; Carlson, J.E. Brief Report: Bartlett’s Test of Sphericity and Chance Findings in Factor Analysis. Multivar. Behav. Res. 1969, 4, 375–377. [Google Scholar] [CrossRef]
Kaiser, H.F. The Varimax Criterion for Analytic Rotation in Factor Analysis. Psychometrika 1958, 23, 187–200. [Google Scholar] [CrossRef]
Abdi, H. Factor Rotations in Factor Analysis. In Encyclopedia for Research Methods for the Social Sciences; Sage: Thousand Oaks, CA, USA, 2003. [Google Scholar]
Swietek, A.R.; Zumwald, M. Visual Capital: Evaluating Building-Level Visual Landscape Quality at Scale. Landsc. Urban Plan. 2023, 240, 104880. [Google Scholar] [CrossRef]
Fraser, T.; Feeley, O.; Ridge, A.; Cervini, A.; Rago, V.; Gilmore, K.; Worthington, G.; Berliavsky, I. How Far I’ll Go: Social Infrastructure Accessibility and Proximity in Urban Neighborhoods. Landsc. Urban Plan. 2024, 241, 104922. [Google Scholar] [CrossRef]
Roberts, H.; Sadler, J.; Chapman, L. The Value of Twitter Data for Determining the Emotional Responses of People to Urban Green Spaces: A Case Study and Critical Evaluation. Urban Stud. 2019, 56, 818–835. [Google Scholar] [CrossRef]
Fonteyn, P.; Daniels, S.; Malina, R.; Lizin, S. In Plain Sight: Green Views from the Residence and Urbanites’ Neighborhood Satisfaction. Landsc. Urban Plan. 2024, 245, 105021. [Google Scholar] [CrossRef]
Mouratidis, K.; Poortinga, W. Built Environment, Urban Vitality and Social Cohesion: Do Vibrant Neighborhoods Foster Strong Communities? Landsc. Urban Plan. 2020, 204, 103951. [Google Scholar] [CrossRef]
Wartmann, F.M.; Frick, J.; Kienast, F.; Hunziker, M. Factors Influencing Visual Landscape Quality Perceived by the Public. Results from a National Survey. Landsc. Urban Plan. 2021, 208, 104024. [Google Scholar] [CrossRef]
Ogawa, Y.; Oki, T.; Zhao, C.; Sekimoto, Y.; Shimizu, C. Evaluating the Subjective Perceptions of Streetscapes Using Street-View Images. Landsc. Urban Plan. 2024, 247, 105073. [Google Scholar] [CrossRef]
Naik, N.; Philipoom, J.; Raskar, R.; Hidalgo, C. Streetscore—Predicting the Perceived Safety of One Million Streetscapes. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014; pp. 793–799. [Google Scholar] [CrossRef]
Kang, Y.; Zhang, F.; Peng, W.; Gao, S.; Rao, J.; Duarte, F.; Ratti, C. Understanding House Price Appreciation Using Multi-Source Big Geo-Data and Machine Learning. Land Use Policy 2011, 111, 104919. [Google Scholar] [CrossRef]
Hoffman, J.; Tzeng, E.; Park, T.; Zhu, J.Y.; Isola, P.; Saenko, K.; Efros, A.; Darrell, T. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PLMR: Birmingham, UK, 2018; pp. 1989–1998. [Google Scholar]
Garau, C.; Annunziata, A. A Method for Assessing the Vitality Potential of Urban Areas. The Case Study of the Metropolitan City of Cagliari, Italy. City Territ. Archit. 2022, 9, 7. [Google Scholar] [CrossRef]

Figure 1. Workflow of the study integrating ESG evaluation and perceptual analysis. The diagram also presents the ESG-based indicator framework, showing Environmental (E), Social (S), and Governance (G) dimensions together with POI-based functional indicators. This framework is incorporated within the methodological design to ensure clarity and coherence.

Figure 2. Spatial distribution of the seven study communities (A–G) in the core urban area of Shenyang. The map integrates road networks, and community boundaries (white dashed) to provide detailed spatial context. Each community is labeled to correspond with Table 1. A uniform 1 km circular buffer is delineated around the centroid of each community to approximate a 15-min walkable radius, following established planning benchmarks [19,41].

Figure 3. Examples of Baidu Street View images recognition and semantic segmentation.

Figure 4. Mapping Environmental Dimensions of ESG: Kernel Density Results.

Figure 5. Mapping Social Dimensions of ESG: Kernel Density Results.

Figure 6. Mapping Governance Dimensions of ESG: Kernel Density Results.

Figure 7. Boxplots of Place Pulse 2.0 scores across seven communities in Shenyang for six perceptual dimensions. The plots illustrate considerable inter-community variation in subjective perceptions of the urban environment.

Figure 8. Spearman correlation matrix between ESG-related latent factors (extracted through factor analysis) and six perceptual dimensions from Place Pulse 2.0. Strong and significant associations are observed, such as the positive correlation between Walkability (S Factor 2) and Road Extremization (G Factor 2), and the dual relationship of Green Ecological Quality (E Factor 2) with both “More Beautiful” and “More Depressing” perceptions. All coefficients are significant at the 0.001 level.

Figure 9. Factor scores for each community in the Environmental, Social and Governance dimensions.

Table 1. The introduction of the seven communities.

Community	Year of Completion	Scale of Residents (Households)	Main Environmental Factors	Geographical Location	Total Number of POIs
A	2007	305	Overpasses	Northwestern	8916
B	1993	481	Core Metro Hub	Center	12,508
C	2018	2277	River	Southwestern	2515
D	2007	2096	Government Departments	Northern	2850
E	2016	1309	River	Southern	3529
F	1990	718	Core Business District	East center	12,484
G	2007	1361	Large-scale Park	Northeastern	8214

This table summarizes key attributes of the seven sampled communities, including POI counts derived from Gaode Map data (2021).

Table 2. Summary of datasets used in the study, including study objects, acquisition period, and exact sources.

Dataset	Study Object	Acquisition Period	Source
Baidu Street View Imagery	Street-level panoramic images covering roads within 1 km buffer of each community	January 2023	Baidu Maps Street ViewAPI [https://map.baidu.com (accessed on 12 January 2023)]
Gaode Maps POI Data	Geocoded points of commercial, public service, and recreational facilities	2021	Gaode MapsAPI [https://lbs.amap.com (accessed on 9 December 2021)]
OpenStreetMap Road Network	Major and minor road centerlines	January 2023	OpenStreetMapcontributors [https://www.openstreetmap.org (accessed on 12 January 2023)]
Community Boundaries	Polygon shapefiles delineating seven study communities	January 2024	Derived by authors from Baidu Maps satellite layer
Lianjia Real Estate Data	Community-level attributes: completion year, household numbers	June 2023	Lianjia Real EstatePlatform [https://sy.lianjia.com (accessed on 10 June 2023)]

Table 3. The formula and meaning of each level dimension.

ESG-Mapped Dimension	Interpretive Dimension	Formula	Description
Environmental	Sky rate	$S_{v} = S / A$	S: sky pixels; A: total pixels
	Visual permeability	$I_{p} = W L / B$	$W L$ : wall pixels; B: building pixels
	Green vision rate	$G_{v} = G / A$	G: vegetation pixels; A: total pixels
Social	Mobility traffic interference index	$V_{i} = V / M$	V: vehicle pixels; M: carriageway pixels
Social	Walkway ratio	$P_{r} = W / R$	W: walkway pixels; R: walkway + carriageway pixels
Governance	Relative walking width	$W_{w} = W / M$	W: walkway pixels; M: carriageway pixels
	Crowd aggregation	$H_{r} = H / A$	H: person pixels; A: total pixels
	Walking space	$W_{s} = W / A$	W: walkway pixels; A: total pixels
POI	Total number of POI business	$P_{a}$	$P_{a}$ : total facilities in POI
POI	POI business completeness	C	C: number of facility function types

Table 4. Rotated factor loadings for ESG Environmental Factor.

Items	Factor 1	Factor 2	Communalities
Green Vision Rate	0.221	0.972	0.994
Sky rate	0.962	0.089	0.934
Visual permeability	0.876	0.374	0.908
POI business completeness	−0.925	−0.356	0.982
Total number of POI business	−0.957	−0.238	0.972

Rotated: Varimax.

Table 5. Coefficient Matrix of ESG Environmental Component (Factor) Score.

Items	Component 1	Component 2
Green Vision Rate	−0.282	1.024
Sky rate	0.364	−0.269
Visual permeability	0.219	0.089
POI business completeness	−0.246	−0.051
Total number of POI business	−0.305	0.097

Table 6. Rotated Factor Loadings for ESG Social Factor.

Items	Factor 1	Factor 2	Communalities
Walkway ratio	0.086	0.995	0.997
Mobility Traffic Interference Index	0.974	−0.067	0.953
POI business completeness	0.975	0.164	0.978
Total number of POI business	0.973	0.207	0.989

Rotated: Varimax.

Table 7. Coefficient Matrix of ESG Social Component (Factor) Score.

Items	Component 1	Component 2
Walkway ratio	−0.099	0.971
Mobility Traffic Interference Index	0.367	−0.195
POI business completeness	0.337	0.033
Total number of POI business	0.331	0.076

Table 8. Rotated Factor Loadings for ESG Governance Factor.

Items	Factor 1	Factor 2	Communalities
Crowd Aggregation	0.950	−0.092	0.912
Relative walking width	0.088	0.911	0.838
Walking Space	−0.045	−0.911	0.832
POI business completeness	0.967	0.148	0.957
Total number of POI business	0.985	0.139	0.989

Rotated: Varimax.

Table 9. CoefficientMatrix of ESG Governance Component (Factor) Score.

Items	Component 1	Component 2
Crowd Aggregation	0.350	−0.118
Relative walking width	−0.029	0.538
Walking Space	0.044	−0.541
POI business completeness	0.340	0.024
Total number of POI business	0.347	0.018

Table 10. Factor loadings, eigenvalues, and variance explained for Environmental (E), Social (S), and Governance (G) dimensions.

Dimension/Variable	Factor 1	Factor 2	Composite Coefficient	Weight (%)
Environmental (E)
Eigenvalue (after rotation)	3.513	1.276
Variance explained (%)	70.26	25.52
Green Vision Rate	0.1177	0.8606	0.3157	15.62
Sky rate	0.5134	0.0788	0.3976	19.68
Visual permeability	0.4676	0.3308	0.4311	21.33
POI business completeness	0.4933	0.3153	0.4459	22.07
Total number of POI business	0.5105	0.2104	0.4305	21.30
Social (S)
Eigenvalue (after rotation)	2.853	1.064
Variance explained (%)	71.32	26.59
Walkway ratio	0.0509	0.9646	0.2991	18.25
Mobility Traffic Interference Index	0.5767	−0.0649	0.4025	24.56
POI business completeness	0.5773	0.1588	0.4636	28.29
Total number of POI business	0.5758	0.2003	0.4738	28.91
Governance (G)
Eigenvalue (after rotation)	2.818	1.710
Variance explained (%)	56.37	34.19
Crowd Aggregation	0.5661	0.0706	0.3790	21.53
Relative walking width	0.0521	0.6969	0.2956	16.79
Walking Space	0.0267	0.6966	0.2797	15.88
POI business completeness	0.5761	0.1129	0.4012	22.79
Total number of POI business	0.5867	0.1062	0.4053	23.02

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, J.; Liu, Z.; Wang, J. Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data. ISPRS Int. J. Geo-Inf. 2025, 14, 338. https://doi.org/10.3390/ijgi14090338

AMA Style

Xie J, Liu Z, Wang J. Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data. ISPRS International Journal of Geo-Information. 2025; 14(9):338. https://doi.org/10.3390/ijgi14090338

Chicago/Turabian Style

Xie, Jingxue, Zhewei Liu, and Jue Wang. 2025. "Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data" ISPRS International Journal of Geo-Information 14, no. 9: 338. https://doi.org/10.3390/ijgi14090338

APA Style

Xie, J., Liu, Z., & Wang, J. (2025). Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data. ISPRS International Journal of Geo-Information, 14(9), 338. https://doi.org/10.3390/ijgi14090338

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Developing a Replicable ESG-Based Framework for Assessing Community Perception Using Street View Imagery and POI Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Sources

2.3. Methodology

2.3.1. Evaluation Framework

2.3.2. Image Segmentation

2.3.3. Kernel Density Analysis

2.3.4. Factor Analysis and Place Pulse Perception Scoring

3. Results

3.1. ESG Factors Distribution via Kernel Density Analysis

3.2. Exploratory Factor Analysis of Community-Level ESG Factors

3.3. Multidimensional Perception Analysis Using Place Pulse 2.0

3.4. Spearman Correlation Coefficient Analysis

4. Discussion

4.1. Identification of Latent Urban Structure Through Factor Analysis

4.1.1. Factor Analysis of Environmental Dimension in ESG

4.1.2. Factor Analysis of Social Dimension in ESG

4.1.3. Factor Analysis of Governance Dimension in ESG

4.2. Linking ESG-Based Spatial Factors with Subjective Perceptions

4.3. Policy Implications and Urban Renewal Strategies

4.4. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI