1. Introduction
Land surface temperature (LST) is a critical variable for understanding physical processes on the Earth’s surface [
1,
2], with wide applications across domains. It is vital for public health, environmental monitoring studies [
3], and urban climate change studies [
4]. Rapid urbanization has substantially altered the natural state of the land surface. The proliferation of urban agglomerations and densely packed buildings induces airflow obstruction [
5] and increases the probability of multiple scattering, leading to persistently elevated surface temperatures and enhanced outward radiative energy emission. Urban land surface temperature (ULST) is a key indicator of urban sustainability. Accurate estimation of ULST is crucial for understanding the urban heat island (UHI) effect [
6,
7], managing energy consumption, and assessing human thermal comfort.
Satellite remote sensing is an effective tool for continuous, large-scale monitoring of ULST [
8], offering substantial advantages over traditional ground-based measurements in spatial coverage, timeliness, and cost-effectiveness [
9]. However, cities—typical heterogeneous landscapes with structural and material differences from natural surfaces—present significant challenges for accurate ULST retrieval from remote sensing data [
10]. These challenges stem from urban geometry interfering with radiative transfer processes. Urban geometry not only induces a “cavity effect” through multiple scattering [
10] but also makes the thermal radiation contribution from adjacent pixels (i.e., the adjacency effect) a non-negligible source of error [
11].
The influence of urban geometry on LST is pronounced. Conventional LST retrieval algorithms (e.g., single-channel methods, split-window algorithms) [
12,
13] assume surfaces are “flat, homogeneous, and isothermal” [
14], which contradicts urban environmental reality and compromises ULST retrieval accuracy. Studies have demonstrated that LST retrieved without considering geometric effects is typically higher than that which accounts for them [
10]. This discrepancy arises because multiple scattering and reflections within urban areas (i.e., the cavity effect) elevate the effective emissivity above the inherent material emissivity, leading to temperature differences. Yang [
10] proposed the UEM-SCM method, which quantifies geometric influences via a radiative transfer model that incorporates the sky view factor (SVF) into a split-window algorithm, thereby distinguishing geometric effects from atmospheric effects. This study found that the uncorrected split-window algorithm misattributes part of the geometric contribution to atmospheric effects, resulting in biases of up to 1.5–2.0 K in built-up areas.
The thermal radiation from adjacent pixels significantly impacts ULST retrieval. This effect, characterized by inter-pixel interactions such as the mixed pixel problem and edge effect interference, introduces retrieval errors. Chen [
11] confirmed that adjacent pixel thermal radiation substantially influences ULST retrieval from high-spatial-resolution thermal infrared (TIR) imagery and proposed a view-factor-based parameterization method. Ru [
15] and Zhong [
5] have also conducted parameterization studies on the adjacency effect using view factors. Zhong [
5] investigated the adjacency effect using high-resolution drone data, revealing its impact on retrieved LST is 0.0–3.0 K. These findings underscore that the adjacency effect, arising from complex urban geometry, cannot be neglected. Despite current research methodologies for the adjacency effect being based on view factors and the average radiance of adjacent pixels, a systematic investigation into how urban geometry and materials influence this effect remains lacking.
Beyond the commonly discussed impacts of urban geometry and materials, the advancement of thermal infrared (TIR) remote sensing technology and the proliferation of high-resolution sensors (e.g., GF-5, SDGSAT-1) have increasingly highlighted the role of the adjacency effect in ULST retrieval. For instance, while GF-5 satellite TIR data (40 m resolution) captures finer details of the urban thermal field, radiation interference from adjacent buildings can significantly increase retrieval errors [
11]. This demonstrates that estimating the adjacency effect necessitates consideration of spatial resolution [
11]. Lower spatial resolution averages the thermal contrast between adjacent pixels, making the adjacency effect more pronounced in high-resolution imagery. The current availability of data across various spatial resolutions raises a critical question: which spatial resolution is optimal for ULST retrieval? Zhen [
16] investigated the optimal spatial resolution for ULST retrieval using the TES algorithm. Numerical experiments utilizing the DART model suggested that 30 m represents a key threshold for the existence of pure pixels in cities. Nevertheless, the optimal spatial resolution at which the adjacency effect can be neglected remains unclear. Therefore, this study systematically analyzed the influence of the adjacency effect under varying spatial resolutions, material properties, and thermal conditions.
This research employs numerical experiments based on the Discrete Anisotropic Radiative Transfer (DART) model to construct simulated urban scenes with diverse geometric and material characteristics, enabling systematic simulation and analysis of the adjacency effect. The DART model stands as one of the most comprehensive physically based 3D models for Earth–atmosphere radiative transfer, covering the spectral domain from ultraviolet to thermal infrared wavelengths [
17]. It simulates the optical 3D radiation budget (RB) and can compute optical signals captured by proximal, aerial, and satellite imaging spectrometers and laser scanners, which is applicable to any urban or natural landscape and any experimental or instrumental configuration [
18]. The DART model has been validated through a series of RAMI (RAdiation transfer Model Intercomparison) experiments [
19], with its 3D radiative transfer capabilities verified in both the visible and thermal infrared domains. Leveraging the DART model, this study systematically simulated and analyzed the adjacency effect: scenarios were constructed combining different spatial resolutions (1–120 m), building geometric parameters (building height—BH, roof area index—
, Surrounding Obstruction View Factor—SVF
Obs.), and material properties (reflectance R = 0.05, 0.10, 0.15) to quantify their impact on adjacency effects. The research aims to (1) elucidate the mechanisms by which geometric parameters (BH,
, SVF
Obs.) and material properties (ε = 1 − R) govern the intensity of the adjacency effect; (2) determine the critical range of spatial resolution where the adjacency effect becomes negligible; and (3) evaluate the influence of surface component temperature differences.
This study employs the DART model to simulate the brightness temperature differences of target pixels under isolated and adjacent conditions, systematically analyzing the influence of spatial resolution, urban geometric structures, and other factors on the adjacency effects in urban land surface temperature (ULST) retrieval. The anticipated contributions include the following: (1) determining the critical spatial resolution range where adjacency effects become negligible, thereby providing a scientific basis for spatial resolution selection in current ULST retrieval processes; (2) clarifying the relative importance and individual effects of different factors on adjacent pixel interference, addressing existing gaps in factor analysis within current research; and (3) validating the conclusion that temperature differences among urban components exert minimal influence on adjacency effects, which offers theoretical support for simplifying ULST retrieval models. The findings are expected to provide a key foundation for developing more accurate ULST retrieval algorithms and optimizing remote sensing data selection (spatial resolution).
Section 2 describes the models used, the setup of the simulated scenes, and the employed formulas.
Section 3 presents a summary and visualization of the experimental results, assessing the influence of different factors on the adjacency effect. Discussion of the results is provided in
Section 4, and conclusions are drawn in
Section 5.
4. Discussion
In
Section 3, we present the relationship between adjacency effects and various factors in radiative simulations of the urban scenario described in
Section 2.3. However, these simulations employ symmetrically and regularly distributed buildings, which can only yield simplified relationships and fail to fully capture the complexity of real-world urban environments. Therefore, in
Section 4, we will adjust multiple factors—including incorporating higher spatial resolution, introducing real DSM data, and setting temperature differences among urban component surfaces—to analyze whether the aforementioned findings remain applicable in the investigation of complex urban scenarios.
4.1. Impacts of Multiple Factors on Adjacent Pixel Effects
Previous research on retrieving ULST from thermal infrared (TIR) imagery has highlighted the non-negligible influence of adjacent pixels in high-spatial-resolution remote sensing data [
29]. This study aims to investigate the impact of various factors on adjacent pixel effects. By employing the DART model, we simulated the brightness temperature difference (ΔTb) of a target pixel induced by adjacent pixels under varying spatial resolutions, urban geometric structures, and material surface properties. The simulations, interpreted from a radiative transfer perspective, elucidate how energy from adjacent geometric structures influences the target pixel under different factors. Compared to prior studies, this work further reveals the influence mechanisms of spatial resolution, 3D structural parameters (building height, roof area index, and obstruction level), and material properties (emissivity and surface temperature).
4.1.1. Discussions on Other Spatial Resolutions
In
Section 3.1, we simulated the target pixel’s ΔTb across 1–30 m spatial resolutions. It was found that while the target pixel’s sensitivity to energy from adjacent pixels is less pronounced at 30 m resolution compared to higher resolutions, ΔTb serves as a quantitative metric for adjacent pixel effects.
Figure 12 illustrates the frequency distribution of ΔTb across different spatial resolutions, which directly reflects variations in adjacency effect intensity. The 30 m resolution demarcates higher and lower resolutions based on ΔTb magnitude: In high ΔTb intervals (>1.5 K), frequencies exceed 20% for resolutions finer than 30 m but fall below 10% for coarser resolutions. Conversely, in low ΔTb intervals, approximately 50% of cases fall within 0–1 K, decreasing to 23% in the 1–1.5 K range, showing significant cross-interval fluctuation. Compared to high resolutions (e.g., 1 m, 3 m), the 30 m resolution exhibits a steeper gradient in ΔTb frequency distribution across intervals, validating the findings of Zhen [
16] showing that 30 m represents the spatial resolution where adjacent pixel effects exhibit the most significant variation within geometrically complex mixed pixels. However, as indicated by the fluorescent yellow polyline in
Figure 12, ΔTb exceeding 1 K accounted for 68.75% of all simulated cases at 30 m resolution. Notably, a substantial portion reached 2 K, with specific geometric configurations (minimum building density and maximum building height) yielding ΔTb > 3 K. This indicates that 30 m resolution is not yet the critical threshold where adjacent pixel effects can be neglected in ULST retrieval. This result aligns with Chen [
11], who concluded that adjacent pixel effects must be considered for ULST retrieval from high-resolution TIR data like GF-5’s 40 m imagery.
To more precisely identify the critical spatial resolution range where adjacent pixel effects significantly impact the target pixel, we extended the simulations to include 60 m, 90 m, and 120 m resolutions. Under isothermal conditions with reflectances of R = 0.05, 0.1, and 0.15, frequency statistics of the target pixel’s ΔTb at 60 m, 90 m, and 120 m are presented in
Figure 12. It can be observed that at these lower spatial resolutions, over 70% of target pixels exhibited ΔTb < 1 K. At 90 m resolution (
Figure 12), adjacent pixel effects induced ΔTb > 1 K only under exceptionally prominent urban geometric structures; in all other cases, ΔTb remained below 1 K. Consequently, we therefore regard the 60–90 m range as the threshold at which the adjacency-induced error is attenuated to a level that is commensurate with the intrinsic uncertainty of current mainstream ULST algorithms (1–2 K for split-window methods) and thus acceptable for many urban heat island analyses.
In order to substantiate the proposed critical range, an additional experiment was conducted by randomly selecting a practical study area (Hong Kong, China), with target pixels at spatial resolutions of 40 m and 80 m. The experiment is explicitly designed to verify the aforementioned critical range of spatial resolution; accordingly, material reflectance is fixed at 0.1 to eliminate its confounding influence. This material property selection first refers to the common reflectance values of urban materials (0.05/0.1/0.15) statistically summarized in
Section 2.4. Secondly, concrete and cement account for a significantly high proportion (close to or exceeding most cases) in local building exterior walls and structures, and their corresponding parameters (ε = 0.9, R = 0.1) can represent the mainstream properties of urban surfaces. The building surface inputs for the experiment were derived from Hong Kong DSM data sourced from a spatial data sharing platform (
https://portal.csdi.gov.hk/csdi-webpage, accessed on 29 August 2025). The adjacency effect was characterized by the simulated ΔTb under isothermal and homogeneous conditions (T = 300 K, R = 0.1).
Figure 13 shows that at the 40 m spatial resolution, the Tb of the target pixel in an isolated state was 296.72 K, while under the influence of the surrounding environment, it was 297.49 K, resulting in a ΔTb of 0.77 K due to geometric and adjacency effects. At the 80 m spatial resolution, the Tb of the target pixel in isolation was 297.07 K, increasing to 297.37 K when influenced by the surrounding environment, yielding a ΔTb of 0.30 K. These results demonstrate consistency with the findings in
Section 3.1: a higher spatial resolution (40 m) captures a more pronounced ΔTb induced by geometric and adjacency effects. At finer resolutions, pixels represent smaller ground areas, enabling a more detailed representation of ground heterogeneity and making the influence of heat exchange between environments more evident. Conversely, at coarser resolutions (80 m), the larger pixel coverage dilutes the manifestation of these effects. That is, pixels are more significantly impacted by their surroundings at high spatial resolutions, while at lower resolutions, they are less disturbed and approach their isolated Tb state. Under actual complex geometric structures, the 80 m spatial resolution remains consistent with the discussed critical range of 60–90 m. Although the 80 m resolution in this experiment may not represent an exact threshold for neglecting adjacency effects, it still provides a valuable reference for subsequent research.
4.1.2. Analysis of Geometric Structure on Adjacency Effects
Regarding urban geometric structure, the results in
Section 3.2 are based on simulations assuming uniformly regular distributed buildings. This controlled setup facilitated a clearer analysis of the correlation between the target pixel’s structural parameters and adjacent pixel effects. The findings in
Section 3.2 indicate that adjacent pixel effects become more pronounced with increasing building height and decreasing building density (
). These trends are related to the void space within the target pixel (the 3D space formed between building facets and the urban ground plane). When adjacent geometric structures reflect energy towards the target pixel, not all reflected energy is received by the target pixel; only a portion enters and persistently influences it. Energy entering the target pixel’s internal structure undergoes multiple reflections between walls and the ground due to the void space. During these reflections, energy is partially absorbed by the internal materials due to their absorption properties. This ultimately manifests as an increase in the total absorbed radiance observed for the target pixel. Lower building density widens the “pathway” for reflected energy from adjacent structures to enter the target pixel. Increased building height provides more opportunities for the energy entering the target pixel to be reflected, accompanied by partial absorption, thereby increasing the overall radiance.
Furthermore, while the UCM-RT model proposed by Chen [
11] introduced SVF
in and SVF
adj as fundamental descriptors for energy exchange between the target pixel and its surroundings—variables also commonly used in most adjacent pixel effects studies—this study diverges by employing the sky view factor obstructed by adjacent structures (SVF
Obs.). The sky view factor of a target pixel depends on both its internal structure and external obstruction by surrounding structures. Therefore, we directly utilize this external obstruction metric to describe and quantify the portion influenced by adjacent pixels. Greater obstruction of the target pixel by adjacent structures (lower SVF
Obs.) corresponds to more energy reflected back to the target pixel, leading to more significant adjacent pixel effects. This aligns with the findings of Chen [
11] regarding the significant impact of SVF
Obs. on ground temperature and Top-Of-Atmosphere (TOA) brightness temperature, where a decrease in SVF
adj (implying reduced sky view for adjacent pixels and thus more reflection towards the target) increases the contribution of scattered radiance from adjacent pixels.
However, the findings presented in
Section 3 of this study were derived from simulations under idealized, symmetric urban building structures. To investigate the relationship between geometric structure and adjacency effects under more realistic conditions, we further examined the relationship between the observed sky view factor (SVF
Obs.) and the resulting brightness temperature change (∆Tb) of the target pixel due to adjacency effects, under isothermal and homogeneous conditions at a spatial resolution of 40 m. The focus of this experiment is on the impact of geometric structures on the adjacency effect of adjacent pixels. To avoid interference, all experiments on real urban surfaces will unify the material properties to ensure that the geometric structure is the only variable; therefore, the surface reflectance of the material is still fixed at 0.1, and the basis for selecting this parameter value is consistent with that described in
Section 4.1.1 (line 401). The building surface inputs for this experiment utilized DSM data of Hong Kong, sourced from a spatial data sharing platform (
https://portal.csdi.gov.hk/csdi-webpage, accessed on 29 August 2025). Five randomly selected areas (40 m × 40 m), designated HK_1 to HK_5, served as the study objects. Consistent with the findings in
Section 3.2 across various spatial resolutions, an increase in SVF
Obs. was found to correspond to an increase in ∆Tb. This indicates that greater obstruction of the target pixel by surrounding structures leads to a more pronounced influence from adjacency effects.
Figure 14 confirms that this influence pattern also holds at the 40 m resolution. Overall, ∆Tb exhibits an increasing trend with higher SVF
Obs.. Nevertheless, within these five randomly selected study areas, the variation in brightness temperature difference (∆Tb) does not demonstrate a strictly monotonic relationship with SVF
Obs.. This deviation contrasts with the results in
Section 3.1 and
Section 3.2, which were obtained under regular urban scenarios exhibiting relatively distinct linear relationships. The irregular urban surfaces examined here involve more complex thermal interactions between components. In real urban environments, the relationship between SVF
Obs. and ΔTb exhibits localized deviations from monotonicity. Although the overall trend remains largely consistent, irregular building geometries—such as asymmetrical heights, misaligned orientations, and abrupt variations in obstruction angles—disrupt the idealized multiple-scattering pathways. These structural heterogeneities alter radiation propagation within the urban canopy, leading to complex trajectories of scattered energy and localized fluctuations in ΔTb, even under uniform surface temperatures and material properties. Notably, such non-monotonic behavior often occurs among pixels with similar SVF
Obs. values, underscoring the critical importance of 3D morphological variations beyond SVF
Obs. itself. Therefore, accurate simulation in real urban settings requires geometric descriptors capable of capturing both intricate spatial arrangements and mutual obstructions among buildings.
4.1.3. Analysis of Non-Isothermal Surfaces on Adjacency Effects
Regarding material properties, Zhong [
5], utilizing UAV data, indicated that adjacent pixel effects are influenced by the target pixel’s emissivity and the surrounding environment, with pixels of lower emissivity being more significantly affected. Under isothermal conditions in
Section 3.3 of this study, we simulated different material reflectances (R = 0.05, 0.1, 0.15). The results corroborated the above conclusion, demonstrating the universality of this pattern across different urban morphologies. Critically, the simulations revealed that material properties exert a stronger influence on adjacent pixel effects than urban structural parameters.
The preceding discussions were conducted under isothermal conditions. However, in reality, component temperatures exhibit significant non-isothermicity due to factors such as solar radiation incidence, orientation, and surface area. To investigate the applicability of the isothermal findings and quantify the impact of surface component non-isothermicity on adjacent pixel effects across spatial resolutions, we introduced temperature differences between components. Typically, roofs exhibit higher temperatures, walls are cooler, and ground surfaces fall in between. Consequently, temperatures were set to 315 K for roofs, 310 K for ground surfaces, and 300 K for walls as inputs for the DART simulations. The results demonstrated that differences in component temperatures have a negligible impact on adjacent pixel effects (<0.5 K ΔTb, mostly <0.1 K at lower resolutions), indicating that the conclusions derived from isothermal surfaces remain applicable.
Figure 15 illustrates this for the same scene, comparing isothermal and non-isothermal surfaces. Although warmer ground surfaces radiate more energy compared to cooler walls, potentially increasing the radiation impacting adjacent pixels, the resulting difference in adjacent pixel effects due to the temperature contrast between 3D building facets and ground surfaces proved minimal and comparable to the isothermal case. Furthermore, the disparity in adjacent pixel effects between the two scenarios (isothermal vs. non-isothermal) is very small at low spatial resolutions (below 30 m). Therefore, within the identified critical spatial resolution threshold range (60–90 m), the isothermal assumption can reasonably be employed to simplify the quantification of adjacent pixel effects.
4.2. Significance of Simulation and Potential Improvements
High-spatial-resolution remote sensing imagery now offers finer details of the thermal environment. However, for large-scale thermal environment assessment, evaluating the adjacent pixel effects remains crucial. The preceding analysis—conducted on regularized scenes with uniform building height—shows that, within the 60–90 m resolution window, the adjacency-induced error drops below 1 K in the majority of cases, rendering the effect operationally manageable rather than negligible. We emphasize that these results are most valid for relatively homogeneous urban layouts and should be re-evaluated for highly heterogeneous environments before wider application. Furthermore, material emissivity was found to exert a more pronounced influence on the magnitude of adjacent pixel effects compared to geometric structure. Although the relationships between various factors and adjacent pixel effects were established through simulation, several key aspects require further consideration and refinement to be overcome in the future.
First, in existing simulations, material spectra and building morphology are often simplified to a “single emissivity + regular array” configuration, which significantly deviates from the complex underlying surfaces of real urban areas. To quantitatively evaluate the uncertainty introduced by such simplifications, we further incorporated random materials (multiple samples labeled as artificial materials from the ASTER spectral library, which exhibit diverse optical properties in the 10 µm simulation band) and more realistic building data, conducting a preliminary simulation of the adjacency effect at a spatial resolution of 30 m. Even under the isothermal surface assumption, the results in
Figure 16 show that the temperature differences for most target pixels reached 2–3 K, with a full range of 0–15 K, significantly higher than the idealized results (0–4 K) presented in the yellow box in
Figure 7. The simulation in
Figure 7 is based on the assumptions of “isothermal, homogeneous materials, and regular geometry,” whereas the irregular substrates, varying building heights, and the resulting spatial heterogeneity in SVF in real urban areas markedly enhance the multiple scattering and reflection processes of thermal radiation [
30]. Moreover, unifying material emissivity diminishes the radiative difference signals between adjacent pixels, leading to a systematic underestimation of the adjacency effect, which should otherwise be pronounced. Such temperature differences, co-modulated by “material-geometry” details, are typical features revealed in refined urban thermal environment studies. Therefore, future research should utilize high-resolution 3D building models or aerial LiDAR data to more accurately characterize urban structural details. Simultaneously, a more comprehensive urban feature spectral library should be introduced, coupled with laboratory-measured hyperspectral emissivity data, to assign more precise optical and thermal parameters to surfaces of different materials and aging stages, thereby enhancing the model’s universality and retrieval accuracy for real urban underlying surfaces.
Second, the urban thermal environment is shaped by multiple underlying surfaces such as buildings, vegetation, and water bodies. This study intentionally focuses on the built-up structures in urban areas, where the 3D configuration of buildings, the materials they are made of, and the spatial resolution-induced spectral mixing of these structures in imagery are the core factors to be analyzed. For the purposes of this research, vegetation and water bodies are temporarily simplified as background elements, as they generally exhibit relatively flat morphology and lack the “cavity structures” (which are present in buildings) that generate multiple scattering. Nevertheless, transpiration from vegetation and the thermal inertia of water bodies continue to influence the temperature field through distinct thermal radiation and energy exchange processes. Therefore, incorporating a wider range of surface components would enable a more comprehensive characterization of how adjacent pixels influence the target pixel.
Finally, validation against field measurements such as ground-based thermal infrared observations or UAV remote sensing remains an indispensable yet missing step of the present study. However, the DART model employed in our simulations has already undergone a three-way validation against flux–brightness temperature–directionality, e.g., Widlowski [
31] cross-compared DART with eleven 3D Monte Carlo models within the RAMI-III exercise over heterogeneous vegetation and urban structures and reported reflectance relative discrepancies ≤1%, and Sobrino [
32] further validated DART-derived brightness temperatures over bare soil, grassland, and urban scenes using airborne AHS and satellite ASTER data together with ground stations, achieving root-mean-square errors <2 K. These results confirm the applicability of DART to real 3D urban structures, hence its usability in thermal infrared remote sensing. Nevertheless, direct comparison between simulations and real-world scenes still confronts several practical barriers. First, temporal mismatch and consistency issues—such as asynchronous satellite overpass and ground observation times, divergent atmospheric conditions, and inconsistent spectral/field-of-view characteristics of instruments—introduce systematic biases. Second, inherent confounding factors in imagery, e.g., uncertainties in surface emissivity retrieval and unresolved mixed pixels, further complicate the validation. Consequently, future multi-source data fusion strategies (e.g., coupling mobile ground campaigns with coordinated satellite acquisitions) are required to address these challenges. Moreover, rigorous validation itself remains a topic that warrants continuous investigation, demanding iterative refinement of ULST retrieval algorithms and a deeper interpretation of radiative transfer processes.