1. Introduction
Building patterns typically refer to visually significant structures formed by a single building or a group of buildings, which often contain rich spatial information and hold considerable social significance in modern urban planning and spatial analysis [
1,
2,
3]. Beyond their geometric appearance, letter-shaped building patterns such as I-, C-, and E-shaped layouts reflect typical spatial organization logics in urban environments. These patterns are commonly associated with residential compounds, campuses, and industrial blocks, and are closely related to land-use intensity, internal circulation structures, and spatial enclosure characteristics. Therefore, identifying such building patterns provides useful intermediate-scale indicators for analyzing urban spatial structure, diagnosing development constraints, and supporting planning decisions regarding future urban expansion [
4]. In the construction of smart cities, the automated identification and analysis of building patterns offer a scientific basis for urban decision-making [
5]. Automatically extracting, simplifying, and retaining the core information of building patterns enables a more intuitive representation of significant building features in geographic space [
6,
7,
8]. Thus, building pattern recognition is also essential for enhancing the efficiency and accuracy of map synthesis.
Contemporary research on building pattern recognition broadly categorizes into two main approaches: shape recognition for individual buildings and arrangement recognition for clusters of buildings. Shape recognition focuses on analyzing geometric features and parameters of buildings, employing techniques such as corner functions [
9], Fourier descriptors [
10], and graph convolutional auto encoders [
11,
12]. Conversely, arrangement recognition leverages template matching [
13], structural rules [
14], and machine learning techniques [
15] to decode the layouts of building groups. Particularly in contemporary research, I-shaped building patterns, noted for their distinct geometric configurations and significant spatial relationships, have emerged as a focal area of study. Their representation of a type of building pattern that is commonly found in real cities and possesses distinct geometric structures, making it highly visible and structural within urban layouts. Additionally, focusing on this specific pattern enables a deeper exploration of the performance and advantages of recognition methods when applied to building groups with evident geometric relationships, with the potential for subsequent extension to the recognition of other building patterns [
14]. Nonetheless, traditional techniques primarily focus on the geometric attributes of individual objects, which, while effective at a certain level, often lack the precision and robustness needed for recognizing complex building groupings in urban settings and across varied spatial scales. As illustrated in
Figure 1, existing building pattern recognition methods can successfully identify I-shaped patterns that exhibit clear and stable structural characteristics at a single scale, such as the cases shown in
Figure 1a,c. In contrast, the patterns shown in
Figure 1b,d rely on building configurations that are distributed across multiple scales, where the I-shaped structure is fragmented or ambiguous when observed at any individual scale. As a result, these patterns are difficult to reliably identify by conventional single-scale methods. In this article, the patterns in
Figure 1b,c are referred to as potential architectural patterns.
Multi-scale data, which partially reflect human insights into the clustering and segmentation of building groups [
16,
17,
18], can facilitate the identification of latent building patterns that conform to human visual principles. For instance, in the case where the buildings in
Figure 1a,c are identified as I-shaped building patterns, the potential I-shaped building pattern in
Figure 1b can be recognized through the “one-to-many” relationship between the buildings in
Figure 1a,b, or the “many-to-one” relationship between the buildings in
Figure 1c,b. Similarly, the potential building pattern in
Figure 1d can be recognized using the same method. The use of multi-scale data effectively addresses the limitations of single-scale data in complex pattern recognition. By fusing cross-scale information, it becomes possible to capture a broader range of potential patterns. In urban architecture and cartography, researchers have integrated multi-source, multi-scale building data with geometry-based matching methods and geospatial ontologies to model multi-scale building representations and spatial relationships [
19]. Related studies have also investigated changes in building representations induced by map synthesis, distinguishing temporal updates from representational changes across scales [
20].
However, multi-scale building data are inherently heterogeneous and structurally complex, which poses significant challenges for their integrated analysis. Existing approaches based on spatial databases are mainly designed for static data storage and geometric queries, and thus have limited capability to represent complex spatial semantics, manage multi-scale relationships, or support dynamic reasoning. As a result, effectively integrating and reasoning over building data across different scales remains difficult using conventional spatial database technologies.
Knowledge graphs (KGs), as graph-based structures for modeling multi-relational networks, are well-suited for representing both the geometric features and semantic relationships of buildings. By organizing entities as nodes and relationships as edges, KGs support explicit and interpretable rule-based reasoning, enabling dynamic inference with fully traceable reasoning processes. These properties make KGs particularly effective for integrating heterogeneous multi-scale building data and reasoning about complex spatial relationships [
21,
22].
Consequently, this paper utilizes a knowledge graph to articulate the multi-scale shape features, spatial relationships, and semantic attributes of buildings, thus introducing a novel approach for recognizing I-shaped building patterns within a cohesive framework. This method uses knowledge graphs to discern I-shaped arrangement patterns within building clusters and utilizes recognition outcomes from multi-scale data to enhance the identification of potential I-shaped building configurations. Moreover, individual buildings characterized by I-shaped designs have previously been identified through human visual recognition techniques.
The rest of this paper is organized as follows: In
Section 2, we introduce related work and concepts.
Section 3 details the knowledge graph construction process for recognizing I-shaped building patterns.
Section 4 outlines how rule-based reasoning is applied using the constructed knowledge graph.
Section 5 presents the experimental results following the workflow.
Section 6 explores related topics. Finally,
Section 7 offers a conclusion to the paper.
3. Construction of a Knowledge Graph for Recognizing I-Shaped Building Patterns
3.1. Defining Spatial Relationships in I-Shaped Building Group Arrangements
The spatial arrangement of I-shaped building patterns typically comprises three distinct buildings: one centrally located building flanked by two others, forming the I-shape. These side buildings are both parallel to each other and perpendicular to the central building. The definition of an I-shaped arrangement includes “proximity”, “exact parallelism”, and “perfect perpendicularity” between the buildings. To refine these definitions, this paper introduces the concept of “relative positional relationships” to more precisely delineate the spatial positioning of each building relative to others. Moreover, given that the central positioning of the middle building is crucial for pattern recognition quality, a “center distance constraint” is applied to enhance the alignment of recognition outcomes with human visual perception principles.
3.1.1. Proximity Relationships
In the context of building groups, this study addresses the proximity relationships between spatially adjacent yet unconnected buildings using the Constrained Delaunay Triangulation (CDT) skeleton method. This approach employs shared boundaries or corners to establish the proximity relationships between two buildings on a map. As illustrated in
Figure 2a, buildings sharing the same CDT skeleton are considered adjacent. Subsequently, a proximity graph is constructed to visually represent these relationships, as depicted in
Figure 2b. Within the knowledge graph, the proximity between building Bi and Bj is represented by a ‘Neighbor’ type relationship between the respective entities.
3.1.2. Relative Positional Relationships
This study employs the Smallest Bounding Rectangle (SBR) method to describe the relative positions between two buildings, denoted as the reference building (
Ba) and the target building (
Bb), with their corresponding SBRs referred to as SBRa and SBRb, respectively [
14]. By orienting the longer and shorter axes of SBRa as the X and Y axes, respectively, the eight-directional neighborhood of SBRa is established, as shown in
Figure 3. This framework allows for the analysis of
Ba and
Bb’s relative positions by evaluating where SBRb is located within SBRa’s designated eight-directional neighborhood.
When SBRb is projected onto the long or short axis of SBRa, their spatial relationship can be characterized using 13 fundamental relative positions. These relations include before, meets, overlaps, starts, during, finishes, equal, and their corresponding inverse relations, forming a set of 13 mutually exclusive and jointly exhaustive spatial relations. In this study, these relations are adopted to discretize and model the relative positional relationships between building entities, thereby facilitating rule-based reasoning within the knowledge graph. Given that SBRb may overlap projections on both SBRa’s long and short axes, the relative positions between the two can be denoted as r
i,j (a,b). This notation represents the 169 potential directional relationships where the target building
Bb is positioned relative to the reference building
Ba, as shown in
Figure 4. In this figure, the color orange indicates the reference building
Ba, while blue signifies the target building
Bb.
In this study, buildings with overlapping geometries, as highlighted by the red boxes in the diagram, are excluded from the relative positional analysis due to the ambiguity they introduce in defining spatial relationships. Moreover, buildings with excessively complex shapes are generally not considered suitable for forming regular patterns. Before approximating a building’s shape using SBR, an assessment of the building’s shape complexity is performed by calculating its rectangularity, defined as the ratio of the building’s area to the area of its SBR. A rectangularity greater than 0.6 suggests that the building’s shape approximates a rectangle, which implies simplicity. Buildings with a lower rectangularity, indicative of greater shape complexity, are excluded from the analysis of relative positional relationships and are not considered part of the I-shaped building pattern formation [
5].
3.1.3. Fully Parallel Relationships
Fully Parallel Relationships occur when two buildings,
Ba and
Bb, not only share similar dimensions and shapes but are also aligned parallel to each other along their main orientation, with their long axes partially overlapping [
14]. To quantify the extent of this overlap more precisely along the long axes of
Ba and
Bb, this study introduces the long axis overlap ratio (LR), as described in Equation (1). This ratio is critical for assessing the degree of alignment necessary to classify the relationship as completely parallel.
In this formula, LR(Ba, Bb) quantifies how much the projection of the target building Bb on the reference building Ba’s long axis overlaps with Ba’s axis itself, with ‘Overlap’ indicating the intersection area and ‘Merge’ indicating the combined area.
The criteria for a fully parallel relationship are outlined in Equation (2).
In this formula, A is the building area, O is the primary orientation of the building’s SBR, LR is the Long Axis Overlap Ratio, and θ1, θ2, θ3 are the thresholds that determine the precise conditions under which the relationship is considered fully parallel. The ratio Aa/Ab is a positive value, and when this ratio approaches 1, it indicates that the areas of the two buildings are more similar. Subtracting 1 shifts the ratio so that perfect area similarity corresponds to SimA = 0. Taking the absolute value ensures that SimA is non-negative and symmetric with respect to Aa and Ab.
3.1.4. Full Perpendicular Relationships
A full perpendicular relationship is established between buildings
Ba and
Bb when their principal directions are perpendicular to each other, and the projection of
Bb on
Ba aligns precisely with
Ba’s long axis. This condition is represented as Full_Per(
Ba,
Bb), indicating a strictly unidirectional relationship from the reference building
Ba to the target building
Bb. The definition of this relationship is specified in Equation (3).
In this formula, O is the primary orientation of the building’s SBR, PerO is used to quantify how close the orientations of two buildings are to being perpendicular. When the angle difference between Oa and Ob approaches 90°, the value of PerO decreases and approaches zero, indicating that the two buildings are nearly perpendicular in orientation. When the angle difference deviates from 90°, the value of PerO increases, reflecting a weaker perpendicular relationship between the two buildings.
3.1.5. Center Distance Constraint
The I-shaped building pattern is characterized by parallel wing buildings with a central building that is perpendicular to these wings and positioned at their midpoint. The centrality of this middle building is an essential factor that influences its recognition as part of an I-shaped pattern, based on human visual perception principles. This paper establishes a center distance evaluation system by measuring the relative distance between the centroid of the middle building and that of the entire building group, as specified in Equation (4).
In this formula, A is the central building, while B1 and B2 are the wing buildings on either side. dis(x,y) denotes the Euclidean distance between the centroids of buildings x and y. This distance is normalized and expressed in a relative form, and the subtraction of 1 is applied to shift the ideal distance condition to zero. As a result, smaller values indicate that the spatial separation between A and B1 is closer to the expected configuration, while larger values reflect a greater deviation from this ideal distance.
3.2. Definitions of Building Matching Relationships at Different Scales
Map data at smaller scales is commonly derived through the generalization of larger-scale data, essentially an information abstraction process. When buildings from a large-scale map are updated to a smaller-scale map, they might be simplified in shape, merged with other buildings, or even removed entirely [
1]. Such changes can lead to substantial discrepancies in the shape and size of the same building on different scales, or result in its absence on a smaller scale. To address this, this study defines six matching relationships to associate the same building entities across scales: one-to-one (1:1), one-to-many (1:M), many-to-one (M:1), many-to-many (M:M), one-to-none (1:0), and none-to-one (0:1), as detailed in
Figure 5. These relationships facilitate the consistent representation of buildings despite scale-driven alterations in their depiction.
The determination of matching relationships between corresponding buildings at different scales employs methodologies outlined in recent studies [
19,
45]. Utilizing Equation (5), the overlap ratio, Roverlap, is calculated efficiently. This ratio is crucial for identifying whether two instances of a building,
Bb and
Bs, at different scales represent the same structure. Specifically, following previous studies [
19], if the Roverlap between
Bb and
Bs is 0.3 or higher, they are considered to represent the same building at different scales. And the type of matching relationship is inferred by the number of corresponding matching nodes between nodes at different scales in the knowledge graph.
In this formula, Ab and As represent the areas of buildings Bb and Bs, respectively, while A(Bb ∩ Bs) denotes the area where Bb and Bs overlap.
3.3. Construction of the Knowledge Graph
Compared to the proximity graph model, the attribute graph model has shown greater prominence and broader adoption in knowledge graph applications, as noted by [
46]. This paper utilizes the attribute graph model in the Neo4j graph database to describe the graph structure of building groups with nodes and edges, representing the attribute graph of the building group as: Graph = (
B,
R).
In this framework, B represents a set of nodes, defined as B = {B1, B2, …, Bn}. Each node corresponds to an independent building entity, and these nodes are categorized based on their scale with labels such as “BuildingBig” for large-scale buildings and “BuildingSmall” for smaller-scale structures. Each building entity within the graph is detailed with four key attributes: “bID” which serves as a unique identifier for each building; “Area” which specifies the building’s total surface area; “Orientation” which describes the principal direction that the building faces; and “Pattern” which indicates whether the building is part of an I-shaped building arrangement.
Represents a set of edges, R = {Rm(Bi, Bj), Bi ∈ B, Bj ∈ B}, where R(Bi, Bj) indicates the relationship between entities Bi and Bj. The proximity between buildings A and B is represented in the knowledge graph as a relationship between entities Ba and Bb, denoted as Ba-(Neighbor)-Bb. The relative position relationship is represented as a unidirectional relationship from the reference building entity Ba to the target building entity Bb, denoted as Ba-(Position)->Bb. This relative position relationship has two relationship attributes: “P_type” indicates the type of relative position relationship; “LR” represents the degree of overlap of the longitudinal axes of the two buildings. From the size, main direction, proximity, and relative position relationships between buildings, the bidirectional fully parallel relationship and the unidirectional fully vertical relationship between building entities Ba and Bb are inferred, represented as Ba-(Full_Pal)-Bb and Ba-(Full_Per)->Bb, respectively. Additionally, the center distance between building entities can be deduced from the centroid distance between the buildings, with the centroid distance between entities Ba and Bb represented as dis(Ba, Bb).
The matching relationships between buildings across different scales can be represented as a “Match” relationship with an attribute “M_type”, where the value of “M_type” corresponds to the type of relationship, including one-to-one (1:1), one-to-many (1:M), many-to-one (M:1), and many-to-many (M:M) types. However, the previously mentioned one-to-zero (1:0) and zero-to-one (0:1) relationships are not required to be explicitly represented in the knowledge graph.
Entities and relationships in the knowledge graph are shown in
Table 2. This paper utilizes Neo4j to store the knowledge graph and uses the Cypher query language to define rules for building pattern recognition, thereby enhancing rule-based reasoning.
4. Rule-Based Reasoning for Recognizing I-Shaped Building Patterns
This paper utilizes rule-based reasoning within the established knowledge graph to identify I-shaped building patterns, a process divided into two key steps. Initially, the method identifies I-shaped patterns at specific spatial scales. Subsequently, it integrates multi-scale data into the recognition process, with the goal of improving the overall effectiveness of pattern identification. This approach not only allows for precise pattern detection at individual scales but also enhances the accuracy and reliability of the results by considering variations across different scales, ultimately achieving a more robust pattern recognition outcome.
4.1. Recognizing I-Shaped Building Patterns at a Single Spatial Scale
This paper utilizes the Neo4j graph database to host the knowledge graph and employs the Cypher language to articulate rules crucial for reasoning and recognizing I-shaped building patterns at designated spatial scales. The identification process involves discerning key relationships such as Neighbor, Full_Pal (Fully Parallel), Full_Per (Fully Perpendicular),and the Cen_D (Center Distance). Essential prerequisites, as specified by Equations (2)–(4), include verifying similarities between building entities, assessing parallelism or perpendicularity in their principal orientations, and measuring centroid distances. These steps facilitate the logical deduction of Fully Parallel and Fully Perpendicular relationships, alongside the Center Distance constraint, which are essential for constructing an accurate I-shaped building pattern recognition framework, as demonstrated in Equation (6).
It is worth mentioning that the specific steps for inferring the Center constraint based on the centroid distances between building entities are as follows: (1) Assume there are three buildings
Bm,
B1,
B2, and infer the possibility of forming an I-shaped pattern based on their proximity relationships, relative positions, and relationships of being fully parallel and fully perpendicular. (2) Calculate their center distances using the distances dis(
Bm,
B1), dis(
Bm,
B2), and dis(
B1,
B2), as specified in Equation (4). (3) Compare the calculated center distance with a threshold; if it is below the threshold, then modify the attribute “Pattern” of the three entities to “True”, indicating that the building entities form an I-shaped pattern after meeting the center distance constraint. Otherwise, they are deemed not to form an I-shaped pattern under human visual recognition principles. Importantly, the center distance constraint is not explicitly represented in the knowledge graph.
Consequently, the rule-based reasoning process for identifying I-shaped building patterns comprises four sequential steps: (1) Validate relationships concerning area similarity (SimA), parallel orientation (PalO), and perpendicular orientation (PerO); (2) Deduce relationships that are Full_Pal and Full_Per; (3) Confirm compliance with the center Cen_D; (4) Recognize and confirm the I-shaped building pattern configuration. This methodical approach is illustrated in
Figure 6.
4.2. Integrating Multi-Scale Data in Recognizing I-Shaped Building Patterns
Research into enhancing pattern recognition with multi-scale data encompasses two primary strategies. The first strategy involves transitioning from larger to smaller scales, essentially treating groups of buildings at larger scales as decomposed into their component structures at smaller scales. For instance, a building or group of buildings identified with an I-shaped pattern at a larger scale may suggest the potential for similar patterns among corresponding smaller-scale buildings [
17]. For example, if three buildings (such as B
3, B
4, B
5) at a larger scale are recognized as forming an I-shaped pattern, these may represent a breakdown into constituent buildings at a smaller scale. Specifically, B
5 corresponds directly to B
2 in a one-to-one (1:1) relationship, while B
3 and B
4 collectively relate to B1 in a many-to-one (M:1) relationship, potentially leading B
1 and B
2 to manifest an I-shaped pattern at a smaller scale. This method of enhancing recognition from larger to smaller scales is depicted in
Figure 7b, emphasizing the dynamic flow and transformation of building patterns across different scales.
The second approach involves scaling from smaller to larger scales, whereby a building group at a smaller scale is considered as an aggregation of corresponding larger-scale structures. If a group of buildings is identified with an I-shaped pattern at a smaller scale, it suggests that the associated group at a larger scale might also exhibit an I-shaped pattern [
13]. For instance, at a smaller scale, if three buildings (such as B
3, B
4, B
5) are recognized as forming an I-shaped configuration, their corresponding entities at a larger scale could be their aggregated counterparts, still likely to form an I-shaped pattern. Specifically, B
3 is linked to B
6 and B
7 in a many-to-one (M:1) relationship, whereas B
4 and B
5 are each linked to B
8 and B
9 in one-to-one (1:1) relationships, respectively. This setup suggests that at the larger scale, buildings B
6, B
7, B
8, and B
9 could also establish an I-shaped pattern. This process of enhancing recognition from smaller to larger scales is demonstrated in
Figure 7c, effectively illustrating how scale transitions influence pattern identification.
Therefore, the process of leveraging multi-scale data to enhance pattern recognition can be delineated into three primary steps: Firstly, cross-scale matching relationships are established for buildings that have been identified as forming I-shaped patterns at specific scales. Secondly, the relationships between these buildings and their corresponding entities at other scales are automatically assessed through rule-based reasoning within the knowledge graph, where the enhancement strategy (large-to-small or small-to-large) is determined based on predefined cross-scale matching types (e.g., one-to-one, one-to-many, many-to-one, and many-to-many), without manual intervention. Finally, the selected enhancement strategy is applied to identify potential I-shaped patterns across scales. This cross-scale matching and pattern transformation process improves the overall accuracy and effectiveness of building pattern recognition.
5. Experiments and Analysis
5.1. Dataset
The experimental data is sourced from the National Basic Geographic Information Database of China, including buildings from the Lanzhou City (LZ), area at spatial scales of 1:10,000 and 1:5000. Although only two vector scales are used in this study due to the limited availability of reliable multi-scale cadastral datasets, they are sufficient to fully instantiate and validate the proposed cross-scale enhancement mechanism, which operates bidirectionally (large-to-small or small-to-large) through rule-based reasoning over predefined cross-scale matching types.
To assess the accuracy and robustness of the experimental method in pattern recognition across urban environments of varying complexity, buildings from London (LD) were also used as a control group, as shown in
Figure 8.The selected datasets are categorized into two scale levels, which correspond to two spatial resolutions: large-scale data (SL) at approximately 1:5000 and small-scale data (SS) at approximately 1:10,000. The dataset from Lanzhou encompasses about 2 square kilometers and includes 654 buildings at the larger scale and 395 at the smaller scale. Conversely, the London dataset covers about 1.6 square kilometers and comprises 1343 buildings at the larger scale and 724 at the smaller scale. These datasets provide a comprehensive basis for evaluating the method’s effectiveness in identifying Building patterns within significantly varied urban settings.
The dataset encompasses buildings with a diverse array of shapes, sizes, and orientations, distinctly partitioned by roads, presenting a complex array of relative positions among the structures. Within this dataset, the majority of buildings at both considered scales typically feature fewer than 8 edges, and most possess high simplicity in form, as indicated by a significant measure of rectangularity (often greater than 0.7). The comprehensive analysis of building features, as demonstrated in
Figure 9, highlights that complex buildings—characterized by having more than 10 edges and a rectangularity of less than 0.6—make up a minor portion (marked in red in
Figure 9). This implies that most buildings in the research area are straightforward in shape, and readily approximated by the SBR, which facilitates the identification of I-shaped patterns. The diversity in orientation and spatial relations suggests a potential for more intricate, “composite” building patterns such as I-shaped configurations, thus enhancing the ability to effectively assess the efficacy of the methodologies described in this study.
5.2. Parameter Settings
The building pattern recognition methodology employed in this study is based on Gestalt principles, with visual variables such as position, shape, size, orientation, and center distance. Variations in the threshold values of these parameters will influence the recognition results differently. As defined in
Section 2 of this paper, the spatial relationships between buildings at a given scale are initially extracted. Based on a comprehensive consideration of experimental analysis and relevant literature [
14,
27], the threshold values in Equations (2) and (3) are determined as follows:
1 = 2,
2 = 15,
3 = 0.4.
At the same time, in order to define the center distance constraint, the performance of the model at different center distance thresholds is evaluated using the ROC-AUC metric under the three threshold settings mentioned above. A set of 100 threshold values ranging from 0 to 1 with a step size of 0.01 is used, and the ROC-AUC curve is plotted, as shown in
Figure 10, with an AUC value of 0.90. The threshold
4 in Equation (4) is selected as 0.2, which satisfies the human visual principle.
The comparison of recognition effects when threshold 4 is 0.1, 0.2, and 0.5 is shown in
Figure 11.
5.3. The Process of Constructing the Knowledge Graph
The building relationships within and across different scales are mapped and represented as a knowledge graph, which is maintained within Neo4j version 4.4.26. An illustrative example of this can be seen in
Figure 12, which shows the knowledge graph constructed using data from London City. To facilitate the recognition of I-shaped patterns, the Cypher query language is utilized to perform rule-based reasoning based on the arrangement analysis rules described in
Section 4.1.
Using this approach, I-shaped patterns are first identified at specified scales through rule-based verification of building arrangements. Based on these recognized patterns, the subsequent enhancement recognition (ER) further explores potential I-shaped configurations at other scales by leveraging the cross-scale matching relationships represented in the knowledge graph. This process enables comprehensive pattern detection and analysis by combining rule-based reasoning with cross-scale information propagation supported by Neo4j.
5.4. Analysis of Experimental Results and Evaluation of Methodology
The I-shaped patterns identified through the method proposed in this study are depicted in
Figure 13. A total of 82 groups of I-shaped patterns were successfully detected across two regions.
To assess the efficacy of the proposed method, outcomes are analyzed across four scenarios: (1) Results from recognizing individual building shapes and their arrangements (SR + AR); (2) Integrated analysis including shape recognition, arrangement assessment, and center distance evaluation (SR + AR + CR); (3) Comprehensive analysis incorporating shape recognition, arrangement determination, and multi-scale enhancement (SR + AR + ER); (4) A full suite of analyses combining shape recognition, arrangement verification, center distance checks, and multi-scale enhancement (SR + AR + CR + ER).
In addition, comparative experiments were conducted on the same dataset using the template matching method (TM) simultaneously. The TM follows the canonical definition of I-shaped building patterns reported in the literature [
34] and employs a corresponding template consisting of two parallel wing buildings and one perpendicular central building. A graph-based matching approach was applied to identify building patterns conforming to this template. The results of both the template matching method and artificial visual recognition are shown in
Figure 14.
These analyses are benchmarked against human visual recognition outcomes sourced from seven graduate and doctoral students in cartography and geographic information systems, well-versed in how buildings may be decomposed or aggregated to form recognizable patterns. Inconsistencies in recognition are resolved through a voting mechanism. It should be noted that, although the National Basic Geographic Information Database provides authoritative building footprint and cadastral information, it does not explicitly include ground-truth annotations for building pattern configurations (e.g., I-shaped building groups). As a result, such authoritative datasets cannot be directly used as reference ground truth for building pattern recognition. In this context, expert-based visual interpretation is adopted as the gold-standard reference, which is a common and accepted practice in studies on building pattern recognition and cartographic generalization [
29,
43]. Based on this human visual interpretation process, a total of 50 I-shaped building patterns were identified in the Lanzhou dataset, and 182 I-shaped building patterns were identified in the London dataset as the gold standard. In the reported results, matches with human recognition are indicated as True positives (Tr), discrepancies as False positives (Fr), and omissions as Misses (Mr). It should be noted that for each method configuration, the sum of true positives (Tr) and misses (Mr) equals the total number of gold-standard I-shaped patterns, whereas the number of false positives (Fr) is not fixed and depends on the number of incorrectly identified candidate structures. The effectiveness of the recognition method is quantified by computing precision and recall, utilizing a formula referenced from Zhang [
37].
Table 3 below summarizes the effectiveness of different recognition methods applied to datasets from Lanzhou (LZ) and London (LD).
From the data presented in
Figure 13 and
Figure 14, and
Table 3, it is clear that the integration of multi-scale data has substantially enhanced the recall rates. Specifically, the recall rate for Lanzhou City’s data increased by 24%, and for London City’s data by 52.75%, compared to methods that did not account for multiple scales. This improvement suggests that utilizing multi-scale data for pattern enhancement significantly facilitates the detection of more potential building patterns, particularly in London City, where the high density of buildings and complex building forms pose significant challenges. This achievement can be ascribed to the complex spatial relationships often present in urban environments, which promote the emergence of numerous potential building patterns. The approach based on multi-scale data exhibits an enhanced capability to identify these patterns, effectively addressing the deficiencies of traditional methods in handling areas with dense construction and complex building forms.
Furthermore, according to the Gestalt principles of human visual recognition [
47], which emphasize the importance of proximity, similarity, continuity, closure, and good form in recognized objects, conventional methods that are weakly linked to physical entities often fall short in identifying all potential patterns. For instance, as illustrated in
Figure 13, the I-shaped patterns G2 and G6 cannot be recognized at a larger scale, and G5 and G8 at a smaller scale, while G1 and G4 are only partially recognizable at a certain scale. However, with the application of pattern enhancement techniques, the methodology developed in this study is able to more comprehensively recognize these potential I-shaped patterns, demonstrating its effectiveness and adaptability in complex urban settings.
The unique structural characteristics of I-shaped patterns mean that the positioning of the central building relative to the overall pattern’s center critically affects whether recognition results are consistent with human visual perception. For instance, in
Figure 13, the building groups G3 and G7 adhere to the rule-based definitions of I-shaped patterns but do not meet the criteria for human visual recognition. To address this, this study introduces a center distance evaluation system, which has proven effective in enhancing the precision of I-shaped pattern recognition. Implementing this system resulted in a significant improvement in precision: for the data from Lanzhou City, precision increased by 22.45%, and for London City, by 26.98%.
This improvement underscores the importance of considering spatial relationships and central alignment in building pattern recognition, particularly in complex urban settings where conventional rule-based approaches might overlook subtleties recognized by human observers. The center distance evaluation system offers a quantifiable means to align rule-based recognition more closely with human visual standards, thereby bridging a critical gap between automated processes and intuitive human judgments.
Comparison experiments with traditional methods demonstrate that the method presented in this paper successfully tackles the issue of low recall rates in recognizing typical letter-shaped buildings using traditional template matching and similar approaches. When compared to the Lanzhou dataset, the recall rate of traditional template matching methods declined by 18.79% in the London dataset, where regional building density and building form complexity are higher. In contrast, the method proposed here significantly enhances the recognition recall rate while maintaining a high level of precision, and achieves strong recognition results in both datasets.
The outcomes discussed confirm that the method presented in this study effectively identifies the majority of I-shaped patterns within the experimental regions, with recognition results that align closely with human visual perception. The application of this method across two distinct datasets has yielded high precision and recall rates, further underscoring its precision and robustness.
6. Discussion
6.1. Misrecognition Due to Scale Variability
The approach we have developed leverages a knowledge graph to integrate data across multiple scales, significantly improving the precision and adaptability of Building pattern recognition. By associating data from various scales, our method captures the complex, multi-dimensional characteristics of buildings, enhancing the robustness of the recognition process. However, despite its effectiveness, this technique can occasionally lead to errors in pattern recognition. Such errors often arise from the dynamic variations in how buildings are represented at different scales. As identified by Chen, Zhang, and Lin [
20], the variability in map objects across multiple scales is influenced primarily by two factors: map generalization, which involves the simplification or abstraction of map information at various scales, and the genuine changes in geographical features, reflecting actual alterations that occur in buildings and other geographic entities over time or due to environmental factors.
Studies involving multi-scale map data reveal that map generalization strategies at various scales significantly influence how geographic entities are represented. At smaller scales, details may be simplified or omitted, leading to a less detailed depiction, whereas larger scales tend to retain and display greater complexity and details, as indicated by the blue arrows in
Figure 15. Concurrently, the real-world conditions of building entities are subject to continual evolution due to dynamic environmental changes, such as the demolition or reconstruction of buildings as depicted by the orange arrows in
Figure 15. This dynamic nature adds layers of complexity to the task of pattern recognition. Thus, when employing multi-scale data for building recognition, it is essential not only to manage the inconsistencies in information across different scales but also to account for the temporal dynamics of building entities. This sets a direction for future research on how to more effectively merge multi-scale data to capture the dynamic changes in buildings accurately. Future strategies should also explore utilizing advanced knowledge graph technologies to mitigate the impacts of recognition errors, thereby enhancing the reliability and accuracy of building pattern analysis.
6.2. Factors Influencing the Visual Recognition of Building Patterns
The shape, proportion, and spatial arrangement of building clusters are key visual factors in building pattern recognition. Additionally, contextual influences such as the surrounding environment and building density play significant roles in determining recognition accuracy. While much-existing research focuses on the impact of individual object parameters on visual recognition, there is less emphasis on quantitatively describing the morphologies and special structures of building clusters within complex urban settings. Some researchers have integrated principles of spatial cognition and psychology into spatial data mining, incorporating factors like position, orientation, and size. This integration has led to enhancements in the recognition of building distribution patterns by introducing the concept of “visual distance”, thus aligning pattern mining more closely with human visual perception [
47]. Nevertheless, there remains a need for further studies on the effects of other Gestalt principles, such as closure and continuity, on spatial pattern recognition.
Figure 16 illustrates that the position and orientation of the central building significantly influence the recognition of I-shaped building patterns. Although the layout on the left side of
Figure 16a adheres to the I-shaped arrangement rules identified by [
14], it more closely resembles a C-shaped pattern visually. Conversely, a deviation in the central building’s orientation on the right side results in a Z-shaped pattern. Similarly, the E-shaped pattern in
Figure 16b is altered due to positional shifts among three parallel buildings.
This study successfully implements a quantitative description of the I-shaped pattern by integrating multi-scale data and applying a center distance constraint to define the recognition rules for this pattern. The findings highlight the centrality of the middle building as a crucial factor influencing recognition accuracy. Quantitatively characterizing this centrality allows for a more thorough and precise capture of the buildings’ visual features. Nonetheless, navigating complex urban settings and responding to dynamic changes remains challenging. Future research should continue to evolve more comprehensive recognition models and methods to address these complexities effectively, thus enhancing the adaptability and accuracy of building pattern recognition in varied urban landscapes.
6.3. Recognition of Other Structural Patterns
Robustness and scalability are critical considerations for rule-based building pattern recognition approaches, particularly when applied to complex urban environments and multi-scale datasets. The proposed method is designed to address these challenges through a combination of explicit relational modeling, multi-constraint reasoning, and cross-scale consistency analysis.
Regarding robustness, rule-based systems are often considered sensitive to ambiguous configurations or local noise when individual rules are applied in isolation. In the proposed framework, however, pattern recognition does not rely on a single geometric or relational constraint. Instead, building patterns are identified only when multiple structural conditions—such as shape characteristics, relative positional relationships, neighborhood configurations, and cross-scale matching relations—are simultaneously satisfied. This multi-constraint reasoning strategy effectively reduces the influence of local irregularities or noisy building geometries. Moreover, cross-scale enhancement recognition contributes to robustness by propagating building patterns identified at one scale to other related scales through cross-scale matching relationships. This mechanism enables the detection of potential building patterns that may be ambiguous or difficult to identify when relying on single-scale analysis alone, thereby improving the recall of pattern recognition in complex urban environments.
To further examine robustness under more challenging conditions, additional experiments were conducted in a newly selected area of the Lanzhou dataset characterized by higher building density, greater structural heterogeneity, and more irregular spatial arrangements. In this complex region, the proposed framework was applied at two map scales (1:5000 and 1:10,000) to recognize multiple building patterns, including C-shaped, and F-shaped configurations. The results indicate that the framework maintains stable performance even in densely built and structurally diverse environments, as visually demonstrated in
Figure 17, suggesting that the combined use of relational constraints and cross-scale reasoning provides a degree of robustness against ambiguity and noise commonly encountered in real-world urban data.
In terms of scalability and generality, the proposed framework separates generic relational modeling from pattern-specific semantic constraints. Core components such as multi-scale representation, inter-building relational modeling, knowledge graph construction, and rule-based reasoning are pattern-agnostic and remain unchanged when extending the method to new building patterns. While recognizing additional patterns requires the definition of pattern-specific semantic rules, this process primarily involves recombining existing relational primitives rather than redesigning the overall framework. The successful recognition of C-shaped and F-shaped building patterns in the additional experiments demonstrates that the proposed method provides a scalable foundation for building pattern recognition beyond the I-shaped case.
Nevertheless, it should be acknowledged that the effectiveness of rule-based recognition depends on the completeness and correctness of the encoded pattern semantics. Extremely irregular building morphologies or patterns lacking clear geometric organization may remain challenging. Future work may explore the integration of data-driven or probabilistic components to complement the current rule-based framework. However, within the scope of this study, the proposed method offers a robust, interpretable, and scalable solution for multi-scale building pattern recognition in complex urban environments.