Building an Urban Spatial Structure from Urban Land Use Data: An Example Using Automated Recognition of the City Centre

: It has been suggested that the method of constructing an urban spatial structure typically follows a forward process from planning and design up to expression, as reﬂected in both graphic and text descriptions of urban planning. Although unorthodox, the original status structures can be extracted and constructed from an existing urban land-use map. This approach not only provides the methodological foundation for urban spatial structure evolution and allows for a comparative and quantitative analysis between the existing and planned conditions, but also lays a theoretical basis for failure in scientiﬁc decision making during the planning phase. This study attempts to achieve this by identifying the city centre (a typical element of the urban spatial structure) from urban land use data. The city centre is a special region consisting of several units with particular spatial information, including geometric attributes, topological attributes, and thematic attributes. In this paper, we develop a methodology to support the delineation of the city centre, considering these factors. First, using commercial land data, we characterise the city centre as units based on a series of indicators, including geometric and thematic attributes, and integrate them into a composite index of “urban centrality”; Second, a graph-based spatial clustering method that considers both topological proximity and attribute similarity is designed and used to identify the city centre. The precise boundary of the city centre is subsequently delimited using a shape reconstruction method based on the cluster results. Finally, we present a case study to demonstrate the effectiveness and practicability of the methodology.


Introduction
Urban spatial structure is viewed as a generalized description of elements arranged in geographic space and the process of element interactions [1].Many studies have been made on the spatial structure of cities, and we divide them into two main forms.One characterises the city as the node for the study of the urban external spatial structure in a "macro" depiction, including the urban system and urban agglomerations; the other identifies the city as the surface to study the urban internal spatial structure in a "micro" depiction, including urban land-use patterns, central business district (CBD), and urban image space.In the field of urban planning, the construction method of urban internal spatial structures typically follows a forward process from planning to expression; i.e., city planners aim to design a reasonable urban spatial structure to guide the development of a city, which is reflected in a graphic description of urban planning that emphasizes urban spatial structure.For example, according to the Comprehensive Plan of Shenzhen City (2011-2020) in China, Shenzhen's urban spatial structure will be a polycentric structure (with seven planned city centres).City planners design this structure to guide the development of Shenzhen.It focuses on the future of urban planning.However, status structures (i.e., urban spatial structures at a certain time during the planning period) are significantly important to policymakers and urban planning.Status structures not only provide the methodological foundation of urban spatial structure evolution for comparative and quantitative analysis between existing and planned conditions, but also lay a theoretical basis for failure in scientific decision making during the planning phase.Can an urban spatial structure be analysed in the context of the status of a city?Most previous studies have focused specifically on forms of land use.The most early and well-known studies have formed the Agricultural Location Theory, Industrial Location Theory, and Central Place Theory; these theories have all successfully reflected the distribution and evolution of urban spatial elements, based on location models of land utilization.Many studies on urban land use, including urban morphology [2,3], city landscape pattern [4,5], urban land-use allocation [6,7], and urban expansion simulation [8,9], further confirm this fact, revealing hidden urban information from the distribution of various types of urban land.Thus, studying urban spatial structures using urban land use is an objective and feasible approach.
The Urban Planning Bureau in each city of China maintains and disseminates urban land datasets at a very fine scale.Designed as general purpose products, these datasets offer a wealth of (primarily geometry and land type) information about individual land parcels.However, they do not model higher-order geographic phenomena [10].For example, we can easily obtain the spatial distribution information of various land types, but not the districts of functional areas (e.g., industrial areas) within cities; we can get the location and height of hills, but not the geographic extent of the hills.Providing more of the higher-level semantics in urban land datasets could allow the Urban Planning Bureau to respond better to user requirements.This allows the representation of city space to be closer to the way it is conceptualized by people [11] and not simply reflected in graphic descriptions of urban planning that emphasize the urban spatial structure.
As a developing country with a very large population, the change in central urban landscapes and functions is critical for promoting sustainable progress in China, because the city centre is considered to be the economic and social centre of the city [12].In this paper, we use the city centre derived from urban land use as an example to attempt to build an urban spatial structure from urban land use.The city centre is a typical element of the urban spatial structure [13].Moreover, as cities become more complex and some of cities are polycentric in design, the recognition of city centre areas is an important issue for urban planning, and draws the attention of policymakers and scholars [14,15].Although human cognition might be used to sketch a city centre, many researchers have attempted to develop quantitative methods for its delineation (see Section 2).Most methods focus primarily on spatial patterns or non-spatial inequality.A city is usually divided into subareas, such as census tracts, and school zones, which usually emerge into irregular shapes and have different spatial information including geometric attributes, topological attributes, and thematic attributes [16].Similarly, the city centre is a special region that consists of several units (commercial land in this paper) with different characteristics.
Based on the questions above, this study primarily aimed to establish an applicable methodology oriented towards urban land use for identifying city centre areas.In this context, we addressed the two following questions: (1) How is the city centre characterised based on the existing urban land use?
The city centre is significantly influenced by economic aggregation, and its core function is commerce.The city centre is strongly relevant to commercial land.A city is always divided into a series of spatial statistical units, with different characteristics for analysing urban spatial structure.In this study, we characterise the city centre from commercial land data using a series of indicators, including geometric and thematic attributes, and integrate them into a composite index of "urban centrality".
(2) How can we delineate the city centre?
We proposed a spatial clustering method based on graph theory and attribute information, to delineate the city centre.Spatial clustering generally involves two steps.First, we constructed spatial proximity relationships among objects (commercial lands).A clustering result is obtained based on the geometric property of the objects, and we cluster these objects based on attribute similarity (i.e., composite index of urban centrality).This research contributes to determining the city centre, considering not only geometric and topological attributes, but also socio-economic attributes.

Related Work
The delineation of the city centre is strongly related to data sources [13,17] such as unit postcode [18], socio-economic data [16,19], geographic grid [20], mobility data (such as point of interest [21], travel flows [22], mobile phone positioning [23]), and remote sensing data [24].Since the 1950s, many studies have been made available to quantitatively recognize the city centre using different data sources.These approaches can be divided into two primary categories: (1) index-based, and (2) clustering methods.The index-based method represents the concentration of city centre areas using a simple index or by constructing an evaluation index system.In the early stages, Murphy and Vance [13] proposed Central Business Height Index (CBHI) and Central Business Intensity Index (CBII) to gauge the degree of the CBD.Consecutive block units that satisfy the degree (CBII ≥ 50%, CBHI ≥ 1) are defined as central areas.Galster et al. [25] and Lee [26] used the Gini coefficient and delta index to measure the unequal distribution of population or employment based on spatial units in a city, which facilitated the understanding of monocentricity and polycentricity patterns.Pereira et al. [27] introduced the urban centrality index (UCI) based on the location coefficient [28] and proximity index [29], to measure the degree of urban centrality.The UCI is able to reflect spatial concentration, considers a centrality scale that varies from monocentricity to polycentricity, can quantify the change in urban spatial structure.
The clustering method recognizes city centre areas based on statistical analysis; that is, the objects in the same class are similar to one another, and dissimilar to those in different classes.One prevalent method for the use spatial analysis is in the identification of the city centre.Tsai [30] adopted the global Moran coefficient by population and employment, to characterise urban forms at the metropolitan level.The Moran coefficients of monocentric, polycentric, and decentralized sprawling forms are high, intermediate, and close to zero respectively.A similar study was performed by Zhou [23], who used mobile phone positioning data to identify the city centre.An alternative method is to use kernel density estimation (KDE) modelling to create continuous surface representations of indicators such as urban road network [31,32], housing price [33], geo-referenced images [34], central activities and functions [35] and economic activity [36].The KDE method transforms these indicators from point object into continuous surfaces of spatial densities.Statistical analysis are then conducted on these surfaces to evaluate the borderline of the city centre with high density values.Spatial clustering is another primary technique for urban spatial structure analysis.Hu [37] defined urban areas of interest (AOI) as the areas within a city that attract the attention of people, such as prominent landmarks, commercial zones, and scenic views.In this study, the improved DBSCAN clustering algorithm (DBSC) was used to extract AOI from Flickr photo data.To identify multiple city centres (namely, polycentric pattern) and delineate their precise boundaries, Sun [38] sought to combine the DBSCAN clustering algorithm and Voronoi graph to achieve this by using location-based social network data.
Several advanced processing techniques, such as spatial cognition [10,39] and remote sensing [24,40], have also evolved to provide powerful tools that can be used in the quantitative city centre study.Montello [39] reported a study in which participants drew lines around the areas that they believed constituted downtown Santa Barbara.Vagueness in the boundaries was elicited in two ways: by comparing the variation in boundary locations across the participants, and by having participants draw different boundaries to indicate their varying confidence in regional membership for different parts of the area.The results provided evidence that the method is a viable approach to externalizing people's representations of vague cognitive regions.Taubenboeck [24] presented a conceptual framework to define the CBD using physical and morphological parameters, and developed a transferable method to detect and delineate CBDs over larger areas from a combination of Cartosat-1 digital surface models and multispectral Landsat ETM+ imagery.
The extent of city centre areas derived by most of above-mentioned methods (e.g., [23,27,31,32]) is indeterminate.The identification of city centre that aims to find out accurate boundary of city centre becomes vital in many applications ranging from urban planning to epidemiology [38].Moreover, the city centre is a special region that consists of several units (commercial land in this paper) with different spatial information.The present methods primarily focus on the spatial patterns (e.g., [37,38]) or non-spatial inequality (e.g., [33,36]).In this paper, we characterised the city centre from commercial land data relating to a series of socio-economic indicators, and propose a methodology considering spatial proximity and attribute similarity for delineating the city centre from commercial land use.

Methodology
Figure 1 shows an overview of the proposed methodology for delineating a city centre.The methodology involves three parts.(1) Detect clusters (see Section 3.1).We proposed a spatial clustering method that begins by constructing spatial proximity relationships among objects (commercial lands) and then clustering these objects based on attribute similarity (composite index of urban centrality).We could detect clusters of commercial lands that are adjacent to each other and have higher urban centrality; (2) Characterise the city centre (see Section 3.2).In this study, we characterised the city centre from commercial land data relating to a series of indicators, and these indicators are then integrated into a comprehensive index to measure the urban centrality of commercial lands; (3) Extract the city centre area (see Section 3.3).We could derive the city centre from the clustered commercial lands, and the city centre boundary was then obtained by applying the shape reconstruction method (triangle filtration) to discrete commercial lands with higher urban centrality.developed a transferable method to detect and delineate CBDs over larger areas from a combination of Cartosat-1 digital surface models and multispectral Landsat ETM+ imagery.The extent of city centre areas derived by most of above-mentioned methods (e.g., [23,27,31,32]) is indeterminate.The identification of city centre that aims to find out accurate boundary of city centre becomes vital in many applications ranging from urban planning to epidemiology [38].Moreover, the city centre is a special region that consists of several units (commercial land in this paper) with different spatial information.The present methods primarily focus on the spatial patterns (e.g., [37,38]) or non-spatial inequality (e.g., [33,36]).In this paper, we characterised the city centre from commercial land data relating to a series of socio-economic indicators, and propose a methodology considering spatial proximity and attribute similarity for delineating the city centre from commercial land use.

Methodology
Figure 1 shows an overview of the proposed methodology for delineating a city centre.The methodology involves three parts.(1) Detect clusters (see Section 3.1).We proposed a spatial clustering method that begins by constructing spatial proximity relationships among objects (commercial lands) and then clustering these objects based on attribute similarity (composite index of urban centrality).We could detect clusters of commercial lands that are adjacent to each other and have higher urban centrality; (2) Characterise the city centre (see Section 3.2).In this study, we characterised the city centre from commercial land data relating to a series of indicators, and these indicators are then integrated into a comprehensive index to measure the urban centrality of commercial lands; (3) Extract the city centre area (see Section 3.3).We could derive the city centre from the clustered commercial lands, and the city centre boundary was then obtained by applying the shape reconstruction method (triangle filtration) to discrete commercial lands with higher urban centrality.

Detecting Clusters Using a Graph-Based Spatial Clustering Algorithm
The city centre is a special region that consists of commercial lands; if there is a spatially local cluster composed of commercial lands (1) which are adjacent to each another (i.e., spatial proximity) and (2) which have higher urban centrality (see Section 3.2), this cluster might exist in a city centre.
To detect this class of cluster, a spatial clustering algorithm should consider both spatial proximity and attribute similarity.We adopted a global and local constraint to model the spatial proximity relationships among objects.An adaptive spatial clustering method, based on attribute information

Detecting Clusters Using a Graph-Based Spatial Clustering Algorithm
The city centre is a special region that consists of commercial lands; if there is a spatially local cluster composed of commercial lands (1) which are adjacent to each another (i.e., spatial proximity) and (2) which have higher urban centrality (see Section 3.2), this cluster might exist in a city centre.
To detect this class of cluster, a spatial clustering algorithm should consider both spatial proximity and attribute similarity.We adopted a global and local constraint to model the spatial proximity relationships among objects.An adaptive spatial clustering method, based on attribute information entropy, was developed for clustering spatial objects with attribute similarity after constructing spatial proximity relationships.

Construction of Spatial Proximity Relationships
Delaunay triangulation is an effective way to express spatial proximity relationships [41].The Delaunay-based algorithms can discover clusters of arbitrary shapes, and require few input parameters.However, they may not be reliable when the density varies between clusters [42]; namely, they are inaccurate near edges occurring in the gap between low-and high-density regions.In this paper, a two-level strategy algorithm was used to construct spatial proximity relationships among objects.First, we used the centroid to represent commercial land, and constructed Delaunay triangulation from these points.Second, the AUTOCLUST algorithm [41] was adopted to remove the long edges in Delaunay triangulation at the global level.Finally, inaccuracy near edges was further removed at the local level.After this two-level strategy, objects with the same edge were regarded as spatial neighbours.Two-level strategies are defined as follows: Definition 1. Global edge-length constraint [41]: Suppose D is a spatial database, and DT(D) is the Delaunay triangulation of D. For each point P ∈ D, the neighbourhood N(P) is the set of Delaunay edges incident to P. The global edge-length constraint can be represented as follows: where Local_Mean_Length(P) is the mean length of edges directly incident to P, Local_SD(P) is the standard deviation of the length of edges directly incident to P, Global_SD is the standard deviation of the length of all edges in the Delaunay triangulation, d(P) is the number of edges directly incident to P, |e i | is the length of edges directly incident to P, and N is the number of D.
If the length of an edge directly incident to P is larger than Global_Length_Constraint(P), then the edge will be removed from the Delaunay triangulation (see Figure 2c).Through the global edge-length constraint, many inappropriate long edges were removed, as shown in Figure 2c.After filtering the long edges from the Delaunay triangulation, some inaccurate near edges (red circle regions in Figure 2c) remained in the local areas.A local edge-length constraint was developed to remove edges at the local level.
As observed from Figure 2, the length variations of inter-cluster edges incident to points in the cluster borders tended to be relatively larger because both short edges and long edges existed.To exploit this characteristic, we adopted a statistical variable-F(P) to detect inter-cluster edges.

Definition 2. Local edge-length constraint:
In DT(S), for a point P ∈ S, the neighborhood N(P) is the set of edges incident to point p, and its local constraint is expressed as Sets = {F(P)|F(P) ≤ γ} (5) where γ denotes the threshold value of F(P), and can be obtained using a heuristic method in [43].
The F(P) value of each point inside the clusters was small because the length variations of their incident edges changed minimally.In contrast, the points on the cluster borders had large variations; thus, the F(P) value was large.The final spatial clustering result (Figure 2d) was composed of all connected data points with F(P) ≤ γ and all data points belonging to their neighbourhoods.

Clustering Commercial Lands with Attribute Similarity
After global and local trimming operations, a modified Delaunay triangulation C-DT was obtained.Based on C-DT, attribute similarity was used to identify clusters with higher urban centrality.
Many studies have proposed the clustering of spatial objects with attribute similarity.Although DBSCAN, kernel density estimation and Local Getis-Ord are widely used for attribute inequality to derive the city centre, there are two common drawbacks: the first is that the clustering quality heavily depends on the user-defined parameters, which vary with the attribute distribution and cannot be assigned easily, and the second is that their similarity measure considers only the immediate similarity between two objects, which cannot differentiate their real differences, as reflected by ignoring the clustering tendency of geographical phenomena.Compared to the above methods, DBSC [42] does not require any parameters, and the similarity between an object and a set of objects can consider both local and global differences.The good performance of the DBSC algorithm has been demonstrated by both simulated and actual datasets.However, the threshold used to determine the similarity of attributes in DBSC is fixed, which directly leads to unsatisfactory clustering results when attributes are non-homogeneous in geographical space.
Considering these facts, we introduced information entropy [45] to measure attribute similarity among commercial lands, and a corresponding threshold considering local information was designed.The proposed method was derived from the DBSC algorithm, but used a different measurement strategy.Some basic concepts were defined and used to explain the proposed clustering algorithm.

Definition 3. Attribute information entropy: For spatial objects (commercial lands) in
, the attribute (composite index of urban centrality) of is denoted as = , , ⋯ ， .The attribute probability of an object ∈ in can then be expressed as: The attribute information entropy of is expressed as: According to the Maximum Entropy Theory, when the attribute values of all objects in a dataset are equal, the information entropy of the data set will reach the maximum value, known as the principle of "equal probability maximum entropy".The more spatial objects in that are similar to each another, the ( ) value is higher.In contrary, when the spatial objects in show great differences, the ( ) value is lower.Clustering follows the principle of "Equal probability maximum entropy" such that a spatial object is partitioned into the most similar cluster.

Clustering Commercial Lands with Attribute Similarity
After global and local trimming operations, a modified Delaunay triangulation C-DT was obtained.Based on C-DT, attribute similarity was used to identify clusters with higher urban centrality.
Many studies have proposed the clustering of spatial objects with attribute similarity.Although DBSCAN, kernel density estimation and Local Getis-Ord are widely used for attribute inequality to derive the city centre, there are two common drawbacks: the first is that the clustering quality heavily depends on the user-defined parameters, which vary with the attribute distribution and cannot be assigned easily, and the second is that their similarity measure considers only the immediate similarity between two objects, which cannot differentiate their real differences, as reflected by ignoring the clustering tendency of geographical phenomena.Compared to the above methods, DBSC [42] does not require any parameters, and the similarity between an object and a set of objects can consider both local and global differences.The good performance of the DBSC algorithm has been demonstrated by both simulated and actual datasets.However, the threshold used to determine the similarity of attributes in DBSC is fixed, which directly leads to unsatisfactory clustering results when attributes are non-homogeneous in geographical space.
Considering these facts, we introduced information entropy [45] to measure attribute similarity among commercial lands, and a corresponding threshold considering local information was designed.The proposed method was derived from the DBSC algorithm, but used a different measurement strategy.Some basic concepts were defined and used to explain the proposed clustering algorithm.Definition 3. Attribute information entropy: For spatial objects (commercial lands) in A = {x 1 , x 2 , . . ., x n }, the attribute (composite index of urban centrality) of A is denoted as R = {r 1 , r 2 , . . ., r n }.The attribute probability of an object x k ∈ A in R can then be expressed as: The attribute information entropy of A is expressed as: According to the Maximum Entropy Theory, when the attribute values of all objects in a dataset are equal, the information entropy of the data set will reach the maximum value, known as the principle of "equal probability maximum entropy".The more spatial objects in A that are similar to each another, the H(M r ) value is higher.In contrary, when the spatial objects in A show great differences, the H(M r ) value is lower.Clustering follows the principle of "Equal probability maximum entropy" such that a spatial object is partitioned into the most similar cluster.Definition 4. Directly adjacent neighbour: For an object P 1 in C-DT, the spatial neighbors of P 1 contain all the objects that directly link to P 1 denoted by Directly adjacent neighbour (P 1 ).Definition 5. Indirectly adjacent neighbour: If a chain P 1 , P 2 , . . ., P n−1 , P n meets the requirements that P n only belongs to DN(P n−1 ), P k belongs to {DN(P k−1 ) ∩ DN(P k+1 ), 1 < k < n} and P 2 only belongs to DN(P 1 ) then P 3 , P 4 , . . . ,P n will be the Indirectly adjacent neighbours of P 1 .Definition 6. Information Entropy Measurement: In C-DT, for spatial objects x 1 and x 2 ∈ Directly adiacent neighbor ( x 1 ), x i and x j denotes their attributes, respectively.Therefore, the similarity between x i and x j is defined as Formula ( 9) considers only the similarity between two objects; it faces difficulties in obtaining their actual differences, because the binary relationship may conceal the tendency of attribute similarity in the spatial distribution.When comparing similarities between the object and cluster, many objects should be used in the similarity measure.The information entropy is introduced to overcome the defects of the similarity measure with the binary relationship, which can consider both local (two objects) and global (one object and a set of objects) differences.
For a set of spatial objects B (the number of objects is greater than or equal to two) in C-DT and R = {r 1 , r 2 , . . . ,r m }, the similarity between an object Q (R = r Q and cluster B can be denoted by S(Q, B), and is expressed as follows: Definition 7. Neighbourhood Entropy: For an object P in C-DT, all the objects of its directly adjacent neighbours, including itself, are defined as Sub_SN(P).The neighborhood entropy of P is is expressed as where H n= Sub_DN(P) (M r ) denotes the information entropy of Sub_DN(P) and Sub_DN(P) is the number of Sub_DN(P) Neighborhood entropy measures the similarity between P and directly adjacent neighbors, and if the object P is similar to its directly adjacent neighbors, the value of Hnear(P) is high.
By computing the Hnear(P) of all objects sorted in descending order, the maximum value is extracted as the clustering centre point.The clustering centre point is not fixed; when a cluster is formed, the clustering centre point will be updated among the objects that are not clustered.Definition 9. Cluster: For a Cluster_Point(P) breadth-first search [46] is used to visit its directly and indirectly adjacent neighbours.One cluster is formed if they satisfy the threshold (see Section 3.1.3)and no new object is added to the cluster.Definition 10.Noise: Given an object P, if P does not belong to any cluster, P will be identified as noise.

Algorithm Description Determination of the Attribute Clustering Threshold
From Definitions 3 and 6, when attribute values in one cluster are equal, H(M) reaches the maximum value, denoted by H max (M), and the cluster achieves the maximum similarity.If an object Q is similar to one known object or cluster, then the difference between the information entropy of the new cluster including Q and H max (M) of the new cluster will be small, according to the monotonicity of information entropy.In contrast, the actual difference will be great.Therefore, an object Q is similar to one object or cluster if the following is satisfied: where H(M ) is the information entropy of a new cluster and H max (M ) is the maximum information entropy of a new cluster.An intermediate value for the parameter, 0 < θ < 1, is defined a priori, which can adapt to a range of different clusterings.
The values in the same cluster should be similar to one another, and dissimilar to those in different clusters, in order to obtain a suitable θ The partitioning best method (PBM) index [47] provides a measure of how "neatly split" the clusters are, and is expressed as: where N c is the number of clusters, N i is the total number of points in C i cluster, v i is the centre of the C i cluster, and D N c measures the maximum separation between a pair of clusters.When this index is large, the rule of k standard deviations obtains a better result.

Detection of Clusters Using Our Spatial Clustering Algorithm
The proposed methodology for detecting clusters involves four distinct stages.
Step 1. Construct spatial proximity relationships among commercial land blocks.This step can be implemented by the following operations: 1 Convert commercial land blocks into discrete points, and then construct Delaunay triangulation for these points.Remove the edges from the Delaunay triangulation using global edge-length constraints.3 Remove the edges from the Delaunay triangulation using local edge-length constraints.
Step 2. Compute the neighbourhood entropy.For each commercial land, calculate its neighbourhood entropy and then sort the commercial land blocks in descending order of neighbourhood entropy.
Step 3. Implement attribute clustering.This step involves the following operations: 1 Select the maximum value of the neighbourhood entropy as Cluster_Point(P).
2 Use the breadth-first search to visit directly and indirectly adjacent neighbours of P in descending order of their neighborhood entropy.The cluster is formed if they satisfy Formula (15) and no new object is added to the cluster, thus identifying them as clustered.3 Traverse all points that are not clustered by iterating operations (1)-( 2).When clustering is finished, any point that does not belong to a cluster will be identified as noise.

Algorithm Analysis Implementation Procedure of the Algorithm
The implementation procedure of our algorithm is further illustrated through the simulated data in Figure 3.The attribute values (i.e., comprehensive index) of objects (i.e., centroids of commercial lands) were labelled in Figure 3a.Delaunay triangulation with global and local edge-length (γ = 0.2365) constraints was utilised to model spatial proximity relationships, and the clustering result is shown in Figure 3b.The next step computed the neighbourhood entropy of all objects based on the spatial proximity relationships.P1 was first selected as the clustering centre point (see Figure 3c), and the neighbours of P1 were detected as the first cluster C1 (θ = 0.98).P2 was then identified as the new clustering centre point, and the neighbours of P2 formed the second cluster C2.Finally, P3 was selected as the Clustering Center Point, and the neighbours of P3 were classified as cluster C3.The clustering results of simulated datasets (see Figure 3d) illustrated that objects in the same cluster are indeed similar in both the spatial and attribute domains.

Algorithm Validation
To validate the proposed algorithm in this paper, a 3-D (1-D attribute) simulated dataset was utilised to test our algorithm.The attribute value (AV) range of objects in each predefined cluster (C1-C5) is labelled in Figure 4a.The attribute values of objects in each cluster were randomly assigned within a certain range.To ensure randomization, for each object, 20 replications of the attribute value were generated, and the average value of these 20 replications for each object was set as the final attribute value.
The two-level strategy algorithm was first used to construct spatial proximity relationships among 2-D planar points (the result can be seen in Figure 2), and attribute clustering was then implemented.The attributes were non-homogeneous in geographical space-whereas the attribute value range in C2 was different from the others.The clustering results obtained using the DBSC algorithm and our method are shown in Figure 4b,c.Our method separated the five clusters and the

Algorithm Validation
To validate the proposed algorithm in this paper, a 3-D (1-D attribute) simulated dataset was utilised to test our algorithm.The attribute value (AV) range of objects in each predefined cluster (C1-C5) is labelled in Figure 4a.The attribute values of objects in each cluster were randomly assigned within a certain range.To ensure randomization, for each object, 20 replications of the attribute value were generated, and the average value of these 20 replications for each object was set as the final attribute value.
According to a related analysis of the city centre for China's urban study, we suggested three key factors for characterising the city centre: geometry (area and aggregation degree), economy (employment density [48], and land price [12,49]), and traffic accessibility (road network density and distance from the nearest road [50]).These were then combined into a comprehensive index to measure the urban centrality of commercial lands.

Extraction of Indicators
In this study, several indicators, such as aggregation degree, employment density, road network density, and distance from the nearest road could not be obtained directly because commercial land data features high spatial resolution.Thus, we used geographic information system (GIS) to support the analysis.
(1) Aggregation degree of commercial land.The distribution of commercial land in city centre areas tends to be clustered, and the area of each commercial plot cannot measure the unequal distribution of commercial land.Therefore, we considered the area and space occupied by commercial land, to measure the aggregation degree of commercial land.We used an approach provided by Ai and Van Oosterom [51] to compute the space occupied by commercial land.A geometric construction similar to a Voronoi diagram was created based on the skeleton of the Delaunay triangulation built using a polygon (commercial land) boundary, as shown in Figure 5a.The aggregation degree of commercial land is expressed as The two-level strategy algorithm was first used to construct spatial proximity relationships among 2-D planar points (the result can be seen in Figure 2), and attribute clustering was then implemented.The attributes were non-homogeneous in geographical space-whereas the attribute value range in C2 was different from the others.The clustering results obtained using the DBSC algorithm and our method are shown in Figure 4b,c.Our method separated the five clusters and the isolated noise very well, but the DBSC algorithm detected 17 clusters.To observe this type of non-homogeneous effect, we broaden the attribute value range in C2 (see Figure 4d).As shown in Figure 4e,f, our method still performed well, while the DBSC algorithm misclassified some objects in C2 as noise.The DBSC algorithm could not provide satisfactory results, primarily because the global parameter could not adapt to the non-homogeneous phenomena.In addition, Figure 4g shows the sensitivity test to the outliers, where the attributes were homogeneous in geographical space (the attribute difference in each cluster is similar).Some clusters were assigned as outliers (denoted by ISPRS Int.J. Geo-Inf.2017, 6, 122 selected as the Clustering Center Point, and the neighbours clustering results of simulated datasets (see Figure 3d) illustr indeed similar in both the spatial and attribute domains.

Algorithm Validation
To validate the proposed algorithm in this paper, a 3-D utilised to test our algorithm.The attribute value (AV) rang (C1-C5) is labelled in Figure 4a.The attribute values of o assigned within a certain range.To ensure randomization, attribute value were generated, and the average value of was set as the final attribute value.
The two-level strategy algorithm was first used to con among 2-D planar points (the result can be seen in Figure implemented.The attributes were non-homogeneous in geog value range in C2 was different from the others.The cluste algorithm and our method are shown in Figure 4b,c.Our met isolated noise very well, but the DBSC algorithm detected non-homogeneous effect, we broaden the attribute value ran Figure 4e f, our method still performed well, while the DBS in C2 as noise.The DBSC algorithm could not provide sati global parameter could not adapt to the non-homogeneou shows the sensitivity test to the outliers, where the attribut space (the attribute difference in each cluster is similar).So (denoted by ), with the attribute values of outliers random 4h, i, our method and the DBSC algorithm could separate five

Characterising the City Center
Delineating the city centre by using spatial informat topological attributes (spatial proximity relationships in Section ), with the attribute values of outliers randomly set from 40 to 60.As seen in Figure 4h, i, our method and the DBSC algorithm could separate five clusters and outliers very well.

Characterising the City Center
Delineating the city centre by using spatial information requires not only geometric and topological attributes (spatial proximity relationships in Section 3.2.1),but also socio-economic attributes.According to a related analysis of the city centre for China's urban study, we suggested three key factors for characterising the city centre: geometry (area and aggregation degree), economy (employment density [48], and land price [12,49]), and traffic accessibility (road network density and distance from the nearest road [50]).These were then combined into a comprehensive index to measure the urban centrality of commercial lands.

Extraction of Indicators
In this study, several indicators, such as aggregation degree, employment density, road network density, and distance from the nearest road could not be obtained directly because commercial land data features high spatial resolution.Thus, we used geographic information system (GIS) to support the analysis.
(1) Aggregation degree of commercial land.The distribution of commercial land in city centre areas tends to be clustered, and the area of each commercial plot cannot measure the unequal distribution of commercial land.Therefore, we considered the area and space occupied by commercial land, to measure the aggregation degree of commercial land.We used an approach provided by Ai and Van Oosterom [51] to compute the space occupied by commercial land.
A geometric construction similar to a Voronoi diagram was created based on the skeleton of the Delaunay triangulation built using a polygon (commercial land) boundary, as shown in Figure 5a.
The aggregation degree of commercial land is expressed as where s i denotes a commercial plot, Den(s i ) denotes the aggregation degree of s i , AP i denotes the area of s i , and AV i denotes the area of space occupied by s i .The range of Den(s i ) is 0 to 1.If Den(s i ) tends to 1, then the distribution of commercial land is clustered.The higher clustering of commercial plots was in the city centre areas, as shown in Figure 5b (Region 1).In contrast, when Den(s i ) tended to 0, the distribution of commercial land was decentralized, as shown in Figure 5b (Region 2).
ISPRS Int.J. Geo-Inf.2017, 6, 122 11 of 23 where denotes a commercial plot, ( ) denotes the aggregation degree of , denotes the area of , and denotes the area of space occupied by .The range of ( ) is 0 to 1.If ( ) tends to 1, then the distribution of commercial land is clustered.The higher clustering of commercial plots was in the city centre areas, as shown in Figure 5b (Region 1).In contrast, when ( ) tended to 0, the distribution of commercial land was decentralized, as shown in Figure 5b (Region 2).(2) Road network density and distance from the nearest road.Many urban districts with high commercial functions have been formed around the road network.Those regions feature high accessibility, namely, the comercial lands located in the city centre areas show higher accessibility.Road network density and distance from the nearest trunk road are important indicators for measuring accessibility.To compute the road network density for commercial land, the buffer operation of GIS was used to determine for how long the roads are located in a circular area with a radius distance of 1 km within the commercial land, as shown in Figure 6.Therefore, (2) Road network density and distance from the nearest road.Many urban districts with high commercial functions have been formed around the road network.Those regions feature high accessibility, namely, the comercial lands located in the city centre areas show higher accessibility.Road network density and distance from the nearest trunk road are important indicators for measuring accessibility.To compute the road network density for commercial land, the buffer operation of GIS was used to determine for how long the roads are located in a circular area with a radius distance of 1 km within the commercial land, as shown in Figure 6.Therefore, where s i denotes a commercial plot, RoadDen(s i ) denotes the road network density of s i , AB i denotes the buffer area of s i (the buffer radius is usually set to 1 km), and LP i denotes the sum length of roads that are located inside the circular area of the commercial plot.(2) Road network density and distance from the nearest road.Many urban districts with high commercial functions have been formed around the road network.Those regions feature high accessibility, namely, the comercial lands located in the city centre areas show higher accessibility.Road network density and distance from the nearest trunk road are important indicators for measuring accessibility.To compute the road network density for commercial land, the buffer operation of GIS was used to determine for how long the roads are located in a circular area with a radius distance of 1 km within the commercial land, as shown in Figure 6.Therefore, where denotes a commercial plot, ( ) denotes the road network density of , denotes the buffer area of (the buffer radius is usually set to 1 km), and denotes the sum length of roads that are located inside the circular area of the commercial plot.We also adopted the nearest neighbour analysis of GIS to compute the distance from the commercial land to the nearest road, including trunk roads, road intersections, and subways.
(3) Employment density.Broadly, the distribution of employment data was used to identify city centres.Conventionally, employment density obtained from the census tract failed to meet our statistical zone (i.e., commercial land).We computed the employment density of commercial land as the density of the census tract if the primary part of the commercial land fell into this We also adopted the nearest neighbour analysis of GIS to compute the distance from the commercial land to the nearest road, including trunk roads, road intersections, and subways.
(3) Employment density.Broadly, the distribution of employment data was used to identify city centres.Conventionally, employment density obtained from the census tract failed to meet our statistical zone (i.e., commercial land).We computed the employment density of commercial land as the density of the census tract if the primary part of the commercial land fell into this census tract.As shown in Figure 7, the employment density of commercial land 1, 2 and 3 was, respectively, the density of CT-1, CT-2 and CT-3.The value of commercial land 4 should have been the same as commercial land 2 because the primary part of the commercial land fell into CT-2.
ISPRS Int.J. Geo-Inf.2017, 6, 122 12 of 23 census tract.As shown in Figure 7, the employment density of commercial land 1, 2 and 3 was, respectively, the density of CT-1, CT-2 and CT-3.The value of commercial land 4 should have been the same as commercial land 2 because the primary part of the commercial land fell into CT-2.

Index of Urban Centrality
The next step combined these indicators into a comprehensive index to represent the "urban centrality" of each commercial land.Factor analysis [52] was originally developed by psychologists to reduce many variables to a smaller number of underlying factors, dimensions, or components.In this study, we adopted this method to reduce the dimensions of multidimensional attribute data, aggregating geometric and thematic attributes into a comprehensive index of "urban centrality".Using the variance contribution rate of the factors as weight, factor analysis was applied to formulate the comprehensive index value , expressed as

Index of Urban Centrality
The next step combined these indicators into a comprehensive index to represent the "urban centrality" of each commercial land.Factor analysis [52] was originally developed by psychologists to reduce many variables to a smaller number of underlying factors, dimensions, or components.In this study, we adopted this method to reduce the dimensions of multidimensional attribute data, aggregating geometric and thematic attributes into a comprehensive index of "urban centrality".Using the variance contribution rate of the factors as weight, factor analysis was applied to formulate the comprehensive index value F, expressed as where F m mdenotes the extracted factors, ω i denotes the variance contribution rate of the factors, and ∑ m i = 1 ω i denotes the total variance contribution rate.

Extracting the City Centre Area
After spatial clustering considering both spatial proximity and attribute similarity operations, we could detect clusters with different urban centrality values.The next step was to derive the city centre from these clustered commercial lands.First, we computed the mean value of each cluster and used positive standard deviation values [12,35,53] to delimit the city centre areas showing higher urban centrality.Second, polygons were used to determine the precise boundaries of the city centre areas consisting of commercial lands.In this study, we utilised Delaunay-based shape reconstruction to generate the city centre area from a set of commercial land clusters.
Many studies have been proposed in the literature for boundary representation.The Delaunay-based method captures geometric and topological information of the objects well, and it is simpler and more systematic than other methods.As depicted in Figure 8b, we initially constructed the Delaunay triangulation of the boundary points for commercial lands.Boundary points should be used to increase density according to the Gestalt law of proximity (Figure 8a).Triangle filtration using Peethambaran and Muthuganapathy [54] was subsequently employed to extract the boundaries of the city centres (Figure 8c).This method was selected because it can derive an acute and unique shape without any external parameters.Triangle filtration proceeded by iteratively removing all thin boundary triangles if the CIRCUMCENTER and REGULARITY constraints are satisfied (for additional details, see inside triangle filtration).

Application
The proposed methodology was applied to delineate city centres in two cities from China, Yangzhou and Wenshan.The two cities are markedly different cities.First, Yangzhou is a city located in the Yangtze River Delta, the largest economic zone in China.With its development in recent years, Yangzhou has evolved into a polycentric city.However, Wenshan is a new and small city located in the southwest basin region of China.The two cities represent two typical urban spatial structures in China, i.e., polycentric and monocentric cities.The analysis of the city centre could be broadened to other cities in China.

Application
The proposed methodology was applied to delineate city centres in two cities from China, Yangzhou and Wenshan.The two cities are markedly different cities.First, Yangzhou is a city located in the Yangtze River Delta, the largest economic zone in China.With its development in recent years, Yangzhou has evolved into a polycentric city.However, Wenshan is a new and small city located in the southwest basin region of China.The two cities represent two typical urban spatial structures in China, i.e., polycentric and monocentric cities.The analysis of the city centre could be broadened to other cities in China.
The research was performed by considering two cities: Yangzhou and Wenshan.Since the algorithm steps were the same in our cases, the first application will be presented as the primary example.The urban land use dataset was provided by the Yangzhou Planning Bureau for the year 2012 in the first application.The dataset is composed of 735 records of commercial lands, and their spatial distributions are shown in Figure 9. Different stages of the recognition process and their outputs for the presented case study are illustrated in the following.

Application
The proposed methodology was applied to delineate city centres in two cities from China, Yangzhou and Wenshan.The two cities are markedly different cities.First, Yangzhou is a city located in the Yangtze River Delta, the largest economic zone in China.With its development in recent years, Yangzhou has evolved into a polycentric city.However, Wenshan is a new and small city located in the southwest basin region of China.The two cities represent two typical urban spatial structures in China, i.e., polycentric and monocentric cities.The analysis of the city centre could be broadened to other cities in China.
The research was performed by considering two cities: Yangzhou and Wenshan.Since the algorithm steps were the same in our cases, the first application will be presented as the primary example.The urban land use dataset was provided by the Yangzhou Planning Bureau for the year 2012 in the first application.The dataset is composed of 735 records of commercial lands, and their spatial distributions are shown in Figure 9. Different stages of the recognition process and their outputs for the presented case study are illustrated in the following.

Construction of Spatial Proximity Relationships
As shown in Figure 10, the centroid was used to represent commercial land and the Delaunay triangulation is introduced for the construction of spatial proximity among these points.Delaunay triangulation with global and local edge-length ( = 0.4638) constraints was utilised to model the spatial proximity relationships, and the clustering results are shown in Figure 10a-d

Construction of Spatial Proximity Relationships
As shown in Figure 10, the centroid was used to represent commercial land and the Delaunay triangulation is introduced for the construction of spatial proximity among these points.Delaunay triangulation with global and local edge-length (γ = 0.4638) constraints was utilised to model the spatial proximity relationships, and the clustering results are shown in Figure 10a-d

Using the Factor Analysis Method for the Index of Urban Centrality
Factor analysis can be implemented through several methods; the principal axis factor method was applied in this study.We considered the geometric and thematic attributes, as discussed above in the analysis, and a 6 × 6 matrix of correlation coefficients was created.Three primary factors with eigenvalues over 1.0 were extracted using the principal axis factor and maximum variance rotation methods.Table 1 provides the factor loadings, and the accumulation variance reached 91.957%.According to Formula (3), the comprehensive index of urban centrality for each commercial land is represented as: where 1 denotes the first factor score, and is the indicator value.To facilitate the calculation, we normalize the value to the range of (0, 10).After the global and local trimming operations, we applied the clustering method based on information entropy to the comprehensive index of urban centrality.Clusters were identified, and

Using the Factor Analysis Method for the Index of Urban Centrality
Factor analysis can be implemented through several methods; the principal axis factor method was applied in this study.We considered the geometric and thematic attributes, as discussed above in the analysis, and a 6 × 6 matrix of correlation coefficients was created.Three primary factors with eigenvalues over 1.0 were extracted using the principal axis factor and maximum variance rotation methods.Table 1 provides the factor loadings, and the accumulation variance reached 91.957%.According to Formula (3), the comprehensive index of urban centrality for each commercial land is represented as: where f 1 denotes the first factor score, and value is the indicator value.To facilitate the calculation, we normalize the F value to the range of (0, 10).After the global and local trimming operations, we applied the clustering method based on information entropy to the comprehensive index of urban centrality.Clusters were identified, and noises were removed.Figure 11a, b show the clustering results using our method and the DBSC algorithm (different colours represent different clusters).Basic statistical information (see Figure 12 and Table 2) from the clustering results obtained using the two approaches is also provided, including the number of clusters and noises, the mean of each cluster, the standard deviation of each cluster, the trend of mean values of clusters and the coefficient of variation (CV) value of clusters.By a simple comparison, both algorithms presented a general attribute distribution with similar patterns, as reflected by the trend of mean values of clusters.However, some cluster details from our method and DBSC were rather different.As shown in Table 2, there were 72 clusters and 26 noise points discovered by our method.Figure 12c shows that there was a significant difference between the adjacent clusters, and that the variation in each cluster was small (see Figure 12a).Compared to the results obtained using DBSC, the number of clusters and noise points generated using DBSC algorithm were higher than the number generated using our method, leading directly to a smaller difference between adjacent clusters (i.e., the CV value of mean values of clusters was smaller than ours).The DBSC algorithm could not provide satisfactory results.This outcome was primarily due to the global parameter used for clustering commercial lands with attributes not adapting to the local variation, which failed to recognize similar clusters; therefore, we handled this local variation by setting the relative variation proportion θ, with the PBM index being used to find a suitable θ value.Figure 13 shows the resulting curve plot, with the x-axis set as the θ value and the y-axis set as the PBM index.The maximum value of the PBM index was achieved when θ equals 0.986.Notably, the PBM index began to change when θ equalled 0.96, because the clustering results of spatial proximity and attribute similarity were considered the same in the interval (0, 0.958].
ISPRS Int.J. Geo-Inf.2017, 6, 122 15 of 23 noises were removed.Figure 11a, b show the clustering results using our method and the DBSC algorithm (different colours represent different clusters).Basic statistical information (see Figure 12 and Table 2) from the clustering results obtained using the two approaches is also provided, including the number of clusters and noises, the mean of each cluster, the standard deviation of each cluster, the trend of mean values of clusters and the coefficient of variation (CV) value of clusters.By a simple comparison, both algorithms presented a general attribute distribution with similar patterns, as reflected by the trend of mean values of clusters.However, some cluster details from our method and DBSC were rather different.As shown in Table 2, there were 72 clusters and 26 noise points discovered by our method.Figure 12c shows that there was a significant difference between the adjacent clusters, and that the variation in each cluster was small (see Figure 12a).Compared to the results obtained using DBSC, the number of clusters and noise points generated using DBSC algorithm were higher than the number generated using our method, leading directly to a smaller difference between adjacent clusters (i.e., the CV value of mean values of clusters was smaller than ours).The DBSC algorithm could not provide satisfactory results.This outcome was primarily due to the global parameter used for clustering commercial lands with attributes not adapting to the local variation, which failed to recognize similar clusters; therefore, we handled this local variation by setting the relative variation proportion , with the PBM index being used to find a suitable value.Figure 13 shows the resulting curve plot, with the x-axis set as the value and the y-axis set as the PBM index.The maximum value of the PBM index was achieved when equals 0.986.Notably, the PBM index began to change when equalled 0.96, because the clustering results of spatial proximity and attribute similarity were considered the same in the interval (0, 0.958].

Results and Validation
To delimit the geographical extent of the city centres, we further examined the tail values of the cluster results, using positive standard deviation values.According to a study on delimitating the city centre, a value of three standard deviations is often suggested [35,53].Figure 14a, c show the urban centrality classification of the commercial lands using a standard deviation of 3 with our cluster and the DBSC cluster results (i.e., mean value of each cluster).The red commercial lands in Figure 14 were concentrated in two areas of the city, forming a polycentric structure.We confirmed that the two areas with an urban centrality value over 7.28 (tail value of our method) or 7.38 (tail value of DBSC) had a larger population density, road network density, and a higher land price.As polygon clusters had been identified from commercial lands, we used the method discussed above (see Section 3.3) to derive the boundaries of the city centres from the clustered polygons.Figure 14b, d map the identified two city (centre 1 and centre 2) with precise boundaries using two alternative methods.We validated the boundaries produced using computational models to compare them to city centres from prior available knowledge.According to the Comprehensive Plan of Yangzhou City (1996-2010), the functional core was primarily located in two regions: the traditional Wen Changge centre, and the new He Dong centre.The comparative city centre representations provided narrative descriptions of the Yangzhou city centres extents.These descriptions are mapped with precise boundaries (See the blue areas in Figure 15a).Figure 15a, b show the computed city centres (purple areas) using two alternative methods versus the comparative city centres (blue areas).

Results and Validation
To delimit the geographical extent of the city centres, we further examined the tail values of the cluster results, using positive standard deviation values.According to a study on delimitating the city centre, a value of three standard deviations is often suggested [35,53].Figure 14a, c show the urban centrality classification of the commercial lands using a standard deviation of 3 with our cluster and the DBSC cluster results (i.e., mean value of each cluster).The red commercial lands in Figure 14 were concentrated in two areas of the city, forming a polycentric structure.We confirmed that the two areas with an urban centrality value over 7.28 (tail value of our method) or 7.38 (tail value of DBSC) had a larger population density, road network density, and a higher land price.As polygon clusters had been identified from commercial lands, we used the method discussed above (see Section 3.3) to derive the boundaries of the city centres from the clustered polygons.Figure 14b, d map the identified two city centres (centre 1 and centre 2) with precise boundaries using two alternative methods.We validated the boundaries produced using computational models to compare them to city centres from prior available knowledge.According to the Comprehensive Plan of Yangzhou City (1996-2010), the functional core was primarily located in two regions: the traditional Wen Changge centre, and the new He Dong centre.The comparative city centre representations provided narrative descriptions of the Yangzhou city centres extents.These descriptions are mapped with precise boundaries (See the blue areas in Figure 15a).Figure 15a, b show the computed city centres (purple areas) using two alternative methods versus the comparative city centres (blue areas).We further used the F1-Score, which is the harmonic mean of precision and recall for a quantitative comparison of the overlap between the computed and comparative city centre areas.The evaluation indicator is calculated as follows: where is the city centre area as delimited by the algorithm, is the area of the comparative city centre, and is the area where the computed and comparative city centres overlap.
The evaluation results in Figure 15 are presented in Table 3.We found that the F1-Score of our result reached 82.19% and 54.84%, and all indicators were larger than the corresponding indicators from the DBSC results, i.e., 64.38% and 41.11%.The He dong centre computed using the algorithms was smaller than the comparative city centre.The computational models omitted the north part in Figures 14 and 15, due to many instances of residential land in these areas, and a lack of commercial land.Given the results from the year 2012, the urban development and its functions evolved as We further used the F1-Score, which is the harmonic mean of precision and recall for a quantitative comparison of the overlap between the computed and comparative city centre areas.The evaluation indicator is calculated as follows: where a computed is the city centre area as delimited by the algorithm, a comparative is the area of the comparative city centre, and a overlap is the area where the computed and comparative city centres overlap.The evaluation results in Figure 15 are presented in Table 3.We found that the F1-Score of our result reached 82.19% and 54.84%, and all indicators were larger than the corresponding indicators from the DBSC results, i.e., 64.38% and 41.11%.The He dong centre computed using the algorithms was smaller than the comparative city centre.The computational models omitted the north part in Figures 14 and 15, due to many instances of residential land in these areas, and a lack of commercial land.Given the results from the year 2012, the urban development and its functions evolved as originally planned.Such evaluation illustrates that our methodology based on commercial land has the potential to yield satisfactory results, possibly leading to the development of a new method of confirming the recognition of the city centre.

Case study of Wenshan
In this section, we verified the feasibility of the proposed method using a different city in China, i.e., Wenshan. Figure 16a presents the spatial distribution of commercial lands for the year 2013.Delaunay triangulation with global and local edge-length constraints was utilised to model the spatial proximity relationships, and the clustering results are shown in Figure 16b, c.Factor analysis was then applied for the comprehensive index to represent the "urban centrality" of each commercial land, and the accumulation variance reached 86.764%.Similarly, two different methods were used and compared to detect clusters with higher urban centrality.Figures 17a and 18a illustrate the cluster results detected by the methods.To verify the feasibility of our method, the final city centre was computed using the classification of a standard deviation of 3 (See Figures 17b and 18b).The red commercial land areas in Figures 17b and 18b were concentrated in one area of the city, forming a monocentric structure.According to the Comprehensive Plan of Wenshan City (2006-2025), we mapped the geographic extent of the planned city centre (See Figures 17c and 18c).Based on the results, we compared our method and the DBSC algorithm by F1-Score.Table 4 presents the evaluation results.Compared to our method, the DBSC method generated a larger space for the city centre, as reflected by lower precision.Through the computation of the evaluation indictor, the proposed method was feasible for the different cities.

Case study of Wenshan
In this section, we verified the feasibility of the proposed method using a different city in China, i.e., Wenshan. Figure 16a presents the spatial distribution of commercial lands for the year 2013.Delaunay triangulation with global and local edge-length constraints was utilised to model the spatial proximity relationships, and the clustering results are shown in Figure 16b, c.Factor analysis was then applied for the comprehensive index to represent the "urban centrality" of each commercial land, and the accumulation variance reached 86.764%.Similarly, two different methods were used and compared to detect clusters with higher urban centrality.Figures 17a and 18a illustrate the cluster results detected by the methods.To verify the feasibility of our method, the final city centre was computed using the classification of a standard deviation of 3 (See Figures 17b and 18b).The red commercial land areas in Figures 17b and 18b were concentrated in one area of the city, forming a monocentric structure.According to the Comprehensive Plan of Wenshan City (2006-2025), we mapped the geographic extent of the planned city centre (See Figures 17c and 18c).Based on the results, we compared our method and the DBSC algorithm by F1-Score.Table 4 presents the evaluation results.Compared to our method, the DBSC method generated a larger space for the city centre, as reflected by lower precision.Through the computation of the evaluation indictor, the proposed method was feasible for the different cities.

Conclusions and Outlooks
Considering that the construction of the urban spatial structure in the planning and design stages follows the forward process from planning to expression, this research contributes to providing a new method for the construction of an urban spatial structure via the urban land use.Using the city centre as an example, this study presents a methodology to delineate the city centre from commercial land data.The advantage of using urban land as a primary data source is that it has a much higher spatial resolution and smaller space granularity than conventional statistical units.This advantage affords the possibility of gaining an immediate impression of the distribution of geographic phenomena, and high precision in the statistical agglomeration of the city centre.The research considers the city centre to be a special region consisting of several commercial lands with different characteristics.The proposed methodology contributes to determining the city centre with the consideration of not only geometric and topological attributes, but also socio-economic attributes.Two types of cities were used to validate the effectiveness and practicability of our methodology, and the results showed that constructing an urban spatial structure from urban land

Conclusions and Outlooks
Considering that the construction of the urban spatial structure in the planning and design stages follows the forward process from planning to expression, this research contributes to providing a new method for the construction of an urban spatial structure via the urban land use.Using the city centre as an example, this study presents a methodology to delineate the city centre from commercial land data.The advantage of using urban land as a primary data source is that it has a much higher spatial resolution and smaller space granularity than conventional statistical units.This advantage affords the possibility of gaining an immediate impression of the distribution of geographic phenomena, and high precision in the statistical agglomeration of the city centre.The research considers the city centre to be a special region consisting of several commercial lands with different characteristics.The proposed methodology contributes to determining the city centre with the consideration of not only geometric and topological attributes, but also socio-economic attributes.Two types of cities were used to validate the effectiveness and practicability of our methodology, and the results showed that constructing an urban spatial structure from urban land

Conclusions and Outlooks
Considering that the construction of the urban spatial structure in the planning and design stages follows the forward process from planning to expression, this research contributes to providing a new method for the construction of an urban spatial structure via the urban land use.Using the city centre as an example, this study presents a methodology to delineate the city centre from commercial land data.The advantage of using urban land as a primary data source is that it has a much higher spatial resolution and smaller space granularity than conventional statistical units.This advantage affords the possibility of gaining an immediate impression of the distribution of geographic phenomena, and high precision in the statistical agglomeration of the city centre.The research considers the city centre to be a special region consisting of several commercial lands with different characteristics.The proposed methodology contributes to determining the city centre with the consideration of not only geometric and topological attributes, but also socio-economic attributes.Two types of cities were used to validate the effectiveness and practicability of our methodology, and the results showed that constructing an urban spatial structure from urban land use is feasible, and that our method provides the methodological foundation of city centre evolution between the existing and planned conditions.
In this paper, we characterised the city centre using commercial land data as units relating to a series of indicators, and integrated these into a composite index of "urban centrality".Factor analysis was applied to formulate such a comprehensive index, and it was applied through statistical testing, i.e., an accumulation variance ≥80%.When different indicators or cities are considered, the question is whether this formulated method of comprehensive index is appropriate for characterising the city centre.The weighted sum calculation model is another method type for determining the comprehensive index in urban analysis [16,31].This method determines the weight for each indicator, and these weighted indicators are then summed into a single index.Compared to factor analysis, the weight determined is a user-driven process, not a data-driven process.A previous study [16] showed that there is little difference between the two methods in urban structure analysis.Further research might suggest that when the experiment cannot satisfy the factor analysis method, the weighted sum calculation model can be used.
To delineate the city centre a precise boundary, we proposed an adaptive spatial clustering method considering spatial proximity attribute similarity.Although DBSCAN, kernel density estimation and Local Getis-Ord are widely used to derive city centre, the results are highly influenced by their parameters, especially when there is a little knowledge about the cities.In addition, their similarity measurement considers only the immediate similarity between two objects, which cannot differentiate their real differences.The DBSC algorithm was proposed to overcome these drawbacks, however, it suffered the limit of global parameter.The final city centre boundaries related directly to the process of clustering.Our proposed method emphasizes the constraints of DBSC algorithm for obtaining a more precise clustering result.To better test the validity, we compared our method and DBSC algorithm to identify simulated datasets and city centres.Our method obtained a more accurate clustering result, and was more suitable for cities with different urban spatial structures.
As urban land use is not available in China, we used two cities of different structure (i.e., polycentric and monocentric) to validate our methodology.We see two main extensions of our work in future research.One is that the same experiments should be carried out for cities in China to better represent and confirm our work.The other is that additional or different thematic attributes should be considered when indicating urban centrality of other cities.

Figure 1 .
Figure 1.Overview of the procedure for delineating the city centre.

Figure 1 .
Figure 1.Overview of the procedure for delineating the city centre.

2
ISPRS Int.J. Geo-Inf.2017, 6, 122 9 of 23 selected as the Clustering Center Point, and the neighbours of P3 were classified as cluster C3.The clustering results of simulated datasets (see Figure3d) illustrated that objects in the same cluster are indeed similar in both the spatial and attribute domains.

Figure 3 .
Figure 3. Spatial clustering implemented using our algorithm: (a) simulated data with attributes; (b) spatial clustering considering spatial proximity; (c) the clustering centre point; (d) spatial clustering considering spatial proximity and attribute similarity.

Figure 3 .
Figure 3. Spatial clustering implemented using our algorithm: (a) simulated data with attributes; (b) spatial clustering considering spatial proximity; (c) the clustering centre point; (d) spatial clustering considering spatial proximity and attribute similarity.

Figure 3 .
Figure 3. Spatial clustering implemented using our algorithm: spatial clustering considering spatial proximity; (c) the cluster considering spatial proximity and attribute similarity.

Figure 5 .
Figure 5. (a) Similar Voronoi polygon (black bold line) of the surface entity (gray block) built using Delaunay triangulation (black line); (b) aggregation degree of commercial land.

Figure 5 .
Figure 5. (a) Similar Voronoi polygon (black bold line) of the surface entity (gray block) built using Delaunay triangulation (black line); (b) aggregation degree of commercial land.

Figure 5 .
Figure 5. (a) Similar Voronoi polygon (black bold line) of the surface entity (gray block) built using Delaunay triangulation (black line); (b) aggregation degree of commercial land.

Figure 6 .
Figure 6.Road network density obtained from the buffer operation of GIS.

Figure 6 .
Figure 6.Road network density obtained from the buffer operation of GIS.

Figure 8 .
Figure 8. Extracting the city centre areas from commercial lands using triangle filtration: (a) boundary points with increased density; (b) Delaunay triangulation of boundary points for commercial lands; (c) boundary generated using triangle filtration method

Figure 8 .
Figure 8. Extracting the city centre areas from commercial lands using triangle filtration: (a) boundary points with increased density; (b) Delaunay triangulation of boundary points for commercial lands; (c) boundary generated using triangle filtration method

Figure 8 .
Figure 8. Extracting the city centre areas from commercial lands using triangle filtration: (a) boundary points with increased density; (b) Delaunay triangulation of boundary points for commercial lands; (c) boundary generated using triangle filtration method

Figure 9 .
Figure 9. Spatial distribution of commercial land in Yangzhou.
. Delaunay triangulation clustering can discover clusters with different shapes, and is robust with outliers.The final result reflected the spatial distribution pattern of commercial land in Yangzhou.

Figure 9 .
Figure 9. Spatial distribution of commercial land in Yangzhou.
. Delaunay triangulation clustering can discover clusters with different shapes, and is robust with outliers.The final result reflected the spatial distribution pattern of commercial land in Yangzhou.

Figure 10 .
Figure 10.The process of spatial clustering considering spatial proximity: (a) the centroids of commercial land; (b) Delaunay triangulation of centroids; (c) result using the Global constraint; (d) result using the Local constraint ( = 0.4638).

Figure 10 .
Figure 10.The process of spatial clustering considering spatial proximity: (a) the centroids of commercial land; (b) Delaunay triangulation of centroids; (c) result using the Global constraint; (d) result using the Local constraint (γ = 0.4638).

Figure 12 .
Figure 12.Comparison of spatial clustering results between the DBSC algorithm and our method.(a) Standard deviation of each cluster using our method; (b) standard deviation of each cluster using DBSC; (c) mean value of each cluster using our method ( represents the trend of mean values of clusters); (d) mean value of each cluster using DBSC ( represents the trend of mean values of clusters).

Figure 13 .Figure 12 .Figure 12 .
Figure 13.Curve plot of the value and the PBM index.

Figure 13 .
Figure 13.Curve plot of the value and the PBM index.

Figure 12 .
Figure 12.Comparison of spatial clustering results between the DBSC algorithm and our method.(a) Standard deviation of each cluster using our method; (b) standard deviation of each cluster using DBSC; (c) mean value of each cluster using our method ( represents the trend of mean values of clusters); (d) mean value of each cluster using DBSC ( represents the trend of mean values of clusters).

Table 2 .Figure 13 .
Figure 13.Curve plot of the value and the PBM index.
values of clusters represents the trend of mean values of clusters).

Table 2 .Figure 12 .
Figure 12.Comparison of spatial clustering results between the DBSC algorithm and our method.(a) Standard deviation of each cluster using our method; (b) standard deviation of each cluster using DBSC; (c) mean value of each cluster using our method ( represents the trend of mean values of clusters); (d) mean value of each cluster using DBSC ( represents the trend of mean values of clusters).

Figure 13 .
Figure 13.Curve plot of the value and the PBM index.

13 .
Curve plot of the θ value and the PBM index.

Figure 14 .
Figure 14.Extraction results of city centre areas in Yangzhou using urban land as the primary data source: (a) city centres derived using our method; (b) geographical extent of city centres using triangle filtration; (c) city centres derived by DBSC; (d) geographical extent of city centres using triangle filtration.

Figure 14 .
Figure 14.Extraction results of city centre areas in Yangzhou using urban land as the primary data source: (a) city centres derived using our method; (b) geographical extent of city centres using triangle filtration; (c) city centres derived by DBSC; (d) geographical extent of city centres using triangle filtration.

Figure 15 .
Figure 15.Comparison of the boundaries of city centres delineated by the two algorithms (in purple) with comparative city centres (in blue): (a) our result; (b) DBSC result.Background mapping from OpenStreetmap.

Figure 15 .
Figure 15.Comparison of the boundaries of city centres delineated by the two algorithms (in purple) with comparative city centres (in blue): (a) our result; (b) DBSC result.Background mapping from OpenStreetmap.

Figure 16 .
Figure 16.The process of spatial clustering considering spatial proximity:(a) the distribution of commercial land; (b) result using the Global constraint; (c) result using the Local constraint ( = 0.4213).Figure 16.The process of spatial clustering considering spatial proximity:(a) the distribution of commercial land; (b) result using the Global constraint; (c) result using the Local constraint (γ = 0.4213).

Figure 16 .
Figure 16.The process of spatial clustering considering spatial proximity:(a) the distribution of commercial land; (b) result using the Global constraint; (c) result using the Local constraint ( = 0.4213).Figure 16.The process of spatial clustering considering spatial proximity:(a) the distribution of commercial land; (b) result using the Global constraint; (c) result using the Local constraint (γ = 0.4213).

Figure 17 .
Figure 17.The identified city centres generated using our method: (a) spatial clustering results for the urban centrality index ( = 0.98); (b) the derived city centre areas; (c) the computed city centre (in purple) versus the comparative city centre (in blue).Background mapping from OpenStreetmap.

Figure 18 .
Figure 18.The identified city centres generated using the DBSC method: (a) spatial clustering results for urban centrality index (T = 0.436); (b) the derived city centre areas; (c) the computed city centre (in purple) versus the comparative city centre (in blue).Background mapping from OpenStreetmap.

Figure 17 . 23 Figure 17 .
Figure 17.The identified city centres generated using our method: (a) spatial clustering results for the urban centrality index (θ = 0.98); (b) the derived city centre areas; (c) the computed city centre (in purple) versus the comparative city centre (in blue).Background mapping from OpenStreetmap.

Figure 18 .
Figure 18.The identified city centres generated using the DBSC method: (a) spatial clustering results for urban centrality index (T = 0.436); (b) the derived city centre areas; (c) the computed city centre (in purple) versus the comparative city centre (in blue).Background mapping from OpenStreetmap.

Figure 18 .
Figure 18.The identified city centres generated using the DBSC method: (a) spatial clustering results for urban centrality index (T = 0.436); (b) the derived city centre areas; (c) the computed city centre (in purple) versus the comparative city centre (in blue).Background mapping from OpenStreetmap.

Table 2 .
Statistics for clustering results in Figure10.

Table 2 .
Statistics for clustering results in Figure10.

Table 2 .
Statistics for clustering results in Figure10.

Table 3 .
Comparison of overlap between the computed and comparative city centres.
ISPRS Int.J. Geo-Inf.2017, 6, 122 19 of 23 originally planned.Such evaluation illustrates that our methodology based on commercial land has the potential to yield satisfactory results, possibly leading to the development of a new method of confirming the recognition of the city centre.

Table 3 .
Comparison of overlap between the computed and comparative city centres.

Table 4 .
Comparison of the overlap between the computed and comparative city centres.

Table 4 .
Comparison of the overlap between the computed and comparative city centres.

Table 4 .
Comparison of the overlap between the computed and comparative city centres.