Automatic Delineation of Urban Growth Boundaries Based on Topographic Data Using Germany as a Case Study

: Urban Growth Boundary (UGB) is a growth management policy that designates speciﬁc areas where growth should be concentrated in order to avoid urban sprawl. The objective of such a boundary is to protect agricultural land, open spaces and the natural environment, as well as to use existing infrastructure and public services more efﬁciently. Due to the inherent heterogeneity and complexity of settlements, UGBs in Germany are currently created manually by experts. Therefore, every dataset is linked to a speciﬁc area, investigation period and dedicated use. Clearly, up-to-date, homogeneous, meaningful and cost-efﬁcient delineations created automatically are needed to avoid this reliance on manually or semi-automatically generated delineations. Here, we present an aggregative method to produce UGBs using building footprints and generally available topographic data as inputs. It was applied to study areas in Frankfurt/Main, the Hanover region and rural Brandenburg while taking full account of Germany’s planning and legal framework for spatial development. Our method is able to compensate for most of the weaknesses of available UGB data and to signiﬁcantly raise the accuracy of UGBs in Germany. Therefore, it represents a valuable tool for generating basic data for future studies. Application elsewhere is also conceivable by regionalising the employed parameters.


Introduction
Urban growth, encompassing both land take and urban sprawl, is one of the main challenges facing urban planners. By 2050, nearly seven out of ten people in the world will live in cities. The urban population will double its current size by then. The challenge is that urban areas are currently growing at an average rate twice as fast as the population [1]. Urban population growth, together with economic development, is expected to add 1.2 million km 2 of new urban built up area to the world in the next three decades [2]. The reasons for urbanisation are manifold and are related to a variety of factors that are difficult to observe comprehensively at the global level, including international capital flows, the informal economy, land use policy and generalized transport costs [3]. While there are international studies on the causes of land take [4], there have only been a few studies observing regional urbanisation trends [5,6] and examining the causes of land take in Germany [7].
The consequences of land take are manifold, particularly regarding its diverse negative effects on the non-renewable and limited resources of land and fertile soil. Furthermore, urban growth undermines biodiversity by fragmenting and shrinking habitats and biotopes [8][9][10]. Other negative impacts include increased flood risk, reduced underground reservoirs as a result of soil sealing [11,12] and rising greenhouse gas emissions as by-products of the associated increase in affluence that usually accompany urbanization [13,14]. Moreover, urban sprawl has social and economic repercussions such as increased traffic, less attractive landscapes [15], rising public costs [16] and higher costs of living [9], as well as socio-residential segregation [17,18]. On the other hand, urbanisation offers developing countries in particular the opportunity for more sustainable development [14]. Accompanying and steering this process involves harnessing the growth and development benefits of urbanization while proactively managing its negative effects [19].
Therefore, sustainable land use development has become one of the guiding principles of spatial planning, formalised in policy goals on international [20], supranational [21] and national levels [22]. For example, the European Union has set a target of reducing net land take to zero by 2050. Individual countries have defined their own objectives in this regard [23]. Germany, for instance, has set the limit of land take for buildings and transport infrastructure to 30 ha per day by 2020 [24]; in France the rate of agricultural land take is about to be reduced by 50% [13] and in the UK, 60% of new housing must be constructed on brownfield sites [25].
To achieve these goals, appropriate measures and tools are required to realise sustainable forms of land use [26]. Here, robust monitoring systems with up-to-date indicators are required to determine the rate of progress towards the chosen goals and to help policymakers evaluate the effectiveness of measures [27,28]. Spatial data and analysis of urban and inter-city levels are needed for at least half of the indicators stipulated in Goal 11 of the UN's Sustainable Development Goals [27].

Urban Growth Boundaries
Settlements are each unique in their form, fragmentation and structure. The standard distinction between urban and rural is, in fact, spatially and functionally blurred [27]. Urban settlements can be defined in many ways: for example, as large and densely populated regions with a special administrative, legal or historical status [29]. Furthermore, settlements-and especially cities-can be considered economic or commercial hubs, where benefits are realised through the sharing of natural resources and reduced transportation costs [30] or as places with 'physical, social, economic and cultural dimensions' [31]. Determining the extent of a city by means of boundaries is essential for comparative studies that seek to measure change over time or when conducting environmental impact assessments or land use mapping/classification. Furthermore, boundaries can then serve as auxiliary geometries for objects within urban settlements to identify the partonomic relationships between different classes of geographic elements such as transportation networks, green spaces and rivers. They are features that help to characterise any given city [29]. Urban growth boundaries (UGBs), also known as urban edge strategies [32,33], are closely linked to this point. UGB is a growth management policy concept that designates certain areas where growth should occur in order to avoid urban sprawl [34,35]. The objective of such a boundary is to protect agricultural land, open spaces and the natural environment as well as to make more efficient use of use existing infrastructure and public services. In addition, UGBs are used to promote infill development and redevelopment [36]. There are no common criteria for delineating UGBs, with each planning authority defining boundaries according to local needs and requirements. As UGBs are not static, they are able to cover a planning period of 20 to 30 years; here regular monitoring and reviews are necessary to adapt them to unforeseen changes [37]. UGBs have proved themselves to be effective tools in managing urban development and preventing the negative impact of urban sprawl [38][39][40]. Consequently, they have been widely adopted in many countries, such as the United States [41][42][43], the UK [44], Saudi Arabia [45], Canada [46], Australia [47], Korea [48] and Germany [38]. Great attention has been paid to UGBs in China in view of the country's rapid urbanisation over the last decades [37,49]. According to He et al. [50], China's Ministry of Housing and the Ministry of Land and Resource intend to expand this planning approach to cover more than 600 cities [50]. A more recent publication even mentions extending this to over 3000 of the country's towns and cities [51]. However, their use is also the subject of controversy. Negative effects associated with UGBs are connected to "leapfrogging" as a result of increasing land prices and housing prices, where land developers and households move beyond the contained area [38]. In this way, such spillover effects lead to, for example, potential sub optimal commuting patterns or increasing public service costs. However, studies note that these negative effects are not in the nature of the UGBs. Moreover, a flexible proper management and implementation of UGBs with sufficient development reserves inside the contained zone and similar policies across administrative boundaries are the biggest success factors [52,53].
Since the creation and revision of UGBs is a task that requires both necessary expertise and human capacity, more and more studies have recently investigated how creating and drawing UGBs can be supported by automated processes. Urban growth models, such as Ideal Urban Radial Proximity [54], artificial neural networks [55], spatial-logistic regression-UGB [56], rule based cellular automata [57], hybrid models [58], UBEM [50] and weight of evidence method [49], have been used to delineate UGBs.

Automatic Delineation of Settlements
There is a wide range of methods for the automatic delineation of settlement boundaries. These may be classified in terms of the different source data used in the analysis. Such inputs may include:

1.
Remote-sensing data. Many approaches are based on classification methods using spectral information from satellite data of Landsat, Sentinel-1, Sentinel-2 or SPOT5/SPOT6 [59][60][61]. The Global Human Settlement Layer (GHSL) provides a global dataset for the analysis of built-up areas for the years 1990, 2000 and 2014. This was created by applying supervised data classification algorithms to Landsat images [62]. The High-Resolution Settlement Layer (HRSL) combines image classification and convolutional neural networks [63]. Recent approaches have attempted combine of deep learning (DeepVGI) with crowdsourcing (MapSwipe) [64]. Esch et al. made use of TanDEM-X and TerraSAR-X radar images and a fully automated processing framework to create the Global Urban Footprint (GUF) raster map [65]. However, the low-resolution datasets still show great variation between different regions and geographic settings [66]. They are also unable to identify the finer details needed to investigate urban dynamics. A less common approach is to make use of nightlight data. These data strongly correlate with economic activity and population density. Due to overglow and saturation, it is only partly suitable for delineation of boundaries of settlements [67,68].

2.
Road network data. In contrast to raster-based methods, various studies have proposed using spatial vector data from official topographic or cadastral databases or VGI platforms such as OpenStreetMap. Walter et al. investigated the density and layout of road networks to define settlement boundaries [69,70]. A mathematical model based on the clustering of vertices and the edges of a street network was applied by Zhou et al. [71], and Masucci et al. [72] evaluated density-, intersection-and street block-based approaches to delineate built-up areas using road networks. Jiang and Jia derived 'natural cities' by clustering street nodes [73].

3.
Settlement and building data. Only a few studies have attempted to utilise building footprints. Most of them utilise vector datasets from National Mapping Agencies from digital landscape models or cadaster data. For instance, Li et al. merged and generalised these footprints via Thiessen polygons and the rules of Gestalt theory until they were reduced to an outline of the settlement [74]. Chaudhry and Mackaness created settlement boundaries with a multi-stage approach directly derived from building footprints [29]. Arribas-Bel et al. created city boundaries based on a machine learning algorithm that groups buildings by means of an adopted DBSCAN algorithm [75]. Tannier and Thomas used a fractal-based method to generate boundaries from building footprints [76]. Muhs et al. extracted building data from topographic maps to delineate the extent of built-up land using digital image processing [77].
De Bellefon et al. calculated a raster from building footprint densities, from which they derived urban areas [78]. Harig et al. presented a method that used a supervised parameter optimisation along with a buffer-based method to assess the quality of the delineation [79]. This approach was developed and evaluated in terms of its potential use in the spatial sciences to monitor built-up areas at a very fine-grained level. Schumacher follows a similar approach by aggregating buildings into a so-called 'urban mask' [80]. 4.
Census data. Rozenfeld et al. applied a city clustering algorithm to census data to derive city boundaries by clustering populated sites [81,82].

Institutional Framework in Germany and Urban Growth Boundaries
While there is a strong tradition in Germany of growth management at regional and local levels, the country's federal structure limits the power of the national government to regulate urban development and land use. Nonetheless, regional and urban planning is anchored in national legislation through the Federal Spatial Planning Act (Raumordnungsgesetz) and the Federal Building Act (Baugesetzbuch, BauGB). These transfer the responsibilities for spatial planning and growth management to state governments, regional planning bodies and local governments (local land use planning). At a local level, growth management is realised by 'preparatory land use plans' ( § 5 BauGB) and 'legally binding land use plans' ( § 9 BauGB) of the municipalities. These plans have to be consistent with the stipulations of higher-tiered regional plans. The BauGB also regulates designated areas. The criteria for these areas, as formalised in BauGB § 34, are largely identical to the objectives of a UGB. They direct the reuse of brownfield areas and infill development in already urbanised areas and protect valuable open space, for example, prime farmland or environmentally sensitive areas. In undesignated areas, all construction projects are thus checked in advance to see whether they fit into the existing settlement structure or contribute to an organic settlement development. The municipality can enact statutes to dispel doubts about the admissibility of the building plot [83]. Construction is permitted within the limits of these statutes. The limits described therein need to represent the decision that would have been taken in any individual construction project. However, the municipality may sporadically include areas that do not meet these criteria in order to enable settlement development or to bridge boundaries ( Figure 1). Excluded from consideration are areas for recreational use such as allotments and weekend cottages, parks, sports facilities or water bodies and other undeveloped areas in the settlement. In addition, buildings that are not part of residential areas, for example, farms, are also excluded. The areas of a settlement that meet these criteria, as well as areas subject to legally-binding land use plans, are also known as the inner zone (so-called Innenbereich) [84]. The special significance of the inner zones is their link to national sustainability goals. One example is the implementation of guidelines for sustainable urban development such as 'development inside before development outside' [22]. These generally formulated goals necessitate a wide range of research methods, including the development of indicators to measure progress towards this goal. They all require spatially comparable and homogeneous delineations of inner zones to enable data collection for the monitoring of land take [84], to study settlement dynamics [5] or to estimate infill development potentials [85].
The data situation regarding urban development in Germany is still poor. Arguments in the ongoing debate are typically based more on assumptions and speculations than on clear empirical facts [6]. The relatively weak information base provided by the official land use statistics is a key reason for this. First, the data refer to administrative units such as municipalities, counties or states. Second, different types of urban land use are grouped into a single class called 'urbanised area' ('Siedlungs-und Verkehrsfläche'). These aggregated data do not reflect the fine-grained pattern of land use change. These changes, small in their environmental, social and economic impacts, add up to large-scale changes as the result of hundreds or thousands of individual land use decisions [6]. Up-to-date and high resolution geometries that reflect the status quo of a settlement development in Germany exist only to a very limited extent.  For this reason, geometries of settlement bodies (ATKIS ® -Ortslage) from the nationally available Authoritative Topographic-Cartographic Information System (ATKIS ® Base DLM) have been used to determine UGBs [85,86]. However, the geometry of the settlement bodies from a digital landscape model is suboptimal, namely a rough approximation of inner zones/UGBs. Figure 1 shows the key problem of the current data situation: the geometry of the settlement bodies (ATKIS ® -Ortslage) includes many areas that do not belong to the UGB as determined by expert delineation (ED). These are, for example, allotments or large facilities for livestock breeding as well as agricultural land, green areas and parks. On the other hand, the ED (which is the most exact form of manually created UGB) is not available for the entire settlement, that is, parts towards the east and north are excluded. While the ED is only indicated for part of the settlement, the ATKIS ® -Ortslage includes almost all built-up areas. Additionally, the highly detailed delineation of the ED can be clearly seen. The boundary is generally located directly behind buildings. The ED is very fine-grained and selective in terms of the inclusion and exclusion of certain buildings and areas.

Roads, streets
In comparison to UGBs as defined in other countries [41][42][43][44][45][46][47], inner zones (Innenbereiche) are more restrictive and detailed. Single buildings can influence the boundary (cf. Figure 1). However, there is no uniform procedure to define a UGB. In addition, different settlement structures make it difficult to establish uniform rules for delineation. Due to their heterogeneity and complexity, UGBs are usually created manually by experts [87][88][89][90]. Each delineation is a case-by-case decision in space at a particular time.

Aim and Research Questions
Using Germany as an example, it could be shown that UGBs are urgently needed in science and planning. Previous approaches are only suitable to a limited extent. The majority of algorithms for the delineation of settlements aim at a cartographic generalisation, or at a mostly simplified representation of the built-up area. They are therefore suitable as a methodological starting point, but not as an adequate method for mapping settlement development in a high level of detail.
Therefore, the aim of this study was to develop a new approach to automatically delineating UGBs using the morphological characteristics of settlements derived from commonly available topographic data at a very fine-grained level. Furthermore, we investigated how the quality of such delineations can be assessed in Germany as a case study. The method should provide answers to the following research questions: How can UGBs be automatically delimited on the basis of building footprints under German standards? How should the resulting geometries be validated if no comparative dataset is available in many cases? How can under-and over-delineations in the comparative datasets be identified and quantified? How does settlement structure (local, regional) affect the accuracy of the delineation procedure?

Method
Our approach presents a way of extracting settlement boundaries, in particular urban growth boundaries (UGBs). As inputs, the automated procedure uses commonly available topographical data, particularly building footprints, road networks or land use data. It is primarily adapted to the (legislative) situation in Germany. Before the method is described in detail in the following sections, Figure 2 provides a brief overview. Initially, the building footprint data are used to partition the study area. Before the buildings can be aggregated within these partitions, several initial processing steps are necessary. These include identifying the city and street blocks from the road network data and auxiliary data, calculating a building coverage threshold and filtering out buildings that do not meet the criteria for the UGB. Subsequently, the aggregation is carried out iteratively. After several refinement steps, the resulting geometries of the UGBs are saved and the next partition is calculated. Finally, the results are evaluated by comparison with expert delineations.

Partitioning
Our model is designed to delineate UGBs for larger regions or even entire states. Since the used topographic data can easily contain millions of objects, it may be necessary to split the dataset into smaller partitions for computationally intensive processes. In addition to the technical necessity of partitioning, this can identify compact building clusters or settlement bodies to generate patronymic information that is important for later aggregation and connected threshold calculations. The partitioning must be done in such a way that settlement bodies (Siedlungskörper) are not divided. Municipal boundaries can only help to a limited extent, which is why partitioning is based solely on building data.
In a first step, a point density raster map is calculated based on the centroids of the building polygons using the point density function implemented in ArcGIS for Desktop by default [91]. A grid spacing of 100 m and a neighbourhood radius of 200 m was used to calculate density, as this was found to be reasonable for different types of settlements. In the next step, a point dataset is generated from the centres of the cells ( Figure 3a); those points with a density value < 0.00001 are eliminated (white points in Figure 3a), leaving only the points with higher density values (greyish points in Figure 3a). Here the density threshold was defined by empirical testing in order to identify areas that contain no or very few buildings. Subsequently, a triangulated irregular network and a Voronoi diagram are generated from the remaining points (Figure 3b), which form clusters of square cells. Starting from the Voronoi diagram, all lines that have the same length as the defined cell size of the density map (100 m) are deleted as are all dangling lines. The remaining lines form a mesh, which is then converted into polygons (grey lines in Figure 3c). In the case of very large and compact settlements, such as large cities, it may be necessary to divide the resulting partitions manually so that the computer is able to carry out the subsequent calculations. However, this is always accompanied by undesirable edge effects.

Creating Street Blocks and City Blocks
From an urban planning point of view, the city block is the smallest urban unit and an important reference point for spatial analyses [92]. We make use of two definitions of city blocks. Conzen [93] describes a city block as parcel or as a group of adjacent parcels that are partially or completely surrounded by roads [93]. He calls these city blocks street blocks. In the following, we will use this term for polygon shapes derived from the road network data and the partition polygons. Only areas that are completely surrounded by roads, streets and the outline of the partition polygon can form a street block. Luft and Bender state that city blocks may also be part of a predominant built-up area enclosed by topographical lines, in particular by roads or paths [94]. In our model, railway lines, prominent vegetation (e.g., forest and bog) and water bodies are considered to be topographical elements. Topographical features such as geophysical obstacles, elevations or even slopes (dams, embankments, ditches, rivers and the like) and forest edges regularly form boundaries for an UGB [83,95]. In our study we call these more structured blocks city blocks. They are created by merging the outlines or polylines of the corresponding data with the outlines of the partition polygons. Afterwards, these polylines are converted into polygons as described above. In the respective data sets, all polygons that do not contain buildings are deleted.

Calculation of Building Coverage Threshold
For morphological agglomerations, it can be useful to set a contiguity constraint or distance threshold to provide an objective, reproducible and comparable boundary [96]. On the other hand, a constant distance threshold will not precisely reflect the often variable settlement structures and spacing of neighbouring buildings [29]. To overcome this difficulty, 'local' thresholds are often calculated on the basis of the data to hand [29].
Our method uses the indicator building coverage ratio to help delineate city boundaries. This is the ratio of the summed area of building footprints to the area of a corresponding built-up area (m 2 /m 2 ). It is a common indicator in studies in this field and is also known as the building footprint density [36,84]. Here street blocks are used as the spatial reference unit for calculating building coverage. As street blocks at the edge of settlements are very large (due to the transition into open space) and would thus distort the calculation of the threshold value, only blocks located in the centre of the settlement are considered. Initially, all buildings are buffered to a width of 100 m. Subsequently, overlapping buffers are dissolved and blocks that lie completely within each buffer area are selected. However, only blocks containing at least 20 buildings and building sections are selected for the calculation (Figure 4). Since there are also partitions for which no ratio can be calculated, due to their shape or because there are not enough buildings to form a settlement, a base value (global building coverage ratio) is determined at the beginning by applying the same procedure to the entire study area. Otherwise, the building coverage ratio is recalculated for each partition before each run (local building coverage ratio).

Semantic and Spatial Filtering
Not all buildings are relevant for determining the UGB. In order to identify the relevant buildings, a three-stage filtering is applied. In a first step, all buildings are identified whose function is usually not assigned to the UGB. According to § 35 of the German Federal Building Code (BauGB), building projects that serve certain functions are allowed outside a UGB. In most cases, these are functions that are undesirable within settlements, for example, sewage treatment plants, windmills or animal husbandry facilities. In addition, buildings are found outside the UGB, which, according to their function, do not contribute to the intended development of a settlement, such as allotments or weekend houses (cf. Table 1 and Figure 1). These building functions are added to a negative building filter (cf. Table 1). However, this negative building filter cannot be viewed as a reliable criterion because some uses occur both inside and outside the UGB (cf. Figure 5). For example, while barns and stables represent a typical use of buildings in outlying areas, in Europe they are also part of the typical townscape of rural settlements [89]. Therefore, a second filtering step is necessary to avoid deleting buildings that belong to the UGB. As residential buildings, commercial buildings and public buildings are mainly located within the UGB [95] (see Table 1), this group of buildings (positive building filter) is used to create a sub-selection polygon. However, these buildings must have a certain level of significance to be considered part of a settlement [95]. All positive filter buildings are then converted into a point layer. Using the function described in Section 2.1, a density raster is generated from this layer, which is then converted into a point grid showing these density values. In the example, all points in this grid with a density value of less than 0.0003 were deleted, and the remainder buffered with a 50 m radius. The density value is determined empirically in such a way that individual residential buildings, or those at some distance to each other, do not contribute to the generation of the buffer polygons. Since a distance of 100 m was also chosen for the point grid here, the 50 m buffer forms an almost closed geometry. Finally, the previously described buildings that are allowed in the outlying area, and that are outside these buffer polygons, are deleted. Finally, small buildings and building annexes which are not relevant to the creation of an UGB are also omitted. For this purpose, objects are filtered using threshold values. In the case at hand, a threshold of 56.8 m 2 was used for all detached buildings, and a threshold of 35 m 2 for all non-detached buildings and annexes. The thresholds were determined empirically and are taken from [97].

Identification of Densely Developed Blocks
In this step, all those blocks are identified that are so densely built up that they are assumed to be entirely within the UGB. To this end, the building coverage rate is determined for all city blocks and then the blocks that have a building coverage rate greater than 18 percent are preselected for refinement. Our own investigations with the help of expert delineations from Brandenburg (see Section 3) have shown that a whole block, which meets this classification criteria with a probability of 95 percent, is within a UGB. Less dense blocks are passed to the next processing step.

Minimum Spanning Tree Based Aggregation
Based on the buildings in the remaining city blocks, a Delaunay triangulation is performed on the centroids of building polygons. The edges of the graph are weighted by the distance from building edge to building edge. Afterwards, the weighted graph is used to create a minimum spanning tree (MST). The Kruskal algorithm (part of python package networkX) is used to calculate the MST [99,100]. Subsequently, all edges of the graph are deleted, which are crossed by roads longer than 50 m. This creates smaller trees. Aggregation is done by grouping several buildings together using an edge-weighted minimum bounding rectangle (MBR) along the branches of the tree. The self-developed algorithm for the creation of MBR is described in Algorithm 1. The resulting rectangles are oriented along the dominant edges of the building group (cf. Figure 6b). This special form of a minimally bounding rectangle geometry was chosen because UGBs in Germany generally end behind the last building of continuous developments [95]. In fact, the UGB also includes the associated open space, frequently used as a garden or courtyard [83]. The resulting area of buildings and associated open spaces often corresponds to the land parcel as designated in the local cadastre. In Germany, these are predominantly rectangular in shape, running perpendicularly or in parallel to the street or buildings. Although the use of parcel boundaries is generally considered to be an ambiguous criterion for delineation of a UGB [83,95], they often represent land use boundaries. In the case of uneven development on the fringe of a town or village, building groups or individual buildings have to be separately delimited [83]. Both the MBR algorithm and the MST-based aggregation algorithm described below are based on these principles.
The aim of the self-developed aggregation algorithm (cf. Algorithm 2) is to combine buildings into groups if the ratio of the sum of the floor areas of these buildings to the area of MBR is greater than the local building coverage ratio (threshold). In a first step, the MST is transformed into a list of node pairs with length attributes. The list is then sorted by length in ascending order. Starting with the first element in the list, that is, the pair of buildings with the shortest distance, the algorithm determines whether one of the two nodes is already a member of a group (group members are noted on a group list). If this is the case, the two new nodes are temporarily added to the relevant group. Then all buildings are delimited according to the nodes of the group using the MBR algorithm and checked against the threshold value. If the building coverage ratio value is above the threshold value of local building coverage ratio, the group is saved with the new members in the group list. Otherwise, the old group is retained. If neither of the two nodes is a member of a group in the group list or the additional node could not be added to a group, a new group is created. Here, the MBR is also formed by the two members and the building coverage ratio is determined. If it is above the threshold value, the group is stored in the list of groups. This procedure is carried out for all elements in the list, respectively edges or pairs of buildings. The end result of the algorithm is a list containing sublists of building groups. MBRs are formed and issued according to the groups. groupstatus

Refinement
In the refinement process (cf. Figure 7), the MBR of large individual buildings (building footprint > 300 m 2 ) and the resulting rectangles are snapped to the road network and merged with the densely developed block polygons. Gaps and holes are then closed. Finally, areas too small to form a UGB are deleted.
Snapping is performed by creating polygons between each rectangle and neighbouring roads. Since, depending on the situation, a rectangle may have to be connected to several roads, sometimes in different directions, a flexible procedure is needed. For this purpose, the shortest possible polylines are formed from the corners of the rectangle to the road network. These lines are grouped according to their orientation.
Next, the average length for each group is calculated. If the average length of a group is 1.5 times longer than the average length of the group with the shortest length, the lines within this group are deleted. In our case study, the multiplier was empirically determined so as to ensure that when a group of buildings is (more or less) equidistant from several roads, the resulting polylines are not deleted. Polygons are generated from remaining lines, the edges of the rectangle and the road section. All polygons that are five times larger than the rectangle itself or 4900 m 2 (see below, gap threshold) are deleted. This operation is performed for all rectangles. Finally, rectangles and remaining polygons are dissolved to single-part polygons.
The general consensus is that an undeveloped area between existing buildings on the edge of the settlement is within the UGB if that area is no more than two plots wide [95]. After calculating all the boundaries in the study area, they are checked again to close narrow gaps between them. If the area cannot be allocated to the UGB, it forms a gap. A gap can be a notch in a polygon or the area between two polygons. In contrast to a hole, it is not completely enclosed by a geometry. Double buffering is used to detect gaps: in a first step, the outline of the dissolved polygons is buffered by 15 m. Second, the outline of the resulting polygons is again buffered by 15 m. By subtracting these buffered lines from the initially buffered polygons, new polygons are created that represent gaps. Gap polygons smaller than 200 m 2 are deleted, since they mostly occur in corners as a side effect of buffering. According to Bukies et al. [89], an undeveloped area of two to three building plots, that is, approximately 50 to 60 m, will generally be assumed to be inside the UGB; even an open space of up to 90 m will not necessarily interrupt the built-up area [83]. In order to avoid favouring the dispersed settlement structure over compact and space-saving settlement forms, gaps between buildings were added to the UGB up to an extension of 70 m, that is, up to 4900 m 2 in the case of a square [89]. Gap polygons smaller than this so-called gap threshold are retained if at least 70% of their perimeter borders the dissolved polygon, since a land parcel is normally assigned to an UGB if it is surrounded by buildings on at least three sides [98]. However, numerous undeveloped areas between buildings exist within the UGB, which cannot be classified as built-up area due to their size. Especially in the case of undeveloped blocks, parks, green space or meadows, a delimitation towards the inside might be appropriate. If these undeveloped areas form holes that are larger than 1 ha, they are deducted from the settlement area [89]. The same applies to splinter areas. According to Long et al. , polygons with low compactness and a small area (<1 ha) should be eliminated as they are not feasible for urban development [57]. Bukies et al. also stated that regardless of the qualitative aspect of contiguity, the number of existing residential buildings must have a certain weight [89]. In terms of the different settlement structures, while there is no generally accepted minimum, usually about 20 to 25 residential buildings are considered sufficient to constitute an independent UGB [89]. Accordingly, splinter areas smaller than 1 ha and with fewer than 20 buildings were omitted in our case. After calculating all the boundaries in the study area, they are checked again to close narrow gaps between them.

Exemplary Comparison of Expert Delineations and Urban Growth Boundaries
A simple way to compare two boundaries is to consider their intersection and the measurement of areas between them (so-called distortion polygons) [101,102]. For this purpose, areas of the UGBs were overlaid with the comparative geometries to give intersection areas (Equation (1)).
Here, it is necessary to differentiate between area-positive and area-negative deviations. Area-positive deviations are present in the generated UGBs, but not in the EDs. In contrast, area-negative deviations are not defined in the UGBs, but are present in the EDs. In order to analyse the deviations more precisely, we defined eight area-positive (cf. Figure 8) and eight area-negative types of deviations (cf. Figure 9). Each area was assigned to only one type of deviation.
Each class of area-positive deviations corresponds to an area-negative class. The classification was based on area size, building coverage ratio and underlying land use ( Table 2). The classes were defined in such a way that the research questions could be answered by the evaluation. Building coverage thresholds (BC) are orientated on values taken from the literature [36,103]: empty patches (BC < 3%), low BC patches (BC 3% to <15%), areas with high BC (BC > 15%). Finally, the area sum of the types is then set in relation to the total area. The criteria for the detection of patches with mostly industrial or commercial areas ( A + IndComm , A − IndComm ) are an area size of at least 1 ha, a building coverage ratio >15%. At least 50% of the area of the patch is determined in the ATKIS ® Base DLM as an industrial or commercial area. Residential areas that are not included in ED are referred to as A + Resid ; those that are not included in UGB as A − Resid . In order to detect these areas, patches classified at least 50% as residential areas, combined use areas or areas with specific functional characteristics are preselected. Those with a minimum size of 1 ha and a building coverage ratio of >15% are assigned to the class. Areas with individual buildings or groups of buildings at the fringe of a settlement represent another category of classes: if the building coverage ratio is >15%, they are allocated to class A + BdgEdge or A − BdgEdge . If the building coverage ratio of the patch is between >3% and ≤ 15%, these areas are assigned to class A + LowDensBdgGrp or A − LowDensBdgGrp . Areas with a building coverage ratio of ≤ 3% are regarded as undeveloped patches and assigned to class A + EmptyArea or A − EmptyArea . Patches with a degree of coverage < 3% that are completely enclosed by the UGB, thus representing holes, are recorded as class A + Holes or A − Holes . Classes A + LargeEmptyArea and A − LargeEmptyArea comprise large patches > 1 ha which do not contain any buildings or only individual buildings (building coverage ratio <3%). All other areas without buildings at the edge of settlements are classified as class A + EmptyArea or A − EmptyArea . If a settlement body is completely missing in the UGB, the area is assigned to type A − SettBody . If there are additional settlement bodies, these are classified as A + SettBody . Since many sliver polygons are created during the process of intersection, which would falsify the results, patches < 250 m 2 are deleted. This value has also been used in other studies as a threshold for areas that can be safely disregarded [85,90].

Study Areas and Data
In Germany, there are large regional differences in settlement density, settlement structure and settlement dynamics [5][6][7]. This has a direct influence on how UGBs are delineated. To cover the corresponding development scenarios we applied our method to three regions in Germany for which suitable ED data were available: the State of Brandenburg, the Hanover region and Frankfurt/Main.
Brandenburg has an area of 29,654 km 2 and a population of 2.5 million. With a population density of 85 inhabitants/km 2 , it is the second most sparsely populated federal state. The state is divided into 413 municipalities [104]. With 1.1 million people and an area of 2299 km 2 , the Hanover region is the second smallest of the three study areas. The population density is 492 inhabitants/km 2 , which is around the average value for Germany's regions [105]. The region consists of 21 municipalities. The city of Frankfurt/Main, with its well-known skyscraper skyline, is one of Europe's most important financial hubs. It spans an area of almost 248 km 2 and has a population of 750,000. The population density is 3008 inhabitants/km 2 [106].
The study areas refer only to sub-areas within the administrative boundaries of these regions or city ( Figure 10). They differ considerably in their urbanity and spatial extent ( Table 3). The study area in Brandenburg covers 2887 km 2 , which is about 10% of the state's territory. It is divided into three largely rural sub-areas, which are located in the east, southwest and west of the state. The average area of a settlement is 9 ha. A total of 618 individual settlements or parts of settlements were investigated in Brandenburg. The study areas in the Hanover region are located in a ring around the city of Hanover with a total area of 47 km 2 . A total of 164 settlements with an average area of 20 ha are of interest here. The study area of Frankfurt/Main has a spatial extent of 144 km 2 , covering 58% of the city. They consisted of 48 individual areas with an average size of 118 ha.
The settlements studied were selected according to spatial criteria (e.g., location within the study area) and the availability of ED data, since an ED was not available for every settlement within a region. The specific configuration of the selected areas depended on the existing settlement structures and the partitioning (see Section 2.1).   In this study, we used the nationwide data product Official Building Polygons (HU-DE) of Germany from 2011 to 2019. The HU-DE contains geo-referenced polygons representing building footprints derived from the country's official digital real estate map. The polygons do not contain any attributive information apart from an Official Municipality Key [107]. The HU-DE data set is updated every year. To enrich the data with further attributes, we used 3D building data from 2016 that includes object height and identifier, indications of quality and building function [108]. Finally, we utilised data on street and road networks as well as topographic data, for example, on forests, heath, bog and marshes from 2011 to 2017 as auxiliary data of the Authoritative Topographic-Cartographic Information System (ATKIS ® Base DLM). These datasets are available for the whole of Germany [109]. Table 4 provides an overview of the data used and their sources for the different study areas. Table 4. Overview of input data.
As described above, useful and up-to-date UGB data are currently limited. To evaluate our results, we thus utilised comparative data created by experts. These expert delineations (EDs) were acquired from regional planning authorities in each of the three regions. In the case of the state of Brandenburg, the ED is a collection of statutes according to the Federal Building Code § 34. They were enacted by the municipalities in the period from 2009 to 2019.
For Frankfurt/Main, the boundaries were created manually in 2011. Here, experts integrated diverse sources of data: zoning of residential and mixed building areas stipulated in regional land use plan, the urban precincts as shown in aerial photographs and information on development areas. These boundaries were originally defined for the detection of infill development potentials [90].
Similarly, the comparative data for the Hanover region was created manually in 2005 from an automated land survey map and orthophotos as part of the Regional Planning Programme to record all small settlements with fewer than 2000 inhabitants [89]. Therefore, the data cover only parts of the region.
EDs are usually created in such a way that they cover the entire settlement area of the respective settlement. However, none of these EDs is complete in the sense of a UGB. The data for Hanover and Frankfurt considered only residential areas and not commercial and industrial zones; the focus here was on the detection of infill developments in these residential areas. For Brandenburg, the statutes do not reflect the complete settlement area as they do not cover the area of a legal land-use plan and individual municipalities are not obliged to pass any statutes. The EDs were provided by the respective authorities (see Table 4).

Results
The method described above was applied to the three study areas Brandenburg, Hanover region and Frankfurt/Main. The delineation results were compared with expert delineations (EDs). Table 5 shows the classified deviations, the number of features of a geometry, the corresponding area and the proportion of this area to the total.
The proportion of the intersection area is the same for Hanover and Frankfurt, namely 75.8%. For Brandenburg the value is 61.0%. Area-positive deviations have a larger total proportion than area-negative deviations. Here the values for Hanover (18.0%, 6.0%) and Frankfurt (17.2%, 6.8%) are also similar. The values for Brandenburg are 24.1% and 14.7%. Not quite half of all the areas are smaller than 250 m 2 (unclassified areas). A closer look at the individual deviation classes shows larger proportions for industrial and commercial areas, residential areas, areas with a low building coverage ratio, large empty areas and small empty areas. The proportion of the area-positive industrial and commercial areas (A + IndComm ) and area-positive residential areas (A + Resid ) are 9.5% resp. 5.0% for Frankfurt and 4.7% resp. 0.7% for Hanover. For the highly rural areas in Brandenburg, the corresponding values are 4.0% and 3.5%. However, industrial and commercial areas are typically allocated to the UGB if they also include at least some social buildings, offices or administration buildings [95]. However, these are frequently excluded from the EDs, which are often created only for residential areas [89] (cf. Figure 8 A + IndComm ). Therefore, these areas are not to be regarded as errors in delineating a UGB. The same applies to the residential areas (A + Resid ) (cf. Figure 8 A + Resid ). In such cases, it is likely a settlement development took place between the time the ED was drawn up and the date of recording the building footprints, which was not covered by the respective statutes. Additional large empty areas (A + LargeEmptyArea ) play only a minor role in all study areas. The area-positive large areas represent areas on the edge of settlements that could have been integrated into the UGB due to their building coverage ratio, but were deliberately excluded in the ED. Large undeveloped areas (A − LargeEmptyArea ) are only found in Brandenburg (3.1%) and Frankfurt (3.6%). These are an expression of a politically desired settlement development, as large, undeveloped areas have been enclosed here in the ED (cf. Figures 1 and 9). Since these areas are located on the edge of the settlement and are undeveloped, they are not to be assigned to the settlement in the narrower sense [95]. Estimating the accuracy of the delineation is difficult because correctly and incorrectly delineated areas are included in all classes. The accuracy can therefore only be approximated. The accuracy of the method for the individual study area is determined by the sum of the intersections, the industrial and commercial areas (A + IndComm ) , the residential areas (A + Resid ) and the large undeveloped areas on the settlement fringe (A − LargeEmptyArea ). Calculated in this way, the overall accuracy for Brandenburg is 74.6%, for the Hanover region 81.6% and for Frankfurt/Main 93.9% (cf. Table 6). With regard to the settlement structure, the following conclusion can be drawn. The biggest deviations exist for groups of buildings in area-positive patches with low building coverage (A + LowDensBdgGrp ) with 10.8% in Brandenburg. In relative terms, this also applies to Hanover and Frankfurt, yet at only 5.2% and 1.8% respectively. The proportion of area-negative low building coverage is only significant in Brandenburg at 5.2%. On the one hand, this might be due to the relatively broad definition of these classes. On the other hand, these patches are difficult to assign because of the dispersed, sprawling settlement structures that cause their low building coverage. These classes occur more often in regions with disparate settlements forms. Thus, these errors are rather frequent in Brandenburg due the many small and heterogeneous settlements in this region. The percentage of peripheral areas is higher and thus also the number of special cases that cannot be reproduced by the algorithm. The situation for the settlements in the Hanover region is similar. In contrast, the UGB of Frankfurt, which is much more compact overall, shows only relatively small deviations apart from the deviation caused by the lack of industrial and commercial areas. The results should therefore also be considered in light of the number, average settlement size and total length of the edges of the respective settlement bodies studied (see Table 3), which to a certain extent relativise the differences between the individual study areas. Small empty patches (A + EmptyArea , A − EmptyArea ) play a major role in relation to their frequency. However, their share of the total area is only 2.7% and 3.9% in Brandenburg, 3.1% and 3.8% in the Hanover region and 0.4% and 1.1% in Frankfurt/Main. These values, together with the classes not addressed thus far, are an expression of the inherent fuzziness in the procedure. These represent areas of transition from compact development to increasingly sprawling and dispersed urban development. In summary, it can be stated that UGBs can be delineated at fine-grained level with the presented method. The evaluation also allows statements to be made about where the delimitation is reliable and where it is associated with greater uncertainty. The highest accuracy was observed in the Frankfurt/Main region with 93.9%. An accuracy of 81.6% was achieved for the Hanover region and 74.6% for the rural settlements in Brandenburg.
Overall, it is difficult to realise a clear delineation of UGBs in fragmented areas, where only a few buildings with larger spaces in between exist.

Discussion
In the above we have presented a new method to automatically generate urban growth boundaries (UGBs) for large areas using commonly available geodata. Compared to previous approaches on delineating city boundaries based on building data [29,[74][75][76], a higher resolution is achieved. In combination with the adaptation to Germany that enables detailed analyses of land use change processes. This method can be used to generate UGBs for rural settlements (Figure 11a-c) as well as large cities (Figure 11d), thereby providing a source of previously unavailable data for future studies and practical applications. Furthermore, the generated boundaries enable the evaluation of existing geometries such as expert delineations (EDs) and ATKIS ® -Ortslage. In general, we can say that this method delineates compact settlement areas, such as those found in Frankfurt, more precisely than the highly heterogeneous and dispersed settlements in the rural areas of Brandenburg and Hanover. In particular, it is difficult to delineate those parts of the settlement with low building densities or gaps in the development.
These are the parts of the boundary where the algorithm cannot replace delineation by experts, who are able to place the boundaries within an overall context and respond much better to the local characteristics in interpreting the available data. This is particularly noticeable in the transition from fragmented development which is assigned to the UGB to fragmented development which is excluded. The same applies to small undeveloped areas. However, these are the areas for which even experts have difficulty in correctly delineating the settlement [95].
Since most EDs do not cover all relevant areas (cf. Figure 1), it is not possible to determine the accuracy of an automatically-generated UGB. Clearly, EDs are created with diverse objectives in mind, they are created at specific dates and may include undeveloped areas or exclude built-up areas. For example, EDs designated by statute never cover areas of an existing legal land use plan. The algorithm equalises these deviations by always applying the same objective standards for a delimitation. Therefore, not every deviation is an error, even though real errors may of course occur. For this reason, the results require a deeper consideration and more detailed evaluation. Up to now, the differences between large cities, small towns and rural settlements have only been taken into account by adjusting the building coverage ratio. The inclusion of the respective building heights, footprint area, function and the grouping of buildings and parts of buildings into functional groups could improve results.
The previous study by Harig et al. [79], which dealt with the automated delineation of built-up areas based on topographic data in Germany, used given EDs to calibrate the delineation algorithm [79]. However, this presupposes the existence of this geometry and severely limits its applicability.
With regard to an international application, the requirements and expectations for UGBs are just as different as the settlement dynamics, the legal and planning framework conditions in the respective countries. While UGBs in the international context also serve to control new land take, the focus in Germany is more on densification and the use of infill development potentials. Other instruments are available for the declaration of new development areas. Since the approach was primarily adapted to conditions in Germany, which are presumably also present in Central Europe, a transfer to areas outside of Europe is only possible to a limited extent.
While there is still room to optimise the algorithm for the delineation of building groups and more complex street constellations, even in its current form the method already offers a wide range of applications, for example to help assess regional or interregional in-fill development potentials [85], Tracking of densification processes [119] as well as in the calculation of related indicators such as the ratio of inner/outer zone development [84]. Further, the UGBs can even play an important role in the development and testing of new planning instruments such as building certificates [87]. Clearly, to meet the aim of promoting in-fill development over development on the urban fringes as well as boost the development of integrated urban areas, it is first essential to define a geometry of the UGB [120].
Hitherto, the status of and changes in land use have generally been assessed by means of zonal statistical data and administrative boundaries (municipalities, counties, etc.). The advantage of this is the easy coupling of the corresponding data with socio-economic data [5]. UGBs provide a database to allow the analysis of essential structural characteristics of urban use patterns and their changes, independent of administrative boundaries. The advantage here is that the intersection with specialised data is easier, and analyses are possible below the aggregation levels set by the administrative boundaries. In addition, small-scale developments can be made visible and measurable for local administrations, which would otherwise remain hidden due to strong aggregation of the available data [6]. The successful mobilisation of infill development potentials also largely depends on the communication of urban development goals to citizens or stakeholders. The visualisation of the individual local situation is a supporting factor in all spatial planning. Frequently, geographical and topographical features cannot be adequately described verbally. Therefore, a detailed and accurate representation of UGBs and inner-city potentials is an important basis for planning work [88]. Furthermore, the presented method can be used for the historical reconstruction of settlement development. There already exist some methods for automatic building extraction [121] and building block extraction [77] based on topographic maps. Through the automatic delineation of settlement boundaries, dynamic urban factors such as densification, growth/sprawl or shrinkage can be examined on different scales and used as input data for simulations. Finally, the presented method for automated delineation is suitable for managing and limiting the designation of building land by means of quotas, as is already the case in regional plans drawn up in the states of Hesse and Berlin-Brandenburg [122,123]. A prototype of this algorithm is already being used for this purpose by the Brandenburg State Office of Construction and Transport.

Conclusions
The determination of the extent of a city is important for systematic measurements in comparative studies, in governance and urban planning, to measure change over time, in environmental impact assessment as well as in mapping and land classification. Because of the inherent heterogeneity and complexity of urban space, such delineations are usually created manually by experts. Therefore, every resulting dataset is linked to a specific area, investigation period and dedicated use. In order to obtain up-to-date, homogeneous, meaningful and cost-efficient data as well as to avoid the use of manually or semi-automatically generated delineations, this study presents a method to produce UGBs using building footprints and generally available topographic data as inputs. It was applied to areas in Frankfurt/Main, the Hanover region and rural Brandenburg while taking account of the demands of Germany's planning and legal framework for spatial development. In order to make an evaluation of the UGBs possible, we developed an automated procedure on the basis of expert delineations. This takes into account the weaknesses of existing UGB data by forming classes according to building coverage, size and use. In this way, over-and under-delineations can be distinguished and analysed. The results show that the method can generally compensate for the weaknesses of currently used or available UGB data and enable a spatially inclusive and comprehensive delineation of UGBs. Better results are achieved for compact settlements urban settlements such as Frankfurt/Main than rural, highly dispersed settlements such as in Brandenburg. In particular, settlements with complex structures or with low building coverage are rather difficult to precisely delineate. Therefore, future research should focus on the improvement of delineation of lowdensity areas at the edge of the settlement as these are currently the areas with the greatest uncertainty, but also with great potential for sustainable settlement development. The approach is generally transferable internationally as long as sufficient availability and quality of the input data is given.
The approach can be an interesting tool for spatial sciences but also for planners and administrations. Possible area of application is the analyses of urban land use patterns and their changes, independent of administrative boundaries in a spatial resolution never achieved before. The resulting geometry can be also utilised for the visualisation of city boundaries to support planning. For example, it is suitable for controlling and limiting the designation of building land by means of quotas or it helps to assess regional or interregional in-fill development potentials. Based on this, indicators, such as the ratio of inner/outer zone development, can be calculated. Furthermore, the results can serve as basic data for the development and testing of new planning instruments such as building certificates. The tool should therefore be further tested and applied in science and practice.
Author Contributions: Oliver Harig developed the method, implemented the procedure, analysed the data and wrote the initial manuscript. Robert Hecht served as the subject advisor for this work, discussed the method, revised the structure of the manuscript. Serving as doctoral advisors, Dirk Burghardt and Gotthard Meinel made useful contributions at all stages, including discussing the results and proofreading and substantially improving the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding and is part of a self-financed dissertation. The costs of the publication are covered by the Leibniz Institute of Ecological Urban and Regional Development.
Brandenburg State Office of Construction and Transportation and Elend from the Regional Authority FrankfurtRheinMain for providing of this data. Further, the authors would like to thank colleagues at IOER for their support. We also thank the editors and the anonymous reviewers for their valuable comments and suggestions.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: