The spatial distribution of dwelling units and population is highly relevant for addressing general questions of resource management and spatial planning [1
]. Currently, high-resolution population distributions are especially gaining importance for risk assessment and disaster management with regard to technical or natural hazards [4
]. Also for research on energy consumption and demand modelling, detailed information on building stocks and inhabitants are becoming important [5
]. Beyond the demand for current spatial distributions, also those of the past are of relevance. As these allow information on the dynamics to be obtained, pathways open for analysing long term changes in land use and settlement structures retrospectively [7
]. This information is of interest to analyse causes and trajectories of land use policies but also to model the distribution of building ages within the current urban fabrics [8
]. Crols et al. [3
] also highlight the relevance of deducing historical information for validating urban dynamic models, such as in the field of urban and regional economics [9
Given these needs for historical geographic information, constraints arise when it comes to data sources. For most countries, national census data provide figures on population and housing with full coverage and homogeneity in equal temporal intervals. However, the main drawbacks of these data are the low spatial resolution and the rather long (usually decadal) update cycles. Furthermore, demographic and socio-economic statistical data are often aggregated on large spatial units for example, at the ward, municipal, or state level according to privacy or administrative concerns [10
]. A further challenge, particularly in multi-temporal analysis, is the limited comparability over time due to changing administrative units [3
]. Whereas on a local level, some major cities own a wide range of socio-economic data at the city block level, the data are very heterogeneous with respect to the data structure and quality. Also, these data are often not available in smaller towns and rural settlements, leaving only an incomplete spatial coverage. Hence, in many countries, official data remain limited, which prevents a fine-grained analysis of settlement development.
To overcome these limitations, research groups in remote sensing and cartography/GIScience have been working on the development of methods for mapping populations for years [11
]. Thus, techniques of dasymetric mapping (sometimes disaggregation) were established, which allow census data to be transformed to “finer map units” by means of ancillary data. Eicher and Brewer [12
] and Maantay, Morocco and Herrmann [13
] give a good overview of different dasymetric methods, allowing for a disaggregation through methods such as areal weighted transformation. Further, ancillary land-use or land-cover data are used to link population to effectively occupied areas [14
]. This land use information is derived through multispectral satellite imagery [12
]. In the work of Gallego et al. [19
] a 100 m population density grid for Europe has been estimated using CORINE Land Cover data. Even higher resolution at the urban block, plot or even building level is obtained using cadastral data [13
] and digital topographic maps at a scale of 1:50,000 [20
] or 1:25,000 [21
]. Zoraghein & Leyk [22
] deal with various interpolation methods for a refined dasymetric mapping in order to generate temporally consistent population compositions.
Digital building footprint data play a special role in the endeavour to further develop these modelling approaches, as residential population strongly relates to the function, the size and the height of buildings [23
]. They obtained best results using a volumetric method instead of an areametric method which considers the number of building floors. Ural, Hussain and Shan [24
] later modified the approach by introducing a weighting scheme for different housing types such as “houses” and “apartments.” Meinel, Hecht and Herold [21
] differentiate between eight residential building types and apply empirically determined average dwelling and population densities but also non-residential uses within buildings are integrated to provide for more accurate estimations [23
]. Such models also open a way for multi-temporal estimations of populations as information on buildings and functions allows respective changes in population distributions to be estimated during the day and night time [27
] or different seasons due to tourism [29
]. Analyses of long-term dynamics in populations mostly rely on remote sensing data [30
]. However, for the United States address point locations and address information from the private Zillow Transaction and Assessment Database (ZTRAX) have recently been used for characterizing settlement activities at a very fine spatial and temporal resolution [31
]. On this basis, it was possible to generate settlement layers for the United States over 200 years with a spatial resolution of 250 m.
In this paper, urban morphologies, which are derived from topographic data and historical maps, are a starting point to further develop the long-term applicability of building-based models. Herold [32
] proposes an approach to automatically extract building information from historical maps. As topographic maps were periodically compiled, dynamics in urban morphologies can be derived. Hence, combining this approach of map extraction with information on buildings offers a promising way to generate large-scale population and dwelling estimations for long time periods. We thereby increase the spatial and temporal extent of building-based methodologies for dasymetric mapping. Here, we thus aim at the following research questions: How can historical data and dynamics on the population and the number of dwellings be derived by analysing multi-temporal urban morphologies through the combined processing of geographic vector data, statistical data and historical topographic raster maps? What accuracies can be achieved with automatic derivation?
To answer these questions, we first present a methodological framework (Section 2
) for the automated derivation of this information, based solely on conventional data available in European and other developed countries. We then present the study area and the data used to test our approach (Section 3
). The estimation results will be presented in Section 4
and critically discussed in Section 5
with particular focus on possible sources of error. The article ends with a conclusion (Section 6
2. Methodological Framework
This section describes the proposed framework for mapping the dynamics of population and dwellings by means of a spatiotemporal analysis of the urban morphology. Data sources and methods as well as the associated processing steps will be presented in detail.
2.1. Data Sources
This study proposes a methodology which makes use of topographic data and maps to derive the urban morphology. These data are used in order to estimate the historical distribution of dwellings and inhabitants. First, the question of appropriate data sources needs to be clarified. For the description of the current built environment (urban morphology), several data sources can be used but differ in spatial resolution, data quality, data availability and data costs. A very popular data source in earth observation of fast-growing cities is remote sensing data such as aerial imagery, very high-resolution (VHR) satellite imagery or airborne light detection and ranging (LiDAR) data. Nowadays, digital spatial vector data products and services from National Mapping and Cadastral Agencies (NMCAs) offer a variety of products (e.g., cadastral data, digital landscape models, 3D city models, topographic maps) that describe the urban morphology at a very detailed level. Furthermore, volunteered geographic information (VGI) platforms, such as OpenStreetMap, have recently become increasingly important for urban morphology analysis.
In our approach, we have decided to use official spatial data from NMCAs for the following reasons. NMCAs offer nationwide well-structured spatial vector data where relevant objects such as buildings, roads and land use information are explicitly modelled. In remote sensing data, these objects are depicted implicitly (in different data scenes with varying acquisition conditions) and additional resources for image interpretation are necessary to extract the relevant objects and structure them explicitly in a database. A further disadvantage of remote sensing data is significantly higher data costs particularly for large areas under investigation. OpenStreetMap is a good option for cities but data quality is currently not sufficient for analysing the urban morphology for larger regions due to incomplete building footprints in rural areas [33
The workflow of our approach requires the following spatial data:
Topographic raster maps (historical data): The maps should be at a scale of 1: 25,000 or larger and should contain the building footprints cartographically represented as solid, hatched or coloured areal symbols.
Land use data (current data): A polygon data set for the seamless description of land use at the urban block level. This data set can usually be derived from digital landscape models from NMCAs at a scale of 1:10,000 to 1:25,000.
Building model (current data): 2D building footprints or 3D building model in Level of Detail 1 (LoD1).
Furthermore, official statistical data on the number of inhabitants, the number of dwellings and the average household size at the municipal level are required to generate the parameters for the modelling. An additional building sample is required to obtain locally valid building type-specific assumptions on the story height (in m) and the dwelling unit size (in m2). For this purpose, the number of stories, the number of dwelling units as well as the building type are collected. If no local sample data are available, standard parameters from the urban planning literature can also be used.
2.2. General Workflow
The workflow and processing steps of our methodology are depicted in Figure 1
. The individual steps are described in more detail in the next subsections. Required input data are a digital landscape model and a building model of the current time (t0
) as well as topographic maps of n
historical time points (t-1
, …, t-n
) for which the spatial population and dwelling distribution is estimated.
2.3. Pre-Processing of the Topographic Maps
The pre-processing consists of the following steps: (a) scanning of analogue topographic maps for digital processing, (b) geo-referencing of the digital raster maps, (c) transformation, (d) extraction of the black layer through binarization and finally (e) mosaicking of the images.
In the first scanning process (a) the analogue map is converted into a digital image. Here, a scan density of 508 dpi (50 μm spacing) is used. This value has been proven to be sufficient for the automatic interpretation of digitized topographic maps [34
]. After the topographic maps are scanned and georeferenced at a scale of 1:25,000, a pixel corresponds to an area of 1.25 by 1.25 m in nature.
With regard to the automated building extraction, not only the spatial but also the spectral resolution is crucial. The scanned colour maps are stored as 24-bit RGB colour images. In the following, the georeferencing process (b) assigns real-world coordinates to each pixel of the scanned raster map. Usually this is done manually by means of Ground Control Points (GCPs). Herold et al. [36
] have introduced a procedure for an automatic georeferencing of topographic maps for the operational acquisition of spatiotemporal data on urban growth. This procedure can be part of the whole workflow, especially if a large number of map sheets need to be processed. Subsequently, all maps were harmonized in a geodetic transformation process (c). As a common reference system we used the projected Cartesian coordinate system DHDN/3 degrees Gauss-Krueger zone 3 (EPSG code 31467). In the binarization process (d) the black layer is extracted from the raster images, where all map elements shown in black are represented in a binary raster (1: black, 0: rest). Here we applied an unsupervised classification approach (Iso Cluster Unsupervised Classification with 12 classes). The classes reflect–according to the map key–the various color-coded map objects such as black objects (e.g., buildings, street outlines), green objects (e.g., vegetation), blue objects (e.g., water bodies) and so on. In preparation for the building footprint retrieval, the individual raster maps were then merged into one raster file for each time step. After this mosaicking step (e), a raster data set was available for each of the years 1950, 1960, 1970, 1980, 1990 and 2000.
2.4. Building Footprint Retrieval
The raster data sets contain the required map information in implicit form. That is, in order to make use of the historical information such as building footprints, the data need to be transformed. The transformation is achieved by employing a region-oriented segmentation approach. Region-oriented segmentation is defined as the spatial partitioning of an image into sets of regions that meet a defined homogeneity criterion. For the data set at hand, we used the object morphologies as homogeneity criteria. To address the shape (and size) of the image objects, methods of mathematical morphology are applied. The basic morphological operators which are used in this work can be conceptualized as non-linear image filters [32
]. The most important filters are, according to [37
], opening and closing, which are sequences of the basic operators erosion and dilation. In order to separate compact objects (building footprints) from linear and less compact objects (such as street lines), we employed a sequence of these morphological operations with the following parameters: a circle and an opening radius of 3 pixels (3.75 m in nature) are used as a structuring element (SE). The size of the SE depends on the resolution and needs to be adjusted according to the map image resolution (here 508 dpi). The SE depends further on the quality of the map scan and the segmentation process. Hence, the size of the SE cannot be determined solely deductively. The optimal SE size needs to be empirically adapted for other map scales, resolutions and scan qualities. In a second step, all other information with similar morphological characteristics as buildings, such as certain letters or map symbols, need to be detected and removed from the building data by means of an appropriate pattern recognition algorithm, which should be designed to be adaptable to the map layout. An adaptable pattern recognition approach for the maps at a scale of 1:25,000 used herein is described in Meinel et al. [21
]. Subsequently, the built-up area from the current land use data set is used to mask the settlement area as a region-of-interest (ROI). The result is the extracted building layer (Figure 2
). This procedure reduces both the computational load and the misclassification of non-building objects (false positives). A detailed and more generalized description of this complete auxiliary process of building footprint retrieval from maps can be found in References [21
2.5. Derivation of Building Age
This processing step serves to determine the period of construction for each building. This can only be determined at the level of the urban blocks due to the spatial positional discrepancies of the extracted buildings from different time periods. The following steps were necessary for performing the time series analysis:
Calculating the built-up coverage: For each urban block and time slice, the built-up coverage (proportion of building area in the block area) is calculated based on a spatial intersection of the extracted building footprints from the topographic maps and the urban block geometry taken from the ATKIS® Basic DLM. The calculated built-up coverages over time form the basis for determining the time of first construction of an urban block. The block-based calculation avoids building-by-building comparisons over time, which may be impossible due to the varying map quality, layout and positional inaccuracies across the time series.
Determination of a threshold value: For distinguishing between built-up and not built-up at a specific time, a threshold value needs to be applied. The optimal threshold value was determined by performing a Receiver Operating Characteristic (ROC) analysis on given reference data. The reference data include a map of built-up areas for the year 1970. Together with the calculated built-up coverages, they were input data for the ROC analysis. Thus, an optimum threshold value of 0.025 was determined, that is, only from a built-up coverage of 2.5% is an area regarded as built-up (with a sensitivity of 0.948 and a specificity of 0.785).
Historical settlement layer: In this step, the urban blocks of all time slices are classified according to built-up and not built-up by applying the threshold value to the built-up coverage values. Subsequently, the time of first construction is determined on the basis of the development pattern. The result is the historical settlement layer.
Historical urban morphology at the building level
: In a final step, the buildings are intersected with the historical settlement layer at the urban block level and the age data are attached to the building layer (see Figure 3
2.6. Classification of Building Footprints
In this processing step, the building footprints are classified according to a set of predefined building types. For this purpose, we apply the building classification approach with a supervised learning strategy introduced by Hecht et al. [38
]. In this strategy, a classifier is trained on the basis of a given set of training samples (buildings with class labels and object features) using a machine learning technique. The Random Forest (RF) algorithm by Breiman [39
] is used as a machine learning algorithm that constructs a large number of decision trees, each tree being trained with a random bootstrap sample of the training data. By comparing RF with 15 other algorithms, this ensemble-based classifier has been proven to be the best method for building classification in terms of accuracy and efficiency [40
]. The object features required for the classification are geometrical, topological, semantic and statistical measures. These have been calculated using digital image processing techniques and spatial analysis on the basis of the building geometry as well as urban block geometry taken from a digital landscape model. Before the classification approach was applied, small polygons <10 m2
were first removed from the data set. Utilizing the automatic building classification approach, buildings are differentiated into 11 different building types. A distinction is made between multi-family houses (MFH), single-family houses (SFH), rural houses (RH) and non-residential houses, which are further subdivided (see Figure 4
). Observations in the RF that are not part of the bootstrap sample can be used as out-of-bag (OOB) observations to estimate the prediction error. The estimated OOB error in the study area was 7.7% (corresponding to a total accuracy of 92.3%), which is comparable with the findings in Reference [38
2.7. Calculation of the Number of Dwellings and Inhabitants
The derivation of the dwelling units and inhabitants at the level of individual buildings is based on various model assumptions. The assumptions have been made based on numbers from the literature and our own empiric analysis and, if possible, made for each building type. The following assumptions are required for the processing:
Mean story height (sh)
Conversion factor (cf)
Dwelling unit size (dz)
Household size (hh)
The first relevant parameter is based on an assumption about the mean story height (sh) per building type in order to estimate the number of stories for each building (sn). This is necessary because the number of stories is usually not given in 3D building models. If the height of the building is also unknown (e.g., in the case of 2D building footprints), assumptions can be made about the typical number of stories, taking into account the type of building and the settlement structure [21
]. By multiplying the number of stories by the area of the building footprint, the total floor area (fa) in m2
can be calculated. In order to derive the actual living space (ls) in m², the fa value is reduced by applying a conversion factor (cf), since not the entire floor area of a building is usable for living (e.g., stairways etc.). The urban planning literature in Germany proposes a reduction of the area by a factor of 0.8 [41
]. Once the living space has been calculated, the number of dwelling units can be calculated considering average dwelling unit sizes for each building type. An estimation of the number of inhabitants can be made with the help of an assumption about the average household size (hh). Statistical data on average household sizes are provided by the statistical offices at the municipal level and also exist for historical time periods. Values for mean story heights and the mean dwelling sizes can only be derived indirectly from official statistics and literature. Since these can be very different regionally, we propose working with an empirical sample in the study area.
As a result of the previous step (Section 2.6
), the current or historical figures on the population and number of apartments at the building level are available for each time period. In a next and final step, the figures can be aggregated to any spatial level and the dynamics can be calculated by measuring the change in the total and relative values.