2.2. Satellite Derived Products
For the area under study, the ESM dated 2012, downloaded from the European Copernicus Services, includes only buildings at 10 meter spatial resolution [
14]. Such a map represents the percentage of built-up area coverage per spatial unit. It was used by JRC to produce the spatial distribution of regular migrant population with the 2011 census data.
The present study first updated the settlement map of the area under study up to 2018 by using Sentinel-2 imagery. The data selected were multiseasonal and cloudfree Sentinel-2 images, downloaded freely from the European Space Agency (ESA) portal [
23]. The dataset selected consisted of four images acquired on 30 January, 25 May, 29 July and 17 October 2018. All 10 image spectral bands were considered at 10 m spatial resolution.
Depending on in situ reference data availability, two different automatic classification procedures, data-driven pixel-based or, alternatively, knowledge-driven object-based, can be exploited to update the settlement map. Both approaches were implemented in this work. The Food and Agriculture Organization Land Cover Classification System (FAO–LCCS) taxonomy was considered for class discrimination [
24].
More details about the two different classification approaches are provided hereafter:
Data-driven pixel-based approach. A pixel-based Support Vector Machine (SVM) classifier [
25] was applied as a data-driven classifier, with a Radial Basis Function (RBF) kernel. The 10 image spectral bands were fed as input to the classifier along with the slope measurement from a Digital Elevation Model (DEM) available at 8 m. The latter input was added as input for discriminating built-up objects from extraction sites. The final following classes were considered: B15/A1 (Artificial Structures/Built Up); B15/A2A6 (Artificial Structures/Extraction sites); (B27 or B28)/A1 (Artificial or Natural Waterbodies/Water); A11/(A1 or A2)A9 or A12/(A3 or A4)E1 (Cultivated Lands or Natural Vegetation/(Trees or Shrubs) Evergreen); A11/(A1 or A2)A10 or A12/(A3 or A4) E2 (Cultivated Lands or Natural Vegetation/(Trees or Shrubs)Deciduous); A12/A2 (Natural Vegetation/Herbaceous); A11/A3 (Cultivated Lands/Herbaceous. The output LC map was produced by training the SVM classifier with 11,965 reference pixels. Next, the freely available [
26], 5 m-buffered road map, dated 2018, was used in order to discriminate buildings from roads within the B15/A1 output map layer (built-up).
Knowledge-driven object-based approach includes two different steps: a preliminary segmentation and a successive classification [
27]. The hierarchical scheme adopted by the FAO–LCCS taxonomy was implemented for class discrimination. First, Vegetated or Not-Vegetated objects (LCCS Level 1) and then, for each one, Terrestrial or Aquatic areas, were recognized (LCCS Level 2). Finally, within the Artificial Structures layer, buildings were discriminated from roads using the Length/Width morphological feature. The latter exploits the elongated dominant shape that characterizes roads but not buildings.
The final output binary settlement maps for year 2018, at 10 m spatial resolution, obtained by the two approaches were validated through the same set of reference samples (3659 pixels). To obtain a sample population that could best represent the entire population under study, both training and validation reference pixels were selected through a stratified random sampling. This sampling technique was used to ensure that the actual population class segments were neither over-represented nor under-represented [
28].
2.3. Updated Statistical Census Data: Collection and Harmonization
The Italian National Institute of Statistics (ISTAT) publishes data about resident migrant population regularly in order to contribute to Eurostat International Migration statistics. Every year, the publication concerns the total number of regular migrant residents in each municipality classified by sex, nationality and age; every 10 years the publication provides data on the number of regular migrants residing in each municipality specified per census area. The latest updated census data relate to the 2011 National Census Collection.
However, at local level, the Demographic Service Office of each municipality is in charge of collecting the distribution of regular migrants every year. Since the present study uses the dasymetric method to establish the distribution of regular migrant population within Bari metropolitan area, the acquisition of relative relevant data could only be obtained through the interaction and collaboration with the Demographic Service Office municipalities. However, for only Modugno municipality, neighboring with Bari, was it possible to find ancillary data for the year 2018.
The data obtained from the Bari and Modugno offices were highly fragmented and without a common standardization template. In fact, the Bari office contributed data aggregated per census area, while the Modugno office provided data according to the address. Hence, efforts were needed in order to harmonize the information available for both municipalities and group them for census area. The tabular data were transposed on maps represented in Universal Transverse Mercator Coordinate System (European Terrestrial Reference System 1989) and ingested in the open source QGIS environment (
Figure 2).
2.6. Updating Spatial Distribution of Resident Migrant Population: the Dasymetric Mapping Method
Demographic data are usually represented by a choropleth map [
31], where the statistical data are aggregated with census area. This type of representation is reported to have several limitations with respect to spatial analysis and population distribution [
32]. On the other hand, dasymetric maps have the advantage of “representing population density, irrespective of any administrative boundary, as distributed in reality, i.e., by natural spots of concentration and rarefaction” [
10].
JRC dasymetric mapping of resident migrants population was implemented by applying Equation (1) to each element of a uniform output 100 × 100 m cell-size grid:
where:
is the population density assigned to each output cell;
is the population density of resident migrants in the relevant input census section;
is the weighted, for the building use, built-up footprint area occupied in the output cell;
is the weighted, for the building use, built-up footprint area occupied in the input census zone.
It seems worth remembering that in order to be reliable, this formula requires information on the location and the spatial size of buildings included in the output cell. The minimum structure or object to be identified is defined as the building footprint area [
33].
As input for the dasymetric map, JRC adopted the ESM 2012 map (10 m) to extract built-up areas [
9]. In our study, the extraction of built-up areas for 2018 was based on a LC map from high resolution Sentinel-2 satellite data (10 m). Once extracted, built-up elements were weighted, for the building use, through the same weight values as the ones adopted by JRC (
Table 1). The rationale behind this weighting procedure is to account for the fact that, due to multifloor buildings, population density values are generally higher in urban areas than in rural contexts.
Figure 6 presents the vector approach used in our study to update the spatial distribution of resident migrants in the Bari and Modugno area during 2018.
The specific formula (Equation (2)) implemented in Python within the QGIS environment is:
where:
is the built-up footprint area in a generic subelement i of a census zone, with different building use, included in the output cell considered;
is the built-up footprint area in a generic subelement l of a census zone having different building use;
m is the number of the census zones in a cell;
n is the number of census zone subelements, with different building use, within a grid cell;
k is the number of census zone subelements with different building use.
The term
is equivalent to
found in Equation (1) used by JRC. Such a term corresponds to the sum of the areas occupied by built-up structures in any subelement, within the specific census zone, multiplied by the correction factor reflecting building use (see
Table 1). In order to compute this term, an intersection between the choropleth map of regular migrant population distribution in census zones and building use information was needed.
Figure 7 shows a graphical example of the computation steps performed.
The map thus obtained was then overlaid with an output grid (100 × 100 m). As a result, each cell in the grid could be associated to more than one census zone and be divided into elements indicating different building use (
Figure 8). When the cell belonged to a single census zone, the term
was equal to
(Equation (1)), otherwise a separate computation was required for each census zone part within the cell.
The resident migrant population of each census zone in the cell
was thus weighted by the factor:
The sum of the products between the populations and the respective weighting factors in Equation (3) allowed computation of total resident migrant population to be assigned to each cell of the output map.
The processing chain was implemented in the QGIS environment (3.4 version) and were compatible with all versions supported by Python 3.7.
In order to handle the dasymetric modeling uncertainty related to the output map, two indicators were used, i.e., the Mean Absolute Error (MAE) and the Total Absolute Error (TAE), as described elsewhere [
10,
34]. These indicators were computed by comparing the input population value provided for each census zone (
) with the sum of all pixels population output values included in each census zone (
).
TAE and MAE were computed using Equations (4) and (5):
where N is the total number of cells considered.
Validation strategy based on in situ reference data was not applied due to the confidentiality of information involved at the local scale (intraurban).
2.7. SDG Indicators Implementation
The updated settlements (built-up) map and the updated spatial distribution of resident migrant population thus obtained were used to implement: (a) SDG 11.1.1 indicator, i.e., proportion of urban population living in slums, informal settlements or inadequate housing [
18]; (b) SDG 11.3.1. indicator, i.e., ratio of Land Consumption Rate to Population Growth Rate (LCRPGR) [
19]. For both indicators, vector maps, in shapefile format, were produced.
Specifically, based on data availability, indicator SDG 11.1.1 was split into two components related to structural and quality criteria (a.1) or location criteria (a.2), respectively:
(a.1) Proportion of urban population living in inadequate housing households (IHH_1);
(a.2) Proportion of urban population living in households considered inadequate due to their residing on or close to hazardous areas (IHH_2).
Inadequate housing is defined with respect to structural conditions, location and durability of the building. A house is considered as “durable” if it is built in a nonhazardous location and has a permanent and adequate structure able to protect its inhabitants from climatic extremes such as heavy rain, heat, cold and humidity [
18].
For component (a.1) of SDG 11.1.1, related to structural quality criteria, under the hypothesis that people are equally distributed in all the cell settlements, the indicator, according to [
18], was computed as:
where the number of people leaving in inadequate housing households (
) per each output cell was provided as:
with
representing the total regular migrant population in the output cell considered.
Information about the position of inadequate housing introduced in
Section 2.5 (
Figure 4) was used.
For component (a.2) of SDG 11.1.1 related to location criteria, the indicator formula was computed as:
where the number of people leaving in housing households on or close to hazardous areas (
) per each output cell was provided as:
Concerning the SDG 11.3.1 indicator, the formula applied for computing is described in [
19]. Such formula, already applied by [
6] at the global scale, is based on the ratio between the Rate of Land Consumption (LCR) and the Population Growth rate (PGR) as follows:
with:
and:
where:
is the surface occupied by urban areas, in the output cell considered, at the final year (t+n);
is the surface occupied by urban areas, in the output cell considered, at the initial reference year (t);
is the regular migrant population living in urban areas, in the output cell considered, at the final year (t+n);
is the regular migrant population living in urban areas, in the output cell considered, at the initial reference year (t);
n = number of years between initial and final dates considered.
In this study, starting year (t) was 2011 and final year (t+n) was 2018, according to census data availability. Thus, at the two study dates, the available settlement map obtained from Copernicus ESM and the one obtained from classification of 2018 Sentinel-2 images, were considered as Urb measures with a n = 7 years.