1. Introduction
Over past decades, rapid population growth and urbanization have taken place at an unprecedented rate all over the world. For instance, more than half of the world’s population lived in urban areas in 2008, and this figure is projected to rise to 70% by 2050, with most of the growth occurring in developing countries [
1,
2]. Urbanization, as an important landscape pattern change process, is rapidly transforming natural land-cover to anthropogenic urban land use [
3]. However, the rapid urbanization process leads to a series of environment and development problems (e.g., deforestation [
4], agriculture land loss [
5], air pollution [
6], and urban heat island effect [
7]). Therefore, it is of crucial importance to understand the trends in urban sprawl and detect land-cover change [
8], particularly in rapidly urbanizing regions, for urban planning and environment sustainability.
Satellite remote sensing, such as Moderate Resolution Imaging Spectroradiometer (MODIS) observation [
9], Landsat observation [
10,
11,
12] and Sentinel observation [
13], has been used to urban land-cover mapping at different scales. With the development of remote sensing image processing; land-cover datasets produced from satellite images have been widely used in the earth surface processes modeling, sustainable development planning and so on. Several global land-cover products have arisen with efforts of international scientific organizations during the past decades [
14,
15]. Most of them derived by using satellite images at 300–1000 m spatial resolution, such as University of Maryland land-cover dataset (UMD) [
16], MODIS land-cover product (MOD12Q1 and MCD12Q1) [
17] and Climate Change Initiative land-cover product (CCL-LC) from European space agency (ESA), did not provide sufficient thematic detail or change information [
14], especially in urban areas. More recent research focused on the global urban mapping aim to provide spatially more accurate urban thematic layers based on finer resolution satellite images. A multiscale high-resolution image process framework for supporting Global Human Settlement Layer (GHSL) was presented in [
18] with the operational test result involved in the mapping of 24.3 million km
2 of urban areas. The Urban Footprint Processor (UFP) [
19], which is a fully automated method for the delineation of human settlements by the SAR data, was used to produce a new global urban dataset called the Global Urban Footprint (GUF) raster map [
20]. In order to overcome spectral confusion between urban and nonurban land-cover classes, a hierarchical object-based texture (HOTex) was developed and applied to produce a new global map of built-up and settlement extent [
21]. It is worth mentioning that GlobeLand30 land-cover product, with improved spatial resolution and accuracy, has been developed by the National Geomatics Center of China [
10,
22]. This 30 m product is based on Landsat and similar satellite images, which allows it to support the updating and reconstruction of land-cover maps due to the continuous observation of Landsat. Since many of these products presented land-cover map at a few static time instants, their timeliness is far from satisfactory for many applications in urban areas. Accurate and timely information about land-cover change is vital to the comprehension and management of urbanization dynamics [
23].
Most of the current approaches update land-cover maps based on the supervised classification that always requires large amounts of samples. The generation of training samples is a time-intensive, expensive, and subjective task. Therefore, a sufficient number of training samples for operational land-cover mapping is not always feasible due to the high cost and time-consuming process [
24,
25,
26]. Several training data automation methods were proposed as an alternative to manual selection. For example, Training Data Automation (TDA) methods for forest cover mapping and built-up area extraction were designed in [
27] and [
28], respectively. An automatic approach was proposed in [
29] to classify vegetation, water and impervious surface areas, and bare land with training data automation procedure based on several spectral indices. However, most of the current TDA methods focused on several specific land-cover type, and some important land-cover types were not considered (e.g., cultivated land and wetland). When the training dataset is selected from an image that is different from the one used for classification; spectral shifts between the two distributions are likely to make the classification model fail. Such shifts due to differences in acquisition and atmospheric conditions or to the changes in landscape are defined as data-shift problems [
30]. Especially, in land cover map updating, using time series images scenario cases where the data-shift problem is always inevasible. Thus, developing a method capable of exploitation of relational knowledge between existing land-cover products and time series images is appealing for automatic updating large-scale land-cover maps.
To solve the data-shift problems that exist in multi-temporal and multi-modality remote sensing imageries; plenty of approaches have been proposed in the last decades. Popular solutions including: absolute and relative image normalization [
31], histogram matching [
32], Principal Component Analysis (PCA) [
33] and Tasselled Cap (TC) [
34] transformation. In contrast with above and other conventional solutions, the Transfer Learning (TL) approach has gained increasing attention in recent years. TL is defined as follows [
35]: given data and the learning task of source domain, data and learning task of target domain, transfer learning aims to help improve the learning of the target predictive function in target domain using the knowledge in source domain and its learning task. Domain Adaptation (DA) is a particular form of TL. These two domains in remote sensing communities are associated with two satellite images acquired on different areas or on the same area at a different time. TL techniques aim to adapt the priori information of source domain to train a classifier used to predict the label in the target domain for the purpose of classification [
30]. To this end, a widely used approach is based on adaptation of classifier with source domain samples and labeled/unlabeled target-domain samples. In this approach, training samples from source domain are used for initializing the learning task and the data of target domain is applied to adapting the model by a series of methods and strategies, such as Domain Adaptation Support Vector Machine (DASVM) [
36], Active Learning (AL) technique [
37], Change-Detection-driven Transfer Learning (CDTL) [
38], Geodesic Flow Kernel Support Vector Machine (GFKSVM) [
39] and iterative source samples reweighting strategy [
40]. Another popular approach focuses on searching a shared and invariant feature subset. The main idea of this approach is to find an appropriate subspace by feature selection methods, such as Transfer Component Analysis (TCA) [
41] and Manifold Alignment (MA) [
42].
According to relevant remote sensing literatures [
30,
40,
43], TL approaches are usually summarized into four categories, which are instance transfer, parameter transfer, feature representation transfer and relational knowledge transfer. Relational knowledge transfer approaches reuse the knowledge acquired in source domain to solve the learning problem in a related target domain [
43]. The land-cover products, such as Globeland30, provide land-cover knowledge with sufficient sample labels. Considering that Globeland30 is produced mainly based on Landsat images, Globeland30 and its corresponding Landsat image can be defined as source domain, and other Landsat images acquired on the same geographical area can be defined as the target domain. The land-cover maps of target domain can be updated by a relational knowledge transfer approach, theoretically. Based on this idea, a new automatic updating approach was explored to customize a methodology to rapidly urbanizing regions. The method was designed to leverage multi-modality remote sensing dataset, capture high-quality sample labels from land-cover product, and derive the updated land-cover map by unsupervised knowledge transfer procedure. In addition, a novel sample selection strategy is designed as an alternative to traditional random selection method.
The paper is organized as follows:
Section 2 introduces study areas and data sources;
Section 3 introduces the workflow of the proposed approach;
Section 4 evaluates the model performance with reference data. Finally,
Section 5 concludes this paper.
2. Study Areas and Materials
Two cities within the Yangtze River Delta city cluster were selected to update land-cover maps (
Figure 1). One of them is the urban area of Nanjing and its surrounding areas, and the other is the urban area of Hangzhou and its surrounding areas. Each study area has an area of 8100 km
2. Nanjing is the capital of the Jiangsu province and the second largest city in the East China region. Hangzhou is the capital and most populous city of the Zhejiang Province and it sits at the head of Hangzhou Bay, which separates Shanghai and Ningbo. Both Nanjing and Hangzhou are important economic hubs of the Yangtze River Delta city cluster, which is planned to build a world-class city cluster. In recent years, land-cover in these two regions has changed rapidly under the influence of urban expansion.
The nighttime light (NTL) data was obtained from version 4 of the DMSP/OLS NTL cloud-free annual composites. These data were observed by six satellites spanning over 22 years from 1992 to 2013. The main dataset used in this paper is the average stable light with a spatial resolution of 30 arc seconds. The average stable light product containing lights from human active areas with continuous nighttime light has demonstrated potential for mapping or monitoring urban area [
23,
44]. We chose this product to separate the artificial surface from other land-cover types. In order to obtain quality-enhanced NTL data, the original DN value was calibrated using the ridgeline sampling regression method [
45].
The Landsat satellite data have been archived by the US Geological Survey since 1972. These data are applicable for historical study of land-cover change. The remote sensing images chosen for this study were Landsat 5 and Landsat 8 surface reflectance distributed by USGS EarthExplorer. We created a cloud-free Landsat image dataset for each experimental area of circa 2010 and the year to update.
Table 1 lists the acquisition dates of Landsat images used for the study areas, and
Figure 2 shows these images.
The GlobeLand30 datasets are open access maps of Earth land-cover and comprise ten types of land-cover, including cultivated land, forest, grassland, water bodies, wetland, artificial surfaces and so on. These datasets were produced by a pixel-object-knowledge-based (POK-based) classification approach for the years 2000 and 2010 [
10,
22]. More than 20,000 Landsat and Chinese HJ-1 satellite images were used to produce land-cover maps at 30-metre resolution. The overall accuracy of GlobeLand30 in 2010 reaches 80.33% with over 150,000 test samples. GlobeLand30 in 2010 was applied as an original land-cover map, which is valuable for getting the land-cover information.
To facilitate the land-cover classification process, a group of ancillary geospatial products including both digital surface model (DSM) and digital terrain model (DEM) were collected and preprocessed for this research. The DSM was collected from the precise global digital 3D map “ALOS World 3D” developed by the Japan Aerospace Exploration Agency. The DEM was collected from the ASTER Global Digital Elevation Model developed jointly by the Ministry of Economy, Trade, and Industry (METI) of Japan and NASA. Both of the datasets have a horizontal resolution of approximate 30-metres (1 arcsecond in latitude and longitude).
5. Conclusions
For rapidly urbanizing regions, land-cover mapping is of great importance to urban planning and management. However, land-cover products often have a long renewal cycle and cannot provide timely changes in land-cover information. This study presented a novel automatic approach for updating land-cover map by relational knowledge transfer. It classifies a new satellite image by using the knowledge transferred from existing land-cover product and corresponding image. The proposed approach was defined on the basis of three steps. The first step is devoted to obtain reliable land-cover information from Globeland30. This is done by establishing a set of decision rules based on multi-modality RS datasets to optimize land-cover product. The second step aims at applying proposed knowledge transfer procedure to source domain and target domain, and then transferring land-cover knowledge from modified historical map to new images. With the completion of the multi-feature combination classification procedure, in the third step, the target image is classified to produce the land-cover map. Typical experimental areas, Nanjing and Hangzhou, were selected to proof the effectiveness of the approach. The classification results indicated that the aforementioned steps offer a good performance.
Compared with traditional methods for land-cover mapping, the proposed approach does not involve manual selection of samples and adjustment of variable parameters. Therefore, the method in this study can quickly and effectively produce new land-cover map, which has significant application potential for rapidly urbanizing regions. The characteristics of our study are described below: (1) compared with the existing research on land-cover product update, the uncertainty of existing land-cover map has been fully considered, and the optimized map has been produced instead of directly updating the original map; (2) the knowledge transfer approach in this paper is an unsupervised transfer learning procedure. The training sets for new images are transferred from historical land-cover map automatically; (3) the proposed approach provides a solution for a data-shift problem with its general properties and its simplicity, which has the potential for large-scale operational applications.
In future, more types of geographic information datasets, not limited to remote sensing datasets, could be taken into account in our method for land-cover information transferring. The potential datasets include regional thematic map, open street map and socio-economic data. By combining more land-cover knowledge, the updated maps will hopefully contain more abundant and precise land-cover information.