Extraction of Urban Built-Up Area Based on Deep Learning and Multi-Sources Data Fusion—The Application of an Emerging Technology in Urban Planning

: With the rapid expansion of urban built-up areas in recent years, it has become particularly urgent to develop a fast, accurate and popularized urban built-up area extraction method system. As the direct carrier of urban regional relationship, urban built-up area is an important reference to judge the level of urban development. The accurate extraction of urban built-up area plays an important role in formulating scientiﬁc planning thus to promote the healthy development of both urban area and rural area. Although nighttime light (NTL) data are used to extract urban built-up areas in previous studies, there are certain shortcomings in using NTL data to extract urban built-up areas. On the other hand, point of interest (POI) data and population migration data represent different attributes in urban space, which can both assist in modifying the deﬁciencies of NTL data from both static and dynamic spatial elements, respectively, so as to improve the extraction accuracy of urban built-up areas. Therefore, this study attempts to propose a feasible method to modify NTL data by fusing Baidu migration (BM) data and POI data thus accurately extracting urban built-up areas in Guangzhou. More accurate urban built-up areas are extracted using the method of U-net deep learning network. The maximum built-up area extracted from the study is 1103.45 km 2 , accounting for 95.21% of the total built-up area, and the recall rate is 0.8905, the precision rate is 0.8121, and the F1 score is 0.8321. The results of using POI data and BM data to modify NTL data to extract built-up areas have not been signiﬁcantly improved due to the fact that the more data get fused, the more noise there would be, which would ultimately affect the results. This study analyzes the feasibility and insufﬁciency of using big data to modify NTL data through data fusion and feature extraction system, which has important theoretical and practical signiﬁcance for future studies on urban built-up areas and urban development.


Introduction
As an objective reflection of the geographical distribution of urban construction development, urban built-up area refers to the relatively concentrated distribution area that is actually built or in the process of being built within the administrative scope of urban area [1], including the centralized contiguity part of the urban area and other urban construction land dispersed to the inner suburbs but closely connected with the urban area [2]. As the core of urban public activity system and an important part of urban function, although the urban center is the most concentrated area of urban political, economic, cultural and other public activities, the urban center is still only a small part of the urban built-up area. Spatially, the urban built-up area includes the urban center [3]. Meanwhile, as the spatial carrier of all urban activities, urban built-up areas can more directly reflect the urbanization degree of a city, making the size of urban built-up areas an important indicator to judge urban development [4][5][6]. With the acceleration of urbanization, a series of urban problems resulted from urban expansion such as land use contradiction, uncoordinated economic development and complex population flow components require more attention in future urban management and social governance [7,8]. Therefore, it is of great importance to continuously identify and monitor urban built-up areas, which can play a greater role in urban planning and management, territorial space planning and rural revitalization, etc. [9].
The data used in previous studies on extracting urban built-up areas are highly dependent on administrative indicators and statistical yearbooks. These data are updated slowly, and the extraction areas are often larger than the actual urban built-up areas (mainly reflected in the fact that local governments exaggerate the local urban development level to obtain more development funds) [10,11]. Therefore, it is particularly important and urgent to carry out accurate identification and extraction of urban built-up areas [12]. The accurate identification and extraction can help to assess the urban development level in a finer way [13]; it can also help to provide a scientific basis for the formulation of urban planning to promote high-quality urban development [14].
In the study on the extraction of urban built-up areas, the statistics of surface information provided by remote sensing image has become a feasible and universal method [15], and the results of urban built-up area identified by this method have also been adopted by the government and research institutions. As a kind of remote sensing data, NTL (nighttime light) data mainly reflects the relationship between economic distribution and population agglomeration in urban space by capturing nighttime lights generated by urban construction, population activities, etc. [16][17][18][19][20]. For studies using NTL data as prominent study data, it is necessary to determine the threshold of light through different methods [21]. In the previous studies, the most common methods are the dichotomy method and the threshold method for segmentation and extraction [22]; that is, on the premise of knowing the area of the built-up area, the brightness value of nighttime light is continuously segmented, so that the finally extracted brightness range is close to the area of the built-up area [23]. However, the operation of such methods is too complicated and difficult to be popularized. Therefore, the current mainstream method is object-oriented image segmentation [24]. Image semantic segmentation algorithm has strong extraction ability for spectral and spatial features, which makes it more commonly used in the study of remote sensing image classification and information extraction [25,26]. However, the shortcomings of NTL data itself include certain spillover effects (such as Defense Meteorological Satellite Program/Operational Linescan System), excessively extensive resolution, and discontinuous variation of light brightness in the rural-urban boundary, making it gradually difficult to accurately extract urban built-up area with single-source NTL data [27].
In recent years, the application of urban big data, including mobile phone signaling, integrated circuit card, POI (point of interest) and population migration data, has become more and more popular, which provides a new observation way for urban-related studies. These big data have unique advantages in the studies of urban space thanks to their faster collection speed, shorter update cycle and wider sample range [28][29][30][31]. Researchers have also found that there is a significant spatial correlation between big data and NTL data, and they tried to improve the image quality of NTL data by fusing big data [32]. At present, it is widely used to extract urban built-up areas through the fusion of POI data, population migration data with NTL data [33]. Among them, POI data reflect the spatial distribution of infrastructure in urban geographical space [34], while population migration data reflect the differences of population change in urban and rural areas [35]. Both of them play an important role in modifying NTL data and extracting urban built-up areas [36].
Currently, the more commonly used image fusion levels mainly include data level image fusion, feature level image fusion and decision level image fusion [37]. Among them, data level fusion, also known as pixel level fusion, refers to the process of directly processing the data collected by the sensor to obtain the fused image, which is one of the key points of current image fusion study [38]. The advantage of pixel level fusion is that it can maintain as much original data as possible and provide subtle information that cannot be provided by other fusion levels [39]. Pixel-level fusion includes spatial domain algorithm and transform domain algorithm, and there are many fusion rule methods in spatial domain algorithm, such as logical filtering method, gray weighted average method and contrast modulation method [40]. In the transform domain, there are pyramid decomposition and fusion method and wavelet transform method [41]. Among them, wavelet transform is the most important and commonly used pixel-level fusion method at present [42].
The extraction of urban built-up areas based on remote sensing and urban big data is essentially the extraction of image feature information, and deep learning has unique advantages in this field [43]. As a new study direction of machine learning, deep learning is to learn the internal rules and representation levels of sample data [44,45]. The researchers extract different image information features through the correlation algorithm of deep learning, for example, the land type samples are trained by convolutional neural networks (CNN), and then the land use types of large regional scale can be extracted [46]. Compared with previous studies, the study results obtained by using the deep learning method not only have higher accuracy, but also greatly fast the processing speed. In addition, many studies have also found that the effect of deep learning on image segmentation and feature extraction is far better than that of previous related methods [47].
Currently, among the relevant studies on the extraction of urban built-up area, the extraction method with the highest accuracy is to first fuse NTL data and POI data, and then extract urban built-up area by using object-oriented method. Although the obtained highest accuracy generally exceeding 92% [48], there are still some shortcomings that need to be improved about this method. Firstly, both NTL data and POI data are static attributes in urban space, while different urban built-up areas should have a certain dynamic relationship within space [49]. Secondly, whether the fusion of different data will produce different results for the extraction of urban built-up areas still needs to be tested. Thirdly, the threshold selection of object-oriented image segmentation methods, such as multiresolution segmentation, is too subjective, resulting in great differences in study results of different areas; additionally, whether different data would have a significant impact on the results still needs to be further studied [50]. Therefore, this study modifies NTL data from static and dynamic levels, namely fusing POI data and population migration data, respectively, then compares the differences between different data fusion. Finally, urban built-up areas are successfully extracted by using deep learning training samples. Different from previous studies, on the basis of analyzing the influence of multi-source data fusion on the extraction results of urban built-up areas under the method of deep learning, this study further compares and analyzes the difference between different source data and discusses the influence of different source data fusion on the built-up areas. This study proposes a new idea and method of urban built-up area extraction so as to extract urban built-up area more accurately and serve the subsequent urban planning and urban development more scientifically.
The rest of the paper is organized as follows, the second section is methods and materials, including an overview of the study area, sources of study data, and study methods. The third section presents the study results, including the differences between urban built-up areas extracted from different data. The fourth section is the discussion section. The fifth section is the conclusion section.

Study Area
Locating at 112 • 57 to 114 • 3 east longitude and 22 • 26 to 23 • 56 north latitude (Figure 1), Guangzhou is not only the capital city of Guangdong Province but one of the core cities of the Guangdong-Hong Kong-Macao Greater Bay Area urban agglomeration (GBA). With the rapid construction of GBA, Guangzhou's urbanization rate reached 85% by the end of 2020. According to the ranking of urban built-up areas officially released by China in 2020, the area of urban built-up areas in Guangdong Province ranks first in China, while Guangzhou ranks third in China. Therefore, taking Guangzhou as an example to explore the feasible extraction method of urban built-up area can not only provide the decision-making basis for Guangzhou's urban planning and management, but also provide a good reference for other cities. With the rapid construction of GBA, Guangzhou's urbanization rate reached 85% by the end of 2020. According to the ranking of urban built-up areas officially released by China in 2020, the area of urban built-up areas in Guangdong Province ranks first in China, while Guangzhou ranks third in China. Therefore, taking Guangzhou as an example to explore the feasible extraction method of urban built-up area can not only provide the decisionmaking basis for Guangzhou's urban planning and management, but also provide a good reference for other cities.

Study Data
The data used in this study mainly include NTL data, POI data and BM (Baidu Migration) data. NTL data are derived from Luojia-01, POI data from Application Programming Interface service provided by Amap [51], and BM data from Application Programming Interface service of Baidu [52].
Previous studies have shown that the NTL data of Defense Meteorological Satellite Program/Operational Linescan System and National Polarorbiting Partnership/Visible infrared Imaging Radiometer are more suitable for the study of large regional scales, such as urban agglomeration or a certain country or region, while Luojia-01 NTL data are more suitable for the study of a single city scale and can achieve more accurate results than other data. Therefore, Luojia-01 NTL data are used in this study. The Luojia-01 NTL data, with a wavelength range from 480 to 800 μm, a spatial resolution of 130 m, and a width of 260 km, is provided by the Luojia-01 experimental satellite launched by Wuhan

Study Data
The data used in this study mainly include NTL data, POI data and BM (Baidu Migration) data. NTL data are derived from Luojia-01, POI data from Application Programming Interface service provided by Amap [51], and BM data from Application Programming Interface service of Baidu [52].
Previous studies have shown that the NTL data of Defense Meteorological Satellite Program/Operational Linescan System and National Polarorbiting Partnership/Visible infrared Imaging Radiometer are more suitable for the study of large regional scales, such as urban agglomeration or a certain country or region, while Luojia-01 NTL data are more suitable for the study of a single city scale and can achieve more accurate results than other data. Therefore, Luojia-01 NTL data are used in this study. The Luojia-01 NTL data, with a wavelength range from 480 to 800 µm, a spatial resolution of 130 m, and a width of 260 km, is provided by the Luojia-01 experimental satellite launched by Wuhan University, China in 2018. Luojia-01 NTL data can be downloaded for free at http://59.175.109.173:8888/, accessed on 30 April 2022. In this study, the NTL data of Guangzhou from October 2018 to March 2019 were firstly downloaded on 1 May 2022, and after carrying out radiation correction, radiance conversion and multi-period average processing, the spatial distribution of nighttime lights in Guangzhou can be obtained as shown in Figure 2. to March 2019 were firstly downloaded on 1 May 2022, and after carrying out radiation correction, radiance conversion and multi-period average processing, the spatial distribution of nighttime lights in Guangzhou can be obtained as shown in Figure 2. POI data refers to the abstract expression of entities in the virtual geographical space. Amap's POI service can provide more comprehensive and accurate POI navigation and regional search functions, making Amap's POI data have more advantages. POI data mainly include four basic attributes: name, category, longitude, latitude and address. By the end of December 2021, 22 types of POI data in Guangzhou are obtained through accessing API service of Amap, with a total number of 953,201. After carrying out duplication and screening of the obtained POI data, 881,233 POI data are finally obtained, and the spatial distribution of POI data is shown in Figure 3. POI data refers to the abstract expression of entities in the virtual geographical space. Amap's POI service can provide more comprehensive and accurate POI navigation and regional search functions, making Amap's POI data have more advantages. POI data mainly include four basic attributes: name, category, longitude, latitude and address. By the end of December 2021, 22 types of POI data in Guangzhou are obtained through accessing API service of Amap, with a total number of 953,201. After carrying out duplication and screening of the obtained POI data, 881,233 POI data are finally obtained, and the spatial distribution of POI data is shown in Figure 3. BM data refers to the data of population location changes in some regions within a certain period obtained from Baidu map. By analyzing the regional changes of population location, the quantitative relationship of migration population in different regions can be obtained. In consideration of the impact of COVID-19 on the large-scale migration of pop-  BM data refers to the data of population location changes in some regions within a certain period obtained from Baidu map. By analyzing the regional changes of population location, the quantitative relationship of migration population in different regions can be obtained. In consideration of the impact of COVID-19 on the large-scale migration of population, we obtained the migration data of a random week per month from January 2021 to December 2021 through Baidu Open API to avoid large-scale fluctuations in population migration mainly due to the outbreak of COVID-19. After averaging 12 weeks of BM data, it is found that the regional flow of population would generally stabilize in one year, which is more conducive to avoiding the impact of COVID-19. The distribution of population migration in Guangzhou in 2021 is shown in Figure 4.

Kernel Density Analysis (KDA)
KDA reflects the spatial agglomeration form of point elements by calculating the distribution of different points in geographical space [53,54]. The formula of KDA is: where is the kernel density value of the spacial position, is the distance between the space point and the study object j, is the spacial position whose distance is less than or equal to , is the space weight, and is the search radius. The search radius of kernel density analysis is: where is the standard distance, is the median distance, and is the number of points.

Image Fusion Based on Multi-Scale Transform
The multi-scale image fusion method can fuse the image based on the feature information of the image by scaling the image at different scales, and the wavelet transform is

Kernel Density Analysis (KDA)
KDA reflects the spatial agglomeration form of point elements by calculating the distribution of different points in geographical space [53,54]. The formula of KDA is: where p i is the kernel density value of the spacial position, D ij is the distance between the space point i and the study object j, n is the spacial position whose distance is less than or equal to D ij , k j is the space weight, and R is the search radius. The search radius of kernel density analysis is: where SD is the standard distance, D m is the median distance, and n is the number of points.

Image Fusion Based on Multi-Scale Transform
The multi-scale image fusion method can fuse the image based on the feature information of the image by scaling the image at different scales, and the wavelet transform is a classic multi-scale image fusion method [55]. Images fused by wavelet transform can reflect information of different images at the same time, thus improving the efficiency of data use [56]. In addition, compared with other data fusion methods, wavelet transform not only contains high and low frequency information in horizontal, vertical and diagonal directions, but also can better capture the singularity of the image signal by providing a dynamic time-frequency change window, so as not to lose the original information of the image.
where f (t) is the signal vector, ϕ(t) is the wavelet function, α controls the scaling of the wavelet function, τ controls the translation of the wavelet function, and b is the parameter. The basic principle of wavelet transform is to translate the wavelet function into τ and then inner product with the analysis signal f (t) at different scales α, then the image multi-scale fusion could be realized.

Image Feature Extraction Based on Deep Learning
As a learning method based on data representation in machine learning, deep learning has the advantage of semi-supervised feature learning and hierarchical feature extraction. Using efficient algorithms, deep learning can greatly improve the accuracy of data processing while saving processing time, making the process of image information feature extraction simpler and more efficient [57,58]. Although CNN and full CNN are more commonly used in deep learning, the shortcomings of losing pixel position information, only a fixed image size can be input, fewer skip layer structures, the blurred segmentation results, and the relatively longer training time cannot be ignored, so this study uses the U-net neural network. As a segmentation network algorithm evolved from FCN, the U-net neural network considers both the global information and the detailed information of the image, and U-net splices the results of each layer of the encoder into the decoder, all of which enable the image information segmented by U-net to greatly improve its accuracy to obtain better segmentation results [59,60]. The processing flow of U-net in extracting image feature information is shown in Figure 5.
There are five layers of U-net network adopted in this study. In the down-sampling part, each level is composed of two 3 × 3 convolution operations, and then activation function and maximum pooling operation with a 2 × 2 kernel are used. In this way, the number of image feature channels increases, and the length and width of features decreases at each down-sampling layer. In the up-sampling part, a 2 × 2 convolution kernel is used for deconvolution operation at each layer, and the feature graph of the corresponding down-sampling layer is combined. Then, two 3 × 3 convolution layers and ReLU activation function are used. After multi-layer deconvolution operation, the image segmentation result is finally output.
sults, and the relatively longer training time cannot be ignored, so this study uses the Unet neural network. As a segmentation network algorithm evolved from FCN, the U-net neural network considers both the global information and the detailed information of the image, and U-net splices the results of each layer of the encoder into the decoder, all of which enable the image information segmented by U-net to greatly improve its accuracy to obtain better segmentation results [59,60]. The processing flow of U-net in extracting image feature information is shown in Figure 5. There are five layers of U-net network adopted in this study. In the down-sampling part, each level is composed of two 3 × 3 convolution operations, and then activation function and maximum pooling operation with a 2 × 2 kernel are used. In this way, the number of image feature channels increases, and the length and width of features decreases at each down-sampling layer. In the up-sampling part, a 2 × 2 convolution kernel is used for deconvolution operation at each layer, and the feature graph of the corresponding downsampling layer is combined. Then, two 3 × 3 convolution layers and ReLU activation

Accuracy Verification
Verification of the accuracy of extracted built-up area should be carried out from two aspects. First, the extracted built-up area should not only be close to the existing built-up area in area, but also be highly overlapped with the existing built-up area in space position. Therefore, traditional area ratio and random point sampling could no longer accurately verify the accuracy of built-up area extraction. This is also the reason why the accuracy of this study is verified by comparing the spatial overlap degree between the extracted built-up area in this study and the existing built-up area [61]: recall = a overlap a comparative (5) where F 1 score is the harmonic average of recall and precision, ranging from 0 to 1, and the greater the value, the higher the accuracy. a overlap is the total area of the overlap part between the extracted built-up area obtained in this study and the reference built-up area, a computed is the total area of the extracted built-up area, and a comparative is the total area of the reference built-up area. In addition, accuracy value refers to the extracted urban built-up area divided by the reference built-up area or urban built-up area divided by the cross-verified built-up area.
In order to verify the accuracy of the built-up area extraction method proposed in this study, the official land use urban built-up area distribution in Guangzhou published in 2020 is used as the training and test data set of this study (Figure 1). In addition, this study is cross-verified with the urban built-up areas extracted from high-resolution remote sensing images of Google Earth to further verify the accuracy of the results of this study.
In this study, NTL data, POI data and BM data are fused based on multi-scale wavelet transform; then, the urban built-up area is extracted from the fused image through deep learning and U-net, and the accuracy of the results is verified finally. The flow diagram of the study is shown in Figure 6. All the methods used in this study are shown in Table 1.

The Modification of NTL Data by Fusing Multi-Sources Big Data
Existing studies have proved that there is an error in extracting urban built-up areas from the single-source NTL data, which is caused by many factors, including the spillover effect of the NTL data itself, the differences in spatial resolution and threshold extraction algorithms, and so on. Additionally, there is a strong spatial correlation between POI data, BM data and nighttime lights in urban space, which is reflected in the fact that the number of POI, night light brightness and population migration gradually decline from the urban center to the urban edge and finally to the rural area. On the other hand, these three data represent different urban attributes, so POI data and population migration data can be fused with NTL data to modify NTL data in the hoping to improve the accuracy of builtup area extraction. All the methods used in this study are shown in Table 1.

The Modification of NTL Data by Fusing Multi-Sources Big Data
Existing studies have proved that there is an error in extracting urban built-up areas from the single-source NTL data, which is caused by many factors, including the spillover effect of the NTL data itself, the differences in spatial resolution and threshold extraction algorithms, and so on. Additionally, there is a strong spatial correlation between POI data, BM data and nighttime lights in urban space, which is reflected in the fact that the number of POI, night light brightness and population migration gradually decline from the urban center to the urban edge and finally to the rural area. On the other hand, these three data represent different urban attributes, so POI data and population migration data can be fused with NTL data to modify NTL data in the hoping to improve the accuracy of built-up area extraction.
In this study, wavelet transform is carried out based on OpenCV. Although applying multi-scale wavelet decomposition to an image will decompose the original image into components of different scales, as one of the common processing steps of CV, wavelet decomposition is more able to recognize some features that are not obvious at multiple scales. In the process of image fusion, the data image signal can be regarded as an N×N two-dimensional signal matrix in essence. As a method of image fusion through multiscale transformation of different images, wavelet transform can decompose the original two-dimensional signal matrix into four equivalent two-dimensional signal matrices, and each new two-dimensional matrix includes different wavelet coefficients, corresponding to different signal characteristics. Additionally, the new data can be obtained after carrying out inverse transformation by comparing the details of different images in the domain of wavelet transform. The comparison and process before and after data fusion are shown in Figure 7.
Land 2022, 11, x FOR PEER REVIEW 11 of 20 In this study, wavelet transform is carried out based on OpenCV. Although applying multi-scale wavelet decomposition to an image will decompose the original image into components of different scales, as one of the common processing steps of CV, wavelet decomposition is more able to recognize some features that are not obvious at multiple scales. In the process of image fusion, the data image signal can be regarded as an N×N two-dimensional signal matrix in essence. As a method of image fusion through multiscale transformation of different images, wavelet transform can decompose the original two-dimensional signal matrix into four equivalent two-dimensional signal matrices, and each new two-dimensional matrix includes different wavelet coefficients, corresponding to different signal characteristics. Additionally, the new data can be obtained after carrying out inverse transformation by comparing the details of different images in the domain of wavelet transform. The comparison and process before and after data fusion are shown in Figure 7.

Urban Built-Up Area Extracted by Different Data
Geospatial difference is caused by the difference in the aggregation degree of geographical events in a certain geographical space, which represents the formation process of some geographical events. Within the urban space, there are obvious geospatial differences in different geographical ranges, which are specifically manifested in the excessive differences in regional development level and spatial structure between urban and rural areas. Thus, the differences of NTL data, POI data and BM data between urban and rural areas form the basis for the extraction of urban built-up areas in this study.

Urban Built-Up Area Extracted by NTL Data
NTL data distinguish urban built-up areas and non-built-up areas according to the brightness difference of nighttime lights. From the preprocessing results of NTL data (Figure 2), it can be seen that the high values of NTL data are mainly distributed in Haizhu District, Liwan District, Yuexiu District, Huangpu District and Panyu District, while the values of NTL data in Zengcheng District, Conghua District and Baiyun District are relatively lower. Therefore, the spatial distribution of Guangzhou urban built-up area can be preliminarily judged by the high and low values of NTL data.
As shown in Figure 8, the extracted urban built-up area can be obtained by taking built-up area in the land use classification in Figure 1 as trainingsamples, and then using U-net to perform image segmentation of nighttime lights. It can be seen from Figure 8 that the urban built-up area extracted by NTL data is 1148.75 km 2 , accounting for 84.09% of the built-up area of Guangzhou. Additionally, the urban built-up area extracted by NTL data has the following two characteristics, which are, firstly, from the perspective of the extracted built-up areas, the built-up areas are mainly distributed in Haizhu District,

Urban Built-Up Area Extracted by Different Data
Geospatial difference is caused by the difference in the aggregation degree of geographical events in a certain geographical space, which represents the formation process of some geographical events. Within the urban space, there are obvious geospatial differences in different geographical ranges, which are specifically manifested in the excessive differences in regional development level and spatial structure between urban and rural areas. Thus, the differences of NTL data, POI data and BM data between urban and rural areas form the basis for the extraction of urban built-up areas in this study.

Urban Built-Up Area Extracted by NTL Data
NTL data distinguish urban built-up areas and non-built-up areas according to the brightness difference of nighttime lights. From the preprocessing results of NTL data (Figure 2), it can be seen that the high values of NTL data are mainly distributed in Haizhu District, Liwan District, Yuexiu District, Huangpu District and Panyu District, while the values of NTL data in Zengcheng District, Conghua District and Baiyun District are relatively lower. Therefore, the spatial distribution of Guangzhou urban built-up area can be preliminarily judged by the high and low values of NTL data.
As shown in Figure 8, the extracted urban built-up area can be obtained by taking built-up area in the land use classification in Figure 1 as trainingsamples, and then using U-net to perform image segmentation of nighttime lights. It can be seen from Figure 8  while the patches in other areas are relatively fragmented (in China's official land survey patches, the minimum area of built-up area patches is 400 m 2 , so the built-up area patches less than 400 m 2 extracted in this study do not conform to the real built-up area patches). Additionally, there is no obvious connection between the built-up area patches, and there is also a lack of complete and detailed information on the extracted urban built-up area boundary. Additionally, the extraction of a considerable number of road patches further complicates the boundary. On the whole, the urban built-up areas extracted from NTL data extract more patches, which makes the built-up areas fragmented and the boundary information complex, thus making the extraction results quite different from the actual situation. Therefore, the results of extracting urban built-up areas only using NTL data need to be further improved. Additionally, there is no obvious connection between the built-up area patches, and there is also a lack of complete and detailed information on the extracted urban built-up area boundary. Additionally, the extraction of a considerable number of road patches further complicates the boundary. On the whole, the urban built-up areas extracted from NTL data extract more patches, which makes the built-up areas fragmented and the boundary information complex, thus making the extraction results quite different from the actual situation. Therefore, the results of extracting urban built-up areas only using NTL data need to be further improved.

Urban Built-Up Area Extracted by POI_NTL Data
As NTL data, there are also significant differences in the number of POI data in different levels of urban development. However, different from NTL data, POI data mainly reflect the construction of urban internal infrastructure, so the concentration of POI data can assist in judging urban built-up areas and non-built-up areas. The spatial distribution of POI data is roughly similar to that of high and low values of NTL data; that is, high values are mainly distributed in Haizhu District, Liwan District, Yuexiu District and other central urban areas. However, compared with NTL data, the number of POI data decreased more continuously at the edges of urban built-up and non-built-up areas, indicating that POI_NTL data after the fusion of POI data may have a better performance in extracting the edge details of urban built-up areas.
By referring to the training sample of NTL data in the previous section (the training samples are derived from the land use classification in Figure 1), the urban built-up area extracted from the POI_NTL data can be obtained as shown in Figure 8. We can see from Figure 8 that the urban built-up area extracted by POI_NTL data is 1251.03 km 2 , accounting for 91.58% of the built-up area of Guangzhou. The same as the urban built-up area extracted by NTL data, the urban built-up area extracted by POI_NTL data also has the following two characteristics, which are, firstly, from the perspective of the extracted built-up areas, the built-up area extracted by POI_NTL data is mainly concentrated in the central urban area, which is similar to the coverage of urban built-up area extracted by single NTL data. Secondly, from the extracted patches in the urban built-up area, POI_NTL data show a slight decrease in the number and fragmentation degree of built-up area patches. Although there is still no connection between patches, the extracted boundary information of urban built-up area is improved, and the extracted complexity of main road patches is also reduced. On the whole, the results of urban built-up area extracted by the POI_NTL data are different from those extracted from the single NTL data, indicating that POI data play a certain role in improving the urban built-up area extracted by NTL data.

Urban Built-Up Area Extracted by BM_NTL Data
The regional population is one of the important indicators to reflect the macro level of urban development, while the regional population migration is the micro embodiment of the development differences within urban area. Due to the differences in industry and urban construction between urban built-up areas and non-built-up areas, the regional flow of population is obviously different, which is also the reason why the flow of regional population can be used to assist in judging the spatial distribution of urban built-up areas. Figure 3 shows that although population migration is mainly concentrated in central urban areas, compared with NTL data and POI data, population migration between central urban areas and surrounding areas leads to obvious spatial connections between regions.
The urban built-up area extracted by BM_NTL data by using U-net is shown in Figure 8. The urban built-up area extracted by BM_NTL is 1233.16 km 2 , accounting for 90.27% of the built-up area of Guangzhou. From the perspective of the extracted urban built-up area, the extracted urban built-up area in the central city is smaller than that extracted by POI_NTL data, but the extracted urban built-up area between the urban center and Zengcheng, Conghua and other regions is larger than that extracted by NTL and POI_NTL data. From the point of view of the patch and boundary of the extracted urban built-up area, there is no significant difference between the urban built-up area extracted by BM_NTL data and that by POI_NTL data. On the whole, although the results of urban built-up areas extracted by BM_NTL data are similar to those extracted by POI_NTL data, BM_NTL data extract more built-up areas with spatial linkage attributes in different urban groups, especially between central urban areas, Zengcheng and Conghua. However, such built-up areas do not necessarily exist in physical space, which requires further verification.

Urban Built-Up Area Extracted by POI_BM_NTL Data
Although NTL data, POI data and BM data express three different attributes in urban space, NTL data and POI data express static space in essence, while BM data reflect the dynamic relationship between patches of different urban built-up areas. Since POI data and BM data represent different dynamic and static attributes, this study firstly tries to fuse POI data, BM data with NTL data, respectively; then, the three datasets are also fused together to improve the extraction accuracy of urban built-up area and thus to compare the effect of extracting urban built-up area after different data fusion. After the fusion of NTL data, POI data and BM data, POI_BM_NTL data can be obtained. The extracted urban built-up area is used as the training sample, and U-net is used to extract the urban built-up area by POI_BM_NTL data after the fusion of the three kinds of data; then, the results of the extracted urban built-up area can be obtained as shown in Figure 8. We can see from Figure 8 that the urban built-up area extracted is 1298.23 km 2 , accounting for 95.04% of the total built-up area of Guangzhou. Compared with other data fusion extraction results, firstly, POI_BM_NTL extracts a larger spatial range of urban built-up area. Secondly, in addition to the central urban area, the urban built-up area patches extracted from Zengcheng, Conghua and other areas are also more obvious, and there is an obvious spatial correlation between each urban patch. However, there is no significant change in the number of patches and the boundary of built-up areas. In general, the urban built-up area patches extracted by NTL data are complex and fragmented, while the degree of fragmentation of the built-up area extracted by POI_NTL data is improved; on the other hand, BM_NTL data tend to highlight the connections between different urban patches, while POI_BM_NTL data integrate the characteristics of the three kinds of data, making not only the degree of fragmentation but the spatial connection of the extracted built-up areas significantly improved. However, the accuracy of urban built-up areas extracted from different data needs to be further verified.

Comparison of the Urban Built-Up Area Extracted by Different Data
The urban built-up areas extracted by NTL, POI_NTL, BM_NTL and POI_BM_NTL data are mainly concentrated in the central urban areas of Haizhu District, Liwan District and Tianhe District, Panyu District and Nansha District are less distributed, while only a small number of built-up area groups are identified in Conghua District and Zengcheng District. From the urban built-up area extracted by different data (Figure 8), it can be seen that the urban built-up area extracted by NTL data is seriously fragmented, with complex boundaries and a large number of urban built-up area voids due to the fact that NTL data only has a single attribute of nighttime light brightness. After the fusion of POI data, not only the distribution of urban infrastructure is taken into account, but also the concentration of POI quantity is taken into account at the boundary and the void part of the built-up area, which significantly improves the effect of the extracted built-up area, while after the fusion of BM data with NTL data, a small number of built-up areas between various urban patches are extracted by comprehensively considering the potential connection between different urban patches, especially the central urban area and peripheral areas. Furthermore, POI data, BM data and NTL data are finally fused to extract the urban built-up area, which comprehensively considers the distribution of urban infrastructure and the connection among urban patches, making the extracted urban built-up area objectively more consistent with the actual development of urban area.
From the extraction results, although the fusion of big data and NTL data makes up for the deficiency of extracting urban built-up areas by single NTL data to a large extent, the effect of POI_NTL data extraction is better than that of BM_NTL and POI_BM_NTL data since population flow does not exist in physical geographic space while POI data and BM data represent different attributes. Additionally, on the other hand, theoretically, the accuracy of urban built-up area extracted by NTL_POI_BMP data fusion with more data should be significantly better than other results extracted by other data, but the result does not achieve the expected effect, which may be due to the noise of the data itself, indicating that fusing too many data may amplify the shortcomings of the data.

Accuracy Verification of Urban Built-Up Area
This study comprehensively compares the accuracy of urban built-up areas extracted by different data from two aspects, including the overlap ratio between built-up area extracted from different data and built-up area of existing land use classification and the comparison between the built-up area extracted from different data and the urban built-up area extracted from high-resolution images of Google Earth, Among which, the spatial distribution data of the existing built-up area comes from the land use classification data of Guangzhou built-up area at 2020, which is of high accuracy. The verification results of built-up areas extracted by different data are shown in Table 2. As shown in Table 2, the highest recall of the built-up area extracted by NTL data is 0.7768, the highest precision is 0.6544, and the highest F1 score is 0.7103, the highest recall of the built-up area extracted by POI_NTL data is 0.8811, the highest precision is 0.8084, and the highest F1 score is 0.8432, the highest recall ratio of BM_NTL data extraction is 0.8905, the highest precision is 0.8053, and the highest F1 score is 0.8458, while the highest recall of POI_BM_NTL data extraction is 0.8530, the highest precision is 0.8121, and the highest F1 score is 0.8321. Although the cross-verification results of the built-up area data extracted from land use and Google Earth are similar, indicating that the urban built-up area data extracted in this study has a high accuracy, the verification results based on Google Earth are slightly higher than those of land use, which may result from that the resolution of land use used in this study is 30 m, while that of Google Earth is 5 m. The higher resolution of Google Earth makes it have a better verification effect for small built-up patches.
From the proportion of the extracted built-up area, the extraction area of POI_NTL data is larger than that of BM_NTL data. Additionally, the proportion of built-up area extracted by NTL and BM_NTL data is smaller than that of POI_NTL and POI_BM_NTL data. From the proportion of area that coincides with the existing built-up area, the overlap area of POI_BM_NTL extracted is much larger than that of single NTL data, but the accuracy of POI_BM_NTL extracted is not significantly improved compared with POI_NTL data and BM_NTL data. On the whole, data fusion can effectively improve the accuracy of extracting urban built-up area by single NTL data.

Discussion
The traditional extraction of urban built-up areas mainly relies on statistical survey data and remote sensing data, statistical survey data is too subjective and irreproducible to obtain accurate results, while remote sensing data represented by NTL data are often interfered by the data itself such as the spillover effect of lights, etc. [21]. In a word, there are a lack of scientific, objective and popularized methods to extract urban built-up areas. In this study, firstly, the POI data and BM data are fused with NTL data to modify NTL data, and then the characteristics of different images are extracted through deep learning, and different urban built-up areas are obtained. Finally, the accuracy of different built-up areas is analyzed through accuracy verification, hoping to explore a simple and reliable urban built-up area extraction method system. NTL data are used to extract urban built-up areas by capturing the difference of light brightness in different areas. In previous studies, although NTL data as the main data source have achieved good results in the study of urban built-up areas, with the change of built-up areas, a single NTL datum gradually cannot reflect the change of built-up areas. Therefore, at present, there are relatively few studies using a single NTL datum to extract urban built-up areas [21,62,63]; most of them use POI data to modify NTL data so as to obtain higher research accuracy [64]. Despite this, NTL data and POI data are still static data, so this study attempts to fuse BM data to modify NTL data so as to explore the modification effect of different source data on NTL data, which has seldom been considered in previous studies [65]. Many studies have proved that the fusion of POI data and population migration data with NTL data could achieve better results in the extraction of urban built-up areas, but the fusion of more data may not necessarily lead to more accurate results, which is very important for the study of using data fusion to extract urban built-up areas.
At present, the extraction of built-up areas mainly uses the dichotomy method and the threshold method to segment the image pixels, which has a high time cost and low accuracy, and it is especially difficult to replicate and popularize [66], while essentially, the extraction of built-up areas is the extraction of image features. Therefore, this study uses deep learning methods to continuously learn built-up area samples so as to extract urban built-up areas more accurately [67,68]. At present, the methods used to extract urban builtup areas include normalized difference vegetation index (NDVI) and normalized difference building index (NDBI), etc., which inevitably lead to the misjudgment phenomenon of different bodies with the same spectrum and different spectra of the same object [69][70][71], thus affecting the extraction accuracy, while model methods such as random forest and support vector machine have the disadvantages of limited number of training samples, simple model structure and insufficient classification accuracy in extracting urban built-up areas [72][73][74]. Therefore, compared with the extraction of urban built-up areas using nondeep learning methods, the deep learning method used in this study has advantages of higher operability, replicability and accuracy, which plays an important role in promoting the formation of method system for urban built-up areas extraction.
Compared with the traditional studies on the extraction of urban built-up areas, the innovation of this study is mainly reflected in two aspects. First, dynamic and static data are combined to fuse with NTL data, respectively, firstly, and then the differences in the results of urban built-up area extracted by the fusion of data from different sources are further analyzed. Although in previous studies, we have achieved good results by fusing both dynamic and static data with NTL data, and previous studies have shown that the more data sources we fuse, the greater the improvement of the final results [75,76], in our further comparative analysis of the results of fusing and extracting urban built-up areas from different source data, we found that more data fused will not undoubtedly improve the accuracy of extraction results, which was not obtained in previous multi-source data fusion studies [77,78], which may also be an important progress in data fusion studies [17]. Second, in previous studies on urban built-up area extraction and those of our team, the extraction methods of urban built-up area still rely more on indexes and thresholds [66,79]. Based on a large number of training samples that represent built-up area features, a method system of built-up area extraction with high accuracy, simple operation and promotion has been developed by introducing a deep learning method, which is also one of the possible innovations compared to the non-deep learning methods we used in the past.
However, there are still some deficiencies in this study, mainly reflected in the uncertainty of BM data used in this study [80]. Firstly, BM data mainly come from people using smart phones, which cannot reflect the movement of the elderly and children [81,82]. Secondly, the population migration data is essentially a simulation of population flow, and there are some differences with the flow of population elements between and within cities, which may also affect the results [83,84]. Finally, there are still some time differences between the data used. For example, the POI data and BM data used in this study are data from 2021, while the Luojia-01 NTL data used in this study are data from 2019. The inconsistency in time series will affect the accuracy of the results to a certain extent. Therefore, subsequent studies will continue to unify the time series scale of data in order to further improve the credibility of the results. Therefore, in subsequent studies, spatial correction of population migration data will be carried out in the form of questionnaires to further improve the quality of population migration data [85]. Finally, a new feasible method of urban built-up area extraction is proposed, which has obvious practical significance for the future study of urban built-up area.

Conclusions
Since the urban built-up area is the direct manifestation of urbanization, it is of great practical value to accurately extract the urban built-up area for judging the urbanization process and evaluating the urban development. This study extracts urban built-up areas by modifying NTL data through fusing POI data and BM data with NTL data, respectively. The highest accuracy of the extracted urban built-up areas is 95.21%, of which the recall is 0.8905, the precision is 0.8121, and the F1 score is 0.8458. Additionally, the comparison of different data fusion results verifies that the more data fusion does not always increase the accuracy of built-up area extraction; on the contrary, it will affect the extraction of built-up areas. In general, this study proposes a feasible way of extracting urban built-up areas based on deep learning by fusing big data to modify NTL data, and then the feasibility of this method is further proved, which makes the extraction of urban built-up areas no longer rely on subjective judgment of data. The accurate extraction of urban built-up area could not only help to evaluate the level of urban development, but also have certain guiding significance for the construction of subsequent urban planning.
Funding: This research is supported by the 13th Graduate School of architecture and planning of Yunnan University (no. yj20210024).

Data Availability Statement:
The data presented in this study are publicly available data sources stated in the citation. Please contact the corresponding author regarding data availability.