Land Use and Land Cover Mapping in the Era of Big Data

: We are currently living in the era of big data. The volume of collected or archived geospatial data for land use and land cover (LULC) mapping including remotely sensed satellite imagery and auxiliary geospatial datasets is increasing. Innovative machine learning, deep learning algorithms, and cutting-edge cloud computing have also recently been developed. While new opportunities are provided by these geospatial big data and advanced computer technologies for LULC mapping, challenges also emerge for LULC mapping from using these geospatial big data. This article summarizes the review studies and research progress in remote sensing, machine learning, deep learning, and geospatial big data for LULC mapping since 2015. We identiﬁed the opportunities, challenges, and future directions of using geospatial big data for LULC mapping. More research needs to be performed for improved LULC mapping at large scales.


Introduction
Accurate and timely land use and land cover (LULC) maps are important for a variety of applications such as urban and regional planning, disasters and hazards monitoring, natural resources and environmental management, and food security [1][2][3].LULC mapping may help tackle many significant large-scale challenges, such as global warming, the accelerating loss of species habitat, unprecedented population migration, increasing urbanization, and growing inequalities within and between nations [4,5].Therefore, it is important to produce accurate LULC maps.
The land use concept and the land cover concept, though related, are distinctly different [6].Land cover mainly refers to direct observations of terrestrial ecosystems, natural resources, and habitats on the Earth's surface, while land use generally describes a certain land type produced, changed or maintained by the arrangements, activities, and inputs of people.Land use relates to the purpose for which land is utilized by people, but land cover specifies landscape patterns and characteristics.Examples of land use may include multi-family residential homes, state parks, reservoirs, and shopping centers.In contrast, examples of land cover may include forests, wetlands, built areas, water, and grasslands.However, land use and land cover are often used as interchangeable terms in existing research literature.In this article, we discuss LULC mapping without making a specific differentiation between land use and land cover, and the mapping includes both.
Remotely sensed satellite imagery is a valuable source for LULC mapping [7][8][9].Many studies have attempted to extract LULC information from remotely sensed imagery [2,10].Advances in remote sensing technologies have resulted in improvements in spectral, spatial, and temporal resolutions of satellite imagery, all of which benefit LULC mapping.LULC mapping is currently experiencing a transformation from the coarse and moderate scales to much finer scales in order to provide more precise land knowledge.Although remotely sensed imagery has been used in LULC mapping since the launch of Landsat 1 in 1972 [11], Land 2022, 11, 1692 2 of 22 it is still difficult to capture complex and diverse LULC information and patterns by using remotely sensed imagery alone [12].Ancillary data are typically needed as a supplement to remotely sensed imagery in order to accurately identify LULC information, especially the land use information related to socioeconomic aspects [13].
With the development of GPS and data acquisition techniques, the merging of big data with spatial location information-such as social media data, mobile phone tracking data, public transport smart card data, Wi-Fi access point data, wireless sensor networks, and other sensing information generated by Internet of Things devices-may provide useful ancillary data for LULC mapping [14].Compared to traditional geospatial data acquisition, these geospatial big data are normally obtained at a lower cost and have different coverages and better spatio-temporal resolutions.They contain abundant human activity information and may thus be used to compensate for the lack of socioeconomic attributes of the remotely sensed imagery data for accurate LULC mapping [15].In fact, the aforementioned geospatial big data were integrated with remotely sensed imagery and other source data for accurate LULC mapping in many studies [16,17].
We are currently living in the era of big data.The volume of collected or archived geospatial data, including remotely sensed data, is increasing from terabytes to petabytes and even to exabytes [18].For example, the European Space Agency (ESA), the National Aeronautics and Space Administration (NASA), the United States Geological Survey (USGS), and the National Oceanic and Atmospheric Administration (NOAA) provide a huge amount of freely available remotely sensed data and other Geographic Information System (GIS) data for LULC mapping.Social media sites, such as Facebook, Twitter, and Instagram, are generating an enormous volume of data with geospatial location information that can be used for LULC mapping nowadays [19].Progress in data access and algorithm development in the era of big data provides opportunities for developing improved LULC maps [20].Figure 1 illustrates the major opportunities of LULC mapping in the era of big data.Databases that offer free access to LULC maps at the global scale have emerged.For example, as a free search engine, "Collect Earth" developed by the Food and Agriculture Organization (FAO) can help derive past and present LULC change information [21].
Land 2022, 11, x FOR PEER REVIEW 2 of 22 Landsat 1 in 1972 [11], it is still difficult to capture complex and diverse LULC information and patterns by using remotely sensed imagery alone [12].Ancillary data are typically needed as a supplement to remotely sensed imagery in order to accurately identify LULC information, especially the land use information related to socioeconomic aspects [13].
With the development of GPS and data acquisition techniques, the merging of big data with spatial location information-such as social media data, mobile phone tracking data, public transport smart card data, Wi-Fi access point data, wireless sensor networks, and other sensing information generated by Internet of Things devices-may provide useful ancillary data for LULC mapping [14].Compared to traditional geospatial data acquisition, these geospatial big data are normally obtained at a lower cost and have different coverages and better spatio-temporal resolutions.They contain abundant human activity information and may thus be used to compensate for the lack of socioeconomic attributes of the remotely sensed imagery data for accurate LULC mapping [15].In fact, the aforementioned geospatial big data were integrated with remotely sensed imagery and other source data for accurate LULC mapping in many studies [16,17].
We are currently living in the era of big data.The volume of collected or archived geospatial data, including remotely sensed data, is increasing from terabytes to petabytes and even to exabytes [18].For example, the European Space Agency (ESA), the National Aeronautics and Space Administration (NASA), the United States Geological Survey (USGS), and the National Oceanic and Atmospheric Administration (NOAA) provide a huge amount of freely available remotely sensed data and other Geographic Information System (GIS) data for LULC mapping.Social media sites, such as Facebook, Twitter, and Instagram, are generating an enormous volume of data with geospatial location information that can be used for LULC mapping nowadays [19].Progress in data access and algorithm development in the era of big data provides opportunities for developing improved LULC maps [20].Figure 1 illustrates the major opportunities of LULC mapping in the era of big data.Databases that offer free access to LULC maps at the global scale have emerged.For example, as a free search engine, "Collect Earth" developed by the Food and Agriculture Organization (FAO) can help derive past and present LULC change information [21].While these geospatial big data provide new opportunities, challenges remain in storing, managing, analyzing, and visualizing these data for LULC mapping [22].Geospatial big data not only have various forms but are also often associated with unstructured data that are difficult to manage [23].It is extremely difficult to integrate, While these geospatial big data provide new opportunities, challenges remain in storing, managing, analyzing, and visualizing these data for LULC mapping [22].Geospatial big data not only have various forms but are also often associated with unstructured data that are difficult to manage [23].challenges in dealing with these large and complex geospatial big data.New approaches are needed to efficiently process and analyze these data to reveal patterns, trends, and associations related to LULC mapping [24].
Lately, advanced machine learning techniques, especially deep learning (DL), have been developed for large-scale LULC mapping based on multispectral and hyperspectral satellite images or the integration of satellite imagery with other geospatial big data [25].Deep learning has demonstrated better performance compared to traditional methods, such as random forest (RF) and support vector machine (SVM), e.g., [26][27][28].Nevertheless, there are still many issues in applying advanced machine learning or deep learning for accurate LULC mapping using geospatial big data.
The Special Issue "Feature Papers for Land Innovations-Data and machine learning" targets contributions to spatial data science for obtaining, processing, analyzing, harnessing, and visualizing social, economic, environmental, and other land-related data.Particularly, the Special Issue focuses on research in geospatial artificial intelligence and machine learning techniques for dealing with spatial big data.This includes remotely sensed data and social media data.A number of literature review articles related to LULC mapping have been published in the fields of remote sensing, machine learning, deep learning, and geospatial big data since 2015, e.g., [12,13,15,16,18,[29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47].This article summarizes these recent review studies and recent research progress in remote sensing, machine learning, deep learning, and geospatial big data for the Special Issue.As a review article for this Special Issue, the purpose of this paper is to briefly review LULC mapping in the big data era.The method and materials are briefly introduced in Section 2. LULC mapping using remotely sensed imagery is reviewed in Section 3. LULC mapping by integrating geospatial big data and remotely sensed imagery is examined in Section 4. Advanced machine learning, deep learning, and cloud computing for large-scale LULC mapping are summarized in Section 5. Challenges and future directions for the use of geospatial big data for LULC mapping are identified in Section 6.Finally, a brief conclusion is provided in Section 7 at the end of the paper.

Method and Materials
We conducted a search of "review articles" from 2015 to now using Google Scholar with combinations of the following keywords: "land use", "land cover", "mapping", "classification", "remote sensing", "geospatial big data", "deep learning", "machine learning", "cloud computing", and "cyberinfrastructure".In case we missed some review articles, we also carried out another search of "any type articles" from 2015 to now using Google Scholar based on combinations of the above keywords.We sorted the search results by relevance.We went through the searched results and selected papers based on their scopes, objectives, and characteristics.Table S1 lists the cited publications since 2015 based on five grouped themes associated with LULC mapping: meta-analysis, remote sensing, big data, machine learning (especially deep learning), and advanced cyberinfrastructure (especially cloud computing).Please note: * indicates a review paper.

LULC Mapping from Remotely Sensed Imagery Data
As mentioned previously, remote sensing has become one of the most important methods for LULC mapping [48,49].Many existing LULC maps were made by the classification of remotely sensed satellite imagery data [50].Remotely sensed data have multi-source, multi-scale, high-dimension, and non-linear characteristics [51].Since the advent of remote sensing technology, many satellites have been launched.Every day, a large set of spaceborne and airborne sensors provide a massive amount of remotely sensed data.At present, there are more than 200 on-orbit satellite sensors capturing a large amount of multi-temporal and multi-scale remotely sensed data.For example, NASA's Earth Observing System Data and Information System (EOSDIS) managed more than 7.5 petabytes of archived remotely sensed data and archived a daily data increase of four TB in 2013 [52].Many satellite imagery data providers release timely remotely sensed data to the public In the past, many LULC maps were made from coarse spatial resolution satellite imagery data such as advanced very-high-resolution radiometer (AVHRR) and moderateresolution imaging spectroradiometer (MODIS) [53].Advances in remote sensing technology and the launch of sensors with moderate spatial resolutions, such as Landsat, Satellite Pour l'Observation de la Terre (SPOT), and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), have contributed to enhanced LULC mapping, e.g., [54,55].Lately, detailed LULC maps have been produced from high-resolution imagery data such as QuickBird, IKONOS, and WorldView, which can provide more detailed spatial and spectral information for LULC mapping, e.g., [56][57][58].With these high-resolution remotely sensed data, it is possible to identify the detailed geometries, textures, sizes, locations, and adjacent information of ground objects at a much finer scale for LULC mapping [59].
In addition to the different spatial resolutions, the remotely sensed data for LULC mapping also have different spectral and temporal resolutions.Many satellite sensors produce imagery data with very-high spectral resolutions [32].For example, the Hyperion sensor consists of 220 spectral bands, the AVIRIS system provides 224 spectral bands, the WIS instrument has 812 bands, and the hyperspectral imager equipped in HJ-1A has 128 bands.Furthermore, remotely sensed data may come from different types of satellites.Some satellites use optical sensors such as SPOT, Landsat, and IKONOS; some use microwave synthetic aperture radar (SAR) sensors such as TerraSAR, Envisat, and RADARSAT; while others use multi-mode sensors such as MODIS.While the optical satellite imagery data face challenges in producing LULC maps under cloudy weather conditions, microwave SAR data allow LULC mapping under all weather conditions, including the constantly cloudy weather situation [60][61][62].From a temporal resolution perspective, these satellites also have different capabilities to revisit an observation area.Some satellites have a short revisit period of one day (e.g., MODIS and WorldView), while other satellites have a long revisit period of 16 days (e.g., Landsat).Figure 2 shows different types of remotely sensed satellite imagery for LULC mapping.Teeuw et al. [63], Navin and Agilandeeswari [64], and Pandey et al. [40] provided detailed tables for the characteristics of different types of remotely sensed data.
Earth Observing System Data and Information System (EOSDIS) managed more than 7.5 petabytes of archived remotely sensed data and archived a daily data increase of four TB in 2013 [52].Many satellite imagery data providers release timely remotely sensed data to the public without any cost.USGS, NASA, NOAA, IPMUS Terra, NEO, and Copernicus open access hubs are among the most popular open access remotely sensed data providers.
In the past, many LULC maps were made from coarse spatial resolution satellite imagery data such as advanced very-high-resolution radiometer (AVHRR) and moderateresolution imaging spectroradiometer (MODIS) [53].Advances in remote sensing technology and the launch of sensors with moderate spatial resolutions, such as Landsat, Satellite Pour l'Observation de la Terre (SPOT), and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), have contributed to enhanced LULC mapping, e.g., [54,55].Lately, detailed LULC maps have been produced from highresolution imagery data such as QuickBird, IKONOS, and WorldView, which can provide more detailed spatial and spectral information for LULC mapping, e.g., [56][57][58].With these high-resolution remotely sensed data, it is possible to identify the detailed geometries, textures, sizes, locations, and adjacent information of ground objects at a much finer scale for LULC mapping [59].
In addition to the different spatial resolutions, the remotely sensed data for LULC mapping also have different spectral and temporal resolutions.Many satellite sensors produce imagery data with very-high spectral resolutions [32].For example, the Hyperion sensor consists of 220 spectral bands, the AVIRIS system provides 224 spectral bands, the WIS instrument has 812 bands, and the hyperspectral imager equipped in HJ-1A has 128 bands.Furthermore, remotely sensed data may come from different types of satellites.Some satellites use optical sensors such as SPOT, Landsat, and IKONOS; some use microwave synthetic aperture radar (SAR) sensors such as TerraSAR, Envisat, and RADARSAT; while others use multi-mode sensors such as MODIS.While the optical satellite imagery data face challenges in producing LULC maps under cloudy weather conditions, microwave SAR data allow LULC mapping under all weather conditions, including the constantly cloudy weather situation [60][61][62].From a temporal resolution perspective, these satellites also have different capabilities to revisit an observation area.Some satellites have a short revisit period of one day (e.g., MODIS and WorldView), while other satellites have a long revisit period of 16 days (e.g., Landsat).Figure 2 shows different types of remotely sensed satellite imagery for LULC mapping.Teeuw et al. [63], Navin and Agilandeeswari [64], and Pandey et al. [40] provided detailed tables for the characteristics of different types of remotely sensed data.As illustrated in Figure 3, more types of remotely sensed data have emerged to provide additional observations for LULC mapping [51].These remotely sensed data provide observations to differentiate LULC types with complex structures, which are difficult to Land 2022, 11, 1692 5 of 22 differentiate in the past.For example, as a unique measure of human activities and socioeconomic attributes, remote sensing-based nighttime lights (NTL) imagery is especially useful for urban LULC mapping at different spatial and temporal scales, e.g., [65,66].Light detection and ranging (Lidar) is another type of remotely sensed data for detailed LULC mapping, e.g., [67].Unlike optical data, airborne Lidar data can capture highly accurate structural information to differentiate LULC types with different structures, components, and compositions [68].In addition, street-view imagery from Google, Baidu, and Tencent also functions as an additional type of remotely sensed data for LULC mapping, e.g., [69,70].In contrast to the overhead view captured by most other remote sensing methods, streetview imagery data provide street-level or eye-level observations along the road networks.By providing information about what people typically see at street level on ground, streetview imagery data provide crucial information on the functions of objects conventionally hidden from the view above, e.g., [71].For example, street-view imagery data have been used for level II or III land use classification (e.g., differentiation of commercial buildings and residential buildings by using text information on buildings from street-view imagery data [72].Street-view imagery can also be used for ground truth purposes.Recently, unoccupied aerial system (UAS) platforms with small-sized and high-detection-precision sensors have also started producing massive high-resolution images as well, and have been extensively used for high-resolution LULC mapping, e.g., [73,74].Currently, the amount of data collected by UAS is about to explode.As illustrated in Figure 3, more types of remotely sensed data have emerged to provide additional observations for LULC mapping [51].These remotely sensed data provide observations to differentiate LULC types with complex structures, which are difficult to differentiate in the past.For example, as a unique measure of human activities and socio-economic attributes, remote sensing-based nighttime lights (NTL) imagery is especially useful for urban LULC mapping at different spatial and temporal scales, e.g., [65,66].Light detection and ranging (Lidar) is another type of remotely sensed data for detailed LULC mapping, e.g., [67].Unlike optical data, airborne Lidar data can capture highly accurate structural information to differentiate LULC types with different structures, components, and compositions [68].In addition, street-view imagery from Google, Baidu, and Tencent also functions as an additional type of remotely sensed data for LULC mapping, e.g., [69,70].In contrast to the overhead view captured by most other remote sensing methods, street-view imagery data provide street-level or eye-level observations along the road networks.By providing information about what people typically see at street level on ground, street-view imagery data provide crucial information on the functions of objects conventionally hidden from the view above, e.g., [71].For example, street-view imagery data have been used for level II or III land use classification (e.g., differentiation of commercial buildings and residential buildings by using text information on buildings from street-view imagery data [72].Street-view imagery can also be used for ground truth purposes.Recently, unoccupied aerial system (UAS) platforms with small-sized and high-detection-precision sensors have also started producing massive high-resolution images as well, and have been extensively used for high-resolution LULC mapping, e.g., [73,74].Currently, the amount of data collected by UAS is about to explode.Because of the diversity and high dimensionality of remotely sensed data, LULC mapping from remotely sensed big data becomes complex.It is challenging to identify the right datasets and combine them to make detailed LULC maps at large scales.Although the multi-source optical and microwave remotely sensed data allow us to obtain LULC information from multiple viewpoints, they sometimes cause confusion in deciding which type is the most appropriate for particular LULC mapping.In addition, because of the data representation challenge, it is difficult to integrate the various remotely sensed data with different features (e.g., spectral signatures in optical imagery and electromagnetic radiations in microwave imagery) from various sources.Traditional pixel-level, feature- Because of the diversity and high dimensionality of remotely sensed data, LULC mapping from remotely sensed big data becomes complex.It is challenging to identify the right datasets and combine them to make detailed LULC maps at large scales.Although the multi-source optical and microwave remotely sensed data allow us to obtain LULC information from multiple viewpoints, they sometimes cause confusion in deciding which type is the most appropriate for particular LULC mapping.In addition, because of the data representation challenge, it is difficult to integrate the various remotely sensed data with different features (e.g., spectral signatures in optical imagery and electromagnetic radiations in microwave imagery) from various sources.Traditional pixel-level, featurelevel, and decision-level fusion cannot be used to integrate remotely sensed imagery with different scales and/or formats [18].New approaches need to be developed to fuse remotely sensed imagery with other geospatial big data, such as photos from a social network and crowdsourcing spatial data, for LULC mapping.

LULC Mapping from Integration of Geospatial Big Data and Remotely Sensed Imagery Data
Although remotely sensed data have become one of the most important data sources for LULC mapping, these have limitations [42,75].Remotely sensed data are valuable to extract natural and physical land cover information based on spectrum, texture, geometry, context, and temporal information, but they have limitations in capturing the patterns of human activities and socioeconomic environments and describing indirect anthropogenic differences among different land use classes [13].For example, while the spectral information of remotely sensed imagery data is effective to extract land cover information such as water area, forest land, and built-up area, it is almost impossible to distinguish some land use classes such as some industrial land, residential land, and commercial land using the spectral information of remotely sensed data alone [72].
With the development of mobile positioning techniques, wireless communication, and the Internet of Things, new emerging types of social sensing big data are providing complementary information to differentiate some land use classes caused by human activities and socioeconomic environments [9].Examples of these emerging social sensing big data include mobile phone data, geo-tagged photos, social media data, traffic trajectories, and volunteered geographic information (VGI) data [76].These emerging social sensing big data are able to more effectively capture human activities and dynamic socioeconomic environments, and are regarded as complements of remotely sensed imagery data for effectively LULC mapping [77,78].For example, Geo-Wiki is a crowdsourcing platform for LULC mapping and other tasks, which was used to derive the global LULC reference data via four campaigns [79].Flickr offers online services for the sharing of digital photos with geographic locations based on social networks, which was used to identify socioeconomic and human activities in LULC mapping [19].OpenStreetMap (OSM) (as a VGI database) allows the adding, editing, and updating of basic geographic map information with users' experience and knowledge, which was also used to uncover some land use types and patterns, e.g., [80,81].The points of interest (POIs), as one of the most common categories of crowdsourced data, were explored for land use classification by many scholars, e.g., [82][83][84].In addition, a large amount of GPS traffic trajectory data also further enriched the remotely sensed data in excavating human activities at a fine scale for accurate LULC mapping [85].
These emerging social sensing big data improved the existing LULC maps by providing more detailed socioeconomic information and finer spatio-temporal resolutions [86].Many studies have been conducted to integrate the social sensing big data with remotely sensed data for LULC mapping at different scales and locations, e.g., [87][88][89].For example, Hu et al. [90] developed a protocol to identify urban land use functions over large areas using satellite images and open social data.Yin et al. [91] employed both the decision-level integration and feature-level integration of remotely sensed data with social sensing big data for urban land use mapping.Integrating data from these social sensing big data with remotely sensed data may provide a more comprehensive picture of LULC patterns, as shown in Figure 4.
In addition to the integration of remotely sensed data and social sensing big data for LULC mapping, other auxiliary datasets may also be used for LULC mapping [92].For example, census data including demography, employment, education, housing, and income information may provide valuable information to reveal spatial differences in socioeconomic statures across different land use types, e.g., [24].Municipal data such as water consumption data may offer important information to identify the socioeconomic functions of land uses and help classify mixed patterns of land uses [93].In addition, topographic information such as elevation, slope, and aspect information extracted from digital elevation or digital terrain models (DEMs/DTMs) may also be combined with remotely sensed data to increase the accuracy of urban land use classification, e.g., [94].
Ubiquitous sensor networks can constantly obtain spatio-temporal data in days, hours, minutes, seconds, or even milliseconds.These spatio-temporal data allow people to acquire multi-dimensional dynamic information about various land entities and human activities, which may be used for making or updating LULC maps.LULC mapping is expanding from professional aspects to public aspects with the development of The Internet of Things (IoT) and Volunteered Geographic Information (VGI), as evidenced by Geo-Wiki and My Maps feature in Google Maps.However, the non-professional characteristics of IoT and VGI often make the data obtained from them contain data uncertainty such as data loss, noise, inconsistency, and ambiguity [95].Therefore, it is important to develop quality assurance procedures such as data cleaning and quality inspection for high-quality LULC mapping.In addition to the integration of remotely sensed data and social sensing big data for LULC mapping, other auxiliary datasets may also be used for LULC mapping [92].For example, census data including demography, employment, education, housing, and income information may provide valuable information to reveal spatial differences in socioeconomic statures across different land use types, e.g., [24].Municipal data such as water consumption data may offer important information to identify the socioeconomic functions of land uses and help classify mixed patterns of land uses [93].In addition, topographic information such as elevation, slope, and aspect information extracted from digital elevation or digital terrain models (DEMs/DTMs) may also be combined with remotely sensed data to increase the accuracy of urban land use classification, e.g., [94].
Ubiquitous sensor networks can constantly obtain spatio-temporal data in days, hours, minutes, seconds, or even milliseconds.These spatio-temporal data allow people to acquire multi-dimensional dynamic information about various land entities and human activities, which may be used for making or updating LULC maps.LULC mapping is expanding from professional aspects to public aspects with the development of The Internet of Things (IoT) and Volunteered Geographic Information (VGI), as evidenced by Geo-Wiki and My Maps feature in Google Maps.However, the non-professional characteristics of IoT and VGI often make the data obtained from them contain data uncertainty such as data loss, noise, inconsistency, and ambiguity [95].Therefore, it is important to develop quality assurance procedures such as data cleaning and quality inspection for high-quality LULC mapping.
It is still challenging to integrate multi-source remotely sensed data, social sensing big data, and other auxiliary datasets for LULC mapping because of intensive computing and the heterogeneity in spatial data structures, formats, resolutions, scales, and data quality.Novel machine learning including deep learning and cloud computing approaches are urgently needed for LULC mapping.It is still challenging to integrate multi-source remotely sensed data, social sensing big data, and other auxiliary datasets for LULC mapping because of intensive computing and the heterogeneity in spatial data structures, formats, resolutions, scales, and data quality.Novel machine learning including deep learning and cloud computing approaches are urgently needed for LULC mapping.

Machine Learning and Cloud Computing for LULC Mapping
Machine learning is a data analysis method and a subset of artificial intelligence based on the idea that computer systems can learn from data to identify patterns and make decisions with minimal human intervention.There are many different machine learning approaches for LULC mapping [96,97], such as support vector machine (SVM), random forest (RF), and K-nearest neighborhood (KNN).The strengths of machine learning include the capacity to handle data of high dimensionality and to map LULC classes with very complex characteristics.With growing volumes and varieties of the available aforementioned remotely sensed imagery and geospatial big data, cheaper and more powerful computational processing tools, and affordable data storage, machine learning has become more popular than ever for analyzing bigger and more complex data and delivering more accurate LULC mapping results at larger scales [10].Machine learning provides the foundation for autonomously solving data-based LULC mapping problems [98].
Supervised learning, unsupervised learning, and semi-supervised learning are the three main types of machine learning methods for LULC mapping, as shown in Figure 5. Supervised learning algorithms are trained using labeled LULC examples and apply what has been learned in the labeled LULC example data to predict the labels of new LULC data.By inferring methods such as regression and gradient boosting, supervised learning methods use patterns to predict the values of the labels on unlabeled LULC data [99].Popular supervised learning methods include support vector machine (SVM), random forest (RF), classification and regression tree (CART), radial basis function (RBF), decision tree (DT), multilayer perception (MLP), naive Bayes (NB), maximum likelihood classifier (MLC), and fuzzy logic.Unsupervised learning algorithms are used with data that have no historical LULC labels and computers infer a function to describe a hidden structure from unlabeled LULC data.Unsupervised learning methods are used when it is unclear what the LULC mapping results will look like and computers need to dig through hidden layers of LULC data and cluster data together based on the similarities or differences of LULC classes.Popular unsupervised learning methods include self-organizing maps, k-means clustering, nearest-neighbor mapping, affinity propagation (AP) cluster algorithm, ISODATA (iterative self-organizing data), and fuzzy c-means algorithms.Semi-supervised learning is similar to supervised learning.However, it uses both labeled and unlabeled data for training-usually a small amount of labeled data with a large amount of unlabeled data.more accurate LULC mapping results at larger scales [10].Machine learning provides the foundation for autonomously solving data-based LULC mapping problems [98].
Supervised learning, unsupervised learning, and semi-supervised learning are the three main types of machine learning methods for LULC mapping, as shown in Figure 5. Supervised learning algorithms are trained using labeled LULC examples and apply what has been learned in the labeled LULC example data to predict the labels of new LULC data.By inferring methods such as regression and gradient boosting, supervised learning methods use patterns to predict the values of the labels on unlabeled LULC data [99].Popular supervised learning methods include support vector machine (SVM), random forest (RF), classification and regression tree (CART), radial basis function (RBF), decision tree (DT), multilayer perception (MLP), naive Bayes (NB), maximum likelihood classifier (MLC), and fuzzy logic.Unsupervised learning algorithms are used with data that have no historical LULC labels and computers infer a function to describe a hidden structure from unlabeled LULC data.Unsupervised learning methods are used when it is unclear what the LULC mapping results will look like and computers need to dig through hidden layers of LULC data and cluster data together based on the similarities or differences of LULC classes.Popular unsupervised learning methods include self-organizing maps, kmeans clustering, nearest-neighbor mapping, affinity propagation (AP) cluster algorithm, ISODATA (iterative self-organizing data), and fuzzy c-means algorithms.Semi-supervised learning is similar to supervised learning.However, it uses both labeled and unlabeled data for training-usually a small amount of labeled data with a large amount of unlabeled data.Recent advances in machine learning for LULC mapping have been accomplished via deep learning approaches [100,101].As illustrated in Figure 6, deep learning is a subfield of machine learning.All deep learning is machine learning, but not all machine learning is deep learning.Deep learning emerged because shallow machine learning cannot successfully analyze big data for LULC mapping.While basic machine learning models do become progressively better at performing their specific functions as they take in new emergent data, they still need some human intervention.Deep learning algorithms in layers can build an "artificial neural network" (Figure 7) that is able to learn and make intelligent classification decisions on its own [102].Figure 8 illustrates the differences between traditional machine learning and deep learning.For traditional machine learning, feature extraction and classification are separate processes and humans are needed to perform feature extraction.With a deep learning model, feature extraction is integrated with classification and a classification algorithm can determine whether a class prediction is accurate through its own neural network-beyond the training data and without requiring human help.Deep learning algorithms can be considered as a both sophisticated and mathematically complex evolution of machine learning algorithms [103].Deep learning algorithms analyze data with a logic structure similar to how a human would draw LULC mapping conclusions.When fed training data, deep learning algorithms would eventually learn from their own errors whether a LULC class prediction is good or whether it needs to adjust.
shape, size, or color of the objects are.Therefore, CNNs are well suited for LULC mapping [106].Some popular CNN architectures used in the literature are LeNet5, AlexNet, VGG-Net, CaffeNet, GoogLeNet, and ResNet models.RNNs have built-in feedback loops that allow the algorithms to "remember" past data points.RNNs can use this memory of past events to inform their understanding of current events or even predict the future.RNNs are mainly designed to process time series data and are suitable to detect LULC changes [29].By performing complex abstractions over data through a hierarchical learning process, deep leaning algorithms have shown great potential for analyzing big datasets for LULC mapping [16].The hidden layers in deep leaning approaches can discover class structures and patterns in big data and extract valuable class knowledge.Deep learning   By performing complex abstractions over data through a hierarchical learning process, deep leaning algorithms have shown great potential for analyzing big datasets for LULC mapping [16].The hidden layers in deep leaning approaches can discover class structures and patterns in big data and extract valuable class knowledge.Deep learning In recent studies, deep learning outperformed other machine learning algorithms in some LULC mapping problems, particularly in detecting fine-scale types such as small artificial objects [29,104,105].Deep learning algorithms have been used to automatically extract spatial features from very-high-resolution satellite images such as IKONOS, WorldView-3, and SPOT-5, e.g., [101].
There are several different types of deep learning algorithms for LULC mapping, among which the most popular algorithms include convolutional neural networks (CNNs) and recurrent neural networks (RNNs).CNNs are some of the most popular neural network architectures because they can extract low-level features with a high-frequency spectrum, such as the edges, angles, and outlines of LULC objects, whatever the shape, size, or color of the objects are.Therefore, CNNs are well suited for LULC mapping [106].Some popular CNN architectures used in the literature are LeNet5, AlexNet, VGGNet, CaffeNet, GoogLeNet, and ResNet models.RNNs have built-in feedback loops that allow the algorithms to "remember" past data points.RNNs can use this memory of past events to inform their understanding of current events or even predict the future.RNNs are mainly designed to process time series data and are suitable to detect LULC changes [29].
By performing complex abstractions over data through a hierarchical learning process, deep leaning algorithms have shown great potential for analyzing big datasets for LULC mapping [16].The hidden layers in deep leaning approaches can discover class structures and patterns in big data and extract valuable class knowledge.Deep learning is also able to handle nonlinear and highly complex big data more effectively than conventional machine learning methods [43,100].However, compared to traditional machine learning approaches, deep learning requires a vast amount of training data and substantial computing power [27].A deep learning algorithm requires much more data than a traditional machine learning algorithm to properly conduct LULC mapping.Due to the complex multi-layer structure, a deep learning system needs a large training dataset to eliminate fluctuations and make high-quality class interpretations [43].Without a large set of training data, deep learning may show a similar or worse performance than classical machine learning techniques such as SVM [107].
The emergence of cloud computing infrastructure and high-performance GPUs (graphic processing units, used for faster calculations) helped to solve the expensive computational problem faced by deep learning [108].The storage and processing requirements of big data for LULC mapping are greater than that available in traditional computer systems and technologies [109].The existing cluster-based high-performance computing (HPC) with plenty of computational capacities can be used for storing large remotely sensed data and other big data for LULC mapping.However, it is still challenging to process these big remotely sensed data and other big data for large-scale LULC mapping because system architectures and the tools of the existing cluster-based HPC have not been optimized to process such data.The cluster systems or peta-scale supercomputers are not good at loading, transferring, and processing extremely big remotely sensed data and other data for LULC mapping.A potential solution to this problem is cloud computing.Cloud computing satisfies the two main requirements of LULC mapping using big data analytics solutions: (1) scalable storage that can accommodate growing data; and (2) a high processing capability that can run complex LULC mapping tasks in a timely manner.Cloud computing makes deep learning more accessible, making it easier to manage large datasets and train algorithms for distributed hardware, and deploy them efficiently [110].It provides access to special hardware configures, including GPUs, field-programmable gate arrays (FPGAs), TensorFlow processing units (TPUs), and massively parallel high-performance computing (HPC) systems.
Cloud computing has been used for storing big remotely sensed data and other data for LULC mapping with good scalability [111].Three main types of cloud computing services have been used [112]: (1) infrastructure-as-a-service (IaaS), which allows renting IT infrastructures.Servers, virtual machines with storages, networks, and operating systems are completely provided and managed by a cloud provider.Users can pay for what they use; (2) platforms-as-a-service (PaaS): this service is an on-demand style of service where users can obtain a complete development environment required for software applications; (3) software-as-a-service (SaaS): using SaaS, it is possible to deliver software applications over the Internet, such as the 'on demand' or 'subscription' services.
Google AppEngine, Microsoft Azure, and Amazon EC2 are the most popular cloud providers and offer pay-as-you-go clouding computing for storing, processing, and visualizing big remotely sensed data and other data for LULC mapping.GoogleTM developed a geospatial data analysis platform-Google Earth Engine (GEE)-capable of storing and analyzing vast amounts of remotely sensed data for rapid LULC mapping at large scales [113,114].GEE provides users with free access to numerous remotely sensed datasets including Landsat, Sentinel, and MODIS images.GEE has already proven its capacities for LULC classification and change detection, e.g., [115][116][117][118][119][120][121][122][123][124].Microsoft Azure Cloud Services and Amazon Web Services (AWS) have also been used to improve LULC mapping and monitoring [125].Microsoft Azure Cloud Services have established artificial intelligence (AI) for an Earth initiative to address environmental challenges.However, Azure only offers Landsat and Sentinel-2 products for North America and MODIS imagery.Amazon Web Services offer open data from more satellites such as Sentinel-1, Sentinel-2, Landsat-8, and China-Brazil Earth Resources Satellite program (CBERS-4), NOAA image datasets, as well as global model outputs.
In addition to machine learning and cloud computing approaches, other advanced cyberinfrastructure techniques, such as novel scalable parallel file systems capable of storing and managing massive data, and NoSQL (Not Only SQL) databases for managing big unstructured or non-relational data have also been developed for LULC mapping with complex characteristics [126].

Challenges and Future Research Directions
Despite recent progress, LULC mapping continues to face challenges.There are still many issues remaining to be further explored for LULC mapping, as shown in Figure 9. First, fine-scale LULC maps with global coverage remain scarce, particularly for developing countries, many of which are experiencing rapid LULC changes.Although many global LULC maps have been developed by different agencies, most of these maps have coarse spatial resolutions [127,128].For example, the NASA MCD12Q1 dataset has a 500 m resolution, the LULC maps from the European Space Agency (ESA) Climate Change Initiative (CCI) dataset have a 300 m resolution, and the Copernicus Global Land Service (CGLS) Land Cover dataset has a 100 m resolution.There is a lack of detailed First, fine-scale LULC maps with global coverage remain scarce, particularly for developing countries, many of which are experiencing rapid LULC changes.Although many global LULC maps have been developed by different agencies, most of these maps have coarse spatial resolutions [127,128].For example, the NASA MCD12Q1 dataset has a 500 m resolution, the LULC maps from the European Space Agency (ESA) Climate Change Initiative (CCI) dataset have a 300 m resolution, and the Copernicus Global Land Service (CGLS) Land Cover dataset has a 100 m resolution.There is a lack of detailed LULC maps at the global scale.To the best of our knowledge, the Esri global LULC Maps and the European Space Agency (ESA) WorldCover LULC Maps are the only available global LULC maps with a high (10 m) resolution.The Esri global LULC Maps were derived from ESA optical earth observation data Sentinel-2 imagery for 2018~2022.The ESA WorldCover LULC Maps were produced based on the use of both SAR data Sentinel-1 and optical data Sentinel-2 for 2020.However, there were no historical global LULC maps with a high resolution.Spatial resolution influences various aspects of landscape classification and may significantly affect landscape metrics and landscape pattern analysis [129,130].Although very-high-resolution images are available for developing detailed LULC maps, they have proven challenging for creating global maps because of the high cost associated with these very-high-resolution images and their incomplete data coverage and small spatial extent (one image only covers a very small study area).In addition, variations of radiometric properties among different sensors, the influence of different acquisition conditions, and different classic atmospheric perturbations also cause challenges in using these very-high-resolution images for detailed LULC mapping at the global scale.Obtaining cloud-free images is challenging and there are often unavailable data in certain seasons, times, or locations.Therefore, there is a lack of well-annotated fine-scale LULC maps at the global scale and even at the country level for some developing countries.Because high-resolution images have only been available recently, there is a lack of well-annotated fine-scale historical LULC maps for change detection.
Second, existing LULC datasets are often inconsistent and variable in time, space, formats, formal validation, or map legends [23,131].There are various or inconsistent definitions of LULC classes.Different methods and incompatible classification systems are used by different source agencies [132].It is difficult to compare different legend information from various classification schemes.The land surface is heterogeneous and the mapping standards to acquire, represent, and generalize land characteristics are about as diverse as the land surface itself [133,134].For example, according to Cruz et al. [135] in Wisconsin, counties and municipalities may maintain different land use codes for their land parcels.This case is particularly interesting because both the city of Madison and the Fitchburg Township are in the same county-Dane county.LULC data have variations in the semantic contents among different research projects and research teams [133].In addition, different spectral and spatial sensor characteristics, acquisition geometries or illumination conditions, or atmospheric settings also lead to inconsistencies in developing LULC products derived from multi-sensor approaches.It is, therefore, still challenging to combine different LULC products for practices or other applications.
Third, despite more observations from very-high-resolution satellite images, crowd sources and other geospatial big data, accurate training sample data for advanced machine learning or deep learning still remain comparably scarce [43,136].Currently, field surveys, the visual interpretation of high-resolution images, crowdsourcing technology, and existing labeled land use datasets are common methods to obtain reference or training sample data for LULC mapping.However, all of these methods have some limitations.For example, many sample data generated through visual interpolation contain errors.Field surveys are the most accurate method of generating training sample data; however, it is a labor-intensive and time-consuming task.It is expensive to obtain ground truth data via field survey.In addition, the field survey cannot be conducted in all locations because of inaccessibility issues.
Crowdsourcing technology is one of the latest techniques to obtain sample data, but the spatial and temporal sample data created by this technology may be ad hoc, the quality may be highly variable, and some sample data may contain many uncertainties and errors.The existing labeled LULC datasets can help obtain sample data but they normally have low accuracies and coarse resolutions.Therefore, all of the common methods of obtaining sample data for LULC mapping have some issues.There is a lack of reliable training sample data for LULC mapping using advanced machine learning or deep learning approaches for many locations of the world, especially for some developing countries.In addition, some existing sample data may have imbalance issues [137].For example, the sample data produced by crowdsourcing technology or geospatial big data such as social media data or Google Street View data only represent human activities well in urban areas [138].These would perform poorly in measuring land use classes that represent the human activity characteristics in rural areas, where there are low populations and activity densities.Nevertheless, advanced machine learning or deep learning classification methods need a large set of reliable and balanced training sample data, which should cover different classes and areas well, to produce accurate maps, because these methods need to train, test, and classify LULC classes based on the training sample data.A large number of parameters used by these advanced classification methods need to be fine-tuned using a great amount of training sample data.However, obtaining sufficient high-quality training sample data remains a critical issue.
Fourth, mixed LULC pixels also pose great challenges for LULC mapping [16].Because of the insufficient spatial resolution of remotely sensed imagery, it is quite difficult to differentiate the mixed LULC pixels in the past.With advances in remote sensing techniques, the very-high-resolution images with distinct spatial, temporal, spectral, radiometric, and angular characteristics are emerging and they are available for detailed LULC mapping.However, the mixed LULC problem still exists because LULC classification algorithms compatible with these super-high-resolution multispectral images are still underdeveloped.Various subpixel analysis approaches, such as variations in spectral mixture analysis (SMA), support vector machine (SVM), import vector machine (IVM), convolutional neural networks (CNNs), and deep learning-based subpixel mapping network (DLSMNet) have been developed to solve the mixed LULC problem, e.g., [139][140][141][142].However, these subpixel analysis approaches have still experienced some difficulties in handling the spectral heterogeneity of diverse landscape features.Recent studies have been working on addressing the mixed LULC problem by combining remotely sensed imagery with other ancillary data such as road network data, social sensing data, as well as other environmental and socioeconomic data [143].However, the data availability limited the large-scale implementation of these kinds of approaches.Although advanced deep learning approaches can transform multispectral image pixels into high-level abstract features and thus may reduce some mixed pixel problems, the model interpretability of these deep learning approaches is a big challenge due to the "black box" nature of the training procedure.In addition, the model transferability is also an important issue faced by these advanced machine learning or deep learning approaches.Because LULC classes are different across different regions, the model developed for one study area using local training data may not be appropriate for classification in other study areas.
Based on the review, although a large set of LULC maps exist, these maps are often connected to considerable uncertainty due to the positional inaccuracy, unreliable input data, and processing algorithms limitations.More research is needed in the future to develop improved LULC maps.A few general recommendations for future research directions of LULC mapping, as illustrated in Figure 10, are suggested in the paragraphs below.
First, more research on fusing multiple big datasets from different sources and across scales for detailed LULC mapping at large scales needs to be done.Specifically, the joint use of multipl150e remotely sensed datasets (lidar, radar, and optical data), big geospatial data such as social media data, other physical and socioeconomic big data such as census data, and sampling survey data may help improve LULC mapping at multiple scales.Big data fusion is necessary for obtaining the full picture of LULC situation.The fused big datasets will typically have better spatial, temporal, spectral, as well as radiometric resolutions and coverages.However, these multiple big datasets usually have the aforementioned heterogeneous problems, such as different spatial and temporal resolutions, formats, and semantics.Therefore, it is usually difficult to ensure that the satisfactory mapping results in a heterogeneous environment.Because of the variances in shape, color, size, and other properties of objects for each class, it is a challenge to use a universal scale parameter for classification.A cross-scale mapping strategy that can is compatible across pixel-, object-, and parcel-scale may be highly necessary.A cross-scale mapping strategy may solve the aforementioned mixed LULC problem via diverse outputs of classification maps that can include the compositions and proportions of mixed LULC pixels.The cross-scale approach, however, may require a great deal of time and effort to determine the appropriate parameters.More research needs to be performed using the cross-scale mapping strategy to fuse multiple big datasets from different sources.
combining remotely sensed imagery with other ancillary data such as road network data, social sensing data, as well as other environmental and socioeconomic data [143].However, the data availability limited the large-scale implementation of these kinds of approaches.Although advanced deep learning approaches can transform multispectral image pixels into high-level abstract features and thus may reduce some mixed pixel problems, the model interpretability of these deep learning approaches is a big challenge due to the "black box" nature of the training procedure.In addition, the model transferability is also an important issue faced by these advanced machine learning or deep learning approaches.Because LULC classes are different across different regions, the model developed for one study area using local training data may not be appropriate for classification in other study areas.
Based on the review, although a large set of LULC maps exist, these maps are often connected to considerable uncertainty due to the positional inaccuracy, unreliable input data, and processing algorithms limitations.More research is needed in the future to develop improved LULC maps.A few general recommendations for future research directions of LULC mapping, as illustrated in Figure 10, are suggested in the paragraphs below.First, more research on fusing multiple big datasets from different sources and across scales for detailed LULC mapping at large scales needs to be done.Specifically, the joint use of multipl150e remotely sensed datasets (lidar, radar, and optical data), big geospatial data such as social media data, other physical and socioeconomic big data such as census data, and sampling survey data may help improve LULC mapping at multiple scales.Big data fusion is necessary for obtaining the full picture of LULC situation.The fused big datasets will typically have better spatial, temporal, spectral, as well as radiometric resolutions and coverages.However, these multiple big datasets usually have the aforementioned heterogeneous problems, such as different spatial and temporal resolutions, formats, and semantics.Therefore, it is usually difficult to ensure that the satisfactory mapping results in a heterogeneous environment.Because of the variances in shape, color, size, and other properties of objects for each class, it is a challenge to use a universal scale parameter for classification.A cross-scale mapping strategy that can is compatible across pixel-, object-, and parcel-scale may be highly necessary.A cross-scale mapping strategy may solve the aforementioned mixed LULC problem via diverse Second, novel approaches need to be developed to make full use of existing labeled maps for detailed LULC mapping.Numerous global LULC maps often can be accessed for free.However, most LULC maps typically have a much lower resolution than the current very-high-resolution satellite imagery and also contain many noises; thus, they cannot be directly considered as ground truth or training sample data for detailed LULC mapping using advanced machine learning or deep learning approaches.However, these coarse resolution maps do contain some valuable class information, and thus they may be utilized as indirect auxiliary training data for the same purpose.However, current studies to utilize these existing LULC products are still limited and more strategies or methods need to be designed and developed for this purpose.
Third, as mentioned before, it is still a challenge to obtain enough high-quality ground truth data or training sample data for using advanced machine learning or deep learning for LULC mapping.Existing sample data are not only limited in number but are also limited in terms of variety.The sample data are often not sufficient to train a generalized machine learning or deep learning model, because they are specific to time and location.The classifier trained using one dataset normally does not perform well over other datasets.To improve the performance of a deep learning model, image transformations such as flip, translation, and rotation may be adopted to generate additional and more diversified training data from original data.Transfer learning is another way proposed to deal with the challenge of limited training data [144].The transfer learning method employs a pre-trained LULC classifier to extract an initial set of representations for a new LULC dataset.Unsupervised learning can also be used to tackle the problem of lacking labelled LULC training data [43].The use of crowdsourcing sample collections or an open science framework that supports the integration of citizen science and IoT may as well generate more labeled samples.However, all of these methods have limitations.More research needs to be performed to explore these various approaches to obtain enough sample data for LULC mapping.In addition, more research is necessary to determine how to keep the good performances of deep learning methods using fewer training samples.Additionally, more attention from the scientific community is also needed to address the LULC class imbalance issue.LULC classes are normally imbalanced with some majority classes dominating a study area while some minority classes only occasionally occur at some locations.Therefore, some of the classes may have fewer samples than the others, and it is easy to obtain a sample dataset that misses some minority classes.The class imbalance issue may affect the classification accuracy of advanced machine learning or deep learning methods.Future research should pay more attention to the class imbalance issue.
Fourth, more research needs to be performed for automatically creating and updating spatially explicit LULC maps with moderate or high resolutions at the global scale.Currently, most of the available moderate-resolution LULC maps are only available for limited spatial and temporal coverage [3], such as USGS NLCD and LCMAP, BaseVue, Globel-Land30, and GlobeLand10.Although the Esri global LULC Maps and the ESA WorldCover LULC Maps have high 10 m resolution and the recent iMap series of products are available globally at a seasonal cadence with a 30 m resolution, these maps have limited temporal coverage: the Esri Maps cover time periods of 2018~2022 and the ESA Maps only cover the year 2020.In addition, these maps only generated broad classes such as water, trees, grass, crops, scrub/shrub, built area, bare ground, snow/ice, and clouds (unclassified).They lack detailed LULC information, such as Level II LULC classes, to differentiate residential, commercial, and industrial land uses.More research is necessary to generate historical global LULC maps or detailed global LULC maps with Level II or higher level LULC information and fill the data gaps in some locations, particularly those regions or countries where data are extremely deficient.
Fifth, more research is necessary to validate big LULC mapping results.At present, it is still difficult to compare the different LULC mapping results.There is no specifically acceptable accuracy assessment metrics or standards for the evaluation of LULC mapping results.The overall accuracy, user accuracy, and the Kappa coefficient of the confusion matrix are popular methods to validate the mapping results with the ground truth data.However, these evaluation methods are imperfect and only provide evaluations from some perspectives while ignoring other perspectives.For example, many existing LULC mapping studies are only concerned with overall accuracy while ignoring the poor accuracies of rare classes [145].The confusion matrix is entirely devoid of spatial context [146].Kappa indices may be misleading and/or flawed for some practical applications in remote sensing [147].Accuracy validation efforts and standard assessment systems are needed to accurately assess LULC mapping using heterogeneous big data sources.
Finally, research on data access using advanced cyberinfrastructure technologies is also needed.Efforts to develop protocols and platforms to compile, share, visualize, and distribute large LULC datasets including their associated biases and errors over the Internet are urgently needed.Traditional cyberinfrastructure technologies have met a few limits due to the constant growth of data.For example, when LULC data become extremely large, human eyes have difficulty in extracting meaningful information for visualization.It is challenging to present too many data on a limited screen.It is even more challenging to present huge data on mobile devices due to smaller screens and resolutions.The traditional methods of building tile and pyramids are not efficient for visualization at a satisfying speed [148].We need to explore novel ways to display and visualize large data using various abstraction techniques.
The field of computer vision has seen rapid progress in the last decade, which is in a large part growing to the growth of deep learning.Computer vision tasks such as image classification, object detection, and image segmentation also saw prominent achievements brought by deep learning techniques.Although computer vision tools and methods have been adopted for a wide variety of applications, these tools and methods have yet to be unified and integrated with traditional methods of spatial analysis to deal with geographic/spatial data [149].Because several factors such as the lack of largescale annotated sample data and disparate object sizes of remote sensing imagery, the applications of advanced computer vision techniques to the remote sensing domain has lagged behind greatly [150].The extraction of meaningful LULC information from remotely sensed imagery can be aided by techniques in computer vision [151].Computer vision methods with neural networks as the underlying framework can be used to identify LULC spatial features and patterns.Computer vision algorithms may also be used to remove noise and enhance satellite and aerial imagery data for LULC classification, change detection, and data fusion.Novel tools and approaches that combine cutting-edge computer vision technology and remotely sensed imagery need to be developed in the near future for LULC mapping.

Conclusions
Accurately mapping LULC information is important for many applications such as natural resource and environmental management, urban planning, biodiversity conservation, and health promotion.With the advent of remote sensing and computer technologies, massive data have been generated.New and improved remote sensing earth observations and emerging social sensing big data and auxiliary crowdsourcing datasets together offer great data sources for LULC mapping [152,153].The recent innovative machine learning and deep learning algorithms as well as cutting-edge cloud computing have proven their powerful capabilities to process big remotely sensed data and other geospatial big data of high dimensionality for LULC mapping [154,155].
Despite considerable recent progress, LULC mapping still has challenges to deal with, including data gaps, inconsistent and heterogeneous data, imbalanced and scarce sample data, mixed LULC, and the model interpretability and transferability issues of advanced machine learning or deep learning approaches.More research is necessary in the future in the following fields for improved LULC mapping: fusing multiple big datasets from different sources using cross-scale approaches, developing novel approaches to make full use of the existing LULC maps, obtaining high-quality sample data, automatically creating and updating LULC maps with moderate or high resolution at the global scale, improving accuracy assessment methods, and developing advanced cyberinfrastructure technologies for sharing, visualizing, and distributing large LULC datasets over the Internet for various applications.

Figure 1 .
Figure 1.Major opportunities of LULC mapping in the era of big data.

Figure 1 .
Figure 1.Major opportunities of LULC mapping in the era of big data.
It is extremely difficult to integrate, analyze, and transform these heterogeneous geospatial big data from different sources into useful values for LULC mapping.Traditional LULC classification or mapping solutions and software face excessive Land 2022, 11, 1692 3 of 22 without any cost.USGS, NASA, NOAA, IPMUS Terra, NEO, and Copernicus open access hubs are among the most popular open access remotely sensed data providers.

Figure 2 .
Figure 2. Different types of remotely sensed satellite imagery for LULC mapping.

Figure 2 .
Figure 2. Different types of remotely sensed satellite imagery for LULC mapping.

Figure 3 .
Figure 3.More types of remotely sensed imagery for LULC mapping.

Figure 3 .
Figure 3.More types of remotely sensed imagery for LULC mapping.

Land 2022 , 22 Figure 4 .
Figure 4. LULC mapping from the integration of geospatial big data and remotely sensed data.

Figure 4 .
Figure 4. LULC mapping from the integration of geospatial big data and remotely sensed data.

Figure 5 .
Figure 5. Types of traditional machine learning approaches for LULC mapping.Figure 5. Types of traditional machine learning approaches for LULC mapping.

Figure 5 .
Figure 5. Types of traditional machine learning approaches for LULC mapping.Figure 5. Types of traditional machine learning approaches for LULC mapping.

Figure 6 .
Figure 6.Deep learning is a subfield of machine learning.Figure 6. Deep learning is a subfield of machine learning.

Figure 6 .
Figure 6.Deep learning is a subfield of machine learning.Figure 6. Deep learning is a subfield of machine learning.Land 2022, 11, x FOR PEER REVIEW 10 of 22

Figure 7 .
Figure 7.A simple artificial neural network.

Figure 8 .
Figure 8. Differences between traditional machine learning and deep learning.

Figure 7 .
Figure 7.A simple artificial neural network.

Figure 7 .
Figure 7.A simple artificial neural network.

Figure 8 .
Figure 8. Differences between traditional machine learning and deep learning.

Figure 8 .
Figure 8. Differences between traditional machine learning and deep learning.

Figure 10 .
Figure 10.Future research directions of LULC mapping.

Figure 10 .
Figure 10.Future research directions of LULC mapping.