Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024)

Rangel, Antonio; Terven, Juan; Córdova-Esparza, Diana-Margarita; Romero-González, Julio-Alejandro; Ramírez-Pedraza, Alfonso; Chávez-Urbiola, Edgar A.; Willars-Rodríguez, Francisco. J.; Alfonso-Francia, Gendry

doi:10.3390/data10050067

Open AccessData Descriptor

Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024)

by

Antonio Rangel

¹

,

Juan Terven

^1,*

,

Diana-Margarita Córdova-Esparza

²

,

Julio-Alejandro Romero-González

²

,

Alfonso Ramírez-Pedraza

^1,3

,

Edgar A. Chávez-Urbiola

¹

,

Francisco. J. Willars-Rodríguez

¹

and

Gendry Alfonso-Francia

^1,4

¹

CICATA-Qro, Instituto Politecnico Nacional, Queretaro 76090, Mexico

²

Facultad de Informática, Universidad Autónoma de Querétaro, Queretaro 76230, Mexico

³

Secretaría de Ciencia, Humanidades, Tecnología e Innovación SECIHTI, IxM, Alvaro Obregón, Mexico City 03940, Mexico

⁴

Facultad de Ingeniería, Universidad Autónoma de Querétaro, Queretaro 76010, Mexico

^*

Author to whom correspondence should be addressed.

Data 2025, 10(5), 67; https://doi.org/10.3390/data10050067

Submission received: 7 March 2025 / Revised: 22 April 2025 / Accepted: 3 May 2025 / Published: 4 May 2025

Download

Browse Figures

Versions Notes

Abstract

Monitoring land cover changes is crucial for understanding how natural processes and human activities such as deforestation, urbanization, and agriculture reshape the environment. We introduce a publicly available dataset covering the entire United States from 2016 to 2024, integrating six spectral bands (Red, Green, Blue, NIR, SWIR1, and SWIR2) from Sentinel-2 imagery with pixel-level land cover annotations from the Dynamic World dataset. This combined resource provides a consistent, high-resolution view of the nation’s landscapes, enabling detailed analysis of both short- and long-term changes. To ease the complexities of remote sensing data handling, we supply comprehensive code for data loading, basic analysis, and visualization. We also demonstrate an example application—semantic segmentation with state-of-the-art models—to evaluate dataset quality and reveal challenges associated with minority classes. The dataset and accompanying tools facilitate research in environmental monitoring, urban planning, and climate adaptation, offering a valuable asset for understanding evolving land cover dynamics over time.

Keywords:

LULC; change detection; remote sensing

1. Introduction

Land use land cover (LULC) change detection is essential for understanding how both natural events and human activities shape environments over time [1,2]. Such information is critical for examining deforestation [3,4], urbanization [5,6], agricultural expansion [7,8], and the broader impacts of climate change [9,10] on ecosystems and biodiversity. Reliable land cover data also guide land management decisions [11,12], disaster response [13], carbon accounting [11], and policy making aimed at mitigating environmental harm [14].

Recent advances in remote sensing, notably through Sentinel-2 [15], enable large-scale landscape monitoring, allowing the detection of both short- and long-term changes. However, researchers often face a steep learning curve when handling satellite data. Challenges include managing substantial volumes of spatial data, mastering specialized preprocessing steps, and finding suitable labeled datasets that maintain temporal consistency and extensive geographic coverage.

To address these challenges, we introduce a publicly available dataset covering the entire United States from 2016 to 2024, combining multi-band Sentinel-2 [15] imagery with pixel-level annotations from the Dynamic World dataset [16,17], as shown in Figure 1. This resource is accessible at https://doi.org/10.6084/m9.figshare.28520114.v1 (accessed on 2 March 2025) and is accompanied by a ready-to-use code to load the data and perform basic analyses (https://github.com/AntonioRangel7/US-LandCover-Dataset) (accessed on 26 February 2025). By facilitating data access and preprocessing, we aim to lower barriers to entry in remote sensing research, accelerating the development of effective solutions for environmental monitoring, urban planning, and land management.

2. Related Works

This section reviews previous research relevant to land use land cover (LULC) mapping, focusing on existing datasets and advanced mapping methodologies. First, we compare significant global, regional, and national LULC datasets, highlighting differences in spatial resolution, temporal frequency, and coverage, and demonstrate how our proposed dataset addresses current gaps. Next, we discuss advances in LULC mapping techniques, emphasizing semantic segmentation approaches based on convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer architectures. Finally, we explore recent research dedicated to segmentation-based LULC change detection, underlining the necessity of consistent, temporally rich datasets for effectively tracking and analyzing land cover dynamics.

2.1. Comparison of Major LULC Datasets

Recent years have seen the development of diverse LULC datasets, balancing spatial resolution, temporal frequency, and geographic extent. Global datasets such as GlobeLand30 [18] provide comprehensive land cover at a 30 m resolution for 2000, 2010, and 2020. While GlobeLand30 achieves high accuracy (∼85% for 2020), its infrequent updates overlook interannual changes. Other global products, such as the ESA Climate Change Initiative [19] and Copernicus Global Land Cover Maps [20], offer broad coverage but at coarser resolutions and annual updates or less frequent. In contrast, Google’s Dynamic World produces near-real-time global labels at a 10 m resolution using Sentinel-2 imagery and deep learning, demonstrating the advantages of moderate resolution and frequent revisits for capturing detailed changes.

Regional and task-specific datasets often provide finer details or specialized classifications. LandCoverNet [21], a 10 m resolution Sentinel-2-based dataset, labels pixels into seven broad classes based on annual time series but offers limited geographic coverage. BigEarthNet [22] contains multi-label scene classifications for 590,000 Sentinel-2 image patches across Europe, annotated with CORINE’s 43 land cover categories, but lacks pixel-level segmentation. SEN12MS [23] provides paired radar/optical imagery globally with coarse MODIS-derived labels, limiting its usability for precise ground-truth mapping. Similarly, EuroSAT [24] offers Sentinel-2-based scene classification but is limited in spatial coverage and temporal depth.

National datasets, particularly in the United States, vary significantly in resolution and update frequency. The USDA Cropland Data Layer (CDL) [25], focused on crop classification, offers annual national coverage at 30 m (recently improved to 10 m), distinguishing numerous crop types. However, its agricultural focus limits broader applicability. The USGS National Land Cover Database (NLCD) provides 30 m land cover maps at roughly five-year intervals, making it suitable for long-term trends but insufficient for capturing yearly dynamics. High-resolution products like NAIP aerial imagery (∼1 m RGB) have enabled detailed initiatives such as the Chesapeake Conservancy [26], yet the substantial cost and effort limit scalability to national extents.

Research demonstrates the benefits of integrating multi-satellite data collections via cloud platforms for efficient and accurate LULC mapping. For example, Nasiri et al. [27] used Google Earth Engine to evaluate Sentinel-2 and Landsat-8 data, finding seasonal median composites particularly effective for capturing phenological characteristics of croplands and forests.

Our proposed Sentinel-2- and Dynamic World-based dataset uniquely combines high spatial resolution (10 m) with consistent annual temporal coverage (2016–2024) for the entire United States. This dataset surpasses existing datasets by providing detailed pixel-level labels across multiple years, enabling robust year-to-year land cover change analyses. The consistent nine-class Dynamic World schema further facilitates comprehensive change detection, bridging the gap between high-resolution local maps and coarse, infrequently updated national products.

2.2. Advances in LULC Mapping and Semantic Segmentation Techniques

Advancements in LULC mapping closely align with progress in computer vision, transitioning from per-pixel classifiers to sophisticated deep semantic segmentation methods. Initial deep learning approaches utilized Fully Convolutional Networks (FCNs) [28] for pixel-level classification of satellite images. Subsequent architectures, such as U-Net [29] and SegNet [30], improved segmentation accuracy by capturing complex land cover patterns through hierarchical spatial features and skip connections. State-of-the-art models like DeepLabv3+ [31], incorporating atrous convolutions and spatial pyramid pooling, are effective for handling heterogeneous landscapes and have become prevalent in operational LULC mapping pipelines [32].

Multi-temporal segmentation methods extend these models by incorporating temporal information, thereby enhancing classification accuracy and facilitating explicit change detection. Approaches combining CNN encoders with recurrent neural networks (RNNs), such as Long Short-Term Memory (LSTM) networks, leverage seasonal variations to distinguish land cover types effectively. CNN–RNN hybrid models have demonstrated superior performance in crop classification and other temporally sensitive tasks.

Recently, transformer-based models have gained attention for remote sensing applications due to their robust handling of long-range spatial and temporal dependencies [33]. Vision Transformers adapted to satellite imagery (SITS-Former) have achieved superior accuracy compared to traditional CNN and CNN–RNN models, highlighting the strengths of self-attention mechanisms for capturing extensive temporal and spatial contexts [34]. Hybrid architectures combining CNN spatial encoders with temporal attention mechanisms represent the forefront of current research, offering optimized models for our dataset’s extensive temporal and spatial dimensions.

Current trends also emphasize enhancing model interpretability and performance. Techniques such as SHAP have been applied to CNN-based LULC models to interpret predictions, thereby improving model transparency and user trust [35]. Comparative studies, such as those by Zafar et al. [36], provide insights into trade-offs among traditional classifiers, reinforcing the need for advanced models in varied landscapes.

Segmentation-Based LULC Change Detection

A substantial research area involves segmentation-based LULC change tracking. Multi-temporal datasets enable change detection through comparative analysis of segmentation outputs across different periods. Traditional change detection methods, like BFAST [37], identify land cover transitions by comparing annual maps. Advanced approaches, including Siamese convolutional networks [38], directly produce detailed change maps from paired images, streamlining detection.

The proliferation of datasets such as Dynamic World, ESA WorldCover [39,40], and Esri’s annual global maps [32] has significantly accelerated progress in segmentation-based change detection. Our dataset, providing eight consecutive years of consistent 10 m resolution labels across the U.S., facilitates robust training of temporal segmentation models capable of reliably tracking annual and seasonal land cover transitions. This consistency minimizes false detections and enhances the reliability of change analysis, enabling detailed and large-scale monitoring of environmental and anthropogenic land cover dynamics.

3. Methods

3.1. Public Data Sources

Our dataset combines two publicly available and highly reputable resources: the Dynamic World dataset [17] and the Top-of-Atmosphere (TOA) Sentinel-2 satellite imagery [15]. By integrating these resources, we created a detailed, multi-year land cover dataset covering the United States from 2016 to 2024. The dataset leverages the strengths of these sources to provide consistent, high-resolution, temporally rich data suitable for environmental and urban studies.

3.1.1. Dynamic World

The Dynamic World dataset [16], produced through a collaboration between Google, the National Geographic Society, and the World Resources Institute [41], offers global land cover data at a high spatial resolution of 10 m. The dataset is continuously updated in near-real time, generating approximately 5000 new images daily, with update frequencies varying between two and five days depending on the geographic location. Each pixel in the dataset includes probabilities assigned across nine distinct land cover classes, allowing for detailed, probabilistic analyses of landscape composition. Table 1 summarizes these land cover classes and provides representative examples.

The Dynamic World data are hosted on Google Earth Engine (GEE) and Google’s AI Platform [42]. The data can be accessed under the public Image Collection “GOOGLE/DYNAMICWORLD/V1” [43]. The model runs on historical and newly acquired Sentinel-2 TOA imagery. Additionally, the model, training data, and inference examples are available under a Creative Commons BY-4.0 license, with the source code and documentation provided on GitHub [44,45].

3.1.2. ESA Sentinel-2

The Sentinel-2 mission [15], part of the Copernicus Program led by the European Space Agency, delivers high-resolution optical imagery for diverse applications in land monitoring, agriculture, and disaster management. Sentinel-2A was launched on 23 June 2015, followed by Sentinel-2B on 7 March 2017, and Sentinel-2C on 5 September 2024 [15].

Equipped with a Multispectral Instrument (MSI) covering visible (VNIR), near-infrared, and shortwave-infrared (SWIR) wavelengths, Sentinel-2 collects imagery at three spatial resolutions: 10 m (visible/NIR), 20 m (red-edge/SWIR), and 60 m (atmospheric bands) [46]. The satellites are phased 180 degrees apart, ensuring a revisit time of about five days at the equator.

The open-access Sentinel-2 data support numerous applications. In agriculture, the data help optimize fertilizer and irrigation use by enabling near-real-time monitoring of crop health [15]. In forestry, the data support the mapping and tracking of deforestation, which is crucial to combating climate change. The data are also pivotal in disaster response, offering updated flood and wildfire maps for rapid mitigation [15]. In general, the free data policy has driven extensive innovation, making Sentinel-2 a critical resource for the sustainable management of natural resources.

3.2. Data Acquisition

The dataset creation involved comprehensive data acquisition from Google Earth Engine (GEE) using its Python API and was structured into four distinct phases, which are detailed below.

3.2.1. Phase One: Sentinel-2 Collection Filtering

Initially, we accessed the Harmonized Sentinel-2 MSI: Multispectral Instrument, Level 1C collection available on Google Earth Engine. Images were systematically filtered using several built-in GEE functions to ensure optimal data quality and coverage. We used the filterDate() function to select images acquired annually between 1 July and 30 September, capitalizing on the summer season’s clearer skies to obtain higher-quality, cloud-free imagery. To ensure comprehensive geographic coverage, we employed the filterBounds() function to retrieve images spanning the entire territory of the United States.

Specific spectral bands, including Red, Green, Blue, NIR, SWIR1, and SWIR2, were selected explicitly for their relevance in land use and land cover analyses. A strict cloud-coverage threshold of less than 5% was enforced using the CLOUDY_PIXEL_PERCENTAGE metadata attribute, a crucial step to maintain imagery clarity.

Furthermore, to harmonize spatial resolutions across bands, the built-in resample() function was utilized, applying bilinear interpolation specifically to resample the SWIR1 and SWIR2 bands from their native 20 m resolution down to a 10 m resolution. Bilinear interpolation calculates the new pixel value by performing linear interpolation, first in one dimension and then in the other, effectively smoothing transitions and maintaining spatial accuracy. Mathematically, bilinear interpolation is expressed as follows:

P (x, y) = (1 - Δ x) (1 - Δ y) P_{00} + Δ x (1 - Δ y) P_{10} + (1 - Δ x) Δ y P_{01} + Δ x Δ y P_{11},

where

P (x, y)

is the interpolated pixel value,

Δ x

and

Δ y

are the fractional distances between the original pixel locations, and

P_{00}, P_{10}, P_{01}, P_{11}

are the values of the nearest neighboring pixels. This resampling ensured spatial alignment and consistency across the dataset.

3.2.2. Phase Two: Dynamic World Mask Selection

Subsequently, the corresponding Dynamic World land cover masks were retrieved, taking advantage of the matching naming conventions between the Dynamic World masks and Sentinel-2 imagery. This facilitated accurate pairing between satellite images and their respective land cover annotations.

3.2.3. Phase Three: Composite Image Generation

Composite images were then created to effectively summarize the extensive satellite data. We generated three different composites: an RGB composite using Sentinel-2 visible bands, an SWIR-NIR composite to highlight vegetation and moisture content, and a land cover label composite derived from the Dynamic World data. The Sentinel-2 bands were aggregated using a statistical median reducer, significantly reducing the influence of transient atmospheric disturbances such as clouds or shadows. Mathematically, this process is defined as

Composite Pixel Value = median {x_{1}, x_{2}, \dots, x_{n}}

where

x_{i}

represents the individual pixel reflectance values across n satellite images. For land cover classification, we employed a mode reducer to determine the most frequently occurring label per pixel:

{Label}_{composite} = mode {L_{1}, L_{2}, \dots, L_{n}}

where

L_{i}

represents the observed land cover classes.

For the land cover labels, the ee.Reducer.mode() function was used to identify the most frequent class per pixel by selecting the value with the highest frequency. Figure 1 illustrates these three composites.

3.2.4. Phase Four: Data Export and Storage

Finally, due to export limitations inherent to the GEE platform (maximum of 10,000 pixels per side per export), we divided the United States into 797 manageable one-degree square tiles. Each tile was exported individually as a GeoTIFF file, including the RGB, SWIR-NIR, and Dynamic World label composites. This organized, tile-based approach effectively circumvented platform limitations, facilitating streamlined data management, distribution, and subsequent analyses. Figure 2 visually depicts this tiling strategy.

4. Results

The dataset comprises Sentinel-2A imagery and corresponding Dynamic World masks for the entire continental United States, covering July 1 to September 30 of each year from 2016 to 2024. Data were obtained using the Python API of the Google Earth Engine (GEE) and stored in the EPSG:4326 coordinate reference system. Six spectral bands were selected from Sentinel-2A (red, green, blue, NIR, SWIR1, and SWIR2), along with a label band from Dynamic World. Table 2 outlines the bands and their respective spatial resolutions.

Figure 3 illustrates annual land cover changes in the United States from 2016 to 2024. Water (blue) remained largely stable, while trees (green) showed a modest increase, which may indicate reforestation efforts. Grass (light green) and crops (orange) fluctuated from year to year, in line with changing agricultural practices. Built-up areas (purple) exhibited a steady rise, pointing to ongoing urbanization. Shrub and scrub (brown) and bare ground (tan) varied over time, influenced by natural processes and anthropogenic actions. Flooded vegetation (light blue) remained low, suggesting minimal wetland changes, and snow and ice (light gray) were confined to select regions.

Figure 4 focuses on built-up areas, revealing a gradual increase from about 2.25% of total land cover in 2016 to around 3.12% in 2024. This one-percentage-point rise reflects population growth, economic activity, and policy decisions that drive urban development. Although the numeric change may appear small, it can substantially affect ecosystems, agriculture, and resource planning.

4.1. Semantic Segmentation Application

To demonstrate a potential application of this dataset and obtain an initial assessment of its quality, we performed semantic segmentation experiments on multiple land cover classes using well-known segmentation architectures.

4.1.1. Data Preprocessing

We preprocessed the Sentinel-2 RGB composites by converting them into .jpg files with a resolution of 520 × 520, reducing the file size while preserving the spatial structure. These converted images were then stored as NumPy arrays in LMDB files. After conversion, we split the images into three subsets: 107,269 images for training (80%), 13,408 images for validation (10%), and 13,410 images for testing (10%), allowing us to track model performance and tune hyperparameters.

To address class distribution imbalances, we applied data augmentation to classes 3, 6, and 8 using the Mosaic function [47] to create new samples of minority classes. This approach yielded a more balanced dataset, as shown in Table 3.

All pixel intensities were standardized using Z-score normalization:

Z = \frac{X - μ}{σ},

where

μ

and

σ

are the mean and standard deviation of the pixel values in the training set.

4.1.2. Model Training and Evaluation

We evaluated four semantic segmentation architectures: Fully Convolutional Networks (FCN) [48] with a ResNet-50 [49] encoder, Lightweight Atrous Spatial Pyramid Pooling (LRASPP) [50], U-Net++ [51] with a ResNeXt-101 32×8d [52] encoder, and DeepLabV3+ [53] with a ResNet-50 [49] encoder.

All encoders were initialized with pretrained weights from ImageNet to leverage transfer learning. The pretrained models were obtained from both the segmentation_models.pytorch library [54] and the official torchvision model zoo, provided by the PyTorch framework [55]. The models were trained using the Adam optimizer [56] with a learning rate of

10^{- 3}

and the cross-entropy loss function over 50 epochs. Validation metrics were monitored at each epoch to mitigate overfitting and guide early stopping with a patience of 15 epochs.

Table 4 summarizes the global performance metrics for all evaluated models, highlighting LRASPP as the architecture with the best performance, with an IoU of 0.71, an accuracy of 0.89, and an F1 score of 0.82. Table 5 compares the per-class F1 scores for all evaluated models, highlighting LRASPP as the architecture with the highest overall performance. LRASPP achieved the best F1 score in seven out of nine classes, including water (class 0: 0.945), trees (class 1: 0.941), grass (class 2: 0.786), crops (class 4: 0.881), shrub and scrub (class 5:0.845), bare ground (class 7: 0.857), and snow and ice (class 8: 0.786). These results suggest that LRASPP effectively captures both prevalent and infrequent land cover types.

DeepLabV3+ stood out in the built-up area class (class 6), achieving the highest F1 score (0.897). It also performed the best in the flooded vegetation class (class 3), which proved to be the most challenging category. Although the F1 score for this class was relatively low (0.447), DeepLabV3+ still outperformed the other models, which struggled even more, likely due to the extreme class imbalance, as flooded vegetation represents only 0.1% of the entire dataset.

Overall, these results emphasize that while LRASPP demonstrates the most consistent and balanced performance across land cover classes, specialized architectures like DeepLabV3+ may excel in specific and challenging scenarios.

5. Discussion

Our longitudinal description of U.S. land cover change combines two advances:

A nationwide, multispectral Sentinel-2 archive that is harmonized to a 10 m resolution;
Yearly Dynamic World annotations that give nine-class pixel-level labels from 2016 to 2024. The trends extracted from this resource are meaningful only when interpreted in light of those design choices.

5.1. Dataset Construction and Its Analytical Pay-Off

The strict July–September window, five-percent cloud mask, and median-mode compositing ensure that each annual tile is radiometrically comparable. This consistency is what allows the stacked-area analysis (Figure 3) to reveal real increases in built-up areas (+1 pp in eight years) and modest tree gains, instead of artifacts caused by atmospheric noise. Likewise, the 10 m alignment of the SWIR1–SWIR2 bands enables the moisture-sensitive vegetation dynamics discussed in Section 4. These examples show that careful curation amplifies the scientific value of otherwise raw remote sensing products.

5.2. Methodological Innovation

Our segmentation benchmark contributes two novelties. First, we publish the full preprocessing and tiling pipeline, making it easy for readers to reproduce or extend the experiments with alternative backbones (e.g., ConvNeXt- or ViT-based decoders). Second, by reporting the performance of all nine Dynamic-World classes—rather than the usual water/vegetation/urban subset—we surface the limitations of popular architectures on minority classes (e.g., flooded vegetation and snow and ice; see Table 5). This exposes open methodological questions, such as class imbalance-aware loss functions or curriculum learning for multi-temporal data.

5.3. Practical Prospects

A 10 m, nationwide, annual LULC cube opens the door to the following:

Urban growth modeling. The steady 0.11 pp·yr⁻¹ increase in built-up areas can feed spatial interaction models and inform infrastructure stress tests.
Carbon-budget accounting. Year-to-year tree-cover recovery (+0.25 Mha over the study period) can be cross-checked against county-level emissions inventories.
Emergency management. Near-continuous water masks make it possible to validate flood-extent predictions within days rather than months.

5.4. Limitations

Despite its scale, the dataset inherits two constraints:

Generalizability. Climatic zones beyond the U.S. (e.g., equatorial forests and arid deserts) may exhibit spectral signatures not captured in our median composites.
Scalability. The one-degree tiling scheme produces 797 GeoTIFFs per epoch. Although convenient for batch training on HPC clusters, it can hinder interactive exploration on modest hardware. Native cloud-optimized GeoTIFFs or STAC catalogs are an obvious next step.

5.5. Future Work

Future research will proceed along three complementary lines. First, we will incorporate Sentinel-1 backscatter to reduce cloud-induced data gaps and improve the detection of flooded vegetation. Second, we will investigate multi-resolution fusion—10 m, 30 m, and 1 km—to generate hierarchical, uncertainty-aware LULC products. Finally, we plan to publish a benchmark of change detection models (e.g. Siamese U-Net, TempCNN, Swin former) that uses the annual composites introduced here as explicit source–target pairs.

6. Conclusions

We introduced an open-license multi-band, nine-class, 10 m resolution LULC time series for the entire United States (2016–2024) and demonstrated its utility through state-of-the-art semantic segmentation baselines. The key findings are as follows:

Built-up areas expanded from 2.25% to 3.12%, evidencing rapid urbanization;
Segmentation models that excel on dominant classes still struggle with minority ones, signaling the need for class imbalance-aware learning strategies.

Although the resource already enables nationwide environmental analytics, its broader impact will come from (i) integrating radar, elevation, and socioeconomic layers; (ii) converting the tiles to cloud-optimized formats; and (iii) launching a public leaderboard to accelerate methodological advances in spatiotemporal land cover mapping. By lowering the technical entry barrier, we hope to catalyze evidence-based policy and more resilient land management practices.

Author Contributions

Conceptualization, A.R. and J.T.; methodology, J.T.; software, A.R.; validation, D.-M.C.-E., J.-A.R.-G. and G.A.-F.; formal analysis, F.J.W.-R.; investigation, A.R.; resources, E.A.C.-U.; data curation, A.R.; writing—original draft preparation, J.T.; writing—review and editing, D.-M.C.-E., J.-A.R.-G., G.A.-F., A.R.-P. and E.A.C.-U.; visualization, A.R. and J.T.; supervision, J.T., D.-M.C.-E., E.A.C.-U. and G.A.-F.; project administration, J.T. and E.A.C.-U.; funding acquisition, J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Instituto Politecnico Nacional under grant number SIP-20250165.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in FigShare at https://doi.org/10.6084/m9.figshare.28520114.v1 (accessed on 2 March 2025).

Acknowledgments

During the preparation of this manuscript/study, the author used grammar tools to improve grammar, clarity, and the overall readability of the manuscript, and AI to assist in writing and proofreading. The authors have reviewed and edited the content and assume full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mas, J.F. Monitoring land-cover changes: A comparison of change detection techniques. Int. J. Remote Sens. 1999, 20, 139–152. [Google Scholar] [CrossRef]
Lu, D.; Mausel, P.; Brondizio, E.; Moran, E. Change detection techniques. Int. J. Remote Sens. 2004, 25, 2365–2401. [Google Scholar] [CrossRef]
Ygorra, B.; Frappart, F.; Wigneron, J.P.; Catry, T.; Pillot, B.; Pfefer, A.; Courtalon, J.; Riazanoff, S. A near-real-time tropical deforestation monitoring algorithm based on the CuSum change detection method. Front. Remote Sens. 2024, 5, 1416550. [Google Scholar] [CrossRef]
Kaselimi, M.; Voulodimos, A.; Daskalopoulos, I.; Doulamis, N.; Doulamis, A. A vision transformer model for convolution-free multilabel classification of satellite imagery in deforestation monitoring. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 3299–3307. [Google Scholar] [CrossRef] [PubMed]
Montes, A.B.; Salas, J.; Garcia, E.A.V.; Suarez, R.R.; Wood, D. Assessing human settlement sprawl in mexico via remote sensing and deep learning. IEEE Lat. Am. Trans. 2024, 22, 174–185. [Google Scholar] [CrossRef]
Srivastava, S.; Ahmed, T. An Approach to Monitor Urban Growth through Deep Learning based Change Detection Technique using Sentinel-2 Satellite Images. In Proceedings of the 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 15–17 March 2023; pp. 832–838. [Google Scholar]
Setiawan, Y.; Kustiyo, K.; Hudjimartsu, S.A.; Purwanto, J.; Rovani, R.; Tosiani, A.; Usman, A.B.; Kartika, T.; Indriasari, N.; Prasetyo, L.B.; et al. Evaluating Visible–Infrared Imaging Radiometer Suite Imagery for Developing Near-Real-Time Nationwide Vegetation Cover Monitoring in Indonesia. Remote Sens. 2024, 16, 1958. [Google Scholar] [CrossRef]
Nazarova, T.; Martin, P.; Giuliani, G. Monitoring vegetation change in the presence of high cloud cover with Sentinel-2 in a lowland tropical forest region in Brazil. Remote Sens. 2020, 12, 1829. [Google Scholar] [CrossRef]
Schroeder, T.A.; Moisen, G.G.; Healey, S.P.; Cohen, W.B. Adding value to the FIA inventory: Combining FIA data and satellite observations to estimate forest disturbance. In Moving From Status to Trends; U.S. Department of Agriculture: Newtown Square, PA, USA, 2012. [Google Scholar]
Toromade, A.S.; Chiekezie, N.R. GIS-driven agriculture: Pioneering precision farming and promoting sustainable agricultural practices. World J. Adv. Sci. Technol. 2024, 6, 57–72. [Google Scholar] [CrossRef]
Mikhailova, E.A.; Lin, L.; Hao, Z.; Zurqani, H.A.; Post, C.J.; Schlautman, M.A.; Post, G.C. Massachusetts Roadmap to Net Zero: Accounting for Ownership of Soil Carbon Regulating Ecosystem Services and Land Conversions. Laws 2022, 11, 27. [Google Scholar] [CrossRef]
David Raj, A.; Kumar, S.; Sooryamol, K.R.; Mariappan, S.; Kalambukattu, J.G. 137Cs radiotracer in investigating influence of hillslope positions and land use on soil erosion and soil organic carbon stock—A case study in the Himalayan region. Soil Use Manag. 2024, 40, e13099. [Google Scholar] [CrossRef]
Amuti, T.; Xinguo, L. Land cover change detection in oasis of Hotan River Basin in Northwestern China. In Proceedings of the Future Control and Automation: Proceedings of the 2nd International Conference on Future Control and Automation (ICFCA 2012)-Volume 1, Changsha, China, 1–2 July 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 45–51. [Google Scholar]
Li, Y.; Zhang, L.; Qiu, J.; Yan, J.; Wan, L.; Wang, P.; Hu, N.; Cheng, W.; Fu, B. Spatially explicit quantification of the interactions among ecosystem services. Landsc. Ecol. 2017, 32, 1181–1199. [Google Scholar] [CrossRef]
European Space Agency. Sentinel-2. 2024. Available online: https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2 (accessed on 20 October 2024).
Brown, C.F.; Brumby, S.P.; Guzder-Williams, B.; Birch, T.; Hyde, S.B.; Mazzariello, J.; Czerwinski, W.; Pasquarella, V.J.; Haertel, R.; Ilyushchenko, S.; et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci. Data 2022, 9, 251. [Google Scholar] [CrossRef]
Dynamic World—10 m Global Land Cover Dataset in Google Earth Engine. 2023. Available online: https://dynamicworld.app/ (accessed on 9 July 2024).
Chen, J.; Chen, J.; Liao, A.; Cao, X.; Chen, L.; Chen, X.; He, C.; Han, G.; Peng, S.; Lu, M.; et al. Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS J. Photogramm. Remote Sens. 2015, 103, 7–27. [Google Scholar] [CrossRef]
European Space Agency. ESA Climate Change Initiative: Land Cover Project. 2025. Available online: https://climate.esa.int/en/projects/land-cover/ (accessed on 6 April 2024).
Copernicus Land Monitoring Service. Global Dynamic Land Cover Product. 2025. Available online: https://land.copernicus.eu/en/products/global-dynamic-land-cover (accessed on 6 April 2024).
Alemohammad, H.; Booth, K. LandCoverNet: A Global Benchmark Land Cover Classification Training Dataset. arXiv 2020, arXiv:2012.03111. [Google Scholar]
Sumbul, G.; Charfuelan, M.; Demir, B.; Markl, V. Bigearthnet: A large-scale benchmark archive for remote sensing image understanding. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 5901–5904. [Google Scholar]
Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. SEN12MS—A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion. arXiv 2019, arXiv:1906.07789. [Google Scholar] [CrossRef]
Helber, P.; Bischke, B.; Dengel, A.; Borth, D. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2217–2226. [Google Scholar] [CrossRef]
United States Department of Agriculture, National Agricultural Statistics Service. Cropland Data Layer. 2025. Available online: https://nassgeodata.gmu.edu/CropScape/ (accessed on 6 April 2024).
Conservancy, C. CBP Land Use/Land Cover Data Project. 2025. Available online: https://www.chesapeakeconservancy.org/projects/cbp-land-use-land-cover-data-project (accessed on 6 April 2024).
Nasiri, V.; Deljouei, A.; Moradi, F.; Sadeghi, S.M.M.; Borz, S.A. Land use and land cover mapping using Sentinel-2, Landsat-8 Satellite Images, and Google Earth Engine: A comparison of two composition methods. Remote Sens. 2022, 14, 1977. [Google Scholar] [CrossRef]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; proceedings, part III 18. Springer International: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
Karra, K.; Kontgis, C.; Statman-Weil, Z.; Mazzariello, J.C.; Mathis, M.; Brumby, S.P. Global land use/land cover with Sentinel 2 and deep learning. In Proceedings of the 2021 IEEE international geoscience and remote sensing symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4704–4707. [Google Scholar]
Garnot, V.S.F.; Landrieu, L.; Giordano, S.; Chehata, N. Satellite image time series classification with pixel-set encoders and temporal self-attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12325–12334. [Google Scholar]
Cheng, X.; Sun, Y.; Zhang, W.; Wang, Y.; Cao, X.; Wang, Y. Application of deep learning in multitemporal remote sensing image classification. Remote Sens. 2023, 15, 3859. [Google Scholar] [CrossRef]
Temenos, A.; Temenos, N.; Kaselimi, M.; Doulamis, A.; Doulamis, N. Interpretable deep learning framework for land use and land cover classification in remote sensing using SHAP. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Zafar, Z.; Zubair, M.; Zha, Y.; Fahd, S.; Nadeem, A.A. Performance assessment of machine learning algorithms for mapping of land use/land cover using remote sensing data. Egypt. J. Remote Sens. Space Sci. 2024, 27, 216–226. [Google Scholar] [CrossRef]
Verbesselt, J.; Hyndman, R.; Newnham, G.; Culvenor, D. Detecting trend and seasonal changes in satellite image time series. Remote Sens. Environ. 2010, 114, 106–115. [Google Scholar] [CrossRef]
Daudt, R.C.; Le Saux, B.; Boulch, A. Fully convolutional siamese networks for change detection. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 4063–4067. [Google Scholar]
Zanaga, D.; Van De Kerchove, R.; De Keersmaecker, W.; Souverijns, N.; Brockmann, C.; Quast, R.; Wevers, J.; Grosu, A.; Paccini, A.; Vergnaud, S.; et al. ESA WorldCover 10m 2020 v100; European Space Agency. 2021. Available online: https://doi.org/10.5281/zenodo.5571936 (accessed on 6 April 2024).
Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; et al. ESA WorldCover 10m 2021 v200; European Space Agency. 2022. Available online: https://zenodo.org/records/7254221 (accessed on 6 April 2024). [CrossRef]
World Resources Institute. 2024. Available online: https://www.wri.org (accessed on 18 November 2024).
Introduction to AI Platform|Google Cloud. 2023. Available online: https://cloud.google.com/ai-platform/docs (accessed on 9 July 2023).
World Resources Institute and Google. Dynamic World V1: 10 m Near-Real-Time Global Land Use/Land Cover Dataset. 2022. Available online: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_DYNAMICWORLD_V1 (accessed on 9 July 2024).
GitHub—Google/Dynamicworld. 2023. Available online: https://github.com/google/dynamicworld (accessed on 9 July 2023).
Tait, A.; Brumby, S.; Hyde, S.; Mazzariello, J.; Corcoran, M. Dynamic World training dataset for global land use and land cover categorization of satellite imagery. PANGAEA 2021, 933475. [Google Scholar]
Sentinel Hub. Sentinel-2 L2A. 2024. Available online: https://docs.sentinel-hub.com/api/latest/data/sentinel-2-l2a/ (accessed on 20 October 2024).
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 39, 640–651. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar]
Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 1314–1324. [Google Scholar] [CrossRef]
Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. arXiv 2018, arXiv:1807.10165. [Google Scholar]
Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. arXiv 2017, arXiv:1611.05431. [Google Scholar]
Chen, L.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. arXiv 2018, arXiv:1802.02611. [Google Scholar]
Iakubovskii, P. Segmentation Models Pytorch. 2019. Available online: https://github.com/qubvel/segmentation_models.pytorch (accessed on 1 February 2025).
maintainers, T.; contributors. TorchVision: PyTorch’s Computer Vision Library. 2016. Available online: https://github.com/pytorch/vision (accessed on 1 February 2025).
Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2014. [Google Scholar]

Figure 1. Sentinel-2 and Dynamic World composites for the USA: (a) RGB composite using Red, Green, and Blue bands; (b) SWIR-NIR composite (SWIR1, SWIR2, and NIR) highlighting vegetation and soil moisture; (c) Dynamic World composite showing the most frequent land cover type per pixel.

Figure 2. Study area covering the contiguous United States. The red outline shows the original polygon used to define the region of interest; this polygon is subsequently tessellated into 797 square tiles. Partitioning the dataset in this way keeps each Google Earth Engine export request below the platform’s pixel-count limit while preserving full spatial coverage.

Figure 3. Change in land cover categories over time. This stacked area plot illustrates the temporal dynamics of various land cover categories in the USA from 2016 to 2024.

Figure 4. Change in built-up area land cover percentage over time (2016–2024). This line graph shows the consistent rise in urbanization, highlighting infrastructure expansion in the U.S.

Table 1. Dynamic World land cover classes.

Class ID	LULC Type	Examples
0	Water	Rivers, ponds, lakes, oceans, flooded saltpans
1	Trees	Wooded vegetation, dense green shrubs, plantations
2	Grass	Natural meadows, fields, parks, pastures
3	Flooded vegetation	Flooded mangroves, emergent vegetation
4	Crops	Corn, wheat, hay plots
5	Shrub and scrub	Sparse shrubs, savannas, exposed soil
6	Built-up areas	Clusters of houses, roads, highways, urban areas
7	Bare ground	Exposed rock, deserts, sand dunes
8	Snow and ice	Glaciers, snowfields, permanent snowpack

Table 2. Bands selected from Sentinel-2A and Dynamic World, along with band labels and spatial resolutions.

Identification	Band	Spatial Resolution (m)
B2	Blue (Sentinel-2A)	10
B3	Green (Sentinel-2A)	10
B4	Red (Sentinel-2A)	10
B8	NIR (Sentinel-2A)	10
B11	SWIR 1 (Sentinel-2A)	10
B12	SWIR 2 (Sentinel-2A)	10
08	Label (Dynamic World)	10

Table 3. Pixel-level class distribution of the training data before and after Mosaic augmentation.

Class	Original	Augmented
0	7.6%	7.3%
1	35.4%	34.3%
2	8.7%	8.3%
3	0.1%	0.1%
4	18.9%	18.0%
5	15.7%	14.9%
6	4.1%	7.7%
7	9.3%	8.9%
8	0.1%	0.3%

Table 4. Overall performance metrics for all evaluated semantic segmentation models. Metrics include Intersection over Union (IoU), overall accuracy, and F1 score. The best results for each metric are highlighted in bold.

Metric	FCN	U-Net++	DeepLabV3+	LRASPP
IoU	0.68	0.64	0.69	0.71
Accuracy	0.88	0.86	0.88	0.89
F1 score	0.79	0.76	0.80	0.82

Table 5. Per-class F1 score comparison across semantic segmentation models for each land use land cover (LULC) class. The best-performing score for each class is highlighted in bold.

Class ID	LULC Type	DeepLabV3+	FCN	LRASPP	UNet++
0	Water	0.940	0.931	0.945	0.921
1	Trees	0.940	0.936	0.941	0.936
2	Grass	0.767	0.779	0.786	0.694
3	Flooded Veg.	0.447	0.344	0.414	0.245
4	Crops	0.869	0.861	0.881	0.841
5	Shrub and Scrub	0.818	0.823	0.845	0.808
6	Built-Up Areas	0.897	0.895	0.894	0.894
7	Bare Ground	0.814	0.835	0.857	0.794
8	Snow and Ice	0.738	0.706	0.786	0.667

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rangel, A.; Terven, J.; Córdova-Esparza, D.-M.; Romero-González, J.-A.; Ramírez-Pedraza, A.; Chávez-Urbiola, E.A.; Willars-Rodríguez, F.J.; Alfonso-Francia, G. Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024). Data 2025, 10, 67. https://doi.org/10.3390/data10050067

AMA Style

Rangel A, Terven J, Córdova-Esparza D-M, Romero-González J-A, Ramírez-Pedraza A, Chávez-Urbiola EA, Willars-Rodríguez FJ, Alfonso-Francia G. Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024). Data. 2025; 10(5):67. https://doi.org/10.3390/data10050067

Chicago/Turabian Style

Rangel, Antonio, Juan Terven, Diana-Margarita Córdova-Esparza, Julio-Alejandro Romero-González, Alfonso Ramírez-Pedraza, Edgar A. Chávez-Urbiola, Francisco. J. Willars-Rodríguez, and Gendry Alfonso-Francia. 2025. "Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024)" Data 10, no. 5: 67. https://doi.org/10.3390/data10050067

APA Style

Rangel, A., Terven, J., Córdova-Esparza, D.-M., Romero-González, J.-A., Ramírez-Pedraza, A., Chávez-Urbiola, E. A., Willars-Rodríguez, F. J., & Alfonso-Francia, G. (2025). Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024). Data, 10(5), 67. https://doi.org/10.3390/data10050067

Article Menu

Tracking U.S. Land Cover Changes: A Dataset of Sentinel-2 Imagery and Dynamic World Labels (2016–2024)

Abstract

1. Introduction

2. Related Works

2.1. Comparison of Major LULC Datasets

2.2. Advances in LULC Mapping and Semantic Segmentation Techniques

Segmentation-Based LULC Change Detection

3. Methods

3.1. Public Data Sources

3.1.1. Dynamic World

3.1.2. ESA Sentinel-2

3.2. Data Acquisition

3.2.1. Phase One: Sentinel-2 Collection Filtering

3.2.2. Phase Two: Dynamic World Mask Selection

3.2.3. Phase Three: Composite Image Generation

3.2.4. Phase Four: Data Export and Storage

4. Results

4.1. Semantic Segmentation Application

4.1.1. Data Preprocessing

4.1.2. Model Training and Evaluation

5. Discussion

5.1. Dataset Construction and Its Analytical Pay-Off

5.2. Methodological Innovation

5.3. Practical Prospects

5.4. Limitations

5.5. Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI