Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages

Parreiras, Taya Cristo; Santos, Claudinei de Oliveira; Bolfe, Édson Luis; Sano, Edson Eyji; Leandro, Victória Beatriz Soares; Bayma, Gustavo; Silva, Lucas Augusto Pereira da; Furuya, Danielle Elis Garcia; Romani, Luciana Alvim Santos; Morton, Douglas

doi:10.3390/rs17183168

Open AccessArticle

Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages

by

Taya Cristo Parreiras

^1,*

,

Claudinei de Oliveira Santos

²,

Édson Luis Bolfe

^1,3

,

Edson Eyji Sano

⁴

,

Victória Beatriz Soares Leandro

¹

,

Gustavo Bayma

⁵

,

Lucas Augusto Pereira da Silva

⁶

,

Danielle Elis Garcia Furuya

³

,

Luciana Alvim Santos Romani

³

and

Douglas Morton

⁷

¹

Graduate Programme in Geography, State University of Campinas, Unicamp, Campinas 13083-855, SP, Brazil

²

ACELEN Renováveis, São Paulo 04794-000, SP, Brazil

³

Brazilian Agricultural Research Corporation, Embrapa Digital Agriculture, Campinas 13083-886, SP, Brazil

⁴

Brazilian Agriculture Research Corporation, Embrapa Cerrados, Planaltina 73310-970, DF, Brazil

⁵

Brazilian Agriculture Research Corporation, Embrapa Environment, Jaguariúna 13918-110, SP, Brazil

⁶

Department of Geography, State University of Paraíba, Guarabira 58200-000, PB, Brazil

⁷

NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(18), 3168; https://doi.org/10.3390/rs17183168

Submission received: 23 June 2025 / Revised: 6 September 2025 / Accepted: 8 September 2025 / Published: 12 September 2025

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Highlights

What are the main findings?

Coffee plantations in Brazil were mapped with unprecedented sensitivity and specificity (>95%) using a dense Harmonized Landsat Sentinel-2 time series and a hierarchical ensemble of Random Forest and XGBoost models.
Four phenological stages of coffee production—planting, producing, skeleton pruning, and renovation—were accurately distinguished, with balanced accuracies from 77% to 95%, even in fragmented smallholder landscapes.

What is the implication of the main finding?

3.: Provides a scalable, open-source framework for monitoring climate-resilient coffee management practices and supporting smallholder decision-making.
4.: Enables better access to credit, risk mitigation tools, and operational crop management insights in other coffee-producing regions using globally available EO data.

Abstract

Coffee demand continues to rise, while producing countries face increasing challenges and yield losses due to climate change. In response, farmers are adopting agricultural practices capable of boosting productivity. However, these practices increase intercrop variability, making coffee mapping more challenging. In this study, a novel approach is proposed to identify coffee cultivation considering four phenological stages: planting (PL), producing (PR), skeleton pruning (SK), and renovation with stumping (ST). A hierarchical classification framework was designed to isolate coffee pixels and identify their respective stages in one of Brazil’s most important coffee-producing regions. A dense time series of multispectral bands, spectral indices, and texture metrics derived from Harmonized Landsat Sentinel-2 (HLS) imagery, with an average revisit time of ~3 days, was employed. This data was combined with an ensemble learning approach based on decision-tree algorithms, specifically Random Forest (RF) and Extreme Gradient Boosting (XGBoost). The results achieved unprecedented sensitivity and specificity for coffee plantation detection with RF, consistently exceeding 95%. The classification of coffee phenological stages showed balanced accuracies of 77% (ST) and from 93% to 95% for the other classes. These findings are promising and provide a scalable framework to monitor climate-resilient coffee management practices.

Keywords:

remote sensing; coffea arabica; HLS; perennial crops; classification

Graphical Abstract

1. Introduction

Brazil is one of the global leaders in food production and exports, with coffee playing a key role in the country’s agricultural sector. In 2023, Brazil accounted for approximately 32% of the global food production [1]. Brazilian coffee contributes to the international recognition of the importance of the national agricultural sector and boosts the economic sector of regions dependent on this product [2]. In the 2023/2024 harvest, Brazilian coffee production reached a historic 3 million tons of coffee beans exported to more than 120 countries [3]. Minas Gerais, Espírito Santo, and São Paulo are the states that play a significant role in Brazil’s coffee production. São Paulo is expected to contribute approximately 269 thousand tons to Brazil’s coffee production in the upcoming harvest (2025) [4].

Climate change poses a significant threat to global coffee production. Changes in the pattern and distribution of precipitation and temperature are expected to increase the occurrence of pests and pathogens and reduce the yield and availability of suitable planting areas [5]. Previous studies have projected significant reductions in suitable lands for coffee cultivation in Latin America, Africa, and Asia [6,7]. This shift not only increases social vulnerability, since most of the production depends on small-scale farmers, but also exacerbates environmental risks. Expanding coffee plantations into higher altitudes may lead to protected forest encroachment and increased competition with other economic activities [6,7].

Coffee producers are increasingly adopting conservation practices in response to consumers’ demand for products with lower environmental impacts. These practices include reducing soil erosion and chemical inputs, preserving water resources, and supplying organic beans [8,9]. Given coffee’s high economic and cultural relevance in numerous tropical countries and the growing threat of climate change to global agriculture, it is imperative to develop effective methods for monitoring coffee-growing regions. Such efforts are critical for assessing the effectiveness of public policies, ensuring the long-term sustainability of production, and boosting consumers’ confidence [10].

The remote sensing literature on coffee cultivation reveals that, while many studies focus on assessing yield, nutritional status, disease and pest detection, water stress, and other biophysical issues, detailed mapping of coffee fields and production stages remains underexplored [11,12]. Mapping fields cultivated with perennial crops presents additional challenges compared to annual crops [13]. This is particularly true for coffee, a perennial crop with a long lifespan, a biennial yield cycle, and substantial intra-field variability driven by management practices to improve productivity, control shade, and renew plantations [11,12,13,14].

Many other factors contribute to this heterogeneity, including the production system (irrigated or rainfed, sun-grown or shade-grown), plantation age, and shading effects influenced by row orientation, topography, and the presence of surrounding vegetation. Rainfed coffee is typically cultivated at higher elevations, often in hilly or mountainous landscapes, and frequently integrated into agroforestry systems. Additionally, coffee production is concentrated in tropical and equatorial regions, where persistent cloud cover, particularly during the wet season, poses significant challenges for satellite-based monitoring and data acquisition [11,12,13,14].

Several strategies have been explored for mapping coffee plantations, with varying results. Over the last five years, pixel-by-pixel classification approaches have prevailed, mainly based on supervised learning with non-parametric algorithms such as Random Forest (RF), Support Vector Machines (SVM), and k-Nearest Neighbors (KNN) [11,12]. These classifiers are often applied to multitemporal images using cloud computing resources, such as Google Earth Engine (GEE), as demonstrated by Kelley et al. [15] and Manoel et al. [16]. The images are typically processed using spectral indices, linear spectral mixture models, in addition to terrain and weather-related variables such as elevation, slope, and land surface temperature (LST) [11,12]. Synthetic aperture radar (SAR) images and Gray-Level Co-Occurrence Matrix (GLCM) texture metrics have also been explored both individually [17] and in combination with optical data [14]. However, the review paper from Escobar-López et al. [11] suggested that the performance of radar-based imagery has not significantly improved over optical imagery in mapping coffee plantations.

Despite significant efforts to identify optimal attributes and techniques for mapping coffee plantations, improvements in producer accuracy (PA) and user accuracy (UA) with medium spatial resolution remain limited, regardless of sensor spatial resolution [11,12]. A leading hypothesis is that increasing the temporal resolution of medium- and high-resolution satellite imagery may improve the classification of coffee areas by capturing phenological transitions associated with the crop’s biannual cycle and varied production stages [15]. While some progress has been made using Sentinel-2 time series, many studies still rely on seasonal composites, resulting in models with limited temporal fidelity [15,18]. Chaves & Sanches [19] showed that 5-day revisit intervals of Sentinel-2 data improved UA by up to 17% in large-scale irrigated coffee plantations in western Bahia, Brazil. However, this remains underexplored in more heterogeneous and topographically complex regions where manual or semi-mechanized production prevails.

The municipality of Caconde in the São Paulo State, Brazil, is a leading coffee-producing region and a national model of sustainable, smallholder-based cultivation. The combination of high altitudes and favorable climatic conditions supports the production of high-quality coffee beans. Local producers, many of whom are part of women-led cooperatives, are increasingly adopting tools to adapt to climate change while maintaining productivity and environmental standards [20]. Cacondes’ coffee was selected by the Brazilian delegation for the G20 summit in 2024 [21]. As Brazil expands trade agreements with China [22], this region is well-positioned to gain importance in global supply chains. Therefore, enhancing coffee monitoring systems at the regional scale holds strategic value for national agricultural governance and rural development.

This study applies a dense time series of Harmonized Landsat Sentinel-2 (HLS) imagery [23,24,25] to map coffee plantations and production stages at a 30 m resolution in Caconde. The novelty of this study is in the hierarchical classification framework that was proposed to map not only the coffee plantations but also the corresponding four phenological stages (planting, producing, skeleton pruning, and renovation by stumping) through dense HLS time series data processed by an ensemble of machine learning algorithms (RF and XGBoost). The main objectives with this workflow are (a) identifying coffee areas among other land use and land cover types; and (b) characterizing coffee plantations into four production stages: planted, producing, skeletonized, and renovation by stumping.

2. Materials and Methods

2.1. Study Area

The selected study area is Caconde, a 468-km² municipality in northern São Paulo State in Brazil (Figure 1). This municipality is designated as an Agrotechnological District (DAT) within the Science Center for Development in Digital Agriculture (Semear Digital), a national, multi-institutional effort to expand digital agriculture solutions for small-scale farmers in Brazil [26].

Caconde is located in the Atlantic Rainforest biome, characterized by extensive humid forests and recognized as one of the world’s global biodiversity hotspots. The municipality’s elevation ranges from 710 to 1400 m, with most of the region concentrated between 800 and 1000 m (Figure 2). These topographic characteristics contribute to the region’s distinct slope and soil conditions. For example, higher elevations are associated with more undulating terrains, which favor the formation of deep Argisols. The climate of this region is classified as humid subtropical (Cwb) [27], with an average annual precipitation of 1489 mm, ranging from 1094 mm to 1964 mm. Monthly precipitation typically drops to <100 mm from April to September.

These environmental characteristics support various LULC classes, such as forest formations, pastures, crop-livestock integration, silviculture, and perennial crops [28]. Coffee production is of unique importance in the Caconde region, which ranks among the largest coffee-producing areas in Brazil. Coffee is the primary source of income for the majority of rural properties in this municipality (72% of 2504 rural properties) [29].

2.2. Sampling Strategy and Classification Scheme

The sample acquisition process involved two strategies: remote inspection and ground survey. A set of 787 samples was uniformly distributed across the study area using a 4 km × 4 km grid, with six sampling points assigned to each grid cell. Class assignments were made using resources from the open-source Temporal Visual Inspection (TVI) tool, designed for inspecting time series of satellite images [30,31] and available at https://tvi.lapig.iesa.ufg.br/. TVI corresponds to an online and open-source tool to optimize the point inspection of time series of Landsat images to be selected for a pre-established time interval. This tool also provides time series charts of MODIS NDVI values and precipitation data from Tropical Rainfall Measurement Mission (TRMM) and Global Precipitation Mission (GPM), as well as access to images made available by the Google Earth platform [30,31]. Minor adjustments to the sample locations were made to avoid edge effects, non-vegetated areas, and spectral mixing. During this class assignment process, two analysts inspected samples independently, and only those with mutual agreement were retained for further analysis.

A ground survey was conducted on 23–24 October 2023 to collect on-site samples for the fourth classification level, which distinguishes the four coffee management stages: recently planted (PL), in production (PR), skeletonized (SK), and renovated with stumping (ST), as shown in Figure 3. These stages were identified in the field with the assistance of local farmers. Although several other cultural practices are involved in coffee production, these stages are the most relevant for stakeholders and are influential on satellite images acquired over the study area. A key detail is that Cacondes’ diverse geomorphology has a direct impact on coffee production. Farmers cultivate coffee in agroforestry systems alongside native forests on steep, undulating, and mountainous terrain. On gentler slopes, coffee is typically grown in full sunlight. To increase the data volume required for machine learning algorithms, up to three samples were collected per inspected plot or from neighboring plots that presented the same management pattern.

The planting stage refers to recently planted coffee (~2–3 years old), where plants are still underdeveloped, and the open canopy causes significant soil background influence in remote sensing images. The producing stage includes mature coffee plants undergoing flowering and fruit development. These plants are typically taller and may or may not have undergone skeleton pruning or renovation with stumping. Skeleton pruning involves intensive removal of low-yielding branches, retaining only the lateral branches of each plant [32]. This technique is mainly used for shade management and pest and disease control, although it results in low productivity conditions in the first year. In Caconde, skeleton pruning is also employed to intensify production by increasing plant density, as it reduces shading effects between plants. Renovation with stumping, in turn, involves a more radical intervention. It consists of cutting plants down to a height of 30–100 cm, leaving only the main trunk. This method seeks to rejuvenate plants suffering from structural damage due to adverse weather, aging, or poor management in previous years. It aims to revitalize plantations, adjust plant density and spacing, and create access corridors for mechanization. As a result, the ground surface, including soil, dry biomass, and herbaceous plants, is left fully exposed [33].

A hierarchical classification scheme was designed to identify coffee-related pixels, structured into four levels (Figure 4). Level 1 distinguished anthropic vegetation from native vegetation. Water bodies and non-vegetated areas were masked using the most recent data from the MapBiomas Project [28]. Level 2 classified anthropic vegetation into pastures, perennial crops, and annual crops. At Level 3, perennial crops were divided into coffee plantations and forestry. Given the initially limited sample size for silviculture (11 samples), an additional 19 samples were manually collected through visual inspection of a 4.77 m spatial resolution PlanetScope mosaic acquired in October 2023 (Imagery © Planet 2023 Inc., Planet Labs PBC, San Francisco, CA, USA; Figure 1). At Level 4, the different stages of coffee production, i.e., planting, producing, skeleton pruning, and renovation with stumping, were mapped. Figure 5 illustrates the spatial distribution of sample points used by the four classification levels (TVI samples for Levels 1, 2, and 3, and ground samples for Level 4).

2.3. Remote Sensing Data Processing

This study was conducted using surface reflectance and spectral indices derived from HLS satellite images. HLS is an initiative from the National Aeronautics and Space Administration (NASA) designated to improve the frequency of observations by integrating data from Landsat 8/9 (HLS.L30) and Sentinel-2 (HLS.S30) satellites [23,24]. Both products are resampled to a consistent 30 m resolution grid based on the Military Grid Reference System (MGRS) and undergo geometric and radiometric harmonization. The production of HLS data involves atmospheric correction using the Land Surface Reflectance Code, spatial co-registration through the Automated Registration and Orthorectification Package, bidirectional reflectance distribution function (BDRF) normalization using the c-factor technique, and bandpass adjustment by linear fitting with estimated slope and coefficients. For HLS.L30, data are resampled using cubic convolution and aligned to the MGRS grid despite the different UTM registration conventions between Landsat and Sentinel-2. HLS.S30, in turn, employs area-weighted averaging to deal with 10 m, 20 m, and 60 m input resolutions. For older Sentinel-2 products (prior to baseline 2.04), co-registration was performed with Automated Registration and Orthorectification Package (AROP), followed by cubic convolution before averaging [23,24].

A total of 184 HLS.L30 and HLS.S30 observations, each with a spatial resolution of 30 m, was obtained from NASA’s Earthdata platform using an R-based workflow. The data were filtered to remove cloud-contaminated pixels, retaining only those acquired under clear-sky conditions. The F-mask quality band, available for each satellite overpass, was used for this purpose. Only pixels with F-mask values of 64, 128, and 192 were retained, as these values correspond to reliable surface reflectance observations unaffected by clouds, cloud shadows, and adjacent artifacts [23]. Then, a simple temporal linear interpolation was applied across layers to fill gaps in the time series.

Previous studies have used advanced interpolation strategies, such as nonlinear methods based on harmonic functions and RF algorithms for time series reconstruction in cases with significant gaps between valid observations (e.g., around 24 observations per year) [34,35]. In our case, however, the high temporal resolution of the HLS data resulted in much smaller gaps: fewer than 1% of pixels presented temporal gaps longer than 20 days (after F-mask filtering), and 74% of pixels had gaps of 10 days or less. This high frequent imaging schedule reduces the loss of temporal information. In this sense, simple linear interpolation was sufficient to maintain the phenological patterns expected for the vegetation types in the study area. Only images with less than 90% cloud cover were gap-filled using the approximate function (Equation (1)) from the terra package in R version 4.3.0 [36]. This method interpolates values by assigning timestamps that represent the temporal distance (in days) between layers (z vector).

ρ_{t} = ρ_{t 1} + \frac{t - t_{1}}{t_{2} - t_{1}} \times (ρ_{t 2} - ρ_{t 1})

(1)

where ρ_t is the interpolated reflectance at time t (within the interval between t₁ and t₂); ρ_t1 is the observed reflectance before t, at t₁; ρ_t2 is the observed reflectance after t, at t₂; t is the interpolation date; and t₁ and t₂ are the closest known dates to t, corresponding to the available reflectance data.

To assess the quality of the temporal interpolation, an empirical validation was performed by simulating gaps in valid observations. A total of 10,000 pixels were randomly sampled and evaluated five times each, with random temporal gaps introduced across the time series. This process was repeated over five independent, spatially stratified replicates. The results showed high consistency, with R² values exceeding 0.99 for all spectral bands, as detailed in the Supplementary Materials.

The HLS time series for 2023 comprised 184 observations, 128 from the HLS.S30 product and 56 from HLS.L30. Thirty-seven dates, approximately three per month, contained only HLS.L30 data. The average cloud cover during the dry season (April to September) was 41%, increasing to 69% during the rainy season. Although no observations in 2023 were fully cloud-free, 51 of the 128 HLS.S30 dates presented cloud cover below 10%, with 41 occurring in the dry season and 10 in the rainy season; 20% of them corresponded to HLS.L30 products. On average, each pixel had between 2 and 9 cloud-free observations per month throughout the year.

The temporal resolution of HLS varied according to the season. During the rainy season, the mean revisit interval was 4.03 days, while in the dry season, it improved to 1.97 days. This high revisit frequency highlights the advantages of using HLS products over native image acquisition intervals of Sentinel-2 and Landsat 8/9, which provide observations every five and eight days, respectively.

Feature Space Combinations

A dense feature space was constructed for the mapping process, incorporating all available blue, green, red, near-infrared (NIR), and shortwave infrared (SWIR) bands from the HLS.L30 and HLS.S30 datasets. These spectral bands were used to derive the normalized difference vegetation index (NDVI), soil-adjusted vegetation index (SAVI), normalized difference water index (NDWI), and green-normalized difference vegetation index (GNDVI) (Table 1), which have shown high performances in LULC classification and in monitoring agricultural dynamics in Brazil [19,37,38].

Additionally, given the complexity involved in coffee plantation mapping at Level 4, GLCM textural metrics [39] and LST data were incorporated as additional variables. The LST data consisted of monthly median values from the dry season Landsat 8/9 images. The texture features were computed for all spectral bands and images using the glcm function from the GLCM R package, v 1.6.5. Each band was processed individually using a 7 × 7 pixel moving window, equivalent to a spatial domain of 210 × 210 m at 30 m resolution and quantized into 64 gray levels. The following textural metrics were extracted for each band: variance, contrast, dissimilarity, correlation, second moment, entropy, and homogeneity.

Although more commonly applied to radar or high-resolution satellite data, GLCM textures from multispectral bands provide a valuable means of capturing local spatial patterns that are often indistinguishable in spectral bands alone. In LULC mappings, they are particularly useful for improving class separability in sites with structurally heterogeneous LULC classes or phenologically similar crops, although they require robust computational resources [40,41]. Other studies have also used GLCM variables, with significant improvements in classifying coffee plantation stages [11,42]. Texture analysis utilizes metrics that capture intensity variations between neighboring pixels, thereby facilitating the identification of spatial details such as edges, irregularities, and zones of interest [39]. LST captures changes in vegetation structure, especially those related to management practices that can affect thermal properties [43,44].

Table 1. Spectral indices used in this study. ρ = surface reflectance; NIR = near-infrared; SWIR = shortwave infrared; and L = soil factor.

Spectral Index	Equation	Reference
Normalized Difference Vegetation Index (NDVI)	$\frac{ρ_{N I R} - ρ_{R e d}}{ρ_{N I R} + ρ_{R e d}}$	Rouse et al. [45]
Normalized Difference Water Index (NDWI)	$\frac{ρ_{N I R} - ρ_{S W I R}}{ρ_{N I R} + ρ_{S W I R}}$	Gao [46]
Green Normalized Difference Vegetation Index (GNDVI)	$\frac{ρ_{N I R} - ρ_{G r e e n}}{ρ_{N I R} + ρ_{G r e e n}}$	Gitelson et al. [47]
Soil Adjusted Vegetation Index (SAVI)	$\frac{ρ_{N I R} - ρ_{R e d}}{ρ_{N I R} + ρ_{R e d} + L} (1 + L)$	Huete [48]

Therefore, different datasets from these data were considered: multispectral bands, spectral indices, and the combination of multispectral bands and spectral indices. These datasets were created for the whole year and for the dry season (April to September). This strategy aims to verify whether reducing the dataset’s dimensionality can maintain classification accuracy. At Levels 1 to 3, the combinations used were the multispectral bands, spectral indices, and multispectral bands + spectral indices for both all-year and dry season. At Level 4, the previous combinations from the lower levels were enriched with LST data and GLCM attributes (Figure 6).

2.4. Classification Algorithms and Accuracy Assessment

The classification framework was implemented using the caret [49] and randomForest [50] packages available in the R software. (version 4.3.0) Samples from the LULC classes defined at different hierarchical levels were used to extract values of surface reflectance, vegetation indices, land surface temperature, and texture indicators that constitute the dense feature space. These samples were divided into training and testing sets with four different ratios (40/60, 50/50, 60/40, and 70/30), primarily to address sample imbalance between certain classes. For each split, stratified random sampling was applied to preserve class proportions in both subsets, minimizing bias in model performance due to class imbalance. Additionally, ten repetitions were performed for each partition to account for variability arising from random sampling. Model training and hyperparameter tuning were performed within the training sets, while the final model performance was evaluated on the corresponding independent test sets. Within the training set, a 5-fold cross-validation with 10 repetitions was applied to assess internal model consistency and prevent overfitting.

The study was carried out using two algorithms: RF, which was used at all levels, and XGBoost, which was added at Level 4. We implemented an additional algorithm at Level 4 because of the novelty and complexity of the task. RF is developed using the bootstrap method, which involves the creation of multiple decision trees trained on different datasets. This random tree creation reduces the chances of overfitting in the final model [51,52]. XGBoost, a more recent classifier developed as an improvement of the Gradient Boosting Machine, has been shown to enhance classification results, particularly in distinguishing crop types and agricultural management contexts [53,54,55]. This algorithm operates from a sequential perspective, where new trees are created to correct the errors of previously trained trees.

At Level 4, texture and LST variables were added to the spectral data to account for the complexity of coffee phenological stages. This significantly increased the dataset, resulting in a total of 5520 variables. To reduce dimensionality, the Recursive Feature Elimination (RFE) function from caret [49], adjusted by the RF algorithm, was applied. This approach iteratively removes variables by evaluating their relationships, reducing redundancies, and minimizing their impact on model performance [56]. Additionally, the varImp function was used to rank the most important variables at each level across the different dataset combinations. For RF, the importance was measured by the Mean Decrease in Accuracy (MDA) metric, which measures the reduction in model accuracy when the values of a specific variable are randomly permuted, using out-of-bag (OOB) samples for validation. For XGBoost, the normalized importance score (ranging from 0 to 100) was used to indicate the degree of importance of each variable.

In summary, the final workflow comprised 24 models across Levels 1-3 (RF × three dataset combinations × four split sizes × two periods) and eight models for Level 4 [(RF + XGB) × four split sizes]. The RF models were trained using a 5-fold cross-validation with ten repetitions. Since hyperparameterization does not significantly alter the results of RF, the parameters were maintained as in the package default, with 500 trees (ntree), and the mtry set as the square root of the number of variables. On the other hand, at Level 4, a grid search was performed for different combinations of XGBoost parameters to optimize the results. The evaluated configurations were: 6, 8, and 10 for maxdepth; 0.01, 0.1, and 0.2 for the learning rate (eta); 0.5, 0.6, and 0.7 for a fraction of variables (colsample_bytree) and samples (subsample) at each iteration; and 1–5 for the minimum leaf weight (min_child_weight). The gamma factor was fixed at 0, keeping the model free to perform splits that produce additional information gain, just as the number of iterations (nrounds) was fixed at 100. Model performance was assessed using accuracy, sensitivity (Equation (2)), and specificity (Equation (3)) for each class.

S e n s i t i v i t y = \frac{A}{A + C}

(2)

S p e c i f i c i t y = \frac{D}{B + D}

(3)

where A is true positive, B is false negative, C is false positive, and D is true negative.

2.5. Spatial Predictions

Spatial predictions were carried out for each of the 24 models from Levels 1, 2, and 3, and for the eight models from Level 4. The pixels from each model were then constrained to the boundaries of the target class, as defined by the corresponding map from the immediately preceding level. The next step involved an additional ensemble learning strategy was applied to calculate the modal map from the four maps corresponding to the split ratios (0.4, 0.5, 0.6, and 0.7) of each dataset combination (MS, SIs, and MS + SIs), for both the all-year and dry-season datasets. By combining predictions from multiple models, an ensemble learning approach offers advantages, despite being more labor-intensive. It results in outcomes that are less prone to overfitting, bias, or unwanted variability, thus enhancing generalization capacity [57,58]. The spatial prediction process was concluded before selecting the best available combinations.

The final step involved the assessment of the model’s performance to select the final classification map for each level, i.e., the modal map resulting from the combination that achieved the best balance between accuracy, sensitivity, and specificity. At Level 4, the modal map was created by aggregating the predictions from both the RF and XGBoost classifiers. In the case of classification ties, the class from the model with the highest accuracy was assigned. A final refinement step was applied to ensure spatial consistency over the classification levels. This involved confining the final map for each level to the pixels of the target classes identified in the map from the immediately preceding level, following the same logic used for producing the initial maps. For example, coffee and forestry areas identified in the final map from Level 3 were confined to the perennial crop pixels identified by the final map from Level 2. This hierarchical confinement was necessary to ensure spatial consistency in delineating agricultural areas, as the best-performing combinations may vary between levels. Figure 7 shows the processes described in this section.

3. Results

3.1. Accuracy Assessment and Spatial Predictions

Level 1 distinguished anthropic vegetation from natural vegetation (see complete results and average confusion matrix in Supplementary Tables S2 and S3). Not surprisingly, all models performed well at this level. Throughout the year, multispectral attributes performed best, with average accuracies, sensitivities, and specificities of 0.973, 0.975, and 0.966, respectively. At this level, the RF algorithm estimated that Caconde has 36,933 ha of anthropic vegetation and 7737 ha of native vegetation, accounting for 78.8% and 16.9% of the municipality’s total area, respectively. It is important to emphasize that, at this level, no spatial filters were applied. Since the region has high landscape fragmentation, filters could eliminate small patches of native vegetation.

Level 2 focused on detecting perennial crops, primarily coffee plantations, and further refining their distinction from forest plantations at Level 3. At Level 2, the best-performing input configuration was the combination of multispectral bands and spectral indices across the entire year (MS + SIs, all-year), which yielded a superior balance between sensitivity (0.952) and specificity (0.983) for perennial crops, resulting in a balanced accuracy of 0.967 (Table 2 and Table 3). Of the 36,933 ha of anthropic areas, 53.2% are perennial crops, 37.2% pastures, and 9.6% annual crops. (Supplementary Figure S3).

At Level 3, an additional classification step was applied to separate coffee plantations from forestry, particularly eucalyptus plantations, which often exhibit spectral confusion with coffee. The model using only multispectral bands (MS all-year) achieved the highest accuracy (0.996), outperforming both SIs and MS + SIs combinations. The refined map indicated that coffee plantations accounted for 97.3% (19,764 ha) of the area of perennial crops mapped in Level 2. In comparison, forestry occupied only 2.7% (551 ha) (see full results and final map in Supplementary Table S4 and Figure S3).

At Level 4, the production stages of coffee plantations, planting (PL), producing (PR), skeleton pruning (SK), and renovation with stumping (ST), were classified as distinct classes. Notably, this level represents a challenge for mapping as there is an increase in class details. The results of all models are shown in Table 4 and Table 5, and the spatial prediction produced from all eight models is shown in Figure 8.

Due to the increased complexity at Level 4, texture and LST data were incorporated, which expanded the database to 5520 variables. However, after applying RFE, the database was reduced to 1500 covariates, and this refined set was used to train both the RF and XGBoost algorithms. As expected, the increase in the complexity of Level 4 led to a decrease in accuracy compared to the other levels. Nevertheless, the results were consistent, with average accuracies of 0.835 for RF and 0.838 for XGBoost. The analysis also revealed that the size of the training/testing splitting affected the performance. The two best models were obtained with splits of 70% (0.7) and 60% (0.6) in the training set, while the two worst models were obtained with splits of only 40% (0.4) and 50% (0.5).

Considering the metrics per model and per class, XGBoost slightly outperformed RF. The highest balanced accuracies were achieved for the PR (0.948 in RF) and SK (0.947 in XGBoost). On the other hand, the ST presented lower metrics, with balanced accuracy of 0.78 in both RF and XGBoost. These results highlight the persistent challenges of mapping coffee plantations at different production stages. Nonetheless, the findings also indicate that the use of HLS time series emerges as a promising approach to overcome some of these limitations, offering improved consistency and potential for more accurate classification.

3.2. Feature Importance Analysis

The variable importance was analyzed by averaging the Mean Decrease Accuracy (MDA) metrics obtained from the varImp function in the randomForest package [50] for Levels 1 to 4, and by the normalized importance metrics for XGBoost models at Level 4. Only the variables used to produce the final maps were included in the analysis. We present the complete results of the variable importance for Levels 2 and 4 in the main text, while the results for Levels 1 and 3 are available in the Supplementary Materials.

At Level 2, the months with the highest importance (MDA values) were November (149) and July (145), while January (70) and February (61) presented the lowest values (Figure 9). Among the spectral bands, the green and red bands were the most important, with total MDA values of 282 and 171 (sum of importance for all elements), respectively. The green band occupied five of the top six positions among the most influential variables, showing prominence across several months. GNDVI was the most important spectral index at this level. Together, GNDVI and the green bands account for 39% of the total variable importance in the Level 2 dataset (Figure 10).

At Level 4, the application of RFE resulted in a feature space comprising 1500 variables, including 896 texture features, 314 spectral indices, and 290 bands. No LST features were present in the final set. In both classifiers, RF and XGBoost, October had the highest total importance, while July and August contributed the least to model performance (Figure 11). Although the GLCM-based texture dataset contributed with the highest number of variables, its average importance was inferior to that of the spectral bands and indices. Among the top 20 ranked variables in both RF and XGBoost models, GLCMs were present on an average of 0.75 per model. On average, 583 texture-based variables in the RF models and 512 in the XGBoost models presented importance values equal to or less than zero. These results indicate the importance of carefully selecting texture features to prevent compromising model performance and spatialization accuracy. In terms of the total importance by month and the ranking of the top 10 estimators, XGBoost showed higher overall importance (Figure 12).

When considering only variables with metrics greater than 0, texture features showed an average importance of 1.02 based on MDA and 2.05 for overall normalized importance in the XGBoost models. Spectral indices and bands showed higher importance, with average MDA values of 1.17 and 1.15, respectively, and overall normalized importance values of 6.21 and 5.42, respectively. Spectral indices emerged as the most important features for both algorithms. NDWI and SAVI presented the highest average importance in the RF models, with an MDA value of 1.22 and 1.18. In the XGBoost models, spectral indices also dominated variable importance: GNDVI, NDVI, and SAVI had the highest mean overall scores (7.74, 7.36, and 6.04, respectively).

Among spectral bands, Red was the most informative in both algorithms, with an average MDA of 1.10 in RF and a mean overall importance of 5.62 in XGBoost. By comparison, the second-highest band was Blue (RF: mean MDA = 1.04; XGBoost: mean overall = 2.05). Texture (GLCM) metrics derived from these bands were also prominent: in XGBoost, GLCM features from Red/Blue ranked the most relevant, whereas in RF the GLCM features from Blue and Green contributed most. Figure 13 presents the temporal signature of the most relevant variables in each class at each classification level, highlighting the surface phenology and the potential for separability in each case.

Although GLCM-based textural features derived from spectral bands are computationally intensive, they were expected to improve the discrimination of LULC classes by introducing spatial context into the dataset. However, selecting relevant texture features is still challenging. In general, the most relevant GLCM textures differ between the algorithms. In XGBoost, the average importance is clearly higher for variance (mean = 2.62; sum ≈ 1402) and correlation (2.10; sum ≈ 1091), followed by dissimilarity (2.06) and contrast (1.98), indicating that measures sensitive to variation and linear co-occurrence carry more signal in this model. In RF models, the pattern is more homogeneous: angular second moment (1.05) and homogeneity (1.04) stand out as the highest means; Variance has the largest total sum (≈681) because it appears very frequently, although its average per feature is similar to the others (≈1.03). In summary, XGB concentrates importance on variation and correlation metrics, while RF distributes importance more evenly, with a slight advantage for second and homogeneity, and consistent variance due to the large number of positive occurrences.

In the combined analysis, the strongest mean effects come from homogeneity from May (mean ≈ 5.05, total ≈ 50.5), dissimilarity from May (≈3.63, 72.5) and August (≈3.35, 33.5), and variance from February (≈3.13, 266.3). Within the late-season window, correlation from October (≈2.83, 324.8) and variance from December (≈2.79, 501.3) stand out, with dissimilarity from December also high (≈2.58, 361.8). Overall, this shows that while Oct–Dec texture features (especially correlation and variance) remain highly influential due to large totals and many features with positive influence on the models, the largest per-feature means occur in May (homogeneity, dissimilarity) and early season. A practical strategy is to prioritize Oct–Dec for broad coverage, complemented by targeted May/early-season textures that deliver strong average importance. The total and relative importance of each type of texture metric is displayed in Figure 14.

4. Discussion

4.1. Advantages of HLS Data for Class Separability

One of the main advantages of remote sensing data integration through harmonization and fusion is the improvement in temporal resolution. This enhancement enables the precise capture of spectral signatures, representing the diverse attributes of native and anthropogenic vegetation, including seasonal variability in aboveground biomass, leaf area index, soil moisture, and management practices. These attributes are closely related to land surface phenology and plant health, which are critical for mapping and monitoring vegetation dynamics [19,59,60].

In this study, the temporal spectral behavior of the classes was displayed using monthly averages of the most important variables, as shown in Figure 13: the red band for Level 1, GNDVI for Level 2, SWIR band for Level 3, and SAVI for Level 4. Examining these signatures over the year clarified specific confusion patterns and revealed the complexity of certain classes, such as the similarity between stumped and skeletoning coffee in Level 4.

Although native vegetation and anthropized cover were generally well separated in Level 1, the temporal curves highlighted how seasonal amplitude in vegetation vigor drives class differentiation. In Caconde, Atlantic Forest native formations consistently maintained lower red reflectance during the year, reflecting high photosynthetic activity. In contrast, anthropized vegetation achieved the peak of vigor at the onset of the rainy season. This seasonal effect can lead to misclassification in some regions, as observed in Brazilian Savanna (Cerrado Biome) shrublands, where native cover may resemble pastures in spectral terms [37,61,62].

At Level 2, the GNDVI profiles revealed that pasture and annual crop classes shared significant similarities throughout the year, particularly in smallholder landscapes. Field surveys in 2023 showed that, beyond soybean fields (800 ha), maize dominated annual cropping (300 ha), alongside a mosaic of sugarcane (140 ha), sorghum (130 ha), beans (35 ha), potatoes (20 ha), and onions (10 ha) [63]. This fine-grained heterogeneity, typical of small plots supplying local markets, amplifies spectral confusion and challenges mapping efforts even at 30 m resolution [64].

Level 4 analyses identified two key periods for detecting skeletonized coffee: (i) April–May, during fruit maturation, when SAVI declines; and (ii) post-harvest through late dry season (September–November), when producing coffee recovers vigor more quickly than skeletonized stands. Newly planted, skeletonized, and stumped coffee maintained consistently lower SAVI values, aiding class discrimination from producing areas.

Across the four levels, the importance rankings consistently reveal the role of the rainy season months (October–March). Such periods are often dismissed for their higher cloud cover, but they captured essential phenological transitions linked to agricultural practices in tropical systems. Especially in Levels 3 and 4, these transitions proved critical for distinguishing structurally similar perennial crops and highlight the added value of dense HLS time series in complex agricultural landscapes.

The fusion of reflectance data and the integration of multisensor, multiscalar data have led to superior performance in mapping coffee in studies such as Tridawati et al. [42], and Souza et al. [65]. Likewise, in this study, the integration of Landsat 8/9 and Sentinel-2 was highly effective for detecting coffee plantations and characterizing, primarily, areas in production and newly planted fields.

4.2. Performance of Mappings and Impacts of Variables for Levels 1, 2, and 3

Level 1 presented the least difficulty in our mapping, reaching an average accuracy above 0.97. This is likely because this level involves only two classes, native and anthropic vegetation. These results are consistent with other studies that used a hierarchical mapping approach to analyze agricultural dynamics [37,59,62]. The main confusion with native vegetation occurred in areas of perennial crops due to their similar spectral characteristics, a common challenge in vegetation classification [66,67].

The all-year multispectral dataset performed best for mapping Level 1, particularly using the red and SWIR bands for the rainy season (September, November, and December) in Caconde. This region presents dense rainforests and land uses adapted to the humid conditions. Therefore, the red and SWIR bands effectively capture reflectance variations related to chlorophyll content and moisture levels in the vegetation canopy [46], making it easier to discriminate between natural and anthropogenic vegetation. Previous studies have also shown the effectiveness of red and SWIR bands in separating natural and anthropogenic vegetation in other regions [19,37,58,62].

At Level 2, the mapping process aimed at distinguishing perennial crops from annual crops and pastures was most effective when using the all-year dataset combining multispectral bands and spectral indices (MS + SIs), achieving an average accuracy of 0.939. These results align with previous findings from Bendini et al. [59] and Parreiras et al. [62], who demonstrated that combining spectral indices with multispectral bands substantially improved the performance of agricultural mapping.

The most prominent variables for Level 2 were the green band and GNDVI during the rainy months (November and December) and the dry month (July). This shows the importance of water availability throughout the year for separating agricultural uses. During the rainy season, water availability supports the metabolic activities of vegetation, resulting in high reflectance in the green spectrum [68]. Conversely, despite reduced precipitation in the dry season, perennial crops typically maintain higher biomass levels. This contrasts with pastures, which quickly respond to water scarcity with a decline in vigor [69], and non-irrigated annual crops, which usually complete their growth cycle and are harvested before the end of the rainy season. Thus, during the dry season, the higher green reflectance in crops, compared to pastures and annual crops, facilitated the separability of these classes. Previous studies in Brazil [61] and Iran [70] have also shown the importance of the green band and green-related indices, primarily due to their sensitivity to variations in chlorophyll pigments and vegetation vigor.

At Level 3, the SWIR band effectively captured key phenological attributes of both coffee and eucalyptus plantations. Coffee plantations exhibited significant variations in vegetation stages, with periods of increased canopy density during flowering and grain formation and phases of reduced vegetation cover during planting, skeleton pruning, and renovation with stumping, which expose the soil background [32]. In contrast, eucalyptus has a longer phenological cycle with a denser and more stable leaf structure, retaining higher water content throughout the year, resulting in consistently lower SWIR reflectance (Figure 13). These contrasting phenological characteristics lead to distinct SWIR behaviors, enabling the differentiation between coffee and silviculture.

4.3. Level 4: Mapping the Stages of Coffee Production, Advances, and Challenges

Historically, permanent crops have been overlooked in remote sensing research, often due to difficulties in monitoring or lack of economic interest [14]. Mapping perennial crops such as coffee presents unique difficulties due to factors like landscape variability, which arises from differences in production systems (e.g., agroforestry vs. monoculture), varying management practices, and different crop development stages. Additional complexities include steep terrain, persistent cloud cover, and spectral similarities with other classes, all of which complicate the mapping process [11,12]. Despite these challenges, a growing number of studies have focused on developing improving techniques for coffee crop mapping. A summary of these efforts is provided in Table 6.

As highlighted in previous studies, seasonality is an essential attribute for coffee mapping. However, earlier studies often relied on limited approaches, such as using a limited number of images from only the dry season or converting multitemporal images into seasonal metrics [14,15,16,65,66,67]. Alternatively, this study captured seasonality from a dense time series of spectral bands and indices from HLS products.

Although still challenging, Level 4 models yielded promising results in distinguishing coffee production stages. For both RF and XGBoost, PL and PR stages were the most well-classified, with errors up to just 10.7%. In contrast, previous studies in Zimbabwe [66] and Vietnam [14] reported that RF models had low separability between young and mature plantations. When confusion occurred, ST was the class assigned in 100% of the cases for PL, and 83% for PR. ST was the most challenging class for both models, consistently showing the lowest sensitivity. XGB maintained very high specificity, with errors occurring mainly by omission. At the same time, RF followed a similar pattern, with no real improvement in ST detection and only a slight drop in specificity. Despite these difficulties, the study introduced a relevant advancement by isolating and mapping the class of ST coffee trees, previously grouped with newly planted areas in other studies [66,67]. These results provide proof of concept and highlight directions for future data collection and model refinement in complex classification scenarios.

Management practices alter canopy structure and reflectance patterns over time, introducing variability that complicates class separation. In contrast to previous findings [72,73], which reported substantial gains in class discrimination through the inclusion of texture metrics, particularly those derived from microwave or very high-resolution imagery, no such improvement was observed in this study. Here, GLCM features were extracted from medium-resolution optical data. Although their inclusion initially increased the number of features to 5520, the application of RFE allowed ~73% reduction, retaining the most relevant variables. Among the 896 texture-derived features selected, they accounted for only 33.4% of the total variable importance, whereas the remaining 604 features, mostly based on spectral bands and indices, contributed 66.6%. This difference suggests that, despite their numerical prevalence, the optical GLCM texture features added limited discriminatory power. While RFE does not directly enhance model accuracy, it streamlines the dataset and improves interpretability, reinforcing the value of dimensionality reduction in coffee mapping using dense time series.

Although GNDVI presented high average importance in both RF and XGBoost models at Level 2, its contribution to improving classification accuracy at Level 4 was limited. This discrepancy may be attributed to the GNDVIs’ high sensitivity to chlorophyll content, which enhances the detection of productive coffee trees, but its limited sensitivity to subtle structural changes in the canopy, such as those resulting from pruning [74,75]. Its high importance likely reflects its effectiveness in distinguishing producing areas. However, confusion with other classes suggests the need for broader temporal coverage (e.g., two to three years), larger sample sizes, or the inclusion of additional indices that may be more sensitive to canopy structural changes resulting from management practices. Such additions may capture better subtle variations in photosynthetic activity during early recovery stages. In any case, it is also important to note that a variable’s high importance does not necessarily translate to improved class separability, especially when dealing with complex classes in heterogeneous landscapes, where classification accuracy tends to decrease. Nevertheless, the spatial occurrence of these classes should not be overlooked.

The hierarchical approach helped minimize confusion between similar land use types by breaking the problem into simpler, more sequential decisions [59,62]. Reducing complex classification tasks into smaller subtasks has proven particularly beneficial in multiclass analysis based on a deep learning approach, as it helps reduce computational costs [76]. In the context of LULC and crop type mapping using machine learning, hierarchical systems offer several advantages, including interoperability across different scales, flexibility in incorporating new classes, and interdisciplinarity, as the classification legend can be applied across various analytical frameworks [77,78]. Furthermore, hierarchical systems can enhance the ecological relevance of each class, promoting more accurate analyses of global LULC dynamics [78] and facilitating comparison with official statistics from different sources and purposes. However, a typical pattern in the literature is to classify coffee plantations as a single class within a broader land use classification, typically ranging from five to ten categories.

Several studies and meta-analyses, such as those conducted by Chaves et al. [79], Fang et al. [80], Bolfe et al. [37], and Escobar-López et al. [11], have shown that the choice of algorithm for well-established non-parametric classifiers like SVM, KNN, and decision trees often has little impact on overall performance. However, differences may arise in specific classes or contexts. Likewise, this study found that XGBoost presented performance similar to RF, but with a significantly higher computational cost. XGBoost performed slightly better, especially in zero-harvest classes, with greater sensitivity to PL and SK, and greater specificity to PL and ST. However, with a much lower operating cost, RF consistently performed at the same level as XGBoost, yielding improved classifications for the PR stage. Therefore, integrating models with different structures, such as bagging (RF) and boosting (XGBoost), leverages the complementary strengths of both classifiers, and by aggregating predictions through methods such as stacking or voting, a consensus map is produced, further confirming the robustness of the classifications [81].

The models developed in this study were trained exclusively on data from 2023, a period marked by average rainfall within a broader trend of decreasing precipitation in the region. While the dataset used in this study offered sufficient spectral and spatial variability for the models, the exclusive use of data from a single, climatically favorable year can represent a limitation regarding temporal generalization [82,83]. Coffee crop characteristics, such as canopy vigor, spectral index values, and texture patterns, are known to be sensitive to interannual climate variability, particularly rainfall anomalies [82,83]. In drier or drought-affected years, NDVI values tend to decline due to reduced chlorophyll content, lower leaf area, or increased senescence. In this sense, models trained with NDVI data from rainy years may impair performance in drier years, mistakenly interpreting low NDVI as a non-productive area. This discrepancy can lead to underperformance or biased predictions. Likewise, texture metrics derived from stressed canopies may differ markedly from those observed under normal conditions. Therefore, caution is needed when applying the models created in this study in years with markedly different environmental conditions. We then highlight the importance of incorporating multi-year datasets or climate-sensitive predictors in further studies [82,83].

Finally, no study has employed deep neural networks for segmentation in coffee mapping. This is a promising strategy for future studies. Particularly, convolutional neural networks have shown good performance in mapping crops in complex environments [84,85]. However, it is important to note that these networks require a substantial volume of training data and substantial computational resources, particularly in regional applications.

5. Conclusions

This study presents a novel methodological framework for coffee mapping, introducing several innovations rarely addressed in the literature. We leverage dense, high-frequency time series from HLS images, used here for the first time in this context, without applying temporal reducers, thereby allowing full exploitation of phenological patterns. Tested in a structurally fragmented landscape dominated by smallholder farms, our approach accurately distinguishes four stages of coffee production using in situ data, thereby adding operational value to crop management. A four-level hierarchical ensemble classification was implemented by stacking RF models trained with multiple data splits (40/40%, 50/50%, 60/40%, and 70/30%). At Level 4, four RF and four XGBoost models were combined into an eight-model ensemble, with majority voting and a tiebreaker strategy. The models achieved high accuracy, sensitivity, and specificity (>0.90) in the first three levels, outperforming comparable studies. Although performance showed a smooth decline in the final stage, with a focus on production stages, the ensemble still yielded balanced accuracy ranging from 0.78 (renovation with stumping) to 0.95 (producing).

The results of this study, in both the identification and characterization of coffee pixels, are unprecedented in the literature. There is marked variability in variable importance across RF and XGBoost, especially among spectral indices and GLCM texture metrics. Spectral indices were the most important variables across the entire study. Overall, temporal sampling (image-acquisition period) and training-data volume mattered more than the specific indices or texture features themselves. While RF and XGBoost generally performed at similar levels, RF had a much lower computational cost. In specific configurations with larger training sets (70% of the data), XGBoost achieved up to ~5% higher accuracy in coffee characterization. This indicates that the performance gain of XGBoost may not always offset its greater computational demands, depending on the application context.

The methodological framework supports targeted decision-making for smallholders regarding management practices, while enhancing access to credit and risk mitigation tools. It was developed to be technically scalable to other coffee-producing regions, as it relies on globally available Earth Observation data and open-source tools. However, its successful replication depends on the availability of representative in situ samples, particularly for less frequent production stages. In actively productive areas, where phenological signals are stronger and more consistent, the approach has proven especially effective. All codes and procedures are publicly shared, and future work will focus on improving classification accuracy for underrepresented classes and supporting broader operational implementation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17183168/s1, Figure S1:Scatter plots between the values of the multispectral bands in the original and interpolated datasets; Figure S2: Level 1 spatial prediction and average confusion matrix of the Random Forest with all-year multispectral models.; Figure S3: Level 3 spatial prediction and average confusion matrix of the Random Forest with all-year multispectral models; Figure S4: Total importance per type of variable in Level 1 (left) and Level 3 (right), considering the all-year multispectral (MS); Figure S5: Total importance of variables per month at Level 1 (left); and Top 10 best ranked variables based on average Mean Decrease Accuracy between models using all year multispectral dataset (right); Figure S6: Total importance of variables by month at Level 3 (left), and Top 10 ranked variables based on average Mean Decrease Accuracy across models using the multispectral dataset (right). Table S1: Validation results of the gap-filling procedure with linear interpolation of the Harmonized Landsat Sentinel-2 (HLS) multispectral bands via cross-validation; Table S2: Random Forest results at Level 1 with all year and dry season datasets in the following combinations: multispectral bands (MS), spectral indices (SI), and (MS + SI) data sets; Table S3: Random Forest results from Level 3 with all year and dry season datasets in the following combinations: multispectral bands (MS), spectral indices (SIs), and multispectral and spectral indices (MS + SI).

Author Contributions

Conceptualization, C.d.O.S., T.C.P., É.L.B., E.E.S. and G.B.; methodology, C.d.O.S., G.B. and T.C.P.; software, C.d.O.S. and T.C.P.; validation, C.d.O.S. and T.C.P.; formal analysis, T.C.P., C.d.O.S., É.L.B., E.E.S., V.B.S.L., G.B., L.A.P.d.S., D.E.G.F., L.A.S.R. and D.M.; investigation, C.d.O.S., T.C.P., E.E.S. and É.L.B.; resources, É.L.B.; data curation, C.d.O.S. and T.C.P.; writing—original draft preparation, T.C.P., C.d.O.S., É.L.B., E.E.S., V.B.S.L., G.B., L.A.P.d.S., D.E.G.F., L.A.S.R. and D.M.; writing—review and editing, T.C.P., C.d.O.S., É.L.B., E.E.S., V.B.S.L., G.B., L.A.P.d.S., D.E.G.F., L.A.S.R. and D.M.; visualization, T.C.P., C.d.O.S., É.L.B., E.E.S., V.B.S.L., G.B., L.A.P.d.S., D.E.G.F., L.A.S.R. and D.M.; supervision, É.L.B., E.E.S. and L.A.S.R.; project administration, É.L.B. and L.A.S.R.; funding acquisition, É.L.B. and L.A.S.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research and APC was funded by São Paulo Research Foundation (FAPESP), grants numbers 2022/09319-9 (Semear Digital Project), 2024/13150-5 (T.C.P.), 2023/04008-8 (C.d.O.S.), 2024/05205-4 (D.E.G.F.), 2025/01750-0 (V.B.S.L.), and the National Council for Scientific and Technological Development (CNPq)/Research Productivity Fellowship (PQ) (É.L.B.), and (E.E.S.).

Data Availability Statement

The data supporting the findings of this study were deposited in the Embrapa Research Data Repository, Redape, at https://doi.org/10.48432/4HRRJQ.

Acknowledgments

The authors acknowledge the support of the Rural Union of Caconde during the field visits, with special thanks to Ademar Pereira (President), Valéria Franco de Melo, and Fabrício Fagundes (Agronomists), as well as the rural producers who kindly granted access to their properties. During the preparation of this work, the authors used automated tools to improve the grammatical quality and fluency of the text. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

CONAB. Monitoring the Brazilian Harvest: 3rd Survey 2024. Coffee Harvest. Available online: https://www.gov.br/conab/pt-br/atuacao/informacoes-agropecuarias/safras/safra-de-cafe/3o-levantamento-de-cafe-safra-2024/boletim-cafe-setembro-2024 (accessed on 25 April 2025).
MAPA. Ministério de Agricultura e Pecuária. Brasil é o Maior Produtor Mundial e o Segundo Maior Consumidor de Café. Available online: https://www.gov.br/agricultura/pt-br/assuntos/noticias/brasil-e-o-maior-produtor-mundial-e-o-segundo-maior-consumidor-de-cafe (accessed on 25 April 2025).
CECAFÉ. Brazilian Coffee Exporters Council (Conselho dos Exportadores de Café do Brasil). Montlhy Report: June 2024 (Relatório Mensal: Junho 2024). Available online: http://www.consorciopesquisacafe.com.br/images/stories/noticias/2021/2024/Junho/CECAFE_Relatorio_Mensal_JUNHO_2024.pdf (accessed on 25 April 2025).
Martins, V.A.; Fredo, C.E.; da Silva Lago Baptistella, C.; de Camargo Bini, D.L.; Pinatti, E.; de Camargo, F.P.; Miura, M.; Coelho, P.J.; Nakama, L.M.; Ferreira, T.T. Previsões e estimativas de safra do Estado de São Paulo, ano agrícola 2023/24. Análises e Indicadores do Agronegócio 2024, 19, 1–12. [Google Scholar]
Pham, Y.; Reardon-Smith, K.; Mushtaq, S.; Cockfield, G. The impact of climate change and variability on coffee production: A systematic review. Clim. Chang. 2019, 156, 609–630. [Google Scholar] [CrossRef]
Bilen, C.; El Chami, D.; Mereu, V.; Trabucco, A.; Marras, S.; Spano, D. A systematic review on the impacts of climate change on coffee agrosystems. Plants 2022, 12, 102. [Google Scholar] [CrossRef]
Byrareddy, V.M.; Kath, J.; Kouadio, L.; Mushtaq, S.; Geethalakshmi, V. Assessing scale-dependency of climate risks in coffee-based agroforestry systems. Sci. Rep. 2024, 14, 8028. [Google Scholar] [CrossRef]
Jones, K.; Njeru, E.M.; Garnett, K.; Girkin, N. Assessing the impact of voluntary certification schemes on future sustainable coffee production. Sustainability 2024, 16, 5669. [Google Scholar] [CrossRef]
Martinez, H.E.P.; Andrade, S.A.L.; Santos, R.H.S.; Baptistella, J.L.C.; Mazzafera, P. Agronomic practices toward coffee sustainability. A review. Sci. Agric. 2024, 81, e20220277. [Google Scholar] [CrossRef]
Kittichotsatsawat, Y.; Jangkrajarng, V.; Tippayawong, K.Y. Enhancing coffee supply chain towards sustainable growth with big data and modern agricultural technologies. Sustainability 2021, 13, 4593. [Google Scholar] [CrossRef]
Escobar-López, A.; Castillo-Santiago, M.Á.; Mas, J.F.; Hernández-Stefanoni, J.L.; López-Martínez, J.O. Identification of coffee agroforestry systems using remote sensing data: A review of methods and sensor data. Geocarto Int. 2024, 39, 1. [Google Scholar] [CrossRef]
Hunt, D.A.; Tabor, K.; Hewson, J.H.; Wood, M.A.; Reymondin, L.; Koenig, K.; Schmitt-Harsh, M.; Follett, F. Review of remote sensing methods to map coffee production systems. Remote Sens. 2020, 12, 2041. [Google Scholar] [CrossRef]
Bégué, A.; Arvor, D.; Bellon, B.; Betbeder, J.; de Abelleyra, D.; Ferraz, R.P.D.; Lebourgeois, V.; Lelong, C.; Simões, M.; Verón, S.R. Remote sensing and cropping practices: A review. Remote Sens. 2018, 10, 99. [Google Scholar] [CrossRef]
Maskell, G.; Chemura, A.; Nguyen, H.; Gornott, C.; Mondal, P. Integration of Sentinel optical and radar data for mapping smallholder coffee production systems in Vietnam. Remote Sens. Environ. 2021, 266, 112709. [Google Scholar] [CrossRef]
Kelley, L.C.; Pitcher, L.; Bacon, C. Using Google Earth Engine to map complex shade-grown coffee landscapes in Northern Nicaragua. Remote Sens. 2018, 10, 952. [Google Scholar] [CrossRef]
Manoel, M.C.; Rosa, M.R.; Queiroz, A.P. Analysis of the biennial productivity of arabica coffee with Google Earth Engine in the Northeast region of São Paulo, Brazil. Remote Sens. 2024, 16, 3833. [Google Scholar] [CrossRef]
Silva, W.F.; Rudorff, B.F.T.; Formaggio, A.R.; Paradella, W.R.; Mura, J.C. Discrimination of agricultural crops in a tropical semi-arid region of Brazil based on L-band polarimetric airborne SAR data. ISPRS J. Photogramm. Remote Sens. 2009, 64, 458–463. [Google Scholar] [CrossRef]
Le, Q.T.; Dang, K.B.; Giang, T.L.; Tong, T.H.A.; Nguyen, V.G.; Nguyen, T.D.L. Deep learning model development for detecting coffee tree changes based on Sentinel-2 imagery in Vietnam. IEEE Access 2022, 10, 109097–109107. [Google Scholar] [CrossRef]
Chaves, M.E.D.; Sanches, I.D. Improving crop mapping in Brazil’s Cerrado from a data cubes-derived Sentinel-2 temporal analysis. Remote Sens. Appl. Soc. Environ. 2023, 32, 101014. [Google Scholar] [CrossRef]
Chies, V. Small Properties Predominate in Caconde, the Largest Coffee Producer in SP. Available online: https://www.embrapa.br/busca-de-noticias/-/noticia/83223396/pequenas-propriedades-predominam-em-caconde-maior-produtor-de-cafe-de-sp (accessed on 25 April 2025).
ASN. Caconde Coffee Produced by Women Trained by Sebrae-SP Is Delivered to World Leaders at the G20. Available online: https://sp.agenciasebrae.com.br/Cultura-Empreendedora/Cafe-de-Caconde-Produzido-Por-Mulheres-Capacitadas-Pelo-Sebrae-Sp-e-Entregue-a-Liderancas-Mundiais-No-G20 (accessed on 25 April 2025).
SECOM. Vice President Signs Agreements to Promote Brazilian Coffee in China’s Largest Coffee Shop Chain. Available online: https://www.gov.br/secom/pt-br/assuntos/noticias/2024/06/vice-presidente-assina-acordos-para-promocao-do-cafe-brasileiro-na-maior-rede-de-cafeterias-da-china (accessed on 25 April 2025).
Ju, J.; Zhou, Q.; Freitag, B.; Roy, D.P.; Zhang, H.K.; Sridhar, M.; Mandel, J.; Arab, S.; Schmidt, G.; Crawford, C.J.; et al. The Harmonized Landsat and Sentinel-2 Version 2.0 surface reflectance dataset. Remote Sens. Environ. 2025, 324, 114723. [Google Scholar] [CrossRef]
Claverie, M.; Ju, J.; Masek, J.G.; Dungan, J.L.; Vermote, E.F.; Roger, J.-C.; Skakun, S.V.; Justice, C. The Harmonized Landsat and Sentinel-2 surface reflectance data set. Remote Sens. Environ. 2018, 219, 145–161. [Google Scholar] [CrossRef]
NASA. Harmonized Landsat and Sentinel-2. Available online: https://hls.gsfc.nasa.gov/ (accessed on 25 April 2025).
EMBRAPA. Semear Digital Project. Available online: https://www.semear-digital.cnptia.embrapa.br/ (accessed on 25 April 2025).
Alvares, C.A.; Stape, J.L.; Sentelhas, P.C.; De Moraes Gonçalves, J.L.; Sparovek, G. Köppen’s Climate Classification Map for Brazil. Meteorol. Z. 2013, 22, 711–728. [Google Scholar] [CrossRef]
MapBiomas. MapBiomas Project (Projeto MapBiomas). Available online: https://plataforma.brasil.mapbiomas.org/cobertura?activebasemap (accessed on 25 April 2025).
Ronquim, C.C.; Rodrigues, C.A.G.; Franzin, J.P.; Scarazatti, B.; Alvarez, I.A.; Garçon, E.A.M. Spatial characterization and distribution of coffee and native forest areas in Caconde, SP. In Proceedings of the XX Brazilian Symposium on Remote Sensing, INPE, Florianópolis, Brazil, 2–5 May 2023; pp. 700–703, ISBN 978-65-89159-04-9. [Google Scholar]
Lopes, V.C.; Parente, L.L.; Baumann, L.R.F.; Miziara, F.; Ferreira, L.G. Land-use dynamics in a Brazilian agricultural frontier region, 1985–2017. Land Use Policy 2020, 97, 104740. [Google Scholar] [CrossRef]
Nogueira, S.H.M.; Parente, L.L.; Ferreira, L.G. Temporal Visual Inspection: A Tool for the Visual Inspection of Points in Remote Sensing Time Series; Laboratory for Image Processing and Geoprocessing (LAPIG), Federal University of Goiás: Goiânia, Brazil, 2025; Available online: https://www.lapig.iesa.ufg.br/lapig/ (accessed on 25 April 2025).
Adami, M.; Moreira, M.A.; Barros, M.A.; Martins, V.A.; Friedrich, B.; Rudorff, T. Avaliação da Exatidão do Mapeamento da Cultura do Café No Estado de Minas Gerais. Available online: http://www.dsr.inpe.br/laf/cafesat/artigos/AvaliacaoExatidaoMapeamentoCafe.pdf (accessed on 25 April 2025).
Arantes, K.R.; de Faria, M.A.; Rezende, F.C. Recuperação do cafeeiro (Coffea arabica L.) após recepa, submetido a diferentes lâminas de água e parcelamentos da adubação. Acta Sci. Agron. 2009, 31, 313–319. [Google Scholar] [CrossRef]
Wang, Q.; Wang, L.; Zhu, X.; Ge, Y.; Tong, X.; Atkinson, P.M. Remote sensing image gap filling based on spatial-spectral random forests. Sci. Remote Sens. 2022, 5, 100048. [Google Scholar] [CrossRef]
Zhou, Q.; Xian, G.; Shi, H. Gap fill of land surface temperature and reflectance products in Landsat analysis ready data. Remote Sens. 2020, 12, 1192. [Google Scholar] [CrossRef]
Hijmans, R.; Bivand, R.; Cordano, E.; Dyba, K.; Pebesma, E.; Sumner, M. Package ‘Terra’ (1.8-5). Available online: https://cran.r-project.org/web/packages/terra/index.html (accessed on 25 April 2025).
Bolfe, É.L.; Parreiras, T.C.; Silva, L.A.P.; Sano, E.E.; Bettiol, G.M.; Victoria, D.C.; Sanches, I.D.; Vicente, L.E. Mapping agricultural intensification in the Brazilian savanna: A machine learning approach using harmonized data from Landsat Sentinel-2. ISPRS Int. J. Geo-Inf. 2023, 12, 263. [Google Scholar] [CrossRef]
Sano, E.E.; Bolfe, É.L.; Parreiras, T.C.; Bettiol, G.M.; Vicente, L.E.; Sanches, I.D.; Victoria, D.C. Estimating double cropping plantations in the Brazilian Cerrado through PlanetScope monthly mosaics. Land 2023, 12, 581. [Google Scholar] [CrossRef]
Haralick, R.M.; Dinstein, I.; Shanmugam, K. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
Chen, S.; Useya, J.; Mugiyo, H. Decision-level fusion of Sentinel-1 SAR and Landsat 8 OLI texture features for crop discrimination and classification: Case of Masvingo, Zimbabwe. Heliyon 2020, 6, e05358. [Google Scholar] [CrossRef] [PubMed]
Salas, E.A.L.; Boykin, K.G.; Valdez, R. Multispectral and texture feature application in image-object analysis of summer vegetation in eastern Tajikistan Pamirs. Remote Sens. 2016, 8, 78. [Google Scholar] [CrossRef]
Tridawati, A.; Wikantika, K.; Susantoro, T.M.; Harto, A.B.; Darmawan, S.; Yayusman, L.F.; Ghazali, M.F. Mapping the distribution of coffee plantations from multi-resolution, multi-temporal, and multi-sensor data using a Random Forest algorithm. Remote Sens. 2020, 12, 3933. [Google Scholar] [CrossRef]
Duveiller, G.; Hooker, J.; Cescatti, A. The mark of vegetation change on Earth’s surface energy balance. Nat. Commun. 2018, 9, 679. [Google Scholar] [CrossRef] [PubMed]
Değermenci, A.S.; Zengin, H. Determination of land surface temperatures for some oak stands with Landsat 8 OLI satellite images: A case study from Turkey. Environ. Monit. Assess. 2024, 196, 1107. [Google Scholar] [CrossRef] [PubMed]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Tech. Rep. 1974. Available online: https://ntrs.nasa.gov/citations/19740022614 (accessed on 25 April 2025).
Gao, B. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Kuhn, M. Package “Caret”: Classification and Regression Training. Available online: https://cran.r-project.org/web/packages/caret/index.html (accessed on 25 April 2025).
Breiman, L.; Cutler, A.; Liaw, A.; Wiener, M. randomForest: Breiman and Cutler’s Random Forests for Classification and Regression. R Package Version 4.7-1.1. Available online: https://cran.r-project.org/package=randomForest (accessed on 13 June 2025).
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Biau, G.; Scornet, E. A Random Forest Guided Tour. TEST 2016, 25, 197–227. [Google Scholar] [CrossRef]
Saini, R.; Ghosh, S.K. Crop classification in a heterogeneous agricultural environment using ensemble classifiers and single-date Sentinel-2A imagery. Geocarto Int. 2019, 36, 2141–2159. [Google Scholar] [CrossRef]
Rafif, R.; Kusuma, S.S.; Saringatin, S.; Nanda, G.I.; Wicaksono, P.; Arjasakusuma, S. Crop intensity mapping using dynamic time warping and machine learning from multi-temporal PlanetScope data. Land 2021, 10, 1384. [Google Scholar] [CrossRef]
Goldberg, K.; Herrmann, I.; Hochberg, U.; Rozenstein, O. Generating up-to-date crop maps optimized for Sentinel-2 imagery in Israel. Remote Sens. 2021, 13, 3488. [Google Scholar] [CrossRef]
Chen, C.; Liang, J.; Sun, W.; Yang, G.; Meng, X. An automatically recursive feature elimination method based on threshold decision in random forest classification. Geo-Spat. Inf. Sci. 2024, 28, 1494–1519. [Google Scholar] [CrossRef]
Domingos, P. A few useful things to know about machine learning. Commun. ACM 2012, 55, 78–87. [Google Scholar] [CrossRef]
Nguyen, L.H.; Henebry, G.M. Characterizing land use/land cover using multi-sensor time series from the perspective of land surface phenology. Remote Sens. 2019, 11, 1677. [Google Scholar] [CrossRef]
Bendini, H.N.; Fonseca, L.M.G.; Schwieder, M.; Körting, T.S.; Rufin, P.; Sanches, I.D.; Leitão, P.J.; Hostert, P. Detailed agricultural land classification in the Brazilian Cerrado based on phenological information from dense satellite image time series. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101872. [Google Scholar] [CrossRef]
Joshi, A.; Pradhan, B.; Gite, S.; Chakraborty, S. Remote-sensing data and deep-learning techniques in crop mapping and yield prediction: A systematic review. Remote Sens. 2023, 15, 2014. [Google Scholar] [CrossRef]
Parreiras, T.C.; Bolfe, E.L.; Sano, E.S.; Victoria, D.C.; Sanches, I.D.; Vicente, L.E. Exploring the Harmonized Landsat Sentinel (HLS) datacube to map an agricultural landscape in the Brazilian savanna. ISPRS Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2022, XLIII-B3-2022, 967–973. [Google Scholar] [CrossRef]
Parreiras, T.; Bolfe, É.; Chaves, M.; Sanches, I.; Sano, E.; Victoria, D.; Bettiol, G.; Vicente, L. Hierarchical classification of soybean in the Brazilian savanna based on Harmonized Landsat Sentinel data. Remote Sens. 2022, 14, 3736. [Google Scholar] [CrossRef]
IBGE. Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatística). Municipal Agricultural Research (Pesquisa Agrícola Municipal). Available online: https://sidra.ibge.gov.br/pesquisa/pam/tabelas (accessed on 25 April 2025).
Lin, C.; Zhong, L.; Song, X.-P.; Dong, J.; Lobell, D.B.; Jin, Z. Early- and in-season crop type mapping without current-year ground truth: Generating labels from historical information via a topology-based approach. Remote Sens. Environ. 2022, 274, 112994. [Google Scholar] [CrossRef]
Souza, C.G.; Arantes, T.B.; Carvalho, L.M.T.; Aguiar, P. Multitemporal variables for the mapping of coffee cultivation areas. Pesqui. Agropecu. Bras. 2019, 54, e00017. [Google Scholar] [CrossRef]
Chemura, A.; Mutanga, O.; Dube, T. Integrating age in the detection and mapping of incongruous patches in coffee (coffea arabica) plantations using multi-temporal Landsat 8 NDVI anomalies. Int. J. Appl. Earth Obs. Geoinf. 2017, 57, 1–13. [Google Scholar] [CrossRef]
Kawakubo, F.S.; Machado, R.P.P. Mapping coffee crops in Southeastern Brazil using spectral mixture analysis and data mining classification. Int. J. Remote Sens. 2016, 37, 3414–3436. [Google Scholar] [CrossRef]
Xie, Y.; Sha, Z.; Yu, M. Remote sensing imagery in vegetation mapping: A review. J. Plant Ecol. 2008, 1, 9–23. [Google Scholar] [CrossRef]
Veloso, G.A.; Ferreira, M.E.; Ferreira Júnior, L.G.; da Silva, B.B. Modelling gross primary productivity in tropical savanna pasturelands for livestock intensification in Brazil. Remote Sens. Appl. Soc. Environ. 2020, 17, 100288. [Google Scholar] [CrossRef]
Akbari, E.; Boloorani, A.D.; Samany, N.N.; Hamzeh, S.; Soufizadeh, S.; Pignatti, S. Crop mapping using Random Forest and particle swarm optimization based on multi-temporal Sentinel-2. Remote Sens. 2020, 12, 1449. [Google Scholar] [CrossRef]
Gaertner, J. Vegetation classification of coffea on Hawaii island using WorldView-2 satellite imagery. J. Appl. Remote Sens. 2017, 11, 046005. [Google Scholar] [CrossRef]
Liu, J.; Liu, H.; Lv, Y.; Xue, X. Classification of high resolution imagery based on fusion of multiscale texture features. IOP Conf. Ser. Earth Environ. Sci. 2014, 17, 012217. [Google Scholar] [CrossRef]
Gao, T.; Zhu, J.; Zheng, X.; Shang, G.; Huang, L.; Wu, S. Mapping spatial distribution of larch plantations from multi-seasonal Landsat-8 OLI imagery and multi-scale textures using random forests. Remote Sens. 2015, 7, 1702–1720. [Google Scholar] [CrossRef]
Perna, C.; Pagliai, A.; Sarri, D.; Lisci, R.; Vieri, M. Can a LiDAR and multispectral sensor discriminate canopy structure changes due to pruning in olive growing? A field experimentation. Sensors 2024, 24, 7894. [Google Scholar] [CrossRef] [PubMed]
Zou, X.; Mõttus, M. Sensitivity of common vegetation indices to the canopy structure of field crops. Remote Sens. 2017, 9, 994. [Google Scholar] [CrossRef]
Psaltakis, G.; Rogdakis, K.; Loizos, M.; Kymakis, E. One-vs-one, one-vs-rest, and a novel outcome-driven one-vs-one binary classifiers enabled by optoelectronic memristors towards overcoming hardware limitations in multiclass classification. Discov. Mater. 2024, 4, 7. [Google Scholar] [CrossRef]
Garcia, A.S.; Vilela, V.M.F.N.; Rizzo, R.; West, P.; Gerber, J.S.; Engstrom, P.M.; Ballester, M.V.R. Assessing land use/cover dynamics and exploring drivers in the Amazon’s arc of deforestation through a hierarchical, multi-scale and multi-temporal classification approach. Remote Sens. Appl. Soc. Environ. 2019, 15, 100233. [Google Scholar] [CrossRef]
Li, J.; Zhang, B.; Huang, X. A hierarchical category structure based convolutional recurrent neural network (HCS-ConvRNN) for land-cover classification using dense MODIS time-series data. Int. J. Appl. Earth Obs. Geoinf. 2022, 108, 102744. [Google Scholar] [CrossRef]
Chaves, M.E.D.; Picoli, M.C.A.; Sanches, I.D. Recent applications of landsat 8/oli and sentinel-2/msi for land use and land cover mapping: A systematic review. Remote Sens. 2020, 12, 3062. [Google Scholar] [CrossRef]
Fang, P.; Zhang, X.; Wei, P.; Wang, Y.; Zhang, H.; Liu, F.; Zhao, J. The classification performance and mechanism of machine learning algorithms in winter wheat mapping using Sentinel-2 10 m resolution imagery. Appl. Sci. 2020, 10, 5075. [Google Scholar] [CrossRef]
Li, R.; Gao, X.; Shi, F. A framework for subregion ensemble learning mapping of land use/land cover at the watershed scale. Remote Sens. 2024, 16, 3855. [Google Scholar] [CrossRef]
da Silva Júnior, C.A.; Santi, A.L.; Uribe-Opazo, M.A.; da Silva, A.P.; Bazzi, C.L.; Bazzi, H. Coffee-Yield Estimation Using High-Resolution Time-Series Satellite Images and Machine Learning. Agriengineering 2022, 4, 888–902. [Google Scholar] [CrossRef]
Martello, M.; Ferreira, M.P.; Ruhoff, A.L.; Zanon, J.A.; Dalmolin, A.C. Integrated approach for modeling soybean yield using remote sensing data and machine learning algorithms. PLoS ONE 2020, 15, e0230013. [Google Scholar] [CrossRef]
Liu, Z.; Li, N.; Wang, L.; Zhu, J.; Qin, F. A multi-angle comprehensive solution based on deep learning to extract cultivated land information from high-resolution remote sensing images. Ecol. Indic. 2022, 141, 108961. [Google Scholar] [CrossRef]
Kou, W.; Shen, Z.; Zhang, Y.; Wang, H.; Ji, P.; Huang, L.; Zhang, C.; Ma, Y. Spatio-temporal analysis of agroforestry systems in Hotan using multi-source remote sensing and deep learning. Smart Agric. Technol. 2024, 9, 100641. [Google Scholar] [CrossRef]

Figure 1. (A) Location map of the municipality of Caconde on the border between the states of São Paulo (SP) and Minas Gerais (MG), Brazil. Neighboring municipality names are shown in Portuguese (original language) (B) A true-color (Red-Green-Blue) Planet Scope mosaic acquired over the municipality of Caconde in June 2024.

Figure 2. Maps of elevation (A), terrain slope (B), and soil types (C) of the municipality of Caconde, São Paulo State, Brazil. The slope map was produced based on the Copernicus DEM GLO-30, available in the Google Earth Engine platform. The soil map was obtained from the São Paulo State Government.

Figure 3. Field photos showing different stages of coffee planting, producing, skeleton pruning, and renovation with stumping found in the municipality of Caconde, São Paulo State, Brazil, in October 2023.

Figure 4. The hierarchical classification scheme adopted in this study. Water bodies and non-vegetated areas were masked based on the MapBiomas Project [28] data from 2023. The color scheme is for visualization purposes only and does not imply any intrinsic classification meaning. Arrows indicate the hierarchical relationship between classification levels, where classes at higher levels are progressively subdivided into more detailed categories at subsequent levels.

Figure 5. Spatial distribution of training and validation samples used in the four classification levels (Level 1—L1, Level 2—L2, Level 3—L3, and Level 4—L4) involving land use and land cover classes, as well as coffee production stages.

Figure 6. Variable and temporal configurations tested in the classification framework. Each row represents a type of input data or temporal subset: multispectral bands (MS), spectral indices (SIs), land surface temperature (LST), texture metrics from Gray-Level Co-occurrence Matrix (GLCM), and the full-year and dry-season datasets. Columns correspond to the different test configurations. Green cells (1) indicate the inclusion of a variable group or time window in the corresponding test, while gray cells (0) indicate exclusion. Tests 1 to 6 were applied to classification Levels 1 to 3, while Test 7 was specifically designed for Level 4 (highlighted by the red box), focused on coffee phenological stages. MS includes the blue (B), green (G), red (R), near-infrared (NIR), and shortwave infrared (SWIR) bands; spectral indices (SIs) include normalized difference vegetation index (NDVI), normalized difference water index (NDWI), green normalized difference vegetation index (GNDVI), and soil-adjusted vegetation index (SAVI); LST refers to monthly median LST; and GLCM texture metrics included contrast (CON), dissimilarity (DIS), homogeneity (HOM), angular second moment (SEM), entropy (ENT), variance (VAR), and correlation (COR).

Figure 7. Schematic diagram illustrating the hierarchical classification workflow used to generate the final maps for Levels 1 to 4. At each level, multiple classification models were trained using different combinations (COMB. 1, 2, and 3) of input variables, multispectral bands (MS), spectral indices (SIs), their combination (MS + SIs), Gray-Level Co-occurrence Matrix (GLCM) texture features, and land surface temperature (LST), and different train/test split ratios (0.4, 0.5, 0.6, 0.7, which means 40/60%, 50/50%, 60/40%, and 70/30%, respectively). For Levels 1 to 3, Random Forest (RF) models were applied to both all-year and dry-season datasets (MS, SIs, and MS + SIs), and the best-performing combination (BEST COMB.) was selected to generate a modal (Mo) map. At Level 4, the full dataset (MS + SIs + GLCM + LST) was subjected to recursive feature elimination (RFE), and the final classification was obtained through the ensemble (mode) of all RF and Extreme Gradient Boosting (XGBoost) spatial predictions. In all cases, the final map at each level was spatially masked to retain only the pixels identified as belonging to the target class in the previous classification level. The color scheme is for visualization purposes only and does not imply any intrinsic classification meaning. The horizontal arrows indicate the transition from model training and selection (left) to the generation of the final spatial prediction using the selected combination (right). The vertical arrows represent the hierarchical flow between classification levels, where the output of each level constrains the subsequent level.

Figure 8. Final spatial predictions for levels 1–4 and the resulting Land Use/Land Cover map of the municipality of Caconde in 2023. The Level 1 map was generated as the mode of four Random Forest models trained with the all-year multispectral (MS) dataset. The Level 2 map corresponds to the mode of four Random Forest models using the all-year multispectral plus spectral indices dataset (MS + SIs). The Level 3 map represents the mode of four Random Forest models based on the all-year MS dataset. The Level 4 map is the ensemble (mode) of eight models, combining Random Forest and Extreme Gradient Boosting. The Final Map is an overlay of the maps from levels 1–4 plus the areas previously masked with MapBiomas data [28].

Figure 9. Total variable importance by month at Level 2 (left), and the top 10 ranked variables (right) based on the average Mean Decrease Accuracy across models using all year bands and the combined multispectral and spectral index datasets.

Figure 10. Total variable importance per type of variable at Level 2, considering the multispectral bands and spectral indices datasets. Colors are for illustrative purposes only.

Figure 11. Total variable importance by month at Level 4 models for Random Forest (left) and Extreme Gradient Boosting (right). For each month, the total importance corresponds to the sum of the importances of all variables measured in that month.

Figure 12. Top 10 ranked variables based on average Mean Decrease Accuracy and overall normalized importance across Random Forest (left) and Extreme Gradient Boosting (right) models. Variables marked with * are based on texture analysis. SEM = angular second moment; and VAR = variance.

Figure 13. Temporal profiles of the monthly averages of the most important variables for the mapped classes: Red band at Level 1 (A); GNDVI at Level 2 (B); SWIR band at Level 3 (C); and NDVI at Level 4 (D). These profiles highlight variations in vegetative vigor and plant moisture, as represented by the red band, Green Normalized Difference Vegetation Index (GNDVI), Soil-Adjusted Vegetation Index (SAVI), and shortwave infrared (SWIR) band. Month abbreviations in blue indicate the rainy season (October to March), while those in red indicate the dry season (April to September). The color scheme is for visualization purposes only and does not imply any intrinsic classification meaning.

Figure 14. Total and relative importance of texture variables derived from the Gray-Level Co-occurrence Matrix (GLCM) in Level 4 models applied to Random Forest (RF) (A) and Extreme Gradient Boosting (XGBoost) (B) classifiers. In the RF models, variable importance was measured using the Mean Decrease Accuracy (MDA). In contrast, in the XGBoost models, it was based on overall normalized importance, representing the information gain contributed by each variable during training. The analysis considered the following texture features: angular second moment (ASM), contrast (CON), correlation (COR), dissimilarity (DIS), entropy (ENT), homogeneity (HOM), and variance (VAR).

Table 2. Random Forest classification results for Level 2 with all-year and dry-season datasets. Three variable combinations were evaluated: multispectral bands (MS), spectral indices (SIs), and their combination (MS + SIs).

Datasets	Split	Accuracy		Significance
Datasets	Split	All-Year	Dry-Season	All-Year	Dry-Season
MS + SIs	0.4	0.956	0.930	***	***
MS + SIs	0.5	0.933	0.926	***	***
MS + SIs	0.6	0.920	0.942	***	***
MS + SIs	0.7	0.947	0.929	***	***
Average		0.939	0.932
MS	0.4	0.938	0.921	***	***
MS	0.5	0.951	0.905	***	***
MS	0.6	0.929	0.938	***	***
MS	0.7	0.924	0.912	***	***
Average		0.935	0.919
SIs	0.4	0.933	0.918	***	***
SIs	0.5	0.923	0.930	***	***
SIs	0.6	0.938	0.903	***	***
SIs	0.7	0.947	0.924	***	***
Average		0.935	0.918

Level of significance: *** p ≤ 0.001.

Table 3. Random Forest sensitivity and specificity results for all-year (AY) and dry-season (DS) models from Level 2, considering dataset combinations. MS = multispectral; SIs = spectral indices; MS + SIs = combination of multispectral bands and spectral indices.

Datasets	Split	Sensitivity						Specificity
		Perennial Crops		Pasture		Annual Crops		Perennial Crops		Pasture		Annual Crops
		AY	DS	AY	DS	AY	DS	AY	DS	AY	DS	AY	DS
MS + SI	0.4	0.977	0.924	0.966	0.972	0.813	0.719	0.986	0.986	0.969	0.926	0.977	0.971
MS + SI	0.5	0.972	0.972	0.939	0.959	0.741	0.556	0.977	0.954	0.949	0.934	0.969	0.984
MS + SI	0.6	0.920	0.989	0.966	0.966	0.667	0.619	0.978	0.986	0.889	0.935	0.985	0.980
MS + SI	0.7	0.938	0.969	0.955	0.989	0.938	0.438	0.990	0.990	0.951	0.901	0.974	0.981
Average		0.952	0.964	0.957	0.972	0.790	0.583	0.983	0.979	0.940	0.924	0.976	0.979
MS	0.4	0.962	0.927	0.983	0.966	0.594	0.481	0.976	0.966	0.908	0.882	0.997	0.981
MS	0.5	0.991	0.989	0.980	0.949	0.630	0.667	0.971	0.957	0.941	0.944	0.996	0.990
MS	0.6	0.977	0.985	0.966	0.944	0.524	0.438	0.978	0.962	0.917	0.889	0.980	0.987
MS	0.7	0.969	0.924	0.966	0.989	0.500	0.500	0.971	0.967	0.889	0.896	0.994	0.987
Average		0.975	0.956	0.974	0.962	0.562	0.522	0.974	0.963	0.914	0.903	0.992	0.986
SIs	0.4	0.924	0.874	0.972	0.958	0.750	0.714	0.976	0.971	0.920	0.889	0.984	0.971
SIs	0.5	0.991	0.938	0.939	0.978	0.556	0.563	0.931	0.962	0.971	0.914	0.977	0.987
SIs	0.6	0.954	0.924	0.975	0.972	0.667	0.719	0.978	0.986	0.917	0.926	0.990	0.971
SIs	0.7	0.923	0.972	0.989	0.959	0.813	0.556	0.990	0.954	0.938	0.934	0.981	0.984
Average		0.948	0.927	0.969	0.967	0.697	0.638	0.969	0.968	0.937	0.916	0.983	0.978

Table 4. Accuracy metrics of all Random Forest (RF) and Extreme Gradient Boosting (XGBoost) models for Level 4 with features selected with Recursive Feature Analysis.

Model	Split	Accuracy	p
RF	0.4	0.863	***
RF	0.5	0.839	***
RF	0.6	0.878	***
RF	0.7	0.861	***
Average	-	0.835
XGBoost	0.4	0.84	***
XGBoost	0.5	0.871	***
XGBoost	0.6	0.857	***
XGBoost	0.7	0.917	***
Average	-	0.838

Level of significance: *** p ≤ 0.001.

Table 5. Sensitivity and Specificity metrics from the Level 4 classes: PL = planting; PR = producing; SK = skeleton pruning; and ST = renovation with stumping.

		Sensitivity				Specificity
Model	Split	PL	PR	SK	ST	PL	PR	SK	ST
RF	0.4	0.75	0.917	1	0.667	1	0.98	0.902	0.931
RF	0.5	0.9	0.95	0.895	0.538	0.962	0.976	0.907	0.939
RF	0.6	0.875	1	1	0.5	0.976	0.97	0.912	0.974
RF	0.7	0.833	0.917	0.909	0.714	1	0.875	0.96	0.966
Average		0.840	0.946	0.951	0.605	0.985	0.950	0.920	0.953
XGBoost	0.4	0.917	0.833	0.955	0.667	1	0.98	0.882	0.931
XGBoost	0.5	0.9	0.95	1	0.538	1	0.929	0.884	1
XGBoost	0.6	0.875	1	1	0.4	1	0.939	0.853	1
XGBoost	0.7	0.833	1	1	0.714	1	0.917	1	0.966
Average		0.881	0.946	0.989	0.580	1.000	0.941	0.905	0.974

Table 6. Summary of data, methods, and results from remote sensing and machine learning studies on coffee mapping.

Classes	Satellite	Resolution	Scale; n Images	Methods; Variables	Classification	Main Results	Reference
Coffee (sun-grown)	Landsat 8 OLI Landsat 8 TIRS	30 m	Regional; 429 (reduced to 3 medians)	Bands, LST, Tasseled Cap, SRTM	Supervised, RF, pixel-based	μ_CE: 35%, μ_OE: 31%	Manoel et. al. [16]
Coffee	Sentinel-2 MSI	10 m	Regional	14 spectral indices with 5 and 16-day resolution	Supervised, RF, pixel-based	PA: 100%; UA: 75%;	Chaves & Sanches [19]
Sun-grown, intercropped (shade), newly planted	Sentinel-1 SAR, Sentinel-2 MSI	10 m	Regional; 66 (reduced to 2 seasonal medians)	Bands, SIs, SAR metrics, SRTM	Supervised, RF, pixel-based	PA: 56%, 52%, 65%; UA: 65%, 56%, 71%	Maskell et al. [14]
Coffee (agroforestry/shade)	GeoEye-1; Sentinel-2	0.5 m (resampled)	Local; 3	NDVI, Tasseled Cap, SRTM, GLCM	Supervised, RF, pixel-based	PA: 92%, UA: 91%	Tridawati et al. [42]
Coffee (sun-grown)	RapidEye, Landsat 5 TM	5 m, 30 m	Regional; 195 (reduced into metrics)	NDVI and GetStatistic: IAV, stdIAV, AAT, MAC, SSA.	Supervised, SVM, GEOBIA	PA: 94%; UA 90%	Souza et al. [65]
Coffee (shade)	Landsat 8 OLI, Landsat 8 TIRS	30 m	Regional; 143 (reduced to 3 seasonal medians)	KT metrics, LST, SRTM	Supervised, RF, pixel-based	PA: 86%; UA: 80%	Kelley et al. [15]
Coffee (sun-grown)	WorldView-2	1.85 m	Regional; 2	Bands	Supervised; (1) pixel-based + MLC; (2) GEOBIA + SVM	(1) PA: 72%, UA: 69%); (2) PA: 72%, UA: 94%	Gaertner et al. [71]
Young, mature, old (sun-grown)	Landsat 8 OLI	30 m	Local; 2	Bands	Supervised, RF, pixel-based	PA: 70%, 80%, 78%; UA: 81%, 70%, 72%	Chemura et al. [66]
Producing, old-pruned, mixed (sun-grown)	IRS, Resourcesat 2 LISS-3	23.5 m	Local; 1	SMA; PCA	Supervised, DT, GEOOBIA	PA: 74%, 79%, 71%	Kawakubo & Machado [67]

AAT—Annual Aggregated Time-Series (NDVI-based GetStatistic metric); CE—Commission Error; DT—Decision Tree classifier; ETM+—Enhanced Thematic Mapper; GEOBIA—Geographic Object-Based Image Analysis; GLCM—Gray-Level Co-occurrence Matrix; IAV—Inter-Annual Variability (NDVI-based GetStatistic metric); IRS—Indian Remote Sensing; KT—Kauth–Thomas transformation (metrics: greenness, brightness, and wetness); LISS—Linear Imaging Self Scanning; MAC—Mean Annual Cycle (NDVI-based GetStatistic metric); MLC—Maximum Likelihood Classifier; MSI—Multispectral Instrument; NDVI—Normalized Difference Vegetation Index; OLI—Operational Land Imager; PA—Producers’ Accuracy; PCA—Principal Component Analysis; RF—Random Forest; SAR—Synthetic Aperture Radar; SMA—Spectral Mixture Analysis; SRTM—Shuttle Radar Topography Mission (elevation, slope, and aspect); SSA—Singular Spectrum Analysis (NDVI-based GetStatistic metric); stdIAV—Standard Deviation of Inter-Annual Variability (NDVI-based GetStatistic metric); SVM—Support Vector Machines; TIRS—Thermal Infrared Sensor; TM—Thematic Mapper; UA—Users’ Accuracy; OE—Omission Error.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Parreiras, T.C.; Santos, C.d.O.; Bolfe, É.L.; Sano, E.E.; Leandro, V.B.S.; Bayma, G.; Silva, L.A.P.d.; Furuya, D.E.G.; Romani, L.A.S.; Morton, D. Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages. Remote Sens. 2025, 17, 3168. https://doi.org/10.3390/rs17183168

AMA Style

Parreiras TC, Santos CdO, Bolfe ÉL, Sano EE, Leandro VBS, Bayma G, Silva LAPd, Furuya DEG, Romani LAS, Morton D. Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages. Remote Sensing. 2025; 17(18):3168. https://doi.org/10.3390/rs17183168

Chicago/Turabian Style

Parreiras, Taya Cristo, Claudinei de Oliveira Santos, Édson Luis Bolfe, Edson Eyji Sano, Victória Beatriz Soares Leandro, Gustavo Bayma, Lucas Augusto Pereira da Silva, Danielle Elis Garcia Furuya, Luciana Alvim Santos Romani, and Douglas Morton. 2025. "Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages" Remote Sensing 17, no. 18: 3168. https://doi.org/10.3390/rs17183168

APA Style

Parreiras, T. C., Santos, C. d. O., Bolfe, É. L., Sano, E. E., Leandro, V. B. S., Bayma, G., Silva, L. A. P. d., Furuya, D. E. G., Romani, L. A. S., & Morton, D. (2025). Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages. Remote Sensing, 17(18), 3168. https://doi.org/10.3390/rs17183168

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dense Time Series of Harmonized Landsat Sentinel-2 and Ensemble Machine Learning to Map Coffee Production Stages

Abstract

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Sampling Strategy and Classification Scheme

2.3. Remote Sensing Data Processing

Feature Space Combinations

2.4. Classification Algorithms and Accuracy Assessment

2.5. Spatial Predictions

3. Results

3.1. Accuracy Assessment and Spatial Predictions

3.2. Feature Importance Analysis

4. Discussion

4.1. Advantages of HLS Data for Class Separability

4.2. Performance of Mappings and Impacts of Variables for Levels 1, 2, and 3

4.3. Level 4: Mapping the Stages of Coffee Production, Advances, and Challenges

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI